The PCRE4J project's goal is to bring the power of the PCRE library to Java.
PCRE4J provides 100% coverage of the PCRE2 API, giving you access to every feature of the PCRE2 library from Java.
Java's built-in java.util.regex covers many use cases,
but PCRE2 offers capabilities that go beyond what the standard library
provides:
java.util.regex, including \K
(match reset), recursive patterns, callouts, and the DFA matching
algorithm.java.util.regex API — the
regex module mirrors Pattern and
Matcher, so you can switch engines without rewriting
application code.| Feature | java.util.regex |
PCRE4J |
|---|---|---|
Recursive patterns ((?R), (?1)) |
No | Yes |
\K match reset |
No | Yes |
| Callouts | No | Yes |
| DFA matching | No | Yes |
| JIT compilation | No | Yes (default) |
| ReDoS protection (match/depth/heap limits) | No | Yes |
| Compiled pattern serialization to bytes | No | Yes |
| Glob/POSIX pattern conversion | No | Yes |
| Native library bundles (no system install) | N/A | Yes |
| GraalVM native-image | Yes | Yes |
Add the dependency and start matching — the API mirrors
java.util.regex:
import org.pcre4j.regex.Pattern;
var matcher = Pattern.compile("(?<year>\\d{4})-(?<month>\\d{2})").matcher("2026-02");
if (matcher.find()) {
System.out.println(matcher.group("year")); // "2026"
}Maven (pom.xml):
<properties>
<pcre4j.version>1.0.0</pcre4j.version>
</properties>
<dependencies>
<dependency>
<groupId>org.pcre4j</groupId>
<artifactId>regex</artifactId>
<version>${pcre4j.version}</version>
</dependency>
<dependency>
<groupId>org.pcre4j</groupId>
<!-- TODO: Select one of the following artifacts corresponding to the backend you want to use -->
<artifactId>jna</artifactId>
<!-- <artifactId>ffm</artifactId> -->
<version>${pcre4j.version}</version>
</dependency>
</dependencies>Gradle (build.gradle.kts):
val pcre4jVersion = "1.0.0"
dependencies {
implementation("org.pcre4j:regex:$pcre4jVersion")
// TODO: Select one of the following artifacts corresponding to the backend you want to use
implementation("org.pcre4j:jna:$pcre4jVersion")
// implementation("org.pcre4j:ffm:$pcre4jVersion")
}By default, the JIT compilation is used in cases the platform and the
library support it. To override this behavior, you can set the
pcre2.regex.jit system property with the value
false to the JVM.
Add a platform-specific bundle to your dependencies and PCRE4J loads the library automatically:
| Artifact | Platform |
|---|---|
org.pcre4j:pcre4j-native-linux-x86_64 |
Linux x86_64 |
org.pcre4j:pcre4j-native-linux-aarch64 |
Linux aarch64 |
org.pcre4j:pcre4j-native-macos-x86_64 |
macOS x86_64 (Intel) |
org.pcre4j:pcre4j-native-macos-aarch64 |
macOS aarch64 (Apple Silicon) |
org.pcre4j:pcre4j-native-windows-x86_64 |
Windows x86_64 |
org.pcre4j:pcre4j-native-all |
All supported platforms |
Example (Gradle):
implementation("org.pcre4j:pcre4j-native-linux-x86_64:1.0.0")
Install the PCRE2 library on your system:
sudo apt install libpcre2-8-0brew install pcre2PATHThe library is located automatically via pcre2-config,
pkg-config, or well-known platform paths. You can also set
the path explicitly with
-Dpcre2.library.path=/path/to/lib.
Automatic library discovery can be disabled with
-Dpcre2.library.discovery=false.
PCRE4J is organized into layered modules published as separate Maven
artifacts under org.pcre4j:
regex ──→ lib ──→ api ←── jna
↑
└────── ffm
| Artifact | Description |
|---|---|
api |
Backend interface contract (IPcre2) with PCRE2
constants |
lib |
Core wrapper (Pcre2Code, match data, compile/match
options, utilities). Depends on api |
jna |
JNA backend.
Depends on api |
ffm |
FFM
backend. Depends on api |
regex |
java.util.regex-compatible API (Pattern,
Matcher). Depends on api and
lib |
Each API tier requires a different set of artifacts. The
regex and lib artifacts declare their upstream
dependencies as transitive, so your dependency manager pulls them
automatically:
| API Tier | You Declare | Resolved Transitively |
|---|---|---|
java.util.regex-compatible |
regex + jna or ffm |
api, lib |
| PCRE4J wrapper | lib + jna or ffm |
api |
| Direct PCRE2 | jna or ffm |
api |
A backend (jna or ffm) is always required
at runtime but is intentionally not a transitive dependency of
regex or lib, letting consumers choose which
native access mechanism to use.
The regex and lib convenience APIs use a
global backend held by Pcre4j. The backend is initialized
automatically — just add a backend artifact (jna or
ffm) to your classpath and start using PCRE4J. No explicit
setup call is required.
On first use, Pcre4j.api() discovers available backends
via ServiceLoader. When both the FFM and JNA backends are
present, the FFM backend is preferred for its better performance.
For explicit control, you can still call Pcre4j.setup()
to install a specific backend:
import org.pcre4j.Pcre4j;
import org.pcre4j.jna.Pcre2; // or org.pcre4j.ffm.Pcre2
static {
Pcre4j.setup(new Pcre2());
}setup() takes priority over auto-discovery and may be
called again to replace the backend; existing compiled patterns are
unaffected because each Pcre2Code instance captures the
backend it was created with.
Every convenience constructor and factory method (e.g.
Pcre2Code(String), Pattern.compile(String))
has an explicit-API overload that accepts an IPcre2
parameter directly, bypassing the global singleton entirely.
Multi-classloader note:
Pcre4jstores the backend in astaticfield, so each classloader that loads PCRE4J gets its own independent singleton. In application servers or plugin frameworks, either place the PCRE4J JARs in a shared classloader, or use the explicit-API overloads to avoid relying on the global state.
Add lib and a backend to your dependencies:
Maven (pom.xml):
<dependencies>
<dependency>
<groupId>org.pcre4j</groupId>
<artifactId>lib</artifactId>
<version>1.0.0</version>
</dependency>
<dependency>
<groupId>org.pcre4j</groupId>
<!-- TODO: Select one of the following artifacts corresponding to the backend you want to use -->
<artifactId>jna</artifactId>
<!-- <artifactId>ffm</artifactId> -->
<version>1.0.0</version>
</dependency>
</dependencies>Gradle (build.gradle.kts):
dependencies {
implementation("org.pcre4j:lib:1.0.0")
// TODO: Select one of the following artifacts corresponding to the backend you want to use
implementation("org.pcre4j:jna:1.0.0")
// implementation("org.pcre4j:ffm:1.0.0")
}import org.pcre4j.*;
import org.pcre4j.option.*;
import java.util.EnumSet;
public class Usage {
public static String[] example(String pattern, String subject) {
final Pcre2Code code;
if (Pcre4jUtils.isJitSupported(Pcre4j.api())) {
code = new Pcre2JitCode(
pattern,
EnumSet.noneOf(Pcre2CompileOption.class),
null,
null
);
} else {
code = new Pcre2Code(
pattern,
EnumSet.noneOf(Pcre2CompileOption.class),
null
);
}
final var matchData = new Pcre2MatchData(code);
code.match(
subject,
0,
EnumSet.noneOf(Pcre2MatchOption.class),
matchData,
null
);
return Pcre4jUtils.getMatchGroups(code, subject, matchData);
}
}Add a backend artifact directly:
Maven (pom.xml):
<dependencies>
<dependency>
<groupId>org.pcre4j</groupId>
<!-- TODO: Select one of the following artifacts corresponding to the backend you want to use -->
<artifactId>jna</artifactId>
<!-- <artifactId>ffm</artifactId> -->
<version>1.0.0</version>
</dependency>
</dependencies>Gradle (build.gradle.kts):
dependencies {
// TODO: Select one of the following artifacts corresponding to the backend you want to use
implementation("org.pcre4j:jna:1.0.0")
// implementation("org.pcre4j:ffm:1.0.0")
}// TODO: Select one of the following imports for the backend you want to use:
import org.pcre4j.jna.Pcre2;
// import org.pcre4j.ffm.Pcre2;
public class Usage {
public static void example() {
final var pcre2 = new Pcre2();
final var errorcode = new int[1];
final var erroroffset = new long[1];
final var code = pcre2.compile("pattern", 0, errorcode, erroroffset, 0);
if (code == 0) {
throw new RuntimeException(
"PCRE2 compilation failed with error code " + errorcode[0] + " at offset " + erroroffset[0]
);
}
pcre2.codeFree(code);
}
}See the complete list of exposed PCRE2 API functions here.
java.util.regex
API CompatibilityThe regex module provides a complete implementation of
the java.util.regex API backed by PCRE2. All
Pattern and Matcher methods are supported.
Pattern Flags| Flag | PCRE2 Mapping | Notes |
|---|---|---|
CASE_INSENSITIVE |
PCRE2_CASELESS |
|
COMMENTS |
PCRE2_EXTENDED |
|
DOTALL |
PCRE2_DOTALL |
|
LITERAL |
PCRE2_LITERAL |
|
MULTILINE |
PCRE2_MULTILINE |
|
UNICODE_CHARACTER_CLASS |
PCRE2_UCP |
|
UNICODE_CASE |
— | PCRE2 with UTF mode already performs Unicode-aware case folding |
UNIX_LINES |
PCRE2_NEWLINE_LF |
|
CANON_EQ |
— | Via NFD normalization; see Pattern.CANON_EQ Javadoc for
limitations |
Regular Expression Denial of Service (ReDoS) occurs when a crafted input causes catastrophic backtracking in a regex engine, leading to excessive CPU usage. PCRE4J provides several layers of protection against ReDoS attacks.
For reporting security vulnerabilities, please see the Security Policy.
The regex module enables PCRE2 JIT compilation by
default when the platform supports it. JIT-compiled patterns use a
fixed-size machine stack, which can mitigate some forms of catastrophic
backtracking and runaway recursion, but explicit match limits (see
below) should still be used for stronger guarantees on CPU and memory
usage.
To disable JIT: -Dpcre2.regex.jit=false
PCRE2 provides configurable limits that terminate match operations
exceeding resource thresholds. The regex module exposes
these via system properties:
| System Property | Description | PCRE2 Default |
|---|---|---|
pcre2.regex.match.limit |
Maximum number of internal match function calls | ~10,000,000 |
pcre2.regex.depth.limit |
Maximum backtracking depth | ~250 |
pcre2.regex.heap.limit |
Maximum heap memory in KiB | ~20,000 |
When a limit is exceeded, a MatchLimitException is
thrown (a RuntimeException subclass) with the specific
PCRE2 error code indicating which limit was hit.
Example: Tightening limits for untrusted input:
java -Dpcre2.regex.match.limit=1000000 -Dpcre2.regex.depth.limit=100 -jar myapp.jarNote: The PCRE2 library's compiled-in defaults already provide baseline protection. The system properties allow applications to tighten these limits further for security-sensitive use cases. When not set, the library defaults are used.
For fine-grained control, the lib module provides
Pcre2MatchContext with setMatchLimit(),
setDepthLimit(), and setHeapLimit() methods
that can be applied on a per-match basis.
The PCRE4J library supports several backends to invoke the
pcre2 API.
jnaThe jna backend uses the Java Native Access
library to invoke the pcre2 shared library. For this
backend to work, the pcre2 shared library must be installed
on the system. The library is located via jna.library.path,
or automatically discovered using pcre2-config,
pkg-config, or well-known platform paths as a fallback.
ffmThe ffm backend uses the Foreign
Functions and Memory API to invoke the pcre2 shared
library. For this backend to work, the pcre2 shared library
must be installed on the system. The library is located via
java.library.path, or automatically discovered using
pcre2-config, pkg-config, or well-known
platform paths as a fallback.
The ffm module is packaged as a Multi-Release JAR
supporting both:
--enable-preview
flag (FFM was a preview feature)Note: The
ffmbackend is incompatible with OpenJ9-based JVMs (including IBM Semeru) on Java 21 due to a JVM bug in the preview FFM implementation that causes memory corruption assertions. OpenJ9 Java 22+, where FFM is finalized, works correctly. Use thejnabackend on OpenJ9 Java 21.
PCRE4J supports GraalVM native-image compilation using the FFM backend. All modules ship with GraalVM reachability metadata, so no additional configuration is required.
Requirements:
reachability-metadata.json foreign section support)ffm artifact) — the JNA
backend is not supported for native-imageThe 1.0.0 release marks PCRE4J's first stable API with a comprehensive set of new features:
Pcre4j.withBackend())See the Changelog for the full list of changes.
See the Roadmap for planned features and project direction.
Contributions are welcome! Please see CONTRIBUTING.md for guidelines.
All commits must be signed off to certify the Developer Certificate of Origin
(DCO). Use git commit -s to add the required
Signed-off-by line.
This project is brought to you by Alexey Pelykh with a great gratitude to the PCRE library author Philip Hazel and its contributors.
The source code is hosted on GitHub.
Please see the Javadoc Index for the detailed API documentation.