Java Regular Expressions
❮ PreviousNext ❯
What is a Regular Expression?
A regular expression is a sequence of characters that forms a search pattern. When you search for data in a text, you can use this search pattern to describe what you are searching for.
A regular expression can be a single character, or a more complicated pattern.
Regular expressions can be used to perform all types of text search and text replace operations.
Java does not have a built-in Regular Expression class, but we can import the package to work with regular expressions. The package includes the following classes:
- Class - Defines a pattern (to be used in a search)
- Class - Used to search for the pattern
- Class - Indicates syntax error in a regular expression pattern
Example
Find out if there are any occurrences of the word "w3schools" in a sentence:
Try it Yourself »
Example Explained
In this example, The word "w3schools" is being searched for in a sentence.
First, the pattern is created using the method. The first parameter indicates which pattern is being searched for and the second parameter has a flag to indicates that the search should be case-insensitive. The second parameter is optional.
The method is used to search for the pattern in a string. It returns a Matcher object which contains information about the search that was performed.
The method returns true if the pattern was found in the string and false if it was not found.
Flags
Flags in the method change how the search is performed. Here are a few of them:
- - The case of letters will be ignored when performing a search.
- - Special characters in the pattern will not have any special meaning and will be treated as ordinary characters when performing a search.
- - Use it together with the flag to also ignore the case of letters outside of the English alphabet
Regular Expression Patterns
The first parameter of the method is the pattern. It describes what is being searched for.
Brackets are used to find a range of characters:
Expression | Description |
---|---|
[abc] | Find one character from the options between the brackets |
[^abc] | Find one character NOT between the brackets |
[0-9] | Find one character from the range 0 to 9 |
Metacharacters
Metacharacters are characters with a special meaning:
Metacharacter | Description |
---|---|
| | Find a match for any one of the patterns separated by | as in: cat|dog|fish |
. | Find just one instance of any character |
^ | Finds a match as the beginning of a string as in: ^Hello |
$ | Finds a match at the end of the string as in: World$ |
\d | Find a digit |
\s | Find a whitespace character |
\b | Find a match at the beginning of a word like this: \bWORD, or at the end of a word like this: WORD\b |
\uxxxx | Find the Unicode character specified by the hexadecimal number xxxx |
Quantifiers
Quantifiers define quantities:
Quantifier | Description |
---|---|
n+ | Matches any string that contains at least one n |
n* | Matches any string that contains zero or more occurrences of n |
n? | Matches any string that contains zero or one occurrences of n |
n{x} | Matches any string that contains a sequence of Xn's |
n{x,y} | Matches any string that contains a sequence of X to Y n's |
n{x,} | Matches any string that contains a sequence of at least X n's |
Note: If your expression needs to search for one of the special characters you can use a backslash ( \ ) to escape them. In Java, backslashes in strings need to be escaped themselves, so two backslashes are needed to escape special characters. For example, to search for one or more question marks you can use the following expression: "\\?"
❮ PreviousNext ❯
Java Regex
next →← prev
The Java Regex or Regular Expression is an API to define a pattern for searching or manipulating strings.
It is widely used to define the constraint on strings such as password and email validation. After learning Java regex tutorial, you will be able to test your regular expressions by the Java Regex Tester Tool.
Java Regex API provides 1 interface and 3 classes in java.util.regex package.
java.util.regex package
The Matcher and Pattern classes provide the facility of Java regular expression. The java.util.regex package provides following classes and interfaces for regular expressions.
- MatchResult interface
- Matcher class
- Pattern class
- PatternSyntaxException class

Matcher class
It implements the MatchResult interface. It is a regex engine which is used to perform match operations on a character sequence.
No. | Method | Description |
---|---|---|
1 | boolean matches() | test whether the regular expression matches the pattern. |
2 | boolean find() | finds the next expression that matches the pattern. |
3 | boolean find(int start) | finds the next expression that matches the pattern from the given start number. |
4 | String group() | returns the matched subsequence. |
5 | int start() | returns the starting index of the matched subsequence. |
6 | int end() | returns the ending index of the matched subsequence. |
7 | int groupCount() | returns the total number of the matched subsequence. |
Pattern class
It is the compiled version of a regular expression. It is used to define a pattern for the regex engine.
No. | Method | Description |
---|---|---|
1 | static Pattern compile(String regex) | compiles the given regex and returns the instance of the Pattern. |
2 | Matcher matcher(CharSequence input) | creates a matcher that matches the given input with the pattern. |
3 | static boolean matches(String regex, CharSequence input) | It works as the combination of compile and matcher methods. It compiles the regular expression and matches the given input with the pattern. |
4 | String[] split(CharSequence input) | splits the given input string around matches of given pattern. |
5 | String pattern() | returns the regex pattern. |
Example of Java Regular Expressions
There are three ways to write the regex example in Java.
Test it NowOutput
Regular Expression . Example
The . (dot) represents a single character.
Test it NowRegex Character classes
No. | Character Class | Description |
---|---|---|
1 | [abc] | a, b, or c (simple class) |
2 | [^abc] | Any character except a, b, or c (negation) |
3 | [a-zA-Z] | a through z or A through Z, inclusive (range) |
4 | [a-d[m-p]] | a through d, or m through p: [a-dm-p] (union) |
5 | [a-z&&[def]] | d, e, or f (intersection) |
6 | [a-z&&[^bc]] | a through z, except for b and c: [ad-z] (subtraction) |
7 | [a-z&&[^m-p]] | a through z, and not m through p: [a-lq-z](subtraction) |
Regular Expression Character classes Example
Test it NowRegex Quantifiers
The quantifiers specify the number of occurrences of a character.
Regex | Description |
---|---|
X? | X occurs once or not at all |
X+ | X occurs once or more times |
X* | X occurs zero or more times |
X{n} | X occurs n times only |
X{n,} | X occurs n or more times |
X{y,z} | X occurs at least y times but less than z times |
Regular Expression Character classes and Quantifiers Example
Test it NowRegex Metacharacters
The regular expression metacharacters work as shortcodes.
Regex | Description |
---|---|
. | Any character (may or may not match terminator) |
\d | Any digits, short of [0-9] |
\D | Any non-digit, short for [^0-9] |
\s | Any whitespace character, short for [\t\n\x0B\f\r] |
\S | Any non-whitespace character, short for [^\s] |
\w | Any word character, short for [a-zA-Z_0-9] |
\W | Any non-word character, short for [^\w] |
\b | A word boundary |
\B | A non word boundary |
Regular Expression Metacharacters Example
Test it NowRegular Expression Question 1
Test it Now
Regular Expression Question 2
Test it NowJava Regex Finder Example
Output:
Next TopicJava Exception Handling
← prevnext →
Java - Regular Expressions
Java provides the java.util.regex package for pattern matching with regular expressions. Java regular expressions are very similar to the Perl programming language and very easy to learn.
A regular expression is a special sequence of characters that helps you match or find other strings or sets of strings, using a specialized syntax held in a pattern. They can be used to search, edit, or manipulate text and data.
The java.util.regex package primarily consists of the following three classes −
Pattern Class − A Pattern object is a compiled representation of a regular expression. The Pattern class provides no public constructors. To create a pattern, you must first invoke one of its public static compile() methods, which will then return a Pattern object. These methods accept a regular expression as the first argument.
Matcher Class − A Matcher object is the engine that interprets the pattern and performs match operations against an input string. Like the Pattern class, Matcher defines no public constructors. You obtain a Matcher object by invoking the matcher() method on a Pattern object.
PatternSyntaxException − A PatternSyntaxException object is an unchecked exception that indicates a syntax error in a regular expression pattern.
Capturing Groups
Capturing groups are a way to treat multiple characters as a single unit. They are created by placing the characters to be grouped inside a set of parentheses. For example, the regular expression (dog) creates a single group containing the letters "d", "o", and "g".
Capturing groups are numbered by counting their opening parentheses from the left to the right. In the expression ((A)(B(C))), for example, there are four such groups −
- ((A)(B(C)))
- (A)
- (B(C))
- (C)
To find out how many groups are present in the expression, call the groupCount method on a matcher object. The groupCount method returns an int showing the number of capturing groups present in the matcher's pattern.
There is also a special group, group 0, which always represents the entire expression. This group is not included in the total reported by groupCount.
Example
Following example illustrates how to find a digit string from the given alphanumeric string −
Live Demo
import java.util.regex.Matcher; import java.util.regex.Pattern; public class RegexMatches { public static void main( String args[] ) { // String to be scanned to find the pattern. String line = "This order was placed for QT3000! OK?"; String pattern = "(.*)(\\d+)(.*)"; // Create a Pattern object Pattern r = Pattern.compile(pattern); // Now create matcher object. Matcher m = r.matcher(line); if (m.find( )) { System.out.println("Found value: " + m.group(0) ); System.out.println("Found value: " + m.group(1) ); System.out.println("Found value: " + m.group(2) ); }else { System.out.println("NO MATCH"); } } }This will produce the following result −
Output
Found value: This order was placed for QT3000! OK? Found value: This order was placed for QT300 Found value: 0Regular Expression Syntax
Here is the table listing down all the regular expression metacharacter syntax available in Java −
Subexpression | Matches |
---|---|
^ | Matches the beginning of the line. |
$ | Matches the end of the line. |
. | Matches any single character except newline. Using m option allows it to match the newline as well. |
[...] | Matches any single character in brackets. |
[^...] | Matches any single character not in brackets. |
\A | Beginning of the entire string. |
\z | End of the entire string. |
\Z | End of the entire string except allowable final line terminator. |
re* | Matches 0 or more occurrences of the preceding expression. |
re+ | Matches 1 or more of the previous thing. |
re? | Matches 0 or 1 occurrence of the preceding expression. |
re{ n} | Matches exactly n number of occurrences of the preceding expression. |
re{ n,} | Matches n or more occurrences of the preceding expression. |
re{ n, m} | Matches at least n and at most m occurrences of the preceding expression. |
a| b | Matches either a or b. |
(re) | Groups regular expressions and remembers the matched text. |
(?: re) | Groups regular expressions without remembering the matched text. |
(?> re) | Matches the independent pattern without backtracking. |
\w | Matches the word characters. |
\W | Matches the nonword characters. |
\s | Matches the whitespace. Equivalent to [\t\n\r\f]. |
\S | Matches the nonwhitespace. |
\d | Matches the digits. Equivalent to [0-9]. |
\D | Matches the nondigits. |
\A | Matches the beginning of the string. |
\Z | Matches the end of the string. If a newline exists, it matches just before newline. |
\z | Matches the end of the string. |
\G | Matches the point where the last match finished. |
\n | Back-reference to capture group number "n". |
\b | Matches the word boundaries when outside the brackets. Matches the backspace (0x08) when inside the brackets. |
\B | Matches the nonword boundaries. |
\n, \t, etc. | Matches newlines, carriage returns, tabs, etc. |
\Q | Escape (quote) all characters up to \E. |
\E | Ends quoting begun with \Q. |
Methods of the Matcher Class
Here is a list of useful instance methods −
Index Methods
Index methods provide useful index values that show precisely where the match was found in the input string −
Sr.No. | Method & Description |
---|---|
1 | public int start() Returns the start index of the previous match. |
2 | public int start(int group) Returns the start index of the subsequence captured by the given group during the previous match operation. |
3 | public int end() Returns the offset after the last character matched. |
4 | public int end(int group) Returns the offset after the last character of the subsequence captured by the given group during the previous match operation. |
Study Methods
Study methods review the input string and return a Boolean indicating whether or not the pattern is found −
Sr.No. | Method & Description |
---|---|
1 | public boolean lookingAt() Attempts to match the input sequence, starting at the beginning of the region, against the pattern. |
2 | public boolean find() Attempts to find the next subsequence of the input sequence that matches the pattern. |
3 | public boolean find(int start) Resets this matcher and then attempts to find the next subsequence of the input sequence that matches the pattern, starting at the specified index. |
4 | public boolean matches() Attempts to match the entire region against the pattern. |
Replacement Methods
Replacement methods are useful methods for replacing text in an input string −
Sr.No. | Method & Description |
---|---|
1 | public Matcher appendReplacement(StringBuffer sb, String replacement) Implements a non-terminal append-and-replace step. |
2 | public StringBuffer appendTail(StringBuffer sb) Implements a terminal append-and-replace step. |
3 | public String replaceAll(String replacement) Replaces every subsequence of the input sequence that matches the pattern with the given replacement string. |
4 | public String replaceFirst(String replacement) Replaces the first subsequence of the input sequence that matches the pattern with the given replacement string. |
5 | public static String quoteReplacement(String s) Returns a literal replacement String for the specified String. This method produces a String that will work as a literal replacement s in the appendReplacement method of the Matcher class. |
The start and end Methods
Following is the example that counts the number of times the word "cat" appears in the input string −
Example
Live Demo
import java.util.regex.Matcher; import java.util.regex.Pattern; public class RegexMatches { private static final String REGEX = "\\bcat\\b"; private static final String INPUT = "cat cat cat cattie cat"; public static void main( String args[] ) { Pattern p = Pattern.compile(REGEX); Matcher m = p.matcher(INPUT); // get a matcher object int count = 0; while(m.find()) { count++; System.out.println("Match number "+count); System.out.println("start(): "+m.start()); System.out.println("end(): "+m.end()); } } }This will produce the following result −
Output
Match number 1 start(): 0 end(): 3 Match number 2 start(): 4 end(): 7 Match number 3 start(): 8 end(): 11 Match number 4 start(): 19 end(): 22You can see that this example uses word boundaries to ensure that the letters "c" "a" "t" are not merely a substring in a longer word. It also gives some useful information about where in the input string the match has occurred.
The start method returns the start index of the subsequence captured by the given group during the previous match operation, and the end returns the index of the last character matched, plus one.
The matches and lookingAt Methods
The matches and lookingAt methods both attempt to match an input sequence against a pattern. The difference, however, is that matches requires the entire input sequence to be matched, while lookingAt does not.
Both methods always start at the beginning of the input string. Here is the example explaining the functionality −
Example
Live Demo
import java.util.regex.Matcher; import java.util.regex.Pattern; public class RegexMatches { private static final String REGEX = "foo"; private static final String INPUT = "fooooooooooooooooo"; private static Pattern pattern; private static Matcher matcher; public static void main( String args[] ) { pattern = Pattern.compile(REGEX); matcher = pattern.matcher(INPUT); System.out.println("Current REGEX is: "+REGEX); System.out.println("Current INPUT is: "+INPUT); System.out.println("lookingAt(): "+matcher.lookingAt()); System.out.println("matches(): "+matcher.matches()); } }This will produce the following result −
Output
Current REGEX is: foo Current INPUT is: fooooooooooooooooo lookingAt(): true matches(): falseThe replaceFirst and replaceAll Methods
The replaceFirst and replaceAll methods replace the text that matches a given regular expression. As their names indicate, replaceFirst replaces the first occurrence, and replaceAll replaces all occurrences.
Here is the example explaining the functionality −
Example
Live Demo
import java.util.regex.Matcher; import java.util.regex.Pattern; public class RegexMatches { private static String REGEX = "dog"; private static String INPUT = "The dog says meow. " + "All dogs say meow."; private static String REPLACE = "cat"; public static void main(String[] args) { Pattern p = Pattern.compile(REGEX); // get a matcher object Matcher m = p.matcher(INPUT); INPUT = m.replaceAll(REPLACE); System.out.println(INPUT); } }This will produce the following result −
Output
The cat says meow. All cats say meow.The appendReplacement and appendTail Methods
The Matcher class also provides appendReplacement and appendTail methods for text replacement.
Here is the example explaining the functionality −
Example
Live Demo
import java.util.regex.Matcher; import java.util.regex.Pattern; public class RegexMatches { private static String REGEX = "a*b"; private static String INPUT = "aabfooaabfooabfoob"; private static String REPLACE = "-"; public static void main(String[] args) { Pattern p = Pattern.compile(REGEX); // get a matcher object Matcher m = p.matcher(INPUT); StringBuffer sb = new StringBuffer(); while(m.find()) { m.appendReplacement(sb, REPLACE); } m.appendTail(sb); System.out.println(sb.toString()); } }This will produce the following result −
Output
-foo-foo-foo-PatternSyntaxException Class Methods
A PatternSyntaxException is an unchecked exception that indicates a syntax error in a regular expression pattern. The PatternSyntaxException class provides the following methods to help you determine what went wrong −
Sr.No. | Method & Description |
---|---|
1 | public String getDescription() Retrieves the description of the error. |
2 | public int getIndex() Retrieves the error index. |
3 | public String getPattern() Retrieves the erroneous regular expression pattern. |
4 | public String getMessage() Returns a multi-line string containing the description of the syntax error and its index, the erroneous regular expression pattern, and a visual indication of the error index within the pattern. |
Java regex
Regular Expressions in Java
Regular Expressions or Regex (in short) is an API for defining String patterns that can be used for searching, manipulating, and editing a string in Java. Email validation and passwords are a few areas of strings where Regex is widely used to define the constraints. Regular Expressions are provided under java.util.regex package. This consists of 3 classes and 1 interface. The java.util.regex package primarily consists of the following three classes as depicted below in tabular format as follows:
Class | Description |
---|---|
util.regex.Pattern | Used for defining patterns |
util.regex.Matcher | Used for performing match operations on text using patterns |
PatternSyntaxException | Used for indicating syntax error in a regular expression pattern |
Regex in java provides us two classes listed below as follows:
- Pattern Class
- Matcher Class
More understanding can be interpreted from the image provided below as follows:
Class 1: Pattern Class
This class is a compilation of regular expressions that can be used to define various types of patterns, providing no public constructors. This can be created by invoking the compile() method which accepts a regular expression as the first argument, thus returns a pattern after execution.
Method | Description |
---|---|
compile(String regex) | It is used to compile the given regular expression into a pattern. |
compile(String regex, int flags) | It is used to compile the given regular expression into a pattern with the given flags. |
flags() | It is used to return this pattern’s match flags. |
matcher(CharSequence input) | It is used to create a matcher that will match the given input against this pattern. |
matches(String regex, CharSequence input) | It is used to compile the given regular expression and attempts to match the given input against it. |
pattern() | It is used to return the regular expression from which this pattern was compiled. |
quote(String s) | It is used to return a literal pattern String for the specified String. |
split(CharSequence input) | It is used to split the given input sequence around matches of this pattern. |
split(CharSequence input, int limit) | It is used to split the given input sequence around matches of this pattern. The limit parameter controls the number of times the pattern is applied. |
toString() | It is used to return the string representation of this pattern. |
Example: Pattern class
Java
|
Class 2: Matcher class
This object is used to perform match operations for an input string in java, thus interpreting the previously explained patterns. This too defines no public constructors. This can be implemented by invoking a matcher() on any pattern object.
Method | Description |
---|---|
find() | It is mainly used for searching multiple occurrences of the regular expressions in the text. |
find(int start) | It is used for searching occurrences of the regular expressions in the text starting from the given index. |
start() | It is used for getting the start index of a match that is being found using find() method. |
end() | It is used for getting the end index of a match that is being found using find() method. It returns index of character next to last matching character. |
groupCount() | It is used to find the total number of the matched subsequence. |
group() | It is used to find the matched subsequence. |
matches() | It is used to test whether the regular expression matches the pattern. |
Note: that Pattern.matches() checks if the whole text matches with a pattern or not. Other methods (demonstrated below) are mainly used to find multiple occurrences of patterns in the text.
Let us do discuss few sample programs as we did for Pattern class. Here we will be discussing a few java programs that demonstrate the workings of compile(), find(), start(), end(), and split() in order to get a better understanding of the Matcher class.
Example 1: Pattern searching
Java
|
Example 2: Simple regular expression searching
Java
|
Example 3: Case Insensitive pattern searching
Java
|
Example 4: split() method to split a text based on a delimiter pattern.
The string split() method breaks a given string around matches of the given regular expression. There exist two variations of this method so do go through it prior to moving onto the implementation of this method.
Illustration:
Input --> String: 016-78967 Output --> Regex: {"016", "78967"}Java
|
Now we are done with discussing both the classes. Now we will be introducing you to two new concepts that make absolute clear Also there is an exception class associated with
Concept 1: PatternSyntaxException class
This is an object of Regex which is used to indicate a syntax error in a regular expression pattern and is an unchecked exception. Following are the methods been there up in the PatternSyntaxException class as provided below in tabular format as follows.
Method | Description |
---|---|
getDescription() | It is used to retrieve the description of the error. |
getIndex() | It is used to retrieve the error-index. |
getMessage() | It is used to return a multi-line string containing the description of the syntax error and its index, the erroneous regular-expression pattern, and a visual indication of the error index within the pattern. |
getPattern() | It is used to retrieve the erroneous regular-expression pattern. |
Concept 2: MatchResult Interface
This interface is used to determine the result of a match operation for a regular expression. It must be noted that although the match boundaries, groups, and group boundaries can be seen, the modification is not allowed through a MatchResult. Following are the methods been there up here in this interface as provided below in tabular format as follows:
Method | Description |
---|---|
end() | It is used to return the offset after the last character is matched. |
end(int group) | It is used to return the offset after the last character of the subsequence captured by the given group during this match. |
group() | It is used to return the input subsequence matched by the previous match. |
group(int group) | It is used to return the input subsequence captured by the given group during the previous match operation. |
groupCount() | It is used to return the number of capturing groups in this match result’s pattern. |
start() | It is used to return the start index of the match. |
start(int group) | It is used to return the start index of the subsequence captured by the given group during this match. |
Lastly, let us do discuss some of the important observations as retrieved from the above article
- We create a pattern object by calling Pattern.compile(), there is no constructor. compile() is a static method in Pattern class.
- Like above, we create a Matcher object using matcher() on objects of Pattern class.
- Pattern.matches() is also a static method that is used to check if given text as a whole matches pattern or not.
- find() is used to find multiple occurrences of patterns in the text.
- We can split a text based on a delimiter pattern using the split() method
This article is contributed by Akash Ojha. If you like GeeksforGeeks and would like to contribute, you can also write an article and mail your article to [email protected] See your article appearing on the GeeksforGeeks main page and help other Geeks. Please write comments if you find anything incorrect, or you want to share more information about the topic discussed above.
Attention reader! Don’t stop learning now. Get hold of all the important Java Foundation and Collections concepts with the Fundamentals of Java and Java Collections Course at a student-friendly price and become industry ready. To complete your preparation from learning a language to DS Algo and many more, please refer Complete Interview Preparation Course.
I hesitated. Our views did not often meet. Lights were on in the kitchen and in the hallway. Seeing her smile - she didn't mind if I pulled them off her completely.
You will also be interested:
- Riot symbiote
- Ultimate seats
- Zmodo login
- 2013 gmc sierra wiper blades
- Galaxy.s8 plus
- Fire on whidbey island today
- 802 toyota
- Dictionary arable
- Vr reddit
- Indeed jobs mclean, va
- Cva 50 cal
- Wiki swift
I sat and. Looked out the dark window of the bus. We were passing some forest. I decided to go for the night.