Search this site

 Home  Regex intro  Character classes  Repetition operators  Find/replace  Multiline  Example regex


Search and replace with regular expressions

It is possible to perform search and replace operations on strings using regular expressions. How complicated this is naturally depends on how much flexibility you need:

  • to replace instances of an expression in a string with a fixed string, then you can use a simple call to String.replaceAll();
  • if the replacement string isn't fixed, then you can use a loop with a Pattern and Matcher in which you have complete control over the replacement string.

Replacing with a fixed string

If you just want to replace all instances of a given expression within a string with another fixed string, then things are fairly straightforward. For example, the following replaces all instances of digits with a letter X:

str = str.replaceAll("[0-9]", "X");

The following replaces all instances of multiple spaces with a single space:

str = str.replaceAll(" {2,}", " ");

We'll see in the next section that we should be careful about passing "raw" strings as the second paramter, since a couple of characters in this string actually have special meanings.

Replacing with a sub-part of the matched portion

In the replacement string, we can refer to captured groups from the regular expression. For example, the following expression removes instances of the HTML 'bold' tag from a string, but leaves the text inside the tag intact:

str = str.replaceAll("<b>([^<]*)</b>", "$1");

In the expression <b>([^<]*)</b>, we capture the text between the open and close tags as group 1. Then, in the replacement string, we can refer to the text of group 1 with the expression $1. (The second group would be $2 etc.)

Including a dollar sign in the replacement string

To actually include a dollar in the replacement string, we need to put a backslash before the dollar symbol:

str = str.replaceAll("USD", "\\$");

The static method Matcher.quoteReplacement() will replace instances of dollar signs and backslashes in a given string with the correct form to allow them to be used as literal replacements:

str = str.replaceAll("USD",
  Matcher.quoteReplacement("$"));

In general:

  • If there is a chance that the replacement string will include a dollar sign or a backslash character, then you should wrap it in Matcher.quoteReplacement()1.

More flexible find and replacement operations

The replaceAll() method is suitable for cases where the replacement string is fixed or of a fixed format. For more flexibility, the Matcher.find() method can be used.


1. Note that this method was added in Java 5.

comments powered by Disqus

Written by Neil Coffey. Copyright © Javamex UK 2012. All rights reserved.