Search and replace with regular expressions

It is possible to perform search and replace operations on strings in Java using regular expressions. The Java String and Matcher classes offer relatively simple methods for matching and search/replacing strings which can bring the benefit of string matching optimisations that could be cumbersome to implement from scratch. The complexity of using these methods depends how much flexibility you need:

Replacing one "fixed" substring with another

This is the "simplest" form of search and replace. We want to find exact instances of a specific subtring and replace them with another given substring. To do so, we can call replaceAll() on the String, but we need to put Pattern.quote() around the substring we are searching for. For example, this will replace all instances of the substring "1+" with "one plus":

str = str.replaceAll(Pattern.quote("1+"), "one plus");

If you are familiar with regular expressions, then you will know that a plus sign normally has a special meaning. But provided you remember to put Pattern.quote() around the first string, we can use replaceAll() as a simple search and replace call. (If the replacement substring contains a dollar sign or backslash, then we also need to use Matcher.quoteReplacement(): see below.)

Replacing substrings with a fixed string

If you simply want to replace all instances of a given expression within a Java string with another fixed string, then things are fairly straightforward. For example, the following replaces all instances of digits with a letter X:

str = str.replaceAll("[0-9]", "X");

The following replaces all instances of multiple spaces with a single space:

str = str.replaceAll(" {2,}", " ");

We'll see in the next section that we should be careful about passing "raw" strings as the second paramter, since certain characters in this string actually have special meanings.

Replacing with a sub-part of the matched portion

In the replacement string, we can refer to captured groups from the regular expression. For example, the following expression removes instances of the HTML 'bold' tag from a string, but leaves the text inside the tag intact:

str = str.replaceAll("<b>([^<]*)</b>", "$1");

In the expression <b>([^<]*)</b>, we capture the text between the open and close tags as group 1. Then, in the replacement string, we can refer to the text of group 1 with the expression $1. (The second group would be $2 etc.)

Including a dollar sign or backslashes in the replacement string

To actually include a dollar sign or backslash in the replacement string, we need to put another backslash before the dollar symbol or backslash to "escape" it... remembering that within a string literal, a single backslash also needs to be doubled up! For example:

str = str.replaceAll("USD", "\\$");

The static method Matcher.quoteReplacement() will replace instances of dollar signs and backslashes in a given string with the correct form to allow them to be used as literal replacements:

str = str.replaceAll("USD",
  Matcher.quoteReplacement("$"));

In general:

Further information: more flexible find and replacement operations

For additional flexibility: