Search this site

 Home  Regex intro  Character classes  Repetition operators  Find/replace  Multiline  Example regex


Basic regular expressions with String.matches()

On this page we'll look at how to form a basic regular expression and how to test if a string matches the expression. As our example, we'll consider a case fairly typical in data conversion or data cleansing applications:

  • we want to say if a given string represents the value "true", and return a boolean value accordingly;
  • but we need to be flexible in what string values we consider to mean "true" (e.g. the user could have entered "yes", could have capitalised the word etc).

Basics of regular expressions

A regular expression is a sequence of characters that we want to match against. The first general notion is that:

  • a "normal" character matches against itself.

By "normal", we mean excluding a few characters that have special meanings. We'll introduce these as we go along.

In Java, the easiest way to see if a given string matches a particular regular expression is to use the matches() method, passing in the expression. For example, here is how we would check if a string matched the regular expression true:

public boolean isTrueValue(String str) {
  return str.matches("true");
}

Since each character of the regular expression matches against itself, and we have no "special" characters in our expression, this effectively means that the string matches when (and only when) it equals the string "true". In other words, in this particular example, we could have written the following:

public boolean isTrueValue(String str) {
  return str.equals("true");
}

Character classes ("character choices")

OK, so a regular expression with just "normal" characters isn't very interesting. But now for a more interesting example:

  • If we put several characters inside square brackets[...]– this means a choice between characters.

Technically, the choice is called a character class. So if we write [tT], that means "either lower or upper case T". So to accept the values true or True we can write the following:

public boolean isTrueValue(String str) {
  return str.matches("[Tt]rue");
}

Alternatives

The square brackets are useful when we want a choice for a single character. When we want to match alternatives for a whole string, we instead put a pipe character|– between the alternatives:

public boolean isTrueValue(String str) {
  return str.matches("true|yes");
}

The above expression will match either true or yes. To make it additionally match True and Yes, we can combine the two techniques:

public boolean isTrueValue(String str) {
  return str.matches("[Tt]rue|[Yy]es");
}

On the next page, we continue by looking in more detail at character classes, with features such as matching against a range of characters.

comments powered by Disqus

Written by Neil Coffey. Copyright © Javamex UK 2012. All rights reserved.