Search this site

 Home  Regex intro  Character classes  Repetition operators  Find/replace  Multiline  Example regex


Regular expression example: IP location (ctd)

We now have a total of three expressions to extract the country code from Yahoo and Google referrer strings. To get the country code from a referrer string, we simply try matching the string against each pattern in turn. Since in each case the captured country code will be group 1, we can declare a single Matcher variable, which we successively instantiate with the next pattern on failure. The code looks something like this. Note that we write it in such a way as to avoid calling matches() more than once on the same matcher:

  Pattern pGoogle1 =
    Pattern.compile("(?:http://)?www\\.google\\.com/.*hl=([a-z]{2}).*");
  Pattern pGoogle2 = Pattern.compile("(?:http://)?" +
      "www\\.google(?:\\.com|\\.co)?\\.([a-z]{2})/.*");
  Pattern pYahoo = Pattern.compile("(?:http://)?" +
    "([a-z]{2})\\.search\\.yahoo\\.com/.*");

  public String guessCountryCode(String referrer) {
    Matcher m = pGoogle1.matcher(referrer);
    if (!m.matches()) {
      m = pGoogle2.matcher(referrer);
      if (!m.matches()) {
        m = pYahoo.matcher(referrer);
        if (!m.matches()) {
          return null;
        }
      }
    }
    String code = m.group(1).toUpperCase();
    if ("UK".equals(code)) {
      code = "GB";
    }
    return code;
  }

Of course if we had a large number of Patterns (as well may happen in real life), we may well want to put them in an array and cycle through in a loop. Notice that at the end of this method we can put in any corrections necessary to turn the domain suffixes and/or language codes into standard country codes (e.g. the standard country code GB generally covers the UK).

comments powered by Disqus

Written by Neil Coffey. Copyright © Javamex UK 2012. All rights reserved.