Using wait/notify to coordinate threads (latch)

Pior to Java 5 at least, one use for the wait-notify mechanism was to coordinate threads performing a particular multithreaded job. We can use wait() and notify() to implement something called a latch: effectively a counter that triggers an event when it reaches zero. The idea is as follows:

We can separate out the counter or latch into a separate component and implement it something like this:

public class Latch {
  private final Object synchObj = new Object();
  private int count;

  public Latch(int noThreads) {
    synchronized (synchObj) {
      this.count = noThreads;
    }
  }
  public void awaitZero() throws InterruptedException {
    synchronized (synchObj) {
      while (count > 0) {
        synchObj.wait();
      }
    }
  }
  public void countDown() {
    synchronized (synchObj) {
      if (--count <= 0) {
        synchObj.notifyAll();
      }
    }
  }
}

Note that this implementation has the things we mentioned in our introduction to wait/notify: we must synchronize on the object that we are going to call wait() and notify() on (we actually use an internal object created for that purpose: this is to hide the internals of the latch from outside callers), and we must call wait() inside a loop. Subtly, we must also synchronize on this "synch obj" variable when we set the initial count inside the constructor. (This is just a specific case of the generality that if we're reading and writing a variable in different threads, we must provide some form of synchronization around all accesses to that variable.) Inside countDown(), a strict implementation might decide to throw an IllegalStateException if the counter was already zero, but we'll keep things simple here.

Using our latch class to coordinate threads

Now let's say we want to coordinate some threads running in parallel (for example, to parallelise a loop). We do so as follows:

Latch l = new Latch(noThreads);
for (int i = 0; i < noThreads; i++) {
  Thread j = new JobSlice(l, ...);
  j.start();
}
l.awaitZero();

using a definition of JobSlice something like this (we'd pass other parameters to the constructor, defining which portion of the loop or job the individual thread was to perform):

class JobSlice extends Thread {
  private Latch latch;
  public JobSlice(Latch l, ...) {
    this.latch = l;
  }
  public void run() {
    try {
      // do calculation
    } finally {
      latch.countDown();
    }
  }
}

Note that we don't deal fully with error handling here. Putting the count down in a finally means that the controller thread won't wait forever in the case of an error, but it would probably need to query whether or not the job threads all completed successfully. One option would be to add a raiseError() method to the latch, which could be called by the job threads and which would cause awaitZero() to throw an error (inside raiseError(), set the count to zero, but also set a boolean variable indicating that an error occurred; inside awaitZero(), check this variable after the wait loop and then either return normally or raise the exception).

CountDownLatch in Java 5

As of Java 5, we don't need to invent our own latch class to do simple thread coordination such as this: the java.util.concurrent package provides an implementation in the form of CountDownLatch. As in our example above, CountDownLatch is initialised with the number of calls to the countDown() method required before the waiter is allowed to proceed. In this case, the method called by the waiter is simply called await(); it is also possible to call await() with a maximum wait time (this version returns a boolean to indicate whether or not the latched actually reached zero).

It is important not to muddle up await() and wait() on CountDownLatch. The wait() method is of course part of the wait-notify mechanism just as for any object (on a latch object, you would basically never use this but Java provides no means to remove these methods from an object).

When would you use CountDownLatch?

At first glance, this may seem like some quirky functionality that we won't use very often. However, the problem of coordinating threads in this way is likely to become very important in the near future, as ordinary desktop computers become increasingly multiprocessors. We can no longer write a task in a single-threaded loop and expect it to magically get faster with newer computers. To take advantage of newer machines' capabilities, even fairly mundane problems will need to be broken down into parallel jobs and could be controlled in a similar way to the method outlined here.

So a short answer is: wherever you can run parallel threads to collectively complete a given task, and those threads will start and stop at roughly the same time; a typical case will be parallelising loops.