Thread Scheduling

In our introduction to how threads work, we introduced the thread scheduler, part of the OS (usually) that is responsible for sharing the available CPUs out between the various threads. How exactly the scheduler works depends on the individual platform, but various modern operating systems (notably Windows and Linux) use largely similar techniques that we'll describe here. We'll also mention some key varitions between the platforms.

Note that we'll continue to talk about a single thread scheduler. On multiprocessor systems, there is generally some kind of scheduler per processor, which then need to be coordinated in some way. (On some systems, switching on different processors is staggered to avoid contention on shared scheduling tables.) Unless otherwise specified, we'll use the term thread scheduler to refer to this overall system of coordinated per-CPU schedulers.

Across platforms, thread scheduling¹ tends to be based on at least the following criteria:

a priority, or in fact usually multiple "priority" settings that we'll discuss below;
a quantum, or number of allocated timeslices of CPU, which essentially determines the amount of CPU time a thread is allotted before it is forced to yield the CPU to another thread of the same or lower priority (the system will keep track of the remaining quantum at any given time, plus its default quantum, which could depend on thread type and/or system configuration);
a state, notably "runnable" vs "waiting";
metrics about the behaviour of threads, such as recent CPU usage or the time since it last ran (i.e. had a share of CPU), or the fact that it has "just received an event it was waiting for".

Most systems use what we might dub priority-based round-robin scheduling to some extent. The general principles are:

a thread of higher priority (which is a function of base and local priorities) will preempt a thread of lower priority;
otherwise, threads of equal priority will essentially take turns at getting an allocated slice or quantum of CPU;
there are a few extra "tweaks" to make things work.

States

Depending on the system, there are various states that a thread can be in. Probably the two most interesting are:

runnable, which essentially means "ready to consume CPU"; being runnable is generally the minimum requirement for a thread to actually be scheduled on to a CPU;
waiting, meaning that the thread currently cannot continue as it is waiting for a resource such as a lock or I/O, for memory to be paged in, for a signal from another thread, or simply for a period of time to elapse (sleep).

Other states include terminated, which means the thread's code has finished running but not all of the thread's resources have been cleared up, and a new state, in which the thread has been created, but not all resources necessary for it to be runnable have been created. Internally, the OS may distinguish between various different types of wait states² (for example "waiting for a signal" vs "waiting for the stack to be paged in"), but this level of granularity is generally not available or so important to Java programs. (On the other hand, Java generally exposes to the programmer things the JVM can reasonly know about, for example, if a thread is waiting to acquire the lock on a Java object— roughly speaking, "entering a synchronized block".)

Next: quanta and switching

On the next page, we continue our description of thread scheduling with a look at thread quanta and switching, and discuss typical thread scheduling algorithms.

1. Note that in this description, we're going to assume that the OS performs specifically thread scheduling: that is, the unit that the scheduler "juggles" is the individual user thread rather than a process (or a kernel-level thread that sits halfway between a process and a user thread). Solaris 9 and Linux 2.6 onwards use this threading model, but earlier versions don't. To my knowledge, all versions of Windows use thread-level scheduling.
2. At a very low level, this can lead to some complex interactions such as a thread that was scheduled on to the processor getting preempted "at the last minute" while waiting to have its stack paged in. In other words, the notion that a thread goes from, say, "waiting" to "runnable" in an atomic step is actually a simplification. But it's usually a sufficient simplification from the point of view of the programmer.

If you enjoy this Java programming article, please share with friends and colleagues. Follow the author on Twitter for the latest news and rants. Follow @BitterCoffey