The Java Stream interface
The Java Stream interface defines an "iterator" that includes logic of how we
want to iterate through the elements in the stream. Another way of seeing things is that a Stream defines
a "query" into the data in question.
In our introduction to streams, we saw, for
example, that we could call limit() to specify that only a certain number of elemnets would be iterated through.
We also saw an example of the filter() method combined with a lambda expression to determine specific
items of data that we wanted to include or exclude from the stream.
In fact, the
Stream interface defines a host of calls to specify particular properties that we would like the stream to have.
The main ones are summarised in the table below.
Stream method | Example | Purpose |
distinct() |
Stream<Integer> distinctNos = nos.stream()
.distinct();
| Returns a stream that will only present each distinct item from the original stream once. |
limit() |
Stream<Integer> firstIDs = ids.stream()
.limit(5);
| Returns a stream that stops iteration after a the given number of elements. |
map() |
Stream<Integer> nos = strings.stream()
.map(Integer::parseInt);
| Returns a stream that pulls objects from the original stream in sorted order. |
sorted() |
Stream<Integer> sortedIDs = ids.stream()
.sorted();
| Returns a stream that pulls successive objects out of the original steam in sorted order. |
unordered() |
Stream<?> unordered = items.stream()
.unordered();
| Returns a stream that does not guarantee any particular ordering on iteration.
The reason for using unordered() is that by specifying that ordering is not important,
certain optimisations nay be possible. |
skip() |
Stream<String> middleNames = forenames.stream()
.skip(1);
| Returns a stream that pulls successive objects out of the original stream, having skipped past the specified number of items. |
dropWhile() |
Stream<String> skipInitials = forenames.stream()
.dropWhile(s -> s.length() < 2);
| Returns a stream that returns items left after skipping any items that match the given condition. |
Lazy execution of stream operations
It is important to note that the above methods define how the stream will be iterated
when it is terminated. So calling sorted(), for example, does not actually cause the
data to be sorted. Only when you call a terminating operation such as forEach() is the data
in the stream actually iterated through, and at that moment, operations such as sorting occur, if they
are necessary:
strings.stream()
.distinct() // <- Defines that we will filter
// on distinct strings
.sorted() // <- Defines that we will sort
.forEach(System.out::println) // <- Actually filters, sorts and iterates
Stream state and optimisations
As we have mentioned, a key advantage of a Stream compared to a simple Iterator is that a stream
encapsulates the information and logic that defines the iteration. In other words, a stream potentially "knows" whether
its elements are sorted, distinct etc. Calling distinct() on a Stream can be a no-op if the stream
originated from a Set, for example, since by definition, a set cannot contain more than one instance of any two equal objects.
This potential for optimisations means that the developer should avoid certain assumptions:
- stream methods like distinct(), unordered() etc may in general eturn either a new stream or the selfsame stream, depending on
what is required to implement the requested logic
— you should generally not make any assumptions either way;
- you should not rely on side-effects of lambdas passed into these methods, as it is possible that the
lambda expression may not be called (exception: a lambda declared as a Consumer, e.g. with
the peek() method, will be called: part of the very purpose of Consumer is to allow side effects!).
Combining distinct(), sorted(), limit() etc
The above stream "filtering" or "query" methods can be combined in potentially powerful ways, allowing you to query or search through the contents
of a Java list or other collection in relatively few lines of code. Although some optimisations are possible because streams "know" about their
state, it is important to stress that in general, stream filtering methods are executed independently of one another.
As an example of what we mean and why this is a potential limitation, imagine that we want to return the first 20 items of a list in sort order.
We could achieve this as follows:
strings.stream().sorted().limit(20)...
Now strictly speaking to achieve this, there is no need to sort the entire list of strings. If there are 1000 strings in
the list, we only need to know that strings 21-1000 occur in some place or after the first 20 strings. But the specific ordering among those remaining 980 strings
themselves is irrelevant. However, Stream.sorted() will force the entire list of 1000 strings to be sorted when they are iterated. In effect, the
sort and limit operations are "unaware" of one another.
If you enjoy this Java programming article, please share with friends and colleagues. Follow the author on Twitter for the latest news and rants.
Editorial page content written by Neil Coffey. Copyright © Javamex UK 2021. All rights reserved.