A Comprehensive Guide to Java Streams

Learn to harness the power of declarative programming in your Java application using streams!

Java streams, introduced in Java 8 are a powerful tool for processing sequences of elements in a declarative manner. Streams allow developers to express complex data transformations in a concise and readable way. Modern applications are transitioning away from the traditional procedural paradigm to streams in order to create more maintainable, understandable, and robust code bases. This article will give an overview of streams along with examples of their usage.

Overview

The primary purpose of Java streams is to simplify the process of performing bulk operations on collections of data. Traditional looping structures such as for and while loops require detailed instructions on how to traverse and manipulate underlying data, creating a convoluted mess of nested statements and data type definitions. Streams allow a developer to focus on what an operation should do rather than how to do it.

Benefits

Java Streams offer the following advantages when writing code.

  • Declarative Data Processing: Instead of writing looping statements and conditionals, operations such as filtering, mapping, and reducing can be achieved in a fluent and chainable manner.
  • Functional Programming: Streams leverage functional programming principles, encouraging the use of functions as first-class citizens allowing code to be immutable and avoid potential side effects.
  • Conciseness and Readability: Streams enable developers to express complex data operations in fewer lines making code easier to follow and understand.
  • Lazy Evaluation: By default, streams are considered lazy. This means their intermediate operations are not executed until a terminal operation is invoked. Laziness can provide performance optimizations by avoiding unnecessary computations.
  • Parallel Processing: Instead of having to manage and synchronize a set of threads that process a large dataset, the streaming API can allow you to easily invoke a parallel() method that manages synchronization and concurrency for you.

Anatomy of a Java Stream

A Java Stream is broken down into three main components:

  • Source: The originating collection that you wish to transform. This can be a collection, array, or I/O channel.
  • Intermediate Operations: These manipulate the data within a stream. Examples include filter, map, and sorted. They are considered lazy and always return a new stream.
  • Terminal Operations: These signify the end of your stream and trigger the actual data processing. They are also responsible to outputting a desired data structure after processing is complete. Some examples include forEach, collect, and reduce.

Practical Examples

Let’s examine several examples to illustrate how Java Streams work in practice.

Filtering and Collecting Data

Suppose we have a list of integers and want to remove an odd elements leaving only even numbers. This would normally be done by performing modulus arithmetic in a looping statement. Here we can invoke the filter() function that accepts a Lamba as an argument and filters our all even numbers. After all elements have been processed, we can collect the data and convert it back into a list.

List<Integer> numbers = Arrays.asList(1, 2, 3, 4, 5, 6, 7, 8, 9, 10);

List<Integer> evenNumbers = numbers.stream()
                                   .filter(num -> num % 2 == 0)
                                   .collect(Collectors.toList());
                                   
System.println(evenNumbers); // [2, 4, 6, 8, 10]

First we define our list of numbers one through ten. Notice that we first call the stream() method, which converts our list into a Java Stream. The filter function will iterate over every element for us and apply our arithmetic. Whenever a value of true is returned, the element will be retained. Every time a false value is returned, that number will be filtered from the resulting stream. The collect method specifies the data structure we wish to output.

Mapping Data Into a New Form

Imagine we have a scenario where a list contains a collection of names. We want to convert those strings into uppercase names. The map function is perfect for this circumstance.

List<String> names = Arrays.asList("Alice", "Bob", "Charlie");

List<String> upperCaseNames = names.stream()
                                   .map(String::toUpperCase)
                                   .collect(Collectors.toList());

System.out.println(upperCaseNames); // Output: [ALICE, BOB, CHARLIE]

Here the procedure is pretty much the same except we can pass a method reference (String::toUpperCase) which will automatically invoke this method on each string in our stream and return the resulting value. Once again it is collected by into a list.

Collecting Data into a Map

A common requirement of algorithms is building a map that contains a string as a key, along with a value containing the number of characters within that string. Again we can use the map function to achieve our desired result.

        List<String> names = Arrays.asList("Alice", "Bob", "Charlie");

        Map<String, Integer> nameLengthMap = names.stream()
            .collect(Collectors.toMap(
                name -> name,           // Key: the name itself
                name -> name.length()   // Value: the length of the name
            ));

        System.out.println(nameLengthMap);
        // Output: {Alice=5, Bob=3, Charlie=7}

This time we utilize the Collectors.toMap() function to output a resulting key/value pair. This method accepts two lambas as arguments (one for the key and another for the value) and collects it into a map. Voila! We have a map of string names and lengths using a very intuitive and simple operation.

Reducing Data into a Single Element

Often times you want to calculate the sum of all elements in a list or array. This can be achieved by calling the reduce function on a list of integers.

List<Integer> numbers = Arrays.asList(1, 2, 3, 4, 5);

int sum = numbers.stream()
                 .reduce(0, Integer::sum);

System.out.println(sum); // Output: 15

The first argument of the reduce function is called the Identity. This just acts as a starting value from which you can start performing your operations. In our case we are going to begin by adding 1 to 0, calculate the result, and apply the same arithmetic to the next value in the array. The second argument is called the Accumulator. The accumulator takes two arguments (Integer.sum(int1, int2)) and then applies the function. We could have also supplied a lamba here like the in the following example:

List<Integer> numbers = Arrays.asList(1, 2, 3, 4, 5);

int sum = numbers.stream()
                 .reduce(0, (a, b) -> a + b);

System.out.println(sum); // Output: 15

Finding the Minimum and Maximum Values of a List

We can use Streams to find the maximum and minimum values in a list of integers, using the stream min() and max() functions.

List<Integer> numbers = Arrays.asList(10, 20, 30, 40, 50);

int max = numbers.stream()
                 .max(Integer::compare)
                 .orElseThrow();

int min = numbers.stream()
                 .min(Integer::compare)
                 .orElseThrow();

System.out.println("Max: " + max + ", Min: " + min); // Output: Max: 50, Min: 10

The Integer.compare() method of Integer class of java.lang package compares two integer values (x, y) given
as a parameter and returns the value zero if (x==y), if (x < y) then it returns a value less than zero and
if (x > y) then it returns a value greater than zero.

The min() and max() functions return a result of Optional and therefor need to have the get() method called to return the internal value inside the Optional or the orElseThrow() method which return a runtime exception if the resulting value inside the Optional is null.

Grouping Data into a Map

Imagine we have a list of words, and we want to group them together by their length.

List<String> words = Arrays.asList("apple", "banana", "cherry", "date");

Map<Integer, List<String>> groupedByLength = words.stream()
                                                  .collect(Collectors.groupingBy(String::length));

System.out.println(groupedByLength);
// Output: {5=[apple], 6=[banana, cherry], 4=[date]}

The Collectors.groupingBy() function accepts a single Function as it argument and automatically produces a key value of the string length and the resulting strings that match that length.

Parallel Streams for Performance

Often times you will need to transform very large datasets and allowing multiple threads to partition the data into smaller chunks can give you a performance boost on a multi-core CPU. In this example we will filter all even numbers between 1 and 1000000 and then count how many resulting even numbers exist. Except this time we will use parallelStream to automatically do our calculations concurrently.

List<Integer> largeList = IntStream.range(1, 1000000).boxed().collect(Collectors.toList());

long count = largeList.parallelStream()
                      .filter(n -> n % 2 == 0)
                      .count();

System.out.println(count); // Output: 499999

Stream Pipelines

A Stream pipeline consists of a source, zero or more intermediate operations, and a terminal operation. The pipeline is built in stages, with each stage transforming the data and passing it to the next stage. For example:

List<String> names = Arrays.asList("John", "Jane", "Jack", "Jill", "Bob");

names.stream()
     .filter(name -> name.startsWith("J"))
     .sorted()
     .map(String::toUpperCase)
     .forEach(System.out::println);
// Output: JACK, JANE, JILL, JOHN

Conclusion

Java Streams are a powerful addition to the language, enabling developers to write clean, declarative, and efficient code for data processing tasks. By understanding their purpose, leveraging their benefits, and applying the various examples provided in this article, you can harness the full potential of Java Streams in your projects.

Whether you’re filtering, mapping, reducing, or parallelizing operations, Java Streams offer a modern approach to handling collections and other data sources. As with any tool, practice and experimentation are key to mastering its capabilities.

Index