GroupByKey

GroupByKey is a Beam transform for processing collections of key/value pairs. It’s a parallel reduction operation, analogous to the Shuffle phase of a Map/Shuffle/Reduce-style algorithm. The input to GroupByKey is a collection of key/value pairs that represents a multimap, where the collection contains multiple pairs that have the same key, but different values. Given such a collection, you use GroupByKey to collect all of the values associated with each unique key.

Kata: Implement a GroupByKey transform that groups words by its first letter.


Refer to KV and GroupByKey to solve this problem.
Refer to the Beam Programming Guide "GroupByKey" section for more information.