Class BeamWordCount

  • public final class BeamWordCount
    extends java.lang.Object
    An example that counts words in Shakespeare and includes Beam best practices. THIS EXAMPLE IS TAKEN FROM THE APACHE BEAM REPOSITORY.

    For a detailed walkthrough of this example, see

    Basic concepts, also in the MinimalWordCount example: Reading text files; counting a PCollection; writing to text files

    New Concepts:

       1. Executing a Pipeline both locally and using the selected runner
       2. Using ParDo with static DoFns defined out-of-line
       3. Building a composite transform
       4. Defining your own pipeline options

    Concept #1: you can execute this pipeline either locally or using by selecting another runner. These are now command-line options and not hard-coded as they were in the MinimalWordCount example.

    To change the runner, specify:


    To execute this pipeline, specify a local output file (if using the DirectRunner) or output prefix on a supported distributed file system.


    The input file defaults to a public data set containing the text of of King Lear, by William Shakespeare. You can override it and choose your own input with --inputFile.

    • Field Summary

      Modifier and Type Field Description
      static java.lang.String TOKENIZER_PATTERN  
    • Method Summary

      All Methods Static Methods Concrete Methods 
      Modifier and Type Method Description
      static void main​(java.lang.String[] args)  
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Field Detail


        public static final java.lang.String TOKENIZER_PATTERN
        See Also:
        Constant Field Values
    • Method Detail

      • main

        public static void main​(java.lang.String[] args)