public final class DataSkewHashPartitioner extends Object implements Partitioner
Partitionerwhich hashes output data from a source task appropriate to detect data skew. It hashes data finer than
HashPartitioner. The elements will be hashed by their key, and applied "modulo" operation. When we need to split or recombine the output data from a task after it is stored, we multiply the hash range with a multiplier, which is commonly-known by the source and destination tasks, to prevent the extra deserialize - rehash - serialize process. For more information, please check
|Constructor and Description|
|Modifier and Type||Method and Description|
Divides the output data from a task into multiple blocks.
public List<Partition> partition(Iterable elements, int dstParallelism, KeyExtractor keyExtractor)
Copyright © 2018. All rights reserved.