Flink executes distinct()internally as GroupByfollowed by an operator ReduceGroup, where the reduction operator returns only the first element of the group.
GroupBy . , , , , . . GroupBy Sort Flink OutOfMemoryError.
, DataSet.distinct(KeySelector ks). MapFunction, .