This parameter comes from spark.sql.shuffle.partitions, which represents the number of partitions used in the grouping, and has a default value of 200 , but can be increased. This may help, it will depend on the cluster and data.
, , , . reduceByKey/combineByKey, groupByKey, -?