Consider a Flink cluster with some nodes, where each node has a multi-core processor. If we configure the number of slots based on the number of cores and equal memory shares, how does Apache Flink distribute tasks between nodes and free slots? Have they been treated enough?
Is there a way to make / configure Flink to serve slots the same way when setting up task slots based on the number of cores available on node
For example, suppose we share data the same way and perform the same task on partitions. Flink uses all slots from some nodes, and at the same time, some nodes are absolutely free. a node that has fewer processor cores outputs much faster than a node with a large number of processor cores involved in this process. In addition, this acceleration ratio is not proportional to the number of cores used in each node. In other words, if one core is occupied in one node and two kernels are occupied in the other node, if each core is rightly considered as a slot, each slot should output the result for the same task in an almost equal amount of time, regardless of which node they belong to. But here it is not. With this assumption, I would say that nodes are not processed the same way. This, in turn, leads to a result that is not proportional to the number of nodes available. We cannot say that increasing the number of slots necessarily reduces the cost of time.
I would appreciate any comments from the Apache Flink community!
source share