How can you calculate apache data frame size using pyspark?

Question

Is there a way to calculate the size in bytes of the Apache Source frame of data using pyspark?

+7

Mihai tache Jul 04 '16 at 8:33

1 answer

thePurplePython · Answer 1 · 2019-04-19T19:59:20+0000

why don’t you just cache df and then look into the spark interface under the repository and convert units to bytes

df.cache()