Is there a way to calculate the size in bytes of the Apache Source frame of data using pyspark?
why donβt you just cache df and then look into the spark interface under the repository and convert units to bytes
df.cache()