In another case, I have an RDD coming from a Spark SQL statement through a HiveContext . The solution that worked for me after some experimentation was to actually restore the RDD itself.
It doesnβt matter if you use DDL Spark SQL or send SQL statements directly through hiveContext.sql .
I saw around people using the "counting trick" to force the recalculation of the data set, but at least in my attempts I could not see the new data in this way.
In any case, the attempt to cache, update and friends did not work for me, if someone has the correct template, please share.
source share