I have some data that needs to be classified in a spark stream. The classification key values are loaded at the beginning of the program in the HashMap. Therefore, each incoming data packet must be mapped to these keys and labeled accordingly.
I understand that a spark has variables called broadcast variables and storage devices for distributing objects. The examples in the tutorials use simple type variables, etc.
How can I share my HashMap with all sparks using HashMap. Alternatively, is there a better way to do this?
I am coding my sparking in Java.
source
share