I am creating a back-end Java component that processes a moderate amount of data every day. We have a POJO, let's call it Widget , it has about 10 properties on it. My software needs to process groups from Widget lists: essentially there are other processes (completely different systems) that combine their own List<Widget> and then send them to my software. My software actually gets a workaround POJO that looks like this:
public class Payload { private List<Widget> widgets;
My software combines all of these List<Widget> , each of which is created by different systems, and then processes them together in one large batch.
I previously selected ArrayList<ArrayList<Widget>> as the data structure for this batch of Widget lists. There will be about 500,000 List<Widget> groups (external ArrayList ), and each List<Widget> will have about 5 Widget each; for a total of ~ 2.5 million Widget in the internal ArrayList .
In a recent code review, some technical leaders told me that I chose the wrong data structure for this batch widget. They told me that I had to use HashMap<String,List<Widget>> because it is more efficient and easier to work with. The hashmap key is the GUID contained in Payload , which is provided by my software. Not that I need a GUID for any reason, it just serves as a key to keep ~ 500,000 List<Widget> individual - what I need to do.
It made me wonder: who is right?!? The only operations we do in this data structure are “add” (in the case of ArrayList , just adding a Widget or List<Widget> via add(...) ) and then “reading” (I have to go through my software through each Widget and check it for a subject.
for(List<Widget> widgetList : myDoublyNestedArrayOfWidgets) { for(Widget widget : widgetList) { ... } }
These are the only operations we need: add the scattered List<Widget> to some large “batch” data structure, and then at a later time examine them all and do everything with each Widget . This software runs on some amplified servers with lots of memory and processing power.
So I ask: ** Is ArrayList<ArrayList<Widget>> right choice, HashMap<String,List<Widget>> or something else ... and why?