I come from a java background and have a CPU binding problem that I am trying to parallelize to improve performance. I broke my code to execute it in a modular way so that it could be distributed and run in parallel (hopefully).
@Transactional(readOnly = false, propagation = Propagation.REQUIRES_NEW) public void runMyJob(List<String> some params){ doComplexEnoughStuffAndWriteToMysqlDB(); }
Now I was thinking about the following options for parallelizing this problem, and I would like people to think / worry in this area.
The options I'm thinking of now:
1) Use a Java EE cluster (e.g. JBoss) and MessageDrivenBeans. MDBs are located on the sub-nodes of the cluster. Each MDB can select an event that fires, as described above. AFAIK Java EE MDBs are multi-threaded by the application server, so we hopefully can also use multi-core processors. Thus, it must be scalable vertically and horizontally.
2) I could use something like Hadoop and Map Reduce. The concern I would like to get here is that my processing logic is actually quite high level, so I'm not sure how much this can be done to convert the map. Also, I am new to MR.
3) I could look at something like Scala, which in my opinion makes concurrency programming a lot easier. However, although it is scalable vertically, it is not a cluster / horizon scalable solution.
In any case, hope that all of this makes sense and thanks you for any help provided.
java-ee scala architecture mapreduce cluster-computing
Brian
source share