Spreading Scala across a cluster?

Question

Spreading Scala across a cluster?

So, I recently started to study Scala and used graphics as my project to improve my-Scala, and everything is going well - since then I have managed to easily parallelize some graph algorithms (which is beneficial from data parallelization) kindly provided by Scala 2.9 with amazing support for parallel collections .

However, I want to take this step further and parallelize it not only on one machine, but also on several. Does Scala offer any clean way to do this, as with parallel collections, or will I have to wait until I get a chapter in my actor book / learn more about Akka?

Thanks! -kstruct

+8

scala parallel-processing graph scala-collections distributed

adelbertc Mar 11 '12 at 4:32

source share

2 answers

You can use Akka ( http://akka.io ) - he has always been the most advanced and powerful actor and concurrency framework for Scala, and the freshly baked version 2.0 provides a nice transparent actor removed, hierarchy and supervision . The canonical way of doing parallel computing is to create as many participants as possible, since your algorithm has parallel parts, optionally distributing them across several machines, sending them data for processing and then collecting the results (see here ).

+2

Oleg Kunov Mar 11 '12 at 5:28

source share

om-nom-nom · Accepted Answer · 2012-03-11T05:27:19+0000

An attempt was made to create distributed collections (the project is currently frozen).

Alternatives will be Akka (recently received a great addition: Akka Cluster ), which you already mentioned, or a full-fledged cluster of engines, that is, non- parallel collections in any sense and more similar to scala cluster distribution, but can be used in your task in some way - for example, Scoobi for Hadoop, Storm or even Spark (in particular, Bagel for graph processing). There is also a Swarm that was built on top of limited sequels. Last but not least: Menthor - the authors claim that it is especially suitable for processing graphs and uses Actors.

Since you are focused on working with charts, you can also consider Cassovary , which was recently opened via twitter.

Signal-collect is the foundation for parallel data processing supported by Akka.

Spreading Scala across a cluster?

More articles: