Consider the classic program "Word Count". It counts the number of words in all files in a directory. The wizard receives a certain directory and splits the work among the actors (each worker works with one file). This is the pseudo code:
class WordCountWorker extends Actor { def receive = { case FileToCount(fileName:String) => val count = countWords(fileName) sender ! WordCount(fileName, count) } } class WordCountMaster extends Actor { def receive = { case StartCounting(docRoot) =>
But I want to run the Word Count program on a schedule (for example, every 1 minute), providing different directories for scanning.
And Akka provides a good way to schedule messaging:
system.scheduler.schedule(0.seconds, 1.minute, wordCountMaster , StartCounting(directoryName))
But the problem with the specified scheduler starts when the scheduler sends a new message by tick, but the previous message has not yet been processed (for example, I sent a message to scan some large directory, and after 1 second I sent another message to scan another directory, so processing works 1 th directory is not finished yet). Thus, my WordCountMaster will receive WordCountMaster messages from workers who process different directories.
As a workaround, instead of planning to send messages, I can schedule the execution of some block of code that will create a new WordCountMaster each time. That is, one directory = one WordCountMaster . But I believe this is inefficient, and I need to take care of providing unique names for WordCountMaster to avoid an InvalidActorNameException .
So my question is: should I create a new WordCountMaster for each tick, as I mentioned in the previous paragraph? Or are there some best ideas / templates on how to reverse engineer this program to support planning?
Some update: In case of creating one main actor in the directory, I have some problems:
- Problem with naming members
InvalidActorNameException: actor name [WordCountMaster] is not unique!
and
InvalidActorNameException: the name of the actor [WordCountWorker] is not unique!
I can solve this problem simply by not providing the actor name. But in this case, my actors get auto-generated names like $a , $b , etc. This is not good for me.
- The problem with the configuration:
I want to exclude the configuration of my routers on application.conf . That is, I want to provide the same configuration for each WordCountWorker router. But since I do not control the names of the actors, I can not use the configuration below, because I do not know the name of the actor:
/wordCountWorker{ router = smallest-mailbox-pool nr-of-instances = 5 dispatcher = word-counter-dispatcher }