I am trying to create a high performance distributed system with Akka and Scala.
If a message arrives with a request for an expensive (and without side effects) calculation, and the same calculation has already been requested before, I want to avoid recalculating the result. If the previously requested calculation is already completed and the result is available, I can cache it and reuse it.
However, the time window in which duplication of calculations may be requested may be arbitrarily small. for example, I could receive a thousand or a million messages requesting the same costly calculation at the same moment for all practical purposes.
There is a commercial product called Gigaspaces that supposedly copes with this situation.
However, it seems there is currently no frame support for working with duplicate work requests in Akka. Given that the Akka infrastructure already has access to all messages that are routed through the framework, it seems that a framework solution can make a lot of sense here.
Here's what I suggest for the Akka framework: 1. Create a tag indicating the type of messages (for example, "ExpensiveComput" or something similar) that are subject to the following caching method. 2. Intellectually (hashing, etc.) Identify identical messages received (by the same or different) participants in a user-configurable time window. Other parameters: select the maximum size of the memory buffer that will be used for this purpose, taking into account (for example, LRU) replacement, etc. Akka can also cache only the results of messages that were expensive to process; Messages that took very little time to process can be re-processed if necessary; no need to waste precious buffer space by caching them and their results. 3.When identical messages are identified (received during this time window, possibly "in an instant"), avoid unnecessary duplicate calculations. The structure will do this automatically, and essentially duplicate messages will never receive a new actor for processing; they will silently disappear, and the result of its processing once (whether this calculation has already been done in the past or will continue at that time) will be sent to all relevant recipients (immediately, if they are already available, and upon completion of the calculation, if not). Please note that messages should be considered identical, even if the response fields are different, if the semantics / calculations that they represent are the same in all other respects. Also note that the calculation should be purely functional,that is, free from side effects, to optimize the caching offered for work, and not to change the semantics of the program at all.
, , , Akka, / , , .
,
, Scala