I recently found the SemaphoreSlim class to limit the current operation of a parallelizable operation on a (large) stream resource:
// The below code is an example of the structure of the code, there are some // omissions around handling of tasks that do not run to completion that should be in production code SemaphoreSlim semaphore = new SemaphoreSlim(Environment.ProcessorCount * someMagicNumber); foreach (var result in StreamResults()) { semaphore.Wait(); var task = DoWorkAsync(result).ContinueWith(t => semaphore.Release()); ... }
This must be avoided to bring too many results into memory, and the program cannot handle it (as a rule, this is described using the OutOfMemoryException exception). Although the code works and is quite efficient, it still feels awkward. It is noteworthy that the multiplier someMagicNumber , which, although configured through profiling, may not be as optimal as it could be, and is not resistant to changes in the implementation of DoWorkAsync .
Just as combining threads can overcome the obstacle to planning many things to do, I would like something that can overcome the obstacle to planning many things that need to be loaded into memory based on available resources.
Since it is deterministically impossible to decide whether an OutOfMemoryException will occur, I understand that what I'm looking for can only be achieved using statistical means or not at all, but I hope something is missing.
source share