I've been working with a team where I've been putting a WPF UI over a set of implementations of a custom interface. Each implementation is a long running process which is handled in standard manner by creating an asynchronous Task<T> and displaying the results from the continuation.
The standard CPU utilisation for the majority of the interface implementations is shown below - this represent the ideal, not hogging the CPU's and behaving like a good citizen...
Hopefully the difference between the majority and this particular instance is obvious - the majority are single-threaded long running processes and the other is a multi-threaded long running process. It is not multi-threaded by a factor 2, 3 or 4, but by a factor of n, where n is considerably larger than the number of available logical processors, a value greater than 100.
The code causing this issue is generating simulation data for a set of stochastic models, the generation is CPU bound for around 5 minutes, after that the long running process returns to be being a single-threaded and behaves like all the other implementations, I can't show the actual code, but shown below is something similar:
As you can see the for-loop creates a lot of tasks - 1000 in the example. The method then waits for all of them to complete before returning.
The scheduling of tasks (onto the logical processors) can be configured, but in this case it isn't and this is the problem, the default method overload has been usēd.
Because we aren't attempting to control the number of concurrent tasks running the TPL is maximizing the number of tasks, in this case it's running 8 in parallel across the logical processors until all the scheduled tasks have completed.
How do I control the level of concurrency for so many tasks?
The overloads for StartNew on the TaskFactory class doesn't expose an obvious solution such as an integer value for the maximum concurrency. The answer is to pass a custom scheduler to the method which it controls the level of concurrency. MSDN has an article on the subject, and it's perfectly suited to this problem. The modified method now looks like this:
I've hardcoded the maximum number of concurrency to 6 in the above example, but you could do something more dynamic by using the System.Environment.ProcessorCount property. The CPU utilisation now looks something like this:
The CPU utilisation is now around 75% which represent the level of concurrency I've specified with the scheduler - it was created with a concurrency of 6 (out of possible 8 logical processors). This still seems rather large, but there is enough spare capacity for the dispatcher thread to be scheduled and therefore the UI remains responsive.
Note: The code above could be re-written to be more elegant using Parallel.For method, this would remove the need for the custom scheduler - it's not my code to change apparently :)
If I was to re-write it, something like this following would be better IMO:
More info about the differences can be found in the following pdf, Pattern of Parallel Programming in C# for info on Parallel.ForEach vs Task.Factory.StartNew.