This was inspired by a conversation I was having with @leeoades & @hamishdotnet at work earlier this week, we were discussing the use of Yield keyword to short-cut method execution. Lee has a great post about sequences here. I wanted to get down how my thinking about the Yield keyword has now changed.
When ever you have a method which internally creates & populates a collection and then returns the collection why not use yield to make it more efficient?
This got me thinking, what kind ofimprovement efficiency are we talking about?
Time for a quick test, first we need a couple of methods to compare, first one very similar to above and another making use of the yield keyword but both achieving the same logic - returning a collection of numbers to the caller:
Then a test program - use each method as part of LINQ call to validate the returned sequences contain the number 420:
Running this up I got the following output!
What!
I wasn't expecting it to be slower, after all the YeildRange method only has to iterate 420 times before finding a match where as the Range method has to iterate over all 1000 numbers before returning...
Then @hamishdotnet pointed out the cost of yielding and I remembered the chapter in Jon Skeet's C# In Depth about iterator blocks and how a set of classes are created by the compiler when ever you use the yield keyword.
Then I had the realisation the methods didn't really have any 'business logic' - DOH!
Basically the 2 implementations don't do anything, so to simulate this I put a Thread.Sleep in to represent an I/O or processor bound call:
More promising results now...
So what do I mean by 'Applying LINQ principles to business logic'?
By thinking about concepts such as 'lazy evaluation' & 'materialization' when designing your methods you allow for the opportunities for your code to be more efficiently executed when used as part of any linq queries.
Historically I've always thought about using the yield when I'm defining a custom enumerator method, something like GetFooEnumerator - typically I'd have placed this on some kind of model object to allow the easy iteration of some child model instances. Lee was saying why not use yield where ever you have the following usage pattern:
1: public IEnumerable<int> Range(int start, int end)
2: {
3: var numbers = new List<int>();
4: for (var i = start; i < end; i++)
5: {
6: // Do some business logic here...
7: numbers.Add(i);
8: }
9:
10: return numbers;
11: }
When ever you have a method which internally creates & populates a collection and then returns the collection why not use yield to make it more efficient?
This got me thinking, what kind of
Time for a quick test, first we need a couple of methods to compare, first one very similar to above and another making use of the yield keyword but both achieving the same logic - returning a collection of numbers to the caller:
1: public class Numbers
2: {
3: public IEnumerable<int> Range(int start, int end)
4: {
5: var numbers = new List<int>();
6: for (var i = start; i < end; i++)
7: {
8: numbers.Add(i);
9: }
10:
11: return numbers;
12: }
13:
14: public IEnumerable<int> YieldRange(int start, int end)
15: {
16: for (var i = start; i < end; i++)
17: {
18: yield return i;
19: }
20: }
21: }
Then a test program - use each method as part of LINQ call to validate the returned sequences contain the number 420:
1: static void Main(string[] args)
2: {
3: Console.WriteLine();
4:
5: var numbers = new Numbers();
6:
7: var sw = new Stopwatch();
8: sw.Start();
9: var found1 = numbers.Range(0, 999).Any(n => n == 420);
10: sw.Stop();
11:
12: Console.WriteLine("Normal Loop: ticks = " + sw.ElapsedTicks);
13:
14: var sw2 = new Stopwatch();
15: sw2.Start();
16: var found2 = numbers.YieldRange(0, 999).Any(n => n == 420);
17: sw2.Stop();
18:
19: Console.WriteLine(" Yielding: ticks = " + sw2.ElapsedTicks);
20: Console.ReadLine();
21: }
Running this up I got the following output!
What!
I wasn't expecting it to be slower, after all the YeildRange method only has to iterate 420 times before finding a match where as the Range method has to iterate over all 1000 numbers before returning...
Then @hamishdotnet pointed out the cost of yielding and I remembered the chapter in Jon Skeet's C# In Depth about iterator blocks and how a set of classes are created by the compiler when ever you use the yield keyword.
Then I had the realisation the methods didn't really have any 'business logic' - DOH!
Basically the 2 implementations don't do anything, so to simulate this I put a Thread.Sleep in to represent an I/O or processor bound call:
1: public class Numbers
2: {
3: public IEnumerable<int> Range(int start, int end)
4: {
5: var numbers = new List<int>();
6: for (var i = start; i < end; i++)
7: {
8: System.Threading.Thread.Sleep(10);
9: numbers.Add(i);
10: }
11:
12: return numbers;
13: }
14:
15: public IEnumerable<int> YieldRange(int start, int end)
16: {
17: for (var i = start; i < end; i++)
18: {
19: System.Threading.Thread.Sleep(10);
20: yield return i;
21: }
22: }
23: }
More promising results now...
So what do I mean by 'Applying LINQ principles to business logic'?
By thinking about concepts such as 'lazy evaluation' & 'materialization' when designing your methods you allow for the opportunities for your code to be more efficiently executed when used as part of any linq queries.
Comments
Post a Comment