Bill Blogs in C#

Bill Wagner discusses C#, LINQ, and other items of interest

Deferred vs. Immediate query
Do it now, or do it later

Today, I’m going to discuss two important linq concepts: custom sequence operators and deferred vs. immediate execution.

First, lets consider the custom sequence operator. This method computes the Dot Product of two vectors:

public void Linq98() {            
    int[] vectorA = { 0, 2, 4, 5, 6 };
    int[] vectorB = { 1, 3, 5, 7, 8 };
    int dotProduct = vectorA.Combine(vectorB, (a, b) => a * b).Sum();
    Console.WriteLine("Dot product: {0}", dotProduct);
}

The standard LINQ deliverables do not contain a Combine method. But, the LINQ libraries are completely extensible.  Just build your own:

public static class CustomSequenceOperators
{
    public static IEnumerable<T> Combine<T>(
this IEnumerable<T> first,
IEnumerable<T> second,
Func<T, T, T> func) {
        using (IEnumerator<T> e1 = first.GetEnumerator(), e2 = second.GetEnumerator()) {
            while (e1.MoveNext() && e2.MoveNext()) {
                yield return func(e1.Current, e2.Current);
            }
        }
    }
}

There’s a lot going on here, so let’s look at it carefully.

The Combine method is a generic method with one type parameter. In this example, T will be an int, so you can mentally perform that substitution if it makes it easier.

Combine takes three parameters:  two IEnumerable<T>, representing the two sequences to combine, and a Func<T,T,T>, which represents the function predicate. It will enumerate both sequences, and call the predicate using the Nth element from each sequence. The return value is the sequence containing the results of each call to the function predicate. To get the dot product, the first method simply sums all the results of the sequence.

The important lesson of this sample is that you can write your own extension methods (like Combine()) to provide extra capabilities that you need.

The next two methods demonstrate the difference between deferred execution (the default), and immediate execution (which you can request).

Look at these two methods (and note the highlighted difference):

public void Linq99() {
    int[] numbers = new int[] { 5, 4, 1, 3, 9, 8, 6, 7, 2, 0 };
    int i = 0;
    var q =
        from n in numbers
        select ++i;

    foreach (var v in q) {
        Console.WriteLine("v = {0}, i = {1}", v, i);         
    } 
}

public void Linq100() {
    int[] numbers = new int[] { 5, 4, 1, 3, 9, 8, 6, 7, 2, 0 };
    int i = 0;
    var q =(
        from n in numbers
        select ++i )
        .ToList();

    foreach (var v in q) {
        Console.WriteLine("v = {0}, i = {1}", v, i);         
    } 
}

The first method produces this:

v = 1, i = 1
v = 2, i = 2
v = 3, i = 3
v = 4, i = 4
v = 5, i = 5
v = 6, i = 6
v = 7, i = 7
v = 8, i = 8
v = 9, i = 9
v = 10, i = 10

The second, this:

v = 1, i = 10
v = 2, i = 10
v = 3, i = 10
v = 4, i = 10
v = 5, i = 10
v = 6, i = 10
v = 7, i = 10
v = 8, i = 10
v = 9, i = 10
v = 10, i = 10

The difference is that queries are executed only when the user requests data from the query (in this case, the foreach loop at the bottom of the method).  Only then, does the value of ‘i' get incremented. That’s why the loop produces the values 1-10 for the first method.  In the second method, the call to ToList() creates a list that contains all the results of the query. Therefore, it executes the query.  Hence, the value of i in the second method is 10, and does not change by enumerating the list.

Another example of this same behavior can be seen in this method, which reuses a query object after changing the underlying collection. The second iteration executes the query again, producing new results.  Here’s the method:

public void Linq101() {
    int[] numbers = new int[]
{ 5, 4, 1, 3, 9, 8, 6, 7, 2, 0 };
    var lowNumbers =
        from n in numbers
        where n <= 3
        select n;
    Console.WriteLine("First run numbers <= 3:");
    foreach (int n in lowNumbers) {
         Console.WriteLine(n);
    }
    for (int i = 0; i < 10; i++) {
        numbers[i] = -numbers[i];
    }
    Console.WriteLine("Second run numbers <= 3:");
    foreach (int n in lowNumbers) {
        Console.WriteLine(n);
    }
}

The results are:

First run numbers <= 3:
1
3
2
0
Second run numbers <= 3:
-5
-4
-1
-3
-9
-8
-6
-7
-2
0

Note how the results are based on the contents of the source collection when the query is executed, not when it's created.

In the next installment we’ll look at Join methods in LINQ.



Part 1
The general Query Syntax
Part 2
The one where I discuss Object and Collection Initializers
Part 3
The one where I finish restriction operators
Part 4
Beginning to discuss projections
Part 5
Anonymous types and projections
Part 6
Discussing indexed, filtered, and compound queries
Part 7
Finishing up the projection items
Part 8
Projection operators and extension methods
Part 9
OrderBy, ThenBy, and Descending, oh my
Part 10
Grouping operators and building nested groups
Part 11
Set Operations, you bet
Part 12
Conversions: caching collections
Part 13
Where U at item, where U at?
Part 14
Boolean tests on sequences
Part 15
Aggregation operators: Sum, Product, Averages, and moe
Part 16
Concatenation and EqualAll
Published Wednesday, June 21, 2006 9:34 PM by wwagner
Filed under: ,

Comments

No Comments