Grouping operators, and building nested groups
So, we’ve covered the ordering operations, next comes grouping. The LINQ Grouping operators enable you to sub-divide the results of your queries based on any property of the objects returned. You can use this functionality for subtotals, to arrange objects in categories, or to provide other calculations based on sub-groups of objects.
The grouping operators perform their actions by using the capabilities of anonymous types to produce the groupings, and a generic class to store the group. These samples will show you some of the interesting things you can build with the new features in C# 3.0. Implementing grouping does not need any features that I haven’t blogged about before, but uses them to build some new classes that you haven’t seen.
To begin, let’s look at the first grouping sample in its entirety:
public void Linq40() {
int[] numbers = { 5, 4, 1, 3, 9, 8, 6, 7, 2, 0 }; var numberGroups =
from n in numbers
group n by n % 5 into g
select new {Remainder = g.Key, Numbers = g.Group};
foreach (var g in numberGroups) {
Console.WriteLine(
"Numbers with a remainder of {0} when divided by 5:",
g.Remainder);
foreach (var n in g.Numbers) {
Console.WriteLine(n);
}
}
}
The output shows different collections for each possible remainder:
Numbers with a remainder of 0 when divided by 5:
5
0
Numbers with a remainder of 4 when divided by 5:
4
9
Numbers with a remainder of 1 when divided by 5:
1
6
Numbers with a remainder of 3 when divided by 5:
3
8
Numbers with a remainder of 2 when divided by 5:
7
2
There is one new aspect to this query, this line of code:
group n by n % 5 into g
That calls the GroupBy extension method, so the whole query could be written like this:
numbers.GroupBy( n => n % 5).Select
(g => new {Remainder = g.Key, Numbers = g.Group});
The GroupBy method is simply another extension method defined in Sequence.cs. There are four versions that return a sequence of groups. The sequence is a dictionary that contains Key / Group elements. The Key is the item that determines the group membership (in the case above, that’s the remainder). The Group is a list of all the elements that match the key.
The sequence is an instance of another generic class defined in sequence.cs: The Grouping<K, T> class.
Other simple groups can be formed to create lists of words that start with the same letter:
string[] words = { "blueberry", "chimpanzee", "abacus",
"banana", "apple", "cheese" };
var wordGroups =
from w in words
group w by w[0] into g
select new {FirstLetter = g.Key, Words = g.Group};Or, for a closer to real life example, products that are part of the same category:
List<Product> products = GetProductList();
var orderGroups =
from p in products
group p by p.Category into g
select new {Category = g.Key, Products = g.Group};
Like everything else I’ve discussed so far in the LINQ code, these are the small building blocks that can build larger more complicated queries. For example, the following query builds a list of orders grouped by customer, sub-grouped by year, and then sub-grouped by month.
public void Linq43() {
List<Customer> customers = GetCustomerList(); var customerOrderGroups = from c in customers
select
new {c.CompanyName,
YearGroups =
from o in c.Orders
group o by o.OrderDate.Year into yg
select
new {Year = yg.Key,
MonthGroups =
from o in yg.Group
group o by o.OrderDate.Month into mg
select new {Month = mg.Key, Orders = mg.Group}
}
};
ObjectDumper.Write(customerOrderGroups, 3);
}
A section of the output looks like this:
CompanyName=Alfreds Futterkiste YearGroups=...
YearGroups: Year=1997 MonthGroups=...
MonthGroups: Month=8 Orders=...
Orders: OrderID=10643 OrderDate=8/25/1997 Total=814.50
MonthGroups: Month=10 Orders=...
Orders: OrderID=10692 OrderDate=10/3/1997 Total=878.00
Orders: OrderID=10702 OrderDate=10/13/1997 Total=330.00
YearGroups: Year=1998 MonthGroups=...
MonthGroups: Month=1 Orders=...
Orders: OrderID=10835 OrderDate=1/15/1998 Total=845.80
MonthGroups: Month=3 Orders=...
Walk through this a bit, and you’ll see it’s not really so hard.
from c in customers // Iterate every customer.
select // Select something from the customer.
new {c.CompanyName, // Get the customer name
YearGroups = // Create a new group object
from o in c.Orders // From the orders
group o by o.OrderDate.Year into yg // yg is the year group
select // The group is the output of another select
new {Year = yg.Key, // The key is the year .
MonthGroups = // The list is a list of groups.
from o in yg.Group // Taken from a year’s worth of orders.
group o by o.OrderDate.Month into mg // The group is named mg.
// The key is the month, the orders are the list.
select new {Month = mg.Key, Orders = mg.Group}
}
};
The last two samples I'll discuss today show that you can form groups based on some calculated property of the items in your list. The sample shows how to create groups from anagrams of different words. There are two different points here. The first is that the key is the word, trimmed of all blanks. Second, the group is formed by the equality comparer defined in the AnagramEqualityComparer class. The AnagramEqualityComparer considers two strings equal if they contain the same letters.
public void Linq44() {
string[] anagrams = {"from ", " salt", " earn ",
" last ", " near ", " form "};
var orderGroups = anagrams.GroupBy(w => w.Trim(), new
AnagramEqualityComparer()); ObjectDumper.Write(orderGroups, 1);
}
public class AnagramEqualityComparer : IEqualityComparer<string>
{
public bool Equals(string x, string y) {
return getCanonicalString(x) == getCanonicalString(y);
}
public int GetHashCode(string obj) {
return getCanonicalString(obj).GetHashCode();
}
private string getCanonicalString(string word) {
char[] wordChars = word.ToCharArray();
Array.Sort<char>(wordChars);
return new string(wordChars);
}
}The output is shown below:
Key=from Group=...
Group: from
Group: form
Key=salt Group=...
Group: salt
Group: last
Key=earn Group=...
Group: earn
Group: near
And, of course, there’s nothing magical about those operations. The next sample shows that you can create a new object to use as the contents of the list. In this version, string is converted to upper case before being added to the list:
string[] anagrams = {"from ", " salt", " earn ",
" last ", " near ", " form "};
var orderGroups = anagrams.GroupBy(w => w.Trim(), a => a.ToUpper(),
new AnagramEqualityComparer());The output is below:
Key=from Group=...
Group: FROM
Group: FORM
Key=salt Group=...
Group: SALT
Group: LAST
Key=earn Group=...
Group: EARN
Group: NEAR
Next, I’ll discuss the set operators, which help you manage the contents of queries.
Part 1
The general query syntax
Part 2
The one where I discuss Object and Collection initializers
Part 3
The one where I finish restriction operators
Part 4
Beginning to discuss projections
Part 5
Anonymous types and projections
Part 6
Discussing indexed, filtered, and compound queries
Part 7
Finishing up the projection items
Part 8
Projection Operators and Extension methods
Part 9
OrderBy, ThenBy, and Descending. Oh my