Bill Blogs in C#

Bill Wagner discusses C#, LINQ, and other items of interest

February 2006 - Posts

Anonymous types and projections

The next set of LINQ samples takes us into the realm of Anonymous Types. It’s really not a hard concept. It’s just several small steps that add up to one huge leap in terms of programming language power.

Let’s look at the first anonymous type sample. Here’s the code:

public void Linq9() {
  string[] words = { "aPPLE", "BlUeBeRrY", "cHeRry" };
  var upperLowerWords =
    from w in words
    select new {Upper = w.ToUpper(), Lower = w.ToLower()};
    
  foreach (var ul in upperLowerWords) {
    Console.WriteLine("Uppercase: {0}, Lowercase: {1}", ul.Upper, ul.Lower);
  }
}

The output is simple the upper and lower case versions of all the input strings:

Uppercase: APPLE, Lowercase: apple
Uppercase: BLUEBERRY, Lowercase: blueberry
Uppercase: CHERRY, Lowercase: cherry

Conceptually, this is a simple example: from every string, you want to produce a pair of strings. The pair contains the upper case and lower case versions of the input string.

Simple, right?

Of course it is. In C# today, you can create this class by hand in minutes: two private fields to contain the strings, and two public read / write properties to provide access to the private fields. It’s so simple that we could assign to our favorite intern, and expect a reasonably correct solution, including unit tests, by lunchtime.

But, hopefully, even our interns have something better to do, so the C# team is making the compiler to the grunt work of creating the new type for us. By writing:

  select new {Upper = w.ToUpper(), Lower = w.ToLower()};

We tell the compiler to define a new type, with two string properties “Upper” and “Lower”, and corresponding private fields. Furthermore, the compiler should return an instance of that new type for each result.

Great. I’m loving it. But, to make it work, the C# language needs some new features. How does the compiler determine the names for the public properties? You actually name them. Whatever label you place on the left hand side of the assignments inside the new expression determine the names of the public properties. The type of the right hand side of the assignments determine the type of the properties.

Now comes the part of the ocean with the dragons: What is the name of the new type? It really doesn’t have a name. At least, it doesn’t have a name you can use. Looking at the sample running in the debugger, my version gives the name “<Projection>f__28” to this particular type. You don’t want to type that in your code. In fact, you can’t because the compiler might change the name on the next compilation. So, you use var to declare variables of an anonymous type. In the example above, the type of ‘ul’ is <Projection>f__28. That’s because each member of the upperLowerWords collection is of the type “<Projection>f__28”. Furthermore, the type of upperLowerWords is “System.Query.Sequence.Select<string, <Projection>f__28>”.

If you use the debugger to look inside the anonymous type, you’ll find that <Projection>f__28 contains two private string fields: _Lower, and _Upper. It also contains two public read/write string properties: Lower and Upper.

Simple, right?

In a sense it is. The compiler did exactly what you would have done by hand, but did so automatically. The only hard part for you is to understand that ‘var’ is really a substitute for ‘whatever name the compiler gave that type I told it to produce’.

OK, I lied. ‘var’ is really a substitute for ‘the type on the right hand side of the initialization statement.’ But, ‘var’ exists, in part, because sometimes you don’t know the name of the type on the right hand side of the assignment. That makes it hard to declare a variable for it.

The next two samples show that this feature is not limited to strings. The next sample shows that you can create an anonymous type with a Boolean value:

public void Linq10() {
  int[] numbers = { 5, 4, 1, 3, 9, 8, 6, 7, 2, 0 };
  string[] strings = { "zero", "one", "two", "three", 
    "four", "five", "six", "seven", "eight", "nine" };
  var digitOddEvens =
  from n in numbers
  select new {Digit = strings[n], Even = (n % 2 == 0)};

  foreach (var d in digitOddEvens) {
    Console.WriteLine("The digit {0} is {1}.", d.Digit, 
      d.Even ? "even" : "odd");
  }
}

The compiler generated type (this time called <Projection>f__32), contains a string, and a Boolean value.

The last sample shows one obvious use case for this feature. Here, you’re creating a new type that contains a subset of the source type.

public void Linq11() {
  List<Product> products = GetProductList();
  var productInfos =
    from p in products
    select new {p.ProductName, p.Category, Price = p.UnitPrice};

  Console.WriteLine("Product Info:");
  foreach (var productInfo in productInfos) {
    Console.WriteLine("{0} is in the category {1} and costs {2} per unit.", 
    productInfo.ProductName, productInfo.Category, productInfo.Price);
  }
}

The compiler generated type contains three properties: Category (a string), Price (a decimal), and ProductName (also a string).

Note that with two of the properties (Category and ProductName), the name of the property is projected from the source object. The Price property is explicitly named.

In this installment, I gave a brief overview of anonymous types, and the other C# syntax elements that are necessary to support them. There are two key points to remember on anonymous types. First, the concept really is not that hard. You define the contents of the anonymous type in your initialization statements. And, the compiler creates the simplest definition that matches your request. Second, don’t get too hung up on var. It’s simple a placeholder for the type on the right hand side of the initialization statement. Almost everyone misinterprets this for a loosely typed variable. It’s not. It’s just a short hand way to say ‘The type of this variable matches compile-time type of the right hand side of the assignment.’



Posted by wwagner | with no comments
Filed under: ,
Dianne chimes in

So, Dianne wanted to call my bluff on the “what’s your biggest failure?” question. If I want to ask the question, I should be willing to answer it:

Several years ago, I was working on a software game project aimed at small children (3 – 8 years old). One of the features was to navigate immediately to the home screen. It let the user go pick the next activity, no matter what he or she was doing at the time. We figured that this feature should be easily accessible and easy to remember, so we picked the space bar: Press the space bar, go to the home screen.

There was only one problem: little kids had a tendency to hit the space bar. A lot. See, it’s the biggest key on the keyboard, and it’s right in the middle, and closest to those small hands. We actually had some smaller children crying during our first user trials. They’d hit the space bar, promptly quitting the game they are currently enjoying.

That experience had us re-examine every navigation key, and rework quite a few of them.

What did I learn? Simply that we are not the user community for what we create. And, even when you recognize that, we still mess up because we miss important facets of our users’ experiences. We thought children would want the ‘home’ navigation key to be easy to get to. By picking the space bar, we certainly succeeded. But, we failed in our goal of making the game enjoyable for small children: We were causing frustration by making a certain action too easy to perform accidentally.

The point of these questions, in my not-so-humble opinion, is to try and assess someone’s potential for growth, and whether or not they have can be constructively self-critical. Anyone that really thinks their weaknesses are strengths, or doesn’t admit mistakes, can’t grow (professionally or personally). And, those folks that want to hide their mistakes, maybe by avoiding giving co-workers and managers the bad news that inevitably shows up during a project, will probably cause larger problems by delaying corrective action that could be taken.

I don’t expect perfect people, but I do want to work with people that will continue to improve themselves, their craft, and their contribution.

And, now that I’ve given away my answer, I need to come up with a new interview question to assess this.



Posted by wwagner | with no comments
Filed under:
Beginning to discuss projections

In this installment, I’ll discuss the next three LINQ samples, which introduction the projection operators. Remember that the initial samples all discussed Restriction operators. The restriction operators all help determine what gets extracted from an item in a collection. By contrast, the projection operators determine where those items go. Or, how those query results are transformed into some new object, based on what you need.

For example, the first projection sample creates an array of numbers where every element is the corresponding element from the source array, plus 1:

public void Linq6() {
  int[] numbers = { 5, 4, 1, 3, 9, 8, 6, 7, 2, 0 };

  var numsPlusOne =
    from n in numbers
      select n + 1;

  Console.WriteLine("Numbers + 1:");
  foreach (var i in numsPlusOne) {
    Console.WriteLine(i);
  }
}

By selecting ( n + 1), you are projecting a new value into the result, based on some operation using the input.

Some projection operators are even simpler. The next sample produces an array of strings from a list of products:

public void Linq7() {
  List<Product> products = GetProductList();

  var productNames =
    from p in products
    select p.ProductName;

  Console.WriteLine("Product Names:");
  foreach (var productName in productNames) {
    Console.WriteLine(productName);
  }
}

A projection can operate on more than one collection. The next sample shows how you can use project an element from one collection (strings) by querying another collection (numbers).

public void Linq8() {
  int[] numbers = { 5, 4, 1, 3, 9, 8, 6, 7, 2, 0 };
  string[] strings = { "zero", "one", "two", "three", 
    "four", "five", "six", "seven", "eight", "nine" };

  var textNums = 
    from n in numbers
    select strings[n];

  Console.WriteLine("Number strings:");
  foreach (var s in textNums) {
    Console.WriteLine(s);
  }
{

Which produces the string representation of the corresponding number array:

Number strings:
five
four
one
three
nine
eight
six
seven
two
zero

Well, this turned out to be a rather boring set of samples, unless you look into the future a little bit. It’s trivially obvious that you can extract a single field from the object returned by a query. And yet, that simple fact is very important to the power of LINQ. By executing query expressions on a collection, you can extract some information from the source objects and project that information, and possibly additional information into the destination set. From that simple, and obvious fact, you can create results set that are compositions or transformations based on the orginal query set.



Posted by wwagner | with no comments
Filed under: ,
Bad interview questions

The folks over at JobsBlog are offering advice about that old standby interview question "What's your greatest weakness?"

One suggest was to answer it like this:  "I’m goal oriented. I get satisfaction out of implementing things and seeing them come alive. Sometimes when there are projects where the decision makers don’t all agree... {Go read the full article for even more marketing buzzwords}"

In my opinion:  This is a crock.  If you can answer the "What's you biggest weakness?" question as a positive, all it shows is that you have mastered the B.S. that is marketing (see here: http://www.dilbert.com/comics/dilbert/games/career/bin/ms.cgi)

Or here:  http://www.netinsight.co.uk/portfolio/mission/missgen.asp

I hate that question, because it's a trick. If you have a "greatest weakness", you should be doing something to address it, right?  I mean, if it is a real weakness, and not one of those "it's a weakness that's really a strength". 

So, in my opinion, that's the better way to answer this loaded question:  Mention a real weakness, and discuss what you are doing to improve that weakness.

Which is why I ask a different question: "What was your biggest failure, and what did you learn from it?

That gives the interviewee a real chance to discuss how a past experience has helped them grow. We've all made mistakes at different times. The real mark of growth is that you haven't continued to make the same ones, but that you learned from those experiences, and how those lessons will help you do a better job next time.

Otherwise, all you show me is your ability to deny responsiblity for the mistakes you've made. That's not going to help our organization.

Your thoughts?



Posted by wwagner | with no comments
Filed under:
The one where I finish the restriction operators

In this entry, I’ll finish examining the Restriction operator samples that were delivered with the current LINQ preview.

The third restriction sample shows that you can have multiple predicates in a where clause. There really isn’t much more to add.

The fourth sample adds some new features that demonstrate a few more of the cool features inside the LINQ project. The sample method Linq4 simply demonstrates how you can use the results of one query to chain into another query or report. It’s really nothing but an obvious extension of the previous examples. If a single collection contains more sub-collections, you can use the same techniques on the sub-collections.

LINQ5 shows how you can operate on the elements of the collection using extension methods and lambda expressions.

Here’s the code

public void Linq5() {
    string[] digits = { "zero", "one", "two", "three", "four", "five", "six",
        "seven", "eight", "nine" };
    var shortDigits = digits.Where((digit, index) => digit.Length < index);
    Console.WriteLine("Short digits:");
    foreach (var d in shortDigits) {
        Console.WriteLine("The word {0} is shorter than its value.", d);
    }
}

The concept is quite simple. It creates a set of strings where the length of the string is shorter than the number it represents. It does this by creating a lambda expression to compare the length of the string to its index in the array. The lambda expression:

(digit, index)=> digit.Lenth < index );

Defines a function that compares the length of the string ‘digit’ to its index in the digits array. Notice that digit and index are parameters to the delegate, and their types are inferred by the collection ‘digits’. Digits is an array of strings, therefore digit is a string. At least, in the compiler’s world. Index is an int, and is the current index into the collection.

Next, we’ll move through more samples that discuss the projection queries. Those help you format the output of different queries to create new types or initialize known types.



Posted by wwagner | with no comments
Filed under: ,
And a call for comments

A member of our local user group community sent the following question to me via email (slightly edited):

As a resident of Ann Arbor, and a .NET professional, I have heard you talk many times at the Computer Society. I was just wondering if you could comment (maybe thru your blog?) on the value of Microsoft Certifications. I know Microsoft loves them, but what does the industry think?
 
I have been a software developer for about 18 years now and apart from my 1996 Masters degree in Computing, I don't have a single certification. I'm wondering if having MCSD certification would be a benefit or a hinderance on my resume. This may seem a silly question, but I'm wondering if there is a situation where MCSD cert. looks as though some one is trying to compensate for weaknesses?
 
I can't think of any scenario where having a certification could be seen as a weakness. It's another credential, and you have to earn it.

Of course, having a Masters degree is a more weighty credential than an MSCD (or other industry cert), so you should view that as much more significant than an MSCD.

Of course, the question underlying this is how seriously does the industry weight MSCD credentials, in general. I can only speak for myself, but it's a minor consideration when I examine a resume. I would weigh formal education and work experience with quite a bit more weight than any industry certification. The MSCD is a positive, but it's a small one.

I welcome other comments. What do you think when you see a resume' with and without industry certifications?



Posted by wwagner | with no comments
Filed under:
For something so standard, they seem to generate more debate than anything else

I received the following question from a reader of Effective C#:

After reading Item 10: Understanding the pitfalls of GetHashCode() in your Effective C# book, I was confused why you stated that performance would suffer if the GetHashCode() method did not return a random distribution across all integers. I read up on Hashtables to see if I was missing something and could not find anything that indicated that the range had any impact on performance. At first glance, it seems that having sequential numbers (like the default implementation you describe) would actually decrease the number of collisions, thus helping performance. For example, 5 objects with hash codes of 1 – 5 could fit in the hashtable’s underlying array in the first 5 spaces. If the hashcodes where randomly distributed across all integers, then doing the modulus operation to get them to fit into the array creates the opportunity for collisions, thus hurting performance.

In any case, can you provide a hint as to why the range of hash codes affects performance?

Here's my answer:

First, let me start with a disclaimer that writing the optimal hashcode depends on the usage of the type.  So, there is no general ‘best’ answer.
 
Having said that, let’s look at the case of the default hash code, which is the object ID, and increments with each new object created.
 
Every reference type your application creates (strings, the hash table itself, window controls, or web controls, and any other business objects) all use one of these codes. So, chances are high that the first five keys you create would not have the hash codes of 1-5. But, that’s not the point…

More importantly, in the fullness of time, the default GetHashCode() does generate a random distribution over all integer values.  You just have to create a lot of objects.

The default hash code works fine, if and only if, you have not overridden the Equals() method for your type. If two keys are equal (as defined by the Equals() method), they must return the same hash code. Therefore, if you use the default Object.Equals() and the default Object.GetHashCode(), all is well.  The implementation of Object.GetHashCode() does distribute its return values equally among all integers.

However, if you have overridden Equals(), then you must override GetHashCode() so that equal objects produce equal hasch codes. In that case, it is your responsibility to ensure that the values cover as wide a range as possible. Ideally, the range of possible hash codes should be the range of all Int32 values. The fewer values you use, the less efficient your hash code will be. For example, the performance of a hashtable containing objects where GetHashCode() always returns 0 will be quite poor. In fact, the perofrmance of search operations on such a container is O(n). If the hash code values are equally distributed across all integers, the performance of a serach operation should approach O(1).

Finally, the internal storage of the hashtable is not a simple array. More than one hash value may be assigned to the same bucket. You cannot assume it’s a modulo operator used to determine which bucket. Further, you cannot assume the number of buckets, so that even if a modulo operator is in use, you can’t know what base is being used. Lacking those assumptions, you have no choice but to distribute the hash values evenly across all integer values.



Posted by wwagner | with no comments
Filed under:
I got a fever...

And the only prescription is:

More Cowbell



Posted by wwagner | with no comments
The Ann Arbor .NET User Group is off to a great start

Well, this evening we held the first Ann Arbor .NET Developers Group meeting.  We had 40 people there. Almost everyone heard about it from word of mouth, so I'm hopeful that we'll continue to grow from here.

Our website is up, and we'll be posting the calendar soon:  www.aadnd.org

If you're in the Ann Arbor area, stop by next month.



Posted by wwagner | with no comments
Filed under:
Generics and a taste of LINQ

I gave the same talks in both Toledo (Jan 31), and Ann Arbor (Feb 1).

Slides and samples are available at the links below.



The slides, as a PDF file
The presentation
Introductory Sample
This zip contains the original sample
File load and Save
XML serialization using generics
Pi Calc
thread management using generics
Posted by wwagner | with no comments
Filed under: ,
A short commercial announcement

I’m posting this to announce the inaugural meeting of the Ann Arbor .NET Developers Group.  In a short time, you’ll be able to read all about our group at www.aadnd.org.  (It’s not live yet).  Our mission is to educate software developers in the greater Ann Arbor MI area about .NET development practices with the goal of helping our community of developers grow professionally.  

2/8/2006 - First Meeting

6:00 to 6:45 – An Introduction to ASP.NET 2.0

Speaker: Jay Wren

7:00 to 8:30 - It’s all about you! Personalized Web Sites with the ASP.NET 2.0 Portal Framework

          Speaker: Josh Holmes

Your users can go to thousands of web sites. Why do they come to yours? It’s for content that they are interested in. The more focused this content is on them, the more likely they are to re-visit. Through the ASP.NET 2.0 Portal Framework, you can allow the user to narrow that focus and refine it to just the content that they want to see. http://my.msn.com has been using this framework for some time now but it’s available to us now so you will see a lot more personal portal sites popping up. Rather than building web sites, the portal web site developer will focus on building panels of content, called web parts, similar to SharePoint web parts. In this talk, we will discuss how the framework operates as we build a simple web portal application completely with web parts.

 

We will also be supplying pizza and pop between our tutorial and our main presentation.

Our ongoing meetings will be on the 2nd Wednesday of each month, from 6:00 pm until 8:30 pm at Ann Arbor Spark Central. (330 E. Liberty, Lower Level).  

We’d like to see you, and as many other .NET developers you know join us.

Bill Wagner,
President - Ann Arbor .NET Developers Group



Posted by wwagner | with no comments