Bill Blogs in C#

Bill Wagner discusses C#, LINQ, and other items of interest

October 2006 - Posts

Will duck typing work for collections?

Mads wrote this blog post about one of the more controversial changes in C# 3.0. Namely, how the compiler determines if collection initializers are appropriate for the requested type.

As a quick aside, a collection initializer is a sequence of object initializers that are used to add items to a collection. A trivial example is this:

List<string> foo = new List<string> 
{ “hello”, “goodbye”, “That’s all”};

That’s pretty straightforward. The compiler is doing the magic to change the above code to this:

List<string> foo = new List<string();
foo.Add(“hello”);
foo.Add(“goodbye”);
foo.Add(“That’s all”);

The interesting trick is where the compiler figures out whether or not something is a collection. List<T> is pretty clear. Arrays are also clear. Bt, what if you (or someone else) makes a new collection?  Mads points out some very interesting problems. Summarizing, not all collections support ICollection<T>.  The language team got made the following decision (see Mads’ post for more of a justification):

‘A collection is a type that implements IEnumerable and has a public Add method’.

At first analysis, this pained my statically typed brain.  ‘No!’ I screamed (at least in my head). “Collections should be something concreted, like ICollection<T>, or ICollection, or even IEnumerable, if that makes sense.”

Well, the more I thought about it, the more that this design for the C# language makes a great deal of sense.

The above definition for a collection is a form of type inference (at least for collections).  A collection is not some base class or a set of interfaces. Instead, it’s a type that looks like a collection, and acts like a collection.

Mads points out some advantages to this:

  • It’s not constrained to a single signature of Add.
  • It supports dictionaries, custom collections that have some other Add() method, or even multiple Add() overloads.
  • The compiler can provide stronger type-checking when the type in question does support ICollection<T>.

There are a few weaknesses to this design choice. Many have already been noted by Mads, or in the comments:

  • You may choose a different name for your add method:  AddNewItem(), AddBook(), AddPeriodical(), or whatever else.  There is an even more insidious bug lurking here. Supports you refactor a class to change an Add() method to AddThing(). That’s bad, because refactoring shouldn’t cause your code to stop compiling, but this one would.
  • An explicit ICollection<T> implementation won’t work with this pattern based implementation.

The first one is troubling, but there really isn’t a good C# solution. Suppose the language designers fell back on mandating ICollection<T>. You couldn’t rename the method that implements a particular interface function and have it still compile correctly. My own opinion is to leave this as an unsolvable problem. (Note that if you try and refactor an interface method, the VS 2005 IDE warns you about it, but lets you proceed and break your code).

The second one is more troubling.  Rather frankly, I don’t use explicit interface implementation often, but my own opinion is that collection intializers should work with a class that explicitly implements ICollection<T>.

I’d like to see Mads change his definition to say this:

‘A collection is a type that implements ICollection<T>, IDictionary<T>, or implements IEnumerable and has a public Add method’.

Mads discusses that language users will see the benefits of ICollection<T> (I’ll add IDictionary<T>) and support it even though it’s not strictly necessary for collection initializers.  I think he’s right, ICollection<T> is a big improvement over ICollection, and it will become a habit. And, I like the solution of inferring a type is a collection. But, I think that once it’s implemented, the C# community will find interesting ways to create collection – like types that don’t necessarily support ICollection<T>, but still work with collection initialzers, and do it by design. But, if it supports IEnumerable<T>, and has some number of Add() methods, it look like a collection, it acts like a collection, so it should be treated like a collection.

Update: Wesner Moise (see comments) pointed out a mistake in my description of arrays and generic interfaces. The runtime adds support for IList<T>, ICollection<T> and IEnumerable<T> for the specific type in the Array.



Posted by wwagner | with no comments
Filed under: ,
One of the best reasons I've had ot write a small sample

Well, a about a week ago I was getting ready for my Grok talk at GANG. Well, because of some special events last Friday, I decided to change my talk.

I planned on discussing Sandcastle, which is cool, but I had other pressing coding issues.  I wanted to get these pictures posted earlier, but I wanted to have thumbnails in place first.  So, I wrote a quick .NET applciation that generated the thumbnails for the full size images

So, here's where I was last week on Friday:

Early in the game: 

Bottom of the first: 

More tigers at the bat: 

Going deep: 

Runners ready to score: 

ALCS Logo: 

Comerica Park Logo: 

The fountain in center field: 

Late in the game: 

Late hits: 

Polanco at the bat: 

Toward the end of the game: 

My son's first playoff game: 

Todd Jones closing: 

At the end of the game, I took some video of the celebration.



Posted by wwagner | with no comments
Filed under:
I need to learn more about Python

Dianne Marsh pointed me at this blog entry last week.

It's certainly making me want to learn more about Python.

Personally, I'm impressed by LINQ for a few reasons. 

First, unlike Python, C# with LINQ lets me mix statically typed constructs with dynamically typed constructs.  (Python is always dynamically typed.) There are a lot of places where static typing has advantages: numeric processing is usually faster, and, the compiler certainly helps minimize mistakes.

He closes with a question about whether or not you can use the LINQ query syntax anywere, as you can with Python's genexp's syntax.  I believe that you can.  You can perfom query operations on any type that supports IEnumerable (or IEnumerable<T>).

Understand, this post is not meant to slam Python. What I have seen of the language impresses me, and makes me want to learn more. At the moment, any comparison I made between C# and Python would be slanted toward C#, simply because I know it better. My hunch is that Brett's post is similarly slanted because he knows Python better than C#. Call it a push, and we all win.



Posted by wwagner | with no comments
Filed under: ,
It's really a wealth of knowledge

Pointed out by Kate Gregory (see, those RDs are a wealth of knowledge).

InDepthTalk has an aggregation of Regional Director blogs here:  http://indepthtalk.net/Community/RD.category

It's a great one-stop shop for insight from some impressive people:  Everyone from Andrew Brust to Yann Faure.  There are a lot of very recognizable names in between.  (I'd start listing them, but once I start, I'd have to list everyone, or risk leaving out someone important.)

Check it out.



Posted by wwagner | with no comments
Filed under:
all the time, when you're developing

Karen Liu, one of the Product Managers for the C# IDE, was briefly blogging this summer.

One of the posts
http://blogs.msdn.com/karenliu/archive/2006/06/12/628756.aspx
contains all the C# keyboard bindings for the Visual Studio 2005 IDE.  It's a great, handy reference

(BTW, if you attended her Tech Ed talk this summer, you have a copy of this in paper format).



Posted by wwagner | with no comments
Filed under: ,
Combining Take and Skip to paginate output

Take and Skip give you the capability to grab subsets of records based on the records' position in the set. Combining these commands helps you create pages of output (for a web page, etc).  Here's how:

Take grabs the first N elements from a query, and returns only those:

var q = (
        from e in db.Employees
        orderby e.HireDate
        select e)
        .Take(5);

Skip returns all elements after the number asked to skip…

var q = (
        from p in db.Products
        orderby p.UnitPrice descending
        select p)
        .Skip(10);

The SQL to implement Skip, once again, make me feel much better about writing LINQ code vs. SQL:

SELECT [t0].[ProductID], [t0].[ProductName], 
    [t0].[SupplierID], [t0].[CategoryID],
    [t0].[QuantityPerUnit], [t0].[UnitPrice],
    [t0].[UnitsInStock], [t0].[UnitsOnOrder],
    [t0].[ReorderLevel], [t0].[Discontinued]
FROM [Products] AS [t0]
WHERE NOT (EXISTS(
    SELECT NULL AS [EMPTY]
    FROM (
        SELECT TOP 10 [t1].[ProductID]
        FROM [Products] AS [t1]
        ORDER BY [t1].[UnitPrice] DESC
        ) AS [t2]
    WHERE [t0].[ProductID] = [t2].[ProductID]
    ))
ORDER BY [t0].[UnitPrice] DESC

Now, it’s time to go to the cool concept that comes from combining Skip and Take:  paginating your results.  For example, you could do something like this:

var q = (
    from c in db.Customers
    orderby c.ContactName
    select c)
    .Skip(50)   // Skip the first 50 records.
    .Take(10);  // And grab the next 10.

Note that one query gets generated to retrieve only the elements you want. As a variation, you could retrieve a page of records by using a where clause on a unique key.  That works only if you have constructed your database with unique key values that form the ordering for pagination:

var q = (
    from p in db.Products
    where p.ProductID > 50
    orderby p.ProductID
    select p)
    .Take(10);

While writing this entry, I thought about writing a single method to return all the pages as separate objects in a collection of pages.  I'm not going to show the code, because while it was pretty simple to write, it was a bad idea.  Remember that Linq evaluates queries when you request items from the query. So, a design that performs different query operations in succession to retrieve the 1st page, 2nd page, 3rd page, etc. makes DB requests in succession for each page. That's a bad idea.

That means you should use this technique when you need one page from the middle of a set, not when you need all the pages from the set.

Next, we get to the most interesting technology in Linq2SQL:  Inserts, Update and Delete on the database  And, mapping your object edits into new database records and saving them.



Posted by wwagner | with no comments
Filed under: ,
Josh Holmes back in Ann Arbor

On Wednesday, Josh Holmes gave his first public presentation as a newly minted Microsoft Architect Evangelist. He spoke on a number of lesser known ASP.NET features that can help you create quality applications more quickly.

In the tutorial, Josh gave an overview of the personalization features that are part of ASP.NET: profiles, login controls, master pages, and skinning libraries. This was rather basic, but that’s the whole point of a tutorial, right?

 

The main talk went deeper into the personalization that is available in ASP.NET 2.0:  Profiles, custom profiles, anonymous profiles, and the different ways that you can extend them.

 

What was the most interesting for me was how easy it is to extend the personalization system if you have your own storage model for user data. (say, your own database, other personalization database.)

 

Josh will be posting the demos and slides to his personal website here:  www.joshholmes.com



Posted by wwagner | with no comments
Filed under:
with Bruce Eckel and Turbo Gears

Dianne did all the work here, so she got to publicize the event (locally)first. And, of course, Bruce Eckel is the driving force, so he got to announce the event.

But, I'm excited as well.  Turbo Gears is a web development toolkit, built on the Python language, and a number of other tools and libraries. (You should see Dianne's review of the last Ann Arbor Computer Society meeting to learn more). For those that want the quick overview, Turbo Gears is a close competitor with Ruby On Rails.  Developer productivity is the key reason to look at Turbo Gears for your web development.

So, even if you are a C# or .NET developer, you should consider attending the event or learning more about Turbo Gears (and other technologies).  Ignorance is never a competitive advantage. You can even take away some of the techniques from Python and Turbo Gears to use in your own development efforts.



Posted by wwagner | with no comments
Python, Turbo Gears and Pizza, oh my

Dianne wrote a great summary of the Ann Arbor Python User Group meeting last week, and the Ann Arbor Computer Society meting on Turbo Gears.

The python group discussed testing tools, and UI toolkits for Python.

TurboGears is a python - based competitor to Ruby on Rails. Dianne's been leading our first project using this tool, and it is a productive way to create web applications. Like Ruby on Rails, its productivity comes from being designed for one specific purpose, rather than being a general purpose language or framework (like the Java or .NET class library).

Extra points because TurboGears got its start in Ann Arbor.

Read the full summaries here:



Python User Group Meeting Summary
Testing Python apps with Twill, and Selenium
Ann Arbor Computer Society Summary
Turbo Gears and other toolkits with funny names
Posted by wwagner | with no comments
You do remember your set theory, don't you?

To begin, we have Concatenation.  The Concat extension method builds a sequence that contains the first sequence followed by the second sequence followed by… well, you get the idea.

var q = (
        from c in db.Customers
        select c.Phone
    ).Concat(
        from c in db.Customers
        select c.Fax
    ).Concat(
        from e in db.Employees
        select e.HomePhone
    );

Notice that the sequences being concatenated must be the same type (in this case, phone numbers are stored as strings). That does mean I could concatenate phone numbers and names, because the compiler knows them both as strings. The compiler enforces syntax, not semantics.

And, well, you can build a common type and concatenate instances of those types:

var q = (
        from c in db.Customers
        select new {Name = c.CompanyName, c.Phone}
    ).Concat(
        from e in db.Employees
        select new {Name = e.FirstName + " " + e.LastName, Phone = e.HomePhone}
    );

The query above shows how to build a sequence of a new anonymous type, from the concatenation of two queries that return names and phone numbers.  Notice how the first ‘select’ implicitly sets the Phone property, and the second must explicitly set the name of the Phone property. Otherwise, the type definitions don’t match.

Other set operations are available. 

Union:

var q = (
        from c in db.Customers
        select c.Country
    ).Union(
        from e in db.Employees
        select e.Country
    );

Intersection:

var q = (
        from c in db.Customers
        select c.Country
    ).Intersect(
        from e in db.Employees
        select e.Country
    );

Except:

var q = (
        from c in db.Customers
        select c.Country
    ).Except(
        from e in db.Employees
        select e.Country
    );

Next, we'll discuss how to create queries that support paging.



Posted by wwagner | with no comments
Filed under: ,
Scott Collins: "On the Design and Appliction of Programming Languages"

If you've never heard Scott Collins, you have to attend this.

Scott has done a lot in his life: everything from Mozilla to LucasArts' SoundDroid.  He's worked in almost every computer language I've ever heard of, and he's funny to boot.



The Full Announcement
First Wednesday in November

Posted by wwagner | with no comments
And document them

Rico Mariani posted another Performance Quiz last month (I meant to post this earlier, but I didn't get to it).

In this installment, he discusses the performance benefits of a struct with public fields.

He broke the rules in the Annotated Design Guidelines, and many in Effective C# as well. That's fine, because everytime he broke one of the rules, he has a specific and clear reason for it.

That gets to the point of my post:  If you break the guidelines in your code, the burden of justification is on you. You have to have a reason, and that reason has to be based on darn good reason. Finally, if that reason is performance, there better be some data behind your assertions that breaking the rules made it faster.

Of course, on the other side, the nature of creating guidelines and advice books (like Effective C#) is that no matter what you say, there are almost always edge cases or other conditions where the advice doesn't apply. So we authors have a choice: document all the edge cases and fill our books with digressions away from the main topic, or assume you'll know when you see them. In most cases the latter is preferred. I've certainly taken that course, and Rico, with his emphasis on performance, will often point out different times when the standard guidance can have a negative effect on performance. We're both right, it just depends on the goals.



Posted by wwagner | with no comments
Filed under: ,
Kind of a bittersweet post

I've known for a while, but Dianne and I wanted to make sure he got to make the announcement.

I'm happy for him, but sorry to see him go.  Luckily he's not going far, he'll still be in our community, and working with us and our customers (both existing customers, and the new ones we will be adding in the coming months).

And for SRT Solutions, I'm confident our company and our people are strong enough to continue on the growth path we've been on, even though one of our key people has left. We've got a great staff (as Josh mentioned), and they can grow to assume some of the responsibilities Josh had.



Posted by wwagner | with no comments
Filed under:
which I deftly don't answer

I received the following question in email recently:

While working on a problem report related to making sure that all our database updates are enclosed in try...catches.  Within the try we throw exceptions if anything goes wrong.  The catch clause contains all of the cleanup code.  Along the way, a question came to mind.  What kind of overhead is there in throwing an exception or is there any?  Your first book does not address that subject.

The answer is not so simple.  I discussed my general principles on exceptions here recently (http://www.srtsolutions.com/blogs/billwagner/2006/09/04/id140872.aspx)  To recap, if a routine cannot do what it is supposed to do, it should throw an exception. That doesn’t necessarily lead to throw … all through your code, because a lot of code simple calls other methods (that might fail). 
For performance queries, I’ll defer to Rico Mariani. He covered this very question here:  http://blogs.msdn.com/ricom/archive/2003/12/19/44697.aspx . There are a couple points that are good to remember.  First, even Rico points out that every project is different.  Without proper measurements, you just don’t know if exceptions are a cost or not. 
Second, the static cost of exception handling code is minimal.  Which leads me to the final point: the measurement and the cost will be a function of how often the exception is thrown (which reflects how often the exceptional event happens). 
That does lead to some good design guidelines that you should always follow:  Rico points out that it would be good to idea to limit exceptions to cases that have a less than 1 in 1000 probability (Chance of failure is less than 0.1%).  Then, the runtime cost of exceptions (often, please measure your circumstances) won’t have that much of an effect.



Posted by wwagner | with no comments
Filed under: