On inferring collections
Will duck typing work for collections?Mads wrote this blog post about one of the more controversial changes in C# 3.0. Namely, how the compiler determines if collection initializers are appropriate for the requested type.
As a quick aside, a collection initializer is a sequence of object initializers that are used to add items to a collection. A trivial example is this:
List<string> foo = new List<string>
{ “hello”, “goodbye”, “That’s all”};
That’s pretty straightforward. The compiler is doing the magic to change the above code to this:
List<string> foo = new List<string();
foo.Add(“hello”);
foo.Add(“goodbye”);
foo.Add(“That’s all”);
The interesting trick is where the compiler figures out whether or not something is a collection. List<T> is pretty clear. Arrays are also clear. Bt, what if you (or someone else) makes a new collection? Mads points out some very interesting problems. Summarizing, not all collections support ICollection<T>. The language team got made the following decision (see Mads’ post for more of a justification):
‘A collection is a type that implements IEnumerable and has a public Add method’.
At first analysis, this pained my statically typed brain. ‘No!’ I screamed (at least in my head). “Collections should be something concreted, like ICollection<T>, or ICollection, or even IEnumerable, if that makes sense.”
Well, the more I thought about it, the more that this design for the C# language makes a great deal of sense.
The above definition for a collection is a form of type inference (at least for collections). A collection is not some base class or a set of interfaces. Instead, it’s a type that looks like a collection, and acts like a collection.
Mads points out some advantages to this:
- It’s not constrained to a single signature of Add.
- It supports dictionaries, custom collections that have some other Add() method, or even multiple Add() overloads.
- The compiler can provide stronger type-checking when the type in question does support ICollection<T>.
There are a few weaknesses to this design choice. Many have already been noted by Mads, or in the comments:
- You may choose a different name for your add method: AddNewItem(), AddBook(), AddPeriodical(), or whatever else. There is an even more insidious bug lurking here. Supports you refactor a class to change an Add() method to AddThing(). That’s bad, because refactoring shouldn’t cause your code to stop compiling, but this one would.
- An explicit ICollection<T> implementation won’t work with this pattern based implementation.
The first one is troubling, but there really isn’t a good C# solution. Suppose the language designers fell back on mandating ICollection<T>. You couldn’t rename the method that implements a particular interface function and have it still compile correctly. My own opinion is to leave this as an unsolvable problem. (Note that if you try and refactor an interface method, the VS 2005 IDE warns you about it, but lets you proceed and break your code).
The second one is more troubling. Rather frankly, I don’t use explicit interface implementation often, but my own opinion is that collection intializers should work with a class that explicitly implements ICollection<T>.
I’d like to see Mads change his definition to say this:
‘A collection is a type that implements ICollection<T>, IDictionary<T>, or implements IEnumerable and has a public Add method’.
Mads discusses that language users will see the benefits of ICollection<T> (I’ll add IDictionary<T>) and support it even though it’s not strictly necessary for collection initializers. I think he’s right, ICollection<T> is a big improvement over ICollection, and it will become a habit. And, I like the solution of inferring a type is a collection. But, I think that once it’s implemented, the C# community will find interesting ways to create collection – like types that don’t necessarily support ICollection<T>, but still work with collection initialzers, and do it by design. But, if it supports IEnumerable<T>, and has some number of Add() methods, it look like a collection, it acts like a collection, so it should be treated like a collection.
Update: Wesner Moise (see comments) pointed out a mistake in my description of arrays and generic interfaces. The runtime adds support for IList<T>, ICollection<T> and IEnumerable<T> for the specific type in the Array.