February 2005 - Posts

They are slightly different than those from the P & P Guide.

Question on Array Recommendations.

I finished your book a few days ago. It was a fun read, and I think you covered a lot of good areas for code improvement. I found one of your recommendations that contradicts with something I found in the Patterns and Practices Performance and Scalability book from Microsoft dealing with Multidimensional Arrays vs Jagged Arrays. You seem to be in favor of the Multidimensional Arrays over Jagged (Item 40, page 231-232). Here's what the P&P book has to say on that:

Chapter 5, page 241-242:

"Use Jagged Arrays Instead of Multidimensional Arrays

A jagged array is a single dimensional array of arrays. The elements of a jagged array can be of different dimensions and sizes. Use jagged arrays instead of multidimensional arrays to benefit from MSIL performance optimizations. MSIL has specific instructions that target single dimensional zero-based arrays (SZArrays) and access to this type of array is optimized. In contrast, multidimensional arrays are accessed using the same generic code for all types, which results in boxing and unboxing for arrays of primitive types."

They then get into some MSIL code comparing the two.

My Response:

This is one of those low-level performance issues where the performance metrics depend on the particular program. I'll get to the performance characteristics in a moment.

My overriding reason for recommending multi-dimensional arrays stems from ease of use, and maintenance. A multi-dimensional array is a single structure. Simulating multiple dimensions using jagged arrays is more complex: You create one array per row, plus the outer array. Quite simply, it's easier to get wrong, resulting in null reference exceptions.

This added complexity manifests itself in many ways. First, you write more initialization code (shown on p. 230-231 of Effective C#). Iterating all the members of a jagged array is more complex as well. A single for or foreach loop gets replaced with nested loops. I discuss this on p. 232-233.

While this may be simple code when you intend to model a multi-dimensional array using jagged arrays, there is no guarantee that all rows have the same number of elements. This also complicated column-wise traversals: You should check that each row has enough elements before simply assuming it works. If you want to support a single enumerator for a jagged array, you must write your own. It's already available in a multi-dimensional array.

These differences make me lean toward simplicity rather than speed. Quite frankly, a little slower but more maintainable code will scale better than overly optimized but harder to maintain code. That means I prefer the multi-dimensional array.

Finally, I'll address the performance issues. It depends on what you measure, and where your programs bottlenecks are:

A multi-dimensional array is one allocation; a jagged array causes N+1 allocations, where N is the number of rows. That has some cost.

Row-wise traversals will be faster for jagged arrays, as pointed out in the P&P guide.

Column-wise traversals will be faster for multidimensional arrays, as I point out in p. 231.

In all three cases, the differences will be small. You will only see the difference for an operation that is repeated very often. Before you make this kind of a change, you should profile code and determine which version is fastest. And, you should know that the speed difference is significant enough to justify the change.



Posted by wwagner | with no comments
Filed under:
This pertains to Item 22, and Item 45.

Question 1:

Can you further explain the idiom you recommend on top of page 133 (see below) to raise an event. Specifically, how does making a copy of the reference provide protection?

// add a message, and log it.
public void AddMsg ( int priority, string msg )
{
  // This idiom discussed below.
  AddMessageEventHandler l = Log;
  if ( l != null )
    l ( null, new LoggerEventArgs( priority, msg ) );
  }

In my mind a "copy of a reference" is, well, just a copy-of-a-reference, which would not prevent the referenced object from being changed elsewhere. I would agree that a copy of the underlying _object_ would provide protection, but that doesn't seem to be what's going on here.

Answer 1:

This idiom does work because of the behavior of the Add and Remove accessors for an event. If another thread calls Remove(), that does not modify the invocation list you’ve copied to the local variable ‘l’. Rather, it creates a new invocation list with no targets. Your invocation list is still valid. You are not making a deep copy of the invocation list, but any other thread the modifies your invocation list does.

Question 2:

Also while on this topic, do you recommend protecting the actual event call with a Try/Catch as a way to protect against exception-throwing event handlers (that others might write, of course)? Not sure that there would be anything to do in the Catch, maybe just Catch {}. Interested to know what you think.

Answer 2:

It varies. I believe that event handlers should never throw exceptions. (See Item 45, p. 270). However, I don’t believe that you can always count on others to follow that rule. If you are creating libraries that will be used by a large audience, you should be very defensive and wrap event handler invocations with try / catch. When you are creating closed systems, and you can ensure that none of the event handlers could possibly throw exceptions, I find that the simpler syntax easier to read and maintain.



Posted by wwagner | with no comments
Filed under: ,