Bill Blogs in C#

Bill Wagner discusses C#, LINQ, and other items of interest

April 2008 - Posts

I received this question in email from one of my readers, and I thought it would be of general interest:

I find myself with two arrays of the same size and I want to create a third that combines

each element. What I want is something similar to

var S3 = S1.foreach(S2, (s1, s2) => s1 + s2);

this would give S3[i] = S1[i] + S2[i]; for all i

Since this is a common pattern, I was wondering if you know what is a good functional solution.

Thank you for any input you would have

I wrote a couple methods that makes this rather quite simple.  First, I wrote an extension method that merges two sequence:

   1: public static IEnumerable<T> Merge<T>(this IEnumerable<T> leftSequence, 
   2:     IEnumerable<T> rightSequence,
   3:     Func<T, T, T> mergeFunc)
   4: {
   5:     IEnumerator<T> leftEnumerator = leftSequence.GetEnumerator();
   6:     IEnumerator<T> rightEnumerator = rightSequence.GetEnumerator();
   7:     while (leftEnumerator.MoveNext() && rightEnumerator.MoveNext())
   8:     {
   9:         yield return mergeFunc(leftEnumerator.Current, rightEnumerator.Current);
  10:     }
  11: }

 

Once that's done, it provides almost exactly the syntax you requested:

   1: int[] leftVector = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 };
   2: int[] rightVector = { 9, 8, 7, 6, 5, 4, 3, 2, 1, 0 };
   3:  
   4: var sum = leftVector.Merge(rightVector, (x, y) => x + y).ToArray();
   5:  
   6: foreach (var value in sum)
   7:     Console.WriteLine(value);

 

There are a few caveats here.  This doesn't test for any errors if the two vectors aren't the same length. That may or may not matter for you. I like this idiom, because you can do other operations on sequences as well.  Here's multiply:

   1: var sum = leftVector.Merge(rightVector, (x, y) => x * y).ToArray();

I hope that helps.

Posted by wwagner | 2 comment(s)

Well, I'm back from the MVP Summit, and it seems that tradition mandates a summary of the trip.

But there is one problem: The best content was all under NDA. Most MVPs spend their time with members of the product team for their award, and other teams that are related. For me, that obviously means the C# language team, and other Developer Tools teams.

The NDA nature of the meetings means that everyone blogs about the parties, the time in the bar, the friends. Those are great fun, but I don't want to give everyone the impression that the MVP summit is one big party.

Really, it's the time when the Microsoft Product groups give us a good look behind the curtain and show us what they are thinking of building and ask for our feedback on each and every one of the ideas they are discussing.

It's my favorite MS based conference.

Posted by wwagner | with no comments
Filed under: , ,

It's been a while coming, but we finally convinced several of our new and brilliant folks to start blogging.

I've added them all to the blog roll, but here they are, along with some editorial comments from your host:

(Order is alphabetical by blog address and does not signify anything else)

Anne Marsan is a Mechanical Engineer by education, but has been writing software for that market for some time. She has a deep understanding of the darker corners of higher math, and how computers still sometimes don't do it correctly. It's probably significant that she's doing the Euler problems in Matlab.

Marina Fedner is a recent Carnegie-Mellon graduate that is keeping us older folks are learning new stuff everyday.

Mike Woelmer is a farmer turned silverlight developer, by way of several years of game development and medical imaging development. His first blog post discusses that journey. He's had quite the history

Bill Heitzeg is an engineer of all trades and styles. He's also got a great feel for the business issues of software development.

It's a lot of interesting discussions, and a good taste of what kind of culture we have around here.

Now if we can just get Chris, Nate and Alex blogging as well.

Posted by wwagner | 1 comment(s)
Filed under:

Euler Problem six asks you to find the difference between the sum of squares and the square of the sum for the natural numbers 1 through 100.

I took the easy route, and made a brute force implementation in C#. There are a couple new bits of LINQ syntax here. This query creates two different anonymous types. One for the number / square pair, the second is the aggregate answer for the sum and the sum of squares.

The important points are that this is done in a single iteration. It's pulling each value in from the initial sequence and doing all the calculations in one iteration of the sequence:

    1 var aggregate = (from number in Enumerable.Range(1, 100)
    2            select new { number, square = number * number }).
    3            Aggregate(new { Sum = 0, SumOfSquares = 0 },
    4            (sums, val) => new { Sum = sums.Sum+val.number,
    5                SumOfSquares = sums.SumOfSquares + val.square});
    6 
    7 int SquareOfSums = aggregate.Sum * aggregate.Sum;
    8 
    9 Console.WriteLine("{0} - {1} = {2}", SquareOfSums, aggregate.SumOfSquares,
   10     SquareOfSums - aggregate.SumOfSquares);

There you go. Another simple bit of code.

Posted by wwagner | 1 comment(s)
Filed under: , , ,

Well, it's time to post another solution and look at how LINQ and C# 3.0 can create elegant code for these problems.

The fifth problem asks you to find the smallest number that is divisible by all the natural numbers from 1 through 20. You can trivially find the answer like this:

    1 private static void BruteForce()

    2 {

    3     var divisible = (from n in Enumerable.Range(1, int.MaxValue)

    4                     where (n % 2 == 0

    5                     && n % 3 == 0

    6                     && n % 4 == 0

    7                     && n % 5 == 0

    8                     && n % 6 == 0

    9                     && n % 7 == 0

   10                     && n % 8 == 0

   11                     && n % 9 == 0

   12                     && n % 10 == 0

   13                     && n % 11 == 0

   14                     && n % 12 == 0

   15                     && n % 13 == 0

   16                     && n % 14 == 0

   17                     && n % 15 == 0

   18                     && n % 16 == 0

   19                     && n % 17 == 0

   20                     && n % 18 == 0

   21                     && n % 19 == 0

   22                     && n % 20 == 0

   23                         )

   24                     select n).First();

   25     Console.WriteLine(divisible);

   26 }

 

But don't do that. It's very slow. Let's think a bit about the math, and the answer gets much easier. If you remember middle school math, you are being asked to find the greatest common divisor for all numbers 1-20. The unique factorization theorem is what you need here. It states that every number can be written in exactly one way as the product of prime numbers. The greatest common divisor can be found by multiplying the highest powers of each prime factor. In code, it's a little easier to turn the definition around and peel off prime factors from every number in the list. It's faster and simpler than finding all the prime factors.

You start with a list like this: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10

1 really doesn't do much, so we'll start with 2. Keep the 2, and replace every number greater than 2 that is divisible by 2 with that number divided by 2:

1, 2, 3, 2, 5, 3, 7, 4, 9, 5

Move to the next number (3). Repeat:

1, 2, 3, 2, 5, 1, 7, 4, 3, 5

Move to the next number (another 2). Repeat:

1, 2, 3, 2, 5, 1, 7, 2, 3, 5

Continue until done:

1, 2, 3, 2, 5, 1, 7, 2, 1, 1

Multiply: 2520

That gives us this algorithm:

    1 private static void Better()

    2 {

    3     var numbers = Enumerable.Range(1, 20).ToArray();

    4     // remove common factors:

    5     for (int index = 0; index < numbers.Length; index++)

    6         for (int subIndex = index + 1; subIndex < numbers.Length; subIndex++)

    7             if (numbers[subIndex] % numbers[index] == 0)

    8                 numbers[subIndex] /= numbers[index];

    9 

   10     var answer = numbers.Aggregate(1, (product, number) => product * number);

   11 

   12     Console.WriteLine(answer);

   13 }

 

But you know what. I don't like those loops. I'd rather write methods that operate on a sequence. It's another rather simple example of tail recursion. For any sequence, peel off the first number. Then, return that number followed by the rest of the sequence divided by that first number, where possible. Pipe the remaining sequence back into the same function:

   11 private static void Best()

   12 {

   13     var numbers = Enumerable.Range(1, 20);

   14 

   15     var answer = numbers.LeastCommonFactor().Aggregate(1,

   16         (product, number) => product * number);

   17 

   18     Console.WriteLine(answer);

   19 }

   20 

   21 private static IEnumerable<int> LeastCommonFactor(this IEnumerable<int> list)

   22 {

   23     // Stop if the list is empty:

   24     if (!list.Any())

   25         yield break;

   26 

   27     int factor = list.First();

   28     yield return factor;

   29 

   30     var remaining = from item in list.Skip(1)

   31                     select (item % factor == 0) ? item / factor : item;

   32     foreach (int item in remaining.LeastCommonFactor())

   33         yield return item;

   34 }

 

Now that's a simple, elegant algorithm. And, it runs quite a bit faster than the brute force version. Stay tuned for more toward the end of the week.

Posted by wwagner | 1 comment(s)
Filed under: , ,

I've had some comments on the Euler problems, so I'm assuming there is some interest. There's also been interest among other developers in our office.

Darrell Hawley has started solving the problems in Python.

Marina Fedner has started solving the problems in Ruby.

I highly recommend reading.  It will help you grow as a developer to see how other languages support different idioms that make some of these problems easier or harder.

You'll learn more and discover how to add more idioms to your own language.

Posted by wwagner | with no comments
Filed under: , , ,

The fourth Euler problem asks you to find the largest palindrome number that is the product of two three digit numbers. For example, the number 919 is a palindrome: it's the same forward and backwards.

To solve this problem, we need to find the product of all combinations of three digit numbers. Well, if you remember your probability, this is a combinatorial problem. Order doesn't matter. That limits the number of combinations.

Next, we need to find palindromes. A little string conversion makes that simple. You can convert the number to a string, then see if the string is equal to its reverse. Strangely, there isn't a method in System.String that can reverse a string. But, there is a Reverse() extension method that's in scope.

Well, let's look at the code:

    1 static void Main(string[] args)

    2 {

    3     var allProducts = from x in Enumerable.Range(100, 900)

    4                       from y in Enumerable.Range(100, 900)

    5                       where x <= y

    6                       let product = x * y

    7                       orderby product descending

    8                       select new { x, y, product };

    9 

   10     var palindrome = (from n in allProducts

   11                        let chars = n.product.ToString()

   12                        let reverse = new string(chars.Reverse().ToArray())

   13                        where chars == reverse

   14                        select n).First();

   15     Console.WriteLine("{0}, {1} => {2}", palindrome.x, palindrome.y,

   16         palindrome.product);

   17 

   18 }

 

The first query is to generate all products of three digit numbers. Notice that the where clause removes permutations that are the same as two numbers. For example, 251*126 would have already been computed from 126*251. Next, notice the 'let' clause. That assigns a local variable in the query expression so that you don't have to compute the product more than once. While we're at it, we order the values by the products in descending order.

The next query finds the palindromes from all those products. As I said earlier, I do this by converting the product to a string, then comparing the string to its reverse. As with the earlier query, I use the let clause to cache some local results and avoid recomputing values.

You'll need to run the code yourself to see the answer.

Posted by wwagner | 6 comment(s)
Filed under: , ,

Earlier this week, Open XML won ISO approval.  By any objective measure, it won handily.  86% of the participants voted for the standard.

Does that mean that 16% of the voting countries are run by IBM officials?

One could make the case that it does.  Jan van den Beld's post provides far more evidence more eloquently than I ever could:  http://janvandenbeld.blogspot.com/2008/04/hypocrisy.html.  A few highlights:

 "I can see why IBM opposes more voices (at least those that don’t agree with its commercially motivated views). It has enjoyed unparalleled influence in international standardization for decades and may not now like more voices and decision makers in this process. Its allies could not have been clearer about that commercial agenda– to force the purchase of their products by blocking governments from procuring Microsoft Office, regardless of technical merit or actual demand."

"When IBM talks about independence, it really means that national standards bodies should be independent of anyone who disagrees with IBM’s position.

But, the only way to get the full picture is to read his post, including the evidence to which he provides links.

For my own standpoint, I have to look at this from the technology standpoint. I've looked inside a few Office XML documents. They are not simple, but they are understandable (if you understand XML). Can any company (or individual) create a parser to read or write Office XML documents? Sure. What's stopping you?  Your technical skill might, but legal restrictions and closed formats can't.  And to me, data transparency is more "open" that code transparency. If I can see the data (in XML format, possibly) I can certainly figure it out. Even if I can't see the internal algorithms inside Office, I can work with it's files. That's much more important than being able to see the code. Are you, or your customers, going to re-write your office apps?  Probably not. But, does an open format make changing vendors, or creating more addons easier?  Certainly. Open XML gives us that.

In fact, at this point, I would make the case that it's easier to work with office documents stored in Open XML than it is to work with google docs stored in the cloud. Can you program against google docs? Can you programmatically retrieve your data? If so, can you understand the internal format, and process those documents with your own algorithms? I don't believe so.

 

 

Posted by wwagner | with no comments
Filed under: , ,

When you design your API, make it easy to use and hard to misuse. Brad Abrams quotes Rico Mariani on the Pit of Success here: http://blogs.msdn.com/brada/archive/2003/10/02/50420.aspx 

The key point:

"we want our customers to simply fall into winning practices by using our platform and frameworks.  To the extent that we make it easy to get into trouble we fail." - Rico Mariani

Of course, everyone fails at this sometimes.  Here's my example for today.  Look at these two constructors for TcpClient:
public TcpClient(IPEndPoint addr) { /* ... */ }
public TcpClient(string hostName, int Port) { /* ... */}


 Would you believe that one of those specifies the ip address on the local machine, and one specifies the ip address of the destination machine.  Do you have a clue which one?  Me either.  And, your only clue is that one will fail by throwing an exception: "destination unreachable". Yup, how you specify the address changes whether you are specifying the source or destination addr of your connection.(BTW, the version with the IPEndPoint specifies the local side of the connection. The string / int pair specifies the remote connection).

 

Posted by wwagner | 1 comment(s)
Filed under: