Or, what I think about the Alt.Net Vote of No Confidence on the Entity Framework.
Almost a month ago, some folks post the Entity Framework Vote of No Confidence petition. I didn't say anything at that time for a few reasons: I was on vacation, something that controversial requires a thoughtful response, and I wanted to wait for the predictable firestorm to die down.
Last week, the Alt.Net podcast featured Ward Bell and Jeremy Miller discussing, among other things, the petition. It's well worth a listen. For one, the length of the show, almost 44 minutes of content, gives enough space for Ward and Jeremy to go into much more detail about each of the points in the petition. Regardless of how you felt when you first saw the petition, you'll find yourself nodding in agreement at many of their points.
That's why I chose this title for this blog post. Seth Godin is great at marketing and getting attention. He preaches (among other things) to make a big splash when you want to make a point: knock your audience far out of their comfort zone immediately; it makes them much more reachable.
That works great for marketing, but it's lousy as a technical argument.
For the record, I agree with every technical point in the Entity Framework petition. It's v1 software, and it shows a long history of the 'everything is a table in a database' design philosophy that plagues almost all data access libraries. From that assumption several decisions are clear: You won't want to test without attaching to a database. Your data is model a set of tables and relations. All your entities are coming from and stored back into a relational database. What else is there? (If you don't fit that model, EF isn't for you).
This data-centric view is difficult to reconcile with modern agile and object oriented principles. It's no surprise that EF (v1) failed to deliver on the promise of bridging the gap between an arbitrary storage model and a in-memory object model. That to me is the biggest failing with EF, as it exists today. If I start with a business problem, my first reaction isn't "Let's go design the database". It's "Let's build some scenarios." Those scenarios will start to reveal some objects that need both behavior and persistent storage. That persistent storage will be designed, but it will also evolve over several sprints. Some sprints will be about normalizing the data store, if a relational data store is the right answer. Those sprints shouldn't have a huge impact on the other layers. Unfortunately, nothing in the current Entity Framework helps me work that way. They expect, or even demand, that an existing database schema (and probably even sample data) exists. Northwind anyone? AdventureWorks?
But, a signed petition calling it a 'vote of no confidence' is obviously an attention grabber. It clearly grabbed attention. We don't yet know if it achieved the desired result. The EF team is clearly going to make some changes, but the open question is whether those changes will be what the signatories want, or a different direction, attempting to prove that the signatories had the wrong idea about what EF should be.
As for me, I'd much prefer that MS spent more time producing a library that enabled developers to write their own IQueryProvider implementations for specific vertical scenarios. Matt Warren has written quite a few blog posts discussing the design, and some of the implementation for the IQueryProvider used by LINQ to SQL. Read those articles, and you'll get a feel for just how much work this is. You'll also get a good feel for how much of the code is common code to visit nodes and parse expressions. I'd really prefer seeing those common algorithms exposed for the rest of us. That would go a long way toward many different application types: data in the cloud, data in other blogs, data in almost anything but a relational structures in a database. From that standpoint, Entity Framework is an evolutionary dead end no matter what they add.
LINQ to Live Mesh anyone? Or even more powerful: LINQ to <your data stored on LiveMesh>...
To close, yes, I've got issues with the Entity Framework design. yes, all the technical arguments in the vote of no confidence are valid. But, that's not they way I want technical debates to proceed. It gives me visions of glib people spouting nonsense to make a point, not technologists discussing the merits of their work. Our work environments would be much better with more of the latter.
One of my last Visual Studio Magazine article discussed object validation and object invariants.
I received a great email discussing questions about how to handle UI validation in this world. My recommendation in the article was that classes should be responsible for their own state. Furthermore, they should enforce that validity at all times. Objects are easier to use if they are always valid.
The question can be boiled down to "what about objects that provide the backing store for UI controls? How do you let the user edit a type and still mandate that it's always valid?"
As an example, let's imagine a set of UI fields to edit an address in the United States. You could code that imaginary Address type to query some map service and determine if the address really existed. For an Address object to pass its validity checks, all fields would need to be consistent.
However, users can't edit this type of Address type. Change the city, and the state and zip code are no longer valid. Change the zip, and the city and state are no longer valid. No matter what order a poor user tries, she cannot get past the first field to edit the other fields necessary to get the Address back in a valid state.
Yes, it's a problem. The reader posed two ways to solve this consistency issue.
For one, you can loosen the definition of 'valid' to allow the type to be invalid for a while. Other code would test stricter conditions when an Address object was being saved to persistent storage.
I don't like that approach. It weakens the concept of valid until it's almost meaningless. Weakening the concept of 'valid' simply pushes the test to other code: any type that uses an Address object must now perform extra tests because a 'valid' address might not really exist. It might be one of those weakened concepts of 'valid' to allow editing.
His second approach was to introduce two classes that represent an "Address": one that represents the business object concept, and one that represents an Address in the UI layer.
I prefer this approach, even though it appears to be more work at first examination. Let's start with the business object for an Address. It's always valid, and it always must exist. Any code using this address can assume the address exists, and is valid. That makes it much simpler for client code. It could provide one public method that allows client code to set all fields or properties at once, ensuring that it is still valid. The Address type could throw an exception when client code tried to set an invalid address.
The other Address type, used by the UI, would have much looser validation. Its visibility would be limited to the UI editor, and it would be only used by the UI. To perform the full validation, this EditingAddress type would rely on the Address type. Users would then be informed at save time that an object wasn't valid.
As a footnote, the same strategy, using two different types, would be appropriate for types that are loaded and saved using the XML Serializer. You can't rely on the order in which properties are set during the Deserialization process. Objects might not be valid during those transition points.
It's been repeated so often that now it's not even questioned: Software projects fail not because of technology missteps, but because of 'business issues'. That's a catch all phrase for misunderstanding business needs, implementing the wrong features, or poor resource management.
But like so many items that have become conventional wisdom, this one is also only right some of the time. Sometimes projects fail because of technical issues. Like every other profession, there is a bell curve of skills and abilities. Some developers (and architects, testers, and designers) just aren't real professionals, and don't follow a professional software development practice of any sort.
In particular, our business, SRT Solutions, often helps companies whose primary skill is not software development. It's in whatever domain they are targeting: healthcare, manufacturing, engineering research, or whatever. That means all these issues become problems:
Software Architecture: Architecture is a study in tradeoffs. Which goals are most important? Which goals are secondary? Of course, customers say everything is important, but which matters more? In particular, we run into companies that don't have answers for:
Scaling: How many users must the system support? When? How fast will the user base grow? How will that growth be accommodated?
Longevity of the code base: Release 1 is usually fairly simple. It's creating release N + 1, as N grows that gets more and more difficult. What about backwards compatibility? What constitutes a breaking change? How will you migrate or update your data store to handle unexpected upgrades? And, understand that every project will have unexpected upgrades. That's even more true as the release number keeps growing.
Platform: How will your users interact with your software? Is it a web site? A web –based application? Smart Client? Some combination? Something even more radical? Why?
In addition, many of the smaller startup companies we work with have absolutely no knowledge, or even awareness of software development practices. Even rudimentary software engineering topics like source control is foreign to some of these companies. Moving beyond that, what processes should be adopted? Is it agile? If so, Scrum, XP, or something else? How do you handle change requests? How do you do validation? Rollout?
If you're reading my blog, you probably have very strong opinions on all these topics. You probably are a software engineer (or some other title related to software developer). But, how often do you feel that the 'business side' just doesn't get it? You're correct, that these issues are important. Without quality software, your business fails. But the business side doesn't understand that.
That attitude is just plain wrong. Labeling those functions 'the business side' and 'the technical side' cause some of the problems. Everyone is focused on succeeding as a business, generating revenue and profit by delivering solutions to customers. Mutual respect helps a great deal: regardless of your function, you must recognize that other functions are equally important to the overall success. Software people must understand that the software they develop must satisfy real business objectives. True success will be achieved only when we, the technical experts, champion issues of quality, tools, techniques and practices that will promote technical excellence, in parallel to business excellence.
Here's the punch line:
It's your job as a technical professional to engage the business leaders and help them understand why software engineering is important.
What are the business risks if you create software without source control?
What happens to your business if you can't create release 5 as quickly as you created release 1?
More importantly: What happens if your company has developers that don't care about those questions?
That's the real issue: If your company creates software, you need to have a strategy to create excellent software. Otherwise, the software becomes a burden, and a risk – not an asset. Often projects fail to achieve the business goals because of technical issues, not business issues, or communications issues.
I took a one week vacation at the end of June, and I've been going crazy working since then.
Our company has a rare policy of offering 4 weeks of vacation for new hires. A lot of folks are surprised by that, but it has many benefits.
For one, it seems everyone in this industry works hard, and spends quite a bit of extra time working, or growing. If you don't take the time to do something different, you'll burn out. Period.
Equally important, the week or so not looking at daily deadlines means you can take a longer view: I came back from vacation with 6 pages of notes on future strategy. I don't get the chance to do that with the ongoing deadlines while I'm working. It's an important benefit.
By far the most important benefit is restoring work / family balance. This was especially important this year, since I had just finished the manuscript for More Effective C#. That sprint made the rest and relaxation important.
That means this is one of the employee benefits we intend to keep, despite some conventional wisdom that we're spending too much money on this benefit. The benefits are too great, both to our employees and our company.
Jay posts about it here. Jay is going to make an excellent MVP. I really admire his technical integrity and the thoroughness of his comments on technology.
When Jay recommends a technology, it's authentic, not fan-boy praise. You know it it applies to your situation, and how you should leverage it.
More importantly, when he criticizes something, it's actionable criticism. It's well thought out, and he includes discussions about how the product group can improve the situation.
This Wednesday, July 9th, Jonathan Zuck from the Association of Competitive Technology will be visiting Ann Arbor, to speak to our .NET User Group (www.aadnd.org). He will be discussing how public policy affects our businesses (or our employer's businesses).
I've chatted about this topic with Jonathan at several different events. He's always quite a fountain of knowledge and advice. You can get a taste of his wisdom from a recent Hanselminutes: http://www.hanselminutes.com/default.aspx?showID=104
As always, AADND meets at SRT Solutions in downtown Ann Arbor (206 S. Fifth Avenue, Suite 200), at 6:00. I hope to see you there. (And if you can't make it Wednesday, Jonathan is speaking at Grand Rapids on Tuesday, and Flint on Thursday).
Here's the full abstract for Jonathan's talk:
Some of you may have heard the name Jonathan Zuck who was, in his day, a bit of a VB, Smalltalk and Delphi maven, speaking and teaching all over the world. With hundreds of articles, components, 5 MVP awards and a few books to his name, some of the older of you may have learned a lot of programming from him.
After selling his third company 10 years ago, Jonathan became astutely aware of the growing influence politics and public policy were having on the IT industry generally and on small businesses in particular. The Association for Competitive Technology is a trade association that represents small IT companies around the world on issues as far reaching as trade, public procurement, intellectual property protection and overregulation on privacy and online safety. Jonathan and ACT have been devoted to creating an environment which is open to all comers and as free as possible from the kind of bureaucracy that cripples small businesses.
Every day, policies are being discussed in Lansing, DC and around the world that could have an adverse effect on your business including rules for your website, procurement biases in government agencies or internet taxation. You have made various technology decisions for yourself and have invested a great deal of time and money in those technologies to gain expertise and command premium dollars. Politicians are lobbied by big players to enact policies to advance their specific business objectives most often at the expense of your interests. Their size and incumbency lead to policies of stagnation and protectionism that are generally antithetical to entrepreneurship. If politicians create policies that would create a bias against those technologies you would potentially lose some of that hard earned competitive advantage. Very seldom is it a technical decision to do something sweeping but is, instead, more often a political one.
Jonathan is making a few stops here in Michigan to talk about these issues and how they affect you as a businessman or technology professional. Topics of discussion include:
1. Examples of legislation from Michigan and elsewhere that could stifle the potential of your business
2. Technology preferences that could devalue your investment in expertise.
3. Opportunities to become more involved at the right time for maximum impact.
You probably have followed some of this stuff in the news so you might have questions about it as well such as the implications for the ISO vote to accept OOXML as a document standard the role small businesses played in making that happen.
Please join us for a dynamic and informative session about a part of your business and career that often gets overlooked.
I've finished my registration for PDC 2008.
See you in Los Angeles. I'll be the one thinking about software 
OK, here's the deal: Those of us working on organizing the Ann Arbor Give Camp have been reaching out to charities for the past month or so.
There's good news, and a challenge. We've had more than 35 local and regional charities respond with requests for software. The good news is that means Ann Arbor GiveCamp can help a lot of worthy causes. The challenge is that we need roughly 175 developers / designers / dba's / all around software people to help make this happen.
Are you up to it? The event is July 11 - 13th, and you can signup here: http://annarborgivecamp.org/DevRegister.aspx
Let's see how many groups we can help.
Last month, during an event on software tools, we discussed whether it was better for a company to mandate a set of tools, or let every developer choose their own tools.
To be sure, there are some tools that have to be common: source control is a great example. However, I argued that where ever possible, companies should let developers choose their own tools. If you want a different editor, go for it. Refactor! vs. Resharper: it's a personal choice.
I used an analogy from my early work history. When I was in high school and college, I worked in a Green Giant factory, some of the time as a mechanic. Every week, tool vendors (SnapOn, DeWalt, Ridgid, etc.) would visit and any mechanic could order tools for use at work. Everyone picked their own tools. That led to increased pride of ownership, care in the tools, and more identity with the tools that each person choose. We also discussed the relative merits of different tools, or vendors. Using that experience, I argued that whenever possible, let developers pick their own tools.
Later though, I thought this analogy shows something wrong with the current state of the market in software tools. My analogy works great for individual general tools. But, it doesn't work at all for those specialized tools that you only need once in a while. Tool vendors visited every week to sell hand tools. But, they didn't try to sell expensive shared tools: welders, drill presses, milling machines, and the like. Those shared tools were purchased by the company, and located in the shop. But here's the interesting point: those were shared tools. As long as you've been trained, you can use any of the tools in shop.
That led me to this thought: Those very expensive, but only occasionally needed tools are priced and configured completely wrong. Our individual tools, those we use everyday, are priced reasonably. Whether it's Visual Studio, Refactor!, CodeRush, Resharper, or another add-in, it's fairly easy to argue that you get more than your money's worth in productivity.
But think of those tools that someone needs occasionally. They are priced rather high. That somewhat makes sense, because those tools are only needed occasionally. I'm thinking of tools like low-level debuggers, load testing, threat modeling, etc. When you need them, you need to use them. But, that's only some of the time. Unfortunately, buying one copy for a team doesn't work well. In fact, it starts to lower productivity. Suppose our team has one copy of ExtraSpiffyTool. When someone needs it, we'll need to uninstall it from whoever used it last. Then, the person that needs it now will need to add it. Finally, after doing the uninstall / install dance, you can get your work done. That's too much friction. By analogy, think back to that welder or milling machine. If you needed it, you went to the shop, and you used it. (Or, with the welder, you may take it to where you were working, but it was a relatively low-friction activity).
Of course, you could try and convince your company to buy one copy of ExtraSpiffyTool for every developer on your team. But that's not cost effective. Using the welder analogy, that would mean the company would by a welder for everyone. That's silly: most of them would sit unused most of the time. Why not share?
Which brings me to my conclusion: Can tool vendors (those that produce ExtraSpiffyTools) build a licensing model that lets companies buy one copy, and really share the software? This is a solved technical problem: Some network piece could know that the tool is in use, and could queue up users waiting for it.
This would only work for those tools that are a) expensive and b) not used often. If you had developers waiting to use the tool, clearly you need more copies. But, if it really was only needed occasionally, you'd get more sales. You'd lower the friction of moving the tool from developer to developer, and still retain the "one person using it at a time" model.
Just a thought.
OK, this is why my blogging activity has slowed to a crawl, or downright stopped lately.
I've been working quite a bit on my next book: More Effective C#. It's getting closer, and it's now available on Rough Cuts. Rough Cuts is a Safari Books Online service that provides you with pre-publication first access to upcoming books. Chapters 1 & 2 are up right now.
You can see more about the Rough Cuts program here: http://safari.informit.com/roughcuts
More Effective C# is here: http://safari.informit.com/9780321580481
Back to more editing
I received this question in email from one of my readers, and I thought it would be of general interest:
I find myself with two arrays of the same size and I want to create a third that combines
each element. What I want is something similar to
var S3 = S1.foreach(S2, (s1, s2) => s1 + s2);
this would give S3[i] = S1[i] + S2[i]; for all i
Since this is a common pattern, I was wondering if you know what is a good functional solution.
Thank you for any input you would have
I wrote a couple methods that makes this rather quite simple. First, I wrote an extension method that merges two sequence:
1: public static IEnumerable<T> Merge<T>(this IEnumerable<T> leftSequence,
2: IEnumerable<T> rightSequence,
3: Func<T, T, T> mergeFunc)
4: { 5: IEnumerator<T> leftEnumerator = leftSequence.GetEnumerator();
6: IEnumerator<T> rightEnumerator = rightSequence.GetEnumerator();
7: while (leftEnumerator.MoveNext() && rightEnumerator.MoveNext())
8: { 9: yield return mergeFunc(leftEnumerator.Current, rightEnumerator.Current);
10: }
11: }
Once that's done, it provides almost exactly the syntax you requested:
1: int[] leftVector = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 }; 2: int[] rightVector = { 9, 8, 7, 6, 5, 4, 3, 2, 1, 0 }; 3:
4: var sum = leftVector.Merge(rightVector, (x, y) => x + y).ToArray();
5:
6: foreach (var value in sum)
7: Console.WriteLine(value);
There are a few caveats here. This doesn't test for any errors if the two vectors aren't the same length. That may or may not matter for you. I like this idiom, because you can do other operations on sequences as well. Here's multiply:
1: var sum = leftVector.Merge(rightVector, (x, y) => x * y).ToArray();
I hope that helps.
Well, I'm back from the MVP Summit, and it seems that tradition mandates a summary of the trip.
But there is one problem: The best content was all under NDA. Most MVPs spend their time with members of the product team for their award, and other teams that are related. For me, that obviously means the C# language team, and other Developer Tools teams.
The NDA nature of the meetings means that everyone blogs about the parties, the time in the bar, the friends. Those are great fun, but I don't want to give everyone the impression that the MVP summit is one big party.
Really, it's the time when the Microsoft Product groups give us a good look behind the curtain and show us what they are thinking of building and ask for our feedback on each and every one of the ideas they are discussing.
It's my favorite MS based conference.
It's been a while coming, but we finally convinced several of our new and brilliant folks to start blogging.
I've added them all to the blog roll, but here they are, along with some editorial comments from your host:
(Order is alphabetical by blog address and does not signify anything else)
Anne Marsan is a Mechanical Engineer by education, but has been writing software for that market for some time. She has a deep understanding of the darker corners of higher math, and how computers still sometimes don't do it correctly. It's probably significant that she's doing the Euler problems in Matlab.
Marina Fedner is a recent Carnegie-Mellon graduate that is keeping us older folks are learning new stuff everyday.
Mike Woelmer is a farmer turned silverlight developer, by way of several years of game development and medical imaging development. His first blog post discusses that journey. He's had quite the history
Bill Heitzeg is an engineer of all trades and styles. He's also got a great feel for the business issues of software development.
It's a lot of interesting discussions, and a good taste of what kind of culture we have around here.
Now if we can just get Chris, Nate and Alex blogging as well.
Euler Problem six asks you to find the difference between the sum of squares and the square of the sum for the natural numbers 1 through 100.
I took the easy route, and made a brute force implementation in C#. There are a couple new bits of LINQ syntax here. This query creates two different anonymous types. One for the number / square pair, the second is the aggregate answer for the sum and the sum of squares.
The important points are that this is done in a single iteration. It's pulling each value in from the initial sequence and doing all the calculations in one iteration of the sequence:
1 var aggregate = (from number in Enumerable.Range(1, 100)
2 select new { number, square = number * number }).
3 Aggregate(new { Sum = 0, SumOfSquares = 0 },
4 (sums, val) => new { Sum = sums.Sum+val.number,
5 SumOfSquares = sums.SumOfSquares + val.square});
6
7 int SquareOfSums = aggregate.Sum * aggregate.Sum;
8
9 Console.WriteLine("{0} - {1} = {2}", SquareOfSums, aggregate.SumOfSquares,
10 SquareOfSums - aggregate.SumOfSquares);
There you go. Another simple bit of code.
Well, it's time to post another solution and look at how LINQ and C# 3.0 can create elegant code for these problems.
The fifth problem asks you to find the smallest number that is divisible by all the natural numbers from 1 through 20. You can trivially find the answer like this:
1 private
static
void BruteForce()
2 {
3 var divisible = (from n in
Enumerable.Range(1, int.MaxValue)
4 where (n % 2 == 0
5 && n % 3 == 0
6 && n % 4 == 0
7 && n % 5 == 0
8 && n % 6 == 0
9 && n % 7 == 0
10 && n % 8 == 0
11 && n % 9 == 0
12 && n % 10 == 0
13 && n % 11 == 0
14 && n % 12 == 0
15 && n % 13 == 0
16 && n % 14 == 0
17 && n % 15 == 0
18 && n % 16 == 0
19 && n % 17 == 0
20 && n % 18 == 0
21 && n % 19 == 0
22 && n % 20 == 0
23 )
24 select n).First();
25 Console.WriteLine(divisible);
26 }
But don't do that. It's very slow. Let's think a bit about the math, and the answer gets much easier. If you remember middle school math, you are being asked to find the greatest common divisor for all numbers 1-20. The unique factorization theorem is what you need here. It states that every number can be written in exactly one way as the product of prime numbers. The greatest common divisor can be found by multiplying the highest powers of each prime factor. In code, it's a little easier to turn the definition around and peel off prime factors from every number in the list. It's faster and simpler than finding all the prime factors.
You start with a list like this: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10
1 really doesn't do much, so we'll start with 2. Keep the 2, and replace every number greater than 2 that is divisible by 2 with that number divided by 2:
1, 2, 3, 2, 5, 3, 7, 4, 9, 5
Move to the next number (3). Repeat:
1, 2, 3, 2, 5, 1, 7, 4, 3, 5
Move to the next number (another 2). Repeat:
1, 2, 3, 2, 5, 1, 7, 2, 3, 5
Continue until done:
1, 2, 3, 2, 5, 1, 7, 2, 1, 1
Multiply: 2520
That gives us this algorithm:
1 private
static
void Better()
2 {
3 var numbers = Enumerable.Range(1, 20).ToArray();
4 // remove common factors:
5 for (int index = 0; index < numbers.Length; index++)
6 for (int subIndex = index + 1; subIndex < numbers.Length; subIndex++)
7 if (numbers[subIndex] % numbers[index] == 0)
8 numbers[subIndex] /= numbers[index];
9
10 var answer = numbers.Aggregate(1, (product, number) => product * number);
11
12 Console.WriteLine(answer);
13 }
But you know what. I don't like those loops. I'd rather write methods that operate on a sequence. It's another rather simple example of tail recursion. For any sequence, peel off the first number. Then, return that number followed by the rest of the sequence divided by that first number, where possible. Pipe the remaining sequence back into the same function:
11 private
static
void Best()
12 {
13 var numbers = Enumerable.Range(1, 20);
14
15 var answer = numbers.LeastCommonFactor().Aggregate(1,
16 (product, number) => product * number);
17
18 Console.WriteLine(answer);
19 }
20
21 private
static
IEnumerable<int> LeastCommonFactor(this
IEnumerable<int> list)
22 {
23 // Stop if the list is empty:
24 if (!list.Any())
25 yield
break;
26
27 int factor = list.First();
28 yield
return factor;
29
30 var remaining = from item in list.Skip(1)
31 select (item % factor == 0) ? item / factor : item;
32 foreach (int item in remaining.LeastCommonFactor())
33 yield
return item;
34 }
Now that's a simple, elegant algorithm. And, it runs quite a bit faster than the brute force version. Stay tuned for more toward the end of the week.
More Posts
Next page »