Development Blog

 Sunday, August 26, 2007
« Optimizing NHibernate pt. 2 | Main | Optimizing NHibernate pt. 4 »

I feel like I'm taking crazy pills over here. Nobody seems to see this the way I am, so I must be the crazy one, right? Let me try and explain with a few examples and pose a few specific questions that may help me understand what I'm missing. The examples are going to be somewhat contrived, but please bear with me.

Let's say we're working with the Northwind Database and for some reason one of the most often hit pages in our app is a page that displays the Freight and ShipName for a particular customer (I told you this would be contrived). Let's examine a few ways we could build the query for this.

Normal query

"from Order where CustomerId='QUICK'"

Simple. This is what we'd normally do the first time we write the app.

Normal Projection

"select o.Id, o.Freight, o.ShipName from Order o where CustomerId='QUICK'"

This is simple too. It'll give us back a list of object arrays that we can enumerate to display what we want to display.

DTO Projection

"select new OrderDto(o.Id, o.Freight, o.ShipName) from Order o where CustomerId='QUICK'"

This is a cleaner way to do a projection. Here's a good post on exactly how to do this (the key is adding an import statement to your mappings).

Partial Query

"select o(Id, Freight, ShipName) from Order o where CustomerId='QUICK'"

I made up the syntax for this. The idea is that it wouldn't use a constructor (hence no new) it'd use NHibernate's mapping to inject the property values, whether it be property setting, field setting or whatever.

What do you think of the syntax?

So let's talk about the differences between Partial Query and DTO Projection because they're the two most interesting methods to me. Projection is nice because it's implemented. You can use it today. It's also very light, it's not added to the unit of work so there's no overhead on flush, and it seems like a good fit for displaying tables of data.

On the other hand, let's pretend you've got some stupid business rule like every Tuesday the Freight on the ship named Mayflower is reported as double. Yeah, stupid rule but I don't feel like coming up with a real example. Any ways, you'd probably have something in your Order class to handle this stupid rule, but if you're projecting you'll need to apply this rule manually. If you just used a partial query, the business logic comes with you.

I should also mention that projections and partial queries differ greatly in that you cannot update with a projection. It is always read-only. This means that any time you want to update an entity, you need to load the whole thing. Even if you don't care about much or anything that's on the entity at the moment.

Here's some test data showing the average time it takes to query for 373 out of 200k Orders by CustomerId with and without covering indexes.

Method Time without index Time with index
Normal 15.2ms -
Partial Query 8.3ms 6.85ms
DTO Projection 3.4ms 2ms
Normal Projection 2.3ms 1.55ms

The decreases percentage wise is pretty significant, especially when you compare projection to normal queries. There's an improvement all around when you're able to use a covering index, but not nearly as much.

However, I find myself wondering if shaving anywhere from 1 to 12ms off a page load time is enough to justify optimization. There are a few things to consider. For one, the database isn't working as hard. Databases are expensive. If this optimization is made throughout the app, you'd probably notice significant enough savings. I definitely wouldn't suggest going through every query by hand and using projections or partial, but if it were automatic... then why not?

Is it worth the effort to bring all these features and adaptive fetching strategies to NHibernate to realize the Partial Query gains? Or are these numbers just not impressive enough?

Another thing I noticed is that projects are significantly I used dotTrace to figure out why the projection is so much faster than the partial query and it looks like the difference is mostly due to adding the entity to the session and the way things are hydrated. So it looks like Projections would still have their uses for even more aggressive optimizations.

Let's take another example from a domain I'm more familiar with--our own. For this example, we have a User, a Course, and a CourseRegistrationService. To register a user for a course you'd do something like this:

User user = userDao.FindByUsername('someuser');
Course course = courseDao.FindByCourseNumber(101);
courseRegistrationService.RegisterForCourse(user, course);

Does RegisterForCourse need to know a course's name? Or it's description? Or a user's phone number or address? Definitely not. Does NHibernate need to know them in order to hydrate you a business object you can pass around? Yep. Does it matter that NHibernate pulls too much info out of the db for just about everything you do? I think it does.

 I think it's debatable whether or not we should query for DTO's when it comes to displaying information about our business objects, but I think it's clear that you should use business objects for business logic. It's also clear that you don't always need all of the data in your involved business objects for everything you do with them.

Thoughts? When should you use a DTO? When should you use the business object?

What I see when I close my eyes and think about ORMs is an ORM that will:

  • Allow me to get only what I need from the database to do my specific task. (Partial Queries)
  • Figure out over time exactly what data I need (select clause) for my specific task with a little bit of guidance (from and where clause). (Adaptive Fetching Strategies)
  • Allow me to specify intent for objects I pull from the database. If I know I'm not going to write them back, it shouldn't verify I didn't change anything and I shouldn't have to deal with the overhead of flushing them. (Read Only Queries)
  • Track specific task performance when I want it to so that I can quickly identify queries that are called most often or that take the longest so that I have a starting point for my optimization efforts. (Yeah, this is completely unrelated and I'm sure it'd be easy to add)

Now if only I'd just shut up and start writing some code...

by Aaron on Sunday, August 26, 2007 3:32:12 PM (Pacific Standard Time, UTC-08:00)  #    Disclaimer  |  Comments [8]  |  Trackback