First I want to admit something somewhat embarrassing: I don't know enough about databases. Until coming here, I've always had a DBA that could handle that. I've started to realize now how important it is for all developers to have a much better understanding about databases than I do. So I picked up this book. I was really looking to better understand optimization, especially knowing when to create indexes and optimizing queries. Though the book only had a few chapters on the subject, I felt like it greatly increased my understanding on the subject.
A few months ago I wrote about Adaptive Fetching Strategies. At that time I had little knowledge about covering indexes and how they improve performance. Oh well, it's just more reason something like that should be implemented. Before we even get to that however, NHibernate needs to have a few features added to it in order to be useful in scenarios where performance is important.
Remember folks, SELECT * is bad. NHibernate is for effectively always doing a SELECT * when you're asking for an object when it comes to index coverage. Yes, it specifies columns so it doesn't have all the problems associated with SELECT *, but it is still decidedly less performant than querying for only what you want. Compound this with the extra overhead the unneeded columns add during a flush, and you've got a pretty compelling argument to only query for what you need.
OK, so you've decided that you only want to query for what you need and you're using NHibernate. Well, you can do that... kind of. Not much differently than you can if you just used ADO.NET and DataSets though. NHibernate doesn't support lazy field initialization. This means that if you query for only username and email address from your user table, you don't get back a User. You get back a List of object[]'s. Arguably a list of object[]'s is less functional than a DataSet. Combine this with the fact that you're querying using HQL, which has a subset of functionality, predictability, and therefore optimizability of real SQL, and you start to see a big hole in whole ORM thing... if you want to optimize your pages. Yes, HQL does make writing queries more pretty and more in the domain, but it would be nice if it supported things like "on".
Obviously optimizing too early is evil, but we've had several pages now where we've needed to optimize, and just resorted to querying for a table of values. No longer are we querying for objects, that's just not performant enough. Especially in display scenarios where you're not making changes to anything as NHibernate lacks the ability to do read-only queries, so you end up with that (very) expensive flush unless you change your flushing strategy to be manual (awkward) or you detach your objects from the session (also awkward).
So in short, I feel NHibernate (and any ORM for that matter) needs the following features to really be optimization friendly:
- Lazy field initialization
- Querying for partial objects: select u(Username, Email) from User u
- Read-only queries that do not get flushed.
- Join qualifiers (on in T-SQL)
And yeah, I know it's open source and I could just do it myself, but I have nightmares about that codebase, and I hardly have the time to implement such large features. All I have the time to do is complain and wish :)