Thursday, July 10, 2008

A few months ago, Brian Dawson gave us a look at a bunch of perf tests for querying with ADO.NET and Entity Framework.

I adapted and enhanced those tests for my session at TechEd and then added even more logic to them as I dug into performance for my book.

I thought I would share some of what I found. All of the timings here are only meant to demonstrate the performance relatively between the technologies. They are hardly scientific benchmarks. In my book I've called them "backyard benchmarks".

The first set of tests is similar to Brian's in that I merely query for all of the customers in the customer table of AdventureWorksLT. That's about 450.

However, thanks to some tips from Brad Sarsfield and Bruno Guardia Robles who do performance testing on the DP team, we added an extra loop of queries to "prime the pump" so to speak and get anything like metadata loading, query caching and well as SQL Server query caching out of the way.

Here are the average times comparing different ways of querying data.

Again, read these as relative to one another and nothing else.

perfa

The EntityClient query looks funny, as you'd expect it to be faster. When the query is more complex and the data is shaped, the EntityClient performs better than ObjectQuery which makes sense to me.

This is not using any query caching  or pre-compiled queries or other advantages. I actually do a ton of those tests in that particular book chapter though.

Note that Entity Client and Object Services cache compiled queries by default (you can turn this off) which is a little different than using Compiled Queries in LINQ but has similar effect. So by default, the two Entity SQL tests are benefiting from that but the LINQ to Entities is not. So I really should either create a compiled query for LINQ to Entities and LINQ to SQL or turn query caching off for the two Entity SQL tests, because that's not completely fair.On the other hand, these *are* the defaults.

But what I found really interesting were updates.

There are a lot of ways to update data with ADO.NET. I chose to use DataAdapters since calling DataAdapter.Update is relative to calling context.SaveChanges. Here I queried those same 450 or so customers, edited each one, added 10 new ones then updated the database. I also was sure to clear out the 10 new records from the database after each iteration of the tests was complete. I didn't happen to do any deletes in this test.

I also did one DataAdapter.Update using UpdateBatch, just to compare.

In all cases I am starting the stopwatch, performing the update, then stopping the stopwatch. These times are the average of my 100 iterations of each test.

perfb

That is not a typo next to LINQ to SQL. I have not dug into why this is so different, but I had been told that EF was much faster than LINQ to SQL for updates. I had to see for myself. Bruno says this looks about right, but he's going to double check my code to make sure I didn't do anything completely wacky. And if I did it wasn't intentional.

But I was also interested to see that Object Services was also faster than DataAdapter.

I'm not trying to make a point here...just an observation because I have never seen these comparisons made before.

For all of these methods there are a lot of things you can do to improve this performance, not only in code but on the servers. And some server tweaks might benefit one method but not another. These tests are just using all defaults and gives me a stake in the ground. Now if I didn't need to actually finish my book, I could just keep at it which would be fun and interesting. But it wouldn't be so fun when Tim O'Reilly comes knocking on my door wondering where the heck my book is!

Thursday, July 10, 2008 4:07:06 PM (Eastern Standard Time, UTC-05:00)  #     |  Comments [4]  | 
Thursday, July 10, 2008 5:22:35 PM (Eastern Standard Time, UTC-05:00)
Julie,

Who is "Bruno?" Brian's brother?

I'm still trying to figure out why EF exhibited such poor performance when updating 2155 Order_Details entities. I wanted to avoid query caching with repetition of identical queries to better simulate real-world conditions. However ~100:1 slower than out-of-band SQL was a shock.

Cheers,

--rj
Thursday, July 10, 2008 6:52:49 PM (Eastern Standard Time, UTC-05:00)
Julie,

"performance testing ont eh DP team" ... fingers going crazy :)

bill
Monday, July 14, 2008 10:24:58 AM (Eastern Standard Time, UTC-05:00)
Great stuff and a valuable service. I know how long the simplest test takes and I applaud your perseverance.

Now that I've buttered you up, I'd like to motivate you to check a couple of variations: 1) a big fetch and 2) a lot of small fetches. First, the "why".

Assume that I'm using EF to support a user-facing application. I'm not crunching payrolls; I'm retrieving data for people ... and they will linger over it for awhile. The "client" in this example is never a "server" of data to other clients.

Acceptable performance, therefore, will be measured in terms of the pace at which a human being can request and digest business data.

Accordingly, it's probably ok if it takes ~0.1 second for a single moderate sized fetch of 450 customers; we'll happily pay <<5x>> that time for the convenience and maintainability of the EF approach relative to using DataReader.

[ ... as long as we're making only one trip and this isn't followed immediately by a series of other fetches while the user taps his toes ... but I digress.]

-- To Be Continued --
Monday, July 14, 2008 10:26:12 AM (Eastern Standard Time, UTC-05:00)
-- Continuing Previous Comment ---

1) Large Retrieve:

What if the user asks for a big slug of entities - say several thousand? Or, if heaven forfend, the objects are drawn from an obscene 120 column table?

It is worth learning if the techniques alter their relative performance when the object or results are larger.

2) Many small fetches, back-to-back

App start-up time is a frequent user complaint. A lot of apps load a bunch of small code sets (filling ComboBoxes) upon launch. So what are the stats for retrieving 10 records from each of 50 different tables?

Note that in #2, discarding the early passes may be misleading; the app is only going to run this sequence once. So how bad was it really before you started counting later samples?

MAYBE you can fob this off on someone else. The DP team should take this challenge don't you think? But, since you're writing such a great book maybe you've got some ideas :-)
All comments require the approval of the site owner before being displayed.
Name
E-mail
Home page

Comment (HTML not allowed)  

Enter the code shown (prevents robots):

Live Comment Preview