In part 2 of this series, we looked at the lengths SQL Server will go just to avoid the dreaded bookmark lookup, including ignoring an index altogether and choosing instead to scan the entire clustered index. But, we also showed how that decision was, in fact, a smart one, due to it resulting in fewer logical IOs.
But let’s face it, neither path (bookmark lookup or clustered index scan) is very efficient or scalable. So, what can we do to give SQL some better tools?
The answer, it seems, is surprisingly simple: create a covering index. That is, we need to create (or modify an existing one) an index that contains all the columns required for the query. But how do we determine what those columns are?
The easiest way I’ve found is to examine our execution plan, specifically the Bookmark Lookup node. Right click on the icon and choose “Properties”. Then, in the properties window, look for a line called “Output List”. Note: if your plan includes a scan, table or index, then use that operator as your root node instead.

If you click on that little “…” icon, what you’ll see is a nice list of the columns being requested by the query:

In this case, we see that the query is requesting the AccountID and Amount columns. So, how do we add these to the index?
Well, there are two options: 1) add them to the index key itself, 2) add them as included columns. For reasons I’ll illustrate in a moment we’ll see that number 2 is the better option, at least in this specific case.
First, adding them to the key:
DROP INDEX NCI_tempA_TransDate ON dbo.tmpA;
GO
CREATE INDEX NCI_tempA_TransDate ON dbo.tmpA (TransDate, AccountID, Amount);
Now, let’s run the same SELECT query:
SELECT AccountID, Amount FROM dbo.tmpA WHERE TransDate = '1/1/2010';
And the resulting execution plan:
…and STATISTICS IO:
Table 'tmpA'. Scan count 1, logical reads 4, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Both clearly an improvement over the previous results seen in part two (lookup and 217 logical reads).
“Great!” you say, “I’ll go start adding columns to my indexes right now!”
Hold on there… this isn’t all rainbows yet. You see, by adding those columns to the index key itself, we could be incurring additional overhead in terms of index maintenance. Plus, I’m pretty sure if index gurus like Paul Randal (blog | twitter) catch you adding things like large VARCHAR columns to an index key, without them actually being used in filter operations, they just might hunt you down and… well, let’s just say I doubt it’d be pretty. Note: if anyone out there can tell me how to show the difference in behavior via an execution plan or STATISTICS IO I will update this page and give them due props, as I was not able to get any difference between the two plans.
So now let’s look at option two: add the columns using the INCLUDE functionality in SQL 2005+:
DROP INDEX NCI_tempA_TransDate ON dbo.tmpA;
GO
CREATE INDEX NCI_tempA_TransDate_inc ON dbo.tmpA (TransDate)
INCLUDE (AccountID, Amount);
Running our SELECT statement gives an identical execution plan and number of logical IOs as when the columns are placed in the index key. But now we’ve skipped making our index key wider than necessary. Truly a great feature.
Before I close out the series, I should mention that even this second path doesn’t appear to be completely devoid of peril. By adding these INCLUDEd columns, you will be adding to the amount of storage required by the index. For example, the following information was returned by running a query against the sys.dm_db_index_physical_stats DMV, using the page_count column for indexes on our tmpA table:
For explanatory purposes, the indexes are named as follows:
- NCI_tempA_TransDate_inc – key = TransDate, included = AccountID, Amount
- NCI_tempA_TransDate_inckey – key = TransDate, AccountID, Amount
- NCI_tmpA_TransDate – key = TransDate, no included columns
So as you can see, adding INCLUDED columns to the key does indeed increase the storage required by the index. So do keep that in mind when you perform this kind of tuning work, especially on larger databases.
And so we’ve come to the end of our little series. Hopefully you will find this information useful the next time you are tuning a query and notice our little friend, the bookmark lookup (or even a index / tables scans). If anyone has any great stories to share on how this functionality helped them solve a production issue (or prevent one in development), I’d love to hear it in the comments. Props to anyone who can beat my best of reducing a query’s run time from 27 minutes to 44 seconds!
INSERT dbo.tmpA (AccountID, Amount, TransDate)
SELECT TOP 100 name, CAST(object_id as MONEY), ‘1/5/2010′
FROM sys.columns