T-SQL query very slow when using index on Date column

Question

I have a table PaymentItems with 8 million rows. 100'000 rows have the foreign key PaymentItemGroupId = '662162c6-209c-4594-b081-55b89ce81fda'.

I have created a nonclustered index on the column PaymentItems.Date (ASC) in order to be able to sort / find entries faster for dates.

When running the following query, it will take about 3 minutes:

SELECT TOP 10 [p].[Id], [p].[Receivers]
FROM [PaymentItems] AS [p]
WHERE [p].[PaymentItemGroupId] = '662162c6-209c-4594-b081-55b89ce81fda'
ORDER BY [p].[Date]

Interesting is, without the TOP 10 it will take 18 seconds and return all 100'000 rows. When I order descending instead of ascending (ORDER BY [p].[Date] DESC) it will take about 1 second. When I remove the index, it's also faster when sorting ascending.

I analyzed the query plan for the slow query, and it looks like MS SQL Server does not filter the rows by the foreign key first, but will instead sort all 8 million rows first (Index scan non clustered on Date index).

In the fast query, it will filter the where conditions first (key lookup clustered).

Is there anything I can do except removing the index for Date to prevent leading SQL Server into building a bad query plan like this?

Here is the actual query plan: https://www.brentozar.com/pastetheplan/?id=xBBArQl9kh

Here is the create table script:

CREATE TABLE [dbo].[PaymentItems](
    [Id] [uniqueidentifier] NOT NULL,
    [PaymentItemGroupId] [uniqueidentifier] NOT NULL,
    [Date] [datetime2](7) NOT NULL,
 CONSTRAINT [PK_PaymentItems] PRIMARY KEY CLUSTERED 
(
    [Id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY]
GO
CREATE NONCLUSTERED INDEX [IX_PaymentItems_Date] ON [dbo].[PaymentItems]
(
    [Date] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
GO

CREATE NONCLUSTERED INDEX [IX_PaymentItems_PaymentItemGroupId] ON [dbo].[PaymentItems]
(
    [PaymentItemGroupId] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
GO

A foreign key is not an index, so you probably need one for PaymentItemGroupId — siggemannen
– siggemannen, Commented Mar 7 at 9:36
There already is. Not sure if SQL Server or Entity Framework creates this automatically. — Ben5
– Ben5, Commented Mar 7 at 9:52
Can you include the query plan? Use pastetheplan for simplicity. I'm guessing since lookup on date is needed, it decides to use the clustered index which becomes slow. So guessing you need better indexes — siggemannen
– siggemannen, Commented Mar 7 at 9:53
Please also add the full CREATE TABLE definition with indexes. — Charlieface
– Charlieface, Commented Mar 7 at 10:26

Charlieface · Accepted Answer · 2025-03-08 19:43:24Z

5

For this query to run fast, you would need the following index, which it seems you don't have. This would prevent the key lookup entirely by completely covering the query.

CREATE INDEX IX_PaymentItems_PaymentItemGroupId_Date ON PaymentItems
    (PaymentItemGroupId, Date, Id)
    INCLUDE (Receivers)
    WITH (DROP_EXISTING = ON);

Foreign keys are not indexed by default (even though they should be), and even if they were you'd be missing the Date key column.

edited Mar 8 at 19:43

answered Mar 7 at 10:30

Charlieface

78.9k8 gold badges35 silver badges77 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

Ben5 Mar 7 at 11:36

I added the Index but only (PaymentItemGroupId, Date) and it solves the issue. Is there a general rule of thumb, when we should use indexes like this? An when to not use indexes like I did? Or was this rather an "execution plan bug"?

Charlieface Mar 7 at 12:21

You should have the INCLUDE as well. No this is not a bug, it's entirely expected. The server thinks it's faster to scan the whole table because it thinks it will find 10 rows fast, this is the Row Goal Problem. Giving it a proper index means the index will be faster, and using INCLUDE means it won't need Key Lookups. See also learn.microsoft.com/en-nz/archive/blogs/bartd/… and use-the-index-luke.com/blog/2019-04/…

Aaron Bertrand Mar 7 at 12:21

@Ben5 No, it is not a "bug." The optimizer can't invent an index you didn't create; it has to work within the confines of what you did create.

Ben5 Mar 7 at 13:02

@Charlieface Great explanation. In my real scenario there are a lot more columns involved (also dynamic combinations) so I guess this is good enough? Or is it the same as without any index then?

Charlieface Mar 7 at 13:14

Hard to say without a real example, but generally you want some kind of index that at least covers the more selective WHERE predicates. The problem in this case started when it picked a bad index ie IX_PaymentItems_Date because it had to scan many rows and do key lookups on all of them until it got 10 rows that actually matched all the predicates.

|

mikkel · Accepted Answer · 2025-03-11 23:09:27Z

In addition to the accepted answer, I think it's important to dwell a bit at (and upvote) the comment

"it picked a bad index ie IX_PaymentItems_Date because it had to scan many rows and do key lookups on all of them until it got 10 rows that actually matched all the predicates"

If you have 8.6M rows in your table, of which 100.000 have a particular paymentGroupItemId. It would be reasonable to assume that these would be distributed across the entire 8.6M rows. In other words, 1 in every 86 row or so. This is what the optimizer will say: "10 rows you say? That's 860 rows on average .. I can do that really fast by scanning in date order and doing lookups."

But if the sought after paymentItemGroupId wasn't added to the system til sometime late in the game, say after about 8.4M paymentItems had already been created, the distribution isn't reasonable any longer, and the server will get caught off guard, because the date-order index scan is guaranteed to not find anything for the first 8.4M rows. The (paymentItemGroupId,Date)-distribution is skewed

This is what's at play here!

If you select TOP 10 in descending date order, the server will probably only need to look at something like 15 or 20 rows, because so many of the newer paymentItems have the sought after paymentItemGroupId.

Why not use the predicate first? There is an index after all on paymentItemGroupId.
But then SQL Server would have to make 100.000 lookups into the clustered index to find the date, then sort these 100.000 intermediate results and then take TOP 10. Even if the guesstimate of 1/86 is off by x1000 (and the plan actually says ~1/79) it's still better to scan by date and do lookups for paymentItemGroupIds

By combining paymentItemGroupId and Date in an index, you save all the lookups, because you correlate the two columns. For TOP 10 you might not need to Include receivers, but what if it were TOP 100 or TOP 1000 in stead? It's just good practice to cover.

David Singleton · Accepted Answer · 2025-03-12 07:00:11Z

0

I had a slightly different, but with possibly the same root cause. To resolve the INDEX issue the only thing that worked in my case was adding a new Date column which was something like [DateSmall] = CAST(DATETIME2 AS Date) thus removing TIME. Of course this creates new issues, such as UTC vs local dates, but it did solve the performance issue.

PS I did try tackling it as an ASCENDING KEY problem, which helped somewhat but only if I updated Stats with Full scan, which on a table with over 3 million inserts per day, wasn't really viable.

edited Mar 12 at 7:00

answered Mar 10 at 14:02

David Singleton

12 bronze badges

1 Comment

Community Mar 11 at 5:37

Your answer could be improved with additional supporting information. Please edit to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers in the help center.

Collectives™ on Stack Overflow

T-SQL query very slow when using index on Date column

3 Answers 3

6 Comments

Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

6 Comments

Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related