Please help me with this query (sql server 2008)

ALTER PROCEDURE ReadNews @CategoryID INT, @Culture TINYINT = NULL, @StartDate DATETIME = NULL, @EndDate DATETIME = NULL, @Start BIGINT, -- for paging @Count BIGINT -- for paging AS BEGIN SET NOCOUNT ON; --ItemType for news is 0 ;WITH Paging AS ( SELECT news.ID, news.Title, news.Description, news.Date, news.Url, news.Vote, news.ResourceTitle, news.UserID, ROW_NUMBER() OVER(ORDER BY news.rank DESC) AS RowNumber, TotalCount = COUNT(*) OVER() FROM dbo.News news JOIN ItemCategory itemCat ON itemCat.ItemID = news.ID WHERE itemCat.ItemType = 0 -- news item AND itemCat.CategoryID = @CategoryID AND ( (@StartDate IS NULL OR news.Date >= @StartDate) AND (@EndDate IS NULL OR news.Date <= @EndDate) ) AND news.Culture = @Culture and news.[status] = 1 ) SELECT * FROM Paging WHERE RowNumber >= @Start AND RowNumber <= (@Start + @Count - 1) OPTION (OPTIMIZE FOR (@CategoryID UNKNOWN, @Culture UNKNOWN)) END 

Here is the structure of the News and ItemCategory :

 CREATE TABLE [dbo].[News]( [ID] [bigint] NOT NULL, [Url] [varchar](300) NULL, [Title] [nvarchar](300) NULL, [Description] [nvarchar](3000) NULL, [Date] [datetime] NULL, [Rank] [smallint] NULL, [Vote] [smallint] NULL, [Culture] [tinyint] NULL, [ResourceTitle] [nvarchar](200) NULL, [Status] [tinyint] NULL CONSTRAINT [PK_News] PRIMARY KEY CLUSTERED ( [ID] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] ) ON [PRIMARY] CREATE TABLE [ItemCategory]( [ID] [bigint] IDENTITY(1,1) NOT NULL, [ItemID] [bigint] NOT NULL, [ItemType] [tinyint] NOT NULL, [CategoryID] [int] NOT NULL, CONSTRAINT [PK_ItemCategory] PRIMARY KEY CLUSTERED ( [ID] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] ) ON [PRIMARY] 

This request reads news of a certain category (sports, politics, ...). The @Culture parameter specifies the language of the news, for example, 0 (English), 1 (French), etc. The ItemCategory table associates a news entry with one or more categories. The ItemType column in the ItemCategory table indicates which type of itemID exists. at the moment, we only have ItemType 0, indicating that itemID refers to an entry in the News table.

Currently, I have the following index in the ItemCategory table:

 CREATE NONCLUSTERED INDEX [IX_ItemCategory_ItemType_CategoryID__ItemID] ON [ItemCategory] ( [ItemType] ASC, [CategoryID] ASC ) INCLUDE ( [ItemID]) 

and the following index for the news table (suggested by the query analyzer):

 CREATE NONCLUSTERED INDEX [_dta_index_News_8_1734000549__K1_K7_K13_K15] ON [dbo].[News] ( [ID] ASC, [Date] ASC, [Culture] ASC, [Status] ASC ) 

With these indexes, when I execute a query, the query is completed in less than a second for some parameters, and for other parameters (for example, different @Culture or @CategoryID) it can take up to 2 minutes! I used OPTIMIZE FOR (@CategoryID UNKNOWN, @Culture UNKNOWN) to prevent the sniffing parameter for the @CategoryID and @Culture , but it does not seem to work for some parameters.

The News table currently has about 2,870,000 entries and ItemCategory in the ItemCategory table.

Now I really appreciate any advice on how to optimize this query or its indexes.

update: execution plan:
enter image description here
(In this image, ItemNetwork is what I called ItemCategory. they are the same)

+4
source share
7 answers

Have you had a look at some of the built-in SQL tools to help you with this:

those. from the management studio menu:

  • "Query" β†’ "Display calculated execution plan"
  • Request β†’ Include Actual Execution Plan
  • "Tools" β†’ "Database Engine Tuning Advisor"
0
source

Shouldn't the OPTION OPTIMIZE clause be part of internal SQL, not SELECT on the CTE?

0
source

You should look at indexing the culture field in the news table, as well as the itemid and categoryid fields in the product category table. Perhaps you do not need all of these indexes - I would try them one at a time, and then in combination, until you find something that works. Your existing indexes don't seem to help your query very much.

0
source

Actually you need to see the query plan. Note that you put the clustered index for News on News.ID, but this is not an identification field, but FK for the ItemCategory table, this will lead to some fragmentation on the news table over time, therefore it is less ideal.

I suspect that the main problem is that your paging causes a table scan.

Updated:

These Sorts cost you 68% of the query time from the plan, and it makes sense, one of these kinds should at least support the ranking function that you use, which is based on news.rank desc, but you don’t have an index that can support this rating initially.

Getting the index in support, which will be interesting, you can try a simple NC index on news.rank, SQL may choose to combine indexes and avoid sorting, but this will require some experimentation.

0
source

Try using the non-clustered index on itemId, categoryId for the ItemCategory table, and the Rank, Culture non-clustered index on the news table.

0
source

I finally came up with the following indexes that work great, and the stored procedure runs in less than a second. I just removed TotalCount = COUNT(*) OVER() from the query, and I could not find a good index for this. Perhaps I will write a separate stored procedure to calculate the total number of records. I can even use the more button, like on Twitter and Facebook, without pagination buttons.

for news table:

 CREATE NONCLUSTERED INDEX [IX_News_Rank_Culture_Status_Date] ON [dbo].[News] ( [Rank] DESC, [Culture] ASC, [Status] ASC, [Date] ASC ) 

for the ItemNetwork table:

 CREATE NONCLUSTERED INDEX [IX_ItemNetwork_ItemID_NetworkID] ON ItemNetwork ( [ItemID] ASC, [NetworkID] ASC ) 

I just don't know if ItemNetwork needs a clustered index in the Identifier column. I never retrieve an entry from this table using the ID column. Do you think it's better to have a clustered index in columns (ItemID, NetworkID)?

0
source

Try to change

 FROM dbo.News news JOIN ItemCategory itemCat ON itemCat.ItemID = news.ID 

to

 FROM dbo.News news HASH JOIN ItemCategory itemCat ON itemCat.ItemID = news.ID 

or

 FROM dbo.News news LOOP JOIN ItemCategory itemCat ON itemCat.ItemID = news.ID 

I really don't know what is in your data, but joining these tables can be a bottleneck.

0
source

All Articles