I've been tasked with improving the performance (and this is my first real-world performance tuning taks) of a reporting stored procedure which is called by an SSRS front-end and the stored procedure currently takes about 30 seconds to run on the largest amount of data (based on filters set from the report frontend).
This stored procedure has a breakdown of 19 queries executing in it, most of which are transforming the data from an initial (legacy) format from inside the base tables into a meaningful dataset to be displayed to the business side.
I've created a query based on a few DMV's in order to find out which are the most resource-consuming queries from the stored procedure (small snippet below) and I have found one query which takes about 10 seconds, in average, to complete.
select
object_name(st.objectid) [Procedure Name]
, dense_rank() over (partition by st.objectid order by qs.last_elapsed_time desc) [rank-execution time]
, dense_rank() over (partition by st.objectid order by qs.last_logical_reads desc) [rank-logical reads]
, dense_rank() over (partition by st.objectid order by qs.last_worker_time desc) [rank-worker (CPU) time]
, dense_rank() over (partition by st.objectid order by qs.last_logical_writes desc) [rank-logical write]
...
from sys.dm_exec_query_stats as qs
cross apply sys.dm_exec_sql_text (qs.sql_handle) as st
cross apply sys.dm_exec_text_query_plan (qs.plan_handle, qs.statement_start_offset, qs.statement_end_offset) as qp
where st.objectid in ( object_id('SuperDooperReportingProcedure') )
, [rank-execution time]
, [rank-logical reads]
, [rank-worker (CPU) time]
, [rank-logical write] desc
Now, this query is a bit strange in the sense that the execution plan shows that shows that the bulk of the work (~80%) is done when inserting the data into the local temporary table and not when interrogating the other tables from which the source data is taken and then manipulated. (screenshot below is from SQL Sentry Plan Explorer)
Also, in terms of row estimates, the execution plan has way off estimates for this, in the sense that there are only 4218 rows inserted into the local temporary table as opposed to the ~248k rows that the execution plan thinks its moving into the local temporary table. So, becasue of this, I'm thinking "statistics", but still do those even matter if ~80% of the work is the actual insert into the table?
One of my first recommendations was to re-write the entire process and the stored procedure so as to not include the moving and transforming of the data into the reporting stored procedure and to do the data transformation nightly into some persisted tables (real-time data is not required, only relevant data until end of previous day). But the business side does not want to invest time and resources into redesigning this and instead "suggests" I do performance tuning in the sense of finding where and what indexes I can add to speed this up.
I don't believe that adding indexes to base tables will improve the performance of the report since most of the time needed for running the query is saving the data into a temporary table (which from my knowledge it will hit tempdb, which means that they will be written to disk -> increased time due to I/O latency).
But, even so, as I've mentioned this is my first performance tuning task and I've tried to read as much as possible related to this in the last couple of days and these are my conclusions so far, but I'd like to ask for advice from a broader audience and hopefully get a few more insights and understanding on what I can do to improve this procedure.
As a few clear questions I'd appreciate if could be answered are:
Is there anything incorrect in what I have said above (in my understanding of the db or my assumptions) ?
Is it true that adding an index to a temporary table will actually increase the time of execution, since the table (and its associated index(es) is/are being rebuilt on each execution)?
Could there anything else be done in this scenario without having to re-write the procedure / queries and only be done via indexes or other tuning methods? (I've read a few article headlines that you could also "tune tempdb", but I didn't get into the details of those, yet).
Any help is very much appreciated and if you need more details I'll be happy to post.
Update (2 Aug 2016):
The query in question is (partially) below. What is missing are a few more aggregate columns and their corresponding lines in the GROUP BY section:
select
b.ProgramName
,b.Region
,case when b.AM IS null and b.ProgramName IS not null
then 'Unassigned'
else b.AM
end as AM
,rtrim(ltrim(b.Store)) Store
,trd.Store_ID
,b.appliesToPeriod
,isnull(trd.countLeadActual,0) as Actual
,isnull(sum(case when b.budgetType = 0 and b.budgetMonth between #start_date and #end_date then b.budgetValue else 0 end),0) as Budget
,isnull(sum(case when b.budgetType = 0 and b.budgetMonth between #start_date and #end_date and (trd.considerMe = -1 or b.StoreID < 0) then b.budgetValue else 0 end),0) as CleanBudget
...
into #SalvesVsBudgets
from #StoresBudgets b
left join #temp_report_data trd on trd.store_ID = b.StoreID and trd.newSourceID = b.ProgramID
where (b.StoreDivision is not null or (b.StoreDivision is null and b.ProgramName = 'NewProgram'))
group by
b.ProgramName
,b.Region
,case when b.AM IS null and b.ProgramName IS not null
then 'Unassigned'
else b.AM
end
,rtrim(ltrim(b.Store))
,trd.Store_ID
,b.appliesToPeriod
,isnull(trd.countLeadActual,0)
I'm not sure if this is actually helpful, but since #kcung requested it, I added the information.
Also, to answer some his questions:
the temporary tables have no indexes on them
RAM size: 32 GB
Update (3 Aug 2016):
I have tried #kcung's suggestions to move the CASE statements from the aggregate-generating query and unfortunately, overall, the procedure time has not improved, noticeably, as it still fluctuates in the range of ±0.25 to ±1.0 second (yes, both lower and higher time than the original version of the stored procedure - but I'm guessing this is due to variable workload on my machine).
The execution plan for the same query, but modified to remove the CASE conditions, leaving only the SUM aggregates, is now:
Adding indexes to the temporary table will definitely improve the read call but slows down the write calls to the temporary table.
Here, as you mentioned, there are 19 queries executing in the procedure, so analyzing only one query with execution plan would not be more helpful.
Adding more, if possible, execute this query only & check how much time it takes (rows affected).
Other approach you may try, not sure if possible in your case, try using table variable instead of temporary table. This is because, using table variable over the temporary table has additional advantages such as, procedure is pre-compiled, no transactional logs are maintained. & more, you don't need to write drop table.
Any chance I can see the query ? and the indexes on both tables ?
How big is your ram ? how big is the row in each table(roughly) ?
Can you update statistics for both table and resend the query planner ?
To answer your question :
You're mostly right, except in the part of adding indexes. Adding indexes will help the query to do lookup. It will also give chance to the query planner to consider nested loop join plan instead of the hash join plan. Unfortunately, I can't answer more until my question being answered.
You shouldn't need to add index to the temp table. Adding index to this temp(or any insert destination table) table will increase write time, because the insert will need to update that index. Just imagine an index as copy of your table with less information and it sits on top of your table and it needs to be in sync with your table. Every write (insert, update, delete) needs to update this index.
Looking at both tables total rows, this query should run way faster than 10s, unless you have a lemon PC, then it's a different story.
EDIT:
Just want to point out for point 2, I didn't realise you're source table is temp table as well. Temporary table is destroyed after each session of a connection ended. Adding index to temporary table means that you will add extra time to create this index everytime you create this temporary table.
EDIT:
Sorry, I'm using phone now. I'm just gonna be short.
So essentially 2 things :
add primary key on temp table creation time so you do it in one go. Don't bother with adding nonclustered index or any covering index you will end up spending more time creating those.
see your query, all of the case when statement, instead of doing it in this query, why don't you add them as another column in the table. Essentially you want to avoid calculation on the fly when doing group by. You can leave the sum() in the query as it's an aggregate query, but try and reduce run time calculation as much as possible.
Sample :
case when b.AM IS null and b.ProgramName IS not null
then 'Unassigned'
else b.AM
end as AM
You can create a column named AM when creating table b.
Also those rtrim and ltrim. Please remove those and stick it in table creation time. :)
One suggestion is to increase the execution time of stored procedure.
cmd.CommandTimeout = 200 // in seconds.
You can also generate a report link and email it to user when the report was generated.
Other than that use CTE never use temp tables as they are more expensive.
Related
I have following table:
CREATE TABLE public.shop_prices
(
shop_name text COLLATE pg_catalog."default",
product text COLLATE pg_catalog."default",
product_category text COLLATE pg_catalog."default",
price text COLLATE pg_catalog."default"
)
and for this table i have a dataset from 18 months. In each file there are about 15M records. I have to some analysis, like in which month a shop has increased or decreased their price. I imported two months in a table and run following query just to test:
select shop, product from shop_prices group by shop, product limit 10
I waited more than 5 minutes, but no any result and response. It was still on working. What is the best way the store these datasets and run efficiency queries? Is it a good idea if I create for each dataset a seperate tables?
Using explain analyze select shop_name, product from shop_prices group by shop, product limit 10 you can see how Postgres is planning and executing the query and the time the execution takes. You'll see it needs to read the whole table (with the time consuming disk reads) and then sort it in memory - which will probably need to be cached on disk, before returning the results. In the next run you might discover the same query is very snappy if the number of shop_name+product combinations are very limited and thus stored in pg_stats after that explain analyze. The point being that a simple query like this can be deceiving.
You will faster execution by creating an index on the columns you are using (create index shop_prices_shop_prod_idx on public.shop_prices(shop_name,product)).
You should definitely change the price column type to numeric (or float/float8)) if you plan to do any numerical calculations on it.
Having said all that, I suspect this table is not what you will be using as it does not have any timestamp to compare prices between months to begin with.
I suggest you complete the table design and speculate on indices to improve performance. You might even want consider table partitioning https://www.postgresql.org/docs/current/ddl-partitioning.html
You will probably be doing all sorts of queries on this data so there is no simple solution to them all.
By all means return with perhaps more specific questions with complete table description and the output from the explain analyze statement for queries you are trying out and get some good advice.
Best regards,
Bjarni
What is your PostgreSQL version ?
First there is a typo: column shop should be shop_name.
Second you query looks strange because it has only a LIMIT clause without any ORDER BY clause or WHERE clause: do you really want to have "random" rows for this query ?
Can you try to post EXPLAIN output for the SQL statement:
explain select shop_name, product from shop_prices group by shop_name, product limit 10;
Can you also check if any statistics have been computed for this table with:
select * from pg_stats where tablename='shop_prices';
I'm using SQL Server and I want to benefit from reusing query plan. I found this document, but it remains unclear for me whether the plan for my query is being reused or not.
declare #su dbo.IntCollection -- TABLE (Value int not null)
insert into #su values (1),(2),(3) --... about 500 values
update mt
set mt.MyField = getutcdate()
from MyTable mt
join #su vsu on mt.Id = vsu.Value -- Clustered PK, int
Technically the text of batch differs from run to run, as different values are being inserted in #su. But the text of update query remains the same. If I were using .NET I would basically pass a table variable to SQL command, but I'm using Python and it looks like there no way to pass table parameter from my program.
Question 1: does the plan for update query get reused? Or does optimizer look that text of batch is different and does not analyze single queries in batch? In other words, is it the same as
update MyTable
set MyField = getutcdate()
where Id in (1, 2, 3 ...)
Question 2: I can force SQL to remain the same between calls by introducing a stored procedure with table parameter, but will I benefit from it?
Question 3: how to identify for a given query whether its plan was reused or computed again?
Question 4: should I worry about all above in my specific case? After all it is just an update of table on bunch of IDs...
Just answers to your questions..
Question 1: does the plan for update query get reused? Or does optimizer look that text of batch is different and does not analyze single queries in batch? In other words, is it the same as
Your both update statements are treated as new queries,since SQL tries to calculate hash of the query and any simple change will not match with old hash
Question 2: I can force SQL to remain the same between calls by introducing a stored procedure with table parameter, but will I benefit from it?
this sounds like a good approach to me..rather than a bunch of IN's
Question 3: how to identify for a given query whether its plan was reused or computed again?
select usecounts from sys.dm_exec_cached_plans ec
cross apply
sys.dm_exec_sql_text(ec.plan_handle) txt
where txt.text like '%your query text%'
Question 4: should I worry about all above in my specific case? After all it is just an update of table on bunch of IDs...
it seems to me,you are worrying much..There are many rules which enforce query plan reuse behaviour as pointed out in the white paper you referred..so most of the times,query plan will be reused..
I would start worrying about plan re usability only when i see high SQL Compilations/sec coupled with Batch Requests/sec
Taken from Answer here :https://dba.stackexchange.com/questions/19544/how-badly-do-sql-compilations-impact-the-performance-of-sql-server
SQL Compilations/sec is a good metric, but only when coupled with Batch Requests/sec. By itself, compilations per sec doesn't really tell you much.
You are seeing 170. If batch req per sec is only 200 (a little exaggerated for effect) then yes, you need to get down to the bottom of the cause (most likely an overuse of ad hoc querying and single-use plans). But if your batch req per sec is measuring about 5000 then 170 compilations per sec is not bad at all. It's a general rule of thumb that Compilations/sec should be at 10% or less than total Batch Requests/sec.
I have a db where I have a little bit more than 2m rows. It has startIpNum and endIpNum columns(the ranges don't overlap). I am making some queries to that table:
table:
Id | startIpNum(Numeric(0,18)) | endIpNum(Numeric(0,18)) | locId
Query 1:
select locId from Blocks
where startIpNum <= 1550084098 and endIpNum >= 1550084098
Query 2(added this query hoping for better results):
select top 1 locId from Blocks
where endIpNum >= 1550084098
These queries take a reasonable time, no problems. But I need to get around 100 different rows each time I open a web page, and it tooks around 15 seconds which is possibly expected, but not desired.
I believe that by working with indexes I can increase that performence, so I've added 2 indexes, one to start(asc) one to end(desc) but performance is same.
What else can I do to achieve a better query performance?
Update
I have run the create index query you guys have proposed. No changes for now.
As requested I am including the sql query execution plans below(since I am not familiar with the execution plan thing I am only snipping screenshots from ssms, go ahead and ask if something else is required to answer my case):
Execution plan of Query1:
Execution plan of Query2:
As mentioned without an execution plan to look at this is slightly going slightly blind but the basic points are:
1) If the indexes are in place to support this query there is no point adding two. Only one of the indexes can be used. Therefore you need one index that contains both columns.
2) Bringing back "*" means that a key lookup will be inevitable as having used the index to get the rows it needs it will have to fetch the data not included in the index from the clustered index. Key lookups can get very expensive if you are bringing back large amounts of rows. If you can limit the columns you bring back then you can use an INCLUDE to avoid the key lookup. You don't need to include the primary key in this list as this is part of the index anyway.
Having said this your best option will be something like:
CREATE INDEX ix_range ON dbo.yourTable (start, end) INCLUDE (<list_of_columns_in_your_select)
Looking at your query plan it is also clear that a CONVERT_IMPLICIT is being performed on your parameters #1 and #2. These should be avoided so do the following:
DECLARE #1numeric numeric(18, 0),
#2numeric numeric(18, 0)
SELECT #1numeric = CAST(#1 AS numeric(18, 0)),
#2numeric = CAST(#2 AS numeric(18, 0))
SELECT locId FROM Blocks
WHERE startIpNum <= #1numeric and endIpNum >= #2numeric
Try to explicitly cast the value compared to match the columns.
select * from that_table
where CAST(123123123 as Numeric(18,0)) between start and end
I assume sqlserver is losing the index seek due to the implicit cast.
I have 4million records in one of my tables. I need to get the last 25 records that have been added in the last 1 week.
This is how my current query looks
SELECT TOP(25) [t].[EId],
[t].[DateCreated],
[t].[Message]
FROM [dbo].[tblEvent] AS [t]
WHERE ( [t].[DateCreated] >= Dateadd(DAY, Datediff(DAY, 0, Getdate()) - 7, 0)
AND [t].[EId] = 1 )
ORDER BY [t].[DateCreated] DESC
Now I do not have any indexes running for this table and do not intend to have one. This query takes about 10-15 seconds to run and my apps times-out, now is there a way to better it?
You should create an index on EId, DateCreated or at least DateCreated
Without this only way of optimising this that I can think of would be to maintain the last 25 in a separate table via an insert trigger (and possibly update and delete triggers as well).
If you have an ID in the table that is autoincrement (not the Eid but a separate PK) you can order by ID desc instead of DateCreated, that might make your order by faster.
otherwise you do need an index (but your question says you do not want that).
If the table has no indexes to support the query you are going to be forced to perform a table scan.
You are going to struggle to get around the table scan aspect of that - and as the table grows, the response time will get slower.
You are going to have to endevour to educate your client as to the problems going forward they face, and that they should consider an index. They may be saying no, you need to show the evidence to support the reasoning, show them times with / without, and make sure the impact to the record insertion is also shown, it's a relatively simple cost / benefit / detriment for the adding of the index / not adding of it. If they insist on no index, then you have no choice but to extend your timeouts.
You should also try query hint:
http://msdn.microsoft.com/en-us/library/ms181714.aspx
With option FAST n -- number of rows.
This question already has an answer here:
Closed 11 years ago.
Possible Duplicate:
Why does the Execution Plan include a user-defined function call for a computed column that is persisted?
In SQL Server 2008 I'm running the SQL profiler on a long running query and can see that a persisted computed column is being repeatedly recalculated. I've noticed this before and anecdotally I'd say that this seems to occur on more complex queries and/or tables with at least a few thousand rows.
This recalculation is definitely the cause of the long execution as it speeds up dramatically if I comment out that one column from the returned results (The field is computed by running an XPath against an Xml field).
EDIT: Offending SQL has the following structure:
DECLARE #OrderBy nvarchar(50);
SELECT
A.[Id],
CASE
WHEN #OrderBy = 'Col1' THEN A.[ComputedCol1]
WHEN #OrderBy = 'Col2' THEN C.[ComputedCol2]
ELSE C.[ComputedCol3]
END AS [Order]
FROM
[Stuff] AS A
INNER JOIN
[StuffCode] AS SC
ON
A.[Code] = SC.[Code]
All columns are nvarchar(50) except for ComputedCol3 which is nvarchar(250).
The query optimizer always tries to pick the cheapest plan, but it may not make the right choice. By persisting a column you are putting it in the main table (in the clustered index or the heap) but in order to pull out these values, normal data access paths are still required.
This means that the engine may choose other indexes instead of the main table to satisfy the query, and it could choose to recalculate the computed column if it thinks doing so combined with its chosen I/O access pattern will cost less. In general, a fair amount of CPU is cheaper than a little I/O, but no internal analysis of the cost of the expression is done, so if your column calls an expensive UDF it may make the wrong decision.
Putting an index on the column could make a difference. Note that you don't have to make the column persisted to put an index on it. If after making an index, the engine is still making mistakes, check to see if you have proper statistics being collected and frequently updated on all the indexes on the table.
It would help us help you if you posted the structure of your table (just the important columns) and the definitions of any indexes, along with some ideas of what the execution plan looks like when things go badly.
One thing to consider is that it may actually be better to recompute the column in some cases, so make sure that it's really correct to force the engine to go get it before doing so.