Before indexing tables, i backup database and restore my test database with this backup. After than i created non cluster indexes on necessary tables.
Before index, query execute time around of 20 mins
After index, query execute time around of 10 secs.
And than i created these indexes at the prod table manually. But after create indexes execute time was around of 10 mins. When i research this problem on internet, i realised index column order is important for performance. Than i changed column orders. But performance still bad. around of 9 mins.
What is wrong?
(Sorry for bad english)
From the question i understood that, Indexing increased performance on both Test and Prod environments but on Test it is taking 10 secs where as on prod it is taking 10 Mins approx.
On prod environment there are several factors to look at.
Any locks / blocks happening on the object.
index fragmentation levels
Last stats update happened.
If you post the query and the indexing strategy that would get you more help.
Related
We have a large database that has several tables with 10-50 million of rows each. But in reality we just need the data for the past 3 years only. So we created new tables to contain only the latest data. They tables are exactly like the original ones e.g. contains the same indexes on same partition ... same everything.
And everything went perfect. The table records count is now ~10-15 times smaller than the original size. Initial perf. measures showed significant gain, but then we found that some other stored procedures perform worst than previous - now they take 100% more time e.g. from ~2 minutes to ~4.
The table swapping was done via sp_rename.
We rebuilt all the indexes, rebuilt even the statistics, but the effect was actually very small.
Update: we cleared all execution plans via:
DBCC DROPCLEANBUFFERS
DBCC FREEPROCCACHE
GO
Fortunatelly for me when I get back to the original tables, the problematic Stored Procedures starts to work fast as before. Right now I am comparing the execution plans, but it is pain because those SPs are huge.
Any help will be appreciate.
People often ask how to reduce the time in rebuilding the indexes on a huge SQL Server table.
But I have the reverse issue. Its a table of 5 million+ rows and when we Rebuild one of the primary non-clustered index which was fragmented up to 97%, it really went quick and was done in less than a minute.
Does SSMS say 'Rebuilding Index Completed' prematurely, but the actual re-indexing continues to happen in the background for hours on? We use SQL Server 2012.
This is the first place where I have witnessed rebuilding non-clustered indexes on any table of any huge size, finish literally in less than a minute, which frankly boggles my mind. Especially since people always seem to ask the exact opposite question.
Any explanation on this would be highly appreciated!
I have huge SQL Query. Probably 15-20 tables involved.
There are 6 to 7 subqueries which are joined again.
This query most of times takes a minute to run and return 5 million records.
So even if this query is badly written, it does have query plan that makes it finish in a minute. I have ensured that query actually ran and didn't use cached results.
Sometimes, the query plan gets jacked up and then it never finishes. I run a vacuum analyze every night on the tables involved in the query. The work_memory is currently set at 200 MB..I have tried increasing this to 2 GB as well. I haven't experienced the query getting messed when work_memory was 2 GB. But when i reduced it and ran the query, it got messed. Now when i increased it back to 2 GB, the query is still messed. Has it got something to do with the query plan not getting refreshed with the new setting ? I tried discard plan on my session.
I can only think of work_mem and vacuum analyze at this point. Any other factors that can affect a smoothly running query that returns results in a minute to go and and not return anything ?
Let me know if you need more details on any settings ? or the query itself ? I can paste the plan too...But the query and the plan or too big to be pasting here..
If there are more than geqo_treshold (typically 12) entries in the range table, the genetic optimiser will kick in, often resulting in random behaviour, as described in the question. You can solve this by:
increasing geqo_limit
move some of your table referencess into a CTE. If you already have some subqueries, promote one (or more) of these to a CTE. It is a kind of black art to identify clusters of tables in your query that will fit in a compact CTE (with relatively few result tuples, and not too many key references to the outer query).
Setting geqo_treshold too high (20 is probably too high ...) will cause the planner to need a lot of time to evaluate all the plans. (the number of plans increases basically exponential wrt the number of RTEs) If you expect your query to need a few minutes to run, a few seconds of planning time will probably do no harm.
In my test environment, on a copy of my 4GB production database, I archived about 20% of my data, then ran a shrink on it from the SSMS, suggesting 20% max free space.
The result was a 2.7GB database with horrid performance. A particular query is about .5s in production, and about 11s now in test. If I remove the full-text portion of the query in test, execution time is about 2 seconds.
Actual execution plan is identical between production and test.
I rebuilt all the indexes and fulltext indexes. Performance is still about the same. No actual content in the test database has changed since duplication.
Any thoughts on where I'd look for the culprit (besides just behind the keyboard? :)
EDIT: ok, repeated the process three times, same results each time... HOWEVER, the performance degrades BEFORE I run the shrink - as soon as I archive inactive records. 0 seconds before the archive, 18 after. Get 7 seconds back after rebuilding some indexes. The archive process:
Creates a new "Archive" DB
Identifies 3 types of keys to delete, storing them in table variables
Performs a select into the "Archive" DB for those three keys from 20 tables
Deleted rows from 20 "Live" tables for those three keys.
That's it. Post-archive, when I look at the execution plan 40% time is spent in the very first operation, a clustered index scan.
I'm going to delete this and repost with the question rephrased, over at the SQL site.
relocated question: https://dba.stackexchange.com/questions/22337/option-force-order-improves-performance-until-rows-are-deleted
I'm going to delete this in a few days since the question is misleading, but just in case anyone is curious as to the outcome, it was solved here:
https://dba.stackexchange.com/questions/22337/option-force-order-improves-performance-until-rows-are-deleted
The shrink wasn't the cause, I only assumed it was because of the likelihood of fragmenting data with a shrink. The real issue was that deleting rows caused a bad statistical sample of the data shape to be taken. That in turn caused the query analyzer to return a bad plan. It thought its plan would scan about 900 rows, but instead it scanned over 52,000,000.
Thanks for all the help!
In a comment I read
Just as a side note, it's sometimes faster to drop the indices of your table and recreate them after the bulk insert operation.
Is this true? Under which circumstances?
As with Joel I will echo the statement that yes it can be true. I've found that the key to identifying the scenario that he mentioned is all in the distribution of data, and the size of the index(es) that you have on the specific table.
In an application that I used to support that did a regular bulk import of 1.8 million rows, with 4 indexes on the table, 1 with 11 columns, and a total of 90 columns in the table. The import with indexes took over 20 hours to complete. Dropping the indexes, inserting, and re-creating the indexes only took 1 hour and 25 minutes.
So it can be a big help, but a lot of it comes down to your data, the indexes, and the distribution of data values.
Yes, it is true. When there are indexes on the table during an insert, the server will need to be constantly re-ordering/paging the table to keep the indexes up to date. If you drop the indexes, it can just add the rows without worrying about that, and then build the indexes all at once when you re-create them.
The exception, of course, is when the import data is already in index order. In fact, I should note that I'm working on a project right now where this opposite effect was observed. We wanted to reduce the run-time of a large import (nightly dump from a mainframe system). We tried removing the indexes, importing the data, and re-creating them. It actually significantly increased the time for the import to complete. But, this is not typical. It just goes to show that you should always test first for your particular system.
One thing you should consider when dropping and recreating indexes is that it should only be done on automated processes that run during the low volumne periods of database use. While the index is dropped it can't be used for other queries that other users might be riunning at the same time. If you do this during production hours ,your users will probably start complaining of timeouts.