We have a large database that has several tables with 10-50 million of rows each. But in reality we just need the data for the past 3 years only. So we created new tables to contain only the latest data. They tables are exactly like the original ones e.g. contains the same indexes on same partition ... same everything.
And everything went perfect. The table records count is now ~10-15 times smaller than the original size. Initial perf. measures showed significant gain, but then we found that some other stored procedures perform worst than previous - now they take 100% more time e.g. from ~2 minutes to ~4.
The table swapping was done via sp_rename.
We rebuilt all the indexes, rebuilt even the statistics, but the effect was actually very small.
Update: we cleared all execution plans via:
DBCC DROPCLEANBUFFERS
DBCC FREEPROCCACHE
GO
Fortunatelly for me when I get back to the original tables, the problematic Stored Procedures starts to work fast as before. Right now I am comparing the execution plans, but it is pain because those SPs are huge.
Any help will be appreciate.
Related
I have a table of a little over 1 billion rows of time-series data with fantastic insert performance but (sometimes) awful select performance.
Table tblTrendDetails (PK is ordered as shown):
PK TrendTime datetime
PK CavityId int
PK TrendValueId int
TrendValue real
The table is continuously pulling in new data and purging old data, so insert and delete performance needs to remain snappy.
When executing a query such as the following, performance is poor (30 sec):
SELECT *
FROM tblTrendDetails
WHERE TrendTime BETWEEN #inMinTime AND #inMaxTime
AND CavityId = #inCavityId
AND TrendValueId = #inTrendId
If I execute the same query again (with similar times, but any #inCavityId or #inTrendId), performance is very good (1 sec). Performance counters show that disk access is the culprit the first time the query is run.
Any recommendations regarding how to improve performance without (significantly) adversely affecting the insert or delete performance? Any suggestions (including completely changing the underlying database) are welcome.
The fact that subsequent queries of the same or similar data run much faster is probably due to SQL Server caching your data. That said, is it possible to speed this initial query up?
Verify the query plan:
My guess is that your query should result in an Index Seek rather than an Index Scan (or worse, a Table Scan). Please verify this using SET SHOWPLAN_TEXT ON; or a similar feature. Using between and = as your query does should really take advantage of the clustered index, though that's debatable.
Index Fragmentation:
It is possible that your clustered index (the primary key in this case) is quite fragmented after all of those inserts and deletes. I would probably check this with DBCC SHOWCONTIG (tblTrendDetails).
You can defrag the table's indexes with DBCC INDEXDEFRAG (MyDatabase, tblTrendDetails).
This may take some time, but will allow the table to remain accessible, and you can stop the operation without any nasty side-effects.
You might have to go further and use DBCC DBREINDEX (tblTrendDetails). This is an offline operation, though, so you should only do this when the table does not need to be accessed.
There are some differences described here: Microsoft SQL Server 2000 Index Defragmentation Best Practices.
Be aware that your transaction log can grow quite a bit from defragging a large table, and it can take a long time.
Partitioned Views:
If these do not remedy the situation (or fragmentation is not a problem), you may even wish to look to partitioned views, in which you create a bunch of underlying base tables for various ranges of records, then union them all up in a view (replacing your original table).
Better Stuff:
If performance of these selects is a real business need, you may be able to make the case for better hardware: faster drives, more memory, etc. If your drives are twice as fast, then this query will run in half the time, yeah? Also, this may not be workable for you, but I've simply found newer versions of SQL Server to truly be faster with more options and better to maintain. I'm glad to have moved most of my company's data to 2008R2. But I digress...
In my test environment, on a copy of my 4GB production database, I archived about 20% of my data, then ran a shrink on it from the SSMS, suggesting 20% max free space.
The result was a 2.7GB database with horrid performance. A particular query is about .5s in production, and about 11s now in test. If I remove the full-text portion of the query in test, execution time is about 2 seconds.
Actual execution plan is identical between production and test.
I rebuilt all the indexes and fulltext indexes. Performance is still about the same. No actual content in the test database has changed since duplication.
Any thoughts on where I'd look for the culprit (besides just behind the keyboard? :)
EDIT: ok, repeated the process three times, same results each time... HOWEVER, the performance degrades BEFORE I run the shrink - as soon as I archive inactive records. 0 seconds before the archive, 18 after. Get 7 seconds back after rebuilding some indexes. The archive process:
Creates a new "Archive" DB
Identifies 3 types of keys to delete, storing them in table variables
Performs a select into the "Archive" DB for those three keys from 20 tables
Deleted rows from 20 "Live" tables for those three keys.
That's it. Post-archive, when I look at the execution plan 40% time is spent in the very first operation, a clustered index scan.
I'm going to delete this and repost with the question rephrased, over at the SQL site.
relocated question: https://dba.stackexchange.com/questions/22337/option-force-order-improves-performance-until-rows-are-deleted
I'm going to delete this in a few days since the question is misleading, but just in case anyone is curious as to the outcome, it was solved here:
https://dba.stackexchange.com/questions/22337/option-force-order-improves-performance-until-rows-are-deleted
The shrink wasn't the cause, I only assumed it was because of the likelihood of fragmenting data with a shrink. The real issue was that deleting rows caused a bad statistical sample of the data shape to be taken. That in turn caused the query analyzer to return a bad plan. It thought its plan would scan about 900 rows, but instead it scanned over 52,000,000.
Thanks for all the help!
I have noticed an interesting performance change that happens around 1,5 million entered values. Can someone give me a good explanation why this is happening?
Table is very simple. It is consisted of (bigint, bigint, bigint, bool, varbinary(max))
I have a pk clusered index on first three bigints. I insert only boolean "true" as data varbinary(max).
From that point on, performance seems pretty constant.
Legend: Y (Time in ms) | X (Inserts 10K)
I am also curios about constant relatively small (sometimes very large) spikes I have on the graph.
Actual Execution Plan from before spikes.
Legend:
Table I am inserting into: TSMDataTable
1. BigInt DataNodeID - fk
2. BigInt TS - main timestapm
3. BigInt CTS - modification timestamp
4. Bit: ICT - keeps record of last inserted value (increases read performance)
5. Data: Data
Bool value Current time stampl keeps
Enviorment
It is local.
It is not sharing any resources.
It is fixed size database (enough so it does not expand).
(Computer, 4 core, 8GB, 7200rps, Win 7).
(Sql Server 2008 R2 DC, Processor Affinity (core 1,2), 3GB, )
Have you checked the execution plan once the time goes up? The plan may change depending on statistics. Since your data grow fast, stats will change and that may trigger a different execution plan.
Nested loops are good for small amounts of data, but as you can see, the time grows with volume. The SQL query optimizer then probably switches to a hash or merge plan which is consistent for large volumes of data.
To confirm this theory quickly, try to disable statistics auto update and run your test again. You should not see the "bump" then.
EDIT: Since Falcon confirmed that performance changed due to statistics we can work out the next steps.
I guess you do a one by one insert, correct? In that case (if you cannot insert bulk) you'll be much better off inserting into a heap work table, then in regular intervals, move the rows in bulk into the target table. This is because for each inserted row, SQL has to check for key duplicates, foreign keys and other checks and sort and split pages all the time. If you can afford postponing these checks for a little later, you'll get a superb insert performance I think.
I used this method for metrics logging. Logging would go into a plain heap table with no indexes, no foreign keys, no checks. Every ten minutes, I create a new table of this kind, then with two "sp_rename"s within a transaction (swift swap) I make the full table available for processing and the new table takes the logging. Then you have the comfort of doing all the checking, sorting, splitting only once, in bulk.
Apart from this, I'm not sure how to improve your situation. You certainly need to update statistics regularly as that is a key to a good performance in general.
Might try using a single column identity clustered key and an additional unique index on those three columns, but I'm doubtful it would help much.
Might try padding the indexes - if your inserted data are not sequential. This would eliminate excessive page splitting and shuffling and fragmentation. You'll need to maintain the padding regularly which may require an off-time.
Might try to give it a HW upgrade. You'll need to figure out which component is the bottleneck. It may be the CPU or the disk - my favourite in this case. Memory not likely imho if you have one by one inserts. It should be easy then, if it's not the CPU (the line hanging on top of the graph) then it's most likely your IO holding you back. Try some better controller, better cached and faster disk...
In a comment I read
Just as a side note, it's sometimes faster to drop the indices of your table and recreate them after the bulk insert operation.
Is this true? Under which circumstances?
As with Joel I will echo the statement that yes it can be true. I've found that the key to identifying the scenario that he mentioned is all in the distribution of data, and the size of the index(es) that you have on the specific table.
In an application that I used to support that did a regular bulk import of 1.8 million rows, with 4 indexes on the table, 1 with 11 columns, and a total of 90 columns in the table. The import with indexes took over 20 hours to complete. Dropping the indexes, inserting, and re-creating the indexes only took 1 hour and 25 minutes.
So it can be a big help, but a lot of it comes down to your data, the indexes, and the distribution of data values.
Yes, it is true. When there are indexes on the table during an insert, the server will need to be constantly re-ordering/paging the table to keep the indexes up to date. If you drop the indexes, it can just add the rows without worrying about that, and then build the indexes all at once when you re-create them.
The exception, of course, is when the import data is already in index order. In fact, I should note that I'm working on a project right now where this opposite effect was observed. We wanted to reduce the run-time of a large import (nightly dump from a mainframe system). We tried removing the indexes, importing the data, and re-creating them. It actually significantly increased the time for the import to complete. But, this is not typical. It just goes to show that you should always test first for your particular system.
One thing you should consider when dropping and recreating indexes is that it should only be done on automated processes that run during the low volumne periods of database use. While the index is dropped it can't be used for other queries that other users might be riunning at the same time. If you do this during production hours ,your users will probably start complaining of timeouts.
In your experience, how often should Oracle database statistics be run? Our team of developers recently discovered that statistics hadn't been run our production box in over 2 1/2 months. That sounds like a long time to me, but I'm not a DBA.
Since Oracle 11g statistics are gathered automatically by default.
Two Scheduler windows are predefined upon installation of Oracle Database:
WEEKNIGHT_WINDOW starts at 10 p.m. and ends at 6 a.m. every Monday
through Friday.
WEEKEND_WINDOW covers whole days Saturday and Sunday.
When statistics were last gathered?
SELECT owner, table_name, last_analyzed FROM all_tables ORDER BY last_analyzed DESC NULLS LAST; --Tables.
SELECT owner, index_name, last_analyzed FROM all_indexes ORDER BY last_analyzed DESC NULLS LAST; -- Indexes.
Status of automated statistics gathering?
SELECT * FROM dba_autotask_client WHERE client_name = 'auto optimizer stats collection';
Windows Groups?
SELECT window_group_name, window_name FROM dba_scheduler_wingroup_members;
Window Schedules?
SELECT window_name, start_time, duration FROM dba_autotask_schedule;
Manually gather Database Statistics in this Schema:
EXEC dbms_stats.gather_schema_stats(ownname=>NULL, cascade=>TRUE); -- cascade=>TRUE means include Table Indexes too.
Manually gather Database Statistics in all Schemas!
-- Probably need to CONNECT / AS SYSDBA
EXEC dbms_stats.gather_database_stats;
Whenever the data changes "significantly".
If a table goes from 1 row to 200 rows, that's a significant change. When a table goes from 100,000 rows to 150,000 rows, that's not a terribly significant change. When a table goes from 1000 rows all with identical values in commonly-queried column X to 1000 rows with nearly unique values in column X, that's a significant change.
Statistics store information about item counts and relative frequencies -- things that will let it "guess" at how many rows will match a given criteria. When it guesses wrong, the optimizer can pick a very suboptimal query plan.
At my last job we ran statistics once a week. If I remember correctly, we scheduled them on a Thursday night, and on Friday the DBAs were very careful to monitor the longest running queries for anything unexpected. (Friday was picked because it was often just after a code release, and tended to be a fairly low traffic day.) When they saw a bad query they would find a better query plan and save that one so it wouldn't change again unexpectedly. (Oracle has tools to do this for you automatically, you tell it the query to optimize and it does.)
Many organizations avoid running statistics out of fear of bad query plans popping up unexpectedly. But this usually means that their query plans get worse and worse over time. And when they do run statistics then they encounter a number of problems. The resulting scramble to fix those issues confirms their fears about the dangers of running statistics. But if they ran statistics regularly, used the monitoring tools as they are supposed to, and fixed issues as they came up then they would have fewer headaches, and they wouldn't encounter them all at once.
What Oracle version are you using? Check this page which refers to Oracle 10:
http://www.acs.ilstu.edu/docs/Oracle/server.101/b10752/stats.htm
It says:
The recommended approach to gathering statistics is to allow Oracle to automatically gather the statistics. Oracle gathers statistics on all database objects automatically and maintains those statistics in a regularly-scheduled maintenance job.
When I was managing a large multi-user planning system backed by Oracle, our DBA had a weekly job that gathered statistics. Also, when we rolled out a significant change that could affect or be affected by statistics, we would force the job to run out of cycle to get things caught up.
With 10g and higher version of oracle, up to date statistics on tables and indexes are needed by the optimizer to make "good" execution plan decision. How often you collect statistics is a tricky call. It depends on your application, schema, data rate and business practice. Some third party apps which are written to be backward compatible with older version of oracle do not perform well with the new optimizer. Those application require that tables have no stats so that the db resorts back to rule base execution plan. But on the average oracle recommends that stats be collected on tables with stale statistics. You can set tables to be monitor and check their state and have them analyze if/when stale. Often that is enough, sometime it is not. It really depend on your database. For my database we have a set of OLTP tables that need nightly stats collection to maintain performance. Other tables are analyze once a week. On our large dw database, we analyze as needed as the tables are too large for regular analysis without affecting overall db load and performance. So the correct answer is, it depends on the application, data change and business needs.
Make sure to balance the risk that fresh statistics cause undesirable changes to query plans against the risk that stale statistics can themselves cause query plans to change.
Imagine you have a bug database with a table ISSUE and a column CREATE_DATE where the values in the column increase more or less monotonically. Now, assume that there is a histogram on this column that tells Oracle that the values for this column are uniformly distributed between January 1, 2008 and September 17, 2008. This makes it possible for the optimizer to reasonably estimate the number of rows that would be returned if you were looking for all issues created last week (i.e. September 7 - 13). If the application continues to be used and the statistics are never updated, though, this histogram will be less and less accurate. So the optimizer will expect queries for "issues created last week" to be less and less accurate over time and may eventually cause Oracle to change the query plan negatively.
In the case of a data warehouse-type system you can consider collecting no statistics at all, and relying on dynamic sampling (setting optimizer_dynamic_sampling to level 2 or above).
Generally it's not recommended to gather statistics so frequent on the whole database unless you have a strong justification for that, such as a bulk insert or big data change happen frequently on the database.
gathering statistics on the database in this frequency MAY change the queries execution plan to a new poor execution plans, the thing may cost you much time trying to tune every query affected by the new poor plans, this is why you should test the impact of gathering new statistics on a test database, or in case you don't have the time or the man power for that, at least you should keep a fallback plan by backing up the original statics before you gather new ones, so in case you gather a new statistics and then the queries didn't perform as expected, you can easily restore back the original statistics.
There is a very useful script can help you backup original statistics and gather new ones and provide you with SQL command you can use to restore back the original statics in case the thing didn't go as expected after gathering new statistics. You can find the script in this link:
http://dba-tips.blogspot.com/2014/09/script-to-ease-gathering-statistics-on.html