We had a DBA come in a few months ago and setup maintenance plans on our database. Looking at performance stats we are seeing that the task Update Statistics is running overnight and spilling in to operational hours. the vast majority of time it is working on tblAudit, this is a very large table (60Gb) and we don't need it to be part of the maintenance plan but we cannot see a way to exclude this one table. Please see attached pictures.
Is there an easy way to exclude this.
Short (but hesitant) answer: Yes, you could update that task and remove updating statistics for the big table, I'm including a screen shot - not sure why your screen shot does not seem to have anything in the "Object" drop down. I would not recommend this as your solution though and would not do this on any server, personally.
If you do implement certain objects to update and exclude objects, I believe it will make your life even worse because your statistics won't be run on any new objects added to the database - meaning if any new objects are added they would not be covered in this task AND the table being excluded from any update statistics maintenance could lead to a potentially huge performance hit on the big table.
Does that large table get any transactions (updates/inserts/deletes) at all or is it fully static (never changes)? Because if it does have any changes daily, weekly or monthly, it most likely will need updated statistics. There is an internal threshold that gets triggered for automated updating of statistics (if your database is configured with Auto Update and Auto Create), and the more rows/more data you have in a table the more updates to the data are needed to trigger an automated statistic update. What that means is the table could have potentially bad performance, especially over time if you leave it out of any mainteance task for stats updating.
I would suggest you look into implementing Ola Hallengren's maintenance plans instead of what the contract DBA created - which is merely out of the box point and click maintenance plan in your screen shot. It is usually unnecessary to do full scans on all statistics (url: https://ola.hallengren.com/).
If nothing else, perhaps JUST the statistics maintenance pieces of the scripts shared on Ola's site. I've used a modified version of the scripts to do our statistics maintenance using modified guidelines for our workload. A competent DBA would understand your workload, how often the data changes and if fullscan is needed, and when. Perhaps if you have a good maintenance window on the weekend, kick off a update statistics for that large table on the weekend, but leave a nightly script for all other tables.
Related
We are running databases in SQL Server 2012 with multiple large datasets (some are in the 50M+ records range). The previous SQL developer designed the queries and optimized them but they still take 2+ hours to run.
He partially worked around this by creating a static table which gets updated every time the queries are run so if the data hasn't changed, the query runs from this summarized table. The static table gets updated by performing checksums on the relevant tables and updates it if the checksum show the data has changed.
I'm trying to speed up the whole process. We have designed an in-house GUI to run the queries for managers to be able to run reports themselves but I don't want them to waste 2 hours waiting for a report. I will be reviewing the indexes he was using to see if I can optimize them further and tweak his code as well but I suspect I might only get minimal performance improvement.
I like the idea of the static table for reporting but would like have it updated more frequently (preferably nightly) but since the data can also change depending on tasks, I want to avoid any performance hits. For example, the team may be loading records overnight.
Any suggestions would be great. Thank you.
In my present team we have made it a practice to add 5-6 audit columns in all our tables irrespective of if they are required or not. I am concerned this will increase the number of pages occupied by the tables and the size of the Database. Once live, the application may have 50k users hitting it concurrently.
How will it impact the performance of the Application ? What should I tell my boss to convince this is a bad policy.
You need to test out and have some data to show .Test the impact with expected workload of 50k users with and with out Audit and document differences like
1.CPU usage
2.Memory Usage
3.IO load
if you are seeing any slowness ,then you can present you boss the testing you have done..
Here is a whitepaper from Microsoft which states the impact of Auditing on various oltp workloads
If Size is the concern, you can maintain a separate database or table for the log. Log your changes/operations on tables through triggeres or equivalent SPs created.
Then periodically you can delete the old data.
Why are you convinced this is a bad policy.
They don't take a lot of room. If you get some performance issues and they are not used you can just remove them and reindex the table.
Is it really a position you want to go to you boss without some hard evidence?
I did read posts about transactional and reporting database.
We have in single table which is used for reporting(historical) purpose and transactional
eg :order with fields
orderid, ordername, orderdesc, datereceived, dateupdated, confirmOrder
Is it a good idea to split this table into neworder and orderhistrory
The new ordertable records the current days transaction (select,insert and update activity every ms for the orders received on that day .Later we merge this table with order history
Is this a recommended approach.
Do you think this is would minimize the load and processing time on the Database?
PostgreSQL supports basic table partitioning which allows splitting what is logically one large table into smaller physical pieces. More info provided here.
To answer your second question: No. Moving data from one place to another is an extra load that you otherwise wouldn't have if you used the transational table for reporting. But there are some other questions you need to ask before you make this decision.
How often are these reports run?
If you are running these reports once an hour, it may make sense to keep them in the same table. However, if this report takes a while to run, you'll need to take care not to tie up resources for the other clients using it as a transactional table.
How up-to-date do these reports need to be?
If the reports are run less than daily or weekly, it may not be critical to have up to the minute data in the reports.
And this is where the reporting table comes in. The approaches I've seen typically involve having a "data warehouse," whether that be implemented as a single table or an entire database. This warehouse is filled on a schedule with the data from the transactional table, which subsequently triggers the generation of a report. This seems to be the approach you are suggesting, and is a completely valid one. Ultimately, the one question you need to answer is when you want your server to handle the load. If this can be done on a schedule during non-peak hours, I'd say go for it. If it needs to be run at any given time, than you may want to keep the single-table approach.
Of course there is nothing saying you can't do both. I've seen a few systems that have small on-demand reports run on transactional tables, scheduled warehousing of historical data, and then long-running reports against that historical data. It's really just a matter of how real-time you want the data to be.
I have a very large (100+ gigs) SQL Server 2005 database that receives a large number of inserts and updates, with less frequent selects. The selects require a lot of indexes to keep them functioning well, but it appears the number of indexes is effecting the efficiency of the inserts and updates.
Question: Is there a method for keeping two copies of a database where one is used for the inserts and updates while the second is used for the selects? The second copy wouldn't need to be real-time updated, but shouldn't be more than an hour old. Is it possible to do this kind of replication while keeping different indexes on each database copy? Perhaps you have other solutions?
Your looking to setup a master/child database topology using replication. With SQL server you'll need to setup replication between two databases (preferrably on separate hardware). The Master DB you should use for inserts and updates. The Child will service all your select queries. You'll want to also optimize both database configuration settings for the type of work they will be performing. If you have heavy select queries on the child database you may also want to setup view's that will make the queries perform better than complex joins on tables.
Some reference material on replication:
http://technet.microsoft.com/en-us/library/ms151198.aspx
Just google it and you'll find plenty of information on how to setup and configure:
http://search.aim.com/search/search?&query=sql+server+2005+replication&invocationType=tb50fftrab
Transactional replication can do this as the subscriber can have a number of aditional indexes compared with the publisher. But you have to bear in mind a simple fact: all inserts/updates/deletes are going to be replicated at the reporting copy (the subscriber) and the aditional indexes will... slow down replication. It is actually possible to slow down the replication to a rate at wich is unable to keep up, causing a swell of the distribution DB. But this is only when you have a constant high rate of updates. If the problems only occur durink spikes, then the distribution DB will act as a queue that absorbes the spikes and levels them off during off-peak hours.
I would not take this endevour without absolute, 100% proof evidence that it is the additional indexes that are slowing down the insert/updates/deletes, and w/o testing that the insert/updates/deletes are actually performing significantly better without the extra indexes. Specifically , ensure that the culprit is not the other usual suspect: lock contention.
Generally, all set-based operations (including updating indexes) are faster than non set-based ones
1,000 inserts will most probably be slower than one insert of 1,000 records.
You can batch the updates to the second database. This will, first, make the index updating more fast, and, second, smooth the peaks.
You could task schedule a bcp script to copy the data to the other DB.
You could also try transaction log shipping to update the read only db.
Don't forget to adjust the fill factor when you create your two databases. It should be low(er) on the database with frequent updates, and 100 on your "data warehouse"/read only database.
I run an online photography community and it seems that the site draws to a crawl on database access, sometimes hitting timeouts.
I consider myself to be fairly compentent writing SQL queries and designing tables, but am by no means a DBA... hence the problem.
Some background:
My site and SQL server are running on a remote host. I update the ASP.NET code from Visual Studio and the SQL via SQL Server Mgmt. Studio Express. I do not have physical access to the server.
All my stored procs (I think I got them all) are wrapped in transactions.
The main table is only 9400 records at this time. I add 12 new records to this table nightly.
There is a view on this main table that brings together data from several other tables into a single view.
secondary tables are smaller records, but more of them. 70,000 in one, 115,000 in another. These are comments and ratings records for the items in #3.
Indexes are on the most needed fields. And I set them to Auto Recompute Statistics on the big tables.
When the site grinds to a halt, if I run code to clear the transaction log, update statistics, rebuild the main view, as well as rebuild the stored procedure to get the comments, the speed returns. I have to do this manually however.
Sadly, my users get frustrated at these issues and their participation dwindles.
So my question is... in a remote environment, what is the best way to setup and schedule a maintenance plan to keep my SQL db running at its peak???
My gut says you are doing something wrong. It sounds a bit like those stories you hear where some system cannot stay up unless you reboot the server nightly :-)
Something is wrong with your queries, the number of rows you have is almost always irrelevant to performance and your database is very small anyway. I'm not too familiar with SQL server, but I imagine it has some pretty sweet query analysis tools. I also imagine it has a way of logging slow queries.
I really sounds like you have a missing index. Sure you might think you've added the right indexes, but until you verify the are being used, it doesn't matter. Maybe you think you have the right ones, but your queries suggest otherwise.
First, figure out how to log your queries. Odds are very good you've got a killer in there doing some sequential scan that an index would fix.
Second, you might have a bunch of small queries that are killing it instead. For example, you might have some "User" object that hits the database every time you look up a username from a user_id. Look for spots where you are querying the database a hundred times and replace it with a cache--even if that "cache" is nothing more then a private variable that gets wiped at the end of a request.
Bottom line is, I really doubt it is something mis-configured in SQL Server. I mean, if you had to reboot your server every night because the system ground to a halt, would you blame the system or your code? Same deal here... learn the tools provided by SQL Server, I bet they are pretty slick :-)
That all said, once you accept you are doing something wrong, enjoy the process. Nothing, to me, is funner then optimizing slow database queries. It is simply amazing you can take a query with a 10 second runtime and turn it into one with a 50ms runtime with a single, well-placed index.
You do not need to set up your maintenance tasks as a maintenance plan.
Simply create a stored procedure that carries out the maintenance tasks you wish to perform, index rebuilds, statistics updates etc.
Then create a job that calls your stored procedure/s. The job can be configured to run on your desired schedule.
To create a job, use the procedure sp_add_job.
To create a schedule use the procedure sp_add_schedule.
I hope what I have detailed is clear and understandable but feel free to drop me a line if you need further assistance.
Cheers, John