I run an online photography community and it seems that the site draws to a crawl on database access, sometimes hitting timeouts.
I consider myself to be fairly compentent writing SQL queries and designing tables, but am by no means a DBA... hence the problem.
Some background:
My site and SQL server are running on a remote host. I update the ASP.NET code from Visual Studio and the SQL via SQL Server Mgmt. Studio Express. I do not have physical access to the server.
All my stored procs (I think I got them all) are wrapped in transactions.
The main table is only 9400 records at this time. I add 12 new records to this table nightly.
There is a view on this main table that brings together data from several other tables into a single view.
secondary tables are smaller records, but more of them. 70,000 in one, 115,000 in another. These are comments and ratings records for the items in #3.
Indexes are on the most needed fields. And I set them to Auto Recompute Statistics on the big tables.
When the site grinds to a halt, if I run code to clear the transaction log, update statistics, rebuild the main view, as well as rebuild the stored procedure to get the comments, the speed returns. I have to do this manually however.
Sadly, my users get frustrated at these issues and their participation dwindles.
So my question is... in a remote environment, what is the best way to setup and schedule a maintenance plan to keep my SQL db running at its peak???
My gut says you are doing something wrong. It sounds a bit like those stories you hear where some system cannot stay up unless you reboot the server nightly :-)
Something is wrong with your queries, the number of rows you have is almost always irrelevant to performance and your database is very small anyway. I'm not too familiar with SQL server, but I imagine it has some pretty sweet query analysis tools. I also imagine it has a way of logging slow queries.
I really sounds like you have a missing index. Sure you might think you've added the right indexes, but until you verify the are being used, it doesn't matter. Maybe you think you have the right ones, but your queries suggest otherwise.
First, figure out how to log your queries. Odds are very good you've got a killer in there doing some sequential scan that an index would fix.
Second, you might have a bunch of small queries that are killing it instead. For example, you might have some "User" object that hits the database every time you look up a username from a user_id. Look for spots where you are querying the database a hundred times and replace it with a cache--even if that "cache" is nothing more then a private variable that gets wiped at the end of a request.
Bottom line is, I really doubt it is something mis-configured in SQL Server. I mean, if you had to reboot your server every night because the system ground to a halt, would you blame the system or your code? Same deal here... learn the tools provided by SQL Server, I bet they are pretty slick :-)
That all said, once you accept you are doing something wrong, enjoy the process. Nothing, to me, is funner then optimizing slow database queries. It is simply amazing you can take a query with a 10 second runtime and turn it into one with a 50ms runtime with a single, well-placed index.
You do not need to set up your maintenance tasks as a maintenance plan.
Simply create a stored procedure that carries out the maintenance tasks you wish to perform, index rebuilds, statistics updates etc.
Then create a job that calls your stored procedure/s. The job can be configured to run on your desired schedule.
To create a job, use the procedure sp_add_job.
To create a schedule use the procedure sp_add_schedule.
I hope what I have detailed is clear and understandable but feel free to drop me a line if you need further assistance.
Cheers, John
Related
I am reviewing a SQL Server 2008 R2 instance with 30+ databases with the goal of moving to SQL Server 2014. In reviewing this I found a SQL job that a previous employee implemented. The job utilizes a set of scripts from this article https://www.sqlservercentral.com/forums/topic/indexing-views-1, to automatically create and drop all recommended indexes every half hour 24/7. When this was implemented the databases were roughly 40gb, but since have grown to over 1TB as we are a highly transactional company. With one of the databases running our primary ERP/ordering system. From everything I understand about indexing, this seems like a terrible idea as it could be creating and dropping indexes on very large tables. Is this a good practice, am I missing something?
Found this in a related post
"How to use it? Run AutoIndex.sql to install the SPs and sql agent
job. Upon every 30 minutes, the sql agent job will run the auto create
index and auto drop index scripts to make recommendations. Same
recommendation will not be stored multiple times, instead we just bump
up the count and change the latest recommendation time. You can view
the recommendations using the simple commands in the
viewrecommendations.sql. Look for the recommendations in the
recommendation table that have high counts, which means they have been
repetitively recommended thus are more valuable. You can also look at
the initial recommendation time and the last recommendation time to
get a sense of the freshness and the time range this recommendation is
valid for. After you made a decision to implement a recommendation,
simply run execute_recommendation with the recommendation id and the
recommendation will be implemented automatically." Thank U Snehal
Link Here
According to that user the script you're proc you have in your system should just aggregate index recommendations over time and allow you too see what indexes are constantly being recommended.
I believe the important distinction here is SQL doesn't log how many times it suggest a particular index so you may get a suggested index based on a one off query, which probably isn't something you want to implement. Instead you run this for a period and see what's being hit frequently and create those indexes.
I apologize in advance for not having all of the specifics available, but the machine is building an index probably for a good while still and is almost completely unresponsive.
I've got a table on SQL Server 2005 with a good number of columns, maybe 20, but a mammoth number of rows (tens, more likely hundreds of millions). In order to simplify the amount of JPA work I'd need to do to access it, I created a view that contained the bits I was interested in. The view was created as:
SELECT bigtable.ID, bigtable.external_identification, mediumtable.hostname,
CONVERT(VARCHAR, bigtable.datefield, 121) AS datefield
FROM schema.bigtable JOIN schema.mediumtable ON bigtable.joinID = mediumtable.ID;
When I want to select from the view, I do:
SELECT * FROM vwTable WHERE external_identification = 'some string';
This works just fine in SQL Management Studio. The external_identification column has a non-unique, non-clustered index in bigtable. This also worked just fine on our remotely executing Java program in our test environment. Now that we're a day or two away from production, the code has been changed a bit (although the fundamental JPA NamedQuery is still straightforward), but we have a new SQLServer installation on new hardware; the test version was on a 32-bit single core machine, the new hardware is 64-bit multi-core.
Whenever I try to run the code that uses this view on the new hardware, it either hangs indefinitely on the first call of this query or times out if I have a timeout specified. After doing some digging, something like:
SELECT status, command, wait_type, last_wait_type FROM sys.dm_exec_requests;
confirmed that the query was running, but showed it in the state:
suspended, SELECT, CXPACKET, CXPACKET
for as long as I cared to wait for it. Whenever I ran the exact same query from within the Management Studio, it completed immediately. So I did some research, and found out this is due to waiting on some kind of concurrent operation to start/finish. In an attempt to circumvent that, I set the server-wide MAXDOP to 1 (disabled concurrency). After that, the query still hangs, but the sys.dm_exec_requests would show:
suspended, SELECT, PAGEIOLATCH_SH, PAGEIOLATCH_SH
This indicates that it's some sort of HD/scanning issue. While certainly the machine is less responsive than I'd expect for newer hardware, I wouldn't expect this query (even over the view) to require much scanning, since the column I'm searching by is indexed in the underlying table and it works if I run it locally. But just because I'm out of ideas and under the gun, I'm adding indexes to the view; first I have to add the unique clustered index (over ID) before I can attempt to add the non-unique non-clustered index over external_identification.
I'm the only one using this database; when I select from sys.dm_exec_requests the only two results are the query I'm actively inspecting and the select from sys.dm_exec_requests query. So it's not like it's under legitimately heavy, or even at all concurrent, load.
But I suspect I'm grasping at straws. I'm no DBA, and every time I have to interact with SQL Server outside of querying it it baffles my intuitions. Does anyone have any ideas why a query executed remotely would immediately go into a suspended state while the same query locally would execute immediately?
Wow, this one caught me straight out of left field. It turns out that by default, the MSSQL JDBC driver sends its String datatypes as Unicode, which the table/view might not be prepared to handle specifically. In our case, the columns and indexes were not, so MSSQL would perform a full table scan for each lookup.
In our test environment, the table was small enough that this didn't matter, so I was tricked into thinking it worked fine. In retrospect, I'm glad it didn't -- I can't stand it when computers give the illusion of inconsistency.
When I added this little parameter to the end of my JDBC connection string:
jdbc:sqlserver://[IP]:1433;databaseName=[db];sendStringParametersAsUnicode=false
things immediately and magically started working. Sorry for the slightly misleading question (I barely even mentioned JPA), but I had no idea what the cause was and really did believe it was something SQL Server side. Task Manager didn't report heavy CPU/Memory usage while the query was suspended, so I just thought it was idling even though it was really under heavy disk usage.
More info about MSSQL JDBC and Unicode can be found where I stumbled across the solution, at http://server.pramati.com/blog/2010/06/02/perfissues-jdbcdrivers-mssqlserver/ . Thanks, Ed, for that detailed shot in the dark -- it may not have been the problem, but I certainly learned a lot (and fast!) about MSSQL's gritty parts!
It is likely that the query run in SSMS and by your application are using different query plans - from the wait types you're seeing in dm_exec_requests it sounds like the plan created for the application is doing a table scan where the plan for SSMS is using an index seek.
This is possible because the SSMS and application database connections likely use different connection options, some of which are used as a key to the database plan cache.
You can find out which options your application is using by running a default SQL server profiler trace against the server; the first command after the connection is created will be a number of SET... options:
SET DATEFORMAT dmy
SET ANSI_NULLS ON
...
I suspect this list will be different between your application and your SSMS connection - a common candidate is SET ARITHABORT {ON|OFF}, since that forms part of the key of the cached plan.
If you run the SET... commands in an SSMS window before executing the query, the same (bad) plan as is being used by the application should then be picked up.
Assuming this demonstrates the problem, the next step is to work out how to prevent the bad plan getting into cache. It's difficult to give generic instructions about how to do this, since there are a few possible causes.
It's a bit of a scattergun approach (there are other more targetted ways to attempt to resolve this problem but they require more detailed understanding of the issue that I have now), but one thing to try is to add OPTION (RECOMPILE) to the end of your query - this forces a new plan to be generated for every execution, and should prevent the bad plan being reused:
SELECT * FROM vwTable WHERE external_identification = 'some string' OPTION (RECOMPILE);
Assuming you can replicate the bad performance is SSMS using the steps above, you should be able to test this there.
Beware that this can have negative performance consequences if the query is executed very frequently (since each recompilation requires CPU) - this depends on the workload of your application and will need testing.
A couple of other thoughts:
Check the schemas between the test and production systems; this might be as simple as a missing index from one of the tables in the production database, although given that SSMS queries perform OK this is unlikely.
You should re-enable parallelism by taking the server-wide MAXDOP=1 off, since this will limit the performance of your system overall. The problem is almost certainly the query plan, not parallelism
You also need to beware of the consequences of adding indexes to the view - doing so effectively materialises the view, which will (given the size of the table) require a lot of storage overhead - the indexes will also need to be maintained when INSERT/UPDATE/DELETE statements take place on the base table. Indexing the view is probably unnecessary given that (from SSMS) you know it's possible for the query to perform.
I have a SP that has been worked on my 2 people now that still takes 2 minutes or more to run. Is there a way to have these pre run and stored in cache or somewhere else so when my client needs to look at this data in a web browser he doesn't want to hang himself or me?
I am no where near a DBA so I am kind of at the mercy of who I hire to figure this out for me, so having a little knowledge up front would really help me out.
If it truly takes that long to run, you could schedule the process to run using SQL Agent, and have the output go to a table, then change the web application to read the table rather than execute the stored procedure. You'd have to decide how often to run the refresh, and deal with the requests that occur while it is being refreshed, but that can be dealt with as well by having two output files, one live and one for the latest refresh.
But I would take another look at the procedure, look at the execution plan and see where it is slow, make sure it is not doing full table scans.
Preferred solutions in this order:
Analyze the query and optimize accordingly
Cache it in the application (you can use httpRuntime.Cache (even if not asp.net application)
Cache SPROC results in a table in the DB and then add triggers to invalidate the cache (delete the table) so a a call to the SPROC would first look to see if there is any data in the cache table. If none, run SPROC and store the result in the cache table, if so, return the data from that table. The triggers on the "source" tables for the SPROC would just delete * from CacheTable to "clear the cache" (depending on what you sproc is doing and its dependencies, you may even be able to partially update the cache table based on the trigger, but all of this quickly gets difficult to maintain...but sometimes you gotta do what you gotta do...This approach will allow the cache table to update itself as needed. You will always have the latest data and the SPROC will only run when needed.
Try "Analyze query in database engine tuning advisor" from the Query menu.
I usually script the procedure to a new window, take out the query definition part and try different combinations of temp tables, regular tables and table variables.
You could cache the result set in the application as opposed to the database, either in memory by keeping an instance of the datatable around, or by serializing it to disk. How many rows does it return?
Is it too long to post the code here?
OK first things first, indexes:
What indexes do you have on the tables and is the execution plan using them?
Do you have indexes on all the foreign key fields?
Second, does the proc use any of the following performance killers:
a cursor
a subquery
a user-defined function
select *
a search criteria that starts with a wildcard
third
Can the where clause be rewritten to be sargeable? There is more than one way to write almost everything and some ways are better performers than others.
I suggest you buy your developers some books on performance tuning.
Likely your proc can be fixed, but without seeing the code, it is hard to guess what the problems might be.
You know the one I am talking about.
We have all been there at some point. You get that awful feeling of dread and the realisation of oh my god did that actually just happen.
Sure you can laugh about it now though, right, so go on and share your SQL Server mishaps with us.
Even better if you can detail how you resolved your issue so that we can learn from our mistakes together.
So in order to get the ball rolling, I will go first……..
It was back in my early years as a junior SQL Server Guru. I was racing around Enterprise Manager, performing a few admin duties. You know how it is, checking a few logs, ensuring the backups ran ok, a little database housekeeping, pretty much going about business on autopilot and hitting the enter key on the usual prompts that pop up.
Oh wait, was that a “Are you sure you wish to delete this table” prompt. Too late!
Just to confirm for any aspiring DBA’s out there, deleting a production table is a very very bad thing!
Needless to say a world record was promptly set for the fastest database restore to a new database, swiftly followed by a table migration, oh yeah. Everyone else was none the wiser of course but still a valuable lesson learnt. Concentrate!
I suppose everyone has missed the WHERE clause off a DELETE or UPDATE at some point...
Inserted 5 million test persons into a production database. The biggest mistake in my opinion was to let me have write access to the production db in the first place. :P Bad dba!
My biggest SQL Server mistake was assuming it was as capable as Oracle when it came to concurrency.
Let me explain.
When it comes to transactional isolation level in SQL Server you have two choices:
Dirty reads: transactions can see uncommitted data (from other transactions); or
Selects block on uncommitted updates.
I believe these come from ANSI SQL.
(2) is the default isolation level and (imho) the lesser of two evils. But it's a huge problem for any long-running process. I had to do a batch load of data and could only do it out of hours because it killed the website while it was running (it took 10-20 minutes as it was inserting half a million records).
Oracle on the other hand has MVCC. This basically means every transaction will see a consistent view of the data. They won't see uncommitted data (unless you set the isolation level to do that). Nor do they block on uncommitted transactions (I was stunned at the idea an allegedly enterprise database would consider this acceptable on a concurrency basis).
Suffice it to say, it was a learning experience.
And ya knkow what? Even MySQL has MVCC.
I changed all of the prices to zero on a high-volume, production, eCommerce site. I had to take the site down and restore the DB from backup.. VERY UGLY.
Luckily, that was a LOOONG time ago.
forgetting to highlight the WHERE clause when updating or deleting
scripting procs and checking drop dependent objects first and running this on production
I was working on the payment system on a large online business. Multi-million Euro business.
Get a script from a colleague with a few small updates.
Run it on production.
Get an error report 30 minutes later from helpdesk, complaining about no purchases last 30 minutes.
Discover that all connections are waiting on a table lock to be released
Discover that the script from my colleague started with an explicit BEGIN TRANSACTION and expected me to manually type COMMIT TRANSACTION at the end.
Explain to boss why 30 minutes of sales were lost.
Blame myself for not reading the script documentation properly.
Starting off a restore from last week onto a production instance instead of the development instance. Not a good morning.
I've seen plenty other people miss a WHERE clause.
Myself, I always type the WHERE clause first, and then go back to the start of the line and type in the rest of the query :)
Thankfully we only ever make one cock-up before you realise that using transactions really is very, very trivial. I've amended thousands of records on accident before, luckily roll-back is there...
If you're querying the live environment without having thoroughly tested your scripts then I think embarrassing should really be foolhardy or perhaps unprofessional.
One of my favorites happened in an automated import when the client changed the data structure without telling us first. The Social Security number column and the amount of money we were to pay the person got switched. Luckily we found it before the system tried to pay someone his social security number. We now have checks in automated imports that look for funny data before running and stop it if the data seems odd.
Like zabzonk said, forgot the WHERE clause on an update or two in my day.
We had an old application that didn't handle syncing with our HR database for name updates very efficiently, mainly due to the way they keyed in changes to titles. Anyway, a certain woman got married, and I had to write a database change request to update her last name, I forgot the where clause and everyone in said application's name was now Allison Smith.
Columns are nullable, and parameter values fail to retrieve the correct information...
The biggest mistake was giving developers "write" access to the production DB
many DEV and TEST records were inserted / overwritten and backup- ed too production until it was wisely suggested (by me!) to only allow read access!
Sort of SQL-server related. I remember learning about how important it is to always dispose of a SqlDataReader. I had a system that worked fine in development, and happened to be running against the ERP database. In production, it brought down the database because I assumed it was enough to close SqlConnection, and had hundreds, if not thousands of open connections.
At the start of my co-op term I ended up expiring access to everyone who used this particular system (which was used by a lot of applications in my Province). In my defense, I was new to SQL Server Management Studio and didn't know that you could 'open' tables and edit specific entries with a sql statement.
I expired all the user access with a simple UPDATE statement (access to this application was given by a user account on the SQL box as well as a specific entry in an access table) but when I went to highlight that very statement and run it, I didn't include the WHERE clause.
A common mistake I'm told. The quick fix was unexpire everyones accounts (including accounts that were supposed to be expired) until the database could be backed up. Now I either open tables and select specific entries with SQL or I wrap absolutely everything inside a transaction followed by an immediate rollback.
Our IT Ops decided to upgrade to SQL 2005 from SQL 2000.
The next Monday, users were asking why their app didn't work. Errors like:
DTS Not found etc.
This lead to a nice set of 3 Saturdays in the office rebuilding the packages in SSIS with a good overtime package :)
Not, exactly a "mistake" but back when I was first learning PHP and MYSQL I would spend hours daily, trying to figure out why my code was not working, not knowing that I had the wrong password/username/host/database credentials to my SQL database. You cant believe how much time I wasted on that, and to make it even worse this was not a one time incident. But LOL, its all good, it builds character.
I once, and only once, typed something similar to the following:
psql> UPDATE big_table SET foo=0; WHERE bar=123
I managed to fix the error rather quickly. Since that and another error my updates always start out as:
psql> UPDATE table SET WHERE foo='bar';
Much easier to avoid errors that way.
I worked with a junior developer once who got confused and called "Drop" instead of "Delete" on a table.
It was a long night working to get the backup restored...
Edit: I should have mentioned, this was in a production environment, and the table was full of data...
This was before the days when Google could help. I didn't encounter this problem with SQL Server, but with it's ugly older cousin Sybase.
I updated a table schema in a production environment. Not understanding at the time that stored procedures that use SELECT * must be recompiled to pickup new fields, I proceeded to spend the next eight hours trying to figure out why the stored procedure that performed a key piece of work kept failing. Only after a server reboot did I clue in.
Losing thousands of dollars and hundreds of (end user) man-hours at your flagship customer's site is quite an educational experience. Highly recommended!!
A healthy amount of years ago I was working on a clients site, that had a nice script to clear the dev environment of all orders, carts and customers.. to ease up testing, so I of course put the damn script on the productions server query analyzer and ran it.
Took some 5-6 minutes to run too, I was bitching about how slow the dev server was until the number of deleted rows came up. :)
Fortunately I had just ran a full backup since I was about to do an installation..
Beyond the typical where clause error. Ran a drop on an incorrect DB and thus had to run a restore. Now I triple check my server name. Thankfully I had a good backup.
I set the maximum server memory to 0. I was thinking at the time that would automatically tell SQL server to use all available memory (It was early). No such luck. SQL server decided to use only 16 MB and I had to connect in single user mode to get the setting changed back.
Hit "Restore" instead of "Backup" in Management Studio.
There is a SqlServer2000 Database we have to update during weekend.
It's size is almost 10G.
The updates range from Schema changes, primary keys updates to some Million Records updated, corrected or Inserted.
The weekend is hardly enough for the job.
We set up a dedicated server for the job,
turned the Database SINGLE_USER
made any optimizations we could think of: drop/recreate indexes, relations etc.
Can you propose anything to speedup the process?
SQL SERVER 2000 is not negatiable (not my decision). Updates are run through custom made program and not BULK INSERT.
EDIT:
Schema updates are done by Query analyzer TSQL scripts (one script per Version update)
Data updates are done by C# .net 3.5 app.
Data come from a bunch of Text files (with many problems) and written to local DB.
The computer is not connected to any Network.
Although dropping excess indexes may help, you need to make sure that you keep those indexes that will enable your upgrade script to easily find those rows that it needs to update.
Otherwise, make sure you have plenty of memory in the server (although SQL Server 2000 Standard is limited to 2 GB), and if need be pre-grow your MDF and LDF files to cope with any growth.
If possible, your custom program should be processing updates as sets instead of row by row.
EDIT:
Ideally, try and identify which operation is causing the poor performance. If it's the schema changes, it could be because you're making a column larger and causing a lot of page splits to occur. However, page splits can also happen when inserting and updating for the same reason - the row won't fit on the page anymore.
If your C# application is the bottleneck, could you run the changes first into a staging table (before your maintenance window), and then perform a single update onto the actual tables? A single update of 1 million rows will be more efficient than an application making 1 million update calls. Admittedly, if you need to do this this weekend, you might not have a lot of time to set this up.
What exactly does this "custom made program" look like? i.e. how is it talking to the data? Minimising the amount of network IO (from a db server to an app) would be a good start... typically this might mean doing a lot of work in TSQL, but even just running the app on the db server might help a bit...
If the app is re-writing large chunks of data, it might still be able to use bulk insert to submit the new table data. Either via command-line (bcp etc), or through code (SqlBulkCopy in .NET). This will typically be quicker than individual inserts etc.
But it really depends on this "custom made program".