Limiting the number of updated/deleted rows in SQL Server Management Studio - sql-server

It is very easy to make mistakes when it comes to UPDATE and DELETE statements in SQL Server Management Studio. You can easily delete way more than you want if you had a mistake in the WHERE condition or, even worse, delete the whole table if you mistakenly write an expression that evaluates to TRUE all the time.
Is there anyway to disallow queries that affects a large number of rows from within SQL Server Management Studio? I know there is a feature like that in MySQL Workbench, but I couldn't find any in SQL Server Management Studio.

No.
It is your responsibility to ensure that:
Your data is properly backed up, so you can restore your data after making inadverdent changes.
You are not writing a new query from scratch and executing it directly on a production database without testing it first.
You execute your query in a transaction, and review the changes before committing the transaction.
You know how to properly filter your query to avoid issuing a DELETE/UPDATE statement on your entire table. If in doubt, always issue a SELECT * or a SELECT COUNT(*)-statement first, to see which records will be affected.
You don't rely on some silly feature in the front-end that might save you at times, but that will completely screw you over at other times.

A lot of good comments already said. Just one tiny addition: I have created a solution to prohibit occasional execution of DELETE or UPDATE without any WHERE condition at all. This is implemented as "Fatal actions guard" in my add-in named SSMSBoost.

(My comments were getting rather unwieldy)
One simple option if you are uncertain is to BEGIN TRAN, do the update, and if the rows affected count is significantly different than expected, ROLLBACK, otherwise, do a few checks, e.g. SELECTs to ensure just the intended data was updated, and then COMMIT. The caveat here is that this will lock rows until you commit / rollback, and potentially require escalation to TABLOCK if a large number of rows are updated, so you will need to have the checking scripts planned in advance.
That said, in any half-serious system, no one, not even senior DBA's, should really be executing direct ad-hoc DML statements on a prod DB (and arguably the formal UAT DB too) - this is what tested applications are meant for (or tested, verified patch scripts executed only after change control processes are considered).
In less formal dev environments, does it really matter if things get broken? In fact, if you are an advocate of Chaos Monkey, having juniors break your data might be a good thing in the long run - it will ensure that your process re scripting, migration, static data deployment, integrity checking are all in good order?

My suggestion for you is disable auto commit. where you can commit your changes after review it. and commit it before ending the session.
for more details you can please follow the MSDN link:
http://msdn.microsoft.com/en-us/library/ms187807.aspx

Related

Why Would Remote Execution of a Query Cause it to be Suspended?

I apologize in advance for not having all of the specifics available, but the machine is building an index probably for a good while still and is almost completely unresponsive.
I've got a table on SQL Server 2005 with a good number of columns, maybe 20, but a mammoth number of rows (tens, more likely hundreds of millions). In order to simplify the amount of JPA work I'd need to do to access it, I created a view that contained the bits I was interested in. The view was created as:
SELECT bigtable.ID, bigtable.external_identification, mediumtable.hostname,
CONVERT(VARCHAR, bigtable.datefield, 121) AS datefield
FROM schema.bigtable JOIN schema.mediumtable ON bigtable.joinID = mediumtable.ID;
When I want to select from the view, I do:
SELECT * FROM vwTable WHERE external_identification = 'some string';
This works just fine in SQL Management Studio. The external_identification column has a non-unique, non-clustered index in bigtable. This also worked just fine on our remotely executing Java program in our test environment. Now that we're a day or two away from production, the code has been changed a bit (although the fundamental JPA NamedQuery is still straightforward), but we have a new SQLServer installation on new hardware; the test version was on a 32-bit single core machine, the new hardware is 64-bit multi-core.
Whenever I try to run the code that uses this view on the new hardware, it either hangs indefinitely on the first call of this query or times out if I have a timeout specified. After doing some digging, something like:
SELECT status, command, wait_type, last_wait_type FROM sys.dm_exec_requests;
confirmed that the query was running, but showed it in the state:
suspended, SELECT, CXPACKET, CXPACKET
for as long as I cared to wait for it. Whenever I ran the exact same query from within the Management Studio, it completed immediately. So I did some research, and found out this is due to waiting on some kind of concurrent operation to start/finish. In an attempt to circumvent that, I set the server-wide MAXDOP to 1 (disabled concurrency). After that, the query still hangs, but the sys.dm_exec_requests would show:
suspended, SELECT, PAGEIOLATCH_SH, PAGEIOLATCH_SH
This indicates that it's some sort of HD/scanning issue. While certainly the machine is less responsive than I'd expect for newer hardware, I wouldn't expect this query (even over the view) to require much scanning, since the column I'm searching by is indexed in the underlying table and it works if I run it locally. But just because I'm out of ideas and under the gun, I'm adding indexes to the view; first I have to add the unique clustered index (over ID) before I can attempt to add the non-unique non-clustered index over external_identification.
I'm the only one using this database; when I select from sys.dm_exec_requests the only two results are the query I'm actively inspecting and the select from sys.dm_exec_requests query. So it's not like it's under legitimately heavy, or even at all concurrent, load.
But I suspect I'm grasping at straws. I'm no DBA, and every time I have to interact with SQL Server outside of querying it it baffles my intuitions. Does anyone have any ideas why a query executed remotely would immediately go into a suspended state while the same query locally would execute immediately?
Wow, this one caught me straight out of left field. It turns out that by default, the MSSQL JDBC driver sends its String datatypes as Unicode, which the table/view might not be prepared to handle specifically. In our case, the columns and indexes were not, so MSSQL would perform a full table scan for each lookup.
In our test environment, the table was small enough that this didn't matter, so I was tricked into thinking it worked fine. In retrospect, I'm glad it didn't -- I can't stand it when computers give the illusion of inconsistency.
When I added this little parameter to the end of my JDBC connection string:
jdbc:sqlserver://[IP]:1433;databaseName=[db];sendStringParametersAsUnicode=false
things immediately and magically started working. Sorry for the slightly misleading question (I barely even mentioned JPA), but I had no idea what the cause was and really did believe it was something SQL Server side. Task Manager didn't report heavy CPU/Memory usage while the query was suspended, so I just thought it was idling even though it was really under heavy disk usage.
More info about MSSQL JDBC and Unicode can be found where I stumbled across the solution, at http://server.pramati.com/blog/2010/06/02/perfissues-jdbcdrivers-mssqlserver/ . Thanks, Ed, for that detailed shot in the dark -- it may not have been the problem, but I certainly learned a lot (and fast!) about MSSQL's gritty parts!
It is likely that the query run in SSMS and by your application are using different query plans - from the wait types you're seeing in dm_exec_requests it sounds like the plan created for the application is doing a table scan where the plan for SSMS is using an index seek.
This is possible because the SSMS and application database connections likely use different connection options, some of which are used as a key to the database plan cache.
You can find out which options your application is using by running a default SQL server profiler trace against the server; the first command after the connection is created will be a number of SET... options:
SET DATEFORMAT dmy
SET ANSI_NULLS ON
...
I suspect this list will be different between your application and your SSMS connection - a common candidate is SET ARITHABORT {ON|OFF}, since that forms part of the key of the cached plan.
If you run the SET... commands in an SSMS window before executing the query, the same (bad) plan as is being used by the application should then be picked up.
Assuming this demonstrates the problem, the next step is to work out how to prevent the bad plan getting into cache. It's difficult to give generic instructions about how to do this, since there are a few possible causes.
It's a bit of a scattergun approach (there are other more targetted ways to attempt to resolve this problem but they require more detailed understanding of the issue that I have now), but one thing to try is to add OPTION (RECOMPILE) to the end of your query - this forces a new plan to be generated for every execution, and should prevent the bad plan being reused:
SELECT * FROM vwTable WHERE external_identification = 'some string' OPTION (RECOMPILE);
Assuming you can replicate the bad performance is SSMS using the steps above, you should be able to test this there.
Beware that this can have negative performance consequences if the query is executed very frequently (since each recompilation requires CPU) - this depends on the workload of your application and will need testing.
A couple of other thoughts:
Check the schemas between the test and production systems; this might be as simple as a missing index from one of the tables in the production database, although given that SSMS queries perform OK this is unlikely.
You should re-enable parallelism by taking the server-wide MAXDOP=1 off, since this will limit the performance of your system overall. The problem is almost certainly the query plan, not parallelism
You also need to beware of the consequences of adding indexes to the view - doing so effectively materialises the view, which will (given the size of the table) require a lot of storage overhead - the indexes will also need to be maintained when INSERT/UPDATE/DELETE statements take place on the base table. Indexing the view is probably unnecessary given that (from SSMS) you know it's possible for the query to perform.

SQL Server 2008 Auto Backup

We want to have our test servers databases updated from our production server databases on a nightly basis to ensure we're developing on the most recent data. We, however, want to ensure that any fn, sp, etc that we're currently working on in the development environment doesn't get overwritten by the backup process. What we were thinking of doing was having a prebackup program that saves objects selected by our developers and a postbackup program to add them back in after the backup process is complete.
I was wondering what other developers have been doing in a situation like this. Is there an existing tool to do this for us that can run automatically on a daily basis and allow us to set objects not to overwrite (without requiring the attention of a sysadmin to run it daily).
All the objects in our databases are maintained in code - tables, view, triggers, stored procedures, everything - if we expect to find it in the database then it should be in DDL in code that we can run. Actual schema changes are versioned - so there's a table in the database that says this is schema version "n" and if this is not the current version (according to the update code) then we make the necessary changes.
We endeavour to separate out triggers and views - don't, although we probably should, do much with SP and FN - with drop and re-create code that is valid for the current schema version. Accordingly it should be "safe" to drop and recreate anything that isn't a table, although there will be sequencing issues with both the drop and the create if there are dependencies between objects. The nice thing about this generally is that we can confidently bring a schema from new to current and have confidence that any instance of the schema is consistent.
Expanding to your case, if you have the ability to run the schema update code including the code to recreate all the database objects according to the current definitions then your problem should substantially go away... backup, restore, run schema maint logic. This would have the further benefit that you can introduce schema (table) changes in the dev servers and still keep the same update logic.
I know this isn't a completely generic solution. And its worth noting that it probably works better with database per developer (I'm an old fashioned programmer, so I see all problems as having code based solutions (-:) but as a general approach I think it has considerable merit because it gives you a consistent mechanism to address a number of problems.

SQL Server UNDO

I am a part time developer (full time student) and the company I am working for uses SQL Server 2005. The thing I find strange about SQL Server that if you do a script that involves inserting, updating etc there isn't any real way to undo it except for a rollback or using transactions.
You might say what's wrong with those 2 options? Well if for example someone does an update statement and forgets to put in a WHERE clause, you suddenly find yourself with 13k rows updated and suddenly all the clients in that table are named 'bob'. Now you have the wrath of 13k bobs to face since that "someone" forgot to use a transaction and if you do a rollback you are going to undo critical changes that were needed in other fields.
In my studies I have Oracle. In Oracle you can first run the script then commit it if you find that there isn't any mistakes. I was wondering if there was something that I missed in SQL Server since I am still relatively new in working developer world.
I don't believe you missed anything. Using transactions to prevent against these kind of errors is the best mechanism and it is the same mechanism Oracle uses to protected the end user. The difference is that Oracle implicitly begins a transaction for you whereas in SQL Server you must do it explicitly.
SET IMPLICIT_TRANSACTIONS is what you are probably looking for.
I'm no database/sql server expert and I'm not sure if this is what you're looking for, but there is the possibility to create snapshots of a database. A snapshot allows you to revert the database to that state at any time.
Check this link for more information:
http://msdn.microsoft.com/en-us/library/ms175158.aspx
I think transactions work well. You could rollback the DB (to a previous backup or point in the log), but I think transactions are much simpler.
How about this: never make changes to a production database that have not 1st been tested on your development server, and always make a backup before trying anything that is un-proven.
From what I understand, SQL Server 2008 added an Auditing feature that logs all changes made by users to the various databases and also has the option to roll them back after the fact.
Again, this is from what I've read or overheard from our DBA, but might be worth looking into.
EDIT: After looking into it, it appears to only give the ability to rollback on schema changes, not data modifications (DDL triggers).
If I am doing something with any risk in SQL Server, I write the script like this:
BEGIN TRAN
Insert .... whatever
Update .... whatever
-- COMMIT
The last line is a comment on purpose: I first run the lines before, then make sure there's no error, and then highlight just the word Commit and execute that. This works because in Management Studio you can select a part of the T-SQL and just execute the selected portion.
There are a couple of advantages: Implicit Transactions works too, but it's not the default for SQL Server so you have to remember to turn it on or set options to do that. Also, if it's on all the time, I find it's easy for people to "forget" and leave uncommitted transactions open, which can block others. That's mainly because it's not the default behavior and SQL Server folks aren't used to it.

SpeedUp Database Updates

There is a SqlServer2000 Database we have to update during weekend.
It's size is almost 10G.
The updates range from Schema changes, primary keys updates to some Million Records updated, corrected or Inserted.
The weekend is hardly enough for the job.
We set up a dedicated server for the job,
turned the Database SINGLE_USER
made any optimizations we could think of: drop/recreate indexes, relations etc.
Can you propose anything to speedup the process?
SQL SERVER 2000 is not negatiable (not my decision). Updates are run through custom made program and not BULK INSERT.
EDIT:
Schema updates are done by Query analyzer TSQL scripts (one script per Version update)
Data updates are done by C# .net 3.5 app.
Data come from a bunch of Text files (with many problems) and written to local DB.
The computer is not connected to any Network.
Although dropping excess indexes may help, you need to make sure that you keep those indexes that will enable your upgrade script to easily find those rows that it needs to update.
Otherwise, make sure you have plenty of memory in the server (although SQL Server 2000 Standard is limited to 2 GB), and if need be pre-grow your MDF and LDF files to cope with any growth.
If possible, your custom program should be processing updates as sets instead of row by row.
EDIT:
Ideally, try and identify which operation is causing the poor performance. If it's the schema changes, it could be because you're making a column larger and causing a lot of page splits to occur. However, page splits can also happen when inserting and updating for the same reason - the row won't fit on the page anymore.
If your C# application is the bottleneck, could you run the changes first into a staging table (before your maintenance window), and then perform a single update onto the actual tables? A single update of 1 million rows will be more efficient than an application making 1 million update calls. Admittedly, if you need to do this this weekend, you might not have a lot of time to set this up.
What exactly does this "custom made program" look like? i.e. how is it talking to the data? Minimising the amount of network IO (from a db server to an app) would be a good start... typically this might mean doing a lot of work in TSQL, but even just running the app on the db server might help a bit...
If the app is re-writing large chunks of data, it might still be able to use bulk insert to submit the new table data. Either via command-line (bcp etc), or through code (SqlBulkCopy in .NET). This will typically be quicker than individual inserts etc.
But it really depends on this "custom made program".

SQL Server Maintenance Suggestions?

I run an online photography community and it seems that the site draws to a crawl on database access, sometimes hitting timeouts.
I consider myself to be fairly compentent writing SQL queries and designing tables, but am by no means a DBA... hence the problem.
Some background:
My site and SQL server are running on a remote host. I update the ASP.NET code from Visual Studio and the SQL via SQL Server Mgmt. Studio Express. I do not have physical access to the server.
All my stored procs (I think I got them all) are wrapped in transactions.
The main table is only 9400 records at this time. I add 12 new records to this table nightly.
There is a view on this main table that brings together data from several other tables into a single view.
secondary tables are smaller records, but more of them. 70,000 in one, 115,000 in another. These are comments and ratings records for the items in #3.
Indexes are on the most needed fields. And I set them to Auto Recompute Statistics on the big tables.
When the site grinds to a halt, if I run code to clear the transaction log, update statistics, rebuild the main view, as well as rebuild the stored procedure to get the comments, the speed returns. I have to do this manually however.
Sadly, my users get frustrated at these issues and their participation dwindles.
So my question is... in a remote environment, what is the best way to setup and schedule a maintenance plan to keep my SQL db running at its peak???
My gut says you are doing something wrong. It sounds a bit like those stories you hear where some system cannot stay up unless you reboot the server nightly :-)
Something is wrong with your queries, the number of rows you have is almost always irrelevant to performance and your database is very small anyway. I'm not too familiar with SQL server, but I imagine it has some pretty sweet query analysis tools. I also imagine it has a way of logging slow queries.
I really sounds like you have a missing index. Sure you might think you've added the right indexes, but until you verify the are being used, it doesn't matter. Maybe you think you have the right ones, but your queries suggest otherwise.
First, figure out how to log your queries. Odds are very good you've got a killer in there doing some sequential scan that an index would fix.
Second, you might have a bunch of small queries that are killing it instead. For example, you might have some "User" object that hits the database every time you look up a username from a user_id. Look for spots where you are querying the database a hundred times and replace it with a cache--even if that "cache" is nothing more then a private variable that gets wiped at the end of a request.
Bottom line is, I really doubt it is something mis-configured in SQL Server. I mean, if you had to reboot your server every night because the system ground to a halt, would you blame the system or your code? Same deal here... learn the tools provided by SQL Server, I bet they are pretty slick :-)
That all said, once you accept you are doing something wrong, enjoy the process. Nothing, to me, is funner then optimizing slow database queries. It is simply amazing you can take a query with a 10 second runtime and turn it into one with a 50ms runtime with a single, well-placed index.
You do not need to set up your maintenance tasks as a maintenance plan.
Simply create a stored procedure that carries out the maintenance tasks you wish to perform, index rebuilds, statistics updates etc.
Then create a job that calls your stored procedure/s. The job can be configured to run on your desired schedule.
To create a job, use the procedure sp_add_job.
To create a schedule use the procedure sp_add_schedule.
I hope what I have detailed is clear and understandable but feel free to drop me a line if you need further assistance.
Cheers, John

Resources