What is your biggest SQL Server mistake or embarrassing incident? - sql-server

You know the one I am talking about.
We have all been there at some point. You get that awful feeling of dread and the realisation of oh my god did that actually just happen.
Sure you can laugh about it now though, right, so go on and share your SQL Server mishaps with us.
Even better if you can detail how you resolved your issue so that we can learn from our mistakes together.
So in order to get the ball rolling, I will go first……..
It was back in my early years as a junior SQL Server Guru. I was racing around Enterprise Manager, performing a few admin duties. You know how it is, checking a few logs, ensuring the backups ran ok, a little database housekeeping, pretty much going about business on autopilot and hitting the enter key on the usual prompts that pop up.
Oh wait, was that a “Are you sure you wish to delete this table” prompt. Too late!
Just to confirm for any aspiring DBA’s out there, deleting a production table is a very very bad thing!
Needless to say a world record was promptly set for the fastest database restore to a new database, swiftly followed by a table migration, oh yeah. Everyone else was none the wiser of course but still a valuable lesson learnt. Concentrate!

I suppose everyone has missed the WHERE clause off a DELETE or UPDATE at some point...

Inserted 5 million test persons into a production database. The biggest mistake in my opinion was to let me have write access to the production db in the first place. :P Bad dba!

My biggest SQL Server mistake was assuming it was as capable as Oracle when it came to concurrency.
Let me explain.
When it comes to transactional isolation level in SQL Server you have two choices:
Dirty reads: transactions can see uncommitted data (from other transactions); or
Selects block on uncommitted updates.
I believe these come from ANSI SQL.
(2) is the default isolation level and (imho) the lesser of two evils. But it's a huge problem for any long-running process. I had to do a batch load of data and could only do it out of hours because it killed the website while it was running (it took 10-20 minutes as it was inserting half a million records).
Oracle on the other hand has MVCC. This basically means every transaction will see a consistent view of the data. They won't see uncommitted data (unless you set the isolation level to do that). Nor do they block on uncommitted transactions (I was stunned at the idea an allegedly enterprise database would consider this acceptable on a concurrency basis).
Suffice it to say, it was a learning experience.
And ya knkow what? Even MySQL has MVCC.

I changed all of the prices to zero on a high-volume, production, eCommerce site. I had to take the site down and restore the DB from backup.. VERY UGLY.
Luckily, that was a LOOONG time ago.

forgetting to highlight the WHERE clause when updating or deleting
scripting procs and checking drop dependent objects first and running this on production

I was working on the payment system on a large online business. Multi-million Euro business.
Get a script from a colleague with a few small updates.
Run it on production.
Get an error report 30 minutes later from helpdesk, complaining about no purchases last 30 minutes.
Discover that all connections are waiting on a table lock to be released
Discover that the script from my colleague started with an explicit BEGIN TRANSACTION and expected me to manually type COMMIT TRANSACTION at the end.
Explain to boss why 30 minutes of sales were lost.
Blame myself for not reading the script documentation properly.

Starting off a restore from last week onto a production instance instead of the development instance. Not a good morning.

I've seen plenty other people miss a WHERE clause.
Myself, I always type the WHERE clause first, and then go back to the start of the line and type in the rest of the query :)

Thankfully we only ever make one cock-up before you realise that using transactions really is very, very trivial. I've amended thousands of records on accident before, luckily roll-back is there...
If you're querying the live environment without having thoroughly tested your scripts then I think embarrassing should really be foolhardy or perhaps unprofessional.

One of my favorites happened in an automated import when the client changed the data structure without telling us first. The Social Security number column and the amount of money we were to pay the person got switched. Luckily we found it before the system tried to pay someone his social security number. We now have checks in automated imports that look for funny data before running and stop it if the data seems odd.

Like zabzonk said, forgot the WHERE clause on an update or two in my day.

We had an old application that didn't handle syncing with our HR database for name updates very efficiently, mainly due to the way they keyed in changes to titles. Anyway, a certain woman got married, and I had to write a database change request to update her last name, I forgot the where clause and everyone in said application's name was now Allison Smith.

Columns are nullable, and parameter values fail to retrieve the correct information...

The biggest mistake was giving developers "write" access to the production DB
many DEV and TEST records were inserted / overwritten and backup- ed too production until it was wisely suggested (by me!) to only allow read access!

Sort of SQL-server related. I remember learning about how important it is to always dispose of a SqlDataReader. I had a system that worked fine in development, and happened to be running against the ERP database. In production, it brought down the database because I assumed it was enough to close SqlConnection, and had hundreds, if not thousands of open connections.

At the start of my co-op term I ended up expiring access to everyone who used this particular system (which was used by a lot of applications in my Province). In my defense, I was new to SQL Server Management Studio and didn't know that you could 'open' tables and edit specific entries with a sql statement.
I expired all the user access with a simple UPDATE statement (access to this application was given by a user account on the SQL box as well as a specific entry in an access table) but when I went to highlight that very statement and run it, I didn't include the WHERE clause.
A common mistake I'm told. The quick fix was unexpire everyones accounts (including accounts that were supposed to be expired) until the database could be backed up. Now I either open tables and select specific entries with SQL or I wrap absolutely everything inside a transaction followed by an immediate rollback.

Our IT Ops decided to upgrade to SQL 2005 from SQL 2000.
The next Monday, users were asking why their app didn't work. Errors like:
DTS Not found etc.
This lead to a nice set of 3 Saturdays in the office rebuilding the packages in SSIS with a good overtime package :)

Not, exactly a "mistake" but back when I was first learning PHP and MYSQL I would spend hours daily, trying to figure out why my code was not working, not knowing that I had the wrong password/username/host/database credentials to my SQL database. You cant believe how much time I wasted on that, and to make it even worse this was not a one time incident. But LOL, its all good, it builds character.

I once, and only once, typed something similar to the following:
psql> UPDATE big_table SET foo=0; WHERE bar=123
I managed to fix the error rather quickly. Since that and another error my updates always start out as:
psql> UPDATE table SET WHERE foo='bar';
Much easier to avoid errors that way.

I worked with a junior developer once who got confused and called "Drop" instead of "Delete" on a table.
It was a long night working to get the backup restored...
Edit: I should have mentioned, this was in a production environment, and the table was full of data...

This was before the days when Google could help. I didn't encounter this problem with SQL Server, but with it's ugly older cousin Sybase.
I updated a table schema in a production environment. Not understanding at the time that stored procedures that use SELECT * must be recompiled to pickup new fields, I proceeded to spend the next eight hours trying to figure out why the stored procedure that performed a key piece of work kept failing. Only after a server reboot did I clue in.
Losing thousands of dollars and hundreds of (end user) man-hours at your flagship customer's site is quite an educational experience. Highly recommended!!

A healthy amount of years ago I was working on a clients site, that had a nice script to clear the dev environment of all orders, carts and customers.. to ease up testing, so I of course put the damn script on the productions server query analyzer and ran it.
Took some 5-6 minutes to run too, I was bitching about how slow the dev server was until the number of deleted rows came up. :)
Fortunately I had just ran a full backup since I was about to do an installation..

Beyond the typical where clause error. Ran a drop on an incorrect DB and thus had to run a restore. Now I triple check my server name. Thankfully I had a good backup.

I set the maximum server memory to 0. I was thinking at the time that would automatically tell SQL server to use all available memory (It was early). No such luck. SQL server decided to use only 16 MB and I had to connect in single user mode to get the setting changed back.

Hit "Restore" instead of "Backup" in Management Studio.

Related

What is the point of database migrations 'down'?

As all databases should be, the source for ours is versioned using source control. The database is upgraded using a series of SQL scripts generated by Red Gate's comparison tool, which is essentially the same as an 'up' migration in the numerous database migration frameworks that seem to have sprung up recently.
But what's the point in the 'down' migrations in these frameworks? Often the code for the 'up' migration is extremely complex (typically complex data migration as features evolve) and I struggle to see the purpose of having to write it all in reverse for the 'down' one. It's certainly something I've never felt the need for. Am I missing something here...?
It seems that the pertinent question here is:
Why would a scripted rollback ever be preferable to a full database restore from a backup taken immediately before the upgrade?
I can think of several reasons:
The database is very large - say a few hundred GB - and your company cannot afford the downtime and/or administrative overhead that would be involved in a full restore.
A bug was introduced that was not discovered until a week or two into production. If you've never experienced this before, you're lucky. Once you've got a week's worth of transactions in the new database, you can forget about just restoring from backup.
The bug was not discovered until months into the release. In other words, you don't even have the backup anymore, and you're officially in damage control/disaster recovery mode. I've never experienced this, but I've heard stories. It's a scary thought - how do you undo all the damage that was done? In this case your downgrade might not be perfect, but it might still be better than the alternative.
By contrast, perhaps the database changes were trivial - adding a few rows here, a few triggers there. In this case, a scripted rollback is going to take much less time than a restore. It's possible that some things that took hours to upgrade - such as creation of new indexes or addition of new columns - may only take seconds to downgrade (drop).
You're deploying to customer sites. Some of them may not have backups at all (yeah, it's pathetic, but there's nothing you can do about it). If one of them needs a rollback, this is your only option.
There may be other reasons to have downgrade scripts - this is just off the top of my head.
Customer: "We don't like the new version and want to go back to the old version."
Rollbacks. You push everything into production and it blows up - down migrations are a good safety net for rolling back.
Or you're developing with multiple code branches - you can go back and forth between versions to your heart's content.
If you upgrade, and subsequently data is added to your database that you want to preserve, a rollback script (as long as it is designed as such) should achieve this, whereas if you simply restore a backup you'll lose it.
But you could get round the above by restoring a backup and using SQL Data Compare to copy the additional data across.

SQL Server: What to learn if you going to for a dedicated solution

Today we're using a shared SQL Server database and that is perfect as I don't know anything about SQL Server maintenance. But for economic reasons we're need to upgrade to a dedicated server.
Given that I don't have time to read the entire documentation, What do I absolutely need to know about SQL Server to not screw this up?
Resource suggestions appreciated!
The answer probably has to do with how data-intensive your application is. If it's like most business applications, you're probably OK reading a couple quick start guides and winging it (as long as you back up regularly ... that's important, so read up on that carefully). SQL Server is generally pretty self-tuning, and if you're not talking millions of rows and high TPS, you're probably fine for a little while.
If it is a data-intensive application, or has high availability or throughput needs ... get a DBA, even just on contract. Don't put all your eggs in a basket you can't carry.
Backing up!
Oh, it's the accidental DBA!
Brent Ozar has a handful of useful articles: http://www.brentozar.com/sql/
Don't forget about SQLServerPedia - http://sqlserverpedia.com/wiki/Main_Page
Cheers!
In terms of backing up, don't forget to backup the transaction log as well as the database, unless you'd like your transaction log to grow until it takes over the entire drive.
I'd also read up on indexing, and statistics and rebuilding each.
Also you should probably get a good understanding of how database security works.
If at all possible get a dev server as well as a prod one. Much much better to test changes on dev than directly in production! Then limit prod access to only a couple of people and make all changes to production happen through tested scripts.
In order of importance:
How to schedule backups
How to create indexes
How to rebuild indexes
The Profiler and Tuning wizard can help you with 2 and 3.
If you are programming the database and not just administering it, I'd reccommend Robert Vieira's book. It's a great introduction.

SQL Server UNDO

I am a part time developer (full time student) and the company I am working for uses SQL Server 2005. The thing I find strange about SQL Server that if you do a script that involves inserting, updating etc there isn't any real way to undo it except for a rollback or using transactions.
You might say what's wrong with those 2 options? Well if for example someone does an update statement and forgets to put in a WHERE clause, you suddenly find yourself with 13k rows updated and suddenly all the clients in that table are named 'bob'. Now you have the wrath of 13k bobs to face since that "someone" forgot to use a transaction and if you do a rollback you are going to undo critical changes that were needed in other fields.
In my studies I have Oracle. In Oracle you can first run the script then commit it if you find that there isn't any mistakes. I was wondering if there was something that I missed in SQL Server since I am still relatively new in working developer world.
I don't believe you missed anything. Using transactions to prevent against these kind of errors is the best mechanism and it is the same mechanism Oracle uses to protected the end user. The difference is that Oracle implicitly begins a transaction for you whereas in SQL Server you must do it explicitly.
SET IMPLICIT_TRANSACTIONS is what you are probably looking for.
I'm no database/sql server expert and I'm not sure if this is what you're looking for, but there is the possibility to create snapshots of a database. A snapshot allows you to revert the database to that state at any time.
Check this link for more information:
http://msdn.microsoft.com/en-us/library/ms175158.aspx
I think transactions work well. You could rollback the DB (to a previous backup or point in the log), but I think transactions are much simpler.
How about this: never make changes to a production database that have not 1st been tested on your development server, and always make a backup before trying anything that is un-proven.
From what I understand, SQL Server 2008 added an Auditing feature that logs all changes made by users to the various databases and also has the option to roll them back after the fact.
Again, this is from what I've read or overheard from our DBA, but might be worth looking into.
EDIT: After looking into it, it appears to only give the ability to rollback on schema changes, not data modifications (DDL triggers).
If I am doing something with any risk in SQL Server, I write the script like this:
BEGIN TRAN
Insert .... whatever
Update .... whatever
-- COMMIT
The last line is a comment on purpose: I first run the lines before, then make sure there's no error, and then highlight just the word Commit and execute that. This works because in Management Studio you can select a part of the T-SQL and just execute the selected portion.
There are a couple of advantages: Implicit Transactions works too, but it's not the default for SQL Server so you have to remember to turn it on or set options to do that. Also, if it's on all the time, I find it's easy for people to "forget" and leave uncommitted transactions open, which can block others. That's mainly because it's not the default behavior and SQL Server folks aren't used to it.

Find out sql server hardware or speed test

I use an sql server regularly and have recently been getting frustrated by the performance. It would be difficult for me to get direct access to find out the hardware so:
Is there a direct way in management studio to assess performance or find out the exact hardware.
Alternatively does someone have a set of test sql procedures I could try and ideally compare to other results to get an idea of it's performance.
So far I have setup a few quick queries on my local machines sql express server just as test these seem to run quicker than the sql server on the network which is meant to be high performance although no one knows when it was last upgraded I have a feeling it hasn't been for 6 or 7 years. Obviously these test don't account for the possibility of others querying at the same time or network transfers of results... Hopefully someone has a better solution.
You can't just ask your server guys? Seems like there's a fair bit of mistrust if you can't get hardware metrics. Count of CPUs, total memory, etc.
If there's that amount of mistrust, even if you found the answer from the database server, rectifying it would be impossible. If you can't get the current parameters, how could you get a change of hardware passed the server guys?
Start building rapport. The best line in the world to get someone on your side is, "I'm in trouble and I need your help..." You've elevated them and subjugated yourself, you've put them in a position to save you. You'd be amazed at how much you can get out of people that way.
As far as standard queries. You could look at TPC queries.
IF you are on 2005:
SELECT * FROM sys.dm_os_performance_counters
That will give you some sql only stats. You will not find much info about the machine without at least terminal access. In the sql startup log you can see some info on processors as well.
You also might try updating your references in your server. I had an issue a while back that 1 query returned in 100ms and an identical query in 5+ minutes and the only difference between the 2 was a Capital letter in the table name in my query (whih obviously shouldn't matter).
After some searching and SO-Questioning, I found that I needed to update my statistics. Could it be something like this is needed for your database / SQL Server too?
This sort of thing can be very political, especially in a firm with an endemic CYA culture (which describes most financial services companies). If there's no reasonable
expectation of a good working relationship with the production staff, A few approaches are:
Look at the query plans of the
queries. Check that they are
sensible (using indexes when they
should etc.)
Make it formal. Ask their manager
to get the specifications of the
machine, the disk layout and server
configuration and the last time
statistics were updated on all
tables and indexes. Make it clear
that the machine appears to be
under-performing.
If the statistics are out of date,
get them updated.
and one more
SELECT * FROM sys.dm_os_sys_info

SQL Server Maintenance Suggestions?

I run an online photography community and it seems that the site draws to a crawl on database access, sometimes hitting timeouts.
I consider myself to be fairly compentent writing SQL queries and designing tables, but am by no means a DBA... hence the problem.
Some background:
My site and SQL server are running on a remote host. I update the ASP.NET code from Visual Studio and the SQL via SQL Server Mgmt. Studio Express. I do not have physical access to the server.
All my stored procs (I think I got them all) are wrapped in transactions.
The main table is only 9400 records at this time. I add 12 new records to this table nightly.
There is a view on this main table that brings together data from several other tables into a single view.
secondary tables are smaller records, but more of them. 70,000 in one, 115,000 in another. These are comments and ratings records for the items in #3.
Indexes are on the most needed fields. And I set them to Auto Recompute Statistics on the big tables.
When the site grinds to a halt, if I run code to clear the transaction log, update statistics, rebuild the main view, as well as rebuild the stored procedure to get the comments, the speed returns. I have to do this manually however.
Sadly, my users get frustrated at these issues and their participation dwindles.
So my question is... in a remote environment, what is the best way to setup and schedule a maintenance plan to keep my SQL db running at its peak???
My gut says you are doing something wrong. It sounds a bit like those stories you hear where some system cannot stay up unless you reboot the server nightly :-)
Something is wrong with your queries, the number of rows you have is almost always irrelevant to performance and your database is very small anyway. I'm not too familiar with SQL server, but I imagine it has some pretty sweet query analysis tools. I also imagine it has a way of logging slow queries.
I really sounds like you have a missing index. Sure you might think you've added the right indexes, but until you verify the are being used, it doesn't matter. Maybe you think you have the right ones, but your queries suggest otherwise.
First, figure out how to log your queries. Odds are very good you've got a killer in there doing some sequential scan that an index would fix.
Second, you might have a bunch of small queries that are killing it instead. For example, you might have some "User" object that hits the database every time you look up a username from a user_id. Look for spots where you are querying the database a hundred times and replace it with a cache--even if that "cache" is nothing more then a private variable that gets wiped at the end of a request.
Bottom line is, I really doubt it is something mis-configured in SQL Server. I mean, if you had to reboot your server every night because the system ground to a halt, would you blame the system or your code? Same deal here... learn the tools provided by SQL Server, I bet they are pretty slick :-)
That all said, once you accept you are doing something wrong, enjoy the process. Nothing, to me, is funner then optimizing slow database queries. It is simply amazing you can take a query with a 10 second runtime and turn it into one with a 50ms runtime with a single, well-placed index.
You do not need to set up your maintenance tasks as a maintenance plan.
Simply create a stored procedure that carries out the maintenance tasks you wish to perform, index rebuilds, statistics updates etc.
Then create a job that calls your stored procedure/s. The job can be configured to run on your desired schedule.
To create a job, use the procedure sp_add_job.
To create a schedule use the procedure sp_add_schedule.
I hope what I have detailed is clear and understandable but feel free to drop me a line if you need further assistance.
Cheers, John

Resources