So for the second day in a row, someone has wiped out an entire table of data as opposed to the one row they were trying to delete because they didn't have the qualified where clause.
I've been all up and down the mgmt studio options, but can't find a confirm option. I know other tools for other databases have it.
I'd suggest that you should always write SELECT statement with WHERE clause first and execute it to actually see what rows will your DELETE command delete. Then just execute DELETE with the same WHERE clause. The same applies for UPDATEs.
Under Tools>Options>Query Execution>SQL Server>ANSI, you can enable the Implicit Transactions option which means that you don't need to explicitly include the Begin Transaction command.
The obvious downside of this is that you might forget to add a Commit (or Rollback) at the end, or worse still, your colleagues will add Commit at the end of every script by default.
You can lead the horse to water...
You might suggest that they always take an ad-hoc backup before they do anything (depending on the size of your DB) just in case.
Try using a BEGIN TRANSACTION before you run your DELETE statement.
Then you can choose to COMMIT or ROLLBACK same.
In SSMS 2005, you can enable this option under Tools|Options|Query Execution|SQL Server|ANSI ... check SET IMPLICIT_TRANSACTIONS. That will require a commit to affect update/delete queries for future connections.
For the current query, go to Query|Query Options|Execution|ANSI and check the same box.
This page also has instructions for SSMS 2000, if that is what you're using.
As others have pointed out, this won't address the root cause: it's almost as easy to paste a COMMIT at the end of every new query you create as it is to fire off a query in the first place.
First, this is what audit tables are for. If you know who deleted all the records you can either restrict their database privileges or deal with them from a performance perspective. The last person who did this at my office is currently on probation. If she does it again, she will be let go. You have responsibilites if you have access to production data and ensuring that you cause no harm is one of them. This is a performance problem as much as a technical problem. You will never find a way to prevent people from making dumb mistakes (the database has no way to know if you meant delete table a or delete table a where id = 100 and a confirm will get hit automatically by most people). You can only try to reduce them by making sure the people who run this code are responsible and by putting into place policies to help them remember what to do. Employees who have a pattern of behaving irresponsibly with your busness data (particulaly after they have been given a warning) should be fired.
Others have suggested the kinds of things we do to prevent this from happening. I always embed a select in a delete that I'm running from a query window to make sure it will delete only the records I intend. All our code on production that changes, inserts or deletes data must be enclosed in a transaction. If it is being run manually, you don't run the rollback or commit until you see the number of records affected.
Example of delete with embedded select
delete a
--select a.* from
from table1 a
join table 2 b on a.id = b.id
where b.somefield = 'test'
But even these techniques can't prevent all human error. A developer who doesn't understand the data may run the select and still not understand that it is deleting too many records. Running in a transaction may mean you have other problems when people forget to commit or rollback and lock up the system. Or people may put it in a transaction and still hit commit without thinking just as they would hit confirm on a message box if there was one. The best prevention is to have a way to quickly recover from errors like these. Recovery from an audit log table tends to be faster than from backups. Plus you have the advantage of being able to tell who made the error and exactly which records were affected (maybe you didn't delete the whole table but your where clause was wrong and you deleted a few wrong records.)
For the most part, production data should not be changed on the fly. You should script the change and check it on dev first. Then on prod, all you have to do is run the script with no changes rather than highlighting and running little pieces one at a time. Now inthe real world this isn't always possible as sometimes you are fixing something broken only on prod that needs to be fixed now (for instance when none of your customers can log in because critical data got deleted). In a case like this, you may not have the luxury of reproducing the problem first on dev and then writing the fix. When you have these types of problems, you may need to fix directly on prod and you should have only dbas or database analysts, or configuration managers or others who are normally responsible for data on the prod do the fix not a developer. Developers in general should not have access to prod.
That is why I believe you should always:
1 Use stored procedures that are tested on a dev database before deploying to production
2 Select the data before deletion
3 Screen developers using an interview and performance evaluation process :)
4 Base performance evaluation on how many database tables they do/do not delete
5 Treat production data as if it were poisonous and be very afraid
So for the second day in a row, someone has wiped out an entire table of data as opposed to the one row they were trying to delete because they didn't have the qualified where clause
Probably the only solution will be to replace someone with someone else ;). Otherwise they will always find their workaround
Eventually restrict the database access for that person and provide them with the stored procedure that takes the parameter used in the where clause and grant them access to execute that stored procedure.
Put on your best Trogdor and Burninate until they learn to put in the WHERE clause.
The best advice is to get the muckety-mucks that are mucking around in the database to use transactions when testing. It goes a long way towards preventing "whoops" moments. The caveat is that now you have to tell them to COMMIT or ROLLBACK because for sure they're going to lock up your DB at least once.
Lock it down:
REVOKE delete rights on all your tables.
Put in an audit trigger and audit table.
Create parametrized delete SPs and only give rights to execute on an as needed basis.
Isn't there a way to give users the results they need without providing raw access to SQL? If you at least had a separate entry box for "WHERE", you could default it to "WHERE 1 = 0" or something.
I think there must be a way to back these out of the transaction journaling, too. But probably not without rolling everything back, and then selectively reapplying whatever came after the fatal mistake.
Another ugly option is to create a trigger to write all DELETEs (maybe over some minimum number of records) to a log table.
Related
Can I find out when the last INSERT, UPDATE or DELETE statement was performed on a table in an Oracle database and if so, how?
A little background: The Oracle version is 10g. I have a batch application that runs regularly, reads data from a single Oracle table and writes it into a file. I would like to skip this if the data hasn't changed since the last time the job ran.
The application is written in C++ and communicates with Oracle via OCI. It logs into Oracle with a "normal" user, so I can't use any special admin stuff.
Edit: Okay, "Special Admin Stuff" wasn't exactly a good description. What I mean is: I can't do anything besides SELECTing from tables and calling stored procedures. Changing anything about the database itself (like adding triggers), is sadly not an option if want to get it done before 2010.
I'm really late to this party but here's how I did it:
SELECT SCN_TO_TIMESTAMP(MAX(ora_rowscn)) from myTable;
It's close enough for my purposes.
Since you are on 10g, you could potentially use the ORA_ROWSCN pseudocolumn. That gives you an upper bound of the last SCN (system change number) that caused a change in the row. Since this is an increasing sequence, you could store off the maximum ORA_ROWSCN that you've seen and then look only for data with an SCN greater than that.
By default, ORA_ROWSCN is actually maintained at the block level, so a change to any row in a block will change the ORA_ROWSCN for all rows in the block. This is probably quite sufficient if the intention is to minimize the number of rows you process multiple times with no changes if we're talking about "normal" data access patterns. You can rebuild the table with ROWDEPENDENCIES which will cause the ORA_ROWSCN to be tracked at the row level, which gives you more granular information but requires a one-time effort to rebuild the table.
Another option would be to configure something like Change Data Capture (CDC) and to make your OCI application a subscriber to changes to the table, but that also requires a one-time effort to configure CDC.
Ask your DBA about auditing. He can start an audit with a simple command like :
AUDIT INSERT ON user.table
Then you can query the table USER_AUDIT_OBJECT to determine if there has been an insert on your table since the last export.
google for Oracle auditing for more info...
SELECT * FROM all_tab_modifications;
Could you run a checksum of some sort on the result and store that locally? Then when your application queries the database, you can compare its checksum and determine if you should import it?
It looks like you may be able to use the ORA_HASH function to accomplish this.
Update: Another good resource: 10g’s ORA_HASH function to determine if two Oracle tables’ data are equal
Oracle can watch tables for changes and when a change occurs can execute a callback function in PL/SQL or OCI. The callback gets an object that's a collection of tables which changed, and that has a collection of rowid which changed, and the type of action, Ins, upd, del.
So you don't even go to the table, you sit and wait to be called. You'll only go if there are changes to write.
It's called Database Change Notification. It's much simpler than CDC as Justin mentioned, but both require some fancy admin stuff. The good part is that neither of these require changes to the APPLICATION.
The caveat is that CDC is fine for high volume tables, DCN is not.
If the auditing is enabled on the server, just simply use
SELECT *
FROM ALL_TAB_MODIFICATIONS
WHERE TABLE_NAME IN ()
You would need to add a trigger on insert, update, delete that sets a value in another table to sysdate.
When you run application, it would read the value and save it somewhere so that the next time it is run it has a reference to compare.
Would you consider that "Special Admin Stuff"?
It would be better to describe what you're actually doing so you get clearer answers.
How long does the batch process take to write the file? It may be easiest to let it go ahead and then compare the file against a copy of the file from the previous run to see if they are identical.
If any one is still looking for an answer they can use Oracle Database Change Notification feature coming with Oracle 10g. It requires CHANGE NOTIFICATION system privilege. You can register listeners when to trigger a notification back to the application.
Please use the below statement
select * from all_objects ao where ao.OBJECT_TYPE = 'TABLE' and ao.OWNER = 'YOUR_SCHEMA_NAME'
I am debugging my database and I am finding that the replication is failing on a delete statement.
I look at the source and destination tables and they are the same. So someone deleted a row from the source and then put it back in. (The delete fails because of a FK reference to some manual data that I don't want to cascade delete.)
Is there a way to find out the PK of the row it is trying to delete?
(All Replication monitor will tell me is the name of the FK that is causing the delete statement to fail.)
There are a couple of ways. I'll tell you the easy one (because I'm lazy). Put a trace on your subscriber for non-successful stored procedure executions. You should get a hit for one called something like sp_MSdel_table (where table is the name of your table). The argument(s) to that procedure will be the primary key of the record that it's trying to delete.
Easy way number two is to modify the sproc identified in the previous method not to be angry at a missing row (after all, it's just going to delete it so the fact that's it's now missing isn't that big a deal). You might have other non-convergence issues, but at least you can get your commands flowing again. (EDIT: Just noticed the reason for your issue. I'd advise not having FK constraints at the subscriber since any referential integrity should be taken care of at the publisher. I'll make your replication faster when SQL doesn't have to check that each time it does an applicable insert, update, or delete).
Hard way number one involves looking at the error in replication monitor an noting that there's a transaction id and sequence number specified. You then use a sproc in the distribution database to get the text of the command being executed.
Hard way number two involved diffing the tables either with tablediff.exe, something like RedGate's SQLCompare, or a roll-your-own join over linked servers to show the difference. Keep this in your back pocket just in case one of the other one-row-at-a-time methods mentioned above doesn't do it for you. My threshold for such things is about three. YMMV.
I was reading an article the other day the showed how to run SQL Update, Insert, or Deletes as a whatif type scenario. I don't remember the parameter that they talked about and now I can't find the article. Not sure if I was dreaming.
Anyway, does anyone know if there is a parameter in SQL2008 that lets you try an insert, update, or delete without actually committing it? It will actually log or show you what it would have updated. You remove the parameter and run it if it behaves as you would expect.
I don't know of a SQL2008 specific feature with any SQL service that supports transactions you can do this:
Start a transaction ("BEGIN TRANSACTION" in TSQL)
The rest of your INSERT/UPDATE/DELETE/what-ever code
(optional) Some extra SELECT statements and such if needed to output the result of the above actions, if the default output from step 2 (things like "X rows affected") is not enough
Rollback the transaction ("ROLLBACK TRANSACTION" in TSQL)
(optional) Repeat the testing code to show how things are without the code in step 2 having run
For example:
BEGIN TRANSACTION
-- make changes
DELETE people WHERE name LIKE 'X%'
DELETE people WHERE name LIKE 'D%'
EXEC some_proc_that_does_more_work
-- check the DB state after the changes
SELECT COUNT(*) FROM people
-- undo
ROLLBACK TRANSACTION
-- confirm the DB state without the changes
SELECT COUNT(*) FROM people
(you might prefer to do the optional "confirm" step before starting the transaction rather than after rolling it back, but I've always done it this way around as it keeps the two likely-to-be-identical sections of code together for easier editing)
If you use something like this rather then something SQL2008 specific the technique should be transferable to other RDBS too (just update the syntax if needed).
OK, finally figured it out. I've confused this with another project I was working on with PowerShell. PowerShell has a "whatif" parameter that can be used to show you what files would be removed before they are removed.
My apologies to those who have spent time trying to find an answer to this port and my thanks to those of you who have responsed.
I believe you're talking about BEGIN TRANSACTION
BEGIN TRANSACTION starts a local transaction for the connection issuing the statement. Depending on the current transaction isolation level settings, many resources acquired to support the Transact-SQL statements issued by the connection are locked by the transaction until it is completed with either a COMMIT TRANSACTION or ROLLBACK TRANSACTION statement. Transactions left outstanding for long periods of time can prevent other users from accessing these locked resources, and also can prevent log truncation.
Do you perhaps mean SET NOEXEC ON ?
When SET NOEXEC is ON, SQL Server
compiles each batch of Transact-SQL
statements but does not execute them.
When SET NOEXEC is OFF, all batches
are executed after compilation.
Note that this won't warn/indicate things like key violations.
Toad for SQL Server has a "Validate SQL" feature that checks queries against wrong table/column names etc. . Maybe you are talking about some new feature in SSMS 2008 similar to that...
I'm more than seven years late to this particular party but I suspect the feature in question may also have been the OUTPUT clause. Certainly, it can be used to implement whatif functionality similar to Powershell's in a t-sql stored procedure.
https://learn.microsoft.com/en-us/sql/t-sql/queries/output-clause-transact-sql
Use this in each insert/update/delete/merge query to let the SP output a meaningful resultset of the changes it makes e.g. outputting the table name and action performed as the first two columns then all the altered columns.
Then simply rollback the changes if a #whatif parameter is set to 1 or commit them if #whatif is set to 0.
I am moving a system from a VB/Access app to SQL server. One common thing in the access database is the use of tables to hold data that is being calculated and then using that data for a report.
eg.
delete from treporttable
insert into treporttable (.... this thing and that thing)
Update treportable set x = x * price where (...etc)
and then report runs from treporttable
I have heard that SQL server does not like it when all records from a table are deleted as it creates huge logs etc. I tried temp sql tables but they don't persists long enough for the report which is in a different process to run and report off of.
There are a number of places where this is done to different report tables in the application. The reports can be run many times a day and have a large number of records created in the report tables.
Can anyone tell me if there is a best practise for this or if my information about the logs is incorrect and this code will be fine in SQL server.
If you do not need to log the deletion activity you can use the truncate table command.
From books online:
TRUNCATE TABLE is functionally
identical to DELETE statement with no
WHERE clause: both remove all rows in
the table. But TRUNCATE TABLE is
faster and uses fewer system and
transaction log resources than DELETE.
http://msdn.microsoft.com/en-us/library/aa260621(SQL.80).aspx
delete from sometable
Is going to allow you to rollback the change. So if your table is very large, then this can cause a lot of memory useage and time.
However, if you have no fear of failure then:
truncate sometable
Will perform nearly instantly, and with minimal memory requirements. There is no rollback though.
To Nathan Feger:
You can rollback from TRUNCATE. See for yourself:
CREATE TABLE dbo.Test(i INT);
GO
INSERT dbo.Test(i) SELECT 1;
GO
BEGIN TRAN
TRUNCATE TABLE dbo.Test;
SELECT i FROM dbo.Test;
ROLLBACK
GO
SELECT i FROM dbo.Test;
GO
i
(0 row(s) affected)
i
1
(1 row(s) affected)
You could also DROP the table, and recreate it...if there are no relationships.
The [DROP table] statement is transactionally safe whereas [TRUNCATE] is not.
So it depends on your schema which direction you want to go!!
Also, use SQL Profiler to analyze your execution times. Test it out and see which is best!!
The answer depends on the recovery model of your database. If you are in full recovery mode, then you have transaction logs that could become very large when you delete a lot of data. However, if you're backing up transaction logs on a regular basis to free the space, this might not be a concern for you.
Generally speaking, if the transaction logging doesn't matter to you at all, you should TRUNCATE the table instead. Be mindful, though, of any key seeds, because TRUNCATE will reseed the table.
EDIT: Note that even if the recovery model is set to Simple, your transaction logs will grow during a mass delete. The transaction logs will just be cleared afterward (without releasing the space). The idea is that DELETE will create a transaction even temporarily.
Consider using temporary tables. Their names start with # and they are deleted when nobody refers to them. Example:
create table #myreport (
id identity,
col1,
...
)
Temporary tables are made to be thrown away, and that happens very efficiently.
Another option is using TRUNCATE TABLE instead of DELETE. The truncate will not grow the log file.
I think your example has a possible concurrency issue. What if multiple processes are using the table at the same time? If you add a JOB_ID column or something like that will allow you to clear the relevant entries in this table without clobbering the data being used by another process.
Actually tables such as treporttable do not need to be recovered to a point of time. As such, they can live in a separate database with simple recovery mode. That eases the burden of logging.
There are a number of ways to handle this. First you can move the creation of the data to running of the report itself. This I feel is the best way to handle, then you can use temp tables to temporarily stage your data and no one will have concurency issues if multiple people try to run the report at the same time. Depending on how many reports we are talking about, it could take some time to do this, so you may need another short term solutio n as well.
Second you could move all your reporting tables to a difffernt db that is set to simple mode and truncate them before running your queries to populate. This is closest to your current process, but if multiple users are trying to run the same report could be an issue.
Third you could set up a job to populate the tables (still in separate db set to simple recovery) once a day (truncating at that time). Then anyone running a report that day will see the same data and there will be no concurrency issues. However the data will not be up-to-the minute. You also could set up a reporting data awarehouse, but that is probably overkill in your case.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
For my customer I occasionally do work in their live database in order to fix a problem they have created for themselves, or in order to fix bad data that my product's bugs created. Much like Unix root access, it's just dangerous. What lessons should I learn ahead of time?
What is the #1 thing you do to be careful about operating on live data?
BEGIN TRANSACTION;
That way you can rollback after a mistake.
Three things I've learned the hard way over the years...
First, if you're doing updates or deletes on live data, first write a SELECT query with the WHERE clause you'll be using. Make sure it works. Make sure it's correct. Then prepend the UPDATE/DELETE statement to the known working WHERE clause.
You never want to have
DELETE FROM Customers
sitting in your query analyzer waiting for you to write the WHERE clause... accidentally hit "execute" and you've just killed your Customer table. Oops.
Also, depending on your platform, find out how to take a quick'n'dirty backup of a table. In SQL Server 2005,
SELECT *
INTO CustomerBackup200810032034
FROM Customer
will copy every row from the entire Customer table into a new table called CustomerBackup200810032034, which you can then delete once you've done your updates and made sure everything's OK. If the worst happens, it's a lot easier to restore missing data from this table than to try and restore last night's backup from disk or tape.
Finally, be wary of cascade deletes getting rid of stuff you didn't intend to delete - check your tables' relationships and key constraints before modifying anything.
Do a backup first: it should be the number 1 law of sysadmining anyways
EDIT: incorporating what others have said, make sure your UPDATES have appropriate WHERE clauses.
Ideally, changing a live database should never happen (beyond INSERTs and basic maintenance). Changing the live DB's structure is especially fraught with potential bad karma.
Make your changes to a copy, and when you're satisfied, then apply the fix to live.
Often before I do an UPDATE or DELETE, I write the equivalent SELECT.
NEVER do an update unless you are in a BEGIN TRAN t1--not in a dev database, not in production, not anywhere. NEVER run a COMMIT TRAN t1 outside a comment--always type
--COMMIT TRAN t1
and then select the statement in order to run it. (Obviously, this only applies to GUI query clients.) If you do these things, it will become second nature to do them and you won't lose hardly any time.
I actually have a "update" macro that types this. I always paste this in to set up my updates. You can make a similar one for deletes and inserts.
begin tran t1
update
set
where
rollback tran t1
--commit tran t1
Always make sure your UPDATEs and DELETEs have the proper WHERE clause.
To answer my own question:
When writing an update statement, write it out of order.
Write UPDATE [table-name]
Write WHERE [conditions]
Go back and write SET [columns-and-values]
Choosing the rows you want to update before you say what values you want to change is much safer than doing it in the other order. It makes it impossible for update person set email = 'bob#bob.com' to be sitting in your query window, ready to be run by a misplaced keystroke, ready to mess up every row in the table.
Edit: As others have said, write the WHERE clause for your deletes before you write DELETE.
As an example, I create SQL like this
--Update P Set
--Select ID, Name as OldName,
Name='Jones'
From Person P
Where ID = 1000
I highlight the text from the end up to the Select and run that SQL. Once I verify that it is pulling the record I want to update, I hit shift-up to hightlight the Update statement and run that.
Note that I used an alias. I never update a table name explicity. I always use an alias.
If I do this in conjunction with transactions and rollback/commits, I am really, really safe.
My #1 way to be careful with a live database? Don't touch it. :)
Backups can undo damage that you inflict on the database, but you're still likely to introduce negative side effects during that span of time.
No matter how solid you think the script you're working with is, run it through a test cycle. Even if a "test cycle" means running the script against your own instance of the database, make sure you do it. It's much better to introduce defects on your local box than a production environment.
Check, recheck, and check again any statment that is doing updates. Even if you think you're just doing a simple, single column update, sooner or later you will not have enough coffee and forget a 'where' clause, nuking a whole table.
A couple other things I've found helpful:
if using MySQL, enable Safe updates
If you have a DBA, ask them to do it.
I 've found these 3 things have kept me from doing any serious harm.
Nobody wants backup but everyone cries for recovery
Create your DB with foreign key references, because you should:
make it as hard as possible for yourself to update/delete data and destroying the structural integrity / something else with that
If possible, run on a system where you have to commit the changes before you permanently store them (i.e. deactivate autocommit while repairing the db)
Try to identify your problem's classes so that you get an understanding how to fix without trouble
Get a routine in playing backups into a database, always have a second database on a test server at hand so you can just work on that
Because remember: If something fails totally, you need to be up and running again as fast as any possible
Well, that's about all I can think of now. Take the bold passages and you see whats #1 for me. ;-)
Maybe consider not using any deletes or drops at all. Or maybe reduce the user permissions so that only a special DB user can delete/drop things.
If you're using Oracle or another database that supports it, verify your changes before doing a COMMIT.
Data should always be deployed to live via scripts, which can be rehearsed as many times as it is required to get it right on dev. When there's dependent data for the script to run correctly on dev, stage it appropriately -- you can not get away with this step if you truly want to be careful.
Check twice, commit once!
Backup or dump the database before starting.
To add on to what #Wayne said, write your WHERE before the table name in a DELETE or UPDATE statement.
BACK UP YOUR DATA. Learned that one the hard way working with customer databases on a regular basis.
Always add a using clause.
My rule (as an app developer): Don't touch it! That's what the trained DBAs are for. Heck, I don't even want permission to touch it. :)
Different colors per environment: We've setup our PL\SQL developer (IDE for Oracle) so that when you logon to the production DB all the windows are in bright red. Some have gone as far as assigning a different color for dev and test as well.
Make sure you specify a where clause when deleting records.
always test any queries beyond select on development data first to ensure it has the correct impact.
if possible, ask to pair with someone
always count to 3 before pressing Enter (if alone, as this will infuriate your pair partner!)
If I'm updating a database with a script, I always make sure I put a breakpoint or two at the start of my script, just in case I hit the run/execute by accident.
I'll add to recommendations of doing BEGIN TRAN before your UPDATE, just don't forget to actually do the COMMIT; you can do just as much damage if you leave your uncommitted transaction open. Don't get distracted by phones, co-workers, lunch etc when in the middle of updates or you'll find everyone else is locked up until you COMMIT or ROLLBACK.
I always comment out any destructive queries (insert, update, delete, drop, alter) when writing out adhoc queries in Query Analyzer. That way, the only way to run them, is to highlight them, without selecting the commented part, and press F5.
I also think it's a good idea, as already mentioned, to write your where statement first, with a select, and ensure that you are altering the right data.
Always back up before changing.
Always make mods (eg. ALTER TABLE) via a script.
Always modify data (eg. DELETE) via a stored procedure.
Create a read only user (or get the DBA to do it) and only use that user to look at the DB. Add the appropriate permissions to schema so that you can view the content of stored procedures/views/triggers/etc. but not have the ability to change them.