SQL Server - Rolling back particular transaction only at a later date

SQL Server - Rolling back particular transaction only at a later date - sql-server

I have SQL Server 2014, standard edition. We have several tables where we delete data from, then re-insert it under different primary keys (to merge records for two people in our system that are actually the same). All these changes are performed with a T-SQL transaction.
I understand how transactions and rollbacks work, but what I need is more of an audit/rollback since my users may need to rollback just this transaction only at a later date (not restoring the whole database or table). "Change Data Capture" is not an option since I only have standard edition.
My real question lies in how to store this auditing information. I imagine I'll need a unique key to keep track of this being one unit of work so all these table changes get tied to same group as far as the user is concerned. But if I have a DELETE WHERE ID = #ID query for example, how do I store all these deleted records before deleting so that I can re-insert them later if needed? I'm fine with even storing a large rollback T-SQL script of some kind, I'm just not sure how to generate INSERT scripts that I can store and run later for data that I'm about to delete.
I'm open to any ideas, I just need an architecture that's generic enough to handle multiple tables and the ability to rollback deletions and insertions. I care more about the rollback ability than keeping a pretty audit table.

You can not do that out of the box as even with full logging you can roll back an entire database to a point in time but not specific transactions.
You will have to code something for un-doing transactions but I believe simple audit triggers will give you the data you need to make it happen. Here is a good article to get you started.
https://www.mssqltips.com/sqlservertip/4055/create-a-simple-sql-server-trigger-to-build-an-audit-trail/

Related

Disable transactions on SQL Server

I need some light here. I am working with SQL Server 2008.
I have a database for my application. Each table has a trigger to stores all changes on another database (on the same server) on one unique table 'tbSysMasterLog'. Yes the log of the application its stored on another database.
Problem is, before any Insert/update/delete command on the application database, a transaction its started, and therefore, the table of the log database is locked until the transaction is committed or rolled back. So anyone else who tries to write in any another table of the application will be locked.
So...is there any way possible to disable transactions on a particular database or on a particular table?

You cannot turn off the log. Everything gets logged. You can set to "Simple" which will limit the amount of data saved after the records are committed.

" the table of the log database is locked": why that?
Normally you log changes by inserting records. The insert of records should not lock the complete table, normally there should not be any contention in insertion.
If you do more than inserts, perhaps you should consider changing that. Perhaps you should look at the indices defined on log, perhaps you can avoid some of them.

It sounds from the question that you have a create transaction at the start of your triggers, and that you are logging to the other database prior to the commit transaction.
Normally you do not need to have explicit transactions in SQL server.
If you do need explicit transactions. You could put the data to be logged into variables. Commit the transaction and then insert it into your log table.
Normally inserts are fast and can happen in parallel with out locking. There are certain things like identity columns that require order, but this is very lightweight structure they can be avoided by generating guids so inserts are non blocking, but for something like your log table a primary key identity column would give you a clear sequence that is probably helpful in working out the order.
Obviously if you log after the transaction, this may not be in the same order as the transactions occurred due to the different times that transactions take to commit.
We normally log into individual tables with a similar name to the master table e.g. FooHistory or AuditFoo
There are other options a very lightweight method is to use a trace, this is what is used for performance tuning and will give you a copy of every statement run on the database (including triggers), and you can log this to a different database server. It is a good idea to log to different server if you are doing a trace on a heavily used servers since the volume of data is massive if you are doing a trace across say 1,000 simultaneous sessions.
https://learn.microsoft.com/en-us/sql/tools/sql-server-profiler/save-trace-results-to-a-table-sql-server-profiler?view=sql-server-ver15
You can also trace to a file and then load it into a table, ( better performance), and script up starting stopping and loading traces.
The load on the server that is getting the trace log is minimal and I have never had a locking problem on the server receiving the trace, so I am pretty sure that you are doing something to cause the locks.

Create audit table for a big table with a lot of columns in SQL Server

I know this question has been asked many times. My question here is I have a table which is around 8000 records but with around 25 columns. I would like to monitor any changes we make in this table. my server is only 2008.
We usually create an audit table for the specific table we monitor and record any changes into that using cursors as we usually have a lot of columns to monitor. But I don't want that this time!
Do you think instead of cursors, I can use a trigger to create a table called audit table XYZ and monitor changes in it having columns like field name, old value, new value, update_date, username?
Many thanks!

Short answer
Yes, absolutely use triggers over cursors. Cursors have a bad reputation for being misused and performing terribly, so where possible, avoid using them
Longer answer
If you have control over the application which is reading/writing to this table, consider have it build the queries for auditing instead. The thing to watch out for with an INSERT/UPDATE/DELETE trigger (which I assume is what you're going for) is that it's going to increase your write time for queries on that table, whereas writing the audit in its own query will avoid this (there is a caveat that I'll detail in the next paragraph). A consideration you also need to make is how much metadata the audit table needs to contain. For example, if your application requires users to log in, you may want to log their username to the audit table, which may not be available to a trigger. It all comes down to the purpose the audit table needs to serve for your application.
An advantage that triggers do have in this scenario is that they are bound to the same transaction as the underlying query. So if your INSERT/UPDATE/DELETE query fails and is rolled back, the audit rows which were created by the trigger will also be rolled back along with it, so you'll never end up with an audit entry for rows which never existed. If you favour writing your own audit queries over a trigger, you'll need to be careful to ensure that they are in the same transaction and get rolled back correctly in the event of an error

Test data inserted to Prod Database how do i delete that record

I mistakenly inserted few test records using PROD GUI, which got written to PROD Database. Is there a way to find what tables and column did those record touched ?
Thanks

I suppose you don't have a running trace, CDC or other tracking mechanism enabled. So it seems like the following steps would be a reasonable solution:
Make sure that you can't find and drop that data from the
Application GUI
Run SQL Profiler Trace using Tuning Template (it will give you enough information). Include ApplicationName and HostName columns to identify your connection.
Insert one more test record using UI (try to do the same operations as you did before)
Stop the trace and find the data you've inserted in it.
Identify other modification which was done from your application using ApplicationName, HostName, and SPID.
Create a SQL Script to delete those records.
Identify records which you had inserted before (probably they were inserted into the same tables)
Write a query to delete them too
Open transaction
Delete those records
Check that you have deleted only needed records
Commit transaction
UPD: according to the comment to this answer (with which I completely agree), if you have DEV or TEST environment on which you can do the same operation, do it and find modified records there. After that find modified records in the same tables on PROD.
P.S. I can not guarantee that following these steps you will be able to clean the data you've inserted, but probably you will be able to do this. I also recommend creating a full backup before deleting data.

If you have proper transaction logging enabled and using SQL Server 2008 and above. You can try using the Change Data Capture Stored Procedures (Transact-SQL) and check the changes happened to the tables. Hope it helps

Well you could track through the code to see what tables it touches. Run Profiler on dev to see what code it sends or procs it calls when enter a new records the same way that you did on prod.
If you have formal PK and FK relationships, you likely will find out by trial and error because it won't let you delete the parent records until all the children are deleted. And test with some other record on the dev environment to figure out what tables might be involved. Or you could script the FKs to see what other tables are related to the parent table.
If you have auditing (as every Enterprise solution should have, but I digress), you can often find out by looking in the audit tables for transactions at that time. Our audit tables have both the datetime of the transaction and the user which makes it easier to filter for these things.
Of course if you know your data model, you should have a pretty good idea before you start. Or if you have a particular id that is in all the child tables and you do not have nice convenient FKs, then you could check out the system tables to find what tables have that column name. That assumes a fairly standard naming convention though. If you call the same column different things in different tables, yo u might miss some.
If you are using an ORM, there should be some way to check what tables are in the object related to the particular task you did. So if you inserted a test order of instance, check out what is contained in the order object.

Sql Server, find all rows that have been updated by a statement

Is there a way of finding all the rows that have been updated by a single statement, sql itself must be tracking this as it could roll back the update if required. I'm interested in finding all the changed rows as I'm getting performance hit using update triggers.
I have a some large (2M-10M) row tables in Sql Server, and I'm adding audit triggers to track when records are updated and by what, trouble is this is killing performance. Most of the updates against the table will touch 20,000+ rows and they're now taking 5-10 times longer than previously.
I've thought of some options
1) Ditch triggers entirely and add the audit fields to every update statement, but that relies on everyone's code being changed.
2) Use before/after checksum values on the fields and then use them to update the changed rows a second time, still a performance hit.
Has anyone else solved this problem?

An UPDATE trigger already has the records affected by an update statement in the inserted and deleted pseudo columns. You can select their primary key columns into a preliminary audit table serving as a queue, and move more complicated calculation into a separate job.
Another option is the OUTPUT clause for the UPDATE statement, which was introduced in SQL Server 2005. (updated after comment by Philip Kelley)

SqlServer knows how to rollback because it has the transaction log. Is not something that you can find in the data tables.
You can try to add a timestamp column to your rows, then save a "current" timestamp, update all the rows. The changed rows should be all the rows with the timestamp greater than your "current" timestamp. THis will help you to find the changed rows, but not to find what has changed them.

You can use Change Tracking or Change Data Capture. These are technologies built into the Engine for tracking changes and are leveraging the Replication infrastructure (log reader or table triggers). Both are only available in SQL Server 2008 or 2008 R2 and CDC requires Enterprise Edition licensing.
Anything else you'd try to do would ultimately boil down to either one of:
reading the log for changes (which is only doable by Replication, including Change Data Capture, otherwise the Engine will recycle the log before you can read it)
track changes in triggers (which is what Change Tracking would use)
track changes in application
There just isn't any Free Lunch. If audit is a requirement, then the overhead of auditing has to be taken into consideration and capacity planning must be done accordingly. All data audit solution will induce significant overhead, so the an increase of operating cost by factors of 2x, 4x or even 10x are not unheard of.

Migrate and Merge several databases into one

In an update project i have to do the following:
Move 3 databases from SQL2000 to SQL2005 and merge them at the same time. There are already quite a few cross database queries used in SP's and Views.
The current plan is to move each of the old databases into a separate schema in 1 database.
That means we will also have to change our current SP's and Views, we now have:
SELECT OrderId, OrderDate FROM Sales.dbo.Orders
and expect we will have to change that into
SELECT OrderId, OrderDate FROM Sales.Orders
The question is: how do we do that as automated as possible?
I know about SED and similar for changing the scripts. I would welcome tips about how to be 'smart' about this, like strategies for partitioning the scripts, performance (tons of INSERT INTO lines) etc.
Note: I did look at the Import/Export Wizard but apparently I would have to set the Schema manually on each output table and fix the SP's through ALTER scripts anyway.

I did this a couple of years ago, and I ran into a few problems that you want to be aware of.
Assumptions:
You've got a single SQL 2000 database server with 3 databases, A/B/C
You want all of the objects to end up in SQL 2005 in database A (we'll refer to that as the Target)
You want to get rid of databases B and C eventually (the old Sources)
You don't have a full-blown test environment where you can automatically restore your production databases every day, and script this again and again until it's right. (That's the best way, and I've taken that approach too, but it's labor-intensive.)
Here's my hard lessons learned:
Don't do the merge and the SQL 2005 change the same day. Either do the merge before you go to 2005, or after, but don't try to accomplish it all in a single outage. It'll be a finger-pointing mess. If it was me, I'd go to 2005 first just to get it out of the way. That way, I know anything that breaks isn't because of a schema change, and those types of breaks are easier to fix. You want at least a week of end user activity on the 2005 box before you declare victory and move on to the merge.
Build the new objects in Target ahead of time. Even if they're not being queried in your live production apps, go ahead and build 'em now. That way you can populate fake test data in there to test your applications ahead of time. Yes, this means mixing live and test data, but frankly, you're already out there working without a net. Be wary of identity fields, though, since you can end up with conflicting records with the same identity number but different data in the Target and Source databases.
Create views in Target ahead of time. You mentioned that you've got views that already do cross-database queries. Copy those from Source to Target now, and tell any other developers (report guys, power users) to start referring to the Target views instead. This isn't going to speed up your own work, but it speeds up THEIR work. If you can get to the point where you can verify that they're only hitting Target (even though the Target views still point to tables in Source) then it'll make troubleshooting easier on migration day. Then you can start denying permissions on the Source views ahead of time.
Sync tables ahead of time. Make a list of all of the tables that need to be moved out of the Sources, and for each one, analyze how it's being updated. If it's only being inserted into (not updated or deleted), like a log table, then write a T-SQL script to start keeping it in sync in Target. Run that script via a SQL Agent job during periods of low activity on your server, like nightly. This way, when it's go-live day, you won't have to push as many records around, meaning your go-live window will be smaller and your Target transaction logs can stay smaller. Tables that are being constantly updated or deleted aren't as easy, and it's up to you whether you decide to sync those as well. We did it for any tables over a million lines.
Check for record conflicts between the Source databases. It sounds like this one doesn't apply to you specifically, but I'm noting it here in case anybody else does a merge and they're reading it for tips. If you have more than one Source database, dump out the list of objects. If you've got two objects with the same name, check their schema. I've worked with instances where they had a State or Region table in each database, and they were supposed to be identical, but they had identity fields for their primary keys. Each child table (like Customers, which linked to a Region table) referred to the parent table (Region) by the primary key (identity field) - which didn't match from one database to the other. In that case, the smart thing to do is take an outage window ahead of time, before the migration day, to clean those records up with manual update scripts.
Disable any constraints or foreign key relationships
Change the identity fields (if they're lookup tables, you may be able to turn off the identity stuff and just run with manually specified pk numbers)
Modify the Region table to add a NewID field, matching to what it's going to become, and an OldID field, showing what it used to be
Update all of the child tables (Customers) to use the NewID number instead of the original
Update the Region table so that the real ID field now has the NewID value, and the OldID field has what the Region used to be. (You're probably going to screw something up like miss a child table you didn't know about, and you're going to wonder what it used to be.)
Break the migration into pieces. List every stored proc in all of the databases. If any of them can be moved without moving data, do that first. For example, if you've got Source.dbo.usp_RunReport, and it only refers to tables in the Target database, then do that in a first phase. If you've got small system lookup tables that are only used internally in your app, not visible to customers or reports, then put that in the first phase too. It sounds like it's too small to bother with, but the idea is to reduce the amount of panic on migration day. The less you wonder about, the better you can troubleshoot. We moved every static lookup table (State, Region, Calendar, etc) over ahead of time. The amount of work required in Phase 1 - just moving those small, static tables - got management to understand how huge it was going to be to move the rest, and it bought us resources and time we wouldn't have gotten otherwise.
Pre-grow the data files for Target. If you're not using SQL 2005's new Instant File Initialization, data file growths take quite a while. Enable Instant File Initialization if you've got a choice, then grow the data files to make sure they're not fragmented. If they just grow naturally during your migration day, they can be fragmented. If you can't use Instant File Initialization, you still need to pre-grow the files, but you want to do that ahead of time during periods of low activity to speed up the maintenance window.
On migration day, run your inserts one table at a time, or smaller. You want to keep your insert transactions as tight as possible. The smaller your insert transactions, the less space you'll need in the transaction log. Remember that the transaction log will grow with insert statements even in simple mode. After every round of inserts, do a sanity check to make sure that they worked, and that you're not going to run out of drive space for data files or t-log files.
After the updates finish, change security on the Source databases. Put every non-SA login into the dbdenydatareader and dbdenydatawriter roles in the Source databases. That way they can still log in if they've hard-coded the database name in the connection string, but they won't be able to do anything. This makes your troubleshooting easier too: if an app or a query runs into problems, you could consider taking their login out of the deny roles and see if it works - if it does, it's borked. The risk with that is that they might run a transaction that uses the Source database data to update the Target database (get customers from Source, update them in Target) and it might cause issues.
Other options for the Source databases are:
Rename them, so you can still query 'em but the apps won't touch 'em
Detach them, but keep the files available in case you need to troubleshoot
Strip out all logins, and use new logins to access the existing databases just in case. Then if somebody's read-only report is totally borked, you can let it work temporarily by issuing them a new login and telling them it's referring to the wrong database.
After the updates finish, rebuild indexes & statistics on Target. If you're just doing continuous inserts, this isn't a big deal, but if you're merging multiple databases (like two Sales databases that had been broken up into regions of the country) then you'll want to clean things up.
IMHO, use one schema unless you can justify a gain from multiple schemas. This last one is just my two cents, but it sounds like you're going through an awful lot of work to go from 3 databases 1 schema each, to 1 database with 3 schemas. If you're not really sure about the 3 schema thing, you might consider using 1 schema - or else you'll be in another messy rework later on down the road. 3 schemas does make sense if you have specific security needs, but otherwise, just make sure you're getting the bang for the buck that you want. Now would be a great time to go to one schema.

You could give Redgate SQL Compare and Data Compare a shot. They have a schema mapping feature that should let you map the dbo schema to the sales schema in another and then move the tables and procs. It would make it so you don't have to mess with the SQL export wizard. You still would have to refactor your other objects though.
I love these two tools.
edit:
I think you can get a fully functional demo too.
edit:
Additionally, they offer SQL Refactor, which does a 'smart' rename. Score!

Could you have a dummy database called SALES that has a VIEW called [Orders]:
CREATE VIEW Sales.dbo.Orders
AS
SELECT OrderId, OrderDate, ...
FROM CombinedDatabase.Sales.Orders
and then
SELECT ... FROM Sales.dbo.Order
will still work.
You won't be able to INSERT / UPDATE that table without some further jiggery-pokery though.
If you could have such VIEWs log that they were used that would enable you to fix the code that called them!! but I can't think of a way to do that; however you could disable each in turn, run some tests, fix whatever is broken, then move on to next one ... and thus eradicate them by refactoring, but have a largely working application during the process.
I've used SED for this type of thing, but we have unique names for all our tables and all our columns, and we use variable names within our application that match the database column names - so I would have high confidence that changing xxx_yyy_ID to aaa_bbb_ID in our application would work well, and not have accidental side effects.
If you have actual column/table names like "Sales" and "Orders" I think that something like SED would be risky

Ok, so my basic understanding of your problem is something like this:
You have three different databases (i.e. Sales, Manu, Inventory)
They have distinct table & procedure names (no table/proc names in Sales exist in Manu or Inventory)
You want all the tables/procs from all three databases in a single database (i.e. SaleManInv)
Some stored procedures in each database explicitly refer to tables in the other databases (i.e. Sales.dbo.lookupItem() explicitly refers to Inventory.dbo.Items table)
Exporting and importing the tables doesn't seem like it will be a problem, what I would do for the procs:
Export one proc from the SQL Server 2000 db to the SQL Server 2005 DB to determine if you need to get rid of the ".dbo." portion of the cross references.
Export all the procs to text files (same folder for all procs)
Use a text editor with a "Search and Replace in Files" (I use PSPAD) and replace all the "Sales.dbo." with "SaleManInv.dbo.", then all the "Iventory.dbo." with "SameManInv.dbo." etc. to convert all the references to the new db.
Then run the exported and modified procs into your new db.
Is that making any sense? :-)

I was in a similar position where I had several SQL Server 2008 databases that were merged into 1. My solution was to use Integration Services' Transfer Server Objects task into a new target database. All data was copied over along with tables. Afterwards - in what was a very complex query, I scripted out all stored procedures/functions/views/etc. to a file and changed all cross-database references and re-created the stored procedures and other objects.
The trick with the stored procedures was to script them out in the order or syscontraints in order to ensure that stored procedures or functions that were referencing other stored procedures/functions internally were created last.
If there was a tool that I felt could have handled this task in an automated fashion, I would have purchased it immediately.

I would like to know if it's same kind of data. Any way. I would create a new column with the name 'SourceSystem'. So when the boss comes running after:
" - what was the sales diff between databasesystem1 and db2 in 2004".
Then you can answer that. Then in a year or two, if that questions don't pop up. You can delete that column. Merging data removes the origin of the data.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight