Change Data Capture - initial load of historical data - sql-server

I'm upgrading my SQL Server 2000 database to SQL Server 2008 R2. I want to make use of Change Data Capture feature. Im my existing application I have the similar functionality, but I'm using triggers and historical table with Hst_ prefix with almost similar schema as the original tables.
My question is: is there any way to migrate my data from Hst_ tables to the tables used by CDC feature?
I was thinking of doing that like this:
I have the table Cases.
I'm using my custom historization mechanism , so I also have also three triggers (on insert, update and delete) and a twin table Hst_Cases.
Now I'm enabling CDC on table Cases
CDC creates function, which returns historical data (fn_cdc_get_all_changes_dbo_Cases) and also a system table, which actually holds the data (cdc.dbo_Cases_CT).
I could insert data from Hst_Cases to cdc.dbo_Cases_CT, but I have the following problems:
I don't know how to get __$start_lsn and __$seqval.
It is difficult to figure out __$update_mask (I have to compare each two rows).
Is there the only way to do that? I want to avoid the situation then I join "new" historical data with the "old" historical data from Hst_ tables.
Thanks!

You typically don't want to use the capture tables to store long-term change data, it would be better to have an SSIS package move the capture data to permananent tables. If you do use them, I think if you ever have to restore your database, they'll be empty after restore unless you use the KEEP_CDC option when restoring. You'll also need to disable the job that automatically purges the capture tables.
If you create your own tables for storage, you can omit the lsn and mask fields.

Related

Trying to validate if a SQL Server data model uses temporal tables?

Looks like SQL Server 2008 and later uses the concept of "temporal tables" to manage table data history:
https://learn.microsoft.com/en-us/sql/relational-databases/tables/temporal-table-usage-scenarios
Looks like the following clause is used to accomplish this:
WITH (SYSTEM_VERSIONING = ON (HISTORY_TABLE = dbo.MyTableHistory));
Let's assume that a data model has tables TableX and a TableXHistory and I select the following context menu path to generate a DDL script of TableX:
Script Table as > CREATE to > New Query Editor Window
If the generated SQL script does not have a text reference to "HISTORY_TABLE" then can I say 100% that the history table is not managed as a temporal table? Also, would a temporal table be explicitly displayed in the standard tables directory for the data model? Is there any reason not to use temporal tables in 2018 as opposed to manually created history tables? My first impression is that anyone who creates manual history tables in 2018 is most likely out of date with SQL Server capabilities.
Temporal tables available only from 2016. Technology is not mature yet.
Temporal tables have their own Pros & Cons. Other options should be considered (classic triggers and history table, change data capture, replication, etc.)
The main disadvantages of temporal tables for me:
multiple changes made at the same time are invisible (only one row is returned)
history tables must be located at the same DB
limitation for transactional replication, merge replication is not supported
issues when system time has been changed - no way to know which update was first w/o implementing additional logic (version)
history tables can'be updated w/o disabling versioning
to get net changes you need to query the base table (which is not good).
how to detect which columns are changed? (CDC & triggers can detect that naturally, with temporal it may be very expensive)
...

Replicated database for storing historical data

Only part of the data in the database is being processed by the application, the rest is necessary for reporting purposes, but it causes poor application performance. I would like to archive historical data without modifying database schema.
Is there a possibility to replicate database, delete old data from primary instance and regularly synchronise new changes into replicated database? That way primary "transactional" database will be lightweight and replicated database will contain full set of both current and historical data for reporting purposes.
Could you recommend some tools or give some tips to achieve that on Oracle?
edit:
I'm wondering if I could use streams and somehow make DML handler to ignore DELETE operations on rows (docs.oracle.com/cd/B28359_01/server.111/b28321/…) so that during data replication historical rows will be preserved despite being deleted from transactional db.
You don't need to create two separate databases. Just create one transactional database where you will save all your transactions and then create views based on these tables to show required data. In this way you just have to maintain only one database.

Tools to update tables in SQL server 2000/2005

Is there any handy tool that can make updating tables easier? Usually I got an Excel file with the original value in one column and new value in another column. Then I write a formula in Excel to create the 'update' statement. Is there any way to simplify the updating task?
I believe the approach in SQL server 2000 and 2005 would be different, so could we discuss them both? Thanks.
In addition, these updates usually request by "non-programmer" (which means they don't understand SQL, so it may not feasible to let them do query), is there any tool that can let them update the table directly without having DBAs do this task? Also, that tool needs to limit the privilege to only modify certain tables. And better has a way rollback the change.
Create a DTS package that will import a csv file, make the updates and then archives the file. The user can drop the file in a specific folder designated for the task or this can be done by an ops person. Schedule the DTS to run every hour, day, etc.
In case your users would insist that they keep using Excel, you've got several different possibilities of getting the data transferred to SQL Server. My preferred one would be to use DTS/SSIS, as mentioned by buckbova.
However, another method is by using OPENROWSET(), which makes it possible to query your Excel file as if it was a table. I wrote a small article about it here: http://blog.hoegaerden.be/2010/03/29/retrieving-data-from-excel/
Another approach that hasn't been mentioned yet (I'm not a big fan of letting regular users edit data directly in the DB), any possibility of creating a small custom application for them?
There you go, a couple more possible solutions :-)
Valentino.
I think the best approach is to expose a view on your data accessible to users who are allowed to do updates, and set up triggers on the view to perform the actual updates on the underlying data. Restrict change to only the columns they should be changing.
This technique can work on SQL Server 2000 and 2005.
I would add audit triggers on the underlying tables so you can always track changes.
You'll have complete control, and they can connect to it with Access or whatever and perform their maintenance.
You could create some accounts in SQL Server for these users and limit their access to only certain tables and columns along with onlu select / update / insert privileges. Then you could create an access database with linked tables to these.

How to make a log on relational-data in SQL-Server?

I need to create in my DB a log, that every action in the program should be written there.
I will also want to store additional data to it for example have the table and row the action was applied to.
In other words I want the log to be dynamic and should be able to refer to the other tables in the database.
The problem is, I don't know how to relate all the tables to this log.
Any ideas?
You have two choices here:
1) modify your program to add logging for every db access
2) add triggers to each table in your db to perform logging operations.
I don't recommend one logging table for all tables. You will have locking issues if you do that (every insert, update and delete in every table woudl have to hit this one, bad idea). Create a table for each table that you want to audit. There are lots of possible designs for the table, but they usually include some variant of old vlaue, new value, date changed, and user who did the change.
Then create triggers on each table to log the changes.
I know SQL Server 2008 also has a systemic way to set up auditing, this would be easier to set up than manual auditing and might be enough to lure your company into using 2008.

MS SQL Auditing

I have a problem with a Database at my work. There is currently auditing in place, but its clunky, requires a lot of maintence, and it falls short in a few regards. So I am replacing it.
I want to do this in as generic of a way as possible and have designed the tables, and how everything will link and be updated.
Now, thats all fine and good, but I want to be able to write a generic way to insert records into these audit tables. (Without having to enter a command for each column in each table being changed.)
Is there anyway within a Stored Procedure to iterate over all the columns in a table? And I would like to write this in such a way that it will work with several tables, and automatically pickup and audit added columns and such.
Any ideas?
EDIT: Guess I should Clarify. I will be auditing data that is in the tables. But I will be using the same table(s) to store the audited data for every table in the database.
And I can not use Triggers because usually, when an update occurs, it occurs across multiple tables, but I would like all of these updates to be part of a single Change Set.
That is not a problem, because I can do all the Updates from within a single Stored Proc. I would just prefer some way like a loop, that i can get all the updated fields, figure out which ones changed, and the insert those changed ones into the audit table.
And I would like to do this without have a long list of if statements and insert statements for each column. (By doing this in a generic loop, it will handle added columns automatically and not be bothered by deleted columns)
By "added columns" I guess you are looking to audit DDL. If you use SQL 2005, then you want this link.
If don't use SQL 2005, then you probably want to either use one of the many SQL schema comparison tools, like SQL Red Gates tool set probably has something in there.
If you don't have $ for tools, then you might just want to run periodic queries against information_schema.tables and information_schema.columns. By periodically capturing these in permenant tables, you can identify when they have gained or lost rows (and hence a schema changed occured)
If you are doing data audit instead, then you'll want want to code generate some triggers, again using information_schema.tables and information_schema.columns.
There would be performance considerations, but you could add insert and update triggers to all of your tables, and have the triggers insert into your audit tables.
Use DDL Triggers (assuming you have SQL Server 2005+)!
http://www.sqlteam.com/article/using-ddl-triggers-in-sql-server-2005-to-capture-schema-changes
http://technet.microsoft.com/en-us/library/ms189871.aspx
That could be done if you were using a data access layer that could trap which tables and columns are being update and generating the insert statements for the audit table. In a stored procedure? Which stored procedure? Do you have a single one that does updates? Or are you creating one per table?
If it's an option for you, just upgrade to sql server 2008 and turn on Change Data Capture.

Resources