Copy huge tables data from one database to another in SQL Server

Copy huge tables data from one database to another in SQL Server - sql-server

I have two database say DB_A and DB_B. In DB_A database, table having huge data (few tables are having 2 to 10 million data). I want to move the all table data from DB_A to DB_B database. Please help me on writing stored procedures to move efficiently (fast) the data from one database to another.

The issue is how to handle your transaction logs. It has to write to both, but you should handle it in chunks.
So... Try something like this:
While exists (select * from db1.dbo.tablename)
Begin
Delete top 100 from db1.dbo.tablename
Output deleted.* into dbo.tablename;
Checkpoint;
End

No need to reinvent the wheel, have you considered using SQL Server Replication in order to move your database data?
Using Transactional Replication for example, you could define a Publication of the database tables that you wish to be copied/replicated.
You can then either syncrhonize the data on an ad-hoc basis, using a Snapshot, or if you wish to keep the data up to date in "near real time" then you could use continuous replication.

Related

Incrementally moving data across SQL Server instances without an ETL tool

I am working on a project that involves regularly copying data from identically structured database A (on a SQL Server instance) to database B (on a different instance). I do not have the option of using SSIS or another ETL tool, so I have to use already established linked servers and write a stored procedure run as an hourly job for the incremental loads.
To copy the delta records from database A to B, I have the following stored procedure to be run as a job:
CREATE PROCEDURE Incremental_Load
AS
BEGIN
SET NOCOUNT ON;
INSERT INTO New_Table (New_Date_Column, Copied_Attribute)
SELECT Datetime_Var, Attribute1
FROM Linked_Server.Database_A.dbo.Source_Table
WHERE Datetime_Var > (SELECT MAX(New_Date_Column) FROM New_Table);
END
It seems to work; only the new records, not present in the new table are inserted, based on the last record by datetime. My question has to do with performance and potential issues. If the source table gets new records every 30 minutes, and this stored procedure is run hourly, should I expect any issues, especially with regard to the subquery? Also, is it possible I will not detect all deltas? Are there any other glaring issues with this approach?
In case it matters, I do not have access to SQL Server Profiler. I also cannot create jobs, because of my privileges/roles; Someone else must do that for me. So, I would like to collect some information in advance of testing.
Thank you.

SQL Server: Archiving old data

I have a database that is getting pretty big, but the client is only interested in the last 2 years' data. But they would like to keep the older data "just-in-case".
Now we would like to archive the data to a different server over a WAN.
My plan is to create a stored proc to:
Copy all data from lookup tables, tables containing master data and foreign key tables over to the archive server.
Copy data from transactional tables over to the archive DB.
Delete transactional data from master db that's older than 2 years.
Although the approach will teoretically meet our needs, the 2 main problems are:
Performace: I'm copying the data over via SQL Linked Servers. Some of the big tables are really slow as it needs to compare which records exist and then update them, and the records that doesn't exists needs to be created. Seems like it will run in 3-4 hours.
We need to copy the tables in the correct sequence to prevent foreign key violations, and also the tables that have a relationship to itself (eg. Customers table with a ParentCustomer field), needs to be transferred without the ParentCustomer and then the ParentCustomer needs to be updated to prevent FK violations. Thus it becomes difficult to auto generate my Insert and Update statements (I would like to auto generate my statements as far as possible).
I just feel there might be a better way of archiving data that I do not yet know about. SSIS might be an option, but not sure if it will prevent my existing challenges. I don't know much about SSIS, so I might need to find some material to study it if that's the way to go.

I believe you need a batch process that will run as a scheduled task; perhaps every night. There are two options, which you have already discussed:
1) SQL Agent Job, which executes a Stored Procedure. The stored procedure will use Linked Servers.
2) SQL Agent Job, which will execute an SSIS package.
I believe you could benefit from a combination of both approaches, which would avoid Linked Serverd. Here are the steps:
1) An SQL Agent Job executes an SSIS package, which transfers the data to be archived from the live database to the copy database. This should be done in a specific sequence to avoid foreign key violations.
2) Once the SSIS package has executed the transfer, then it executes a stored procedure on the live database deleting the information that is over two years old. The stored procedure will not require any linked servers.
You will have to use transactions to make sure duplicate data is not archived. For example, if the SSIS package fails then the transaction should be rolled back and the Stored Procedure should not be executed.

You can use table partitions to create separate partitions for relevant date ranges.

insert data from different db server every second

Primary DB have all the raw data every 10 minutes, but it only store for 1 week. I would like to keep all the raw data for 1 year in another DB, and it is different server. How can it possible?
I have created T-query to select the required data from Primary DB. How can it keep update the data from primary DB and insert to secondary DB accordingly? The table has Datetime, would it able to insert new data for latest datetime?
Notes: source data SQL 2012
secondary db SQL 2005

If you are on sql2008 or higher the merge command (ms docs) may be very useful in your actual update process. Be sure to you understand it.
You table containing the full year data sounds like it could be OLAP, so I refer to it that way occasionally (if you don't know what OLAP is, look it up sometime, but it does not matter to this answer)
If you are only updating 1 or 2 tables, log shipping replication and failover may not work well for you, especially since you are not replicating the table due to different retention policies if nothing else. So make sure you understand how replication, etc. work before you go down that path. If these tables are over perhaps 50% of the total database, log shipping style methods might still be your best method. They work well and handle downtime issues for you -- you just replicate the source database to the OLAP server and then update from the duplicate database into your OLAP database.
Doing an update this every second is an unusual requirement. However, if you create a linked server, you be able to insert your selected rows into a staging table on the remote sever and them update from them to your OLAP table(s). If you can reliably update your OLAP table(s) on the remote server in 1 second, you have a potentially useful method. If not, you may fall behind on posting data to your OLAP tables. If you can update once a minute, you may find you are much less likely to fall behind on the update cycle (at the cost of being slightly less current at all times).
You want to consider putting after triggers on the source table(s) that copies the changes to a staging table (still on the source database) into staging table(s) with an identity on this staging table along with a flag to indicate Insert, Update or Delete and you are well positioned to ship updates for one or a few tables instead of the whole database. You don't need to requery your source database repeatedly to determine what data needs to be transmitted, just select top 1000 from from your staging table(s) (order by the staging id) and move them to the remote staging table.
If your fall behind, a top 1000 loop keeps from trying to post to much data in any one cross server call.
Depending on your data, you may be able to optimize storage and reduce log churn by not copying all columns to your staging table, just the staging id and the primary key of the source table and pretend that whatever data is in the source record at the time you post it to the OLAP database accurately reflects the data at the time the record was staged. It won't be 100% accurate on your OLAP table at all times, but it will be accurate eventually.
Cannot over emphasize that you need to accommodate the downtime in your design -- unless you can live with data loss or just wrong data. Even reliable connections are not 100% reliable.

Recording all Sql Server Inserts and Updates

How can I record all the Inserts and Updates being performed on a database (MS SQL Server 2005 and above)?
Basically I want a table in which I can record all the inserts andupdates issues on my database.
Triggers will be tough to manage because there are 100s of tables and growing.
Thanks
Bullish

We have hundreds of tables and growing and use triggers. In newer versions of SQL server you can use change Data Capture or Change Tracking but we have not found them adequate for auditing.
What we have is are two separate audit tables for each table (one for recording the details of the instance (1 row even if you updated a million records) and one for recording the actual old and new values), but each has the same structure and is created by running a dynamic SQL proc that looks for unauditied tables and creates the audit triggers. This proc is run every time we deploy.
Then you should also take the time to write a proc to pull the data back out of the audit tables if you want to restore the old values. This can be tricky to write on the fly with this structure, so it is best to have it handy before you have the CEO peering down your neck while you restore the 50,000 users accidentally deleted.

As of SQL Server 2008 and above you have change data capture.
Triggers, although unwieldy and a maintenance nightmare, will do the job on versions prior to 2008.

Daily Backups for a single table in Microsoft SQL Server

I have a table in a database that I would like to backup daily, and keep the backups of the last two weeks. It's important that only this single table will be backed up.
I couldn't find a way of creating a maintenance plan or a job that will backup a single table, so I thought of creating a stored procedure job that will run the logic I mentioned above by copying rows from my table to a database on a different server, and deleting old rows from that destination database.
Unfortunately, I'm not sure if that's even possible.
Any ideas how can I accomplish what I'm trying to do would be greatly appreciated.

You back up an entire database.
A table consists of entries in system tables (sys.objects) with permissions assigned (sys.database_permissions), indexes (sys.indexes) + allocated 8k data pages. What about foreign key consistency for example?
Upshot: There is no "table" to back up as such.
If you insist, then bcp the contents out and backup that file. YMMV for restore.

You can create a DTS/SSIS package to do this.

I've never done this, but I think you can create another file group in your database, and then move the table to this filegroup. Then you can schedule backups just for this file group. I'm not saying this will work, but it's worth your time investigating it.
To get you started...
http://decipherinfosys.wordpress.com/2007/08/14/moving-tables-to-a-different-filegroup-in-sql-2005/
http://msdn.microsoft.com/en-us/library/ms179401.aspx

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight