SQL Sync database tables to another server - sql-server

we have an system that creates and table in a database on our production server for each day/shift. I would like to somehow grab the data from that server and move it to our archive server and if the data is more than x days old remove it off the production server.
On the production server, the database is called "transformations" and the tables are named "yyyy-mm-dd_shift_table". I would like to move this into a database on another server running SQL 2012 into a database "Archive" with the same name. Each table contains about 30k records for the day.
The way i see it would be something like:
Get list of tables on Production Server
If table exists in Archive server, look for any changes (only really relevant for the current table) and sync changes
If table doesn't exist in Archive Server, create table and syn changes.
If date on table is greater and X days, delete table from archive server
Ideally i would like to have this as a procedure in SQL that can run either daily/hourly ect.
Suggestions on how to attack this would be great.
EDIT: Happy to do a select on all matching tables in the database and write them into a single table on my database.

A lot of digging today and i have come up with the flowing, This will load all the data from the remote server and insert it into the table on the local server. This requires a Linked server on your archive server which you can use to query the remote server. I'm sure you could reverse this and push the data but i didn't want to chew up cycles on the production server.
-- Set up the variables
--Tracer for the loop
DECLARE #i int
--Variable to hold the SQL queries
DECLARE #SQLCode nvarchar(300)
--Variable to hold the number of rows to process
DECLARE #numrows int
--Table to hold the SQL queries with and index for looping
DECLARE #SQLQueries TABLE (
idx smallint Primary Key IDENTITY(1,1)
, SQLCode nvarchar(300)
)
--Set up a table with the SQL queries that will need to be run on the remote server. This section creates an INSERT statment
--which is returning all the records in the remote table that do not exist in the local table.
INSERT INTO #SQLQueries
select 'INSERT INTO Local_Table_Name
SELECT S.* FROM [Remote_ServerName].[Transformations].[dbo].[' + name + '] AS S
LEFT JOIN Local_Table_Name AS T ON (T.Link_Field = S.Link_Field)
WHERE T.Link_Field IS Null'+
CHAR(13) + CHAR(10) + CHAR(13) + CHAR(10)
from [Remote_ServerName].[Transformations].sys.sysobjects
where type = 'U' AND name Like '%_Table_Suffix'
--Set up the loop to process all the tables
SET #i = 1
--Set up the number of rows in the resultant table
SET #numrows = (SELECT COUNT(*) FROM #SQLQueries)
--Only process if there are rows in the database
IF #numrows > 0
--Loop while there are still records to go through
WHILE (#i <= (SELECT MAX(idx) FROM #SQLQueries))
BEGIN
--Load the Code to run into a variable
SET #SQLCode = (SELECT SQLCode FROM #SQLQueries WHERE idx = #i);
--Execute the code
EXEC (#SQLCode)
--Increase the counter
SET #i = #i + 1
END
Initial ran over 45 tables inserted about 1.2 million records took 2.5 min. After that each run took about 1.5 min which only inserted about 50-100 records

I actually created a solution for this and have it posted on GitHub. It uses a library called EzAPI and will sync all the tables and columns from one server to another.
You're welcome to use it, but the basic process works by first checking the metadata between the databases and generating any changed objects. After making the necessary modifications to the destination server, it will generate one SSIS package per object and then execute the package. Can you choose to remove or keep the packages after they are generated.
https://github.com/thevinnie/SyncDatabases

Related

Copy indexes from a "parent" SQL Server database to a subset of the tables

We have a customer that uses a 3rd party system running on SQL Server 2014. I downloaded a copy of the underlying database in .bak format from the 3rd party, which has about 1500 tables and no data. The customer uses a cut-down version of this schema with about 150 of the tables, deleting the ones they don't use.
Somehow they also managed to delete all the indexes... sigh. They have sent me a .bak to do some reporting queries on, but the performance is making me pull out my hair.
I would like to copy the indexes (and pkeys) from the original "pattern" database to the customer's copy. I have seen an excellent solution on how to do this with two identical DBs, but using these leaves me with 1400 names that don't exist, and this causes problems in MSSQLAdmin. Can anyone offer a modification that does not assume the "child" DB has every table and only CREATEs on those that exist?
in case you are using the accepted answer of the question you have mentioned :
WHILE (##FETCH_STATUS = 0)
BEGIN
DECLARE #IXSQL NVARCHAR(4000) SET #IXSQL = 'IF OBJECT_ID(''' + #IxTable + ''') IS NOT NULL BEGIN ' --create index if table exists
SET #IXSQL = #IXSQL + 'CREATE '
..............
SET #IXSQL = #IXSQL + ') END'
-- Print out the CREATE statement for the index
.................

Data Sync between tables from SQL Server & Db2 - how to delete rows

We have two databases one in SQL Server & one in DB2, we have a scenario where we do some data inserts & data updates and deletes in SQL Server & at the same time we also do data inserts updates & deletes in Db2.
We sync data back & forth using some processes, whenever there is a change from SQL Server we sync data to db2 for insert, update & delete, if we have a change in db2 we sync data to SQL Server, we use IBM MQ messages which we dequeue the messages to sync the changes back and forth.
Everything was good until we had some issues of data sync from Db2 to SQL Server, one of our process was down which sync from db2 to SQL Server, so there is an on demand job that runs every night that will do full data refresh from Db2 to SQL Server but we are only doing Merge Update & insert, we are not doing delete as data which is yet to be synced to db2 is also present in SQL Server, so we cannot directly delete as both databases can have more or less records, so data on SQL Server some of them are left orphan, we have a scoping so data which is getting updated in SQL Server cannot be change in db2 and vice versa.
My question is when we are syncing from Db2 to SQL Server, how to identify records that got deleted from db2 only so that we can delete those from SQL Server, we don't want to delete records that are created in SQL Server but yet to be sent to db2, we have 114 tables and we cannot maintain a flag if that is an option to differentiate.
When you said you are synchronizing data back and forth between MS SQL Server and DB2 Server, how are you capturing the changes? If using some CDC tool (IDR, GoldenGate, Informatica), these tools allow you to detect conflicts so you can decide what records to keep or delete.
If you are capturing your changes by an in-house development (triggers or your own log scraper ), you should keep at least the operation type and timestamp in your temporary change data set, so that you can recognize the operation.
If you are comparing the tables and deal with changes, you won't be able to recognize if missing columns at DB2 side represents rows deleted on DB2 side or rows added to SQL side... But you can fix that, by developing a proper change data capture mechanism.
Change tracking on the sql server side might be a viable option (as long as all the tables you would like to sync/"delete from" have a primary key).
With CT you could track which rows, for each table, were created at the sql server side
since the last sync from sql server to db2. Those rows should not be deleted yet:
DELETE
FROM SQL_SERVER_TABLE
WHERE
NOT EXISTS(SELECT * FROM CHANGETABLE())
AND NOT EXISTS(SELECT * FROM DB2_staging)
I would connect SQL to DB2 via linked servers (more there : https://learn.microsoft.com/fr-fr/sql/relational-databases/system-stored-procedures/sp-addlinkedserver-transact-sql?view=sql-server-ver15) and then do queries to find out which record are missing on both sides.
This can be accomplished with OPENQUERY. You can do something like that :
SELECT * FROM YourSqlTable
EXCEPT
SELECT * FROM OPENQUERY(YOURDB2SERVER, 'SELECT * FROM YourDB2Table')
And then the same thing inverted :
SELECT * FROM OPENQUERY(YOURDB2SERVER, 'SELECT * FROM YourDB2Table')
EXCEPT
SELECT * FROM YourSqlTable
You can then send the records on the right server .
If you have a lot of tables to compare you can write these queries with dynamic SQL
DECLARE #TABLENAME nvarchar(200);
DECLARE TABLE_CUR CURSOR FOR
SELECT TABLE_NAME FROM YourDatabaseName.INFORMATION_SCHEMA.TABLES;
OPEN TABLE_CUR
FETCH NEXT FROM TABLE_CUR INTO #TABLENAME;
WHILE ##FETCH_STATUS = 0
BEGIN
DECLARE #Query nvarchar(MAX);
SET #Query = 'SELECT * FROM OPENQUERY(YOURDB2SERVER, ''SELECT *
FROM '+ #TABLENAME + ' '')
EXCEPT
SELECT * FROM '+ #TABLENAME
-- Don't forget the double '' for openquery
EXEC sp_executeSQL #Query;
SET #Query = 'SELECT * FROM '+ #TABLENAME + '
EXCEPT
SELECT * FROM OPENQUERY(YOURDB2SERVER, ''SELECT *
FROM '+ #TABLENAME + ' '')'
-- Don't forget the double '' for openquery
EXEC sp_executeSQL #Query;
END
CLOSE TABLE_CUR;
DEALLOCATE TABLE_CUR;
Thanks for the suggestions, I am not using CDC, but maintaining changes in a LOG table which are yet to be synced to DB2.
DELETE TGT
FROM [IGP].[LocationType] AS TGT
INNER JOIN #locationType SRC ON
TGT.[LocationTypeCode] = SRC.[LocationTypeCode];
I am first inserting the log table data that are yet to be synced to DB2 into #locationType temp table and delete them from IGP(staging Db2 master data) so the updates & Deletes won't be overridden from IGP staging table data which is Db2 master data.
Now I need to take care of inserts that don't exists in Db2 and there in SQL server but it's not synced from the log table I shouldn't be deleting them as it would be data loss, so I use below merge query
MERGE INTO [dbo].[LocationType] AS TGT
USING [IGP].[LocationType] AS SRC
ON TGT.[LocationTypeCode] = SRC.[LocationTypeCode]
WHEN MATCHED AND (EXISTS
(SELECT TGT.[Description] EXCEPT SELECT SRC.[Description]))
THEN
UPDATE SET TGT.[LocationTypeCode] = SRC.[LocationTypeCode],
TGT.[Description] = SRC.[Description]
WHEN NOT MATCHED THEN
INSERT([LocationTypeCode], [Description])
VALUES([LocationTypeCode], [Description])
WHEN NOT MATCHED BY SOURCE
AND (EXISTS (SELECT TGT.[LocationTypeCode]
EXCEPT SELECT [LocationTypeCode] FROM #locationType)) THEN DELETE;

SQL Server : gather data from different databases

I have many different application databases with a [Log] table. I have one central database with a similar log table, but with one extra column called TenantId. There is also a Tenant table with a TenantId and a DatabaseName column. These DatabaseName contain the names of the application databases.
Now I want to loop all the application databases and copy the log entries to the central log table, with the TenantId that belongs to the application database name.
Would it be possible to write one procedure in the central database instead of creating many procedures in the application databases? All databases are on the same SQL Server instance.
Just some quick Dynamic SQL. In the example below, CHINRUS is my central database and would therefore be excluded from consolidation.
I should add, that the WHERE should be tailored to exclude any misc database on the server. Yet another option would be to maintain a table which has the proper definitions.
Declare #LogTable varchar(100)='[Chinrus].[dbo].[TransactionLog]'
Declare #CentralDB varchar(100)='Chinrus'
Declare #SQL varchar(max) = ''
Select #SQL = #SQL + SQL
From (
Select Name,SQL=';Insert Into '+#LogTable+' Select *,TenantId='''+Name+''' From ['+Name+'].[dbo].[TransactionLog] '
From master.dbo.sysdatabases
Where Name<>#CentralDB
) A
Select #SQL
--Exec(#SQL)
You can get list of all databases with following query:
SELECT name
FROM master.dbo.sysdatabases
and then you can use a cursor to get each database data and insert in one table in current database.

Get names of multiple servers SSMS is connected to

I have one instance of SSMS open and I am connected to one remote server as well as localhost. How can I get the names of all the servers that SSMS is currently connected to? The emblem of the remote server looks like
and the local looks like
Also, I would like to know if there's any problems with connecting to multiple servers from one instance of SSMS, and how to switch between servers through a script without clicking on a table name and doing something like select top 1000 rows
Okay there are multiple issues at work here as this is not always a simple answer. Depending on your environment and rights you may have one or more many permission groups that have access to one or many environments which have one or many servers that thus have access to one or many databases. However if you do have permission and you have linked servers set up with data access you can do something like this to get a listing of things you have access to. You could run this similarly on different environments making it into a procedure that you could call with ADO.NET or similar.
--declare variable for dynamic SQL
DECLARE
#SQL NVARCHAR(512)
, #x int
-- Create temp table to catch linked servers
Declare #Servers TABLE
(
Id int identity
, ServerName VARCHAR(128)
)
-- insert linked servers
insert into #Servers
select name
FROM sys.servers
-- remove temp table if it exists as it should not be prepopulated.
IF object_ID('tempdb..#Databases') IS NOT NULL
DROP TABLE tempdb..#Databases
;
-- Create temp table to catch built in sql stored procedure
CREATE TABLE #Databases --DECLARE #Procs table
(
ServerName varchar(64)
, DatabaseName VARCHAR(128)
)
SET #X = 1
-- Loops through the linked servers with matching criteria to examine how MANY there are. Do a while loop while they exist.
WHILE #X <= (SELECT count(*) FROM #Servers)
BEGIN
declare #DB varchar(128);
Select #DB = ServerName from #Servers where Id = #X -- get DB name from current cursor increment
-- Set up dynamic SQL but do not include master and other meta databases as no one cares about them.
SET #SQL = 'insert into #Databases select ''' + #Db + ''', name from ' + #DB + '.master.sys.databases
where name not in (''master'',''tempdb'',''model'',''msdb'')'
-- Execute the dynamic sql to insert into collection object
exec sp_executesql #SQL
-- increment for next iteration on next server
SET #X = #X + 1
END
;
SELECT *
FROM #Databases
I'm not entirely sure what you are asking. If you are asking if you can connect to multiple instances of SQL Server in a single query window the answer is yes. I went into detail on how and some of the implications here: Multiple instances, single query window
If on the other hand you are asking how to tell what instance you are connected to you can use ##SERVERNAME.
SELECT ##SERVERNAME
It will return the name of the instance you are connected to.
Typically you would connect to one instance per query window and flip between the windows to affect the specific instance you are interested in.
If you want to write a command to send you to a specific instance you can set your query window to SQLCMD mode (Query menu -> SQLCMD Mode) and use the :CONNECT command.
:CONNECT InstaneName
SELECT ##SERVERNAME

Linked Server Insert-Select Performance

Assume that I have a table on my local which is Local_Table and I have another server and another db and table, which is Remote_Table (table structures are the same).
Local_Table has data, Remote_Table doesn't. I want to transfer data from Local_Table to Remote_Table with this query:
Insert into RemoteServer.RemoteDb..Remote_Table
select * from Local_Table (nolock)
But the performance is quite slow.
However, when I use SQL Server import-export wizard, transfer is really fast.
What am I doing wrong? Why is it fast with Import-Export wizard and slow with insert-select statement? Any ideas?
The fastest way is to pull the data rather than push it. When the tables are pushed, every row requires a connection, an insert, and a disconnect.
If you can't pull the data, because you have a one way trust relationship between the servers, the work around is to construct the entire table as a giant T-SQL statement and run it all at once.
DECLARE #xml XML
SET #xml = (
SELECT 'insert Remote_Table values (' + '''' + isnull(first_col, 'NULL') + ''',' +
-- repeat for each col
'''' + isnull(last_col, 'NULL') + '''' + ');'
FROM Local_Table
FOR XML path('')
) --This concatenates all the rows into a single xml object, the empty path keeps it from having <colname> </colname> wrapped arround each value
DECLARE #sql AS VARCHAR(max)
SET #sql = 'set nocount on;' + cast(#xml AS VARCHAR(max)) + 'set nocount off;' --Converts XML back to a long string
EXEC ('use RemoteDb;' + #sql) AT RemoteServer
It seems like it's much faster to pull data from a linked server than to push data to a linked server: Which one is more efficient: select from linked server or insert into linked server?
Update: My own, recent experience confirms this. Pull if possible -- it will be much, much faster.
Try this on the other server:
INSERT INTO Local_Table
SELECT * FROM RemoteServer.RemoteDb.Remote_Table
The Import/Export wizard will be essentially doing this as a bulk insert, where as your code is not.
Assuming that you have a Clustered Index on the remote table, make sure that you have the same Clustered index on the local table, set Trace flag 610 globally on your remote server and make sure remote is in Simple or bulk logged recovery mode.
If you're remote table is a Heap (which will speed things up anyway), make sure your remote database is in simple or bulk logged mode change your code to read as follows:
INSERT INTO RemoteServer.RemoteDb..Remote_Table WITH(TABLOCK)
SELECT * FROM Local_Table WITH (nolock)
The reason why it's so slow to insert into the remote table from the local table is because it inserts a row, checks that it inserted, and then inserts the next row, checks that it inserted, etc.
Don't know if you figured this out or not, but here's how I solved this problem using linked servers.
First, I have a LocalDB.dbo.Table with several columns:
IDColumn (int, PK, Auto Increment)
TextColumn (varchar(30))
IntColumn (int)
And I have a RemoteDB.dbo.Table that is almost the same:
IDColumn (int)
TextColumn (varchar(30))
IntColumn (int)
The main difference is that remote IDColumn isn't set up as as an ID column, so that I can do inserts into it.
Then I set up a trigger on remote table that happens on Delete
Create Trigger Table_Del
On Table
After Delete
AS
Begin
Set NOCOUNT ON;
Insert Into Table (IDColumn, TextColumn, IntColumn)
Select IDColumn, TextColumn, IntColumn from MainServer.LocalDB.dbo.table L
Where not exists (Select * from Table R WHere L.IDColumn = R.IDColumn)
END
Then when I want to do an insert, I do it like this from the local server:
Insert Into LocalDB.dbo.Table (TextColumn, IntColumn) Values ('textvalue', 123);
Delete From RemoteServer.RemoteDB.dbo.Table Where IDColumn = 0;
--And if I want to clean the table out and make sure it has all the most up to date data:
Delete From RemoteServer.RemoteDB.dbo.Table
By triggering the remote server to pull the data from the local server and then do the insert, I was able to turn a job that took 30 minutes to insert 1258 lines into a job that took 8 seconds to do the same insert.
This does require a linked server connection on both sides, but after that's set up it works pretty good.
Update:
So in the last few years I've made some changes, and have moved away from the delete trigger as a way to sync the remote table.
Instead I have a stored procedure on the remote server that has all the steps to pull the data from the local server:
CREATE PROCEDURE [dbo].[UpdateTable]
-- Add the parameters for the stored procedure here
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
-- Insert statements for procedure here
--Fill Temp table
Insert Into WebFileNamesTemp Select * From MAINSERVER.LocalDB.dbo.WebFileNames
--Fill normal table from temp table
Delete From WebFileNames
Insert Into WebFileNames Select * From WebFileNamesTemp
--empty temp table
Delete From WebFileNamesTemp
END
And on the local server I have a scheduled job that does some processing on the local tables, and then triggers the update through the stored procedure:
EXEC sp_serveroption #server='REMOTESERVER', #optname='rpc', #optvalue='true'
EXEC sp_serveroption #server='REMOTESERVER', #optname='rpc out', #optvalue='true'
EXEC REMOTESERVER.RemoteDB.dbo.UpdateTable
EXEC sp_serveroption #server='REMOTESERVER', #optname='rpc', #optvalue='false'
EXEC sp_serveroption #server='REMOTESERVER', #optname='rpc out', #optvalue='false'
If you must push data from the source to the target (e.g., for firewall or other permissions reasons), you can do the following:
In the source database, convert the recordset to a single XML string (i.e., multiple rows and columns combined into a single XML string).
Then push that XML over as a single row (as a varchar(max), since XML isn't allowed over linked databases in SQL Server).
DECLARE #xml XML
SET #xml = (select * from SourceTable FOR XML path('row'))
Insert into TempTargetTable values (cast(#xml AS VARCHAR(max)))
In the target database, cast the varchar(max) as XML and then use XML parsing to turn that single row and column back into a normal recordset.
DECLARE #X XML = (select '<toplevel>' + ImportString + '</toplevel>' from TempTargetTable)
DECLARE #iX INT
EXEC sp_xml_preparedocument #ix output, #x
insert into TargetTable
SELECT [col1],
[col2]
FROM OPENXML(#iX, '//row', 2)
WITH ([col1] [int],
[col2] [varchar](128)
)
EXEC sp_xml_removedocument #iX
I've found a workaround. Since I'm not a big fun of GUI tools like SSIS, I've reused a bcp script to load table into csv and vice versa. Yeah, it's an odd case to have the bulk operation support for files, but tables. Feel free to edit the following script to fit your needs:
exec xp_cmdshell 'bcp "select * from YourLocalTable" queryout C:\CSVFolder\Load.csv -w -T -S .'
exec xp_cmdshell 'bcp YourAzureDBName.dbo.YourAzureTable in C:\CSVFolder\Load.csv -S yourdb.database.windows.net -U youruser#yourdb.database.windows.net -P yourpass -q -w'
Pros:
No need to define table structures every time.
I've tested and it worked way faster than inserting directly through
the LinkedServer.
It's easier to manage than XML (which is limited to
varchar(max) length anyway).
No need of an extra layout of abstraction (tools like SSIS).
Cons:
Using the external tool bcp through the xp_cmdshell interface.
Table properties will be lost after ex/im-poring csv (i.e. datatype, nulls,length, separator within value, etc).

Resources