One-Way Replication Into Different Schema - sql-server

I'm looking for the best (best practice) option for one way replication between two databases. I would like to keep this purely SQL but, can write something in C# or use an ETL tool if there are no other good options.
Current setup:
DB1 - There are three instances of this database. It is a large relational database, the schema is the same for each but, they are separate data pots (no replication). Two databases on a 2012 server and one on a 2014 server
DB2 - There are two instances of this database on seperate servers (Europe, Americas) and the data is merge replicated between the two. The publisher is the 2014 server.
The Goal:
DB2 is tied to some reports. It has one table and a small application attached to that table. Users from many different countries enter data via a small application into DB2 and generate reports out of the application.
DB1 is a relational database that has a very large application on top of it but with fewer users. If users are using the application for DB1 then they should not need to duplicate their records into DB2.
There should be one-way replication from the multiple seperate DB1s into DB2. How quickly this happens is not too important.
The important things are:
No backwards replication occurs from DB2s > DB1s (Data only flows from DB1s into one of the DB2s)
Create, Update, and Delete actions should occur in DB2 based on the results
of a comparisson with DB1 (the one way replication)
Current Approach:
I currently have a flat sql view on each DB1 database that has the same schema as the table in the DB2 db's that the data needs to go into.
The servers are also joined as linked servers.
My though was to do a sort of manually written replication script on one of the DB2 databases that calls the views from the DB1s and does the CUD actions on a timed basis.
It seems to me that there should be an easier way though!?
Any thoughts on how to do this would be very much appreciated.
Keep in mind that since several of the DB1s exist on a SQL 2012 server that there may be some issues as 2012 might not be allowed to be a publisher for replication to a 2014 server.

Related

How to execute a stored procedure present in database A/Server A and query database B/Server B as well?

I have a configurations database with a lot of stored procedures and then I have a large number of databases, all are part of one system and on the same database server.
When the stored procedures in the config database are executing, they often query other databases as well and it is possible to do so because the configuration and all the other database are on the same database server.
But with time, as the data is growing, customers are growing, databases are growing, this one database server is slowing down. So now we want to take some of our databases and put them on a different database server but we are unable to do so because these databases and the configuration database are tightly coupled to each other because many of the stored procedures in the config database query other databases as well.
Is there some way I can execute a stored procedure present in config database / Server A, but this stored procedure is also querying database 2 on Server B?
If not then what could be the best approach to decouple all the other databases from the configurations database? I know getting rid of the stored procedures by implementing an ORM or something could be an option but that would be very time taking as we 1000+ stored procedures.
Let's say your configure database is Server A and all your user databases are on Server A and your SP are using 3 part naming to query multiple databases.
Now you want to migration one of the databases (DB1) to Servers B. Now DB1 does not exist on Server A, it is now on Server B. You can then Create a place holder DB on Server A called DB1. Then you will create a linked server to Server B and in the Place holder DB1, you will create alias which uses the linked server object with the same table name. This way, no changes is required on your thousands of SP and its configuration.
However, Linked Server may introduce performance problems as joining and indexing is less efficient.

Compare millions of records from Oracle to SQL server

I have an Oracle database and a SQL Server database. There is one table say Inventory which contains millions of rows in both database tables and it keeps growing.
I want to compare the Oracle table data with the SQL Server data to find out which records are missing in the SQL Server table on daily basis.
Which is best approach for this?
Create SSIS package.
Create Windows service.
I want to consume less resource to achieve this functionality which takes less time and less resource.
Eg : 18 millions records in oracle and 16/17 millions in SQL Server
This situation of two different database arise because two different application online and offline
EDIT : How about connecting SQL server from oracle through Oracle Gateway to SQL server to
1) Direct query to SQL server from Oracle to update missing record in SQL server for 1st time.
2) Create a trigger on Oracle which gets executed when record is deleted from Oracle and it insert deleted record in new oracle table.
3) Create SSIS package to map newly created oracle table with SQL server to update SQL server record.This way only few records have to process daily through SSIS.
What do you think of this approach ?
I would create an SSIS package and load the data from the Oracle table use a Data Flow / OLE DB Data Source. If you have SQL Enterprise, the Attunity Connectors are a bit faster.
Then I would load key from the SQL Server table into a Lookup transformation, where I would match the 2 sources on the key, and direct unmatched rows into a separate output.
Finally I would direct the unmatched rows output to a OLE DB Command, to update the SQL Server table.
This SSIS package will require a lot of memory, but as the matching is done in memory with minimal IO, it will probably outperform other solutions for speed. It will need enough free memory to cache all the keys from the SQL Server Table.
SSIS also has the advantage that it has lots of other transformation functions available if you need them later.
What you basically want to do is replication from Oracle to SQL Server.
You could do this in SSIS, A windows Service or indeed a multitude of platforms.
The real trick is using the correct design pattern.
There are two general design patterns
Snapshot Replication
You take all records from both systems and compare them somewhere (so far we have suggestions to compare in SSIS or compare on Oracle but not yet a suggestion to compare on SQL Server, although this is valid)
You are comparing 18 million records here so this is a lot of work
Differential replication
You record the changes in the publisher (i.e. Oracle) since the last replication then you apply those changes to the subscriber (i.e. SQL Server)
You can do this manually by implementing triggers and log tables on the Oracle side, then use a regular ETL process (SSIS, command line tools, text files, whatever), probably scheduled in SQL Agent to apply these to the SQL Server.
Or you could do this by using the out of the box replication capability to set up Oracle as a publisher and SQL as a subscriber: https://msdn.microsoft.com/en-us/library/ms151149(v=sql.105).aspx
You're going to have to try a few of these and see what works for you.
Given this objective:
I want to consume less resource to achieve this functionality which takes less time and less resource
transactional replication is far more efficient but complicated. For maintenance purposes, which platforms (.Net, SSIS, Python etc.) are you most comfortable with?
Other alternatives:
If you can use Oracle gateway for SQL Server then you do not need to transfer data and can make the query directly.
If you can't use Oracle gateway, you can use Pentaho data integration or another ETL tool to compare tables and get results. Is easy to use.
I think the best approach is using oracle gateway.Just follow the steps. I have similar type of experience.
Install and Configure Oracle Database Gateway for SQL Server.
https://docs.oracle.com/cd/B28359_01/gateways.111/b31042/installsql.htm
Now you can create a dblink from oracle to sql server.
Create a procedure which compare the missing records in oracle database and insert into sql server database.
For example, you can use this statement inside your procedure.
INSERT INTO "dbo"."sql_server_table"#dblink_name("column1","column2"...."column5")
VALUES
(
select column1,column2....column5 from oracle_table
minus
select "column1","column2"...."column5" from "dbo"."sql_server_table"#dblink_name
)
Create a scheduler which execute the procedure daily.
When both databases are online, missing records will be inserted to sql server. Otherwise the scheduler fail or you can execute the procedure manually.
It takes minimum resource.
I will suggest having a homemade ETL solution.
Schedule an oracle job to export source table data (on a daily
manner based on the application logic ) to plain CSV format.
Schedule a SQL-Server job (with acceptable delay from first oracle job) to read this CSV file and import it
to a medium table inside sql-servter using BULK INSERT.
Last part of the SQL-Server job will be reading medium table data
and do the logic(insert, update target table). I suggest having another table to store reports of this daily job result.

MS SQL Server: central database and foreign keys

I'm am currently developing one project of many to come which will be using its own database and also data from a central database.
Example:
the database "accountancy" with all accountancy package specific tables.
the database "personelladministration" with its specific tables
But we also use data which is general and will be used in all projects like "countries", "cities", ...
So we have put these tables in a separate database called "general"
We come from a db2 environment where we could create foreign keys between databases.
However, we are switching to MS SQL server where it is not possible to put foreign keys between databases.
I have seen that a workaround would be to use triggers, but I'm not convinced that is a clean solution.
Are we doing something wrong in our setup? Because it seems right to me to put tables with general data in a separate database instead of having a table "countries" in every database, that seams difficult to maintain and inefficiënt.
What could be a good approach to overcome this?
I would say that countries is not a terrible table to reproduce in multiple databases. I would rather duplicate static data like that than use more elaborate techniques. There is one physical schema per database in sql server and the schema can not be shared. That is why people use replication or triggers for shared data.
I can across this problem a while back. We have one database for authentication, however, those users have to be shared across multiple applications some of which have their own database.
Here is my question on this topic.
We resorted to replication and using an custom Authentication/Registration service agent to keep the data up to data.
Using views, in what Sourav_Agasti suggested in his answer, would be the most straight forward approach for static data. You can create views and indexed views and join data from databases on linked servers.
Create a loopback linked server and then create a view(if required, on each database) which accesses the table in this "central database" through this linked server. There will be a minor performance impact but it more than enough compensates by being very simiplistic.

SQL Server 2005 Auditing

Background
I have a production SQL Server 2005 server to which 4 different applications connect and make changes.
There are no foreign keys and in some cases no primary keys.
Unfortunately throwing the whole thing out and starting from scratch is not an option.
So my solution is to start migrating each of the applications to a service layer approach so that there is only one application directly connecting to the database.
However there are problems that need to be fixed before that service layer is written and all the applications are migrated over.
So rather than make changes and hope they don't break any one of the 4 badly written applications (with no way of quickly testing all functionality) my solution is to start auditing the database
Problem
How do I audit what stored procedures, tables, columns, views are being accessed/updated/called by each user on SQL Server 2005.
I can find out which tables are being updated but I have no idea which columns and by what users.
I also don't know if certain tables are being accessed only through stored procedures/views.
I know that SQL Server 2008 has better auditing features but if I could do this without spending money that would be great. That said if the best solution is to upgrade or buy software that's also an option.
Check out SQL Server 2008's CDC feature. You can't use this directly in 2005 but you can write a trigger for each table to log all data changes to a new audit table. i.e. you'd have an audit table for each table in your db, with all the same columns plus some additional columns saying what the operation was and when it occurred.
If the nature of your applications means you can get user information and/or application information from CURRENT_USER and APP_NAME() you could include that information in the audit table too.
And check out this answer for more goodness.

Linking tables between databases

I’m after a bit of advice on the best way to go about this is SQL server 2008R2 express. I have a number of applications that are in separate databases on the same server. They are all “plugins” that use a central staff/structure list that will be in a separate database. The application is in the process of being migrated from JET.
What I’m looking for is the best way of all the “plugin” databases being able to see the central database and use those tables in standard queries and views etc.
As I’m using express that rules out any replication solution and so far the only option I can think of is to use triggers or a stored procedure to “push” out all the changes to the plugins. The information needs to be populated on a near enough real time basis however the number of changes will be very small maybe up to 100 a day and the biggest table only has about 1000 rows at the moment (the staff names table).
Hopefully that will cover all everything but if anyone needs any more details then just ask
Thanks
Apologies if I've misunderstood, but from your description it sounds like all these databases are hosted on the same instance of SQL Server - it's your mention of replication that makes me uncertain.
Assuming that's the case, you should be able to replace any copies of tables from the central database which are held in the "plugin" databases with views or synonyms which reference the central tables directly, since SQL server allows you to make references between databases on the same server using three-part naming (database_name.schema_name.object_name)
For example, if each plugin db has a table StaffNames, you could replace this with a view by dropping the table, then creating a view:
drop table StaffNames
go
create view StaffNames
as
select * from <centraldbname>.<schema - probably dbo>.StaffNames
go
and your code should continue to work seamlessly, as long as permissions are set up.
Alternatively, you could replace all the references to the shared tables in the plugin databases with three-part name references to the central database, but the view method requires less work.

Resources