One question though lets say publisher database had 100 tables and I use Transactional Replication to move the data from those 100 tables to Subscriber Database that would be fine.
But lets say I don't want the 100 tables but i want to create 3-4 Views which contain the key information I want from those 100 tables. How Would I achieve this.
1) Firstly I guess the views need to be created on the publisher database
2) Secondly Do i need to create then 3/4 Tables in the Subscriber database which have the same columns as the view from publisher database.
3) What sort of replication or maybe even SSIS or something to move the data from the publisher view to subscriber database
Replication probably wouldn't be viable or as performant an option as creating a SSIS package for transferring data from those views and into the small set of tables in the remote database. SSIS's strongest feature is it's ability to transfer large volumes of data quickly from a source and into a destination. With a little upkeep, you could potentially just transfer the differences between the two databases at some scheduled interval and have a fairly flexible solution.
SSIS will be the better solution. You would create the tables on your target database. Then, you can create the SSIS pacakge(s) to populate the target tables.
SSIS can use queries on tables or views. And, it can also execute a stored procedure to retrieve the data.
Related
This seems a design question but I wanted to know if there is a pattern or design consideration we need to have where we would want to create a Database and not a new schema.
why not create one big database and separate schemas. Under what circumstance should we create a new database.
They are just logical divisions, so for the most part it's a matter of preference. There is one place where it's not a matter of preference: replication.
As of September, 2022, the unit of replication is the database. It's possible to specify which databases you want to replicate, but not which schemas within a database to replicate.
If you plan to replicate, you'll want to think about keeping only the schemas/tables that are important to replicate in one or more databases that get replicated and keep other data in databases that do not get replicated.
Another thought could be, In a large DWH Enterprise Solution,
There can be variety of flavours of tables which You can map to different databases. Sales DB, Master DB, Finance DB for ex. Then Inside DBs, You may want to have schemas for tables, views ,procedures and other object .
I need to sync data from several tables in a legacy SQL Server db (source) to a single table in a Postgres db (target). The schema of the source db is absurd, so the query to select the data takes a very long time to run. I'm planning to create an indexed view in the source db, and then somehow sync that indexed view to the Postgres table.
Right now, I simply have a scheduled task that drops the Postgres table (target) and then recreates it from scratch by running the complex query in the source db. This was quick to set up, and it ensures that changes in the source db always eventually make it to the target db, but recreating the table every few hours is (understandably) very slow and expensive. I need a way to replicate ongoing changes (only the new/updated data) from the source view to the target table. Is there a (relatively) simple way to do this?
I'm somewhat familiar with CDC, but I understand that CDC cannot be used on a view, so I don't believe that's an option. Adding "updated at" timestamps to the source tables is not an option, so I can't use that approach. I could add a hash column to the source tables, or maybe add a hash column to the view, so that's an option if that would work. Is there an existing tool/service that does what I need?
If you want to view SQL Server DB data in PostgreSQL, then you can also tds_fdw.
https://github.com/tds-fdw/tds_fdw
Also, there are some third-party tools which could help you to achieve your goal, for example, SymmetricDS
http://www.symmetricds.org/about/overview
I have a SQL Server database where we have created some views based on dim and fact tables. I need to build SSAS tabular model based on my tables and views. But one of the view runs for 1.5 hour inside SQL query (SSMS). Now I need to use this same view to build my SSAS tabular model but 1.5 hour is not acceptable. This view is made up of more than 10 table joins and lot of Where conditions.
1) Can I bring all these tables being used in this view inside my SSAS tabular model but then I am not sure how to join them all and use where clauses inside SSSAS and build something similar to my view. Is that possible? If yes how?
or
2) I will build one time SSAS model from that view and then if I want to incrementally load the data daily, whats is the best way to do that?
The best option is to set up a proper ETL process. That is:
Extract the tables from your source SQL database into a new SQL database that you control.
Transform the data into a star schema.
Load the data from the star schema into SSAS.
On SQL Server, the most common approach is use SSIS packages for data extraction, movement, and orchestration, and SQL Server Agent Jobs for scheduling.
To answer your questions:
Yes, it is certainly possible to bring in all of the tables directly from your source system into your tabular model, but please don't do this! You will only create problems for yourself later on when creating DAX calculations. More information here.
Incrementally loading data is something you decide for each table that is imported into your tabular model. Again, this is much easier if you have a proper star schema, as you would typically run a full processing on all your dimension tables, and then do incremental processing only on the largest fact tables.
I need to create an hourly .SQB backup file of some specific tables, each filtered with a WHERE clause, from a SQL Server database. As an example, I need this data:
SELECT * FROM table1 WHERE pk_id IN (2,5,7)
SELECT * FROM table2 WHERE pk_id IN (2,5,7)
SELECT * FROM table3 WHERE pk_id IN (2,5,7)
SELECT * FROM table4 WHERE pk_id IN (2,5,7)
The structure of the tables on the source database may change over time, e.g. columns may be added or removed, indexes added, etc.
One option is to do some kind of export, script generation, etc. into a staging database on the same instance of SQL Server. Efficiency aside, I have no problem dropping or truncating the tables on the destination database each time. In short, I'm looking to have both the schema and data of the tables duplicated to the destination database. That's completely acceptable.
Another is to just create a .SQB backup from the source database. Being that the .SQB file is all that I really need (it's going to be sent SFTP) - that would be fine, too.
What's the recommended approach in this scenario?
Well if I understand your requirement correctly, you want data from some tables from your database to be shipped over to somewhere else periodically.
Thing that is not possible in SQL server is taking a backup of a subset of tables from your database. So, this is not an option.
Since you have mentioned you will be using SFTP to send the data, using BCP command to extract data is one option, but BCP command may or may not perform very well and it definitely will not scale-out very well.
Instead of using BCP, I would prefer an SSIS package, you will be able to do all (extract files, add where clauses, drop files on SFTP, tune your queries, logging, monitoring etc) in your SSIS package.
Finally, SQL Server Replication can be used to create a subscriber, only publish the articles (tables) you are interested in, you can also add where clauses in your publications.
Again there are a few options with the replication subscriber database.
Give access to your data clients to your subscriber database, no need
for extracts.
Use BCP on the subscriber database to extract data,
without putting load on your production server.
Use SSIS Package to
extract data from the subscriber database.
Finally create a backup of
this subscriber database and ship the whole backup (.bak) file to
SFPT.
I short there is more than one way to skin the cat, now you have to decide which one suits your requirements best.
I'm looking for the best (best practice) option for one way replication between two databases. I would like to keep this purely SQL but, can write something in C# or use an ETL tool if there are no other good options.
Current setup:
DB1 - There are three instances of this database. It is a large relational database, the schema is the same for each but, they are separate data pots (no replication). Two databases on a 2012 server and one on a 2014 server
DB2 - There are two instances of this database on seperate servers (Europe, Americas) and the data is merge replicated between the two. The publisher is the 2014 server.
The Goal:
DB2 is tied to some reports. It has one table and a small application attached to that table. Users from many different countries enter data via a small application into DB2 and generate reports out of the application.
DB1 is a relational database that has a very large application on top of it but with fewer users. If users are using the application for DB1 then they should not need to duplicate their records into DB2.
There should be one-way replication from the multiple seperate DB1s into DB2. How quickly this happens is not too important.
The important things are:
No backwards replication occurs from DB2s > DB1s (Data only flows from DB1s into one of the DB2s)
Create, Update, and Delete actions should occur in DB2 based on the results
of a comparisson with DB1 (the one way replication)
Current Approach:
I currently have a flat sql view on each DB1 database that has the same schema as the table in the DB2 db's that the data needs to go into.
The servers are also joined as linked servers.
My though was to do a sort of manually written replication script on one of the DB2 databases that calls the views from the DB1s and does the CUD actions on a timed basis.
It seems to me that there should be an easier way though!?
Any thoughts on how to do this would be very much appreciated.
Keep in mind that since several of the DB1s exist on a SQL 2012 server that there may be some issues as 2012 might not be allowed to be a publisher for replication to a 2014 server.