I have a transactional database (SQL Server 2014) with around 60 tables, and there is a requirement to create a separate reporting database for reporting purposes.
This will only need to run every 24 hours - however I will be needing to move the data into a different, more query-friendly schema!
Because of this I would hope I could just create some Views on the Transactional Db and then create a table based on that view in the Reporting db and copy across the data.
I originally thought of writing a scheduled Windows Service that somehow extracts data from the tables and inserts into the new one, but then thought if the schema changes it has to update in two places, and also thought surely an enterprise SQL Server license must have some tricks.
I then looked into 'database mirroring' on specific tables but this looks to soon be deprecated.
'Log shipping' looks like more of a disaster recovery solution!
Is there an industry 'best' approach to this problem?
You will need to devise an ETL process to extract data from your source database, transform it and load it into your reporting database. There are many tools available to you to make this easier. You can use SSIS, Azure Data Factory for Azure SQL, and there are many other options. You can use the SQL Agent to schedule stored procedures to run your ETL process.
Your target database will look much different than your source database. There is really no quick way (quick as in scheduling a backup) to accomplish this. There is a lot of information on data warehouse and ETL design available to you to assist you in deciding how to proceed.
Related
I'm modeling a new microservice architecture migrating some part of a monolithic software to microservices.
I'm adding a new PostgreSQL database and the idea is in the future use that database but for now I still need to keep updated the old SQL Server database and also synchronize the PostgreSQL database if something new appears in the old database.
I've searched for ETL tools but are meant to move data to a datawarehouse (that's not what I need). I just can't replicate the information because the DB model is not the same.
Basically I need a way to detect new rows inserted in the SQL Server database, transform that information and insert it in my PostgreSQL.
Any suggestions?
PostgreSQL's foreign data wrappers might be useful. My approach would be, to change the frontend to use PostgreSQL and let postgreSQL handle the split via it's various features (triggers, rules, ...)
Take a look at StreamSets Data Collector. It can detect changes in SQL Server and insert/update/delete to any DB that has a JDBC driver including Postgres. It is open source but you can buy support. You can also make field changes/additions/removals/renaming to the data stream so that the fields match the target table.
I have two SQL Server 2014 DBs with different schemas. These DBs served two distinct web application operating in the same area of interest, hence I have similar tables in these two DBs. What is the easiest way to migrate data between them? I was thinking about a Transact-SQL script. Is there a tool that could solve this task more easily?
If the migration is relatively simple or if you want to reduce the number of tools involved you can stick with a tsql script. If you want to run it on a schedule you can execute it with SQL agent as TSQL or wrap it in a stored procedure and call that from the agent. If there are different servers involved you can create a linked server.
If you like a visual tool or if the process is very complex and you do not want to write tsql scripts then SSIS is a great tool that is specialized in taking data from disparate sources, applying transformations / conversions and importing. Some people also like to use SSIS for simple tasks because of the visual design surface.
Without more details it is hard to say the best route. If I had two DB's that were very similar I would consider merging the designs to accommodate both business lines / customers and add flexibility to allow more businiess lines / customers into the same design in the future
We have a requirement where we will have to move data between different database instance on regular basis. (For e.g. some customers willing to pay more for the better performance). So this is not going to be one off.
The database tables has referential integrity. Is there a way in which this can be done without rewriting sql script (or some other method) every time we migrate customers data?
I came across this How to move data between multiple database's table while maintaining foreign-key relationships/referential integrity?. However it appears that we have write script every time we migrate data (please correct me if I misunderstood the answer on this thread).
Thanks
Edit:
Both servers are using SQL Server 2012 (same version). Its an Azure SQL Server database.
They are not necessarily linked (no firewall between them)
We are only transferring some data, not the whole database. This is only for certain customers who opted pay more.
The schema are exactly same in both databases.
Preyash - please see the documentation on the Split-Merge tool. The Split-Merge tool enables you do move data between databases, as you have described, based on a sharding key (e.g., customer ID). One modification that you will need for your application is to add a shard map (i.e., a database that understand the global state of which customers resides in which databases).
Have a look into Azure Data Sync. It is much more aligned with your requirements. But you may end up in having another SQL Azure DB to maintain a Hub. Azure data Sync follows hub-spoke pattern and will let you do all flexible directional syncs with a few minutes of syncing gap. It is more simple and can set it up very fast without any scripts and all as you wanted.
A little background:
I have a remote, stand alone SQL Server database that is truncated at the end of every weekend. The data is hardly relational, not normalized at all, and pretty annoying to work with. On top of that, the schema for this database cannot be modified at all, because it is recreated by a third party application. Before the database is destroyed each week, a backup is created of that week's data. On average each database will have between 500,000 and 2,000,000 records.
My task is to create a historical version of this database that is a superset of all of these database backups. It should tie into our other databases which contain related sets of information. I have already started on an application to perform this task, and I've gotten to the point where I'm able to match data with our other databases, but I'm wondering if theres any best practice to handling this kind of import.
How do I make sure that I have unique IDs in my historical version of this database? Are there any features in SQL Server that can do some of the heavy lifting in this for me?
Thanks for your time on this.
There's definitely a feature in SQL Server that can assist you and that feature is called SSIS (SQL Server Integration Services). One of the main uses of SSIS is for ETL (Extract, Transform, Load), which means extracting data from several diverse source, transforming it into whatever you need to get into your destination database (such as a data warehouse - any linking with existing data will also happen here), and finally loading it into your destination DB.
I think the best way to get started, if that's what you want of course, is to pick up a good book on SSIS and go through it. While reading, don't forget to play around with the BIDS (Business Intelligence Development Studio - one of the SQL Server tools) to create some test packages.
Furthermore, on the internet you'll find plenty of "getting started" articles.
For your case in particular what I would do is:
create a generic package that can import the data from a source DB (one of your weekly DBs) and insert it into the destination DB - this package can be parameterized using Parent Package Configuration.
create a main package that loops over all backups in a certain folder, restores them one by one and calls the generic import package for each restore. After each successful import, the Control Flow would delete the previously-restored DB.
I think I've given you enough material to investigate on now :-)
Good luck,
Valentino.
I have been googling a lot and I couldn't find if this even exists or I'm asking for some magic =P
Ok, so here's the deal.
I need to have a way to create a "master-structured" database which will only contain the schemas, structures, tables, store procedures, udfs, etc, everything but real data in SQL SERVER 2005 (if this is available in 2008 let me know, I could try to convince my client to pay for it =P)
Then I want to have several "children" of that master db which implement those schemas, tables, etc but each one has different data.
So when I need to create a new stored procedure or something like that, I just create it on the master database (and of course it's available on its children).
Actually I have several different databases with the same schema and different data. But the problem is to maintain congruency between them. Everytime I create a script to create some SP or add some index or whatever, I have to execute it in every database, and sometimes I could miss one =P
So let's say you have a UNIVERSE (would be the master db) and the universe has SPACES (each one represented by a child db). So the application I'm working on needs to dynamically "clone" SPACES. To do that, we have to create a new database. Nowadays I'm creating a backup of the db being cloned, restoring it as a new one and truncate the tables.
I want to be able to create a new "child" of the "master" db, which will maintain the schemas and everything, but will start with empty data.
Hope it's clear... My english is not perfect, sorry about that =P
Thanks to all!
What you really need is to version-control your database schema.
See do-you-source-control-your-databases
If you use SQL Server, I would recommend dbGhost - not expensive and does a great job of:
synchronizing 2 databases
diff-ing 2 databases
creating a database from a set of scripts (I would recommend this version).
batch support, so that you can upgrade all your databases using a single batch
You can use this infrastructure for both:
rolling development versions to test, integration and production systems
rolling your 'updated' system to multiple production deployments (especially in a hosted environment)
I would write my changes as a sql file and use OSQL or SQLCMD via a batchfile to ensure that I repeatedly executed on all the databases without thinking about it.
As an alternative I would use the VisualStudio Database Pro tools or RedGate SQL compare tools to compare and propogate the changes.
There are kludges, but the mainstream way to handle this is still to use Source Code Control (with all its other attendant benefits.) And SQL Server is increasingly SCC friendly.
Also, for many (most robust) sites it's a per-server issue as much as a per-database issue.
You can put things in master like SPs and call them from anywhere. As far as other objects like tables, you can put them in model and new databases will get them when you create a new database.
However, in order to get new tables to simply pop up in the child databases after being added to the parent, nothing.
It would be possible to create something to look through the databases and script them from a template database, and there are also commercial tools which can help discover differences between databases. You could also have a DDL trigger in the "master" database which went out and did this when you created a new table.
If you kept a nice SPACES template, you could script it out (without data) and create the new database - so there would be no need to TRUNCATE. You can script it out from SQL or an external tool.
Little trivia here. The mssqlsystemresource database works as you describe: is defined once and 'appears' in every database as the special sys schema. Unfortunately the special 'magic' needed to get this working is not available to the user databases. You'll have to use deployment techniques to keep your schema in synk. That is, apply the changes to every database as the other answers already suggested.
In theory, you could put a trigger on your UNIVERSE.sysobjects table (assuming SQL Server), and then you could enumerate master.dbo.sysdatabases to find all the child databases. If you have a special table that indicates it's a child database, you can reference child.dbo.sysobjects to find it.
Make no mistake, it would be difficult to implement. But it's one way you could do it.