We need to create an automated process for cloning small SQL Server databases, but in the destination database all primary keys should be distinct from the source (we are using UNIQUEIDENTIFIER ids for all primary keys). We have thousands of databases that all have the same schema, and need to use this "clone" process to create new databases with all non-key data matching, but referential integrity maintained.
Is there an easy way to do this?
Update - Example:
Each database has ~250 transactional tables that need to be cloned. Consider the following simple example of a few tables and their relationships (each table has a UniqueIdentifier primary key = id):
location
doctor
doctor_location (to doctor.id via doctor_id, to location.id via location_id)
patient
patient_address (to patient.id via patient_id)
patient_medical_history (to patient.id via patient_id)
patient_doctor (to patient.id via patient_id, to doctor.id via doctor_id)
patient_visit (to patient.id via patient_id)
patient_payment (to patient.id via patient_id)
The reason we need to clone the databases is due to offices being bought out or changing ownership (due to partnership changes, this happens relatively frequently). When this occurs, the tax and insurance information changes for the office. Legally this requires an entirely new corporate structure, and the financials between offices need to be completely separated.
However, most offices want to maintain all of their patient history, so they opt to "clone" the database. The new database will be stripped of financial history, but all patient/doctor data will be maintained. The old database will have all information up to the point of the "clone".
The reason new GUIDs are required is that we consolidate all databases into a single relational database for reporting purposes. Since all transactional tables have GUIDs, this works great ... except for the cases of the clones.
Our only solution so far has been to dump the database to a text file and search and replace GUIDs. This is ridiculously time consuming, so were hoping for a better way.
I'd do this by creating a basic restore of the database, and updating all values in the primary key to a new GUID.
To make this automatically update all the foreign keys you need to add constraints to the database with the CASCADE keyword i.e.
CREATE TABLE Orders
(
OrderID uniqueidentifier,
CustomerID uniqueidentifier REFERENCES Customer(CustomerID) ON UPDATE CASCADE,
etc...
Now when you update the Customer table's CustomerID the Order table's CustomerID is updated too.
You can do this to a whole table using a simple update query:
UPDATE TABLE Customer SET CustomerID = NewID();
You'd need to do this to each table with a uniqueidentifier as it's primary key.
You could create an Integration Services (SSIS) package to do this. You would create the new database in the control flow, then copy the data from the source to the destination using the data flow, which would also replace the GUIDs or make other needed transformations along the way.
If the DBs have a large number of tables, and only a few of them need to be modified, then you might be better off just making a copy of the MDF/LDF files, re-attaching them with a new DB name, and using a script to update the IDs.
The advantage of using SSIS is that it's easier to fully automate. The downside is that it might take a little longer to get things set up.
Related
One of the requirements of a recent project I was working on, was maintaining history of database table data as part of an audit trail. My first thought about the technical solution was to use triggers, but after some research I learned about SQL Server temporal tables (Part of core SQL Server 2016). I did a lot of research around this and see that Temporal tables can be put to good use.
More on temporal tables: Managing Temporal Table History in SQL Server 2016
However, I want the data in temporal tables to be created only when few columns are changed.
CREATE TABLE dbo.Persons
(
ID BIGINT IDENTITY(1,1) NOT NULL,
FirstName NVARCHAR(50) NOT NULL,
LastName NVARCHAR(50),
PhoneNumber NVARCHAR(20)
)
Now if I create the temporal table on top of this (SYSTEM_VERSIONING = On), I want the data to be inserted in the Temporal table only when Phone Number is changed and not the first name and last name.
Unfortunately, that's not the way it works. Like the link in your post says, "system versioning is all-or-nothing". Honestly, your first instinct is likely your best option - every other method of doing it (CDC, replication, system versioning..) will capture more data than you want and you will have to pare the results down after the fact.
If you really want to use system versioning, you'd just have to use one of the options presented in the provided link: delete unwanted rows and/or update unwanted columns to NULL values.
I would recommend going with your first instinct and use triggers to implement something like a type 4 slowly changing dimension. It's the most straightforward method of getting the specific data you want.
You could create one table for the attributes you want history for (and you'll set system_versioning = ON) and a second table with the attributes you don't want history for. Between the two tables you'll have a 1-to-1 relation.
I am trying to migrate Oracle database to Postgres using foreign_database_wrapper by creating foreign tables.
But since the foreign tables acts like a view of Oracle so at the time of executing any query it fetches data on fly from the Original source and hence it increases the processing time.
As of now, in order to maintain physical data at Postgres end, I am creating table and inserting those data in it.
eg: create table employee_details as select * from emp_det;
where employee_details is a physical table and emp_det is a foreign table
But I felt this process is kind of redundant and time to time we need to manipulate this table(new insertion, updation or deletion)
So if anyone could share some related way where I can preserve these data with some other mode.
Regards,
See the identical Github issue.
oracle_fdw does not store the Oracle data on the PostgreSQL side.
Each access to a foreign table directly accesses the Oracle database.
If you want a copy of the data physically located in the PostgreSQL database, you can either do it like you described, or you could use a materialized view:
CREATE MATERIALIZED VIEW emp_det_mv AS SELECT * FROM emp_det;
That will do the same thing, but simpler. To refresh the data, you can run
REFRESH MATERIALIZED VIEW emp_det_mv;
My company has an application with a bunch of database tables that used to use a sequence table to determine the next value to use. Recently, we switched this to using an identity property. The problem is that in order to upgrade a client to the latest version of the software, we have to change about 150 tables to identity. To do this manually, you can right click on a table, choose design, change (Is Identity) to "Yes" and then save the table. From what I understand, in the background, SQL Server exports this to a temporary table, drops the table and then copies everything back into the new table. Clients may have their own unique indexes and possibly other things specific to the client, so making a generic script isn't really an option.
It would be really awesome if there was a stored procedure for scripting this task rather than doing it in the GUI (which takes FOREVER). We made a macro that can go through and do this, but even then, it takes a long time to run and is error prone. Something like: exec sp_change_to_identity 'table_name', 'column name'
Does something like this exist? If not, how would you handle this situation?
Update: This is SQL Server 2008 R2.
This is what SSMS seems to do:
Obtain and Drop all the foreign keys pointing to the original table.
Obtain the Indexes, Triggers, Foreign Keys and Statistics of the original table.
Create a temp_table with the same schema as the original table, with the Identity field.
Insert into temp_table all the rows from the original table (Identity_Insert On).
Drop the original table (this will drop its indexes, triggers, foreign keys and statistics)
Rename temp_table to the original table name
Recreate the foreign keys obtained in (1)
Recreate the objects obtained in (2)
I have to migrate a Firebird database to MS SQL, and the original author designed the database in the way that it has mutable primary keys. For example, we have a tables
PRODUCTS with primary key (NAME,CATEGORY,SHORTNAME) and some data
PRICE_LISTING (NAME,CATEGORY,SHORTNAME) - foreign key to PRODUCT, and some data like price
PRODUCT_ALTERNATIVES (ORIGINAL_NAME,ORIGINAL_CATEGORY,ORIGINAL_SHORTNAME; REPLACEMENT_NAME,REPLACEMENT_CATEGORY,REPLACEMENT_SHORTNAME) - foreign keys to PRODUCTS
as you can see, whenever product name changes, lot of data in other tables has to change. The original author solved this with cascading update on the foreign keys. For some reasons, MS SQL doesn't allow me to use cascading update on the foreign key, so the database cannot be migrated so easily.
What makes things worse is that the software has a lot of editing screens, that use DataSets, DataGridViews and binding in Windows Forms, so the editor works pretty much automagically. For example, editor for PRODUCTS is a simple windows form, and whenever user renames a product (changes its NAME), the changes are propagated through database (via cascading update) to PRICE_LISTING and PRODUCT_ALTERNATIVES tables. And also, if the app wants to show price listings, it does simple SELECT * FROM PRICE_LISTING that includes product names and price.
One approach might be to add surrogate keys - ID identity columns to each table, then modify for example PRICE_LISTING table to (PRODUCT_ID, price) etc. But then, to show price, I'd have to define view that will
SELECT *,PRODUCT.NAME,PRODUCT.CATEGORY,PRODUCT.SHORTNAME FROM PRICE_LISTING join PRODUCT on ...
then, the automatically generated DataSets etc. will have to be modified to select from view, update the view.
In a nutshell and from the original author's perspective, we are breaking the program and making it more complicated. The only single thing that does not work when ported in a straightforward manner is the cascading update, but I see that the database design is very flawed and it will be a maintenance nightmare if not fixed.
What to do?
Using Merge Replication, I have a table that for the most part is synchronized normally. However, the table contains one column is used to store temporary, client-side data which is only meaningfully edited and used on the client, and which I don't have any desire to have replicated back to the server. For example:
CREATE TABLE MyTable (
ID UNIQUEIDENTIFIER NOT NULL PRIMARY KEY,
Name NVARCHAR(200),
ClientCode NVARCHAR(100)
)
In this case, even if subscribers make changes to the ClientCode column in the table, I don't want those changes getting back to the server. Does Merge Replication offer any means to accomplish this?
An alternate approach, which I may fall back on, would be to publish an additional table, and configure it to be "Download-only to subscriber, allow subscriber changes", and then reference MyTable.ID in that table, along with the ClientCode. But I'd rather not have to publish an additional table if I don't absolutely need to.
Thanks,
-Dan
Yes, when you create the article in the publication, don't include this column. Then, create a script that adds this column back to the table, and in the publication properties, under snapshot, specify that this script executes after the snapshot is applied.
This means that the column will exist on both the publisher and subscriber, but will be entirely ignored by replication. Of course, you can only use this technique if the column(s) to ignore are nullable.