We're looking to migrate from MSSQL to Postgres. I'm intending on using sql servers bcp tool for generating csv that we'll import into postgres with the bulk copy features. We are however, having trouble getting the DDL migrated. We've. I've gotten it to work by massaging the DDL generated by MMSQL by hand but We need something automated since we have a moving target (still adding tables, columns etc.) and will need to do this more than once.
We're open to commercial and open source products but have not found anything that does everything we need: Tables, serial columns, indexes (unique, multi column etc), defaults and foreign key constraints.
Check this link http://dbconvert.com/convert-mssql-to-postgre-pro.php?DB=6
Specifically look at the "SQL Azure to PostgreSQL" feature. Hopefully that will handle your table DDL.
NOTE: I have not used this just ran across it at http://www.postgresql.org/ in the latest news section a few days ago.
Finally settled on using http://www.enterprisedb.com/products-services-training/products-overview/postgres-plus-solution-pack/migration-toolkit. It's the most thorough and does data and ddl. There have been some issues with escaping backslashes that are followed by numbers since postgres thinks these are unicode escape sequences.
Related
I have a PostgreSQL database that I want to move to SQL Server -- both schema and data. I am poor so I don't want to pay any money. I am also lazy, so I don't want to do very much work. Currently I'm doing this table by table, and there are about 100 tables to do. This is extremely tedious.
Is there some sort of trick that does what I want?
You should be able to find some useful information in the accepted answer in this Serverfault page: https://serverfault.com/questions/65407/best-tool-to-migrate-a-postgresql-database-to-ms-sql-2005.
If you can get the schema converted without the data, you may be able to shorten the steps for the data by using this command:
pg_dump --data-only --column-inserts your_db_name > data_load_script.sql
This load will be quite slow, but the --column-inserts option generates the most generic INSERT statements possible for each row of data and should be compatible.
EDIT: Suggestions on converting the schema follows:
I would start by dumping the schema, but removing anything that has to do with ownership or permissions. This should be enough:
pg_dump --schema-only --no-owner --no-privileges your_db_name > schema_create_script.sql
Edit this file to add the line BEGIN TRANSACTION; to the beginning and ROLLBACK TRANSACTION; to the end. Now you can load it and run it in a query window in SQL Server. If you get any errors, make sure you go to the bottom of the file, highlight the ROLLBACK statement and run it (by hitting F5 while the statement is highlighted).
Basically, you have to resolve each error until the script runs through cleanly. Then you can change the ROLLBACK TRANSACTION to COMMIT TRANSACTION and run one final time.
Unfortunately, I cannot help with which errors you may see as I have never gone from PostgreSQL to SQL Server, only the other way around. Some things that I would expect to be an issue, however (obviously, NOT an exhaustive list):
PostgreSQL does auto-increment fields by linking a NOT NULL INTEGER field to a SEQUENCE using a DEFAULT. In SQL Server, this is an IDENTITY column, but they're not exactly the same thing. I'm not sure if they are equivalent, but if your original schema is full of "id" fields, you may be in for some trouble. I don't know if SQL Server has CREATE SEQUENCE, so you may have to remove those.
Database functions / Stored Procedures do not translate between RDBMS platforms. You'll need to remove any CREATE FUNCTION statements and translate the algorithms manually.
Be careful about encoding of the data file. I'm a Linux person, so I have no idea how to verify encoding in Windows, but you need to make sure that what SQL Server expects is the same as the file you are importing from PostgreSQL. pg_dump has an option --encoding= that will let you set a specific encoding. I seem to recall that Windows tends to use two-byte, UTF-16 encoding for Unicode where PostgreSQL uses UTF-8. I had some issue going from SQL Server to PostgreSQL due to UTF-16 output so it would be worth researching.
The PostgreSQL datatype TEXT is simply a VARCHAR without a max length. In SQL Server, TEXT is... complicated (and deprecated). Each field in your original schema that are declared as TEXT will need to be reviewed for an appropriate SQL Server data type.
SQL Server has extra data types for UNICODE data. I'm not familiar enough with it to make suggestions. I'm just pointing out that it may be an issue.
I have found a faster and easier way to accomplish this.
First copy your table (or query) to a tab delimited file like so:
COPY (SELECT siteid, searchdist, listtype, list, sitename, county, street,
city, state, zip, georesult, elevation, lat, lng, wkt, unlocated_bool,
id, status, standard_status, date_opened_or_reported, date_closed,
notes, list_type_description FROM mlocal) TO 'c:\SQLAzureImportFiles\data_script_mlocal.tsv' NULL E''
Next you need to create your table in SQL, this will not handle any schema for you. The schema must match your exported tsv file in field order and data types.
Finally you run SQL's bcp utility to bring in the tsv file like so:
bcp MyDb.dbo.mlocal in "\\NEWDBSERVER\SQLAzureImportFiles\data_script_mlocal.tsv" -S tcp:YourDBServer.database.windows.net -U YourUserName -P YourPassword -c
A couple of things of note that I encountered. Postgres and SQL Server handle boolean fields differently. Your SQL Server schema need to have your boolean fields set to varchar(1) and the resulting data will be 'f', 't' or null. You will then have to convert this field to a bit. doing something like:
ALTER TABLE mlocal ADD unlocated bit;
UPDATE mlocal SET unlocated=1 WHERE unlocated_bool='t';
UPDATE mlocal SET unlocated=0 WHERE unlocated_bool='f';
ALTER TABLE mlocal DROP COLUMN unlocated_bool;
Another thing is the geography/geometry fields are very different between the two platforms. Export the geometry fields as WKT using ST_AsText(geo) and convert appropriately on the SQL Server end.
There may be more incompatibilities needing tweaks like this.
EDIT. So whereas this technique does technically work, I am trying to transfer several million records from 100+ tables to SQL Azure and bcp to SQL Azure is pretty flaky it turns out. I keep getting intermittent Unable to open BCP host data-file errors, the server is intermittently timing out and for some reason some records are not getting transferred with no indications of errors or problems. So this technique is not stable for transferring large amounts of data to Azure SQL.
You can use Navicate a powerful GUI tool for working with various databases including Postgres and SQL Server.
You can transfer both schema and data easily as follows:
Create two connection for source and target database
Go to Tools -> Data Transfer
Select source database and target database with its IP, database name and schema
as you can see in the option, if target table is not exist, it would create
Tada, it takes 10 mins to transfer whole my 63 tables and its data from Postgres to SQL Server.
Enjoy it!
Dear all, Currently I am just researching how I could handle the change of the collation on the database.
Somebody made an unusual decision to create accent sensitive database for global use... but I am on the way to handle this!
REASON: of changing the collation is that database contains data collected from different countries and as we all know some of cultures have their own letters.
With the respect for the customers, our organization would like to have Accent Insensitive database. That will allow users to request data from the server without any limitations using local characters.
As far as I have find out, there may be an option to drop constraints and etc. change collation and then just to bring everything back. In this case I am afraid if this would be enough to affect already existing data (columns).
Another way, I have found an article in Collation change on 2005 and 2008 server. However, this does not include the 2012 server.
Also I am taking the complexity of this example into consideration as well.
I believe that I am not in an easy phase. But I am hoping to get few advises what would be the best and safest way to handle this.
Thank you for your concerns and assistance.
UPDATE let me add what architecture do we have: The complete system contains 4 databases and more than 1.000 tables in total. So my expectations is that not all of the possible ways may work in an optimal way.
me too i had to deal with a similar issue because of a different reason: ancient databases with an old SQL collation installed ages ago on a SQL6.5 server that has been inplace upgraded for each version from sql 7 to sql 2005 and now should be updated to sql 2012.
why all these inplace upgrades? because the actual collation was the server collation and was so old that is not available during then install process of a recent version (2000+) of sql server...
i decided to drop all that old rubbish so i had to find a way that allowed me to move to a new installation with a windows collation.
i had to exclude the data migration (create a new database and import data) because of the lack of documentation and the huge number of customizations, triggers, hidden rules and so on.
the solution i used (the order matters):
disable automatic statistics generation
script the creation of all foreign keys and then drop them
script unique and primary indexes and then drop them
script all remaining indexes and then drop them
script custom statistics and then drop them
script CHECK and DEFAULT constraints and then drop them
now you can run the ALTER commands needed to change the collation of the columns and change the collation of the database itself.
when done repeat the above in reverse order to rebuild all the needed objects.
it happens that if the database is so old as is mine you may incur in something funny like existing foreign key that references fields with different datatypes.
Changing collation of all existing columns is a real pain. I suggest a side-by-side migration rather than alter each column individually. Create a new database with the desired collation containing only empty tables. Copy data from the old db to the new one using INSERT...SELECT (or the ETL tool of your choice), and then create constraints, indexes, and other database objects.
Consider upvoting the Make it easy to change collation on a database SQL Server feature request.
There are a number of complicated solutions on the internet for inplace collation changes but the simplest (and safest) way we have found is to script out the database, alter the script to create a new db with the collation set at the start and then import the data to the new database.
We achieve this using MS SQL Server 2012 Management Studio in the following way:
Script out all database objects with Tasks -> Generate Scripts -> Script entire Database and all Database objects
Alter the script with the following 2 changes and then run it to create a new database:
a) Change DB name to MY-NEW-DB
b) Under the CREATE DATABASE statement add: ALTER DATABASE [MY-NEW-DB] collate Latin1_General_CS_AS
If desired, use a tool like RG SQL Compare to compare the old and new database to verify all indexes, constraints, types etc were the same and collation on relevant columns only was changed.
Run Tasks->Import Data ensuring 'Enable Identity Insert' checked. All data transferred to the new case sensitive database correctly.
Run DBCC CHECKDB if you wish to check consistency
I've got a reasonably large / complicated DB which I need to upgrade in the field from version 1 to version 2. There's a lot of changes in schema and importantly data between the two.
Yes, I know this should have been version controlled alla:
http://www.codinghorror.com/blog/2008/02/get-your-database-under-version-control.html
but it wasn't - it will be when I am done.
So, current problem, I'm face with the choice of either going through all the commits or trying to diff between two versions of the db. So far I've tried:
http://opendbiff.codeplex.com/
http://www.sqldelta.com/
http://www.red-gate.com/
However none of them seem to be able to successfully generate schema upgrade scripts because they don't also do the data at the same time. This results in foreign key violations when adding new keys to tables as the table it references is new and while the schema for the table has been created, the data whcih it contains has not. Well it could be, but that requires me to use a different part of the tool and then mix together the two scripts.
I know this may look like a duplicate of:
What is best tool to compare two SQL Server databases (schema and data)?
which is where I found most of the existing tools I've tried, but so far I've not managed to get any of these to produce a working schema migration script (I'm really not too fussed about the data, but I do need the data which is required for foreign keys - which tbh is all the difference as I've deploy old version and new version).
Am I expecting too much?
Should I give up and start manually stitching together what I do have?
Or do I go through all the commits and manually create upgrade scripts?
I can't think of more powerful tools available than the ones you seem to have tried. If those fail, my homegrown versioning system probably won't help you much either.
However, you should be able to generate an update script and then manually edit it to add the data transformations to it.
And/or you could disable the foreign key constraints for the time that the update script runs.
There is no such thing as doing schema and data "at the same time". Even if you have them in one big script you would still be doing the schema first and then the data. If the schema script creates a new table and adds a constraint to it there is no reason you should get a referential integrity violation error as there are no rows in those tables.
In any case, you should give our xSQL Schema Compare and Data Compare tools a try, you will be impressed with the performance and the level of control you get.
I have a PostgreSQL database that I want to move to SQL Server -- both schema and data. I am poor so I don't want to pay any money. I am also lazy, so I don't want to do very much work. Currently I'm doing this table by table, and there are about 100 tables to do. This is extremely tedious.
Is there some sort of trick that does what I want?
You should be able to find some useful information in the accepted answer in this Serverfault page: https://serverfault.com/questions/65407/best-tool-to-migrate-a-postgresql-database-to-ms-sql-2005.
If you can get the schema converted without the data, you may be able to shorten the steps for the data by using this command:
pg_dump --data-only --column-inserts your_db_name > data_load_script.sql
This load will be quite slow, but the --column-inserts option generates the most generic INSERT statements possible for each row of data and should be compatible.
EDIT: Suggestions on converting the schema follows:
I would start by dumping the schema, but removing anything that has to do with ownership or permissions. This should be enough:
pg_dump --schema-only --no-owner --no-privileges your_db_name > schema_create_script.sql
Edit this file to add the line BEGIN TRANSACTION; to the beginning and ROLLBACK TRANSACTION; to the end. Now you can load it and run it in a query window in SQL Server. If you get any errors, make sure you go to the bottom of the file, highlight the ROLLBACK statement and run it (by hitting F5 while the statement is highlighted).
Basically, you have to resolve each error until the script runs through cleanly. Then you can change the ROLLBACK TRANSACTION to COMMIT TRANSACTION and run one final time.
Unfortunately, I cannot help with which errors you may see as I have never gone from PostgreSQL to SQL Server, only the other way around. Some things that I would expect to be an issue, however (obviously, NOT an exhaustive list):
PostgreSQL does auto-increment fields by linking a NOT NULL INTEGER field to a SEQUENCE using a DEFAULT. In SQL Server, this is an IDENTITY column, but they're not exactly the same thing. I'm not sure if they are equivalent, but if your original schema is full of "id" fields, you may be in for some trouble. I don't know if SQL Server has CREATE SEQUENCE, so you may have to remove those.
Database functions / Stored Procedures do not translate between RDBMS platforms. You'll need to remove any CREATE FUNCTION statements and translate the algorithms manually.
Be careful about encoding of the data file. I'm a Linux person, so I have no idea how to verify encoding in Windows, but you need to make sure that what SQL Server expects is the same as the file you are importing from PostgreSQL. pg_dump has an option --encoding= that will let you set a specific encoding. I seem to recall that Windows tends to use two-byte, UTF-16 encoding for Unicode where PostgreSQL uses UTF-8. I had some issue going from SQL Server to PostgreSQL due to UTF-16 output so it would be worth researching.
The PostgreSQL datatype TEXT is simply a VARCHAR without a max length. In SQL Server, TEXT is... complicated (and deprecated). Each field in your original schema that are declared as TEXT will need to be reviewed for an appropriate SQL Server data type.
SQL Server has extra data types for UNICODE data. I'm not familiar enough with it to make suggestions. I'm just pointing out that it may be an issue.
I have found a faster and easier way to accomplish this.
First copy your table (or query) to a tab delimited file like so:
COPY (SELECT siteid, searchdist, listtype, list, sitename, county, street,
city, state, zip, georesult, elevation, lat, lng, wkt, unlocated_bool,
id, status, standard_status, date_opened_or_reported, date_closed,
notes, list_type_description FROM mlocal) TO 'c:\SQLAzureImportFiles\data_script_mlocal.tsv' NULL E''
Next you need to create your table in SQL, this will not handle any schema for you. The schema must match your exported tsv file in field order and data types.
Finally you run SQL's bcp utility to bring in the tsv file like so:
bcp MyDb.dbo.mlocal in "\\NEWDBSERVER\SQLAzureImportFiles\data_script_mlocal.tsv" -S tcp:YourDBServer.database.windows.net -U YourUserName -P YourPassword -c
A couple of things of note that I encountered. Postgres and SQL Server handle boolean fields differently. Your SQL Server schema need to have your boolean fields set to varchar(1) and the resulting data will be 'f', 't' or null. You will then have to convert this field to a bit. doing something like:
ALTER TABLE mlocal ADD unlocated bit;
UPDATE mlocal SET unlocated=1 WHERE unlocated_bool='t';
UPDATE mlocal SET unlocated=0 WHERE unlocated_bool='f';
ALTER TABLE mlocal DROP COLUMN unlocated_bool;
Another thing is the geography/geometry fields are very different between the two platforms. Export the geometry fields as WKT using ST_AsText(geo) and convert appropriately on the SQL Server end.
There may be more incompatibilities needing tweaks like this.
EDIT. So whereas this technique does technically work, I am trying to transfer several million records from 100+ tables to SQL Azure and bcp to SQL Azure is pretty flaky it turns out. I keep getting intermittent Unable to open BCP host data-file errors, the server is intermittently timing out and for some reason some records are not getting transferred with no indications of errors or problems. So this technique is not stable for transferring large amounts of data to Azure SQL.
You can use Navicate a powerful GUI tool for working with various databases including Postgres and SQL Server.
You can transfer both schema and data easily as follows:
Create two connection for source and target database
Go to Tools -> Data Transfer
Select source database and target database with its IP, database name and schema
as you can see in the option, if target table is not exist, it would create
Tada, it takes 10 mins to transfer whole my 63 tables and its data from Postgres to SQL Server.
Enjoy it!
We have customers who are upgrading from one database version to another (Oracle 9i to Oracle 10g or 11g to be specific). In one case, a customer exported the old database and imported it into the new one, but for some reason the indexes and constraints didn't get created. They may have done this on purpose to speed up the import process, but we're still looking into the reason why.
The real question is, is there a simple way that we can verify that the structure of the database is complete after the import? Is there some sort of checksum that we can do on the structure? We realize that we could do a bunch of queries to see if all the tables, indexes, aliases, views, sequences, etc. exist, but this would probably be difficult to write and maintain.
Update
Thanks for the answers suggesting commercial and/or GUI tools to use, but we really need something free that we could package with our product. It also has to be command line or script driven so our customers can run it in any environment (unix, linux, windows).
Presuming a single schema, something like this - dump USER_OBJECTS into a table before migration.
CREATE TABLE SAVED_USER_OBJECTS AS SELECT * FROM USER_OBJECTS
Then to validate after your migration
SELECT object_type, object_name FROM SAVED_USER_OBJECTS
MINUS
SELECT object_type, object_name FROM USER_OBJECTS
One issue is if you have intentionally dropped objects between versions you will also need to delete the from SAVED_USER_OBJECTS. Also this will not pick up if the wrong version of objects exist.
If you have multiple schemas, then the same thing is required for each schema OR use ALL_OBJECTS and extract/compare for the relevant user schemas.
You could also do a hash/checksum on object_type||object_name for the whole schema (save before/compare after) but the cost of calculation wouldn't be that different from comparing the two tables on indexes.
If you are willing to spend some, DBDiff is an efficient utility that does exactly what you need.
http://www.dkgas.com/oradbdiff.htm
In SQL DEVELOPER (the free Oracle utility) there is a Database Schema Differences feature.
It's worth to try it.
Hope it helps.
SQL Developer - download
Roni.
I wouldn't write the check script, I'd write a program to generate the check script from a particular version of the database. Just go though the metatdata and record what's there and write it to a file, then compare the values in that file against the values in the customer's database. This won't work so well if you use system-generated names for your constraints, but it is probably enough to just verify that things are there. Dropping indexes and constraints is pretty common when migrating a database, so you might not even need to check too much; if two or three things are missing, then it's not unreasonable to assume they all are. You might also want to write a script that drops all the constraints and indexes and re-creates them, and just have your customers run that as a post-migration step. Just be sure you drop everything by name, so you don't delete any custom indexes your customer might have created.