How to migrate a PostgreSQL database into a SQLServer one? - sql-server

I have a PostgreSQL database that I want to move to SQL Server -- both schema and data. I am poor so I don't want to pay any money. I am also lazy, so I don't want to do very much work. Currently I'm doing this table by table, and there are about 100 tables to do. This is extremely tedious.
Is there some sort of trick that does what I want?

You should be able to find some useful information in the accepted answer in this Serverfault page: https://serverfault.com/questions/65407/best-tool-to-migrate-a-postgresql-database-to-ms-sql-2005.
If you can get the schema converted without the data, you may be able to shorten the steps for the data by using this command:
pg_dump --data-only --column-inserts your_db_name > data_load_script.sql
This load will be quite slow, but the --column-inserts option generates the most generic INSERT statements possible for each row of data and should be compatible.
EDIT: Suggestions on converting the schema follows:
I would start by dumping the schema, but removing anything that has to do with ownership or permissions. This should be enough:
pg_dump --schema-only --no-owner --no-privileges your_db_name > schema_create_script.sql
Edit this file to add the line BEGIN TRANSACTION; to the beginning and ROLLBACK TRANSACTION; to the end. Now you can load it and run it in a query window in SQL Server. If you get any errors, make sure you go to the bottom of the file, highlight the ROLLBACK statement and run it (by hitting F5 while the statement is highlighted).
Basically, you have to resolve each error until the script runs through cleanly. Then you can change the ROLLBACK TRANSACTION to COMMIT TRANSACTION and run one final time.
Unfortunately, I cannot help with which errors you may see as I have never gone from PostgreSQL to SQL Server, only the other way around. Some things that I would expect to be an issue, however (obviously, NOT an exhaustive list):
PostgreSQL does auto-increment fields by linking a NOT NULL INTEGER field to a SEQUENCE using a DEFAULT. In SQL Server, this is an IDENTITY column, but they're not exactly the same thing. I'm not sure if they are equivalent, but if your original schema is full of "id" fields, you may be in for some trouble. I don't know if SQL Server has CREATE SEQUENCE, so you may have to remove those.
Database functions / Stored Procedures do not translate between RDBMS platforms. You'll need to remove any CREATE FUNCTION statements and translate the algorithms manually.
Be careful about encoding of the data file. I'm a Linux person, so I have no idea how to verify encoding in Windows, but you need to make sure that what SQL Server expects is the same as the file you are importing from PostgreSQL. pg_dump has an option --encoding= that will let you set a specific encoding. I seem to recall that Windows tends to use two-byte, UTF-16 encoding for Unicode where PostgreSQL uses UTF-8. I had some issue going from SQL Server to PostgreSQL due to UTF-16 output so it would be worth researching.
The PostgreSQL datatype TEXT is simply a VARCHAR without a max length. In SQL Server, TEXT is... complicated (and deprecated). Each field in your original schema that are declared as TEXT will need to be reviewed for an appropriate SQL Server data type.
SQL Server has extra data types for UNICODE data. I'm not familiar enough with it to make suggestions. I'm just pointing out that it may be an issue.

I have found a faster and easier way to accomplish this.
First copy your table (or query) to a tab delimited file like so:
COPY (SELECT siteid, searchdist, listtype, list, sitename, county, street,
city, state, zip, georesult, elevation, lat, lng, wkt, unlocated_bool,
id, status, standard_status, date_opened_or_reported, date_closed,
notes, list_type_description FROM mlocal) TO 'c:\SQLAzureImportFiles\data_script_mlocal.tsv' NULL E''
Next you need to create your table in SQL, this will not handle any schema for you. The schema must match your exported tsv file in field order and data types.
Finally you run SQL's bcp utility to bring in the tsv file like so:
bcp MyDb.dbo.mlocal in "\\NEWDBSERVER\SQLAzureImportFiles\data_script_mlocal.tsv" -S tcp:YourDBServer.database.windows.net -U YourUserName -P YourPassword -c
A couple of things of note that I encountered. Postgres and SQL Server handle boolean fields differently. Your SQL Server schema need to have your boolean fields set to varchar(1) and the resulting data will be 'f', 't' or null. You will then have to convert this field to a bit. doing something like:
ALTER TABLE mlocal ADD unlocated bit;
UPDATE mlocal SET unlocated=1 WHERE unlocated_bool='t';
UPDATE mlocal SET unlocated=0 WHERE unlocated_bool='f';
ALTER TABLE mlocal DROP COLUMN unlocated_bool;
Another thing is the geography/geometry fields are very different between the two platforms. Export the geometry fields as WKT using ST_AsText(geo) and convert appropriately on the SQL Server end.
There may be more incompatibilities needing tweaks like this.
EDIT. So whereas this technique does technically work, I am trying to transfer several million records from 100+ tables to SQL Azure and bcp to SQL Azure is pretty flaky it turns out. I keep getting intermittent Unable to open BCP host data-file errors, the server is intermittently timing out and for some reason some records are not getting transferred with no indications of errors or problems. So this technique is not stable for transferring large amounts of data to Azure SQL.

You can use Navicate a powerful GUI tool for working with various databases including Postgres and SQL Server.
You can transfer both schema and data easily as follows:
Create two connection for source and target database
Go to Tools -> Data Transfer
Select source database and target database with its IP, database name and schema
as you can see in the option, if target table is not exist, it would create
Tada, it takes 10 mins to transfer whole my 63 tables and its data from Postgres to SQL Server.
Enjoy it!

Related

Export SQLite integer as DateTime

I have an SQLite3 database. I also have an SQL Server database with the same structure. I need to export the data from SQLite and insert it into the SQL Server database.
The export from SQLite and the modification of the generated export needs to be 100% scripted. Inserting into the SQL Server database will be done manually through SQL Server Management Studio.
I have a mostly good dump of the database through this answer here. I can modify most of the script as needed with sed.
The one thing I'm stuck on right now is that the SQLite database stores timestamps as number of seconds since UNIX epoch. The equivalent column in SQL Server is DATETIME. As far as I know, inserting an integer into a DateTime won't work.
Is there a way to specify that certain fields be converted a certain way upon dumping from SQLite? Meaning, specify that the integer fields be dumped as proper DateTime strings that SQL Server will understand?
Or, is there something I can run on the Linux command line that will somehow find these Integer timestamps and convert them?
EDIT: Anything that runs in a Bash script on Ubuntu is acceptable.
Three basic solutions: (1) modify the data before the dump; (2) manipulate the file after the dump, or (3) modify the data on import. Which you choose will depend on how much freedom you have to modify schemas.
If you wish to do it in SQLite, I'd suggest adding text columns with the dates stored as needed for import to SQL Server, then ignore or remove the original columns on dump. The SQLite doc page for datetime() may help, as might answers to this question.
Or, you can write a function in SQL Server that handles the import. Perhaps set it on an insert trigger.
Otherwise, a script that manipulates your dump file would work too. It sounds like you have a good handle on how to do this.

Migrate PostgreSQL database into MS SQL Server [duplicate]

I have a PostgreSQL database that I want to move to SQL Server -- both schema and data. I am poor so I don't want to pay any money. I am also lazy, so I don't want to do very much work. Currently I'm doing this table by table, and there are about 100 tables to do. This is extremely tedious.
Is there some sort of trick that does what I want?
You should be able to find some useful information in the accepted answer in this Serverfault page: https://serverfault.com/questions/65407/best-tool-to-migrate-a-postgresql-database-to-ms-sql-2005.
If you can get the schema converted without the data, you may be able to shorten the steps for the data by using this command:
pg_dump --data-only --column-inserts your_db_name > data_load_script.sql
This load will be quite slow, but the --column-inserts option generates the most generic INSERT statements possible for each row of data and should be compatible.
EDIT: Suggestions on converting the schema follows:
I would start by dumping the schema, but removing anything that has to do with ownership or permissions. This should be enough:
pg_dump --schema-only --no-owner --no-privileges your_db_name > schema_create_script.sql
Edit this file to add the line BEGIN TRANSACTION; to the beginning and ROLLBACK TRANSACTION; to the end. Now you can load it and run it in a query window in SQL Server. If you get any errors, make sure you go to the bottom of the file, highlight the ROLLBACK statement and run it (by hitting F5 while the statement is highlighted).
Basically, you have to resolve each error until the script runs through cleanly. Then you can change the ROLLBACK TRANSACTION to COMMIT TRANSACTION and run one final time.
Unfortunately, I cannot help with which errors you may see as I have never gone from PostgreSQL to SQL Server, only the other way around. Some things that I would expect to be an issue, however (obviously, NOT an exhaustive list):
PostgreSQL does auto-increment fields by linking a NOT NULL INTEGER field to a SEQUENCE using a DEFAULT. In SQL Server, this is an IDENTITY column, but they're not exactly the same thing. I'm not sure if they are equivalent, but if your original schema is full of "id" fields, you may be in for some trouble. I don't know if SQL Server has CREATE SEQUENCE, so you may have to remove those.
Database functions / Stored Procedures do not translate between RDBMS platforms. You'll need to remove any CREATE FUNCTION statements and translate the algorithms manually.
Be careful about encoding of the data file. I'm a Linux person, so I have no idea how to verify encoding in Windows, but you need to make sure that what SQL Server expects is the same as the file you are importing from PostgreSQL. pg_dump has an option --encoding= that will let you set a specific encoding. I seem to recall that Windows tends to use two-byte, UTF-16 encoding for Unicode where PostgreSQL uses UTF-8. I had some issue going from SQL Server to PostgreSQL due to UTF-16 output so it would be worth researching.
The PostgreSQL datatype TEXT is simply a VARCHAR without a max length. In SQL Server, TEXT is... complicated (and deprecated). Each field in your original schema that are declared as TEXT will need to be reviewed for an appropriate SQL Server data type.
SQL Server has extra data types for UNICODE data. I'm not familiar enough with it to make suggestions. I'm just pointing out that it may be an issue.
I have found a faster and easier way to accomplish this.
First copy your table (or query) to a tab delimited file like so:
COPY (SELECT siteid, searchdist, listtype, list, sitename, county, street,
city, state, zip, georesult, elevation, lat, lng, wkt, unlocated_bool,
id, status, standard_status, date_opened_or_reported, date_closed,
notes, list_type_description FROM mlocal) TO 'c:\SQLAzureImportFiles\data_script_mlocal.tsv' NULL E''
Next you need to create your table in SQL, this will not handle any schema for you. The schema must match your exported tsv file in field order and data types.
Finally you run SQL's bcp utility to bring in the tsv file like so:
bcp MyDb.dbo.mlocal in "\\NEWDBSERVER\SQLAzureImportFiles\data_script_mlocal.tsv" -S tcp:YourDBServer.database.windows.net -U YourUserName -P YourPassword -c
A couple of things of note that I encountered. Postgres and SQL Server handle boolean fields differently. Your SQL Server schema need to have your boolean fields set to varchar(1) and the resulting data will be 'f', 't' or null. You will then have to convert this field to a bit. doing something like:
ALTER TABLE mlocal ADD unlocated bit;
UPDATE mlocal SET unlocated=1 WHERE unlocated_bool='t';
UPDATE mlocal SET unlocated=0 WHERE unlocated_bool='f';
ALTER TABLE mlocal DROP COLUMN unlocated_bool;
Another thing is the geography/geometry fields are very different between the two platforms. Export the geometry fields as WKT using ST_AsText(geo) and convert appropriately on the SQL Server end.
There may be more incompatibilities needing tweaks like this.
EDIT. So whereas this technique does technically work, I am trying to transfer several million records from 100+ tables to SQL Azure and bcp to SQL Azure is pretty flaky it turns out. I keep getting intermittent Unable to open BCP host data-file errors, the server is intermittently timing out and for some reason some records are not getting transferred with no indications of errors or problems. So this technique is not stable for transferring large amounts of data to Azure SQL.
You can use Navicate a powerful GUI tool for working with various databases including Postgres and SQL Server.
You can transfer both schema and data easily as follows:
Create two connection for source and target database
Go to Tools -> Data Transfer
Select source database and target database with its IP, database name and schema
as you can see in the option, if target table is not exist, it would create
Tada, it takes 10 mins to transfer whole my 63 tables and its data from Postgres to SQL Server.
Enjoy it!

Collation change on MS sql server 2012

Dear all, Currently I am just researching how I could handle the change of the collation on the database.
Somebody made an unusual decision to create accent sensitive database for global use... but I am on the way to handle this!
REASON: of changing the collation is that database contains data collected from different countries and as we all know some of cultures have their own letters.
With the respect for the customers, our organization would like to have Accent Insensitive database. That will allow users to request data from the server without any limitations using local characters.
As far as I have find out, there may be an option to drop constraints and etc. change collation and then just to bring everything back. In this case I am afraid if this would be enough to affect already existing data (columns).
Another way, I have found an article in Collation change on 2005 and 2008 server. However, this does not include the 2012 server.
Also I am taking the complexity of this example into consideration as well.
I believe that I am not in an easy phase. But I am hoping to get few advises what would be the best and safest way to handle this.
Thank you for your concerns and assistance.
UPDATE let me add what architecture do we have: The complete system contains 4 databases and more than 1.000 tables in total. So my expectations is that not all of the possible ways may work in an optimal way.
me too i had to deal with a similar issue because of a different reason: ancient databases with an old SQL collation installed ages ago on a SQL6.5 server that has been inplace upgraded for each version from sql 7 to sql 2005 and now should be updated to sql 2012.
why all these inplace upgrades? because the actual collation was the server collation and was so old that is not available during then install process of a recent version (2000+) of sql server...
i decided to drop all that old rubbish so i had to find a way that allowed me to move to a new installation with a windows collation.
i had to exclude the data migration (create a new database and import data) because of the lack of documentation and the huge number of customizations, triggers, hidden rules and so on.
the solution i used (the order matters):
disable automatic statistics generation
script the creation of all foreign keys and then drop them
script unique and primary indexes and then drop them
script all remaining indexes and then drop them
script custom statistics and then drop them
script CHECK and DEFAULT constraints and then drop them
now you can run the ALTER commands needed to change the collation of the columns and change the collation of the database itself.
when done repeat the above in reverse order to rebuild all the needed objects.
it happens that if the database is so old as is mine you may incur in something funny like existing foreign key that references fields with different datatypes.
Changing collation of all existing columns is a real pain. I suggest a side-by-side migration rather than alter each column individually. Create a new database with the desired collation containing only empty tables. Copy data from the old db to the new one using INSERT...SELECT (or the ETL tool of your choice), and then create constraints, indexes, and other database objects.
Consider upvoting the Make it easy to change collation on a database SQL Server feature request.
There are a number of complicated solutions on the internet for inplace collation changes but the simplest (and safest) way we have found is to script out the database, alter the script to create a new db with the collation set at the start and then import the data to the new database.
We achieve this using MS SQL Server 2012 Management Studio in the following way:
Script out all database objects with Tasks -> Generate Scripts -> Script entire Database and all Database objects
Alter the script with the following 2 changes and then run it to create a new database:
a) Change DB name to MY-NEW-DB
b) Under the CREATE DATABASE statement add: ALTER DATABASE [MY-NEW-DB] collate Latin1_General_CS_AS
If desired, use a tool like RG SQL Compare to compare the old and new database to verify all indexes, constraints, types etc were the same and collation on relevant columns only was changed.
Run Tasks->Import Data ensuring 'Enable Identity Insert' checked. All data transferred to the new case sensitive database correctly.
Run DBCC CHECKDB if you wish to check consistency

Fast batch insert/update with SQL Server 2008 and BCP

I'm not a good SQL programmer, I've got only the basics, but I've heard of some BCP thing for fast data loading. I've searched the internet and it seems to be a command-line only utility, and not something you can use in code.
The thing is, I want to be able to make very fast inserts and updates in a SQL Server 2008 database. I would like to have a function in the database that would accept:
The name of the table I want to execute an insert/update operation against
The names of the columns I'll be feeding data to
The data in a CSV format or something that SQL can read stupid-fast
A flag indicating weather the function should perform an insert or update operation
This function would then read this CSV string and genarate the necessary code for inserting/updating the table.
I would then write code in C# to call that function passing it the table name, column names, a list of objects serialized as a CSV string and the insert/update flag.
As you can see, this is intended to be both fast and generic, suitable for any project dealing with large amounts of data, and thus a candidate to my company's framework.
Am I thinking right? Is this a good idea? Can I use that BCP thing, and is it suitable to every case?
As you can see, I need some directions on this... thanks in advance for any help!
In C#, look at SQLBulkCopy. It's what SSIS uses in the background.
For true bcp/BULK INSERT, you'd need bulkadmin rights which may not be allowed
Have you considered using SQL Server Integrated Services (SSIS). It's designed to do exactly what you describe. It is very fast. You can insert data on a transactional basis. And you can set it up to run on a schedule. And much more.

MaxDB Data and Schema Export to SQL Server 2005/8

I am tasked with exporting the data contained inside a MaxDB database to SQL Server 200x. I was wondering if anyone has gone through this before and what your process was.
Here is my idea but its not automated.
1) Export data from MaxDB for each table as a CSV.
2) Clean the CSV to remove ? (which it uses for nulls) and fix the date strings.
3) Use SSIS to import the data into tables in SQL Server.
I was wondering if anyone has tried linking MaxDB to SQL Server or what other suggestions or ideas you have for automating this.
Thanks.
AboutDev.
I managed to find a solution to this. There is an open source MaxDB library that will allow you to connect to it through .Net much like the SQL provider. You can use that to get schema information and data, then write a little code to generate scripts to run in SQL Server to create tables and insert the data.
MaxDb Data Provider for ADO.NET
If this is a one time thing, you don't have to have it all automated.
I'd pull the CSVs into SQL Server tables, and keep them forever, will help with any questions a year from now. You can prefix them all the same, "Conversion_" or whatever. There are no constraints or FKs on these tables. You might consider using varchar for every column (or the ones that cause problems, or not at all if the data is clean), just to be sure there are no data type conversion issues.
pull the data from these conversion tables into the proper final tables. I'd use a single conversion stored procedure to do everything (but I like tsql). If the data isn't that large millions and millions of rows or less, just loop through and build out all the tables, printing log info as necessary, or inserting into exception/bad data tables as necessary.

Resources