I have an SQLite3 database. I also have an SQL Server database with the same structure. I need to export the data from SQLite and insert it into the SQL Server database.
The export from SQLite and the modification of the generated export needs to be 100% scripted. Inserting into the SQL Server database will be done manually through SQL Server Management Studio.
I have a mostly good dump of the database through this answer here. I can modify most of the script as needed with sed.
The one thing I'm stuck on right now is that the SQLite database stores timestamps as number of seconds since UNIX epoch. The equivalent column in SQL Server is DATETIME. As far as I know, inserting an integer into a DateTime won't work.
Is there a way to specify that certain fields be converted a certain way upon dumping from SQLite? Meaning, specify that the integer fields be dumped as proper DateTime strings that SQL Server will understand?
Or, is there something I can run on the Linux command line that will somehow find these Integer timestamps and convert them?
EDIT: Anything that runs in a Bash script on Ubuntu is acceptable.
Three basic solutions: (1) modify the data before the dump; (2) manipulate the file after the dump, or (3) modify the data on import. Which you choose will depend on how much freedom you have to modify schemas.
If you wish to do it in SQLite, I'd suggest adding text columns with the dates stored as needed for import to SQL Server, then ignore or remove the original columns on dump. The SQLite doc page for datetime() may help, as might answers to this question.
Or, you can write a function in SQL Server that handles the import. Perhaps set it on an insert trigger.
Otherwise, a script that manipulates your dump file would work too. It sounds like you have a good handle on how to do this.
Related
I'm trying to import data into SQL Server using SQL Server Management Studio and I keep getting the "output column... failed because truncation occurred" error. This is because I'm letting the Studio autodetect the field length which it isn't very good at.
I know I can go back and extend the column length but I'm thinking there must be a better way to get it right first time without having to manaully work out how long each column is.
I know that this must be a common issue but my Google searches aren't coming up with anything as I'm more looking for a technique rather than a specific issue.
One approach you may take, assuming the import is not something which would take hours to complete, is to just set every text column to VARCHAR(MAX), and then complete the CSV import. Once you have the actual table in SQL Server, you can inspect each column using LEN to see how wide it is. Based on that, you can either alter columns, or you could just take notes, drop the table, and reimport using appropriate widths.
You should look into leveraging SSIS for this task. There is somewhat of a fixed cost in terms of spending time setting up the process for importing the csv file and creating a physical table in the database. Ultimately, though, you will be able to set the data types for each column/field in your file. Further, SSIS will enable you to transform or reformat the data to say the least.
I would suggest downloading Visual Studio and SQL Server Data Tools. The latter contains the necessary tools, including SSIS, SSRS, and SSAS, for which you would need to complete this task.
The main point is being able to automate this task, especially if it's an ongoing project of uploading csv files into the database.
I have a PostgreSQL database that I want to move to SQL Server -- both schema and data. I am poor so I don't want to pay any money. I am also lazy, so I don't want to do very much work. Currently I'm doing this table by table, and there are about 100 tables to do. This is extremely tedious.
Is there some sort of trick that does what I want?
You should be able to find some useful information in the accepted answer in this Serverfault page: https://serverfault.com/questions/65407/best-tool-to-migrate-a-postgresql-database-to-ms-sql-2005.
If you can get the schema converted without the data, you may be able to shorten the steps for the data by using this command:
pg_dump --data-only --column-inserts your_db_name > data_load_script.sql
This load will be quite slow, but the --column-inserts option generates the most generic INSERT statements possible for each row of data and should be compatible.
EDIT: Suggestions on converting the schema follows:
I would start by dumping the schema, but removing anything that has to do with ownership or permissions. This should be enough:
pg_dump --schema-only --no-owner --no-privileges your_db_name > schema_create_script.sql
Edit this file to add the line BEGIN TRANSACTION; to the beginning and ROLLBACK TRANSACTION; to the end. Now you can load it and run it in a query window in SQL Server. If you get any errors, make sure you go to the bottom of the file, highlight the ROLLBACK statement and run it (by hitting F5 while the statement is highlighted).
Basically, you have to resolve each error until the script runs through cleanly. Then you can change the ROLLBACK TRANSACTION to COMMIT TRANSACTION and run one final time.
Unfortunately, I cannot help with which errors you may see as I have never gone from PostgreSQL to SQL Server, only the other way around. Some things that I would expect to be an issue, however (obviously, NOT an exhaustive list):
PostgreSQL does auto-increment fields by linking a NOT NULL INTEGER field to a SEQUENCE using a DEFAULT. In SQL Server, this is an IDENTITY column, but they're not exactly the same thing. I'm not sure if they are equivalent, but if your original schema is full of "id" fields, you may be in for some trouble. I don't know if SQL Server has CREATE SEQUENCE, so you may have to remove those.
Database functions / Stored Procedures do not translate between RDBMS platforms. You'll need to remove any CREATE FUNCTION statements and translate the algorithms manually.
Be careful about encoding of the data file. I'm a Linux person, so I have no idea how to verify encoding in Windows, but you need to make sure that what SQL Server expects is the same as the file you are importing from PostgreSQL. pg_dump has an option --encoding= that will let you set a specific encoding. I seem to recall that Windows tends to use two-byte, UTF-16 encoding for Unicode where PostgreSQL uses UTF-8. I had some issue going from SQL Server to PostgreSQL due to UTF-16 output so it would be worth researching.
The PostgreSQL datatype TEXT is simply a VARCHAR without a max length. In SQL Server, TEXT is... complicated (and deprecated). Each field in your original schema that are declared as TEXT will need to be reviewed for an appropriate SQL Server data type.
SQL Server has extra data types for UNICODE data. I'm not familiar enough with it to make suggestions. I'm just pointing out that it may be an issue.
I have found a faster and easier way to accomplish this.
First copy your table (or query) to a tab delimited file like so:
COPY (SELECT siteid, searchdist, listtype, list, sitename, county, street,
city, state, zip, georesult, elevation, lat, lng, wkt, unlocated_bool,
id, status, standard_status, date_opened_or_reported, date_closed,
notes, list_type_description FROM mlocal) TO 'c:\SQLAzureImportFiles\data_script_mlocal.tsv' NULL E''
Next you need to create your table in SQL, this will not handle any schema for you. The schema must match your exported tsv file in field order and data types.
Finally you run SQL's bcp utility to bring in the tsv file like so:
bcp MyDb.dbo.mlocal in "\\NEWDBSERVER\SQLAzureImportFiles\data_script_mlocal.tsv" -S tcp:YourDBServer.database.windows.net -U YourUserName -P YourPassword -c
A couple of things of note that I encountered. Postgres and SQL Server handle boolean fields differently. Your SQL Server schema need to have your boolean fields set to varchar(1) and the resulting data will be 'f', 't' or null. You will then have to convert this field to a bit. doing something like:
ALTER TABLE mlocal ADD unlocated bit;
UPDATE mlocal SET unlocated=1 WHERE unlocated_bool='t';
UPDATE mlocal SET unlocated=0 WHERE unlocated_bool='f';
ALTER TABLE mlocal DROP COLUMN unlocated_bool;
Another thing is the geography/geometry fields are very different between the two platforms. Export the geometry fields as WKT using ST_AsText(geo) and convert appropriately on the SQL Server end.
There may be more incompatibilities needing tweaks like this.
EDIT. So whereas this technique does technically work, I am trying to transfer several million records from 100+ tables to SQL Azure and bcp to SQL Azure is pretty flaky it turns out. I keep getting intermittent Unable to open BCP host data-file errors, the server is intermittently timing out and for some reason some records are not getting transferred with no indications of errors or problems. So this technique is not stable for transferring large amounts of data to Azure SQL.
You can use Navicate a powerful GUI tool for working with various databases including Postgres and SQL Server.
You can transfer both schema and data easily as follows:
Create two connection for source and target database
Go to Tools -> Data Transfer
Select source database and target database with its IP, database name and schema
as you can see in the option, if target table is not exist, it would create
Tada, it takes 10 mins to transfer whole my 63 tables and its data from Postgres to SQL Server.
Enjoy it!
I have a PostgreSQL database that I want to move to SQL Server -- both schema and data. I am poor so I don't want to pay any money. I am also lazy, so I don't want to do very much work. Currently I'm doing this table by table, and there are about 100 tables to do. This is extremely tedious.
Is there some sort of trick that does what I want?
You should be able to find some useful information in the accepted answer in this Serverfault page: https://serverfault.com/questions/65407/best-tool-to-migrate-a-postgresql-database-to-ms-sql-2005.
If you can get the schema converted without the data, you may be able to shorten the steps for the data by using this command:
pg_dump --data-only --column-inserts your_db_name > data_load_script.sql
This load will be quite slow, but the --column-inserts option generates the most generic INSERT statements possible for each row of data and should be compatible.
EDIT: Suggestions on converting the schema follows:
I would start by dumping the schema, but removing anything that has to do with ownership or permissions. This should be enough:
pg_dump --schema-only --no-owner --no-privileges your_db_name > schema_create_script.sql
Edit this file to add the line BEGIN TRANSACTION; to the beginning and ROLLBACK TRANSACTION; to the end. Now you can load it and run it in a query window in SQL Server. If you get any errors, make sure you go to the bottom of the file, highlight the ROLLBACK statement and run it (by hitting F5 while the statement is highlighted).
Basically, you have to resolve each error until the script runs through cleanly. Then you can change the ROLLBACK TRANSACTION to COMMIT TRANSACTION and run one final time.
Unfortunately, I cannot help with which errors you may see as I have never gone from PostgreSQL to SQL Server, only the other way around. Some things that I would expect to be an issue, however (obviously, NOT an exhaustive list):
PostgreSQL does auto-increment fields by linking a NOT NULL INTEGER field to a SEQUENCE using a DEFAULT. In SQL Server, this is an IDENTITY column, but they're not exactly the same thing. I'm not sure if they are equivalent, but if your original schema is full of "id" fields, you may be in for some trouble. I don't know if SQL Server has CREATE SEQUENCE, so you may have to remove those.
Database functions / Stored Procedures do not translate between RDBMS platforms. You'll need to remove any CREATE FUNCTION statements and translate the algorithms manually.
Be careful about encoding of the data file. I'm a Linux person, so I have no idea how to verify encoding in Windows, but you need to make sure that what SQL Server expects is the same as the file you are importing from PostgreSQL. pg_dump has an option --encoding= that will let you set a specific encoding. I seem to recall that Windows tends to use two-byte, UTF-16 encoding for Unicode where PostgreSQL uses UTF-8. I had some issue going from SQL Server to PostgreSQL due to UTF-16 output so it would be worth researching.
The PostgreSQL datatype TEXT is simply a VARCHAR without a max length. In SQL Server, TEXT is... complicated (and deprecated). Each field in your original schema that are declared as TEXT will need to be reviewed for an appropriate SQL Server data type.
SQL Server has extra data types for UNICODE data. I'm not familiar enough with it to make suggestions. I'm just pointing out that it may be an issue.
I have found a faster and easier way to accomplish this.
First copy your table (or query) to a tab delimited file like so:
COPY (SELECT siteid, searchdist, listtype, list, sitename, county, street,
city, state, zip, georesult, elevation, lat, lng, wkt, unlocated_bool,
id, status, standard_status, date_opened_or_reported, date_closed,
notes, list_type_description FROM mlocal) TO 'c:\SQLAzureImportFiles\data_script_mlocal.tsv' NULL E''
Next you need to create your table in SQL, this will not handle any schema for you. The schema must match your exported tsv file in field order and data types.
Finally you run SQL's bcp utility to bring in the tsv file like so:
bcp MyDb.dbo.mlocal in "\\NEWDBSERVER\SQLAzureImportFiles\data_script_mlocal.tsv" -S tcp:YourDBServer.database.windows.net -U YourUserName -P YourPassword -c
A couple of things of note that I encountered. Postgres and SQL Server handle boolean fields differently. Your SQL Server schema need to have your boolean fields set to varchar(1) and the resulting data will be 'f', 't' or null. You will then have to convert this field to a bit. doing something like:
ALTER TABLE mlocal ADD unlocated bit;
UPDATE mlocal SET unlocated=1 WHERE unlocated_bool='t';
UPDATE mlocal SET unlocated=0 WHERE unlocated_bool='f';
ALTER TABLE mlocal DROP COLUMN unlocated_bool;
Another thing is the geography/geometry fields are very different between the two platforms. Export the geometry fields as WKT using ST_AsText(geo) and convert appropriately on the SQL Server end.
There may be more incompatibilities needing tweaks like this.
EDIT. So whereas this technique does technically work, I am trying to transfer several million records from 100+ tables to SQL Azure and bcp to SQL Azure is pretty flaky it turns out. I keep getting intermittent Unable to open BCP host data-file errors, the server is intermittently timing out and for some reason some records are not getting transferred with no indications of errors or problems. So this technique is not stable for transferring large amounts of data to Azure SQL.
You can use Navicate a powerful GUI tool for working with various databases including Postgres and SQL Server.
You can transfer both schema and data easily as follows:
Create two connection for source and target database
Go to Tools -> Data Transfer
Select source database and target database with its IP, database name and schema
as you can see in the option, if target table is not exist, it would create
Tada, it takes 10 mins to transfer whole my 63 tables and its data from Postgres to SQL Server.
Enjoy it!
I'm not a good SQL programmer, I've got only the basics, but I've heard of some BCP thing for fast data loading. I've searched the internet and it seems to be a command-line only utility, and not something you can use in code.
The thing is, I want to be able to make very fast inserts and updates in a SQL Server 2008 database. I would like to have a function in the database that would accept:
The name of the table I want to execute an insert/update operation against
The names of the columns I'll be feeding data to
The data in a CSV format or something that SQL can read stupid-fast
A flag indicating weather the function should perform an insert or update operation
This function would then read this CSV string and genarate the necessary code for inserting/updating the table.
I would then write code in C# to call that function passing it the table name, column names, a list of objects serialized as a CSV string and the insert/update flag.
As you can see, this is intended to be both fast and generic, suitable for any project dealing with large amounts of data, and thus a candidate to my company's framework.
Am I thinking right? Is this a good idea? Can I use that BCP thing, and is it suitable to every case?
As you can see, I need some directions on this... thanks in advance for any help!
In C#, look at SQLBulkCopy. It's what SSIS uses in the background.
For true bcp/BULK INSERT, you'd need bulkadmin rights which may not be allowed
Have you considered using SQL Server Integrated Services (SSIS). It's designed to do exactly what you describe. It is very fast. You can insert data on a transactional basis. And you can set it up to run on a schedule. And much more.
I am tasked with exporting the data contained inside a MaxDB database to SQL Server 200x. I was wondering if anyone has gone through this before and what your process was.
Here is my idea but its not automated.
1) Export data from MaxDB for each table as a CSV.
2) Clean the CSV to remove ? (which it uses for nulls) and fix the date strings.
3) Use SSIS to import the data into tables in SQL Server.
I was wondering if anyone has tried linking MaxDB to SQL Server or what other suggestions or ideas you have for automating this.
Thanks.
AboutDev.
I managed to find a solution to this. There is an open source MaxDB library that will allow you to connect to it through .Net much like the SQL provider. You can use that to get schema information and data, then write a little code to generate scripts to run in SQL Server to create tables and insert the data.
MaxDb Data Provider for ADO.NET
If this is a one time thing, you don't have to have it all automated.
I'd pull the CSVs into SQL Server tables, and keep them forever, will help with any questions a year from now. You can prefix them all the same, "Conversion_" or whatever. There are no constraints or FKs on these tables. You might consider using varchar for every column (or the ones that cause problems, or not at all if the data is clean), just to be sure there are no data type conversion issues.
pull the data from these conversion tables into the proper final tables. I'd use a single conversion stored procedure to do everything (but I like tsql). If the data isn't that large millions and millions of rows or less, just loop through and build out all the tables, printing log info as necessary, or inserting into exception/bad data tables as necessary.