I'm trying to import data into SQL Server using SQL Server Management Studio and I keep getting the "output column... failed because truncation occurred" error. This is because I'm letting the Studio autodetect the field length which it isn't very good at.
I know I can go back and extend the column length but I'm thinking there must be a better way to get it right first time without having to manaully work out how long each column is.
I know that this must be a common issue but my Google searches aren't coming up with anything as I'm more looking for a technique rather than a specific issue.
One approach you may take, assuming the import is not something which would take hours to complete, is to just set every text column to VARCHAR(MAX), and then complete the CSV import. Once you have the actual table in SQL Server, you can inspect each column using LEN to see how wide it is. Based on that, you can either alter columns, or you could just take notes, drop the table, and reimport using appropriate widths.
You should look into leveraging SSIS for this task. There is somewhat of a fixed cost in terms of spending time setting up the process for importing the csv file and creating a physical table in the database. Ultimately, though, you will be able to set the data types for each column/field in your file. Further, SSIS will enable you to transform or reformat the data to say the least.
I would suggest downloading Visual Studio and SQL Server Data Tools. The latter contains the necessary tools, including SSIS, SSRS, and SSAS, for which you would need to complete this task.
The main point is being able to automate this task, especially if it's an ongoing project of uploading csv files into the database.
Related
I have an SQLite3 database. I also have an SQL Server database with the same structure. I need to export the data from SQLite and insert it into the SQL Server database.
The export from SQLite and the modification of the generated export needs to be 100% scripted. Inserting into the SQL Server database will be done manually through SQL Server Management Studio.
I have a mostly good dump of the database through this answer here. I can modify most of the script as needed with sed.
The one thing I'm stuck on right now is that the SQLite database stores timestamps as number of seconds since UNIX epoch. The equivalent column in SQL Server is DATETIME. As far as I know, inserting an integer into a DateTime won't work.
Is there a way to specify that certain fields be converted a certain way upon dumping from SQLite? Meaning, specify that the integer fields be dumped as proper DateTime strings that SQL Server will understand?
Or, is there something I can run on the Linux command line that will somehow find these Integer timestamps and convert them?
EDIT: Anything that runs in a Bash script on Ubuntu is acceptable.
Three basic solutions: (1) modify the data before the dump; (2) manipulate the file after the dump, or (3) modify the data on import. Which you choose will depend on how much freedom you have to modify schemas.
If you wish to do it in SQLite, I'd suggest adding text columns with the dates stored as needed for import to SQL Server, then ignore or remove the original columns on dump. The SQLite doc page for datetime() may help, as might answers to this question.
Or, you can write a function in SQL Server that handles the import. Perhaps set it on an insert trigger.
Otherwise, a script that manipulates your dump file would work too. It sounds like you have a good handle on how to do this.
I have more than 500s of tables in SQL server that I want to move to Dynamics 365. I am using SSIS so far. The problem with SSIS is the destination entity of dynamics CRM is to be specified along with mappings and hence it would be foolish to create separate data flows for entities for 100s of SQL server table sources. Is there any better way to accomplish this?
I am new to SSIS. I don't feel this is the correct approach. I am just simulating the import/export wizard of SQL server. Please let me know if there are better ways
It's amazing how often this gets asked!
SSIS cannot have dynamic dataflows because the buffer size (the pipeline) is calculated at design time (as opposed to execution time).
The only way you can re-use a dataflow is if all the source to target mappings are the same - Eg if you have 2 tables with exactly the same DDL structure.
One option (horrible IMO) is to concatenate all columns into a massive pipe-separated VARCHAR and then write this to your destination into a custom staging table with 2 columns eg (table_name, column_dump) & then "unpack" this in your target system via a post-Load SQL statement.
I'd bite the bullet, put on your headphones and start churning out the SSIS dataflows one by one - you'd be surprised how quick you can bang them out!
ETL works that way. You have to map source, destination & column mapping. If you want that to be dynamic that’s possible in Execute SQL task inside foreach loop container. Read more
But when we are using Kingswaysoft CRM destination connector - this is little tricky (may or may not be possible?) as this need very specific column mapping between source & destination.
That too when the source schema is from OLEDB, better to have separate Dataflow tasks for each table.
I would like some advice on the best way to go about doing this. I have multiple files all with different layouts and I would like to create a procedure to import them into new tables in sql.
I have written a procedure which uses xp_cmdshell to get the list of file names in a folder and the use a cursor to loop through those file names and use a bulk insert to get them into sql but I dont know the best way to create a new table with a new layout each time.
I thought if I could import just the column row into a temp table then I could use that to create a new table to do my bulk insert into. but I couldn't get that to work.
So whats the best way to do this using SQL? I am not that familiar with .net either. I have thought about doing this in SSIS, I know its easy enough to load multiple files which have the same layout in SSIS but can it be doe with variable layouts?
thanks
You could use BimlScript to make the whole process automated where you just point it at the path of interest and it writes all the SSIS and T-SQL DDL for you, but for the effort involved in writing the C# you'd need, you may as well just put the data dump into SQL Server in the C#, too.
You can use SSIS to solve this issue, though, and there are a few levels of effort to pick from.
The easiest is to use the SQL Server Import and Export Wizard to create SSIS packages from your Excel spreadsheets that will dump the sheet into its own table. You'd have to run this wizard every time you had a new spreadsheet you wanted to import, but you could save the package(s) so that you could re-import that spreadsheet again.
The next level would be to edit a saved SSIS package (or write one from scratch) to parameterize the file path and the destination table names, and you could then re-use that package for any spreadsheets that followed the same format.
Further along would be to write a package that determined with of the packages from the previouw level to call. If you can query the header rows effectively, you could probably write an SSIS package that accepted a path as an input parameter, found all the Excel sheets in that path, queried the header rows to determine the spreadsheet format, and then pass that information to the parameterized package for that format type.
SSIS development is, of course, its own topic - Integration Services Features and Tasks on MSDN is a good place to start. SSIS has its quirks, and I highly recommend learning BimlScript if you want to do a lot of SSIS development. If you'd like to talk over what the ideas above would require in more detail, please feel free to message me.
I understand this may be a little far-fetched, but is there a way to take an existing SSIS package and get an output of the job it's doing as T-SQL? I mean, that's basically what it is right? Transfering data from one database to another can be done with T-SQL as well.
I'm wondering this because I'm trying to get away from using SSIS packages for data transfer and instead using EF/linq to do this on the fly in my application. My thought process is that currently I have an SSIS package that transfers and formats data from one database to another in preparation to be spit out to an excel. This SSIS package runs nightly and helps speed up the generation of the excel as once the data is transferred to the second db, it's already nice and formatted correctly.
However, if I could leverage EF and maybe some linq to sql in order to format the data from the first database on the fly and spit it out to excel quickly without having to use this second db, that would be great. So can my original question be done, can I extract the t-sql representation of an SSIS package some how?
SSIS packages are not exclusively T-SQL. They can consist of custom back-end code, file system changes, Office document creation steps, etc, to name only a few. As a result, generating the entirety of an SSIS package's work into T-SQL isn't possible, because the full breadth of it's work isn't limited to SQL Server.
We are trying to devise an optimal method for importing very large Excel files into SQL database. Using SSIS is somewhat troublesome because it scans top X records to determine the format of the file, but rows further down may be different, so it takes a lot of trial and error, with us having to bring the unusual columns to the top so SSIS can "learn".
When we get new file formats to import, they conform to specification in terms of row formatting etc - so we can say we know the schema in advance. The SQL destination tables have the same schema, with couple of extra columns such as date inserted and original filename.
Is there an easier way to create format definitions for new files we are going to insert? We don't have to use SSIS, we are open to any other tool, with a view for as much automation as possible. There's a question of testing the sanity of data we will import, we were planning on doing basic queries against staging datasets such as "less than 1% of records can miss postal code" etc.
Many thanks
Maybe you can import data as text and after that you can convert that using Derived Column transformation. You can read data from Excel as Text using IMEX option in Connection String. More information about this parameter you can find here.