I need to import a .CSV file into a SQL Server table and I'm having problems due to " appearing within the string.
I have found the problem
lines ,"32" Leather Bike Trs ",
It never splits the column.
I've been trying to solve this for hours, what I'm I missing here.
If it can't be done with SSMS Import.
Can it be done in SSIS, import as one big column and use SQL, C# script, what would be my next step to research?
Thanks.
Below sample line to put into a csv file to try.
"Company","Customer No","Store No","Store Name","Channel","POS Terminal No","Currency Code","Exchange Rate","Sales Order No","Date of Sales Order","Date of Transaction","Transaction No","Line No","Division Code","Item Category Code","Budget Group Description","Item Description","Item Status","Item Variant Season Code","Item No","Variant Code","Colour Code","Size","Original Price","Price","Quantity","Cost Amount","Net Amount","Value Including Tax","Discount Amount","Original Store No","Original POS Terminal No","Original Trans No","Original Line No","Original Sales Order No","Discount Code","Refund Code","Web Return Description" "Motor City","","561","Outback","In-store","P12301","HKD","1","","","20160218","185","10000","MT","WW","Jeans","32" Leather Bike Trs ","In Stock","9902","K346T4","BK12","BK","12","180.00000000000000000000","149.00000000000000000000","1.00000000000000000000","34.12500000000000000000","135.45000000000000000000","149.00000000000000000000",".00000000000000000000","","","0","0","","","",""
You're right the issue come from one " placed in your text. The fun fact is, if you had 2 " in your text, SSMS could handle it (as many other tools).
Maybe you should consider the possibility to change the text qualifier of your file before implementing a SSIS package ?
Related
Every time that I try to import an Excel file into SQL Server I'm getting a particular error. When I try to edit the mappings the default value for all numerical fields is float. None of the fields in my table have decimals in them and they aren't a money data type. They're only 8 digit numbers. However, since I don't want my primary key stored as a float when it's an int, how can I fix this? It gives me a truncation error of some sort, I'll post a screen cap if needed. Is this a common problem?
It should be noted that I cannot import Excel 2007 files (I think I've found the remedy to this), but even when I try to import .xls files every value that contains numerals is automatically imported as a float and when I try to change it I get an error.
http://imgur.com/4204g
SSIS doesn't implicitly convert data types, so you need to do it explicitly. The Excel connection manager can only handle a few data types and it tries to make a best guess based on the first few rows of the file. This is fully documented in the SSIS documentation.
You have several options:
Change your destination data type to float
Load to a 'staging' table with data type float using the Import Wizard and then INSERT into the real destination table using CAST or CONVERT to convert the data
Create an SSIS package and use the Data Conversion transformation to convert the data
You might also want to note the comments in the Import Wizard documentation about data type mappings.
Going off of what Derloopkat said, which still can fail on conversion (no offense Derloopkat) because Excel is terrible at this:
Paste from excel into Notepad and save as normal (.txt file).
From within excel, open said .txt file.
Select next as it is obviously tab delimited.
Select "none" for text qualifier, then next again.
Select the first row, hold shift, select the last row, and select the text radial button. Click Finish
It will open, check it to make sure it's accurate and then save as an excel file.
There is a workaround.
Import excel sheet with numbers as float (default).
After importing, Goto Table-Design
Change DataType of the column from Float to Int or Bigint
Save Changes
Change DataType of the column from Bigint to any Text Type (Varchar, nvarchar, text, ntext etc)
Save Changes.
That's it.
When Excel finds mixed data types in same column it guesses what is the right format for the column (the majority of the values determines the type of the column) and dismisses all other values by inserting NULLs. But Excel does it far badly (e.g. if a column is considered text and Excel finds a number then decides that the number is a mistake and insert a NULL instead, or if some cells containing numbers are "text" formatted, one may get NULL values into an integer column of the database).
Solution:
Create a new excel sheet with the name of the columns in the first row
Format the columns as text
Paste the rows without format (use CVS format or copy/paste in Notepad to get only text)
Note that formatting the columns on an existing Excel sheet is not enough.
There seems to be a really easy solution when dealing with data type issues.
Basically, at the end of Excel connection string, add ;IMEX=1;"
Provider=Microsoft.Jet.OLEDB.4.0;Data Source=\\YOURSERVER\shared\Client Projects\FOLDER\Data\FILE.xls;Extended Properties="EXCEL 8.0;HDR=YES;IMEX=1";
This will resolve data type issues such as columns where values are mixed with text and numbers.
To get to connection property, right click on Excel connection manager below control flow and hit properties. It'll be to the right under solution explorer. Hope that helps.
To avoid float type field in a simple way:
Open your excel sheet..
Insert blank row after header row and type (any text) in all cells.
Mouse Right-Click on the head of the columns that cause a float issue and select (Format Cells), then choose the category (Text) and press OK.
And then export the excel sheet to your SQL server.
This simple way worked with me.
A workaround to consider in a pinch:
save a copy of the excel file, modify the column to format type 'text'
copy the column values and paste to a text editor, save the file (call it tmp.txt).
modify the data in the text file to start and end with a character so that the SQL Server import mechanism will recognize as text. If you have a fancy editor, use included tools. I use awk in cygwin on my windows laptop. For example, I start end end the column value with a single quote, like "$ awk '{print "\x27"$1"\x27"}' ./tmp.txt > ./tmp2.txt"
copy and paste the data from tmp2.txt over top of the necessary column in the excel file, and save the excel file
run the sql server import for your modified excel file... be sure to double check the data type chosen by the importer is not numeric... if it is, repeat the above steps with a different set of characters
The data in the database will have the quotes once the import is done... you can update the data later on to remove the quotes, or use the "replace" function in your read query, such as "replace([dbo].[MyTable].[MyColumn], '''', '')"
I need to create an export of data from SQL Server (multiple tables) into a fixed width text file. The text file will have rows that are different based on the record type.
Header Info (Customer, Address)
Line Item Info (Customer, Item, Qty)
Summary Info (Customer, Total Qty)
Any suggestions to accomplish this efficiently?
I'm currently re-casting all columns into char to create the "fixed width" then using SSIS to merge the tables before exporting as a ragged right text file. However, because not all the widths are the same, I'm having to concatenate the line item info into one column to make the merge work. Also, the header info is being merged AFTER the line item info, not before so there's a sorting problem there. Not sure if I'm going down the right path?
Hope that made sense... this export is used to import into a COBOL type system.
Thanks,
Using SSIS create three data flow tasks, each for creating a single text file with the fixed-width format.
File 1: Header Info
File 2: Line Item Info
File 3: Summary Info
Then concatenate them together into a fourth file using the approach described in the following link:
How to concatenate 2 files in SSIS (Integration Services)?
Hope this helps.
For these sorts of problems, I reach for SSIS. It eats this kind of thing for lunch
There is an example csv file:
category,fruits,cost
'Fruits','Apple,banana,lemon','10.58'
When I import this csv into SQL Server 2014
by clicking the database in "Object explorer"=>Task=>Import data.
No matter how I play around with column delimiter options, the row 2 will always become
5 columns (Fruits,Apple,banana,lemon,10.58) instead of the desired 3 columns
('Fruits','Apple,banana,lemon','10.58'). (So I want 'Apple,banana,lemon' to be in one column.)
The solution here How do I escape a single quote in SQL Server? doesn't work. Any guru could enlighten? Python, Linux bash, SQL or simple editor tricks are welcome! Thank you!
No matter how I play around with column delimiter options
That's not the option you need to play with - it's the Text Qualifier:
And it now imports easily.
I'm using Pervasive 10 with PCC (Pervasive Control Center) and I need to export a lot of results (over 100 000) to a TXT file.I know it's possible "Execute in Text" but this feature does not work for me because after exporting about 20 000 records the program stops. I have also changed the settings in PCC (Windows->Preferences->Text Output-> Maximun number of rows to display = 500,000).
Anyone know a way to export my query result to a txt file?
You should be able to use the Export Data function. Right click on the table name in the PCC and select Export Data. From there, you can either execute the standard "select * from " or make a more complex query to pull only the data you need. You can set the delimiter to Comma, Tab, or Colon.
Nice answer mirtheil, was wondering about this my self as well.
To add something to the answer.
It does not matter which table you right click and choose "Export Data" on, Because your query will override the default table query.
I have a CSV file with quote text delimiters. Most of the 90000 rows are fine, but I have a few rows that have a text field that contains both a quote and a comma. For example the fields value would be:
AB",AB
When Delimited this becomes
"AB"",AB"
When SQL 2005 attempts to import this I get errors such as...
Messages
Error 0xc0202055: Data Flow Task: The column delimiter for column "Column 4" was not found.
(SQL Server Import and Export Wizard)
This only seems to happen when a quote and comma are in a text value together. Values like
AB"AB which becomes "AB""AB"
or
AB,AB which becomes "AB,AB"
work fine.
Here are some example rows...
"1464885","LEVER WM","","B","MP17"
"1465075",":PLT-BC !!NOTE!!","","B",""
"1465076","BRKT-STR MTR !NOTE!","","B",""
"1465172",":BRKT-SW MTG !NOTE!","","B","MP16"
"1465388","BUSS BAR !NOTE!","","B","MP10"
"1465391","PLT-BLKHD ""NOTE""","","B","MP20"
"1465564","SPROCKET:13TEETH,74MM OD,66MM","ID W/.25"" SETSCR","B","MP6"
"S01266330002","CABLE:224"",E122/261,8 CO","","B","MP11"
The last row is an example of the problem - the "", causes the error.
I've had MAJOR problems with SSIS. Things that Access, Excel and even DTS seemed to do very well, SSIS chokes on. Variable record-length data is another problem but, yes, these embedded qualifiers are a major problem. Especially if you do not have access to the import files because they're on someone else's server that you pay to gain access to and might even be 4 to 5 GB in size! Cant just to a "replace all" on that every import.
You may want to check into this at Microsoft Downloads called "UnDouble" and here is another workaround you might try.
Seems like with SSIS in SQL Server 2008, the bug is still there. I dont know why they havent addressed this in the parser but its like we went back in time with SSIS in basic import functionality.
UPDATE 11-18-2010: This bug still exists in SSIS. Amazing.
How about just:
Search/replace all "", with ''; (fix all the broken fields)
Search/replace all ;''; with ,"", (to "unfix" properly empty fields.)
Search/replace all '';''; with "","", (to "unfix" properly empty fields which follow a correct encapsulation of embedded delimiters.)
That converts your original to:
"1464885","LEVER WM","","B","MP17"
"1465075",":PLT-BC !!NOTE!!","","B",""
"1465076","BRKT-STR MTR !NOTE!","","B",""
"1465172",":BRKT-SW MTG !NOTE!","","B","MP16"
"1465388","BUSS BAR !NOTE!","","B","MP10"
"1465391","PLT-BLKHD ""NOTE""","","B","MP20"
"1465564","SPROCKET:13TEETH,74MM OD,66MM","ID W/.25"" SETSCR","B","MP6"
"S01266330002","CABLE:224'';E122/261,8 CO","","B","MP11"
Which seems to run the gauntlet fine in SSIS. You may have to step 3 recursively to account for 3 empty fields in a row ('';'';'';, etc.) but the bottom line here is that when you have embedded text qualifiers, you have to either escape them or replace them. Let this be a lesson in your CSV creation processes going forward.
Microsoft says doubled double quotes inside double quote delimited fields just don't work. A fix is planned for the end of 2011...
In the mean time we will have to use workarounds like described in the other answers.
I would just do a search/replace for ", and replace it with ,
Do you have access to the original file?