SSIS Text Qualifier not working correctly - sql-server

I have a CSV file I am importing through SSIS.Below is an sample of the data in my file
"MEM1001","OTHER","P" ,20101001,20781231,,20781231,20101001,
"Medic","General >21" ,
"A100100" ,"2210",20101001,20781231
I have added , as column delimiter and " as Text Qualifier in the connection manager.
But columns like "P" ,"Medic","General >21" ,"A100100" , are still coming enclosed with double quotes when I preview the data while rest the of the string columns are coming without double quotes.
I am guessing it has something to do with the spaces after the quotes.
Can somebody explain why this is happening and how can i make this columns to come without double quotes while importing the data from file to table.

I just stumbled across this post, I had the same issues, I was trying around and could not find any other solution.
The text qualifier " only works in csv files, when the quote is directly after the colon, no space after the colon and the text identifier/qualifier. I have no idea why.
If you aren't able to fix the input data, an option would be to create a derived column and to replace the double quotes.
This worked for me:
How to replace double quotes in derived column transformation?
Trim(REPLACE(COLA, "\"", ""))
You should also add the Trim(), otherwise you have empty spaces before and maybe after the word. This could be problematic in a merge join (in my case it was).

I don't know why this extra spaces cause this issue.
Here is what I would do. It may not be the best idea, but it should work.
You will need to add script task before data flow task that would replace all " ," and ", " to ",".
Thank you

Why not just go to the Connection Manager for that csv file, click on Columns, and under the Column delimiter box just enter a space followed by a comma? Worked for me.

Related

Extra double quotes on export from SSMS to CSV

Upon using a text editor to review exported results from SSMS to CSV I'm witnessing extra double quotes around the result values - not field names. I've used the concat function in my script to manually add a single pair of double quotes around each value and field name. So where I would expect "012345678" I'm actually seeing """012345678""".
It may be that my code is a bit too rudimentary
ex.:
SELECT CONCAT('"',ISNULL('012345678',''),'"') AS '"employee_id"'
FROM employees
More fields are selected I just included one as an example.
Any direction is greatly appreciated.
You shouldn't manually add in the " around your data fields. If you want a character around the fields use the Text qualifier option. In SSMS when using Task > Export to open the SQL Server Import and Export Wizard, set the Text qualifier as shown below and whatever character you set for the Text qualifier will be automatically put around each field.
You can find some documentation about text qualfiier here: https://learn.microsoft.com/en-us/sql/integration-services/import-export-data/connect-to-a-flat-file-data-source-sql-server-import-and-export-wizard?view=sql-server-ver15#options-to-specify-general-page

Netezza CSV load ignore comma within value

I am loading a CSV file in Netezza. One of the columns in this file has value like: $500,000-$749,999.
Even though this value is enclosed within double quotes, Netezza is not ignoring the comma. It throws an error like - expected end of row, "999".
There are two more columns after this field in the file. I tried adding EscapeChar ',' but it again gave an error that Delimeter and EscapeChar cannot have the same character.
Have anyone faced similar issue?
Workaround:
I can add 2 two more columns in my table, but then it would fail where field do not have comma value in it.
Try setting the QuotedValue option to DOUBLE
https://www.ibm.com/support/knowledgecenter/en/SSULQD_7.2.1/com.ibm.nz.load.doc/r_load_quotedvalue.html
Also, if all your columns are quoted, you could also set the requirequotes option to true
https://www.ibm.com/support/knowledgecenter/en/SSULQD_7.2.1/com.ibm.nz.load.doc/r_load_requirequotes.html

SSIS Flat File Source Not Splitting Column by Comma

I have a flat file connector in SSIS but for some reason it is not splitting the commas into columns. I have the column delimiter set to comma and you can see in "Column 0" there is commas "," however it just doesn't want to split them. Has anyone come across this before? Any help would be amazing!
The file has a LF line terminators (UNIX way). Is this an issue for SSIS? There is an option I have selected.
I've found the solution, needed to remove a few rows to get the headers to split, using the "Header row to skip" entry. Thanks for all your help!

Is there a way to escape a double quote within a text qualified string on a SSIS Csv import?

I have a CSV I'm trying to import into SQL using SSIS packages through code.
A line might look something like this
321,1234,"SOME MACHINE, MACHINE ACCESSORIES 1 1/2"" - 4"""
In this example they're using a double quote to symbolize inches. They are trying to escape the inches double quote with a double quote. SSIS, however, does not honour this escapism and fails.
Is there anyway I can still use the double quote symbol for inches and escape it within the quoted text?
Many suggestions are to replace the double quote with two single quotes. Is this the only work around or can I use some other escape technique?
I've seen people talk about using the Derived Column transformation but in my case SSIS fails at the Flat File Source step and I therefore cannot get to a derived column transform step.
I'm currently running a script task in the control flow, just before the data flow, to manipulate the Csv with some regex's to cleanup the data.
I need the string to be text qualified with the 2 outer double quotes because of potential commas in the description column.
What can I do about the double quotes within the text qualified string?
Wow, I expected to be able to answer with "Just set the text qualifier", but figured you would have already tried that so I tried it before I answered. Surprise, SSIS doesn't support standard CSV files!
Looks like this is a common complaint. There is one comment in there from Microsoft about some samples that may help; Here is the codeplex project, they mentioned that the Regular Expression Flat File Source sample and the Delimited File Reader Source sample in particular may help -- I'm guessing the Delimited File Reader would be more worthwhile.
I ran into a similar problem yesterday.
We got the csv file that using comma , as delimiter and double quote " as text qualifier, but there is a field that contain double quote within double quote(non-escaped double quote within a string).
After spending half day searching, came up with the solution below:
// load the file into a one dimensional string array.
// fullFilePath is the full path + file name.
var fileContent = File.ReadAllLines(fullFilePath);
// Find double quotes within double quotes and replace with a single quote
var fileContentUpdated = fileContent.Select(
x => new Regex(#"(?<!^)(?<!\,)""(?!\,)(?!$)"
).Replace(x, "'")).ToArray();
// write the string array into the csv file.
File.WriteAllLines(fullFilePath, fileContentUpdated);
I don't see any other way than replace the double quote with something else to avoid the issue.
This answer is not applicable to 2005 as referenced here, but in case someone comes across this while searching and is using 2008, this other question appears to have a working answer: SSIS 2008 and Undouble
There is a workaround if in the File connection you remove the " as text qualifier you can remove all the double quotes later with a derived column expression REPLACE(Item_Name,"\"",""). The downside is that you will need to do it for every field
I didn't find a direct way to achieve this so I wrote a script:
Add a Script component to the workflow (make sure to connect the input arrow or it won't recognize the columns)
Right click on the Script component -> Input Columns, change the column Usage Type to READWRITE
Click Ok
Edit the Script, replace double quotes with two single quotes
public override void Input0_ProcessInputRow(Input0Buffer Row)
{
var descr = Row.Description;
Row.Description = Row.Description.Replace("\"", "''");
}
Probably old news now, but this issue was fixed in SQL Server 2012. I was able to import the same file on a 2012 server that failed on my 2008 server.

Commas within CSV Data

I have a CSV file which I am directly importing to a SQL server table. In the CSV file each column is separated by a comma. But my problem is that I have a column "address", and the data in this column contains commas. So what is happening is that some of the data of the address column is going to the other columns will importing to SQL server.
What should I do?
For this problem the solution is very simple.
first select => flat file source => browse your file =>
then go to the "Text qualifier" by default its none write here double quote like (") and follow the instruction of wizard.
Steps are -
first select => flat file source => browse your file => Text qualifier (write only ") and follow the instruction of wizard.
Good Luck
If there is a comma in a column then that column should be surrounded by a single quote or double quote. Then if inside that column there is a single or double quote it should have an escape charter before it, usually a \
Example format of CSV
ID - address - name
1, "Some Address, Some Street, 10452", 'David O\'Brian'
New version supports the CSV format fully, including mixed use of " and , .
BULK INSERT Sales.Orders
FROM '\\SystemX\DiskZ\Sales\data\orders.csv'
WITH ( FORMAT='CSV');
I'd suggest to either use another format than CSV or try using other characters as field separator and/or text delimiter. Try looking for a character that isn't used in your data, e.g. |, #, ^ or #. The format of a single row would become
|foo|,|bar|,|baz, qux|
A well behave parser must not interpret 'baz' and 'qux' as two columns.
Alternatively, you could write your own import voodoo that fixes any problems. For the later, you might find this Groovy skeleton useful (not sure what languages you're fluent in though)
Most systems, including Excel, will allow for the column data to be enclosed in single quotes...
col1,col2,col3
'test1','my test2, with comma',test3
Another alternative is to use the Macintosh version of CSV, which uses TAB's as delimiters.
The best, quickest and easiest way to resolve the comma in data issue is to use Excel to save a comma separated file after having set Windows' list separator setting to something other than a comma (such as a pipe). This will then generate a pipe (or whatever) separated file for you that you can then import. This is described here.
I don't think adding quote could help.The best way I suggest is replacing the comma in the content with other marks like space or something.
replace(COLUMN,',',' ') as COLUMN
Appending a speech mark into the select column on both side works. You must also cast the column as a NVARCVHAR(MAX) to turn this into a string if the column is a TEXT.
SQLCMD -S DB-SERVER -E -Q "set nocount on; set ansi_warnings off; SELECT '""' + cast ([Column1] as nvarchar(max)) + '""' As TextHere, [Column2] As NormalColumn FROM [Database].[dbo].[Table]" /o output.tmp /s "," -W

Resources