How to import a .csv file with double quotes in column values to SQL table - sql-server

I am trying to import the data from a .csv file to SQL table using SSIS data flow task. One row in my .csv file is like
Col1,Col2,Col3
1200,"ABC","Value is \"greater\" than expected"
While creating the Flat file connection, I have given Comma as Delimiter and " as Qualifier. And created a derived column (REPLACE(Col3,"\"","")) as the second step to remove \" from column3.
But as soon as I start running the package I get an error in the Flat file source itself as "Column delimiter for col3 was not found".
Can someone please guide me in solving this issue?

You may need to escape the slash too, try this please and let us know:
(REPLACE(Col3,"\\\"",""))

Related

Issue with commas in the column while exporting SQL Server query result to .csv file with double quotes the data is splitting up to next columns

I have an issue with commas in the column while exporting SQL Server query results to a .csv file. The data is splitting up to next columns with double quotes, using SSIS.
This is my sample data:
EmpId Location Dept
---------------------------
101 Nyc,AUS It,HR
102 Nyc,AUS It,HR
When exporting this data into a .csv using SSIS, the data is splitting next column though I used text-qualifier double quote (").
Any suggestions will be appreciated
You might want to use another column delimiter, such as ";" instead of "," when exporting it to .csv
When using text qualifier you have to make sure they are applied on the export also.
Open your .csv by notepad++ or any text editor and see if each Location-entry is surrounded by double quotes (")

Error importing data from CSV with OpenRowset in SQL Server - Mysterious value of "S7"

I have a file dump which needs to be imported into SQL Server on a daily basis, which I have created a scheduled task to do this without any attendant. All CSV files are decimated by ',' and it's a Windows CR/LF file encoded with UTF-8.
To import data from these CSV files, I mainly use OpenRowset. It works well until I ran into a file in which there's a value of "S7". If the file contains the value of "S7" then that column will be recognized as datatype of numeric while doing the OpenRowset import and which will lead to a failure for other alphabetic characters to be imported, leaving only NULL values.
This is by far I had tried:
Using IMEX=1: openrowset('Microsoft.ACE.OLEDB.15.0','text;IMEX=1;HDR=Yes;
Using text driver: OpenRowset('MSDASQL','Driver=Microsoft Access Text Driver (*.txt, *.csv);
Using Bulk Insert with or without a format file.
The interesting part is that if I use Bulk Insert, it will give me a warning of unexpected end of file. To solve this, I have tried to use various row terminator indicators like '0x0a','\n', '\r\n' or not designated any, but they all failed. And finally I managed to import some of the records which using a row terminator of ',\n'. However the original file contains like 1000 records and only 100 will be imported, without any notice of errors or warnings.
Any tips or helps would be much appreciated.
Edit 1:
The file is ended with a newline character, from which I can tell from notepad++. I managed to import files which give me an error of unexpected end of file by removing the last record in those files. However even with this method, that I still can not import all records, only a partial of which can be imported.

Tilde (~) Delimited File Read in SSIS

I'm trying to load a Tilde (~) delimited .DAT to SQL Server DB using SSIS. When I use a flat file source to read the file, I don't see the option of a ~ delimiter. I'm pasting a row from my file below:
7318~97836: LRX PAIN MONTHLY DX~001~ALL OTHER NSAIDs~1043676~001~1043676~001~OSR~401~01~ORALS,SOL,TAB/CAP RE~156720~50MG~ANSAID~100 0170-07
In here, I need to get the data between the columns separated by a ~ i.e.
Column 1 should have '7318', Column 2 should have '97836: LRX PAIN MONTHLY DX'.
Can someone help me with this? Can this be done using a Flat File Source or do I need to use a Script Task?
Sure you can, you just need to configure the "Column delimiter" property in the "Flat File Connection Manager Editor". There are some predetermined choices there, but you can click and type any separator you want:
After that you can click "refresh" and then "OK".

Import CSV data into SQL Server

I have data in the csv file similar to this:
Name,Age,Location,Score
"Bob, B",34,Boston,0
"Mike, M",76,Miami,678
"Rachel, R",17,Richmond,"1,234"
While trying to BULK INSERT this data into a SQL Server table, I encountered two problems.
If I use FIELDTERMINATOR=',' then it splits the first (and sometimes the last) column
The last column is an integer column but it has quotes and comma thousand separator whenever the number is greater than 1000
Is there a way to import this data (using XML Format File or whatever) without manually parsing the csv file first?
I appreciate any help. Thanks.
You can parse the file with http://filehelpers.sourceforge.net/
And with that result, use the approach here: SQL Bulkcopy YYYYMMDD problem or straight into SqlBulkCopy
Use MySQL load data:
LOAD DATA LOCAL INFILE 'path-to-/filename.csv' INTO TABLE `sql_tablename`
CHARACTER SET 'utf8'
FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '\"'
IGNORE 1 LINES;
The part optionally enclosed by '\"', or escape character and quote, will keep the data in the first column together for the first field.
IGNORE 1 LINES will leave the field name row out.
UTF8 line is optional but good to use if names have diacritics, like in José.

Talend: Write data to PostgreSQL database error

I am trying to write data from a .csv file to my postgreSQL database. The connection is fine, but when I run my job i get the following error:
Exception in component tPostgresqlOutput_1
org.postgresql.util.PSQLException: ERROR: zero-length delimited identifier at or near """"
at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:1592)
at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1327)
at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:192)
at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:451)
at org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:336)
at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:328)
at talend_test.exporttoexcel_0_1.exportToExcel.tFileInputDelimited_1Process(exportToExcel.java:568)
at talend_test.exporttoexcel_0_1.exportToExcel.runJobInTOS(exportToExcel.java:1015)
at talend_test.exporttoexcel_0_1.exportToExcel.main(exportToExcel.java:886)
My job is very simple:
tFileInputDelimiter -> PostgreSQL_Output
I think that the error means that the double quotes should be single quotes ("" -> ''), but how can i edit this in Talend?
Or is it another reason?
Can anyone help me on this one?
Thanks!
If you are using the customer.csv file from the repository then you have to change the properties of customer file by clicking through metadata->file delimited->customer in the repository pane.
You should be able to right click the customer file and then choose Edit file delimited. In the third screen, if the file extension is .csv then in Escape char settings you have to select CSV options. Typical escape sequences (as used by Excel and other programs) have escape char as "\"" and text enclosure is also "\"".
You should also check that encoding is set to UTF-8 in the file settings. You can then refresh your preview to view a sample of your file in a table format. If this matches your expectations of the data you should then be able to save the metadata entry and update this to your jobs.
If your file is not in the repository, then click on the component with your file and do all of the above CSV configuration steps in the basic settings of the component.

Resources