Bulk Import CSV file into SQL Server - remove double quotes - sql-server

I am running SQL 2008, bulk insert command, while inserting the data, I am trying to remove (") double quotes from the CSV file, which works partially, but doesnt work for all the records, please check my code and the screenshot of the result.
Bulk Insert tblUsersXTemp
from 'C:\FR0250Members161212_030818.csv'
WITH (FIELDTERMINATOR = '","',
ROWTERMINATOR = '"\n"',
--FormatFile =''
ERRORFILE = 'C:\bulk_insert_BadData.txt')

After you do the bulk insert, you could replace the double quotes.
UPDATE tblUsersXTemp
SET usxMembershipID = REPLACE(usxMembershipID, CHAR(34), '')

You need a format file I believe, that's what I think is going on.
If you use the following Bulk Insert command to import the data without using a format file, then you will land up with a quotation mark prefix to the first column value and a quotation mark suffix for the last column values and a quotation mark prefix for the first column values.
Reference
Example from reference:
BULK INSERT tblPeople
FROM ‘bcp.txt’
WITH (
DATAFILETYPE=‘char’,
FIELDTERMINATOR=‘","’,
ROWTERMINATOR = ‘\n’,
FORMATFILE = ‘bcp.fmt’);
You could also potentially have dirty data that uses quotes for more than just delimiters.

Related

BULK INSERT some rows being added with quotation marks

I'm attempting to BULK INSERT a tab-separated text file into a database only containing VARCHAR data. For some reason, some of the data is getting double quotation marks placed around it randomly, while other rows do not:
domain sku type product
amazon.com b0071n529i laptop hp_4535s_a7k08ut#aba_15.6-inch_laptop
amazon.com b00715sj82 laptop "dell_64gb_mini_pcie_ssd_pata,_f462n"
The statement I'm using looks like this:
BULK INSERT database
FROM 'file.txt' WITH (FIRSTROW = 1, FIELDTERMINATOR = '\t', ROWTERMINATOR = '0x0a');
If your issue is those double quotes then you can do this after insertion that would be the better solution,
UPDATE TABLE A
SET A.Product=Replace(A.Product,'"','')
Where Left(A.Product,1)='"' or Right(A.Product,1)='"'

SQL Server Import from csv file

I'm trying to import data from a .csv file into a SQL Server table.
Using the code below, I can read from the file:
BULK INSERT #TempTable
FROM '\\Data\TestData\ImportList.csv'
WITH (FIELDTERMINATOR = ',', ROWTERMINATOR ='\n', FIRSTROW = 2, Lastrow = 3)
GO
(I added LastRow = 3 so I was just getting a subset of the data rather than dealing with all 2000 rows)
But I am getting multiple columns into a single column:
If I use the Import/Export wizard in SSMS, with the below settings, I see the expected results in the preview:
Can anyone give me some pointers as to how I need to update my query to perform correctly.
Here is a sample of what the CSV data looks like:
TIA.
You probably need to specify " as Text qualifier.
Your fields seem to be quoted and most likely contain comma's, which are currrently splitting your fields.
Or, if it works fine using <none> as Text qualifier, try to use FIELDQUOTE = '' or FIELDQUOTE = '\b' in your query. FIELDQUOTE defaults to '"'.
It's hard to tell what's really wrong without looking at some raw csv data that includes those quotes (as seen in your first screenshot).

Bulk insert with text qualifier in SQL Server

I am trying to bulk insert few records in a table test from a CSV file ,
CREATE TABLE Level2_import
(wkt varchar(max),
area VARCHAR(40),
)
BULK
INSERT level2_import
FROM 'D:\test.csv'
WITH
(
FIRSTROW = 2,
FIELDTERMINATOR = ',',
ROWTERMINATOR = '\n'
)
The bulk insert code should rid of the first row and insert the data into the table . it gets rid of first row alright but gets confused in the delimiter section . The first column is wkt and the column value is double quoted and has comma within the value .
So I guess I question is if there is a way to tell the BULK INSERT that the double quoted part is one column regardless of the comma within it ?
the CSV file looks like this ,
"MULTIPOLYGON (((60851.286135090661 510590.66974495345,60696.086128673756 510580.56976811233,60614.7860844061 510579.36978015327,60551.486015895614)))", 123123.22
You need to use a 'format file' to implement a text qualifier for bulk insert. Essentially, you will need to teach the bulk insert that there's potentially different delimiters in each field.
Create a text file called "level_2.fmt" and save it.
11.0
2
1 SQLCHAR 0 8000 "\"," 1 wkt SQL_Latin1_General_CP1_CI_AS
2 SQLCHAR 0 40 "\r\n" 2 area SQL_Latin1_General_CP1_CI_AS
The first line, "11.0" refers to your version of SQL. The second line shows that your table, [level2_import], has two columns. Each line after that will describe a column, and obeys the following format:
[Source Column Number][DataType][Min Size][Max Size][Delimiter pattern][Destination Column Number][Destination Column Name][Case sensitivity of database]
Once you've created that file, you can read in your data with the following bulk insert statement:
BULK INSERT level2_import
FROM 'D:\test.csv'
WITH
(
FIRSTROW = 2,
FORMATFILE='D:\level_2.fmt'
);
Refer to this blog for a detailed explanation of the format file.
SQL Server 2017 finally added support for text qualifiers and the CSV format defined in RFC 4180. It should be enough to write :
BULK INSERT level2_import
FROM 'D:\test.csv'
WITH ( FORMAT = 'CSV', ROWTERMINATOR = '\n', FIRSTROW = 2 )
Try removing .fmt to the file and use .txt instead, that worked for me
I have this issue working with LDAP data the dn contains commas, as do other fields that contain dns. Try changing your field terminator to another, unused character, like a pipe | or Semicolon ;. Do this in the data and the file definition.
so the code should be:
CREATE TABLE Level2_import
(wkt varchar(max),
area VARCHAR(40),
)
BULK
INSERT level2_import
FROM 'D:\test.csv'
WITH
(
FIRSTROW = 2,
FIELDTERMINATOR = ';',
ROWTERMINATOR = '\n'
)
and your CSV:
"MULTIPOLYGON (((60851.286135090661 510590.66974495345,60696.086128673756 510580.56976811233,60614.7860844061 510579.36978015327,60551.486015895614)))"; 123123.22

Bulk INSERT without FIELDDELIMITER

How can I bulk insert a file like below?
test.txt
012341231
013212313
011312321
012312312
The text file does not contain a delimiter. I have used:
BULK INSERT tbl_import_#id#
FROM '../test.txt'
WITH
(FIELDTERMINATOR = '\t',
ROWTERMINATOR = '\n')
and I got an error for that. Appreciate any help thanks.
There is no problem. You can specify a field terminator even if your file doesn't have any field terminators like \t or ,.
Please try to post what error you have got. Check your FROM file ".../test.txt" location and table schema to import data. Better to post your error. I cannot reproduce your error. It works fine for me (I used your values).
Just run the query without FILEDTERMINATOR
BULK INSERT tbl_import_#id#
FROM '../test.txt'
WITH (ROWTERMINATOR = '\n')
The FIELDTERMINATOR argument would be helpful in case you had multiple columns in your table (more values per row). But I can see that this is not the case, so you don't need to separate values except by rows, which will be records in your table.
EDIT:
In case you can use a different table, just create a table with only 1 column(ID column) and run the import (the query above).
After that, run an ALTER command and add the other columns that you want.

SQL Server - Bulk insert without losing CR or LF characters

I am trying to import email communication into a database table using Bulk Insert but I can't seem to be able to preserve the CR and LF characters. Let's consider the following:
CREATE TABLE myTable (
Email_Id int,
Email_subject varchar(200) NULL,
Email_Body TEXT NULL
)
The bulk insert statement has the following:
codepage = '1250',
fieldterminator = '<3P4>',
rowterminator = '<3ND>',
datafiletype = 'char'
The file contains full emails (including CR and LF characters). I would like to import the data and include the CR and LF characters. I have read that BULK INSERT treats each entry as a single row but does that mean it strips out the CR and LF characters? If so, what can I use to import this CSV file? I don't have access to SSIS and I would prefer to use SQL code to do it.
Example data:
11324<3P4>Read this email because it's urgent<3P4>Haha John,
I lied, the email was just to mess with you!
Your Nemesis,
Steve
P.S. I still hate you!
<3ND>
11355<3P4>THIS IS THE LAST STRAW<3P4>Steve,
I have had it with you stupid jokes, this email is going to the manager.
Good day,
John
<3ND>
It should import with the carriage returns and linefeeds, even if you don't see them in some tools. We would import XSL this way and it would preserve all of the line formatting.

Resources