SQL Server BCP: Quotes around fields but some fields contain quotes within - sql-server

'bcp DBName..vieter out c:\test003.txt -c -T -t"\",\"" -r"\"\n\"" -S SERVER'
The above field terminator (ie. -t"","" -r""\n"") works and gets all .csv data fields surrounded by quotation marks.
However, one of the fields in the data stores written articles which have quotes themselves. When I import the data to a database it doesn't copy perfectly because the parser is interpreting quotations within the articles as terminated fields. Is there an easy fix to this?
I tried multiple variations of options for 'FIELDS TERMINATED BY' 'ENCLOSED BY' and 'ESCAPED BY' but can't seem to get the files to import perfectly.
This is the query structure, in case you aren't familiar with it:
LOAD DATA LOCAL INFILE '/home/myinfotel/dump_new/NewsItemImages.csv' INTO TABLE NewsItemImages FIELDS TERMINATED BY ',' ENCLOSED BY '"' LINES TERMINATED BY '\n';
And here is an example record that isn't copying correctly into the database:
"CP1000066268","BX101-1219_2016_205649.jpg","FILE - In this Monday, Dec. 19, 2016 file photo Maine Republican Gov. Paul LePage, right, and House Speaker Sara Gideon, D-Freeport, attend the Electoral College vote at the State House in Augusta, Maine. LePage says he had weight loss surgery and jokes that now "there's 50 less pounds of me to hate." The Republican revealed the bariatric surgery for the first time Wednesday, Jan. 11. He says he underwent the procedure on Sept. 29 and returned to work a day later. (AP Photo/Robert F. Bukaty)","","100","69","650","447","0","","","1","","1","0","live","2017-01-11 16:56:18.000"
Any help is appreciated!

Turns out exporting from Sql Server Management Studio was the solution (or I suppose using the sqlcmd command in the command line would also do the trick). Bizarre. I tried exporting to a CSV with both bcp, and Excel (it's built in Get Data function), but all to no avail.
All I did was connect to the database in SSMS, spit out the table in a query, and "Save Results As" into a .csv. Everything is parsed perfectly now. Here is a link that I used to guide me for this solution.
https://blog.devart.com/how-to-export-sql-server-data-from-table-to-a-csv-file.html

Related

Reading csv files and importing data into SQL Server

I'm successfully reading numbers from a .csv file into SQL Server with below statement, assuming that I've created a linked server named CSV_IMPORT.
select *
from CSV_IMPORT...Sophos#csv
However, the problem is that if I have a comma with numbers as data, it will show NULL instead of a correct one. How can I read the "54,375" correctly into SQL Server? Thank you very much for your help.
Below is the data in csv file.
09/07/2017,52029,70813,10898,6691,6849,122,25,147427
09/08/2017,47165,61253,6840,5949,5517,75,2,126801
09/14/2017,"54,375","16944","15616","2592","3280",380,25,"96390"
This is the result from the statement:
2017-09-07 00:00:00.000 52029 70813 10898 6691 6849 122 25 147427
2017-09-08 00:00:00.000 47165 61253 6840 5949 5517 75 2 126801
2017-09-14 00:00:00.000 NULL 16944 15616 2592 3280 380 25 96390
One way to go would be using temporary table. Read all data as text, then replace every comma in whole table to dot (.), if you want it as decimal separator, or to empty string('') if it's a thousand separator, then load data to exisitng table converting everything (you don't have to do it explicitly, SQL does it implictly).
Last year I did a project for a client which involved importing csv files, which were meant to be the same format, but which came from different sources and hence were inconsistent (even to the point of using different separators according to source). I ended up writing a CLR routine, which read the csv line by line, parsed the content, and added it to a DataTable. This DataTable I then inserted to SQL Server using the SqlBulkCopy class.
The advantage of this approach, was that I was totally in control of dealing with all anomalies in the file. It was also much faster than the alternative of inserting the whole file into a temporary table of varchars and then parsing within SQL Server. Effectively I did one line by line parse in c# and one bulk insert of parsed data.

SQL Server 2014: reading single quote comma delimiter error (',')

There is an example csv file:
category,fruits,cost
'Fruits','Apple,banana,lemon','10.58'
When I import this csv into SQL Server 2014
by clicking the database in "Object explorer"=>Task=>Import data.
No matter how I play around with column delimiter options, the row 2 will always become
5 columns (Fruits,Apple,banana,lemon,10.58) instead of the desired 3 columns
('Fruits','Apple,banana,lemon','10.58'). (So I want 'Apple,banana,lemon' to be in one column.)
The solution here How do I escape a single quote in SQL Server? doesn't work. Any guru could enlighten? Python, Linux bash, SQL or simple editor tricks are welcome! Thank you!
No matter how I play around with column delimiter options
That's not the option you need to play with - it's the Text Qualifier:
And it now imports easily.

Sql Server - Encoding issue, replace strange characters

After importing some data into a Sql 2014 database, I realized that there are some fields in which the data replaced German characters such as (ü, ß, ä, ö, etc) with some weird characters. Ex.
München should be München
ChiemgaustraÃe should be Chiemgaustraße
Königstr should be Königstr
I would like to replace these characters with the right German letter. Ex.
ü -> ü
à - > ß
ö -> ö
However when I run queries like the following to try to identify which rows have these characters, the queries returns 0 rows.
select address
from Directory
where street like N'%ChiemgaustraÃe 50%'
select address
from Directory
where street like N'%ü%'
Is there a query I can run to identify and replace these characters?
I must clarify that most of the data was imported correctly, in fact I believe the strange characters were already part of the original data.
Also, I think I can export the data to a text file, replace the characters and re-import, but I was wondering if there is a way to do it directly in sql.
Thanks in advance for the help.
I couldn't get it fix using only sql.
FutbolFa suggestion worked for the most part but there were a couple of symbols, in particular "Ã" that wasn't picked up by any query a tried. I ended up exporting the data to a text file and replacing the symbols there. Then I just re-imported the info.

Format fields during bulk insert SQL 2008

I am currently working on a project that requires data from a report generated by third party software to be inserted into a local SQL database. So far I have the data stored as a tab delimited .txt file and the following bulk insert SQL statement:
BULK INSERT ExampleTable
FROM 'c:\temp\Example.txt'
WITH
(
FIRSTROW = 2,
FIELDTERMINATOR = '\t',
ROWTERMINATOR = '\n'
)
GO
The two problems I am encountering are, quotation marks around any value that includes it's own comma, and money signs in every field that has a dollar amount.
For instance one of the columns of the table is a description field and some of the values come out looking like:
"this is an example description, some more information, I don't know why the author would use commas in the first place here"
I don't care about the description field nearly as much as other fields that include dollar amounts. Each of these fields is already prefixed with a $ sign, so I have to set them as a nvarchar instead of a decimal or a float, which would be A LOT more useful for reporting. Furthermore, when the dollar amount is greater than 1000, the field will also contain a comma, and thus, quotation marks. ex "$1,084.59"
I am familiar with SSMS, but I have never made a format or bcp file (the solutions I have found online).
Any help would be greatly appreciated.
You can use a format file, but only if your metadata remains constant, which it does not appear to be in your case. You state that the dollar amounts are enclosed in quotes only when they exceed 999 and the comma is inserted. A format file would allow you to define per column delimiters such as [,] or [","]. But if that delimiter is shifting throughout your file, you will have to pre-process the file. Text qualifiers themselves are not supported.
For reference:
CSV import in SQL Server 2008
http://jessesql.blogspot.com/2010/05/bulk-insert-csv-with-text-qualifiers.html
i dont see why, but ThiefMaster deleted my answer :-(
probabaly a mistake and he did not check the link, as this link is the full answer to you question, i will try again for the last time here...
Tip: if your CSV file don't have consistent format, for example ON THE SAME COLUMN some of the values are doubleqouted and some not than this blog will help you do it in an easy way (using openrowset in the last step make it a one simple query): http://ariely.info/Blog/tabid/83/EntryId/122/Using-Bulk-Insert-to-import-inconsistent-data-format-using-pure-T-SQL.aspx
There is a new WIKI at: http://social.technet.microsoft.com/wiki based on this blog if you prefer to read from Microsoft site.

SQL 2005 CSV Import Quote Delimited with inner Quotes and Commas

I have a CSV file with quote text delimiters. Most of the 90000 rows are fine, but I have a few rows that have a text field that contains both a quote and a comma. For example the fields value would be:
AB",AB
When Delimited this becomes
"AB"",AB"
When SQL 2005 attempts to import this I get errors such as...
Messages
Error 0xc0202055: Data Flow Task: The column delimiter for column "Column 4" was not found.
(SQL Server Import and Export Wizard)
This only seems to happen when a quote and comma are in a text value together. Values like
AB"AB which becomes "AB""AB"
or
AB,AB which becomes "AB,AB"
work fine.
Here are some example rows...
"1464885","LEVER WM","","B","MP17"
"1465075",":PLT-BC !!NOTE!!","","B",""
"1465076","BRKT-STR MTR !NOTE!","","B",""
"1465172",":BRKT-SW MTG !NOTE!","","B","MP16"
"1465388","BUSS BAR !NOTE!","","B","MP10"
"1465391","PLT-BLKHD ""NOTE""","","B","MP20"
"1465564","SPROCKET:13TEETH,74MM OD,66MM","ID W/.25"" SETSCR","B","MP6"
"S01266330002","CABLE:224"",E122/261,8 CO","","B","MP11"
The last row is an example of the problem - the "", causes the error.
I've had MAJOR problems with SSIS. Things that Access, Excel and even DTS seemed to do very well, SSIS chokes on. Variable record-length data is another problem but, yes, these embedded qualifiers are a major problem. Especially if you do not have access to the import files because they're on someone else's server that you pay to gain access to and might even be 4 to 5 GB in size! Cant just to a "replace all" on that every import.
You may want to check into this at Microsoft Downloads called "UnDouble" and here is another workaround you might try.
Seems like with SSIS in SQL Server 2008, the bug is still there. I dont know why they havent addressed this in the parser but its like we went back in time with SSIS in basic import functionality.
UPDATE 11-18-2010: This bug still exists in SSIS. Amazing.
How about just:
Search/replace all "", with ''; (fix all the broken fields)
Search/replace all ;''; with ,"", (to "unfix" properly empty fields.)
Search/replace all '';''; with "","", (to "unfix" properly empty fields which follow a correct encapsulation of embedded delimiters.)
That converts your original to:
"1464885","LEVER WM","","B","MP17"
"1465075",":PLT-BC !!NOTE!!","","B",""
"1465076","BRKT-STR MTR !NOTE!","","B",""
"1465172",":BRKT-SW MTG !NOTE!","","B","MP16"
"1465388","BUSS BAR !NOTE!","","B","MP10"
"1465391","PLT-BLKHD ""NOTE""","","B","MP20"
"1465564","SPROCKET:13TEETH,74MM OD,66MM","ID W/.25"" SETSCR","B","MP6"
"S01266330002","CABLE:224'';E122/261,8 CO","","B","MP11"
Which seems to run the gauntlet fine in SSIS. You may have to step 3 recursively to account for 3 empty fields in a row ('';'';'';, etc.) but the bottom line here is that when you have embedded text qualifiers, you have to either escape them or replace them. Let this be a lesson in your CSV creation processes going forward.
Microsoft says doubled double quotes inside double quote delimited fields just don't work. A fix is planned for the end of 2011...
In the mean time we will have to use workarounds like described in the other answers.
I would just do a search/replace for ", and replace it with ,
Do you have access to the original file?

Resources