Import CSV files using BCP to Azure SQL Server - sql-server

My BCP command looks like this:
BCP azuredatabase.dbo.rawdata IN "clientPathToCSVFile" -S servername -U user#servername -P pw -c -t,-r\n
My CSV file is in {cr}{lf} format.
My CSV looks like this
125180896918,20,9,57.28,2020-01-04 23:02:21,282992,1327,4,2850280,49552
125180896919,20,10,57.82,2020-01-04 23:02:21,282992,1298,4,2850280,48881
125180896920,16,11,58.20,2020-01-04 23:02:21,282992,1065,4,2850280,48612
125180896921,20,12,69.10,2020-01-04 23:02:21,282992,515,4,2850280,10032
125180896922,20,13,70.47,2020-01-04 23:02:21,282992,1280,4,2850280,48766
125180896923,1,1,105.04,2020-01-04 23:02:21,,1296,4,2969398,49161
As you can see there are also empty fields.
My output looks like this
Starting copy...
0 rows copied.
Network packet size (bytes): 4096
Clock Time (ms.) Total : 547
So how do I correctly setup my command for BCP?

You stated that your data is in CRLF format (that means \r\n). But your bcp command is told to look for a line terminator of \n (using the -r option).
I would have expected to see the first half of your actual CRLF line terminator of "\r\n" be split in half with the \r being included in your last column and the \n being found as the line terminator, but it look like BCP loaded no rows because it found no \n in your file.
I have not worked with Azure/BCP much, so maybe someone else knows how BCP for Azure would handle this, but the SQL Server version of BCP would still find your \n and then load the \r into your last column.
Either that or your line terminator is now what you think it is. Have you viewed the file with a text editor (not notepad, not wordpad... something that will show hidden characters like line terminators and tabs and such).
Usually, when BCP loads no rows (and there are rows in the file to load), it could be a mixup with line terminators.

Related

Postgres query result into CSV in terminal wrongly quotes text values

I am using following postgres command in terminal to output very large query result into CSV format:
psql -d ecoprod -t -A -F"," -f queries/query.sql > exports/output.csv
It works just fine except its not valid CSV format. Text values should be wrapped in quotes "". Its not and its causing many problems parsing the CSV when there are commas in the text and so on.
Of course I could use another delimiter like semicolon however its the similar problem. In addition some text values contain line break characters which also breaks the parsing.
Didnt find any way to modify the command in documentation. Hope you will help me. Thank you.
-F doesn't promise to generate valid CSV. There is a --csv option you could use instead, which is at least intended for this purpose. But it seems like COPY or \copy would be more suited.

BCP Fixed Width Import -> Unexpected EOF encountered in BCP data-file?

I have some sensitive information that I need to import into SQL Server that is proving to be a challenge. I'm not sure what the original database that housed this information was, but I do know it is provided to us in a Unix fixed length text file with LF row terminator. I have two files: a small file that covers a month's worth of data, and a much larger file that covers 5 years worth of data. I have created a BCP format file and command that successfully imports and maps the data to my SQL Server table.
The 5 year data is supposedly in the same format, so I've used the same command and format file on the text file. It starts processing some records, but somewhere in the processing (after several thousand records), it throws Unexpected EOF encountered and I can see in the database some of the rows are mapped correctly according to the fixed lengths, but then something goes horribly wrong and screws up by inserting parts of data in columns they most definitely do not belong in. Is there a character that would cause BCP to mess up and terminate early?
BCP Command: BCP DBTemp.dbo.svc_data_temp in C:\Test\data2.txt -f C:\test\txt2.fmt -T -r "0x0A" -S "stageag,90000" -e log.rtf
Again, format file and command work perfectly for the the smaller data set, but something in the 5 year dataset is screwing up BCP.
Thanks in advance for the replies!
So I found the offending characters in my fixed width file. Somehow whoever pulled the data originally (I don't have access to the source), escaped (or did not escape correctly) the double quotes in some of the text, resulting in some injection of extra spaces breaking the fixed width guidelines we were supposed to be following. After correcting the double quotes by hex editing the file, BCP was able to process all records using the format file without issue. I had used the -F and -L flags to examine certain rows of the data and to narrow it down to where I could visually compare the rows that were ok and the rows where the problems started to arise, which led me to discover the double quotes issue. Hope his helps for somebody else if they have an issue similar to this!

BCP escape \n in data

I'm using BCP to download data from SQL Server, using queryout option.
However, I notice that if the data content in any columns contain '\n', the content exported from BCP will be treated as newline.
For example, if the data in SQL Server is:
COLUMN_1 COLUMN_2
AAA NAME\nSURNAME
BBB NAMESURNAME
The exported file be like:
AAA NAME
SURNAME
BBB NAMESURNAME
Refer to BCP document, as I understand, the -c should not treat \n as newline.
-c
Performs the operation using a character data type. This option does not prompt for each field; it uses char as the storage type, without prefixes and with \t (tab character) as the field separator and \r\n (newline character) as the row terminator. -c is not compatible with -w.
I'm not sure what I misunderstood.
Here is the command I use:
bcp "select [col_name] from [table_name] where [condition]" queryout test.dat -U[username] -P[password] -S[serverip.port] -c
Thank you.
If your data contains newline or crlf control characters then those characters WILL, naturally, be included in the data that is copied out.
Are the control characters supposed to be there? If so, then leave them and they will be imported into whatever your destination is. Just because your text view shows a "broken" line, does not mean that SQL Server cannot ingest that line again and keep the control character tucked into the data (again, if those control characters belong there... i've seen plenty of cases where they would be).
If the newline character "\n" (or any control character for that matter) is NOT desired, then it's just a matter of doing as you commented in Martin's answer. Just clean the data either before you query it ("update") or during query (as you commented with "select/replace") or after you've copied the data out.
I've used "file cleaner" applications in the past to "clean" a file of unwanted characters (this can be an issue with long-lived data or data that has traversed various platforms or has been touched by humans!!! Yuck!).
I assume that your text includes the actual \n control character rather than simply the characters \ and n next to each other?
Where this exists then your options are to either use native mode or change the row terminator to be something other than \n so that it recognises the correct pattern.
I'd suggest using native mode and test whether that re-imports the data correctly with the \n in place.

BCP unexpected EOF

I try to do a bulk copy from a file into a table in sqlserver 2008. Its only a part of the file that I want to copy (in the BCP command example bellow it is only line number 3 I will copy).
Sometimes I receive the "Unexpected EOF encountered" error message.
The command is:
BCP tblBulkCopyTest in d:\bc_file.txt -F3 -L3 -c -t, -S(local) -Utest -Ptest
When the file looks like the following, the copy works fine (the line number 3 is inserted into my table):
44261233,207,0,0,2168
44261234,207,0,0,2570
44261235,207,0,0,2572
When the file looks like the following, I receive "Unexpected EOF encountered" error message:
Test
44261233,207,0,0,2168
44261234,207,0,0,2570
44261235,207,0,0,2572
It seems like when the file starts with something not in the correct format, the BCP command fails, even though it is not the line I want to copy (in the bcp command I have specified line number 3).
When the file looks like the following, the copy works fine:
44261233,207,0,0,2168
44261234,207,0,0,2570
44261235,207,0,0,2572
Test
So it is only when the file has some incorrect data "before" the lines I want to copy, that I receive the error.
Any suggestions how the BCP command will ignore the lines, not in question?
The way I solve that kind of errors every day:
1) Create a table with single column tblRawBulk(RawRow varchar(1000)).
2) Insert all of the rows there.
3) Remove unnecessary unformatted rows (e.g. 'test' in your example) with WHERE-parameter. Or even remove all unnecessary rows (except of rows, that you need to load) to simplify point 5.
4) Upload this table with BCP to some workfolder.
5) Download this new file.
It's no exact what you want, but may help if you'll write corresponding stored procedure, that can automatically do all this things.

Bulk insert fixed length file with ragged right lines

I have a fixed length text file, except some of the lines end early, with a carriage return/line feed. I'm using a .fmt file.
Q: How do I tell SQL Server to use an empty string for the fields that are unaccounted for?
I should probably ask my client to pad his text file, but it would be easier to just process it with the lines that are terminated early.
You should write a pre-processor to condition the text file before doing the bulk insert.

Resources