I try to do a bulk copy from a file into a table in sqlserver 2008. Its only a part of the file that I want to copy (in the BCP command example bellow it is only line number 3 I will copy).
Sometimes I receive the "Unexpected EOF encountered" error message.
The command is:
BCP tblBulkCopyTest in d:\bc_file.txt -F3 -L3 -c -t, -S(local) -Utest -Ptest
When the file looks like the following, the copy works fine (the line number 3 is inserted into my table):
44261233,207,0,0,2168
44261234,207,0,0,2570
44261235,207,0,0,2572
When the file looks like the following, I receive "Unexpected EOF encountered" error message:
Test
44261233,207,0,0,2168
44261234,207,0,0,2570
44261235,207,0,0,2572
It seems like when the file starts with something not in the correct format, the BCP command fails, even though it is not the line I want to copy (in the bcp command I have specified line number 3).
When the file looks like the following, the copy works fine:
44261233,207,0,0,2168
44261234,207,0,0,2570
44261235,207,0,0,2572
Test
So it is only when the file has some incorrect data "before" the lines I want to copy, that I receive the error.
Any suggestions how the BCP command will ignore the lines, not in question?
The way I solve that kind of errors every day:
1) Create a table with single column tblRawBulk(RawRow varchar(1000)).
2) Insert all of the rows there.
3) Remove unnecessary unformatted rows (e.g. 'test' in your example) with WHERE-parameter. Or even remove all unnecessary rows (except of rows, that you need to load) to simplify point 5.
4) Upload this table with BCP to some workfolder.
5) Download this new file.
It's no exact what you want, but may help if you'll write corresponding stored procedure, that can automatically do all this things.
Related
I have some sensitive information that I need to import into SQL Server that is proving to be a challenge. I'm not sure what the original database that housed this information was, but I do know it is provided to us in a Unix fixed length text file with LF row terminator. I have two files: a small file that covers a month's worth of data, and a much larger file that covers 5 years worth of data. I have created a BCP format file and command that successfully imports and maps the data to my SQL Server table.
The 5 year data is supposedly in the same format, so I've used the same command and format file on the text file. It starts processing some records, but somewhere in the processing (after several thousand records), it throws Unexpected EOF encountered and I can see in the database some of the rows are mapped correctly according to the fixed lengths, but then something goes horribly wrong and screws up by inserting parts of data in columns they most definitely do not belong in. Is there a character that would cause BCP to mess up and terminate early?
BCP Command: BCP DBTemp.dbo.svc_data_temp in C:\Test\data2.txt -f C:\test\txt2.fmt -T -r "0x0A" -S "stageag,90000" -e log.rtf
Again, format file and command work perfectly for the the smaller data set, but something in the 5 year dataset is screwing up BCP.
Thanks in advance for the replies!
So I found the offending characters in my fixed width file. Somehow whoever pulled the data originally (I don't have access to the source), escaped (or did not escape correctly) the double quotes in some of the text, resulting in some injection of extra spaces breaking the fixed width guidelines we were supposed to be following. After correcting the double quotes by hex editing the file, BCP was able to process all records using the format file without issue. I had used the -F and -L flags to examine certain rows of the data and to narrow it down to where I could visually compare the rows that were ok and the rows where the problems started to arise, which led me to discover the double quotes issue. Hope his helps for somebody else if they have an issue similar to this!
My BCP command looks like this:
BCP azuredatabase.dbo.rawdata IN "clientPathToCSVFile" -S servername -U user#servername -P pw -c -t,-r\n
My CSV file is in {cr}{lf} format.
My CSV looks like this
125180896918,20,9,57.28,2020-01-04 23:02:21,282992,1327,4,2850280,49552
125180896919,20,10,57.82,2020-01-04 23:02:21,282992,1298,4,2850280,48881
125180896920,16,11,58.20,2020-01-04 23:02:21,282992,1065,4,2850280,48612
125180896921,20,12,69.10,2020-01-04 23:02:21,282992,515,4,2850280,10032
125180896922,20,13,70.47,2020-01-04 23:02:21,282992,1280,4,2850280,48766
125180896923,1,1,105.04,2020-01-04 23:02:21,,1296,4,2969398,49161
As you can see there are also empty fields.
My output looks like this
Starting copy...
0 rows copied.
Network packet size (bytes): 4096
Clock Time (ms.) Total : 547
So how do I correctly setup my command for BCP?
You stated that your data is in CRLF format (that means \r\n). But your bcp command is told to look for a line terminator of \n (using the -r option).
I would have expected to see the first half of your actual CRLF line terminator of "\r\n" be split in half with the \r being included in your last column and the \n being found as the line terminator, but it look like BCP loaded no rows because it found no \n in your file.
I have not worked with Azure/BCP much, so maybe someone else knows how BCP for Azure would handle this, but the SQL Server version of BCP would still find your \n and then load the \r into your last column.
Either that or your line terminator is now what you think it is. Have you viewed the file with a text editor (not notepad, not wordpad... something that will show hidden characters like line terminators and tabs and such).
Usually, when BCP loads no rows (and there are rows in the file to load), it could be a mixup with line terminators.
(Using postgres 9.4beta2)
I have a dump that I want to import. I have done this with the 'psql' command, as elsewhere it is noted that this is required when using COPY FROM stdin:
psql publishing < publishing.dump.20150211160001
I get this syntax error:
ERROR: syntax error at or near "fbc61bc4"
LINE 1: fbc61bc4-2875-4a3a-8dec-91c8d8b60bcc root
The offending line in the dump file is the one after the COPY statement, here are both those lines together:
COPY content_fragment (id, content, name, content_item_id, entity_version) FROM stdin;
fbc61bc4-2875-4a3a-8dec-91c8dcontent Content for root content fbc61bc4-2875-4a3a-8dec-91c8d8b60bcc 0
The items in the data appear to be tab separated. I am wondering given that the error message says at or near "fbc61bc4", but the full string is "fbc61bc4-2875-4a3a-8dec-91c8dcontent", is psql not liking the '-' character?
This kind of error happens when the COPY itself fails because the table doesn't exist, or one of the mentioned columns doesn't, or the user lacks the permission to write into it, etc...
As COPY fails, the SQL interpreter continues to the next line and interpret it as if was an SQL statement, although it's actually data meant to be fed to COPY. Generally this leads to a syntax error, preceded by the error telling why the COPY failed (and often followed by tons of errors if there are many lines of data).
See another question: psql invalid command \N while restore sql, which shares the same root cause and has some useful comments.
I have a file with the last line like this:
[End of File]
Can I read all the lines before it and skip this one without getting an error?
You would be better off creating a script to remove that line.
The only way for that line not to cause issues would be to set the bcp in batch size to 1 (-b1) Depending on how much data you are working with, using a batch size of 1 will take a long time to finish.
I'm working on a project where I have to parse a bunch of .csv files, all of different formats and containing different kinds of data through some C++ functions. After that I extract data from the files and create a .sql file that can be imported in psql to insert the data into a PostgreSQL database at a later stage.
But I am not able to figure out the correct syntax for the .sql file. Here is a sample table and a sample .sql file reproducing the same errors I am getting:
Table Creation Code:
CREATE TABLE "Sample_Table"
(
"Col_ID" integer NOT NULL,
"Col_Message" character varying(50),
CONSTRAINT "Sample_Table_pkey" PRIMARY KEY ("Col_ID" )
)
insertion.sql (after the copy line, fields separated by a single tab character)
copy Sample_Table (Col_ID, Col_Message) from stdin;
1 This is Spaaarta
2 Why So Serious
3 Baazinga
\.
Now if I execute the above sql file, I get the following error:
ERROR: syntax error at or near "1"
LINE 2: 1 This is Spaaarta
^
********** Error **********
If it can help, I'm running a PostgreSQL 9.1 release, and all the above queries were executed through PGAdmin III Software.
PgAdmin doesn't support executing COPY commands in the same way that psql does (or at least, it didn't the last time I tried it with version 1.14). Use psql to execute the script, or use INSERT statements.
Three things to check:
Is there actually exactly one tab characters between the columns? Spaces are a no-go.
Are there more error messages? I'm missing at least one. (See below)
When you force case sensitive table and column names you have to do this consequently. Therefore you must write this:
copy "Sample_Table" ("Col_ID", "Col_Message") from stdin;
Otherwise you will get theese errors:
psql:x.sql:1: ERROR: relation "sample_table" does not exist
psql:x.sql:5: invalid command \.
psql:x.sql:5: ERROR: syntax error at or near "1"
LINE 1: 1 This is Spaaarta
^
With these things in place I can use your example data successfully.
EDIT Bug change: The questioner now has
ERROR: invalid input syntax for integer: "1 'This is Spaaarta'"
So something with the 1 is not OK.
My guess is, that this is an encoding problem. Windows with it's UTF-16 stuff might be the culprit here.
Debugging these kind of problems other the web is not easy, because to many semi-intelligent programs are in the line, most of them like to adjust "a few" things.
But first check a few things in psql:
\encoding
show client_encoding;
show server_encoding;
According to the pastebin data these should be the same and one of "SQL_ASCII", "LATIN1" or "UTF-8".
If they already are or if adjusting them does not help: Unix/Linux/cygwin has a hexdump -C x.sql program, post its output to pastebin. DO NOT USE the hexdump from any Windows editor like ultraedit - they have fooled me several times. When transferring the file to Linux be sure to use binary transfer.