BCP shift values in SQL Server with ID primary key - sql-server

I'm using bcp to import a .csv file into a table in SQL Server like this:
bcp test1 in "./file.csv" -S server_name -U login_id -P password -d database_name -c -t
I get this message/warning:
Unexpected EOF encountered in BCP data-file.
bcp copy in failed
file.csv data:
A, B, C
A, B, C
A, B, C
A, B, C
My tables:
CREATE TABLE test2
(
a VARCHAR(8) PRIMARY KEY,
b VARCHAR(8),
c VARCHAR(8)
);
CREATE TABLE test1
(
ID INT NOT NULL IDENTITY(1,1) PRIMARY KEY,
a VARCHAR(8),
b VARCHAR(8),
c VARCHAR(8)
FOREIGN KEY (a) REFERENCES test2(a)
);
Here is what I get in my SELECT * FROM test1;:
ID | a | b | c |
1 B CA B C
A B CA B C
Here is what I expected:
ID|a |b |c
1 A B C
2 A B C
3 A B C
3 A B C
I'm having no issues with test2 which is nearly like test but without ID. So the csv files are well formatted. So why do I have a shift like this?
EDIT 1
If I add header to csv fileI have this in my SELECT * FROM test1:
ID|a|b|c |
1 b cA B,C,A,B,C,A,B,C,A,B,C
EDIT 2
I generated a File Format to guide my data.
13.0
3
1 SQLCHQR 0 40 "\t" 1 a SQL_Latin1_General_CP1_CI_AS
2 SQLCHQR 0 40 "\t" 2 a SQL_Latin1_General_CP1_CI_AS
3 SQLCHQR 0 40 "\t" 3 a SQL_Latin1_General_CP1_CI_AS
The modified one to try to "jump" over ID
13.0
4
1 SQLCHQR 0 40 "\t" 2 a SQL_Latin1_General_CP1_CI_AS
2 SQLCHQR 0 40 "\t" 3 a SQL_Latin1_General_CP1_CI_AS
3 SQLCHQR 0 40 "\t" 4 a SQL_Latin1_General_CP1_CI_AS
But I can't manage to make it work.
bcp test1 in "./file.csv" -S server_name -U login_id -P password -d database_name -f file_name -t
SQLState = S1002, NativeError = 0
Error = [Microsoft][ODBC SQL Server Driver]Invalid Descriptor Index
EDIT 3
I found a way around the problem, but it's still not the good solution. What I did is:
Change my test1 table, putting ID column at the end.
Then adding with sed double commas at the end of each lines in my CSV to create a new empty column. Then I made a simple bcp in. I still want keep ID in first column, I just don't want to create an extra View for putting ID in front.

First Edit due to inverse meaning for the -E switch
You are missing missing the -E for the BCP to know that you are trying to use Identity column.
If you use this switch then the values are taken from the file and the SQL Server Identity value is ignored. You should preform DBCC CHECKIDENT('table_name', RESEED) afterwards.
Second Edit due to the more specific user whishes.
After edit #BeGreen specified that he wants to use format file. I'll post then the example how to do it. No need for the view in front which I saw in other post on Stack overflow: What will be BCP format for inserting a identity column.
The examples are being inconsistent in that CSV is mixing "," (in the example) and "\t" (in the format table). I'll be using only "," in the examples.
That is possible but you have to use correct approach.
First you have to generate the format file by this command:
bcp <db_name>.dbo.test1 format nul -c -f "generated_format_file.fmt" -t "," -S "<server>\<instance>" -T
Then you take generated .fmt file in this case (using SQL Server 2005 - as I have quick testing instance there - you can change the first number to fit your needs e.g. 13.0):
8.0
4
1 SQLCHAR 0 12 "," 1 ID ""
2 SQLCHAR 0 8 "," 2 a SQL_Latin1_General_CP1_CI_AS
3 SQLCHAR 0 8 "," 3 b SQL_Latin1_General_CP1_CI_AS
4 SQLCHAR 0 8 "\r\n" 4 c SQL_Latin1_General_CP1_CI_AS
The *.csv file looks (I have added the ID for tests and different from the Identity values that should be inserted into the table):
10,A1,B1,C1
20,A2,B2,C2
30,A3,B3,C3
40,A4,B4,C4
Then the correct command for importing (note: using trusted connection -T instead of -U -P to make the line shorter):
bcp <db_name>.<schema>.test1 IN "import_data.bcp" -f "generated_format_file.fmt" -t "," -S "<server>\<instance>" -T
Starting copy...
4 rows copied.
Network packet size (bytes): 4096
Clock Time (ms.) Total : 218 Average : (18.35 rows per sec.)
Data is imported as (test1):
ID a b c
1 A1 B1 C1
2 A2 B2 C2
3 A3 B3 C3
4 A4 B4 C4
To check identity value that is currently at test1:
select IDENT_CURRENT('[database_name].[dbo].[test1]')
Result:
(No column name)
4

Related

how to import csv file into sybaseASE with less columns than table field by using format file?

I'm using BCP to load data into SybaseASET table under UNIX
I have a temp.csv file with 4 columns:
name | id | attr1 | attr2
FIERA|20138||
SECOR|73328||
WELLINGTON|92413||
template table with two extra columns was defined like below:
create table template
(name varchar(10),
id int,
attr1 varchar(5) default '',
attr2 varhcar(5) default '',
creation_time datetime null,
active_flag char(1) null)
bcp.fmt format file:
10.0
7
1 SYBCHAR 0 10 "|" 1 name
2 SYBINT4 0 4 "|" 2 id
3 SYBCHAR 0 5 "|" 3 attr1
4 SYBCHAR 0 5 "|" 4 attr2
5 SYBDATETIME 0 8 "|" 0 creation_time
6 SYBCHAR 0 1 "|" 0 active_flag
7 SYBCHAR 0 10 "\r\n" 0 end
My purpose is to import all values include blank of temp.csv file into template table, leave the last two fields creation_time and active_flag as null.
I use command:
bcp client..template in temp.csv -F2 -f bcp.fmt -U -P -S
However, I always got the following error:
Unexpected EOF encountered in BCP data-file.
bcp copy in partially failed
I double checked my temp.csv file, all row terminator is \r\n as I listed in fromat file, why I still got unexpected EOF error?
Struggled too many times, all failed. Could anybody help me out? Thanks
===================update on Feb.06=================
Thank you James, I update format file to below as what you indicate:
10.0
6
1 SYBCHAR 0 0 "|" 5 creation_time
2 SYBCHAR 0 0 "|" 6 active_flag
3 SYBCHAR 0 10 "|" 1 name
4 SYBCHAR 0 4 "|" 2 id
5 SYBCHAR 0 5 "|" 3 attr1
6 SYBCHAR 0 5 "\r\n" 4 attr2
then I was prompt "Incorrect host-column number found in bcp format file"
===========================================================================
============SOLUTION IS HERE=============
first solution:
10.0
4
1 SYBCHAR 0 10 "|" 1 name
2 SYBCHAR 0 4 "|" 2 id
3 SYBCHAR 0 5 "|" 3 attr1
4 SYBCHAR 0 5 "\r\n" 4 attr2
second solution:
10.0
6
1 SYBCHAR 0 10 "|" 1 name
2 SYBCHAR 0 4 "|" 2 id
3 SYBCHAR 0 5 "|" 3 attr1
4 SYBCHAR 0 5 "\r\n" 4 attr2
5 SYBCHAR 0 0 "" 5 active_flag
6 SYBCHAR 0 0 "" 6 creation_time
Both work perfectly
There are a few problems with your format file.
According to Sybase documentation, you should be using SYBCHAR exclusively:
Host file datatype
The host file datatype refers to the storage format of the field in
the host data file, not the datatype of the database table column.
The DBMS knows the datatype of its columns; it does not know how the input file is encoded.
Remember that the first element in the lines describing a column (3 onward) indicates the column in the file. Your data file has no column 5-7. I suspect that's the field provoking the error message.
Also afaik 0 is not a valid colid in the target table. If you want to indicate NULL for a particular column, say it starts at the beginning and has no length,
1 SYBCHAR 0 0 "|" 7 active_flag
Finally, there's no need to account for the row-terminator in the format file. You do that on the bcp command line with the -r option. If you're using Windows, IIRC that would become
bcp client..template in temp.csv -F2 -f bcp.fmt -r \r\n -U -P -S
In Linux of course you'd have to quote or escape the backslashes.
Edit: for clarity, here's what I think your file should look like,
10.0
6
1 SYBCHAR 0 0 "|" 5 creation_time
1 SYBCHAR 0 0 "|" 6 active_flag
1 SYBCHAR 0 10 "|" 1 name
2 SYBCHAR 0 4 "|" 2 id
3 SYBCHAR 0 5 "|" 3 attr1
4 SYBCHAR 0 5 "" 4 attr2
If that doesn't work, you'll have to find someone who, um, knows the answer. I don't have a system handy to test on.
Nothing prevents any part of the data file from being mapped to many columns. In field 1 of your format file, though, you mention data file columns 5 & 6, but your data file has only 4 columns. I think that's what the error message it telling you.
do you mean all datatype I put in format file should be 'SYBCHAR'
Yes. The format file can describe text or binary files. Your file is text, so all your data (in the file) are SYBCHAR.

Importing utf-8 encoded data from csv to SQL Server using bulk insert

I am trying to import raw data that I have in .csv format to my table in SQL Server. My table is called [raw].[sub_brand_channel_mapping].
The last column ImportFileId is generated by my python code. I first set all the rows of that column as the default value generated. Then I use bulk insert to import my csv data in utf-8 format to my table in SQL Server. However, during the process a lot of special characters are changing. I am using a format file like this
10.0
8
1 SQLCHAR 0 100 "\t" 1 sub_brand_id ""
2 SQLCHAR 0 1024 "\t" 2 sub_brand_name SQL_Latin1_General_CP1_CI_AS
3 SQLCHAR 0 100 "\t" 3 channel_country_id ""
4 SQLCHAR 0 1024 "\t" 4 channel_id ""
5 SQLCHAR 0 1024 "\t" 5 channel_name SQL_Latin1_General_CP1_CI_AS
6 SQLCHAR 0 256 "\t" 6 status ""
7 SQLCHAR 0 256 "\t" 7 eff_start_date ""
8 SQLCHAR 0 256 "\r\n" 8 eff_end_date ""
My bulk insert command looks like this:
bcp "{table}" in "{file}" {connect_string} -f {format_file} -F {first_row} - b 10000000 -e {error_file} -q -m {max_errors}
My csv files have "\t" as the delimiter. I need to import the exact names without any change. What should I do ?
P.S. I did try converting my utf-8 encoded csv to utf-16-le and then use "-w" in my bcp command. It is giving a lot of errors. In short it didn't work. Please advise me on how to do this.

Bulk insert fmt text qualifier

I've a BULK INSERT task that takes data from a csv and imports into a table. Only problem is that one of the columns can contain a comma, so the import doesn't work as expected.
I've tried to fix this by creating a format (fmt) file, the contents of which I've detailed below:-
9.0
6
1 SQLCHAR 0 50 "," 1 "Identifier" Latin1_General_CI_AS
2 SQLCHAR 0 50 "," 2 "Name" Latin1_General_CI_AS
3 SQLCHAR 0 50 "," 3 "Date of Birth" Latin1_General_CI_AS
4 SQLCHAR 0 50 "," 4 "Admission" Latin1_General_CI_AS
5 SQLCHAR 0 50 "," 5 "Code" Latin1_General_CI_AS
6 SQLCHAR 0 50 "\r\n" 6 "Length" Latin1_General_CI_AS
The column causing me pain is column 2 "Name".
I've tried a couple of things to identify the column as being text qualified and containing a comma but I'm not getting the result I want.
If I change to the following:-
"\"," - I get something like this -- "Richardson, Mat
This isn't correct, so I tried this instead, as suggested on some other forums / sites:-
"\",\""
This doesn't work at all and actually gives me the error
Cannot obtain the required interface ("IID_IColumnsInfo") from OLE DB provider "BULK" for linked server "(null)".Bulk load: An unexpected end of file was encountered in the data file.
I've tried a few other combinations and just can't get this right. Any help or guidance would be massively appreciated.
Not really answering your question regarding format files but a possible get you working solution.
Format files are incomprehensible arcana from the 1980s to me, bulk insert is uber fussy and unforgiving. Therefore I tend to clean data with a few lines of powershell instead. Here's an example I used recently to convert a CSV to Pipe separated, to remove some random quoting on the output and to allow for commas in the records:
Import-Csv -Path $dirtyCsv |
ConvertTo-CSV -NoType -Delimiter '|' |
%{ $_.Replace('"','') } |
Out-File $cleanCsv
You get the idea...
This then simply imported:
BULK INSERT SomeTable FROM 'clean.csv' WITH ( FIRSTROW = 2, FIELDTERMINATOR = '|', ROWTERMINATOR = '\n' )
Hope this helps.
This is occurring because you are telling the bulk insert that your field terminator for the column before name is a simple comma and that the field terminator for the Name column itself is double quote, then comma. You need to change the field terminator for the column before Name to be comma then double quote if you want to take care of the remaining double quote.
I believe your field terminator for the column before name should be: ",\"", where:
,=comma
/" = double quotes
Enclosed in another set of double quotes; it is the value to be used as the field terminator.
Flip the comma and the double quotes for the field terminator of your Name column.
So it should look like this:
9.0
6
1 SQLCHAR 0 50 ",\"" 1 "Identifier" Latin1_General_CI_AS
2 SQLCHAR 0 50 "\"," 2 "Name" Latin1_General_CI_AS
3 SQLCHAR 0 50 "," 3 "Date of Birth" Latin1_General_CI_AS
4 SQLCHAR 0 50 "," 4 "Admission" Latin1_General_CI_AS
5 SQLCHAR 0 50 "," 5 "Code" Latin1_General_CI_AS
6 SQLCHAR 0 50 "\r\n" 6 "Length"

Errors with bcp and bulk insert

I am having a .dat file which I have to upload on my SQL Server 2012 database. The table is as follows:
Primary Key TAB_KEY(bigint, not null)
SESSION_KEY (varchar(32),null)
HIT_KEY(varchar(32),null)
NAME(nvarchar(256),null)
VALUE(nvarchar(1024),null)
SESSION_TIMESTAMP(datetime,null)
The data file is like this in the .dat file:
NOTE : When attempting to import below these datas via BCP I get an error
Column 3: String data, right truncation
BTW Column 3 is NAME column in the column.
Sample data for Column 3 (Name Column field) :
_2__Kart_Ücreti_Yans_t_l_rken_180_Gün_Aktiflik_Kontrolü_Yap_lmal_d_r__Kart_ücreti_yans_rken__kart_n_en_son_hangi_tarihte_al__veri__nakit_çekim_veya__Axess_kartlarda__chip_para_harcamas__yap_ld____kontrol_edilecektir__E_er_günün_tarihi_ve_bu_son_aktiflik_t
Format file :
9.0
5
1 SQLCHAR 0 32 "\t" 2 SESSION_KEY RAW
2 SQLCHAR 0 32 "\t" 3 HIT_KEY RAW
3 SQLCHAR 0 512 "\t" 4 NAME RAW
4 SQLCHAR 0 1024 "\t" 5 VALUE RAW
5 SQLCHAR 0 24 "\r\n" 6 SESSION_TIMESTAMP ""
Error message :
Starting copy... SQLState = 22001, NativeError = 0 Error = [Microsoft][SQL Server Native Client 11.0]String data, right truncation SQLState = 22001,
My BCP command is :
bcp TLWEB.dbo.TLWEB_URLFIELD_8X in BulkUrlField8x.20141209_000000_20141209_235959.cxconnect_2_1418121388.1418121852_10032_1.dat -F 2 -b 250000 -m 50 -a 32000 -U username -P xxxxx -S ServerName\InstanceName,Port_Number -f UrlField8x.fmt

How can I insert newlines in the correct places in isql output in a bash script?

I am querying a database (SQL Server) using isql in a bash script. For example:
RESULT="$(${ISQL} -Q -U ${DB_USER} -S ${DB_SERVER} -D ${DB_NAME} << __END
SELECT order_id % 5 as mod5, count(*) as count
FROM orders
GROUP BY order_id % 5
ORDER BY order_id % 5
GO
__END
)"
My output:
mod5 count -------------------- ----------- 0 17640 1 17640 2 17638 3 17637 4 17638
How can I get the output to have new lines in it? For example:
mod5 count
-------------------- -----------
0 18118
1 18118
2 18116
3 18116
4 18117
I could do something hacky like also selecting '!!!' and then using sed to replace '!!!' with a new line, but I still don't know what do to about the first two lines (headers and dashes). In this case I know that I am going to have two fields of output so I could just count the tokens somehow and insert newlines, but what if I don't know how many columns will be returned in a query like "select * from orders"?
I can think of various solutions but they all seem incredibly hacky - is there a standard way to deal with output like this?
No idea about isql or sql-server. but for your problem, this works:
.cmd ..gives..output|xargs -n2
with your data:
kent$ echo "mod5 count -------------------- ----------- 0 17640 1 17640 2 17638 3 17637 4 17638"|xargs -n2
mod5 count
-------------------- -----------
0 17640
1 17640
2 17638
3 17637
4 17638

Resources