Sql Server Bulk insert via file separated with special charactors - sql-server

I have File to be imported into SQL Server table, I am importing that via BCP command via command line in C# when I pass Comma(,) as Separator in format file then it works fine, but when I try to pass Special char this as separator then it fails and giving me below error.
XML parsing: line 2, character 0, incorrect document syntax
Note: due to some reasons Stackoverflow not showing my special character, Please copy format file into your text editor like notepad++ or something else.
My format file as following for special char.
12.0
20
1 SQLCHAR 0 0 "" 2 Column1 SQL_Latin1_General_CP1_CI_AS
2 SQLCHAR 0 0 "" 3 Column2 SQL_Latin1_General_CP1_CI_AS
3 SQLCHAR 0 0 "" 0 XyzColumnToBypass ""
4 SQLCHAR 0 0 "" 0 XyzColumnToBypass ""
5 SQLCHAR 0 0 "" 0 XyzColumnToBypass ""
6 SQLCHAR 0 0 "" 0 XyzColumnToBypass ""
7 SQLCHAR 0 0 "" 0 XyzColumnToBypass ""
8 SQLCHAR 0 0 "" 0 XyzColumnToBypass ""
9 SQLCHAR 0 0 "" 0 XyzColumnToBypass ""
10 SQLCHAR 0 0 "" 0 XyzColumnToBypass ""
11 SQLCHAR 0 0 "" 0 XyzColumnToBypass ""
12 SQLCHAR 0 0 "" 0 XyzColumnToBypass ""
13 SQLCHAR 0 0 "" 7 Column4 SQL_Latin1_General_CP1_CI_AS
14 SQLCHAR 0 0 "" 6 Column5 SQL_Latin1_General_CP1_CI_AS
15 SQLCHAR 0 0 "" 5 Column6 SQL_Latin1_General_CP1_CI_AS
16 SQLCHAR 0 0 "" 0 XyzColumnToBypass ""
17 SQLCHAR 0 0 "" 0 XyzColumnToBypass ""
18 SQLCHAR 0 0 "" 0 XyzColumnToBypass ""
19 SQLCHAR 0 0 "" 0 XyzColumnToBypass ""
20 SQLCHAR 0 0 "\r\n" 0 XyzColumnToBypass ""
I also tried via SQL server using below query but got the same error.
BULK INSERT mySqlServerTable
FROM '\\machinexxx\Shared\BCP\TextfileToImport_SpecialChar.dat'
WITH(FORMATFILE = '\\machinexxx\Shared\BCP\formatfile.fmt')
I don't know why it accessing my non-XML format file as xml format file while my encoding for the format file is ASCII.

Related

Unicode/Collation Issue in Openrowset SQL Server

My CSV has text like this:
Côté fenêtres,
carré
I'm trying to open this CSV file using openrowset in SQL Server like below:-
select * from openrowset(BULK 'C:\Import_Orders\Files\PO.csv',
FORMATFILE = 'C:\Import_Orders\Format\Cust_441211.fmt.txt') as PO
But the result is like this:
C+¦t+¬ fen+¬tres,
Carr+¬
How can I tackle this issue? Let me know if I need to add anything more to this question.
SQL Version -
Microsoft SQL Server 2017 (RTM-CU29-GDR) (KB5014553) - 14.0.3445.2 (X64)
This is the format file:-
11.0
8
1 SQLCHAR 0 250 "|" 1 PARTNO ""
2 SQLCHAR 0 250 "|" 2 CODE ""
3 SQLCHAR 0 250 "|" 3 PRICEKG ""
4 SQLCHAR 0 250 "|" 4 FOOTKG ""
5 SQLCHAR 0 250 "|" 5 LENGTH ""
6 SQLCHAR 0 250 "|" 6 QTY ""
7 SQLCHAR 0 250 "|" 7 COLOR ""
8 SQLCHAR 0 250 "\r\n" 8 TOTKG ""
(1) You can try to add an additional parameter CODEPAGE = '65001' to specify a code page to support UNICODE characters.
(2) Use may try to use SQLNCHAR data type instead of SQLCHAR in the format file. For a text file you should always specify SQLCHAR for all fields, unless you have a Unicode file in in UTF‑16 encoding in which case you should use SQLNCHAR.
SQL
SELECT * FROM openrowset(BULK 'C:\Import_Orders\Files\PO.csv',
FORMATFILE = 'C:\Import_Orders\Format\Cust_441211.fmt.txt',
CODEPAGE = '65001') as PO;

bulk insert (SQL) format file last line

I have the following csv I wish to import into my db
"LE";"SOURCE";"VAR_SCTARGET_NAME"
"B";"A/K";"A/K"
"A";"A/B";"A/B"
"A";"A/B";"A/C"
I arranged the following format file
10.0
3
1 SQLCHAR 0 0 "\";\"" 1 A ""
2 SQLCHAR 0 0 "\";\"" 2 B ""
3 SQLCHAR 0 0 "\"\r\n\"" 3 AA ""
which works just fine, if it weren't for the last line. The output in my db is the following
LE SOURCE VAR_SCTARGET_NAME
B A/K A/K
A A/B A/B
A A/B A/C"
How can I remove the quote on the last row? I'm working on a SQL Server platform, if it can be of any help.

How to add column to SQL Server bcp query?

I'm beginner in SQL Server, when I write this query:
select ANUMBER
from CDRTABLE
it shows me data, but I want to add new column to result change that query to this:
select '028', ANUMBER
from CDRTABLE
This query adds a new column to query result, so I write this bcp query for saving results to a text file:
EXEC xp_cmdshell 'bcp "SELECT rtrim(ltrim(ANUMBER)),rtrim(ltrim(BNUMBER)),rtrim(ltrim(DATE)),rtrim(ltrim(TIME)),rtrim(ltrim(DURATION)) FROM [myTestReport].[dbo].[CDRTABLE]" queryout f:\newOUTPUT.txt -S DESKTOP-A5CFJSH\MSSQLSERVER1 -Umyusername -Pmypassword -f "f:\myFORMAT.fmt" '
and my format file is this:
9.0
5
1 SQLNCHAR 0 5 "," 1 ANUMBER ""
2 SQLNCHAR 0 10 "," 2 BNUMBER ""
3 SQLNCHAR 0 10 "," 3 DATE ""
4 SQLNCHAR 0 10 "," 4 TIME ""
5 SQLNCHAR 0 10 "\r\n" 5 DURATION ""
Everything is ok, but I want add new column to bcp result, for example add '028' to bcp query result. How can I do that? Thanks.
Because it looks like you're adding a character string to the front of the select, something like this should work:
9.0
6
1 SQLCHAR 0 3 "," 1 NEWCOLUMN "SQL_Latin1_General_CP1_CI_AS"
2 SQLNCHAR 0 5 "," 2 ANUMBER ""
3 SQLNCHAR 0 10 "," 3 BNUMBER ""
4 SQLNCHAR 0 10 "," 4 DATE ""
5 SQLNCHAR 0 10 "," 5 TIME ""
6 SQLNCHAR 0 10 "\r\n" 6 DURATION ""
See https://msdn.microsoft.com/en-us/library/ms191479.aspx for more details on the format of the format file.

BULK INSERT from file which has extra values

How can I tell the format file that the column in csv file should be ignored. I tried putting 0s and I get invalid column number error:
Format file:
10.0
9
0 SQLCHAR 0 12 "\t" 1 ID ""
2 SQLCHAR 0 10 "\t" 2 Symbol SQL_Latin1_General_CP1_CI_AS
0 SQLCHAR 0 11 "\t" 3 DateDone ""
0 SQLCHAR 0 19 "\t" 4 TimeDone ""
4 SQLCHAR 0 10 "\t" 5 Side SQL_Latin1_General_CP1_CI_AS
5 SQLCHAR 0 12 "\t" 6 Size ""
6 SQLCHAR 0 41 "\t" 7 Price ""
7 SQLCHAR 0 10 "\t" 8 Exchange SQL_Latin1_General_CP1_CI_AS
8 SQLCHAR 0 12 "\r\n" 9 Position ""
Sample row of csv data
------------------------------------------------------------------------------------------------------------------------
|AccountName || ExecSymbol || ExecDateTime || ExecSide || ExecSize || ExecPrice || ExecExchange || PositionSize|
------ ------------ ---------------- ------------ ----------- ---------- ------------- -------------
PRIMU$ || SCO || 1/2/2013 || B || 100 || 38.87 || ARCA || 100
The easiest way is to create an 'fmt' file that can specify what you want to import and what you to ignore:
https://msdn.microsoft.com/en-us/library/ms179250.aspx

BCP fails to export with error "Unable to resolve column level collation"

I've thoroughly searched Stack Overflow as well as many other resources but still have an issue. Here's my export script, running under Cygwin:
#!/usr/bin/env bash
#-*- coding: cp1255; -*-
bcp "declare #billing_types table(k int null, t varchar(14)
collate SQL_Latin1_General_CP1255_CI_AS)
insert #billing_types
values (null, 'לא פעיל')
,(1, 'אשרי')
,(2, 'צ׳ק')
,(3, 'הוראת קבע')
declare #standing_order_status table(i int null, s varchar(14)
collate SQL_Latin1_General_CP1255_CI_AS)
insert #standing_order_status
values (null, 'אין')
,(4, 'מבותל')
,(3, 'לא מאושר')
,(2, 'ממתין')
,(1, 'מאושר')
select billing_company_id
,internal_company_name
, t collate SQL_Latin1_General_CP1255_CI_AS as payment_type_string
,isnull(company_email, '') collate SQL_Latin1_General_CP1255_CI_AS as email
,company_fax
,company_address
,company_comments
,invoice_send_with_details
,invoice_send_fax
,invoice_print
,cc_name
,cc_number
,cc_cvv
,cc_id
,cc_expire
,bank_number
,bank_branch
,bank_account
,bank_hoshen
,s collate SQL_Latin1_General_CP1255_CI_AS
from billing_companies
join #billing_types bt on bt.k = payment_type
join #standing_order_status os on os.i = bank_standing_order_status" \
queryout billing-companies.csv -t"," -r"\n" -S server -T \
-U user -P password -f ./billing-companies.fmt
Here's the format file:
9.0
20
1 BIGINT 0 1 "" 1 billing_company_id ""
2 VARCHAR 0 1000 "" 2 internal_company_name SQL_Latin1_General_CP1255_CI_AS
3 VARCHAR 0 14 "" 3 payment_type_string SQL_Latin1_General_CP1255_CI_AS
4 VARCHAR 0 200 "" 4 email SQL_Latin1_General_CP1255_CI_AS
5 VARCHAR 0 100 "" 5 company_fax SQL_Latin1_General_CP1255_CI_AS
6 VARCHAR 0 4000 "" 6 company_address SQL_LATIN1_GENERAL_CP1255_CI_AS
7 NTEXT 0 1 "" 7 company_comments SQL_LATIN1_GENERAL_CP1255_CI_AS
8 BIT 0 1 "" 8 invoice_send_with_details ""
9 BIT 0 1 "" 9 invoice_send_fax ""
10 BIT 0 1 "" 10 invoice_print ""
11 VARCHAR 0 200 "" 11 cc_name SQL_LATIN1_GENERAL_CP1255_CI_AS
12 VARCHAR 0 50 "" 12 cc_number SQL_LATIN1_GENERAL_CP1255_CI_AS
13 VARCHAR 0 50 "" 13 cc_cvv SQL_LATIN1_GENERAL_CP1255_CI_AS
14 VARCHAR 0 50 "" 14 cc_id SQL_LATIN1_GENERAL_CP1255_CI_AS
15 VARCHAR 0 50 "" 15 cc_expire SQL_LATIN1_GENERAL_CP1255_CI_AS
16 VARCHAR 0 100 "" 16 bank_number SQL_LATIN1_GENERAL_CP1255_CI_AS
17 VARCHAR 0 100 "" 17 bank_branch SQL_LATIN1_GENERAL_CP1255_CI_AS
18 VARCHAR 0 100 "" 18 bank_account SQL_LATIN1_GENERAL_CP1255_CI_AS
19 INT 0 1 "" 19 bank_hoshen ""
20 varchar 0 14 "" 20 standing_order_status SQL_LATIN1_GENERAL_CP1255_CI_AS
The collation matches the collation in the database. When I run the query in the studio, I'm getting the expected result, no warnings.
Perhaps here's a caveat: the database uses single-byte prehistoric Hebrew encoding... and I'm not sure whether Cygwin, or anyone later after it isn't trying to convert between encodings. However I've doublechekced that I did my part properly. I.e. the script file is itself in cp-1255.
It works if I remove all mentions of Hebrew from the script. So, I'm guessing this must be the problem, however, I've no idea of how would I solve it.
Have you tried using the -C switch to specify a specific code page?
Here's the BCP syntax page from Books Online:
http://msdn.microsoft.com/en-us/library/ms162802.aspx
Looking here, it looks like you may want to use code page 1255:
http://msdn.microsoft.com/en-us/library/ms186356(v=sql.105).aspx
HTH.

Resources