I'm trying to create a table in SQL Server from a text file using bulk insert but I keep getting bulk load data conversion error(truncation). Is there something I'm doing wrong? The top part is how the data is in the text and below is the code.
'01','INPATIENT FACILITY','010','ACUTE CARE HOSPITAL'
'01','INPATIENT FACILITY','011','PRIVATE PSYCHIATRIC HOSPITAL'
'01','INPATIENT FACILITY','012','INPATIENT MEDICAL REHAB HOSPITAL'
CREATE TABLE [dbo].[PROVIDER_TYPE]
(
[PROVIDER_TYPE_ID] [VARCHAR](2) NULL,
[PROVIDER_TYPE] [VARCHAR](50) NULL,
[PROVIDER_SPECIALITY_ID] [VARCHAR](3) NULL,
[PROVIDER_SPECIALITY] [VARCHAR](50) NULL
) ON [PRIMARY]
BULK INSERT DBO.PROVIDER_TYPE FROM 'C:\SQL\t2.txt'
WITH (
datafiletype = 'char'
,fieldterminator = ','
,ROWTERMINATOR = '\n'
)
The first value isn't 2 characters long, it's 4. The value is '01'; that's inclusive of the single quotes ('). This is why you're getting a truncation error, as '01' ('''01''' if you were to want to represent the string in T-SQL) doesn't fit in a varchar(2).
If you're on SQL Server 2017+ you can use the FORMAT and FIELDQUOTE options. Note that I also use \r\n for ROWTERMINATOR, as I had both when I created the file, if yours only contains a line break (and no carriage return), then just use \n:
BULK INSERT dbo.PROVIDER_TYPE FROM '/mnt/WDBlue/t2.txt' --This was my test file
WITH (DATAFILETYPE = 'char',
FIELDTERMINATOR = ',',
ROWTERMINATOR = '\r\n',
FORMAT = 'CSV',
FIELDQUOTE = '''');
If you aren't using SQL Server 2017+, then it simply does not support quoted fields, and I suggest using a different tool.
Related
I have a CSV file in UTF8 encoding and I would like to import data into SQL Server DB table.
In some cells I have stored values like:
±40%;
16.5±10%;
All columns load perfectly but only columns with ± character show in DB this:
All columns where I would like to store this character I use nvarchar(50) with collation Latin1_General_100_CS_AS_WS_SC_UTF8
Is there any wait how this character store into DB ?
Thank you
EDIT
I use for load CSV file this:
BULK INSERT [dbo].[x]
FROM 'c:\Users\x\Downloads\x.csv'
WITH
(
FIRSTROW = 2,
FIELDTERMINATOR = ';', --CSV field delimiter
ROWTERMINATOR = '\n', --Use to shift the control to next row
ERRORFILE = 'c:\Users\x\Downloads\xx.csv',
TABLOCK
);
I also try to change SSMS Options:
‘Tools‘ -> ‘Options‘ -> ‘Environment‘ -> ‘Fonts and Colors‘ -> Select ‘Grid Results’
Set Font to Arial but without positive results
I have over 20 million records in many files, which I want to import
Have you tried adding a CODEPAGE=65001 (UTF-8) to the WITH clause of the BULK INSERT statement?
BULK INSERT [dbo].[x]
FROM 'c:\Users\x\Downloads\x.csv'
WITH
(
FIRSTROW = 2,
FIELDTERMINATOR = ';', --CSV field delimiter
ROWTERMINATOR = '\n', --Use to shift the control to next row
ERRORFILE = 'c:\Users\x\Downloads\xx.csv',
CODEPAGE = 65001,
TABLOCK
);
I want to bulk import from CSV file in sql but \n for new line is not working in SQL as row terminator. it does not read any record from csv file if i use \n but when i use
ROWTERMINATOR = '0x0A'
it mix up the all records.
this is code what i am using in my sp.
Create Table #temp
(
Field1 nvarchar(max) null,
Field2 nvarchar(max) null,
Field3 nvarchar(max) null
)
BULK INSERT #temp
FROM 'c:\file.csv'
WITH
(
FIRSTROW = 2,
FIELDTERMINATOR = ',', --CSV field delimiter
ROWTERMINATOR = '\n', --not working
--ROWTERMINATOR = '\r', --not working
--ROWTERMINATOR = char(10), ---not working
--ROWTERMINATOR = char(13), ---not working
TABLOCK
)
INSERT INTO table_name
(
tbl_field1,tbl_field2,tbl_field3
)
SELECT
field1,
field2,
field3
FROM #temp
Thanks in Advance
I did it with the help of #DVO. Thank you #dvo for answer. It is working fine as per your instructions. i used notepad++ and see the hidden characters and handle them accordingly.
I am new to Azure and Polybase, I am trying to read a CSV file into a SQL External Table.
I noticed that, it is not possible to skip the first row, the header on some forums I read.
I'm hoping for the opposite,Can you help me ?
The code I used is below.
Thanks in advance
CREATE EXTERNAL TABLE dbo.Test2External (
[Guid] [varchar](36) NULL,
[Year] [smallint] NULL,
[SysNum] [bigint] NULL,
[Crc_1] [decimal](15, 2) NULL,
[Crc_2] [decimal](15, 2) NULL,
[Crc_3] [decimal](15, 2) NULL,
[Crc_4] [decimal](15, 2) NULL,
[CreDate] [date] NULL,
[CreTime] [datetime] NULL,
[UpdDate] [date] NULL,
...
WITH (
LOCATION='/20160823/1145/FIN/',
DATA_SOURCE=AzureStorage,
FILE_FORMAT=TextFile
);
-- Run a query on the external table
SELECT count(*) FROM dbo.Test2External;
there is a workaround by using 'EXTERNAL FILE FORMAT' with 'FIRST_ROW = 2'.
e.g. if we create a file format
CREATE EXTERNAL FILE FORMAT [CsvFormatWithHeader] WITH (
FORMAT_TYPE = DELIMITEDTEXT,
FORMAT_OPTIONS (
FIELD_TERMINATOR = ',',
FIRST_ROW = 2,
STRING_DELIMITER = '"',
USE_TYPE_DEFAULT = False
)
)
GO
And then use this file format with create external table
CREATE EXTERNAL TABLE [testdata].[testfile1]
(
[column1] [nvarchar](4000) NULL
)
WITH (DATA_SOURCE = data_source,
LOCATION = file_location,
FILE_FORMAT = [CsvFormatWithHeader],REJECT_TYPE = PERCENTAGE,REJECT_VALUE = 100,REJECT_SAMPLE_VALUE = 1000)
It will skip first row while executing queries for 'testdata.testfile1'.
You have a few options:
get the file headers removed permanently because Polybase isn't really meant to work with file headers
Use Azure Data Factory which does have options for skipping header rows when the file is in Blob storage
set the rejection options of the Polybase table to try and ignore the header row, ie setREJECT_TYPE to VALUE and the REJECT_VALUE to 1, eg
this is a bit hacky as you don't have any control over whether or not this is actually the header row, but it would work if you only have one header row and it is the only error in the file. Example below.
For a file called temp.csv with this content:
a,b,c
1,2,3
4,5,6
A command like this will work:
CREATE EXTERNAL TABLE dbo.mycsv (
colA INT NOT NULL,
colB INT NOT NULL,
colC INT NOT NULL
)
WITH (
DATA_SOURCE = eds_esra,
LOCATION = N'/temp.csv',
FILE_FORMAT = eff_csv,
REJECT_TYPE = VALUE,
REJECT_VALUE = 1
)
GO
SELECT *
FROM dbo.mycsv
My results:
set the datatypes of the external table to VARCHAR just for staging the data then remove the header row when converting to an internal table using something like ISNUMERIC, eg
CREATE EXTERNAL TABLE dbo.mycsv2 (
colA VARCHAR(5) NOT NULL,
colB VARCHAR(5) NOT NULL,
colC VARCHAR(5) NOT NULL
)
WITH (
DATA_SOURCE = eds_esra,
LOCATION = N'/temp.csv',
FILE_FORMAT = eff_csv,
REJECT_TYPE = VALUE,
REJECT_VALUE = 0
)
GO
CREATE TABLE dbo.mycsv3
WITH (
CLUSTERED INDEX ( colA ),
DISTRIBUTION = ROUND_ROBIN
)
AS
SELECT
colA,
colB,
colC
FROM dbo.mycsv2
WHERE ISNUMERIC( colA ) = 1
GO
HTH
Skip header rows on SQL Data Warehouse PolyBase load
Delimited text files are often created with a header row that contains the column names. These rows need to be excluded from the data set during the load. Azure SQL Data Warehouse users can now skip these rows by using the First_Row option in the delimited text file format for PolyBase loads.
The First_Row option defines the first row that is read in every file loaded. By setting the value to 2, you effectively skip the header row for all files.
For more information, see the documentation for the CREATE EXTERNAL FILE FORMAT statement.
I'm trying to import a .txt file into Advanced Query Tool (the SQL client I use). So far, I have:
CREATE TABLE #tb_test
(
id INTEGER,
name varchar(10),
dob date,
city char(20),
state char(20),
zip integer
);
insert into #tb_test
values
(1,'TEST','2015-01-01','TEST','TEST',11111)
;
bulk insert #tb_test
from 'h:\tbdata.txt'
with
(
fieldterminator = '\t',
rowterminator = '\n'
);
I receive an error message saying there's a syntax error on line 1. Am I missing a database from which #tb_test comes (like db.#tb_test)?
Here's a line from the tbdata.txt file:
2,'TEST2','2012-01-01','TEST','TEST',21111
I was curious with this question and I found the following solution:
Your data is comma separated but you are trying to split by TAB
two options: change the file data to be TAB separated or change the fieldterminator = '\t' to fieldterminator = ','
The DATE format has issues when loading directly from a file, my best solution is to change the temp field dob to type VARCHAR(20) and then, when passing to the final display/data storage convert to DATE.
Here is the corrected code:
CREATE TABLE #tb_test
(
id INTEGER,
name varchar(10),
dob varchar(20),
city char(20),
state char(20),
zip integer
);
insert into #tb_test
values
(1,'TEST','2015-01-01','TEST','TEST',11111)
;
bulk insert #tb_test
from 'h:\tbdata.txt'
with
(
fieldterminator = ',',
rowterminator = '\n'
);
Basically, I want to import hundreds of CSV files into SQL Server 2008.
File format is as following :
<Ticker>,<DTYYYYMMDD>,<Open>,<High>,<Low>,<Close>,<Volume>
AAM,20120110,21.6,22.8,21.4,21.6,3510
AAM,20120109,22.2,22.9,22.0,22.2,1130
AAM,20120105,0.0,23.0,22.2,22.2,210
I tried :
BULK
INSERT BBB
FROM 'D:\FIFA\excel_aam.csv'
WITH
(
FIRSTROW = 2,
FIELDTERMINATOR = ',',
ROWTERMINATOR = '/n'
)
GO
but it didn't work. So I was thinking import the CSV file as varchar format, then change each columns to proper data type later, like this :
CREATE TABLE BBB (
TICKER VARCHAR(15)NULL,
INDEXDATE VARCHAR(15) PRIMARY KEY,
OPENPRICE VARCHAR(15) NULL,
HIGHPRICE VARCHAR(15) NULL,
LOWPRICE VARCHAR(15) NULL,
CLOSEPRICE VARCHAR(15) NOT NULL,
VOLUME VARCHAR(15))
GO
but it gave me the error :
Msg 4863, Level 16, State 1, Line 1
Bulk load data conversion error (truncation) for row 1, column 7 (VOLUME).
So, how could I import these files (so many files that i couldn't use import and export wizard) into SQL Server properly?
For importing so many files sounds like u'll need SSIS
It works just fine in my case when I just change the rowterminator to \n (not /n)
--CREATE TABLE BBB (
--TICKER VARCHAR(15)NULL,
--INDEXDATE DATETIME,
--OPENPRICE DECIMAL(12,4),
--HIGHPRICE DECIMAL(12,4),
--LOWPRICE DECIMAL(12,4),
--CLOSEPRICE DECIMAL(12,4),
--VOLUME DECIMAL(20,4))
--GO
BULK INSERT BBB
FROM 'D:\FIFA\excel_aam.csv'
WITH
(
FIRSTROW = 2,
FIELDTERMINATOR = ',',
ROWTERMINATOR = '\n'
)
GO
(3 row(s) affected)
and I have the rows in the BBB table now....