TSQL BULK INSERT with auto incremented key from .txt file - sql-server

error im getting
This is to insert into an already created table:
CREATE TABLE SERIES(
SERIES_NAME VARCHAR(225) NOT NULL UNIQUE, --MADE VARCHAR(225) & UNIQUE FOR FK REFERENCE
ONGOING_SERIES BIT, --BOOL FOR T/F IF SERIES IS COMPLETED OR NOT
RUN_START DATE,
RUN_END DATE,
MAIN_CHARACTER VARCHAR(20),
PUBLISHER VARCHAR(12),
S_ID INT NOT NULL IDENTITY(1,1) PRIMARY KEY,
CONSTRAINT chk_DATES CHECK (RUN_START < RUN_END)
)
and the text file is organized as:
GREEN LANTERN,0,2005-07-01,2011-09-01,HAL JORDAN,DC
SPIDERMAN,0,2005-07-01,2011-09-01,PETER PARKER,MARVEL
I have already tried adding commas to the end of each line in .txt file
I have also tried adding ,' ' to the end of each line.
Any suggestions?

Indeed, the KEEPIDENTITY prevents the bulk insert from taken place. Removing the statement however won't resolve the problem.
Msg 4864, Level 16, State 1, Line 13
Bulk load data conversion error (type mismatch or invalid character for the specified codepage) for row 1, column 7 (S_ID).
The bulk insert expects to update all the columns. Another way of solving this issue is adding a format file for the text file, see MS Docs - Use a Format File to Bulk Import Data
You can create a format file for your text file with the following command.
bcp yourdatabase.dbo.series format nul -c -f D:\test.fmt -t, -T
Remove the last row, update the number of columns, and replace the last comma with the row terminator. The result will look like as shown below.
13.0
6
1 SQLCHAR 0 255 "," 1 SERIES_NAME SQL_Latin1_General_CP1_CI_AS
2 SQLCHAR 0 1 "," 2 ONGOING_SERIES ""
3 SQLCHAR 0 11 "," 3 RUN_START ""
4 SQLCHAR 0 11 "," 4 RUN_END ""
5 SQLCHAR 0 510 "," 5 MAIN_CHARACTER SQL_Latin1_General_CP1_CI_AS
6 SQLCHAR 0 510 "\r\n" 6 PUBLISHER SQL_Latin1_General_CP1_CI_AS

Remove KEEPIDENTIY from your BULK INSERT, since that specifies that you want to use the values in the source text file as your IDENTITY.
If this still fails, try adding a VIEW on the table that excludes the IDENTITY field, and INSERT into that instead, e.g.:
CREATE VIEW SeriesBulkInsertTarget
AS
SELECT Series_Name,
Ongoing_Series,
Run_Start,
Run_End,
Main_Character,
Publisher
FROM SERIES

Related

SQL Server: Select data from tab delimited file with OPENROWSET and BULK returns empty result

I want to query data from a tab delimited file using SQL Server and OPENROWSET.
I have the following sample source file:
FirstName LastName EMail
Marny Haney sed.dictum.eleifend#sem.com
Alexa Carpenter Vivamus.non.lorem#consectetuereuismod.com
Wyatt Mosley est#tortoratrisus.org
Cedric Johns lectus.a.sollicitudin#quisurna.ca
Lavinia Fischer nibh#insodales.net
Vera Marshall scelerisque#sapienAeneanmassa.co.uk
Beau Frost vel.quam.dignissim#mauris.net
Halla Fisher amet.metus.Aliquam#ullamcorpervelit.co.uk
Sierra Randall Nulla#magnis.net
Noel Malone semper#porttitor.org
I'm using the following format file:
12.0
3
1 SQLCHAR 0 5 "" 1 FirstName SQL_Latin1_General_CP1_CI_AS
2 SQLCHAR 0 5 "" 2 LastName SQL_Latin1_General_CP1_CI_AS
3 SQLCHAR 0 27 "0x0A" 3 EMail SQL_Latin1_General_CP1_CI_AS
I'm trying to query the data from the file with the following statement:
SELECT *
FROM
OPENROWSET(
BULK 'C:\data\Source\sample_data.dwo'
,FORMATFILE= 'C:\data\Format\sample_data.FMT'
,FIRSTROW = 2
) AS a
Unfortunately, the query returns an empty result. I don't get an error.
As far as I understood the default Terminator for fields is \t. I also tried to use t and \t explicitly as a terminator but still no result.
Any suggestions what I can try next?
Link to both files:
https://github.com/LordTakeshiXVII/files/blob/master/sample_data.FMT
https://github.com/LordTakeshiXVII/files/blob/master/sample_data.dwo
You need to adapt your format file:
First change the max-length of the fields to something appropriate (100 in the example) - you can also set it to zero for unlimited input length.
Second set the terminator for the first two fields to \t and of the third field to \r\n
12.0
3
1 SQLCHAR 0 100 "\t" 1 FirstName SQL_Latin1_General_CP1_CI_AS
2 SQLCHAR 0 100 "\t" 2 LastName SQL_Latin1_General_CP1_CI_AS
3 SQLCHAR 0 100 "\r\n" 3 EMail SQL_Latin1_General_CP1_CI_AS
Here you can find more information on format files: https://learn.microsoft.com/en-us/sql/relational-databases/import-export/create-a-format-file-sql-server?view=sql-server-2017

nzload - skiprows not working when 1st column is not matching the table elements

When using nzload for fixed width where the first row are headers for the column, the skiprow works fine. But when I
Works fine if the 1st row has the same number of elements.
1HelloWorld2011-12-07
1HelloWorld2011-12-07
2Netezza 2010-02-16
The first row has a single text that I want nzload to skiprow on but because it's not the same number of elements, nzload throws an error
DummyRow
1HelloWorld2011-12-07
2Netezza 2010-02-16
Script example:
nzload -t "textFixed_tbl" -format fixed -layout "col1 int bytes 1, col2 char(10) bytes 10, col3 date YMD '-' bytes 10" -df /tmp/fixed_width.dat -bf /tmp/testFixedWidth.bad -lf /tmp/testFixedWidth.nzlog -skipRows 1 -maxErrors 1
Data File
DummyRow
1HelloWorld2011-12-07
2Netezza 2010-02-16
Error:
Error: Operation canceled
Error: External Table : count of bad input rows reached maxerrors limit
Record Format: FIXED Record Null-Indicator: 0
Record Length: 0 Record Delimiter:
Record Layout: 3 zones : "col1" INT4 DECIMAL BYTES 1 NullIf &&1 = '', "col2" CHAR(10) INTERNAL BYTES 10, "col3" DATE YMD '-' BYTES 10 NullIf &&3 = ''
Statistics
number of records read: 1
number of bytes read: 22
number of records skipped: 0
number of bad records: 1
number of records loaded: 0
Elapsed Time (sec): 0.0
The skiprows option for nzload / external tables discards the specified number number of rows, but it still processes the skipped rows. Consequently the rows must be properly formed, and this behavior won't act as you hoped/intended.
This is noted in the documentation:
You cannot use the SkipRows option for header row processing in a data file, because even the skipped rows are processed first. Therefore, data in the header rows should be valid with respect to the external table definition

How to split a column with 2 data type in 2 columns (sql server)

I have a huge .CSV file with information about triathlon races (People, Times, Country, Overalltime...etc) all in varchar...
The problem is that one column (Overalltime) stores datatime and varchar types.
The varchars are (DNS,DNF,DQ) while datatimes are (09:09:30) for example.
When I am creating the table, I have a column like this:
overalltime
-------------
09:09:30
09:10:22
DNF
DNS
But I want to split that column in the table, to have two columns. One with the datetime values and another one with the varchar columns.
What will be the best way to split that column?
One approach is to use a case expression to conditionally break your values into columns:
-- Ussing CASE to split rows into columns.
WITH SampleData AS
(
-- Provides sample data to play with.
SELECT
r.overalltime
FROM
(
VALUES
('09:09:30'),
('09:09:30'),
('DNF'),
('DNS')
) AS r(overalltime)
)
SELECT
CASE WHEN ISDATE(overalltime) = 1 THEN overalltime ELSE NULL END AS [Time],
CASE WHEN overalltime = 'DNS' THEN 1 ELSE 0 END AS DNS,
CASE WHEN overalltime = 'DNF' THEN 1 ELSE 0 END AS DNF
FROM
SampleData
;
Returns:
Time DNS DNF
09:09:30 0 0
09:09:30 0 0
NULL 0 1
NULL 1 0
im no pro but the best way to do it would be to filter it out before importing into the database. you use .csv file then it wont be a problem to split them into datatime for one column and 2ndcolumn for varchar. then upload it when you already have two separate columns

Derived column expression only capture 10 characters

I am new to SSIS and I have searched to find the solution to this question. Any help is most appreciated!
I have a flat file with data defined as dt_wstr, to change the datatype I am using a data conversion to set the [column] to dt_str(50)
I am also using a derived column - to add as a new column: The goal is write an expression
I have a [column] which is defined as 11 characters
My question is how do I write an expression to only capture 10 characters, and anything greater than 10 I want to change the [column] to -1 else (dt_I8) [column]
I've tried:
FINDSTRING([Column],"9999999999",1) == 10 ? -1 : (DT_I8)TRIM([Column])
FINDSTRING([Column],"9999999999",1) > 10 ? -1 : (DT_I8)TRIM([Column])
LEN([Column]) == 10 ? -1 : (DT_I8)[column]
SUBSTRING( [Copy of Member ID] ,1,10)
The package runs without errors however the results in the table are not correct, the column with more than 10 characters are not showing up in the table
I am using visual studio 2012
Thank you Dawana
I don't know why your substring attempt didn't work, but this would return the first 10 characters of column:
LEFT(column,10)
https://msdn.microsoft.com/en-us/library/hh231081(v=sql.110).aspx

CSV import in SQL Server 2008

I have a csv file that has column values enclosed within double quotes.
I want to import a csv file from a network path using an sql statement.
I tried bulk insert. But it imports along with double quotes. Is there any other way to import a csv file into SQL Server 2008 using an sql statement by ignoring the text qualifier double quote?
Thanks
-Vivek
You could use a non-xml format file to specify a different delimiter per column. For values enclosed with double quotes, and delimited by tabs, the delimiter could be \",\". You'd have to add an initial unused column to capture the first quote. For example, to read this file:
"row1col1","row1col2","row1col3"
"row2col1","row2col2","row2col3"
"row3col1","row3col2","row3col3"
You could use this format file:
10.0
4
1 SQLCHAR 0 50 "\"" 0 unused ""
2 SQLCHAR 0 50 "\",\"" 1 col1 ""
3 SQLCHAR 0 50 "\",\"" 2 col2 ""
4 SQLCHAR 0 50 "\"\r\n" 3 col3 ""
(The number on the first line depends on the SQL Server version. The number on the second line is the number of columns to read. Don't forget to adjust it.)
The bulk insert command accepts a formatfile = 'format_file_path' parameter where you can specify the format file. For example:
BULK INSERT YourTable
FROM 'c:\test\test.csv'
WITH (FORMATFILE = 'c:\test\test.cfmt')
This results in:
select * from YourTable
-->
col1 col2 col3
row1col1 row1col2 row1col3
row2col1 row2col2 row2col3
row3col1 row3col2 row3col3
This is a known issue when importing files with text delimiters as the bcp/bulk insert utilities don't allow you to specify a text delimiter. See this link for a good discussion.
#Andomar's anaswer got me 99% of the way there with a very similar problem. However, I found SQL Server 2014 failed to import the last row because the last field didn't have the new line characters: \r\n.
So my format file looked more like:
12.0
4
1 SQLCHAR 0 50 "\"" 0 unused ""
2 SQLCHAR 0 50 "\",\"" 1 col1 ""
3 SQLCHAR 0 50 "\",\"" 2 col2 ""
4 SQLCHAR 0 50 "\"" 3 col3 ""
And so for my file, which had a row with field names, the import SQL became:
BULK INSERT MyTable
FROM 'C:\mypath\datafile.csv'
WITH (
FIRSTROW = 2,
FORMATFILE = 'C:\mypath\formatfile.cfmt',
ROWTERMINATOR = '\r\n'
)
The actual CSV had 40 fields so it was helpful to read on Microsoft's website that it is not necessary to write the column names (col1 - col40 works just fine) and also that the fourth parameter in each line - 50 in the example, just needs to be the maximum field length, not exact.

Resources