BULK INSERT with identity (auto-increment) column - sql-server

I am trying to add bulk data in database from CSV file.
Employee table has a column ID (PK) auto-incremented.
CREATE TABLE [dbo].[Employee](
[id] [int] IDENTITY(1,1) NOT NULL,
[Name] [varchar](50) NULL,
[Address] [varchar](50) NULL
) ON [PRIMARY]
I am using this query:
BULK INSERT Employee FROM 'path\tempFile.csv '
WITH (FIRSTROW = 2,KEEPIDENTITY,FIELDTERMINATOR = ',' , ROWTERMINATOR = '\n');
.CSV File -
Name,Address
name1,addr test 1
name2,addr test 2
but it results in this error message:
Bulk load data conversion error (type mismatch or invalid character for the specified codepage) for row 2, column 1 (id).

Add an id column to the csv file and leave it blank:
id,Name,Address
,name1,addr test 1
,name2,addr test 2
Remove KEEPIDENTITY keyword from query:
BULK INSERT Employee FROM 'path\tempFile.csv '
WITH (FIRSTROW = 2,FIELDTERMINATOR = ',' , ROWTERMINATOR = '\n');
The id identity field will be auto-incremented.
If you assign values to the id field in the csv, they'll be ignored unless you use the KEEPIDENTITY keyword, then they'll be used instead of auto-increment.

Don't BULK INSERT into your real tables directly.
I would always
insert into a staging table dbo.Employee_Staging (without the IDENTITY column) from the CSV file
possibly edit / clean up / manipulate your imported data
and then copy the data across to the real table with a T-SQL statement like:
INSERT INTO dbo.Employee(Name, Address)
SELECT Name, Address
FROM dbo.Employee_Staging

I had a similar issue, but I needed to be sure that the order of the ID is aligning to the order in the source file.
My solution is using a VIEW for the BULK INSERT:
Keep your table as it is and create this VIEW (select everything except the ID column)
CREATE VIEW [dbo].[VW_Employee]
AS
SELECT [Name], [Address]
FROM [dbo].[Employee];
Your BULK INSERT should then look like:
BULK INSERT [dbo].[VW_Employee] FROM 'path\tempFile.csv '
WITH (FIRSTROW = 2,FIELDTERMINATOR = ',' , ROWTERMINATOR = '\n');

You have to do bulk insert with format file:
BULK INSERT Employee FROM 'path\tempFile.csv '
WITH (FORMATFILE = 'path\tempFile.fmt');
where format file (tempFile.fmt) looks like this:
11.0 2 1 SQLCHAR 0 50 "\t"  2  Name   SQL_Latin1_General_CP1_CI_AS
2 SQLCHAR 0 50 "\r\n" 3  Address  SQL_Latin1_General_CP1_CI_AS
more details here - http://msdn.microsoft.com/en-us/library/ms179250.aspx

My solution is to add the ID field as the LAST field in the table, thus bulk insert ignores it and it gets automatic values. Clean and simple ...
For instance, if inserting into a temp table:
CREATE TABLE #TempTable
(field1 varchar(max), field2 varchar(max), ...
ROW_ID int IDENTITY(1,1) NOT NULL)
Note that the ROW_ID field MUST always be specified as LAST field!

Create a table with Identity column + other columns;
Create a view over it and expose only the columns you will bulk insert;
BCP in the view

I had this exact same problem which made loss hours so i'm inspired to share my findings and solutions that worked for me.
1. Use an excel file
This is the approach I adopted. Instead of using a csv file, I used an excel file (.xlsx) with content like below.
id username email token website
johndoe johndoe#divostar.com divostar.com
bobstone bobstone#divosays.com divosays.com
Notice that the id column has no value.
Next, connect to your DB using Microsoft SQL Server Management Studio and right click on your database and select import data (submenu under task). Select Microsoft Excel as source. When you arrive at the stage called "Select Source Tables and Views", click edit mappings. For id column under destination, click on it and select ignore . Don't check Enable Identity insert unless you want to mantain ids incases where you are importing data from another database and would like to maintain the auto increment id of the source db. Proceed to finish and that's it. Your data will be imported smoothly.
2. Using CSV file
In your csv file, make sure your data is like below.
id,username,email,token,website
,johndoe,johndoe#divostar.com,,divostar.com
,bobstone,bobstone#divosays.com,,divosays.com
Run the query below:
BULK INSERT Metrics FROM 'D:\Data Management\Data\CSV2\Production Data 2004 - 2016.csv '
WITH (FIRSTROW = 2, FIELDTERMINATOR = ',', ROWTERMINATOR = '\n');
The problem with this approach is that the CSV should be in the DB server or some shared folder that the DB can have access to otherwise you may get error like "Cannot opened file. The operating system returned error code 21 (The device is not ready)".
If you are connecting to a remote database, then you can upload your CSV to a directory on that server and reference the path in bulk insert.
3. Using CSV file and Microsoft SQL Server Management Studio import option
Launch your import data like in the first approach. For source, select Flat file Source and browse for your CSV file. Make sure the right menu (General, Columns, Advanced, Preview) are ok. Make sure to set the right delimiter under columns menu (Column delimiter). Just like in the excel approach above, click edit mappings. For id column under destination, click on it and select ignore .
Proceed to finish and that's it. Your data will be imported smoothly.

This is a very old post to answer, but none of the answers given solves the problem without changing the posed conditions, which I can't do.
I solved it by using the OPENROWSET variant of BULK INSERT. This uses the same format file and works in the same way, but it allows the data file be read with a SELECT statement.
Create your table:
CREATE TABLE target_table(
id bigint IDENTITY(1,1),
col1 varchar(256) NULL,
col2 varchar(256) NULL,
col3 varchar(256) NULL)
Open a command window an run:
bcp dbname.dbo.target_table format nul -c -x -f C:\format_file.xml -t; -T
This creates the format file based on how the table looks.
Now edit the format file and remove the entire rows where FIELD ID="1" and COLUMN SOURCE="1", since this does not exist in our data file.
Also adjust terminators as may be needed for your data file:
<?xml version="1.0"?>
<BCPFORMAT xmlns="http://schemas.microsoft.com/sqlserver/2004/bulkload/format" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<RECORD>
<FIELD ID="2" xsi:type="CharTerm" TERMINATOR=";" MAX_LENGTH="256" COLLATION="Finnish_Swedish_CI_AS"/>
<FIELD ID="3" xsi:type="CharTerm" TERMINATOR=";" MAX_LENGTH="256" COLLATION="Finnish_Swedish_CI_AS"/>
<FIELD ID="4" xsi:type="CharTerm" TERMINATOR="\r\n" MAX_LENGTH="256" COLLATION="Finnish_Swedish_CI_AS"/>
</RECORD>
<ROW>
<COLUMN SOURCE="2" NAME="col1" xsi:type="SQLVARYCHAR"/>
<COLUMN SOURCE="3" NAME="col2" xsi:type="SQLVARYCHAR"/>
<COLUMN SOURCE="4" NAME="col3" xsi:type="SQLVARYCHAR"/>
</ROW>
</BCPFORMAT>
Now we can bulk load the data file into our table with a select, thus having full controll over the columns, in this case by not inserting data into the identity column:
INSERT INTO target_table (col1,col2, col3)
SELECT * FROM openrowset(
bulk 'C:\data_file.txt',
formatfile='C:\format_file.xml') as t;

Another option, if you're using temporary tables instead of staging tables, could be to create the temporary table as your import expects, then add the identity column after the import.
So your sql does something like this:
If temp table exists, drop
Create temp table
Bulk Import to temp table
Alter temp table add identity
< whatever you want to do with the data >
Drop temp table
Still not very clean, but it's another option... might have to get locks to be safe, too.

Related

Inserting data from excel sheet into sql temp table

I have created the following temp table in SQL Server Management Studio:
CREATE TABLE ##LoginMap
(
ObjectId NVARCHAR(50),
UserPrincipleName NVARCHAR(500),
Username NVARCHAR(250),
Email NVARCHAR(500),
Name NVARCHAR(250)
)
These are the same names as the column headers in the excel sheet I am trying to get the data from.
The file name is called Test Loadsheet and the sheet name is AgilityExport_04Aug2022_164839.
I am trying to insert the data into the temp table like so:
INSERT INTO ##LoginMap
SELECT *
FROM OPENROWSET('Microsoft.ACE.OLEDB.12.0',
'Excel 12.0; Database=C:\temp\Test LoadSheet.xlsx', [AgilityExport_04Aug2022_164839]);
GO
However, I am getting the error:
The OLE DB provider "Microsoft.ACE.OLEDB.12.0" for linked server "(null)" does not contain the table "AgilityExport_04Aug2022_164839". The table either does not exist or the current user does not have permissions on that table.
Where have I gone wrong with this? And what do I need to do in order to successfully get the data from each column into my table?
You have file name as Test Loadsheet in one spot, but then in your query you have it as Test LoadSheet.xlsx. Try and see if this is holding it up.
Found a link on importing data from excel to SQL if you are interested:
https://learn.microsoft.com/en-us/sql/relational-databases/import-export/import-data-from-excel-to-sql?view=sql-server-ver16#openrowset
I went about this a different way. In the excel sheet I am using a made a formula like so:
="INSERT INTO ##LoginMap(ObjectId, DisplayName, AzureEmail, AzureUsername) VALUES('"&A2&"', '"&B2&"', '"&C2&"', '"&D2&"');"
I then had this repeated for each row. This then gave me an insert statement for each that I could simply copy and paste into SSMS and allow to run the query

How to parse string into multiple tables in SQL Server 2017

I have a text file that was created by dumping 8 SQL tables into it. Now I need to import this data back into SQL Server.
Using BULK insert I was able to load data into one table with single column 'FileData'.
DECLARE #FileTable TABLE (FileData NVARCHAR(MAX))
INSERT INTO #FileTable
SELECT BulkColumn
FROM OPENROWSET( BULK N'C:\My\Path\Name\FileName.txt', SINGLE_CLOB) AS Contents
SELECT * FROM #FileTable
So now I have this huge string that I need to organize into different tables.
For example this part of string corresponds to the below table :
FileData
00001 00000009716496000000000331001700000115200000000000
Table:
It also seems like all fields have a set length and I can get that length.
I can see doing something like this:
select SUBSTRING('00001 00000009716496000000000331001700000115200000000000 ', 1,5) as RecordKey
select SUBSTRING('00001 00000009716496000000000331001700000115200000000000 ', 6,17) as Filler
select SUBSTRING('00001 00000009716496000000000331001700000115200000000000 ', 23,16) as BundleAnnualPremium
But is any faster and better way to load this data into different tables?
You could just bulk insert with a format file right from the start. But since the data is already loaded into a big table, if you'd rather use pure TSQL, you can pull elements out of a string using left(), right(), and substring().

Pulling rows from .log file into SQL Server table

I have a very flat, simple log file (6 rows of which one row is blank) that I want to insert into a simple 5 column SQL Server table.
Please excuse my SQL ignorance as my knowledge around this topic is not educated.
Below is the .log file content :-
-----------Log File content start----------
07/30/2016 00:02:03 : BATCH CLOSE SUMMARY
MerchantID - 000022673665
TerminalID - 013
BatchItemCount - 650
NetBatchTotal - 5095.00
----------Log file content end-------------
Below is the simple SQL Server table layout:
CREATE TABLE dbo.CCClose
(
CloseTime NVARCHAR(50) NOT NULL,
MercID NVARCHAR(50) NOT NULL,
TermID NVARCHAR(50) NOT NULL,
BatchCount NVARCHAR(30) NOT NULL,
NetBatcTotal NVARCHAR(50) NOT NULL
);
I'm hoping that somehow have each row looked at by SQL for example:
if .log file like 'Batch close Summary' then insert into CloseTime else
if .log file like 'MerchantID' then insert into MercID else
if .log file like 'BatchItemCount' then insert into BatchCount else
if .log file like 'NetBatchTotal' then insert into NetBatchTotal
Off course it would be great if the proper formatting for each column was in place but at this time I just looking at getting the .log file data populated from a directory of these logs.
I plan to use Crystal Reports to build on the SQL Server tables.
This is not going to be a simple process. You can probably do it with bulk insert. The idea is to read it into a staging table, using:
a record terminator of something like "----------Log file content end-------------" + newline
a field separator of a newline
a staging table with several columns of varchars
Then process the staging table to extract the values (and types) that you want. There are probably other options, if you set up a format file, but that adds another level of complexity.
I would read the table into a staging table with one line per row in the table. Then, I would:
use window functions to assign a record number to rows, based on the "content start" lines
aggregate based on the record number
extract the values using aggregations, string functions, and conversions

How to insert xml file into xml field using bcp?

I have a table:
USE [testdb]
GO
CREATE TABLE [dbo].[a](
[n] [int] IDENTITY(1,1) NOT NULL PRIMARY KEY CLUSTERED,
[x] [xml] NULL)
GO
How to insert xml file into field x from client?
MSDN Example doesn't suit me.
INSERT INTO T(XmlCol)
SELECT * FROM OPENROWSET(
BULK 'c:\SampleFolder\SampleData3.txt',
SINGLE_BLOB) AS x;
I'm not the administrator of this server. And I have the only access to the database. I can not put a file in a directory on the server. I can use the BCP and other tools to access the database
XML file is very large (> 50 MB), so it doesn't turn to insert the text of file as a constant in the query SSMS
Little known fact: the bcp utility supports arbitrary strings as column and row delimiters. Construct a file with delimiters not present in your data, and invoke bcp accordingly.
For example, your column delimiter could be -t \0Field\0. Just check the data first. :-)

Easiest way to import CSV into SQl Server 2005

I have several files about 5k each of CSV data I need to import into SQL Server 2005.
This used to be simple with DTS. I tried to use SSIS previously and it seemed to be about 10x as much effort and I eventually gave up.
What would be the simplest way to import the csv data into sql server? Ideally, the tool or method would create the table as well, since there are about 150 fields in it, this would simplify things.
Sometimes with this data, there will be 1 or 2 rows that may need to be manually modified because they are not importing correctly.
try this:
http://blog.sqlauthority.com/2008/02/06/sql-server-import-csv-file-into-sql-server-using-bulk-insert-load-comma-delimited-file-into-sql-server/
here is a summary of the code from the link:
Create Table:
CREATE TABLE CSVTest
(ID INT,
FirstName VARCHAR(40),
LastName VARCHAR(40),
BirthDate SMALLDATETIME)
GO
import data:
BULK
INSERT CSVTest
FROM 'c:\csvtest.txt'
WITH
(
FIELDTERMINATOR = ','
,ROWTERMINATOR = '\n'
--,FIRSTROW = 2
--,MAXERRORS = 0
)
GO
use the content of the table:
SELECT *
FROM CSVTest
GO

Resources