How to insert xml file into xml field using bcp?

How to insert xml file into xml field using bcp? - sql-server

I have a table:
USE [testdb]
GO
CREATE TABLE [dbo].[a](
[n] [int] IDENTITY(1,1) NOT NULL PRIMARY KEY CLUSTERED,
[x] [xml] NULL)
GO
How to insert xml file into field x from client?
MSDN Example doesn't suit me.
INSERT INTO T(XmlCol)
SELECT * FROM OPENROWSET(
BULK 'c:\SampleFolder\SampleData3.txt',
SINGLE_BLOB) AS x;
I'm not the administrator of this server. And I have the only access to the database. I can not put a file in a directory on the server. I can use the BCP and other tools to access the database
XML file is very large (> 50 MB), so it doesn't turn to insert the text of file as a constant in the query SSMS

Little known fact: the bcp utility supports arbitrary strings as column and row delimiters. Construct a file with delimiters not present in your data, and invoke bcp accordingly.
For example, your column delimiter could be -t \0Field\0. Just check the data first. :-)

Related

External table Always insert in a new file, is there any way to write to same file?

I have a external table in SQL Server which points to CSV files in folder of Azure blob storage, I enabled polybase export and trying to insert data using insert query. It works but it always creates new file.
Is there any way I can write to single file or give file name while insert?
Here's my table
CREATE EXTERNAL TABLE archive.filetransferauditlog (
[id] [int] NULL,
[STATUS] [varchar](10) NULL,
[EVENT] [varchar](10) NULL,
[fileNameWithPath] [varchar](2048) NULL,
[eventStartDate] [datetime] NOT NULL,
[eventEndDate] [datetime] NOT NULL,
[description] [varchar](4096) NULL,
[loggedInUserId] [int] NULL,
[transferType] [int] NULL
)
WITH (
LOCATION = '/filetransferauditlog/',
DATA_SOURCE = archivepurgedataExternalDataSource,
FILE_FORMAT = ParquetFile
)
GO
Query I am using:
Insert into archive.filetransferauditlog
select Top(5)
from dbo.filetransferauditlog
Please suggest me any way we can give the file name while insert.
When I try to give location for table to a single file instead of directory, I am able to run select query but not insert.
It returns below error:
java.sql.SQLException: Cannot execute the query "Remote Query" against OLE DB provider "SQLNCLI11" for linked server "SQLNCLI11". CREATE EXTERNAL TABLE AS SELECT statement failed as the path name 'wasbs://demoarchive#testarchivedemo.blob.core.windows.net/filetransferauditlogText/QID5060_20220607_54101_0.txt' could not be used for export. Please ensure that the specified path is a directory which exists or can be created, and that files can be created in that directory.

java.sql.SQLException: Cannot execute the query "Remote Query" against OLE DB provider "SQLNCLI11" for linked server "SQLNCLI11". CREATE EXTERNAL TABLE AS SELECT statement failed as the path name 'wasbs://demoarchive#testarchivedemo.blob.core.windows.net/filetransferauditlogText/QID5060_20220607_54101_0.txt' could not be used for export. Please ensure that the specified path is a directory which exists or can be created, and that files can be created in that directory.
As per MsDocs, The reason behind getting this error because of PolyBase external table pointed out repeatedly reads all the files, so column or data may be discrepancy
Hope you created the external table first and then use INSERT INTO SELECT to export to the external location. While exporting, only data can be exported not a column
Please suggest me any way we can give the file name while insert
If you want data in each table consists in single file, use FILE name in the location section by the directory
If you want multiple file in table set the files into different directories
If you want to create a table on top of csv file just use LOCATION '/warehouse/develop/myfile
Make sure you are giving correct path in location attributes are not, give your location along with your file name as below
Create External Table Your table name
(col1 int)WITH (LOCATION= '/filetransferauditlog/ your file name' ,DATA_SOURCE); OR
Create External Table Your table name
(col1 int) WITH (LOCATION = '/filetransferauditlog/ ',DATA_SOURCE);
For inserting into same file use Join query while export please check whether you are using correct format as sample below
INSERT INTO dbo.filetransferauditlog
SELECT T.* FROM Insured_Customers T1 JOIN CarSensor_Data T2
ON (T1.CustomerKey = T2.CustomerKey)
WHERE T2.YearMeasured = 2009 and T2.Speed > 40;
For more information in detail, please refer below links:
PolyBase query scenarios
PolyBase errors and possible solutions

Here is the query that you should execute
SELECT *
INTO OUTFILE 'C:\\Donnees\\dev\\table_exp.txt'
FIELDS TERMINATED BY ';' ENCLOSED BY '"'
LINES STARTING BY 'export-table' TERMINATED BY '$\n'
FROM name_of_table;
Note: Into outfile his is the file path
fields terminated by:his is the file path
enclosed by:the symbol that frames the column values
lines starting by this is how your recording in the file will start
TERMINATED BY:with which symbol does your recording end?

We posted the same question in to Microsoft community and we found answer and the workaround from there.
https://techcommunity.microsoft.com/t5/sql-server/external-table-always-insert-in-a-new-file-is-there-any-way-to/m-p/3480998#M1680
Thanks everyone for the quick help.

Easy way to load a CSV file from the command line into a new table of an Oracle database without specifying the column details

I often want to quickly load a CSV into an Oracle database. The CSV (Unicode) is on a machine with an Oracle InstantClient version 19.5, the Oracle database is of version 18c.
I look for a command line tool which uploads the rows without me specifying a column structure.
I know I can use sqlldr with a .ctl file, but then I need to define columns types, etc. I am interested in a tool which figures out the column attributes itself from the data in the CSV (or uses a generic default for all columns).
The CSVs I have to ingest contain always a header row the tool in question could use to determine appropriate columns in the table.

Starting with Oracle 12c, you can use sqlldr in express mode, thereby you don't need any control file.
In Oracle Database 12c onwards, SQLLoader has a new feature called
express mode that makes loading CSV files faster and easier. With
express mode, there is no need to write a control file for most CSV
files you load. Instead, you can load the CSV file with just a few
parameters on the SQLLoader command line.
An example
Imagine I have a table like this
CREATE TABLE EMP
(EMPNO number(4) not null,
ENAME varchar2(10),
HIREDATE date,
DEPTNO number(2));
Then a csv file that looks like this
7782,Clark,09-Jun-81,10
7839,King,17-Nov-81,12
I can use sqlldr in express mode :
sqlldr userid=xxx table=emp
You can read more about express mode in this white paper
Express Mode in SQLLDR

Forget about using sqlldr in a script file. Your best bet is on using an external table. This is a create table statement with sqlldr commands that will read a file from a directory and store it as a table. Super easy, really convenient.
Here is an example:
create table thisTable (
"field1" varchar2(10)
,"field2" varchar2(100)
,"field3" varchar2(100)
,"dateField" date
) organization external (
type oracle_loader
default directory <createDirectoryWithYourPath>
access parameters (
records delimited by newline
load when (fieldname != BLANK)
skip 9
fields terminated by ',' optionally ENCLOSED BY '"' ltrim
missing field values are null
(
"field1"
,"field2"
,"field3"
,"dateField" date 'mm/dd/yyyy'
)
)
location ('filename.csv')
);

Insert fixed values using BCP

I'm trying to import a TXT file using bcp.
My TXT file is like this:
abc|cba
xyz|zyx
My Table is like this:
Field_1 -> Identity field
Field_2 -> Varchar(3)
Field_3 -> Varchar(3)
Filed_4 -> Varchar(1) In this case I must set with default value 'P'
Filed_5 -> Varchar(1) In this case I must set with default value 'C'
My table with values must be:
1,abc,cba,P,C
2,xyz,zyx,P,C
Note-> My TXT file is huge (around 200GB), I can't import into another table to then pass all value to this table (just saying).
##Version-> SQL Server 2014 (SP2)

You cannot generate data via BCP, you must depend on SQL Server to do that as Jeroen commented. To add to his comment, the identity value is not a default, you should continue to use the identity property of the column.
For both (identity and default), you must use the -f option to BCP. This the option to include a format file to direct the BCP utility to see and handle the data as stated in the format file.
Using a format file, you can specify which columns in the file are mapped to which columns are in the destination table. To exclude a column, just set its destination value to "0".
The format files and the bcp utility are much larger topics in and of themselves, but to answer your question; yes it is possible and using a format file with modified destination values (set to "0") is the way to do it.
Doing this, you can process the data once. Using powershell to append data is possible, but unnecessary and less efficient. To do this in one action with bcp, you need to use a format file.

Pulling rows from .log file into SQL Server table

I have a very flat, simple log file (6 rows of which one row is blank) that I want to insert into a simple 5 column SQL Server table.
Please excuse my SQL ignorance as my knowledge around this topic is not educated.
Below is the .log file content :-
-----------Log File content start----------
07/30/2016 00:02:03 : BATCH CLOSE SUMMARY
MerchantID - 000022673665
TerminalID - 013
BatchItemCount - 650
NetBatchTotal - 5095.00
----------Log file content end-------------
Below is the simple SQL Server table layout:
CREATE TABLE dbo.CCClose
(
CloseTime NVARCHAR(50) NOT NULL,
MercID NVARCHAR(50) NOT NULL,
TermID NVARCHAR(50) NOT NULL,
BatchCount NVARCHAR(30) NOT NULL,
NetBatcTotal NVARCHAR(50) NOT NULL
);
I'm hoping that somehow have each row looked at by SQL for example:
if .log file like 'Batch close Summary' then insert into CloseTime else
if .log file like 'MerchantID' then insert into MercID else
if .log file like 'BatchItemCount' then insert into BatchCount else
if .log file like 'NetBatchTotal' then insert into NetBatchTotal
Off course it would be great if the proper formatting for each column was in place but at this time I just looking at getting the .log file data populated from a directory of these logs.
I plan to use Crystal Reports to build on the SQL Server tables.

This is not going to be a simple process. You can probably do it with bulk insert. The idea is to read it into a staging table, using:
a record terminator of something like "----------Log file content end-------------" + newline
a field separator of a newline
a staging table with several columns of varchars
Then process the staging table to extract the values (and types) that you want. There are probably other options, if you set up a format file, but that adds another level of complexity.
I would read the table into a staging table with one line per row in the table. Then, I would:
use window functions to assign a record number to rows, based on the "content start" lines
aggregate based on the record number
extract the values using aggregations, string functions, and conversions

BULK INSERT with identity (auto-increment) column

I am trying to add bulk data in database from CSV file.
Employee table has a column ID (PK) auto-incremented.
CREATE TABLE [dbo].[Employee](
[id] [int] IDENTITY(1,1) NOT NULL,
[Name] [varchar](50) NULL,
[Address] [varchar](50) NULL
) ON [PRIMARY]
I am using this query:
BULK INSERT Employee FROM 'path\tempFile.csv '
WITH (FIRSTROW = 2,KEEPIDENTITY,FIELDTERMINATOR = ',' , ROWTERMINATOR = '\n');
.CSV File -
Name,Address
name1,addr test 1
name2,addr test 2
but it results in this error message:
Bulk load data conversion error (type mismatch or invalid character for the specified codepage) for row 2, column 1 (id).

Add an id column to the csv file and leave it blank:
id,Name,Address
,name1,addr test 1
,name2,addr test 2
Remove KEEPIDENTITY keyword from query:
BULK INSERT Employee FROM 'path\tempFile.csv '
WITH (FIRSTROW = 2,FIELDTERMINATOR = ',' , ROWTERMINATOR = '\n');
The id identity field will be auto-incremented.
If you assign values to the id field in the csv, they'll be ignored unless you use the KEEPIDENTITY keyword, then they'll be used instead of auto-increment.

Don't BULK INSERT into your real tables directly.
I would always
insert into a staging table dbo.Employee_Staging (without the IDENTITY column) from the CSV file
possibly edit / clean up / manipulate your imported data
and then copy the data across to the real table with a T-SQL statement like:
INSERT INTO dbo.Employee(Name, Address)
SELECT Name, Address
FROM dbo.Employee_Staging

I had a similar issue, but I needed to be sure that the order of the ID is aligning to the order in the source file.
My solution is using a VIEW for the BULK INSERT:
Keep your table as it is and create this VIEW (select everything except the ID column)
CREATE VIEW [dbo].[VW_Employee]
AS
SELECT [Name], [Address]
FROM [dbo].[Employee];
Your BULK INSERT should then look like:
BULK INSERT [dbo].[VW_Employee] FROM 'path\tempFile.csv '
WITH (FIRSTROW = 2,FIELDTERMINATOR = ',' , ROWTERMINATOR = '\n');

You have to do bulk insert with format file:
BULK INSERT Employee FROM 'path\tempFile.csv '
WITH (FORMATFILE = 'path\tempFile.fmt');
where format file (tempFile.fmt) looks like this:
11.0 2 1 SQLCHAR 0 50 "\t"  2  Name   SQL_Latin1_General_CP1_CI_AS
2 SQLCHAR 0 50 "\r\n" 3  Address  SQL_Latin1_General_CP1_CI_AS
more details here - http://msdn.microsoft.com/en-us/library/ms179250.aspx

My solution is to add the ID field as the LAST field in the table, thus bulk insert ignores it and it gets automatic values. Clean and simple ...
For instance, if inserting into a temp table:
CREATE TABLE #TempTable
(field1 varchar(max), field2 varchar(max), ...
ROW_ID int IDENTITY(1,1) NOT NULL)
Note that the ROW_ID field MUST always be specified as LAST field!

Create a table with Identity column + other columns;
Create a view over it and expose only the columns you will bulk insert;
BCP in the view

I had this exact same problem which made loss hours so i'm inspired to share my findings and solutions that worked for me.
1. Use an excel file
This is the approach I adopted. Instead of using a csv file, I used an excel file (.xlsx) with content like below.
id username email token website
johndoe johndoe#divostar.com divostar.com
bobstone bobstone#divosays.com divosays.com
Notice that the id column has no value.
Next, connect to your DB using Microsoft SQL Server Management Studio and right click on your database and select import data (submenu under task). Select Microsoft Excel as source. When you arrive at the stage called "Select Source Tables and Views", click edit mappings. For id column under destination, click on it and select ignore . Don't check Enable Identity insert unless you want to mantain ids incases where you are importing data from another database and would like to maintain the auto increment id of the source db. Proceed to finish and that's it. Your data will be imported smoothly.
2. Using CSV file
In your csv file, make sure your data is like below.
id,username,email,token,website
,johndoe,johndoe#divostar.com,,divostar.com
,bobstone,bobstone#divosays.com,,divosays.com
Run the query below:
BULK INSERT Metrics FROM 'D:\Data Management\Data\CSV2\Production Data 2004 - 2016.csv '
WITH (FIRSTROW = 2, FIELDTERMINATOR = ',', ROWTERMINATOR = '\n');
The problem with this approach is that the CSV should be in the DB server or some shared folder that the DB can have access to otherwise you may get error like "Cannot opened file. The operating system returned error code 21 (The device is not ready)".
If you are connecting to a remote database, then you can upload your CSV to a directory on that server and reference the path in bulk insert.
3. Using CSV file and Microsoft SQL Server Management Studio import option
Launch your import data like in the first approach. For source, select Flat file Source and browse for your CSV file. Make sure the right menu (General, Columns, Advanced, Preview) are ok. Make sure to set the right delimiter under columns menu (Column delimiter). Just like in the excel approach above, click edit mappings. For id column under destination, click on it and select ignore .
Proceed to finish and that's it. Your data will be imported smoothly.

This is a very old post to answer, but none of the answers given solves the problem without changing the posed conditions, which I can't do.
I solved it by using the OPENROWSET variant of BULK INSERT. This uses the same format file and works in the same way, but it allows the data file be read with a SELECT statement.
Create your table:
CREATE TABLE target_table(
id bigint IDENTITY(1,1),
col1 varchar(256) NULL,
col2 varchar(256) NULL,
col3 varchar(256) NULL)
Open a command window an run:
bcp dbname.dbo.target_table format nul -c -x -f C:\format_file.xml -t; -T
This creates the format file based on how the table looks.
Now edit the format file and remove the entire rows where FIELD ID="1" and COLUMN SOURCE="1", since this does not exist in our data file.
Also adjust terminators as may be needed for your data file:
<?xml version="1.0"?>
<BCPFORMAT xmlns="http://schemas.microsoft.com/sqlserver/2004/bulkload/format" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<RECORD>
<FIELD ID="2" xsi:type="CharTerm" TERMINATOR=";" MAX_LENGTH="256" COLLATION="Finnish_Swedish_CI_AS"/>
<FIELD ID="3" xsi:type="CharTerm" TERMINATOR=";" MAX_LENGTH="256" COLLATION="Finnish_Swedish_CI_AS"/>
<FIELD ID="4" xsi:type="CharTerm" TERMINATOR="\r\n" MAX_LENGTH="256" COLLATION="Finnish_Swedish_CI_AS"/>
</RECORD>
<ROW>
<COLUMN SOURCE="2" NAME="col1" xsi:type="SQLVARYCHAR"/>
<COLUMN SOURCE="3" NAME="col2" xsi:type="SQLVARYCHAR"/>
<COLUMN SOURCE="4" NAME="col3" xsi:type="SQLVARYCHAR"/>
</ROW>
</BCPFORMAT>
Now we can bulk load the data file into our table with a select, thus having full controll over the columns, in this case by not inserting data into the identity column:
INSERT INTO target_table (col1,col2, col3)
SELECT * FROM openrowset(
bulk 'C:\data_file.txt',
formatfile='C:\format_file.xml') as t;

Another option, if you're using temporary tables instead of staging tables, could be to create the temporary table as your import expects, then add the identity column after the import.
So your sql does something like this:
If temp table exists, drop
Create temp table
Bulk Import to temp table
Alter temp table add identity
< whatever you want to do with the data >
Drop temp table
Still not very clean, but it's another option... might have to get locks to be safe, too.