bulk import of xml data in to sql server - sql-server

I have a set of xml files that I want to parse the data of and import in to a sql server 2012 database. The provided xml files will be validated against a schema.
I am looking as to what is the best method of doing this is. I have found this: http://msdn.microsoft.com/en-us/library/ms171878.aspx
I am wondering if this is the best way or if there are others?

You have several options:
SSIS XML Source. This does not validate against the schema. If you want to detect and properly handle invalid XML files, create a script task to validate the schema in C#.
Parse the XML in a stored procedure.
Insert the entire XML file in one column. Depending on your schema validation requirements, you can use an untyped or typed XML column. (Or both)
Parse the XML using XPath functions. This is actually very fast.
INSERT INTO SomeTable (Column1, Column2, Column3)
SELECT
YourXmlColumn.value('(/root/col1)[1]','int'),
YourXmlColumn.value('(/root/col2)[1]','nvarchar(10)'),
YourXmlColumn.value('(/root/col3)[1]','nvarchar(2000)'),
YourXmlColumn.value('(/root/col4)[1]','datetime2(0)')
FROM YourXmlTable

Related

How to get insert fields from sql?

I am using Flink Sql to parse sql's lineage.
I use flink planner to parse a sql as
insert into target_table(dest_f1, dest_f2) select source_f1, source_f2 from source_table
Obviously, source_f1 is the source of dest_f1.
When I get a CatalogSinkModifyOperation via Flink planner, the CatalogSinkModifyOperation doesn't contains any insert columns information, which means no dest_f1, dest_f2.
How can I get the insert columns' name from my target_table?
You can use the following code to get the column information of the target table:
List<String> targetColumnList = tableEnv.from(sinkTable)
.getResolvedSchema()
.getColumnNames();
or
relNode.getRowType().getFieldNames()
If you want to parse the lineage of the flink sql field, you can refer to the open source project:
https://github.com/HamaWhiteGG/flink-sql-lineage

How to Import/export SQL Server tables via XML files

I am receiving data files in XML format from a public agency. The structure of the files and the notations within make me think that what I am seeing is a bulk table dump from SQL Server. Each XML file starts with a schema definition. This is followed by multiple elements, each element looking like it contains one table row of data. The structure of the files makes me think that they were created by some facility within SQL Server itself.
I have looked at the SQL Server docs and online articles, but I cannot find information on how this would be accomplished. I am thinking that if there is a built in facility to export the data in this format, there must be a built in facility to import the data back into SQL Server.
A snippet of one XML file showing the opening schema section and a couple of data elements can be found here;
sample data XML file
Am I correct in thinking that these files are SQL Server dumps? If so, who are these exported and then imported?
You can use xml functionality of SQL Server. Something like this.
Import
declare #x xml = N'--your xml here'
; with xmlnamespaces('urn:schemas-microsoft-com:sql:SqlRowSet1' as p)--required as xml has xmlns attribute
--insert mytable
select t.v.value('p:pool_idn[1]','int') pool_idn, --note p: prefix
t.v.value('p:eff_dte[1]','datetime') eff_dte,
t.v.value('p:pool_nam[1]','varchar(50)') pool_nam
--and so on
from #x.nodes('root/p:pool') t(v)
Export
select * from mytable
for xml auto, elements, root, xmlschema('urn:my:schema')

SSIS ETL: Xml column into varchar destination

What would be the optimal and efficient way to do the below
Requirement
Copy an xml column into staging DB and then
Staging to DWH :
Column Size
Source : Xml
StagingDB : Xml or Varchar ?
Datawarehouse : Varchar(8000)
I need to decide which would be the best optimal and most performing way to copy an xml column into staging DB. copy XML from source into an xml column in STage DB is the best way or Xml to Varchar(max) will be the best way, considering the data transferred will be in millions ?
If you don't want to benefit from Xml properties and you want to store the whole xml inside a Varchar(8000) column in the data warehouse it is more easier to read the xml as a text. (less validation required , faster)
Note that you can read the file using a script task or other component instead of Xml Source
So it is up to you, if you need to get some xml node , or to achieve something related to xml functions or properties then you have to use Xml Else use Varchar.

extracting and saving xml results in file using SSIS

I am trying to query results as XML from Oracle db using XMLElement and XMLAgg functions which gives me results in CLOB format. Now, when I try to use this query in Data Source flow task in SSIS, I get an error as unsupported data format.
Query:
select XMLElement("root",
XMLAgg(XMLElement("person",
XMLForest(person.first_name, person.last_name)))) AS "XMLResult"
from person
Question:
How do I use this query in SSIS (2008 R2) to avoid that error or any workaround. Further I need to write the results to a file.
You will need to convert the result into VARCHAR or VARCHAR2 datatype as SSIS doesn't support the XML datatype.

SQL Server Insert failure due to XML Schema validation error

I have a XML column in a table and it is defined by a schema. I am trying to insert values into this table by using Insert into tbl1 Select * from tbl for xml. But this is failing due to schema validation failure for one of the records. But i want to insert the records which have passed the validation atleast and i can capture the others later. Can someone help me in this.
SQL server validates all dataset, not single row. If you want to validate Row-by-Row using SQL server tools, methods are:
SQLCLR (fastest) link
SSIS (easy to create) - using loop FOREACH you try to insert row into table. All failed rows are redirecting to another table.
TSQL TRY/CATCH Block - insert xml from single row to schema validated variable. Slowest one.

Resources