SSIS - merging non linked parts of xml

SSIS - merging non linked parts of xml - sql-server

i have xml that has complex structure and while i was able to pull set of data i need from this sensor like measurements "from" and "to" and "count", I also have to pull data about sensor like IP address and Serial number that lives in different tag which doesn't have same id as the data tags. Here is the XML:
<response xmlns="http://www.test.com/sensor-api/v2">
<sensor-time timezone="America/New_York">2017-07-18T15:45:03-04:00</sensor-time>
<status>
<code>OK</code>
</status>
<sensor-info>
<serial-number>Q3:80:39:40:9Z:N2</serial-number>
<ip-address>192.163.135.10</ip-address>
<name>Test</name>
<group />
<device-type>PC2 - UL</device-type>
</sensor-info>
<content>
<elements>
<element>
<element-id>2</element-id>
<element-name>Conf_Lower_Zone</element-name>
<sensor-type>SINGLE_SENSOR</sensor-type>
<data-type>ZONE</data-type>
<from>2017-07-18T15:40:00-04:00</from>
<to>2017-07-18T15:45:00-04:00</to>
<resolution>ONE_MINUTE</resolution>
<measurements>
<measurement>
<from>2017-07-18T15:40:00-04:00</from>
<to>2017-07-18T15:41:00-04:00</to>
<values>
<value label="count">0</value>
</values>
</measurement>
<measurement>
<from>2017-07-18T15:41:00-04:00</from>
<to>2017-07-18T15:42:00-04:00</to>
<values>
<value label="count">0</value>
</values>
</measurement>
I used SSIS package with merge join process and i was able to push data in to SQL table now i have to add the sensor info ( IP, Serial) to the same table. So Serial and IP would repeat for every row of data of course.
How do i do this in SSIS package? What process to use to add two additional columns to repeat data all the way down for every line.
Here is the SSIS package so far:
Ok so I edited the SSIS package deriving two differing output from XML Source, one with Sensor-Info that feeds that Sensor_Info Table in SQL Server, and another output from XML Source that feeds Count_Data Table in SQL Server.
Than I added the Execute SQL Task within foreach Look Container as on image bellow, and i added this Query
USE SANDBOX
GO
INSERT INTO ALL_DATA
SELECT *
FROM [SANDBOX].[dbo].[Sensor_Info],[dbo].[Count_Data]
This is in attempt to Combine these two tables after each XML Load. However i am getting trash data which does combine the tables but with no sense.
What am I doing wrong now?

One of your XML outputs should be sensor-info.
You should run that through a script component transformation:
Set up some variables for #IP and #SerNum and include in variables.
Check both IP and SerialNumber columns as input columns.
Enter the script and it is simply this.
Variables.IP = Row.IP.ToString();
Variables.SerNum = Row.SerialNumber.TOString();
Now add a derived column in the flow that you want to add these values and set them to the variables you just defined.

Related

SSIS - Extract from XML dtsx Package every SQLCommand / TableorViewName From Data flow task

Basically, my purpose is urbanization context is to retrieve ALL input Datasource (sqlcommand or tableorviewname) and ALL output datasources
I did a first try succesfully with one of our dtsx packages :
SELECT CONVERT(XML, BulkColumn) AS xContent INTO Packages
FROM OPENROWSET(BULK 'F:\Repos\DW_all\DW_all\anaplan_sales_mrr_v2.dtsx', SINGLE_BLOB) AS x;
SELECT
X.Exe.value('(./#DTS:Description)','nvarchar(20)') as description
,X.Exe.value('(./#DTS:ObjectName)','nvarchar(20)') as ObjectName
,X.Exe.value('(./DTS:ObjectData/pipeline/components/component/properties/property)[1]','nvarchar(25)') as TargettedTable
,X.Exe.value('(./DTS:ObjectData/pipeline/components/component/connections/connection/#connectionManagerRefId)[1]','nvarchar(100)') as TargettedConnect
,X.Exe.value('(./DTS:ObjectData/pipeline/components/component[2]/properties/property[1]/#name)[1]','nvarchar(max)') as SourceType
,X.Exe.value('(./DTS:ObjectData/pipeline/components/component[2]/properties/property)[1]','nvarchar(max)') as [Source]
,X.Exe.value('(./DTS:ObjectData/pipeline/components/component[2]/connections/connection/#connectionManagerRefId)[1]','nvarchar(100)') as SourceConnect
FROM (SELECT XContent AS pkgXML FROM [dbo].[TestPackage]) t
CROSS APPLY pkgXML.nodes('/DTS:Executable/DTS:Executables/DTS:Executable/DTS:Executables/DTS:Executable') X(Exe)
Where X.Exe.value('(./#DTS:Description)[1]','nvarchar(max)') = 'Data Flow Task'
RESULT :
description
ObjectName
TargettedTable
...
Data Flow Task
1rst flowname
First Table
Data Flow Task
Other Flow
other Table
Data Flow Task
Another Flow
another table
Well, till now everything's fine.
Unfortunately, the properties position and the path is now always the same. In my exemple above, target is the first tag in the XML file (that's why [1]) and source is mentioned after (that's why [2]). But in other packages, that's the reverse case.
In the same idea, the property indicating the type of the datasource (name=sqlcommand or name=tableorviewname) is not always in the same position, so the pointer [1] is not relevant.
Moreover, in my exemple above the path is '/DTS:Executable/DTS:Executables/DTS:Executable/DTS:Executables/DTS:Executable' (with one Sequence Container) but other packages don't have container and have a different path (ex: /DTS:Executable/DTS:Executables/DTS:Executable').
I have tried some test with kind of wildcard like [.] or [*] but i'm not confortable with this and my tests are still on failure.
<DTS:ObjectData xmlns:DTS="DTS">
<pipeline
version="1">
<components>
<component
refId="Package\blabla my description id"
componentClassID="Microsoft.OLEDBSource"
contactInfo="OLE DB Source;Microsoft Corporation; Microsoft SQL Server; (C) Microsoft Corporation; All Rights Reserved; http://www.microsoft.com/sql/support;7"
description="OLE DB Source"
name="My DataFlow Name"
usesDispositions="true"
version="7">
<properties>
<property
dataType="System.Int32"
description="The number of seconds before a command times out. A value of 0 indicates an infinite time-out."
name="CommandTimeout">0</property>
<property
dataType="System.String"
description="Specifies the name of the database object used to open a rowset."
name="OpenRowset"></property>
<property
dataType="System.String"
description="Specifies the variable that contains the name of the database object used to open a rowset."
name="OpenRowsetVariable"></property>
<property
dataType="System.String"
description="The SQL command to be executed."
name="SqlCommand"
UITypeEditor="Microsoft.DataTransformationServices.Controls.ModalMultilineStringEditor, Microsoft.DataTransformationServices.Controls, Version=15.0.0.0, Culture=neutral, PublicKeyToken=89845dcd8080cc91">SELECT * From OneTable Inner Join AnotherTable ...
</property>
</properties>
</component>
</components>
</pipeline>
</DTS:ObjectData>
Can anyone help me please to improve the initial script to make it efficient for whatever packages, resulting all sqlcommand or tableorviewname in input of the dataflows included in the package, and the same in output.
TIA for your help and advices :-)
Fred.M.

Extracting XML in a column from a SQL Server database

I have read dozens of posts and have tried numerous SQL queries to try and get this figured out. Sadly, I'm not a SQL expert (not even a novice) nor am I an XML expert. I understand basic queries from SQL, and understand XML tags, mostly.
I'm trying to query a database table, and have the data show a list of values from a column that contains XML. I'll give you an example of the data. I won't burden you with everything I have tried.
Here is an example of field inside of the column I need. So this is just one row, I would need to query the whole table to get all of the data I need.
When I select * from [table name] it returns hundreds of rows and when I double click in the column name of 'Document' on one row, I get the information I need.
It looks like this:
<code_set xmlns="">
<name>ExampleCodeTable</name>
<last_updated>2010-08-30T17:49:58.7919453Z</last_updated>
<code id="1" last_updated="2010-01-20T17:46:35.1658253-07:00"
start_date="1998-12-31T17:00:00-07:00"
end_date="9999-12-31T16:59:59.9999999-07:00">
<entry locale="en-US" name="T" description="Test1" />
</code>
<code id="2" last_updated="2010-01-20T17:46:35.1658253-07:00"
start_date="1998-12-31T17:00:00-07:00"
end_date="9999-12-31T16:59:59.9999999-07:00">
<entry locale="en-US" name="Z" description="Test2" />
</code>
<displayExpression>[Code] + ' - ' + [Description]</displayExpression>
<sortColumn>[Description]</sortColumn>
</code_set>
Ideally I would write it so it runs the query on the table and produces results like this:
Code Description
--------------------
(Data) (Data)
Any ideas? Is it even possible? The dozens of things I have tried that are always posted in stack, either return Nulls or fail.
Thanks for your help

Try something like this:
SELECT
CodeSetId = xc.value('#id', 'int'),
Description = xc.value('(entry/#description)[1]', 'varchar(50)')
FROM
dbo.YourTableNameHere
CROSS APPLY
YourXmlColumn.nodes('/code_set/code') AS XT(XC)
This basically uses the built-in XQuery to get an "in-memory" table (XT) with a single column (XC), each containing an XML fragment that represents each <code> node inside your <code_set> root node.
Once you have each of these XML fragments, you can use the .value() XQuery operator to "reach in" and grab some pieces of information from it, e.g. it's #id (attribute by the name of id), or the #description attribute on the contained <entry> subelement.

The following query will read the xml field in every row, then shred certain values into a tabular result set.
SELECT
-- get attribute [attribute name] from the parent node
parent.value('./#attribute name','varchar(max)') as ParentAttributeValue,
-- get the text value of the first child node
child.value('./text()', 'varchar(max)') as ChildNodeValueFromFirstChild,
-- get attribute attribute [attribute name] from the first child node
child.value('./#attribute name', 'varchar(max)') as ChildAttributeValueFromFirstChild
FROM
[table name]
CROSS APPLY
-- create a handle named parent that references that <parent node> in each row
[xml field name].nodes('//xpath to parent name') AS ParentName(parent)
CROSS APPLY
-- create a handle named child that references first <child node> in each row
parent.nodes('(xpath from parent/to child)[0]') AS FirstChildNode(child)
GO
Please provide the exact values you want to shred from the XML for a more precise answer.

Creating XML Schema for Bulk Load to SQL Server - Child Element Describes Parent

I have an XML document that I'm working to build a schema for in order to bulk load these documents into a SQL Server table. The XML I'm focusing on looks like this:
<Coverage>
<CoverageCd>BI</CoverageCd>
<CoverageDesc>BI</CoverageDesc>
<Limit>
<FormatCurrencyAmt>
<Amt>30000.00</Amt>
</FormatCurrencyAmt>
<LimitAppliesToCd>PerPerson</LimitAppliesToCd>
</Limit>
<Limit>
<FormatCurrencyAmt>
<Amt>85000.00</Amt>
</FormatCurrencyAmt>
<LimitAppliesToCd>PerAcc</LimitAppliesToCd>
</Limit>
</Coverage>
<Coverage>
<CoverageCd>PD</CoverageCd>
<CoverageDesc>PD</CoverageDesc>
<Limit>
<FormatCurrencyAmt>
<Amt>50000.00</Amt>
</FormatCurrencyAmt>
<LimitAppliesToCd>Coverage</LimitAppliesToCd>
</Limit>
</Coverage>
Inside the Limit element, there's a child LimitAppliesToCd that I need to use to determine where the Amt element's value actually gets stored inside my table. Is this possible to do using the standard XML Bulk Load feature of SQL Server? Normally in XML I'd expect that the element would have an attribute containing the "PerPerson" or "PerAcc" information, but this standard we're using does not call for that.
If anyone has worked with the ACORD standard before, you might know what I'm working with here. Any help is greatly appreciated.

Don't know exactly what you are talking about, but this is a solution to get the information out of your XML.
Assumption: Your XML is already bulk-loaded into a declared variable #xml of type XML:
A CTE will pull the information out of your XML. The final query will then use PIVOT to put your data into the right column.
With a fitting table's structure the actual insert should be simple...
WITH DerivedTable AS
(
SELECT cov.value('CoverageCd[1]','varchar(max)') AS CoverageCd
,cov.value('CoverageDesc[1]','varchar(max)') AS CoverageDesc
,lim.value('(FormatCurrencyAmt/Amt)[1]','decimal(14,4)') AS Amt
,lim.value('LimitAppliesToCd[1]','varchar(max)') AS LimitAppliesToCd
FROM #xml.nodes('/root/Coverage') AS A(cov)
CROSS APPLY cov.nodes('Limit') AS B(lim)
)
SELECT p.*
FROM
(SELECT * FROM DerivedTable) AS tbl
PIVOT
(
MIN(Amt) FOR LimitAppliesToCD IN(PerPerson,PerAcc,Coverage)
) AS p

INSERT statement not working when using it through a variable in Mule

My database component has the following configuration
<db:insert config-ref="Oracle_Configuration" bulkMode="true" doc:name="Database">
<db:dynamic-query><![CDATA[#[flowVars.dbquery]]]></db:dynamic-query>
</db:insert>
I have declared the "dbquery" variable as follows
<set-variable variableName="dbquery" value="INSERT INTO WBUSER.EMP VALUES('#[payload.FullName]','#[payload.SerialNumber]')" doc:name="Variable"/>
On running the application the values inserted into the DB are "#[payload.FullName] and #[payload.SerialNumber].
But when my database component has the following configuration actual values of FullName and SerialNumber are getting inserted into the database.
<db:insert config-ref="Oracle_Configuration" bulkMode="true" doc:name="Database">
<db:dynamic-query><![CDATA[INSERT INTO WBUSER.EMP VALUES('#[payload.FullName]','#[payload.SerialNumber]')]]></db:dynamic-query>
</db:insert>
Here FullName and SerialNumber are not variables. They are column names of the list in the payload as [{FullName=yo, SerialNumber=129329}, {FullName=he, SerialNumber=129329}].
Can someone tell me the difference here. And is there a way i can achieve database insertion using just the variable as in the earlier case?

It caused by different approach to insert data. It works correctly for the configuration inside db-insert, because the payload is in form of List and Bulk Mode option selected.
To make it work for the first configuration (declare SQL query in a variable) then you have to do the following steps:
Iterate each payload value by utilizing: collection-splitter.
Deselect Bulk Mode from database connector.
The configuration should be:
<collection-splitter doc:name="Collection Splitter"/>
<set-variable variableName="dbquery" value="INSERT INTO WBUSER.EMP VALUES('#[payload.FullName]','#[payload.SerialNumber]')" doc:name="Variable"/>
<db:insert config-ref="MySQL_Configuration" doc:name="Database">
<db:dynamic-query><![CDATA[#[flowVars.dbquery]]]></db:dynamic-query>
</db:insert>

How to use selected SQL statement in SSIS package as source variable?

I created SSIS package where I need to use FELC. The first step before the loop is to run sql task to obtain all SQL statements designed to generate different XML files and stored in a source table. Inside the FELC I would like to process the statements to generate XML files and send them to various folder locations with names and target folder coming from the source table. There is hundreds of files that needs to be refreshed on regular basis. Instead of running single jobs for each XML file generation I would like to amalgamate it into one process.
Is it possible?

This is the basic Shred Recordset pattern.
I have 3 variables declared: SourceQuery, CurrentQuery and rsQueryData. The first 2 are Strings, the last is an Object type
SQL - Get source data
This is my query. It simulates your table and induces a failing SQL Statement if I take out the filter.
SELECT
ProcessID
, Stmt_details
FROM
(
VALUES
( 1, 'SELECT 1;', 1)
, ( 20, 'SELECT 20;', 1)
, ( 30, 'SELECT 1/0;', 0)
) Stmt_collection (ProcessID, Stmt_details, xmlFlag)
WHERE
xmlFlag = 1
The Execute SQL Task is set with Recordset = Full and I assign it to variable User::rsQueryData which has a name of 0 in the mapping tab.
FELC
This is a standard Foreach ADO Recordset Loop container. I use my User::rsQueryData as the source and since I only care about the second element, ordinal position 1, that's the only thing I map. I assign the current value to User::CurrentStatement
SQL - Execute CurrentStatement
This is an Execute SQL Task that has as its source the Variable User::CurrentStatement. There's no scripting involved. The FELC handles the assignment of values to this Variable. This Task uses as its source that same Variable. This is very much how native SSIS developers will approach solving a problem. If you reach for a Script Task or Component as the first approach, you're likely doing it wrong.
Biml
If you're doing any level of SSIS/SSRS/SSAS development, you want Bids Helper It is a free add on to Visual Studio that makes your development life so much easier. The feature I'm going to leverage here is the ability to declaratively define an SSIS package. This language is called the Business Intelligence Markup Language, Biml, and I love it for many reasons but on StackOverflow, I love it because I can give you the code to reproduce exactly my solution. Otherwise, I have to build out a few hundred screenshots showing you everywhere I have to click and set values.
Or, you
1. Download and install BIDS Helper
2. Open up your existing SSIS project
3. Right click on the Project and select "Add new Biml file"
4. In the resulting BimlScript.biml file, open it up and paste all of the following code into it
5. Fix the value for your database connection string. This one assumes you have an instance on your local machine called Dev2014
6. Save the biml file
7. Right click that BimlScript.biml and select "Generate SSIS Packages"
8. Marvel at the resulting so_28867703.dtsx package that was added to your solution
<Biml xmlns="http://schemas.varigence.com/biml.xsd">
<Connections>
<OleDbConnection Name="CM_OLE" ConnectionString="Data Source=localhost\dev2014;Initial Catalog=tempdb;Provider=SQLNCLI10.1;Integrated Security=SSPI;Auto Translate=False;" />
</Connections>
<Packages>
<Package ConstraintMode="Linear" Name="so_28867703">
<Variables>
<Variable DataType="String" Name="QuerySource">SELECT ProcessID, Stmt_details FROM (VALUES (1, 'SELECT 1;', 1), (20, 'SELECT 20;', 1), (30, 'SELECT 1/0;', 0))Stmt_collection(ProcessID, Stmt_details, xmlFlag) WHERE xmlFlag = 1 </Variable>
<Variable DataType="String" Name="CurrentStatement">This statement is invalid</Variable>
<Variable DataType="Object" Name="rsQueryData"></Variable>
</Variables>
<Tasks>
<ExecuteSQL
ConnectionName="CM_OLE"
Name="SQL - Get source data"
ResultSet="Full"
>
<VariableInput VariableName="User.QuerySource" />
<Results>
<Result VariableName="User.rsQueryData" Name="0" />
</Results>
</ExecuteSQL>
<ForEachAdoLoop
SourceVariableName="User.rsQueryData"
ConstraintMode="Linear"
Name="FELC - Shred RS"
>
<VariableMappings>
<!--
0 based system
-->
<VariableMapping VariableName="User.CurrentStatement" Name="1" />
</VariableMappings>
<Tasks>
<ExecuteSQL ConnectionName="CM_OLE" Name="SQL - Execute CurrentStatement">
<VariableInput VariableName="User.CurrentStatement" />
</ExecuteSQL>
</Tasks>
</ForEachAdoLoop>
</Tasks>
</Package>
</Packages>
</Biml>
That package will run, assuming you fixed the connection string to a valid instance. You can see below that if you put break points on the Execute SQL Task, it will light up two times. If you have a watch window on CurrentStatement, you can see it change from the design time value to the values shredded from the result set.
While we await clarification on XML and files, if the goal is to take the query from the FELC and export to file, I answered that https://stackoverflow.com/a/9105756/181965 Although in this case, I'd restructure your package to just the Data Flow and eliminate the shredding as there's no need to complicate matters to export a single row N times.

If i understand you correctly; You can add a "Script Task" from Toolbox to first step of loop container and store the selected statement from the database in to the global variable and pass it for execution in the next step

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

SSIS - merging non linked parts of xml - sql-server

Related

SSIS - Extract from XML dtsx Package every SQLCommand / TableorViewName From Data flow task

Extracting XML in a column from a SQL Server database

Creating XML Schema for Bulk Load to SQL Server - Child Element Describes Parent

INSERT statement not working when using it through a variable in Mule

How to use selected SQL statement in SSIS package as source variable?

Categories

Resources