StaxMate: Accessing plain XML - stax

I got following (in reality huge) XML to process:
<root>
<item attr="hello world">
<subitem></subitem>
<subitem></subitem>
<subitem></subitem>
<subitem></subitem>
</item>
<item attr="hello world">
<subitem></subitem>
<subitem></subitem>
<subitem></subitem>
<subitem></subitem>
</item>
.
.
.
</root>
With StaxMate this is pretty easy.
But how on earth do I tell StaxMate to "record" e.g. the plain XML for each item (see XML above).
So that after processing the an item I have done my processing on it + I have the String
<item attr="hello world">
<subitem></subitem>
<subitem></subitem>
<subitem></subitem>
<subitem></subitem>
</item>
somewhere.
Thank you very very much,
Fabian

You need to write entries; XMLStreamWriter2 has nice method, XMLStreamWriter2.copyEventFromReader(...) which can be used to make exact copy of the token.
But why are you creating Strings at all? Wouldn't it make sense to either directly write XML fragments, or process them? Strings are inefficient things to use with XML; take up memory, need to be encoded when written out, parsed when they contain XML. So about the only reason to do this is if they must be passed as Strings to another library.

Related

Concatenating values from the repeating nodes under repeating nodes in BizTalk Maps

I have something like this in an input XML
<Root>
<OrderText>
<item>item1</item>
<item>item2</item>
</OrderText>
<OrderText>
<item>item3</item>
<item>item4</item>
<item>item5</item>
</OrderText>
</Root>
From this input, the desired output is
<Root>
<OrderItems>
<Items>item1#item2</Items>
</OrderItems>
<OrderItems>
<Items>item3#item4#item5</Items>
</OrderItems>
</Root>
I am trying to find a solution here and followed a question asked by myself long back (link How to Concatenate multiple repetitive nodes into a single node - BizTalk)
but with that approach I'm getting result like below
<Root>
<OrderItems>
<Items>item1#item2#item3#item4#item5</Items>
</OrderItems>
<OrderItems>
<Items>item1#item2#item3#item4#item5</Items>
</OrderItems>
</Root>
which is totally wrong. can somebody help me please.
Have a look at the documentation Cumulative Concatenate Functoid
That gives you the first clue
Parameter 2: An optional numeric value that indicates the scope to which the accumulation should be performed. The default value is zero (0), indicating that the accumulation scope is the entire input instance message.
Try adding the second parameter and setting it to 1. This will result in the below output, which is closer to what you want.
<Root>
<OrderItems>
<Items>item1#item2#</Items>
<Items>item3#item4#item5#</Items>
</OrderItems>
</Root>
The second clue can be found by going to the Error List, showing Messages and clicking on the "Double-click here to show/hide compiler links". That will cause orange lines to appear on the map surface showing how the map thinks it should loop. See screenshot above that also shows that. Note how it is only looping on the root?
So the second fix is to draw a line from OrderText to OrderItems, and when prompted select Direct Link, which is telling it you want it to loop there as well.
This will give you on output close to your desired output of
<Root>
<OrderItems>
<Items>item1#item2#</Items>
</OrderItems>
<OrderItems>
<Items>item3#item4#item5#</Items>
</OrderItems>
</Root>
Removing the extra # at the end could be done either with a number of fuctoids such as string Size, String Left and a Subtraction functoid, or using a Scripting Fuctoid.

best xml format for store it in sql server database

I have a workflow in xml format like this:
...
<workflow>
<tasks>
<task type="start" id="Task_038517r" name="addRequest">
<form id="Form_3y245d1"/>
</task>
...
<task type="final" id="Task_1sytah6" name="confirmationRequest">
<form id="Form_3y245d1"/>
</task>
</tasks>
</workflow>
...
And I can change this to another format:
...
<workflow>
<tasks>
<task>
<type>start</type>
<id>Task_038517r</id>
<name>addRequest</name>
<form>
<id>from_3jfu845</id>
</form>
</task>
...
<task>
<type>final</type>
<id>Task_1sytah6</id>
<name>confirmationRequest</name>
<form>
<id>form_3y245d1</id>
</form>
</task>
</tasks>
</workflow>
...
I need to store this xml in workflowXML field. workflowXML is filed of workflow sql server table. I need to get value of attributes using entityframework in web application.The first format is less volume. The second format has better structure.
It would be very helpful if someone could explain which method is better.
Thanks.
One advantage of the attribute oriented storage is that Each value belongs to the given element and it cannot exist twice.
If humans' readability is of any importance, the second format might be easier to be read (you call it better structure), but - to be honest - this should not bother you. XML is - in most cases - read by a machine. Attributes are closely bound to their elements.
As your data is probably generated, you should not have to bother about a (possible) duplication of a sub-element either, which might break your data.
So, my final statement would be it is somehow your personal taste. I'd prefer the attributes
I think you will get attributes value of xml nodes using entityframework. Then you have to read about linq to xml and use it in this way. So I think the first format shows that can be done.
visit this links:
how to get Attribute Value using linq to xml?
Linq to XML simple get attribute from node statement

SQL Server 2008 Xml Issue With Xml Escape Characters

Our current Point of Sale system executes too many queries in nested transactions that leave duplicated or partial data in place. I changed the entire thing to a single stored procedure where all sale item data is passed in as Xml, iterated through in a temp table, and saved to the database, then committed. However, SQL rejects special characters in the xml.
For example:
<?xml version="1.0" encoding="utf-16"?>
<list>
<item>
<objectid>bd99fcb6-3031-48b7-9a71-5f8cefe0a614</objectid>
<amount>50.00</amount>
<fee>1.50</fee>
<waivedfee>0.00</waivedfee>
<tax>0.00</tax>
<name>TEST & TEST PERSON</name>
<payeeid>197</payeeid>
<accountnumber>5398520352</accountnumber>
<checknumber />
<comedreceiptnumber />
<isexpedited>0</isexpedited>
<echeckrefnumber />
</item>
</list>
Fails. It tells me that there is an illegal character where & is located. I don't know why. It's escaped properly with &. I can't find any solutions online, anywhere. Everywhere people tell me to replace & with & - which is what I am doing!
Use XML PATH(''), it will encode the special characters for you.
SELECT 'TEST & TEST PERSON' FOR XML PATH('')
I figured it out. UTF-16 is correct. That Xml is fine. There was a final piece of xml, the ledgers, that were just plain strings with no encoding and no escaping special characters. Once I corrected that it all worked.
Thanks for the help!

How to avoid in SQL to store an XML <option></option> like <option />?

When I try to store a XML in SQL than have an empty Element, SQL just change it and store it with only one tag for the element.
For Example the XML to store is:
<ROOT>
<FIRSTNAME>ROGER</FIRSTNAME>
<MIDDLENAME></MIDDLENAME>
</ROOT>
Then Sql stored it like
<ROOT>
<FIRSTNAME>ROGER</FIRSTNAME>
<MIDDLENAME />
</ROOT>
The sql update is just very simple:
UPDATE
SESIONESREPORTES
SET
SER_PARAMETROS = '
<ROOT>
<FIRSTNAME>ROGER</FIRSTNAME>
<MIDDLENAME></MIDDLENAME>
</ROOT>'
WHERE SER_ID=7
I need like this because I have some query that fails when a element is empty, you can see it here..
Merging many rows in a single
I don't think you can, looking at the following link:
XML Data Type and Columns
According to this (XML Storage Options Section):
The data is stored in an internal representation that preserves the
XML content of the data. This internal representation includes
information about the containment hierarchy, document order, and
element and attribute values. Specifically, the InfoSet content of the
XML data is preserved. For more information about InfoSet, visit
http://www.w3.org/TR/xml-infoset. The InfoSet content may not be an
identical copy of the text XML, because the following information is
not retained: insignificant white spaces, order of attributes,
namespace prefixes, and XML declaration.
So the internal storage will strip out all parts it deems unnecessary, the document goes on to state that if you need an exact copy of the XML document and not just the content, you should use either [n]varchar(max) or varbinary(max)
<MIDDLENAME></MIDDLENAME>
and
<MIDDLENAME/>
are equivalent; any XML parser will treat them identically - as an empty element. If your query fails on an empty element, it will fail on either of them. You'll need to either rewrite your query to handle empty elements, put some content in the <MIDDLENAME> element, or omit the element entirely (if your query can handle it's absence.)

<data> tag for tmx file

What is stored in the data tag of of a tmx file such as the following
<data encoding="base64" compression="gzip">
H4sIAAAAAAAAA+3YIQ6AMAwF0AEKEATwSO5/RCoRmGHY2BMvaVLzRb/pkVI6gOZ0oQ9DAVlynbd5DFOYH3Y1WcMW9gKytGbJ8HXWFtXaaQAAAAAA/s8Pm1xuBvLpDW9ciGmfRhAnAAA=
</data>
Also if this is key info, how is it read or extracted using c ?
This looks like binary octet data that has been compressed using gzip and then encoded in Base64 to make it XML-safe. Implementations for both should be easy to obtain, though I don't know enough C libraries to recommend one.

Resources