Solr ScriptTransformer return value - solr

I have the following fields:
In the database I have the field property_industry_sector which is a list of comma separated ints, null or empty string.
In the Solr schema configuration I have the same field property_industry_sector of type int and multivalued.
My problem is that I have to handle to difference in the DataImportHandler configuration, and my attempt looks like this:
<entity
name="property_industry_sector_extractor"
transformer="script:SplitIndustrySector"
query="
SELECT property_industry_sector
FROM job
WHERE job.id = ${job.id}
">
<field column="property_industry_sector" name="property_industry_sector" />
</entity>
Where the ScriptTransformer has the following definition:
function SplitIndustrySector(row) {
//var logger = java.util.logging.Logger.getLogger("org.apache.solr");
if(row.get('property_industry_sector') !== null) {
if(false === row.get('property_industry_sector').isEmpty()) {
var pieces = row.get('property_industry_sector').split(',');
var arr = new java.util.ArrayList();
for(var i=0, len=pieces.length; i<len; i++) {
arr.add(new java.lang.Integer(pieces[i]));
}
row.put('property_industry_sector', arr);
return row;
}
}
var arr = new java.util.ArrayList();
arr.add(new java.lang.Integer(0));
row.put('property_industry_sector', arr);
return row;
}
The problem is with the general case, when the value is null or empty string, because no matter what the transformer does, I still get the following Exception
property_industry_sector=property_industry_sector(1.0)={[, 0]}}]
java.lang.NumberFormatException: For input string: ""
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Integer.parseInt(Integer.java:493)
at java.lang.Integer.parseInt(Integer.java:514)
at org.apache.solr.schema.TrieField.createField(TrieField.java:374)
at org.apache.solr.schema.SchemaField.createField(SchemaField.java:97)
at org.apache.solr.update.DocumentBuilder.addField(DocumentBuilder.java:203)
at org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:276)
at org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:60)
at org.apache.solr.handler.dataimport.SolrWriter.upload(SolrWriter.java:73)
at org.apache.solr.handler.dataimport.DataImportHandler$1.upload(DataImportHandler.java:294)
at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:631)
at org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:267)
at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:186)
at org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:353)
at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:411)
at org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:392)
I do not understand where the empty string comes from (which it tries to convert to Integer) while also being confused by the values it tries to insert above the exception:
property_industry_sector=property_industry_sector(1.0)={[, 0]}}]
I've tried clearing the row prior to the put() call. Return null, or just as with the current example return the row with a single value of 0.

Haven't found a way to work it out, but managed to solve the issue with an alternative solution. Instead of using the ScriptTransformer I was able to achieve the same goal with SQL transformations.
<entity name="industry_sector_hack" query='
SELECT property_industry_sector AS property_industry_sector_ids
FROM job
WHERE id = ${job.id} AND
property_industry_sector IS NOT NULL AND
property_industry_sector <> ""
'>
<entity name="property_industry_sector" query='
SELECT property.id AS property_industry_sector
FROM property
WHERE property.id IN (${industry_sector_hack.property_industry_sector_ids})
'>
<field column="property_industry_sector" name="property_industry_sector" />
</entity>
</entity>

Related

SSIS Xpath query expression error

I'm trying to call a SalesForce web service via SSIS, and I am trying to retrieve the value of the sessionID node.
Here is the XML:
<?xml version="1.0" encoding="utf-16"?>
<LoginResult xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<metadataServerUrl xmlns="urn:partner.soap.sforce.com">https://xxxxxx/services/Soap/m/31.0/xxxxxx</metadataServerUrl>
<passwordExpired xmlns="urn:partner.soap.sforce.com">false</passwordExpired>
<sandbox xmlns="urn:partner.soap.sforce.com">true</sandbox>
<serverUrl xmlns="urn:partner.soap.sforce.com">https://xxx/services/Soap/u/31.0/xxx</serverUrl>
<sessionId xmlns="urn:partner.soap.sforce.com">xxxxxxxxxx</sessionId>
<userId xmlns="urn:partner.soap.sforce.com">xxx</userId>
<userInfo xmlns="urn:partner.soap.sforce.com">
<accessibilityMode>false</accessibilityMode>
<currencySymbol>$</currencySymbol>
<orgAttachmentFileSizeLimit>5242880</orgAttachmentFileSizeLimit>
<orgDefaultCurrencyIsoCode>USD</orgDefaultCurrencyIsoCode>
<orgDisallowHtmlAttachments>false</orgDisallowHtmlAttachments>
<orgHasPersonAccounts>false</orgHasPersonAccounts>
<organizationId>xxxxxxxx</organizationId>
<organizationMultiCurrency>false</organizationMultiCurrency>
<organizationName>xxxxx</organizationName>
<profileId>xxxxx</profileId>
<roleId xsi:nil="true" />
<sessionSecondsValid>7200</sessionSecondsValid>
<userDefaultCurrencyIsoCode xsi:nil="true" />
<userEmail>xxxxxxx</userEmail>
<userFullName>xxxxx</userFullName>
<userId>xxxxxxx</userId>
<userLanguage>en_US</userLanguage>
<userLocale>en_US</userLocale>
<userName>xxxxxxx</userName>
<userTimeZone>America/New_York</userTimeZone>
<userType>Standard</userType>
<userUiSkin>Theme3</userUiSkin>
</userInfo>
</LoginResult>
I successfully tested this expression via http://www.xpathtester.com/xpath.
Edit:
Trying a script task now, and still I'm not finding the exact right combination to select this information. There are multiple namespaces in the XML, and the below code returns 0 nodes. Quite frustrating!
public void Main()
{
string loginResult;
string sessionID;
loginResult = Dts.Variables["User::loginResult"].Value.ToString();
XmlDocument doc = new XmlDocument();
doc.LoadXml(loginResult);
var xmlnsManager = new System.Xml.XmlNamespaceManager(doc.NameTable);
//xmlnsManager.AddNamespace("t1", "http://www.w3.org/2001/XMLSchema-instance");
xmlnsManager.AddNamespace("ns", "urn:partner.soap.sforce.com");
XmlNodeList list = doc.SelectNodes("/LoginResult/ns:sessionID", xmlnsManager);
for (int i = 0; i < list.Count; i++)
{
sessionID = list[i].Value;
}
Dts.TaskResult = (int)ScriptResults.Success;
}
}
A correct expression to find the sessionId element is:
//*[local-name() = 'sessionId']/text()
In your input XML, the element you'd like to find:
<sessionId xmlns="urn:partner.soap.sforce.com">xxxxxxxxxx</sessionId>
does not have a prefix - it is in a default namespace that does not require elements to be prefixed.
Therefore, a possible explanation is that an expression like
//*:sessionId
only finds elements that are prefixed in the input XML. That should not be a problem but as far as I can see, SSIS is known for problems with namespaced XML (see e.g. here or here).
As far as the XPath specification is concerned, an expression like //*:root should be able to find an element like
<root xmlns="www.example.com"/>
EDIT: Apparently, you have changed the question alltogether - now there is another problem: The outermost element LoginResult is in no namespace at all, not in the xsi: namespace:
<LoginResult xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"/>
The schema instance namespace just happens to be declared on this element, but it is not used there. So, change your code to:
var xmlnsManager = new System.Xml.XmlNamespaceManager(doc.NameTable);
xmlnsManager.AddNamespace("ns", "urn:partner.soap.sforce.com");
XmlNodeList list = doc.SelectNodes("/LoginResult/ns:sessionId", xmlnsManager);

Using LINQ to find Excel columns that don't exist in array?

I have a solution that works for what I want, but I'm hoping to get some slick LINQ types to help me improve what I have, and learn something new in the process.
The code below is used verify that certain column names exist on a spreadsheet. I was torn between using column index values or column names to find them. They both have good and bad points, but decided to go with column names. They'll always exist, and sometimes in different order, though I'm working on this.
Details:
GetData() method returns a DataTable from the Excel spreadsheet. I cycle through all the required field names from my array, looking to see if it matches with something in the column collection on the spreadsheet. If not, then I append the missing column name to an output parameter from the method. I need both the boolean value and the missing fields variable, and I wasn't sure of a better way than using the output parameter. I then remove the last comma from the appended string for the display on the UI. If the StringBuilder object isn't null (I could have used the missingFieldCounter too) then I know there's at least one missing field, bool will be false. Otherwise, I just return output param as empty, and method as true.
So, Is there a more slick, all-in-one way to check if fields are missing, and somehow report on them?
private bool ValidateFile(out string errorFields)
{
data = GetData();
List<string> requiredNames = new [] { "Site AB#", "Site#", "Site Name", "Address", "City", "St", "Zip" }.ToList();
StringBuilder missingFields = null;
var missingFieldCounter = 0;
foreach (var name in requiredNames)
{
var foundColumn = from DataColumn c in data.Columns
where c.ColumnName == name
select c;
if (!foundColumn.Any())
{
if (missingFields == null)
missingFields = new StringBuilder();
missingFieldCounter++;
missingFields.Append(name + ",");
}
}
if (missingFields != null)
{
errorFields = missingFields.ToString().Substring(0, (missingFields.ToString().Length - 1));
return false;
}
errorFields = string.Empty;
return true;
}
Here is the linq solution that makes the same.
I call the ToArray() function to activate the linq statement
(from col in requiredNames.Except(
from dataCol in data
select dataCol.ColumnName
)
select missingFields.Append(col + ", ")
).ToArray();
errorFields = missingFields.ToString();
Console.WriteLine(errorFields);

Solr / Lucene: Get all field names sorted by number of occurrences in index

I want to get the list of all fields (i.e. field names) sorted by the number of times they occur in the Solr index, i.e.: most frequently occurring field, second most frequently occurring field and so on.
Alternatively, getting all fields in the index and the number of times they occur would also be sufficient.
How do I accomplish this either with a single solr query or through solr/lucene java API?
The set of fields is not fixed and ranges in the hundreds. Almost all fields are dynamic, except for id and perhaps a couple more.
As stated in Solr: Retrieve field names from a solr index? you can do this by using the LukeRequesthandler.
To do so you need to enable the requestHandler in your solrconfig.xml
<requestHandler name="/admin/luke" class="org.apache.solr.handler.admin.LukeRequestHandler" />
and call it
http://solr:8983/solr/admin/luke?numTerms=0
If you want to get the fields sorted by something you are required to do this on your own. I would suggest to use Solrj in case you are in a java environment.
Fetch fields using Solrj
#Test
public void lukeRequest() throws SolrServerException, IOException {
SolrServer solrServer = new HttpSolrServer("http://solr:8983/solr");
LukeRequest lukeRequest = new LukeRequest();
lukeRequest.setNumTerms(1);
LukeResponse lukeResponse = lukeRequest.process(solrServer );
List<FieldInfo> sorted = new ArrayList<FieldInfo>(lukeResponse.getFieldInfo().values());
Collections.sort(sorted, new FieldInfoComparator());
for (FieldInfo infoEntry : sorted) {
System.out.println("name: " + infoEntry.getName());
System.out.println("docs: " + infoEntry.getDocs());
}
}
The comparator used in the example
public class FieldInfoComparator implements Comparator<FieldInfo> {
#Override
public int compare(FieldInfo fieldInfo1, FieldInfo fieldInfo2) {
if (fieldInfo1.getDocs() > fieldInfo2.getDocs()) {
return -1;
}
if (fieldInfo1.getDocs() < fieldInfo2.getDocs()) {
return 1;
}
return fieldInfo1.getName().compareTo(fieldInfo2.getName());
}
}

How to filter on the value of a specific element in a list?

Using GAE-Java-JDO, is it possible to filter on the value of a specific element in a list?
WHAT WORKS
Normally, I would have the following:
#PersistenceCapable
class A {
String field1;
String field2;
// id, getters and setters
}
Then I would build a simple query:
Query q = pm.newQuery(A.class, "field1 == val");
q.declareParameters("String val");
List<A> list = new ArrayList<A>((List<A>) q.execute("foo"));
WHAT I WOULD LIKE
The above works fine. But what I would like to have is all of the fields stored in a list:
#PersistenceCapable
class AA {
ArrayList<String> fields;
// id, getters and setters
}
and then be able to query on a specific field in the list:
int index = 0;
Query q = pm.newQuery(A.class, "fields.get(index) == val");
q.declareParameters("int index, String val");
List<A> list = new ArrayList<A>((List<A>) q.execute(index, "foo"));
But this throws an exception:
org.datanucleus.store.appengine.query.DatastoreQuery$UnsupportedDatastoreFeatureException:
Problem with query
<SELECT FROM xxx.AA WHERE fields.get(index) == val PARAMETERS int index, String val,>:
Unsupported method <get> while parsing expression:
InvokeExpression{[PrimaryExpression{strings}].get(ParameterExpression{ui})}
My impression from reading the GAE-JDO doc is that this is not possible:
"The property value must be supplied by the application; it cannot refer to or be calculated in terms of other properties"
So... any ideas?
Thanks in advance!
If you only need to filter by index+value, then I think prefixing the actual list-values with their index should work. (If you need to also filter by actual values, then you'll need to store both lists.)
i.e. instead of the equivalent of
fields= ['foo', 'bar', 'baz] with query-filter fields[1] == 'bar'
you'd have
fields= ['0-foo', '1-bar', '2-baz'] with query-filter fields == '1-bar'
(but in java)

Parse XML with Linq

I have the following XML document which I would like to parse into a DataSet.
<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<Response Status="OK">
<Item>
<Field Name="ID">767147519</Field>
<Field Name="Name">Music</Field>
<Field Name="Path">Family\Music</Field>
<Field Name="Type">Playlist</Field>
</Item>
</Response>
I am wanting to get the attribute values for ID, Name, and Path.
The following is what I have attempted:
Dim loaded As XDocument = XDocument.Load(uriString)
Dim name = From c In loaded.Descendants("Item") Select c
For Each result In name
Dim str1 = result.Attribute("ID").Value 'Returns Nothing and causes a validation error
Dim str2 = result.Value ' Returns all the attribute values in one long string (ie "767147519MusicFamilyPlaylist")
Next
Any help would be greatly appreciated.
Thanks,
Matt
EDIT:
Following one of the answers below, I have been attempting to implement an anonymous type in my Linq, however I keep encountering the error
Object reference not set to an
instance of an object.
My updated code is as follows:
Dim name = From c In loaded.Descendants("Item") Select c Select sID = c.Element("Field").Attribute("Name").Value, sName = c.Attribute("ID").Value.FirstOrDefault
Dim Id As String = String.Empty
For Each result In name
Id = result.sID
Next
I think this error means that the attribute ("ID") cannot be located, so I have attempted several variations of this with similar results.
Is anyone able to identify where I am going wrong and point me in the right direction.
Thanks,
Matt
You can use XPath:
Dim data = From item In loaded.Descendants("Item")
Select
ID = item.XPathSelectElement("Field[#Name='ID']").Value,
Name = item.XPathSelectElement("Field[#Name='Name']").Value,
Path = item.XPathSelectElement("Field[#Name='Path']").Value,
Type = item.XPathSelectElement("Field[#Name='Type']").Value
(Be sure to import the System.Xml.XPath namespace)
Or to add it directly to a DataTable:
Dim dt As New DataTable()
dt.Columns.Add("ID")
dt.Columns.Add("Name")
dt.Columns.Add("Path")
dt.Columns.Add("Type")
For Each item In loaded.Descendants("Item")
dt.Rows.Add(
item.XPathSelectElement("Field[#Name='ID']").Value,
item.XPathSelectElement("Field[#Name='Name']").Value,
item.XPathSelectElement("Field[#Name='Path']").Value,
item.XPathSelectElement("Field[#Name='Type']").Value
)
Next
Another one solution with anonymous types:
var doc = XDocument.Load("c:\\test");
var list = doc.Root
.Elements("Item")
.Select(item =>
new
{
Id = item.Elements("Field").Where(e => e.Attribute("Name").Value == "ID").Select(e => e.Value).FirstOrDefault(),
Path = item.Elements("Field").Where(e => e.Attribute("Name").Value == "Path").Select(e => e.Value).FirstOrDefault(),
Name = item.Elements("Field").Where(e => e.Attribute("Name").Value == "Name").Select(e => e.Value).FirstOrDefault(),
})
.ToArray();
foreach (var item in list)
{
var id = item.Id;
var name = item.Name;
}
Ugly expression inside new operator can be shorted with next anonymous function:
Func<XElement, string, string> getAttrValue = (node, attrName) =>
{
return node.Elements("Field")
.Where(e => e.Attribute("Name").Value == attrName)
.Select(e => e.Value)
.FirstOrDefault();
};
Then new operator looks like:
new
{
Id = getAttrValue(item, "ID"),
Path = getAttrValue(item, "Path"),
Name = getAttrValue(item, "Name"),
}
Here is my attempt at solution to your problem. I just noticed that you wish to go with as much LINQ as possible so I've structured my LINQ query accordingly. Please note result type (for "IDs") will be IEnumerable() i.e. you will need to run a for each loop on it to get individual ids even with a single item:
Dim loaded As XDocument = XDocument.Load(uriString)
Dim IDs = From items In loaded.Descendants("Item") _
Let fields = items.Descendants("Field") _
From field In fields _
Where field.Attribute("Name").Value = "ID" _
Select field.Value
On a side note: For future reference, if you run into C# anonymous type "var" in examples, the equivalent in vb is plain dim like in my query above (without the 'as type' part).
Hope this helps.
Maverik
Use XPath and save everyone the headaches?
XmlDocument xml = new XmlDocument();
xml.Load(xmlSource);
string id = xml.SelectSingleNode("/Response/Item/Field[#Name='ID']").InnerText;
string name = xml.SelectSingleNode("/Response/Item/Field[#Name='Name']").InnerText;
string path = xml.SelectSingleNode("/Response/Item/Field[#Name='Path']").InnerText;
I am wanting to get the attribute values for ID, Name, and Path.
If you don't mind using something else than XDocument i'd just use a XmlDocument:
XmlDocument doc = new XmlDocument();
doc.Load(new XmlTextReader("XData.xml"));
XmlNodeList items = doc.GetElementsByTagName("Item");
foreach (XmlElement item in items.Cast<XmlElement>())
{
XmlElement[] fields = item.GetElementsByTagName("Field").Cast<XmlElement>().ToArray();
string id = (from s in fields where s.Attributes["Name"].InnerText == "ID" select s).First().InnerText;
string name = (from s in fields where s.Attributes["Name"].InnerText == "Name" select s).First().InnerText;
string path = (from s in fields where s.Attributes["Name"].InnerText == "Path" select s).First().InnerText;
//Do stuff with data.
}
Performance-wise this might be abysmal. You could also have a loop on the Fields and then use a switch on the Name-Attribute so you don't check the same field more than once. Why would you need any linq for this anyway?
XmlDocument doc = new XmlDocument();
doc.Load(new XmlTextReader("XData.xml"));
XmlNodeList items = doc.GetElementsByTagName("Item");
foreach (XmlElement item in items.Cast<XmlElement>())
{
foreach (XmlNode field in item.GetElementsByTagName("Field"))
{
string name = field.Attributes["Name"].InnerText;
switch (name)
{
case "ID":
string id = field.InnerText;
//Do stuff with data.
break;
case "Path":
string path = field.InnerText;
//Do stuff with data.
break;
case "Name":
string name = field.InnerText;
//Do stuff with data.
break;
default:
break;
}
}
}
Your linq query returns all the Item elements in the document:
Dim name = From c In loaded.Descendants("Item") Select c
The code that follows is trying to obtain an 'ID' attribute from the 'Item' element:
Dim str1 = result.Attribute("ID").Value
However, the 'ID' attribute is on a 'Field' child element.
What you need is the following:
// find all the Item elements
var items = loaded.Descendants("Item");
foreach(var item in items)
{
// find all the Field child elements
var fields = item.Descendants("Field");
// find the field element which has an ID attribute, and obtain the element value
string id = fields.Where(field => field.Attribute("ID")!=null)
.Single()
.Value;
// etc ...
}
A Simple solution is
var result = doc.Root.Descendants(XName.Get("Item")).Select(x => x.Descendants(XName.Get("Field")));
foreach (var v in result)
{
string id = v.Single(x => x.Attribute(XName.Get("Name")).Value == "ID").Value;
string name = v.Single(x => x.Attribute(XName.Get("Name")).Value == "Name").Value;
string path = v.Single(x => x.Attribute(XName.Get("Name")).Value == "Path").Value;
string type = v.Single(x => x.Attribute(XName.Get("Name")).Value == "Type").Value;
}
It can be easily converted in to vb code.
Here is a generic solution that handles all fields with different field names in several items. It saves the result in one table containing all distinct field names as column names.
Module Module1
Function createRow(ByVal table As DataTable, ByVal item As XElement) As DataRow
Dim row As DataRow = table.NewRow
Dim fields = item.Descendants("Field")
For Each field In fields
row.SetField(field.Attribute("Name").Value, field.Value)
Next
Return row
End Function
Sub Main()
Dim doc = XDocument.Load("XMLFile1.xml")
Dim items = doc.Descendants("Item")
Dim columnNames = From attr In items.Descendants("Field").Attributes("Name") Select attr.Value
Dim columns = From name In columnNames.Distinct() Select New DataColumn(name)
Dim dataSet As DataSet = New DataSet()
Dim table As DataTable = New DataTable()
dataSet.Tables.Add(table)
table.Columns.AddRange(columns.ToArray())
Dim rows = From item In items Select createRow(table, item)
For Each row In rows
table.Rows.Add(row)
Next
' TODO Handle Table
End Sub
End Module
I tried to use as much Linq as possible, but Linq is a bit inflexible when it comes to handling nested elements recursively.
Heres the sample xml file I've used:
<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<Response Status="OK">
<Item>
<Field Name="ID">767147519</Field>
<Field Name="Name">Music</Field>
<Field Name="Path">Family\Music</Field>
<Field Name="Type">Playlist</Field>
</Item>
<Item>
<Field Name="ID">123</Field>
<Field Name="Name">ABC</Field>
<Field Name="RandomFieldName">Other Value</Field>
<Field Name="Type">FooBar</Field>
</Item>
</Response>
And the result:
ID Name Path Type RandomFieldName
767147519 Music Family\Music Playlist
123 ABC FooBar Other Value
After some further research and with the assistance of parts from the answers provided, I have come up with the following, which returns the information that I am after.
Dim Query = From items In loaded.Descendants("Item") _
Let sID = ( From q In items.Descendants("Field") _
Where q.Attribute("Name").Value = "ID" ) _
Let sName = ( From r In items.Descendants("Field") _
Where r.Attribute("Name").Value = "Name" ) _
Let sPath = ( From s In items.Descendants("Field") _
Where s.Attribute("Name").Value = "Path" ) _
Where (Ctype(sPath.Value,String) Like "Family\*") _
Select pId=sID.Value, pName=sName.Value, pPath = sPath.Value
If this can be improved in any way to enable better performance, please let me know.
Thank you all for your assistance, while no one answer was able to entirely solve the problem I was able to learn a great deal about Linq through everyones assistance.
Matt
I hope you expected something like this short answer and not another implementation:
Dim items = From c In loaded.Descendants("Item") Select c (...)
Ok so far nothing should run into any trouble. The variable name 'name' was a bit confusing, so I changed it to 'items'.
The second part contains the error:
Dim items = (...) Select sID = c.Element("Field").Attribute("Name").Value, sName = c.Attribute("ID").Value.FirstOrDefault
The following works because there is an Attribute called Name, although the result is 'ID' what shurely wasn't expected:
c.Element("Field").Attribute("Name").Value
Here comes the error:
c.Attribute("ID").Value.FirstOrDefault
c is the XmlNode '< Item > ... < / Item >' and it does not have any attributes, thus the result of c.Attribute("ID") is null.
I guess you wanted something like the following:
Dim loaded = XDocument.Load("XMLFile1.xml")
Dim items = From item In loaded.Descendants("Item") Select _
sID = (From field In item.Descendants("Field") _
Where field.Attribute("Name") = "ID" _
Select field.Value).FirstOrDefault() _
, _
sName = (From field In item.Descendants("Field") _
Where field.Attribute("Name") = "Name" _
Select field.Value).FirstOrDefault()
There are a few errors in your code:
You should get the Descendents that have the XName equal to Field instead of to Item
Dim name = From c In loaded.Descendants("Field") Select c
The attribute you are after is called Name, not ID
Dim str1 = result.Attribute("Name").Value
At the first iteration of your for each str1 will be "ID", the next one it will be "Name", etc.
Total code:
Dim loaded As XDocument = XDocument.Load(uriString)
Dim name = From c In loaded.Descendants("Field") Select c
For Each result In name
Dim str1 = result.Attribute("Name").Value 'Returns "ID"
Dim str2 = result.Value ' Returns "767147519"
Next
There's another way to fix this problem. Transform this XML into the format that the DataSet wants, and then load it using DataSet.ReadXml. This is something of a pain if you don't know XSLT. But it's really important to know XSLT if you work with XML.
The XSLT you'd need is pretty simple. Start with the XSLT identity transform. Then add a template that transforms the Response and Item elements into the format that the DataSet expects:
<xsl:template match="Response">
<MyDataSetName>
<xsl:apply-templates select="Item"/>
</MyDataSetName>
</xsl:template>
<xsl:template match="Item">
<MyDataTableName>
<xsl:apply-templates select="Field[#Name='ID' or #Name='Name' or #Name='Path']"/>
</MyDataTableName>
</xsl:template>
<xsl:template match="Field">
<xsl:element name="{#Name}">
<xsl:value-of select="."/>
</xsl:element>
</xsl:template>
That will change your XML to a document that looks like this:
<MyDataSetName>
<MyDataTableName>
<ID>767147519</ID>
<Name>Music</Name>
<Path>Family\Music</Path>
</MyDataTableName>
</MyDataSetName>
...and you can just feed that to DataSet.ReadXml.
Edit:
I should point out, since it's not obvious unless you do this a lot, that one effect of this is that the amount of C# code that you need to create and populate the DataSet is minimal:
private DataSet GetDataSet(string inputFilename, string transformFilename)
{
StringBuilder sb = new StringBuilder();
using (XmlReader xr = XmlReader.Create(inputFilename))
using (XmlWriter xw = XmlWriter.Create(new StringWriter(sb)))
{
XslCompiledTransform xslt = new XslCompiledTransform();
xslt.Load(transformFilename);
xslt.Transform(xr, xw);
}
using (StringReader sr = new StringReader(sb.ToString()))
{
DataSet ds = new DataSet();
ds.ReadXml(sr);
return ds;
}
}
It's also reusable. You can use this method to populate as many different DataSets from as many different possible input formats as you need; you just need to write a transform for each format.

Resources