SQL Server : parse XML in field with unknown namespace - sql-server

I'm trying to parse XML from a column to add it to a view for reports but I don't know the namespace. It may be more complicated because the fields within the XML are only created once the field is used in the base application.
The pfm.Entity table that I'm trying to parse XML from the Custom# column from is structured with the following columns:
[RootId#] ,[Id#] ,[LastId#] ,[Guid#] ,[Custom#] ,[Type#]
My goal is to parse the XML contained within Custom# into columns so that I can join the table with others on the RootId# field. Below is the select statement I'm testing with:
WITH XMLNAMESPACES ('http://www.w3.org/2001/XMLSchema' as ns)
SELECT TOP (1000)
Custom#.value('(/ns:Fields/ns:Field)[1]', 'VARCHAR(50)') AS xmlfield
FROM
[SelectDb].[pfm].[Entity]
My guess is that I'll need to be selecting with something like below but that's throwing an error on the # character.
SELECT
Custom#.value('(/ns:Fields/ns:MarketingSrcPercentage_901419#)[1]', 'VARCHAR(50)') AS mktsrcp
Below is two examples of what the XML in the field can look like. Keep in mind that there are rows where this is NULL as well.
<Fields>
<Field name="PurchaseDate_NJ#">6/11/2018</Field>
<Field name="AddOrImpDate_NJ#">8/1/2018</Field>
<Field name="AffidavitAddOrExcep_NJ#">nfui fevtt[th40thijfkrkl grwgr ijg rgmrk gmkmr pkgi</Field>
<Field name="DateOfPropertyAcquisition_DC#">8/7/2018</Field>
<Field name="AuthorizedAppointedAgent_PA#">NAME OF PERSON AUTHORIZED/APPONTED AGENT PROMPT BOX</Field>
<Field name="SpouseWaivingMaritalRights_PA#">NAME OF SPOUSE WAIVING MARTIAL RIGHTS PROMPT BOX</Field>
<Field name="PersonalFunds_TXFA#">true</Field>
<Field name="Currency_TXFA#">true</Field>
<Field name="CashiersCheck_TXFA#">true</Field>
<Field name="TravelersCheck_TXFA#">true</Field>
<Field name="MoneyOrder_TXFA#">true</Field>
<Field name="BusinessCheck_TXFA#">true</Field>
<Field name="PersonalCheck_TXFA#">true</Field>
<Field name="BankruptcyFilings_NY#">true</Field>
</Fields>
and
<Fields>
<Field name="">1</Field>
<Field name="MarketingSrcPercentage_901419#">1</Field>
<Field name="MarketingRep1Name_901419#">Brian</Field>
</Fields>
In case it's helpful, this is a database for SoftPro Select. I've looked through their documentation but I haven't found anything of help.
EDIT: Added in a clearer definition of the source table.

The XML you provide does not include any namespace? I don't know, why you think you need WITH XMLNAMESPACES at all...
Try this:
DECLARE #xml XML=
N'<Fields>
<Field name="PurchaseDate_NJ#">6/11/2018</Field>
<Field name="AddOrImpDate_NJ#">8/1/2018</Field>
<Field name="AffidavitAddOrExcep_NJ#">nfui fevtt[th40thijfkrkl grwgr ijg rgmrk gmkmr pkgi</Field>
<Field name="DateOfPropertyAcquisition_DC#">8/7/2018</Field>
<Field name="AuthorizedAppointedAgent_PA#">NAME OF PERSON AUTHORIZED/APPONTED AGENT PROMPT BOX</Field>
<Field name="SpouseWaivingMaritalRights_PA#">NAME OF SPOUSE WAIVING MARTIAL RIGHTS PROMPT BOX</Field>
<Field name="PersonalFunds_TXFA#">true</Field>
<Field name="Currency_TXFA#">true</Field>
<Field name="CashiersCheck_TXFA#">true</Field>
<Field name="TravelersCheck_TXFA#">true</Field>
<Field name="MoneyOrder_TXFA#">true</Field>
<Field name="BusinessCheck_TXFA#">true</Field>
<Field name="PersonalCheck_TXFA#">true</Field>
<Field name="BankruptcyFilings_NY#">true</Field>
</Fields>';
--Just some examples to get your named values in a type-safe way.
--The rest works the same...
SELECT #xml.value('(/Fields/Field[#name="PurchaseDate_NJ#"])[1]','date') AS [PurchaseDate_NJ#]
,#xml.value('(/Fields/Field[#name="AuthorizedAppointedAgent_PA#"])[1]','nvarchar(max)') AS [AuthorizedAppointedAgent_PA#]
,#xml.value('(/Fields/Field[#name="Currency_TXFA#"])[1]','bit') AS [Currency_TXFA#]

I found a solution for the problem but not the ideal solution which would pull all fields and sort them into columns. I'm still looking for the namespace but the below is working for now.
SELECT TOP (1000)
entity.Custom#.value('data(//Field[#name="MarketingRep1Name_901419#"])[1]','VARCHAR(100)') 'MarketingRep1'
FROM
[SelectDb].[pfm].[Entity] entity

Related

Query for xml values in sql server

I have xml field in the below table in Data column
ID | website | Data
Following is the xml field
<Product>
<field name="IsCustomer" type="System.Boolean, mscorlib">
<boolean>false</boolean>
</field>
</product>
I need to retrieve all the IsCustomer values in my table.
Following is the code part that I tried so far.
SELECT EMP.ED.value() as EmployeeID
FROM [dbo].[Products]
CROSS APPLY Data.nodes('/Product/Field[#Name="IsCustomer"]/Boolean') as EMP(ED)
Can anyone please help me?
First of all: XML is strictly case sensitive! Your XML is not even valid... The leading <Product> is another element-name then the closing </product>. As you seem to use lower letters in all places, I changed it this way.
Your own query is close, but wrong with some capital letters and you did not use the .value()-function properly (missing paramters).
Try this:
DECLARE #mockup TABLE(ID INT IDENTITY,Descr VARCHAR(100),Data XML);
INSERT INTO #mockup VALUES
('Your Sample','<product>
<field name="IsCustomer" type="System.Boolean, mscorlib">
<boolean>false</boolean>
</field>
</product>')
,('Your sample plus another field','<product>
<field name="IsCustomer" type="System.Boolean, mscorlib">
<boolean>true</boolean>
</field>
<field name="SomeOther" type="System.Boolean, mscorlib">
<boolean>true</boolean>
</field>
</product>')
,('No "IsCustomer" at all','<product>
<field name="SomeOther" type="System.Boolean, mscorlib">
<boolean>true</boolean>
</field>
</product>')
,('Two of them','<product>
<field name="IsCustomer" type="System.Boolean, mscorlib">
<boolean>true</boolean>
</field>
<field name="IsCustomer" type="System.Boolean, mscorlib">
<boolean>false</boolean>
</field>
</product>');
SELECT * FROM #mockup;
--Your query returning various variants, one of them should be okay for you:
SELECT m.ID
,m.Descr
,fld.value('(boolean/text())[1]','bit')
FROM #mockup AS m
OUTER APPLY m.Data.nodes('/product/field[#name="IsCustomer"]') AS A(fld);
I already wrote a post about XQuerying which might be helpful to you.
Note: XQuerying is case sensitive and you need to position yourself properly. Take a look at the post i linked, and if it does not help i will update this comment with a solution regarding your query
Update
To answer your question
First and foremost you need to have properly formatted XML, which means tags need to match their case sensitivity, and same rule applies when you are referencing a tag in a XQuery.
So one of the solutions would be :
DECLARE #XML as XML
SET #XML = '<Product>
<field name="IsCustomer" type="System.Boolean, mscorlib">
<boolean>false</boolean>
</field>
</Product>'
SELECT EMP.t.value('boolean[1]','varchar(20)') as EmployeeID
FROM #XML.nodes('/Product/field') as EMP(t)

Solr templated sql print nothing in dataimport

I have the following dataimport configuration:
<?xml version="1.0" encoding="UTF-8" ?>
<dataConfig>
<dataSource driver="net.ucanaccess.jdbc.UcanaccessDriver" type="JdbcDataSource" url="jdbc:ucanaccess://C:/feqh/main.mdb;memory=false" />
<document>
<entity name="Book"
query="select bkid AS id, bkid AS BookID,bk AS BookTitle, betaka AS BookInfo, cat as cat from 0bok">
<field column="id" name="id"/>
<field column="BookID" name="BookID"/>
<field column="BookTitle" name="BookTitle"/>
<field column="cat" name="cat"/>
<entity name="Category"
query="select name as CatName, catord as CatWeight, Lvl as CatLevel from 0cat where id = ${Book.cat}">
<field column="CatName" name="CatName"/>
<field column="CatWeight" name="CatWeight"/>
<field column="CatLevel" name="CatLevel"/>
</entity>
</entity>
</document>
</dataConfig>
This dataimport is failed due to the following error from the log:
Unable to execute query: select name as CatName, catord as CatWeight,
Lvl as CatLevel from 0cat where id = Processing Document # 1
When I replace ${Book.cat} with any fixed number such as 128, the import works fine.
So it seems that ${Book.cat} does not printout any value. The database that I import data from is MS Access database mdb using ucanaccess version 2.0.9. I'm using Solr 4.9.0 on Java8. How could I solve this issue?
For unknown reason I found that the column name should be written in Upper case in the template, so ${Book.cat} should be ${Book.CAT} I said unknown because I'm sure that the column
name in the database is written lower case cat.

Timestamp compatibility while performing delta import in solr

Im new to solr.I have successfully indexed oracle 10g xe database. Im trying to perform delta import on the same.
The delta query requires a comparison of last_modified column of the table with ${dih.last_index_time}.
However in my application I do not have such a column . Also, i cannot add this column. Therefore i used 'scn_to_timestamp(ora_rowscn)' to give the value of the required timestamps. This query returns the value of type timestamp in the following format 24-JUL-13 12.42.32.000000000 PM and dih.last_index_time is in the format 2013-07-24 12:18:03. So, I changed the format of dih.last_index_time as to_timestamp('${dih.last_index_time}', 'YYYY/MM/DD HH:MI:SS').
My Data-config looks like this -
<dataConfig>
<dataSource type="JdbcDataSource" driver="oracle.jdbc.OracleDriver" url="jdbc:oracle:thin:#XXX.XXX.XX.XX:XXXX:xe" user="XXXXXXXX" password="XXXXXXX" />
<document name="product_info">
<entity name="PRODUCT" pk="PID" query="SELECT * FROM PRODUCT" deltaImportQuery="SELECT * FROM PRODUCT WHERE PID=${dih.delta.id}" deltaQuery="SELECT PID FROM PRODUCT WHERE scn_to_timestamp(ora_rowscn) > to_timestamp('${dih.last_index_time}', 'YYYY/MM/DD HH:MI:SS')">
<field column="PID" name="id" />
<field column="PNAME" name="itemName" />
<field column="INITQTY" name="itemQuantity" />
<field column="REMQTY" name="remQuantity" />
<field column="PRICE" name="itemPrice" />
<field column="SPECIFICATION" name="specifications" />
<entity name="SUB_CATEGORY" query="SELECT * FROM SUB_CATEGORY WHERE SCID=${PRODUCT.SCID}">
<field column="SUBCATNAME" name="brand" />
<entity name="CATEGORY" query="SELECT CNAME FROM CATEGORY WHERE CID=${SUB_CATEGORY.CID}">
<field column="CNAME" name="itemCategory" />
</entity>
</entity>
</entity>
</document>
</dataConfig>
However,This is not working and im getting the following error -
Unable to execute query: SELECT * FROM PRODUCT WHERE PID= Processing Document # 1
Caused by: java.sql.SQLException: ORA-00936: missing expression
Please help me out!!!
I had a similar issue and had more success with *to_date*. But looking at this again, it just looks like perhaps you just need to quote your delta id in the delatImportQuery:
deltaImportQuery="SELECT * FROM PRODUCT WHERE PID='${dih.delta.id}'"

solr: import from different datasources using DIH

I am trying to fill a Solr index from 2 different data-sources (xml and db) using the DataImportHandler.
1st try: Created 2 data-config.xml files, one for the xml import and one for the db import.
The db-config would read id and lets say field A. The xml-config also id and field B.
That works for both (i could import from both datasources), but the index got overwritten each time (with clean=false of course), so I either had id and A or id and B
so on for the
2nd try: merged the 2 files into one
<?xml version="1.0" encoding="UTF-8"?>
<dataConfig>
<dataSource
name="cr-db"
jndiName="xyz"
type="JdbcDataSource" />
<dataSource
name="cr-xml"
type="FileDataSource"
encoding="utf-8" />
<document name="doc">
<entity
dataSource="cr-xml"
name="f"
processor="FileListEntityProcessor"
baseDir="/path/to/xml"
filename="*.xml"
recursive="true"
rootEntity="false"
onError="skip">
<entity
name="xml-data"
dataSource="cr-xml"
processor="XPathEntityProcessor"
forEach="/root"
url="${f.fileAbsolutePath}"
transformer="DateFormatTransformer"
onError="skip">
<field column="id" xpath="/root/id" />
<field column="A" xpath="/root/a" />
</entity>
<entity
name="db-data"
dataSource="cr-db"
query="
SELECT
id, b
FROM
a_table
WHERE
id = '${f.file}'">
<field column="B" name="b" />
</entity>
</entity>
</document>
</dataConfig>
A bit funny is the id = '${f.file}'-part i guess, but that is the id that is used. The select statement is correctly formed, but I get an exception when trying to run that file in the dataimport.jsp. The first part (xml) works fine, but when he gets to the db part it raises:
java.lang.RuntimeException: java.io.FileNotFoundException:
Could not find file: SELECT id, b FROM a_table WHERE id = '12345678.xml'
at org.apache.solr.handler.dataimport.FileDataSource.getFile[..]
Any advice? Thanks in advance
EDIT
I found the problem for the FileNotFoundException: within the entity tags the datasource-attributes need to be camelCased --> dataSource..
Now it runs through, but with the same outcome as in the first try: only field B gets in the index. If I take the db-entity out, then the file contents are indexed (field A)
Try:
<entity name="db-data" dataSource="cr-db"
The attributes are case-sensitive, so your wrong-cased attribute name is ignored and you fall back to the default one (which somehow is the file one).

Solr data import multi-valued field into a single valued field

I have a multi-valued field
<arr name="colors">
<str>Blue</str>
<str>Red</str>
<str>Orange</str>
<str>Pink</str>
<str>Violet</str>
</arr>
Filled like this:
<entity name="pub_attributes" query=" SELECT name [description] FROM dbo.Colors">
<field name="colors" column="description" />
</entity>
And I need another field with all the colors but only in one line separated by white spaces like
<str name="Colors_All">Bue Red Orange Pink Violet</str>
How can I do this without accessing the Colors table all over again??
Maybe something like this
<entity name="Properites_all" query="
DECLARE #all VARCHAR(MAX)
SET #all = ''
Select #all = #all + ... from '${pub_attributes.colors}'
UNION
Another SELECT that will add more info than just the colors
">
<field name="colors_all" column="description" />
</entity>
I think, what you looking for is copyfield:
copyfield wiki and also you can take a look here:how to use it
Hope that will help.

Resources