Getting junk character in C structure after parsing - c

I am using a simple xml file.
<COMPANY>
<EMPLOYEES>
<EMPLOYEE>
<NAME>BOB</NAME>
<EMPID>51211</EMPID>
<SEX>M</SEX>
<DOB>10-1-1982</DOB>
<DOJ>12-7-2001</DOJ>
</EMPLOYEE>
</EMPLOYEES>
</COMPANY>
The xml booster meta definition file for the same is as below
<SYSTEM NAME="testmeta" >
<CCONFIG MAXLEN="100"
ARRAYSIZE="5"
FLATMODE="TRUE"/>
<ELEMENT NAME="COMPANY" TAG="COMPANY" MAIN="TRUE" >
<FIELDS>
<FIELD NAME="EMPLOYEES" REFTYPE="EMPLOYEE" MODE="DEFAULT" />
</FIELDS>
<FORMULA>
<ENCLOSED NAME="EMPLOYEES" >
<META NAME="COMMENT" >Target field is EMPLOYEES</META>
<REPEAT TARGET="EMPLOYEES" ATLEASTONE="TRUE" >
<ELEMENTREF NAME="EMPLOYEE" />
</REPEAT>
</ENCLOSED>
</FORMULA>
</ELEMENT>
<ELEMENT NAME="EMPLOYEE" TAG="EMPLOYEE" >
<FIELDS>
<FIELD NAME="NAME" TYPE="STRING" />
<FIELD NAME="EMPID" TYPE="INTEGER" />
<FIELD NAME="SEX" TYPE="STRING" />
<FIELD NAME="DOB" TYPE="STRING" />
<FIELD NAME="DOJ" TYPE="STRING" />
</FIELDS>
<FORMULA>
<CONCAT>
<ENCLOSED NAME="NAME" >
<META NAME="COMMENT" >Target field is NAME</META>
<PCDATA TARGET="NAME" />
</ENCLOSED>
<ENCLOSED NAME="EMPID" >
<META NAME="COMMENT" >Target field is EMPID</META>
<PCDATA TARGET="EMPID" />
</ENCLOSED>
<ENCLOSED NAME="SEX" >
<META NAME="COMMENT" >Target field is SEX</META>
<PCDATA TARGET="SEX" />
</ENCLOSED>
<ENCLOSED NAME="DOB" >
<META NAME="COMMENT" >Target field is DOB</META>
<PCDATA TARGET="DOB" />
</ENCLOSED>
<ENCLOSED NAME="DOJ" >
<META NAME="COMMENT" >Target field is DOJ</META>
<PCDATA TARGET="DOJ" />
</ENCLOSED>
</CONCAT>
</FORMULA>
</ELEMENT>
</SYSTEM>
Generated a .c and .h file for the same using xmlbooster lite using following command
xmlblit.exe -C testmeta.xmlb
Now in my application in main function i m calling accept_COMPANY function and passing S_XMLB_CONTEXT context object. Function succeeds but when i print each employee value using
printf("%s, %d, %s, %s, %s", En->aNAME,
En->aEMPID,
En->aSEX,
En->aDOB,
En->aDOJ);
printf("\n");
i am getting junk characters printed, even for the integer aEMPID value.
I am using Visual studio 2010 to compile and run the C program.
Tried both Unicode and MultiByte project but no luck in getting the correct values.
I am getting output as
UOB, 78, M, j0-1-1982, 1t-7-2001
After debugging the generated .c file found that, the generator is setting unwanted values for PCDATA type field. For e.g. for NAME field after retrieving the name from XML, the code has the following statement
/* Regexp */
if (strlen(obj->aNAME) > 0)
(obj->aNAME)[0] = 'U';
Anyone faced similar situation?

Wrote to xmlbooster support got the following reply.
XMLBooster Lite only support Java ; C is only provided for evaluation purposes, as values are scrambled as you’ve noticed in your sample programs and in the generated code. You need to purchase XMLBooster Pro to generate production-level parsers in C.
Got the answer, hope this helps others too who are experimenting.

Related

Apache solr index files (pdf,docx,..) over ftp

how to index files over ftp ,
the FTP repo contain all my documents in different format, i am able to do this task for system folder but it doesn't work with ftp
i have this configuration via (DIH)
<dataConfig>
<dataSource type="BinFileDataSource" />
<dataSource type="BinURLDataSource" name="binSource" baseUrl="ftp://localhost:21/" onError="skip" user="solr_ftp" password="solr_ftp_pass" />
<document>
<!-- baseDir: path to the folder that containt the files (pdf | doc | docx | ...) -->
<entity name="files" dataSource="binSource" baseDir="ftp://localhost" rootEntity="false" processor="FileListEntityProcessor" fileName=".*\.(doc)|(pdf)|(docx)|(txt)|(rtf)|(html)|(htm)" onError="skip" recursive="true">
<field column="fileAbsolutePath" name="filePath" />
<field column="resourceName" name="resourceName" />
<field column="fileSize" name="size" />
<field column="fileLastModified" name="lastModified" />
<!-- tika -->
<entity name="documentImport" processor="TikaEntityProcessor" url="${files.fileAbsolutePath}" format="text">
<field column="title" name="title" meta="true"/>
<field column="subject" name="subject" meta="true"/>
<field column="description" name="description" meta="true"/>
<field column="comments" name="comments" meta="true"/>
<field column="Author" name="author" meta="true"/>
<field column="Keywords" name="keywords" meta="true"/>
<field column="category" name="category" meta="true"/>
<field column="xmpTPg:NPages" name="Page-Count" meta="true"/>
<field column="text" name="content"/>
</entity>
</entity>
</document>
</dataConfig>
Error:
failed:java.lang.RuntimeException: java.lang.RuntimeException: org.apache.solr.handler.dataimport.DataImportHandlerException: 'baseDir' value: ftp://localhost is not a directory Processing Document # 1

SQL Templating in SmartGWT

I am trying to use SQL templating in smartgwt to load data from my database into a listgrid but I am not able to get the desired result. This is the raw SQL query which I am trying to adopt in SmartGWT to get the result
SELECT
dbo.province.provincename as province,
dbo.province.capital,
dbo.province.code,
dbo.province.telcode,
dbo.province.taxcode,
dbo.county.countyname as county,
dbo.district.districtname as district,
dbo.zone.alternateName as zone,
dbo.neighbourhood.alternateName as neigbhour,
dbo.city.cityname as city,
dbo.city.taxcode,
dbo.city.fdocode,
count(dbo.customer.customeraltname) as countCustomer
FROM
dbo.county
INNER JOIN
dbo.province
ON
(
dbo.county.provinceID = dbo.province.id)
INNER JOIN
dbo.district
ON
(
dbo.county.id = dbo.district.countyID)
INNER JOIN
dbo.city
ON
(
dbo.district.id = dbo.city.districtID)
INNER JOIN
dbo.zone
ON
(
dbo.city.id = dbo.zone.cityId)
INNER JOIN
dbo.neighbourhood
ON
(
dbo.zone.id = dbo.neighbourhood.zoneId)
INNER JOIN
dbo.customer
ON
(
dbo.neighbourhood.id = dbo.customer.neighbourhoodId)
Group By dbo.province.provincename,
dbo.province.capital,
dbo.province.code,
dbo.province.telcode,
dbo.province.taxcode,
dbo.county.countyname,
dbo.district.districtname,
dbo.zone.alternateName,
dbo.neighbourhood.alternateName,
dbo.city.cityname,
dbo.city.taxcode,
dbo.city.fdocode
Below is my ds.xml
<DataSource ID="CusNeiGroupDS" serverType="sql">
<fields>
<field name="id" type="integer" />
<field name="provincename" title="province" type="text"/>
<field name="capital" title="capital" type="text"/>
<field name="code" title="code" type="text"/>
<field name="telcode" title="telcode" type="text"/>
<field name="taxcode" title="taxcode" type="text" />
<field name="provinceId" type="integer" tableName="county"/>
<field name="countyname" title="county" type="text" tableName="county"/>
<field name="district" title="district" type="text" />
<field name="city" title="city" type="text" />
<field name="zone" title="zone" type="text" />
<field name="neighbour" title="neighbour" type="text" />
<field name="taxcodecity" title="taxcodecity" type="text"
/>
<field name="fdocode" title="fdocode" type="text" />
<!-- <field name="countCustomer" title="countCustomer" type="int" /> -->
</fields>
<operationBindings>
<operationBinding operationId="summary"
operationType="fetch">
<selectClause>
districtname as
district,
alternateName as zone,
alternateName as neigbhour,
cityname
as city,
fdocode
</selectClause>
<tableClause>province, county, district, zone, neighbourhood, city
</tableClause>
<whereClause>
province.Id = county.provinceId
AND district.countyId = county.Id
AND city.districtId = district.Id
AND neighbourhood.zoneId = zone.Id
</whereClause>
</operationBinding>
</operationBindings>
the error I get is
Execute of select: SELECT COUNT(*) FROM CusNeiGroupDS WHERE ('1'='1')
on db: SQLServer threw exception: java.sql.SQLException: Invalid
object name 'CusNeiGroupDS'. - assuming stale connection and retrying
query.
But when I put the table name in the datasource like this,I get output but only from that table which I mention and not from the other tables which are joined with FK.
<DataSource ID="CusNeiGroupDS" serverType="sql" tableName="province">
I was able to achieve this with using includeFrom and foreignKey tags in the datasources. Then create another datasource where i use it to include all the coumns I need from the different tables.Like so
<DataSource ID="neighbourDS_1" serverType="sql" tableName="neighbourhood" inheritsFrom="neighbourDS">
<fields>
<field name="provincename" includeFrom="provinceDS.provincename" />
<field name="capital" includeFrom="provinceDS.capital" />
<field name="code" includeFrom="provinceDS.code" />
<field name="telcode" includeFrom="provinceDS.telcode" />
<field name="countyname" includeFrom="countyDS.countyname" />
<field name="district" includeFrom="districtDS.districtname" />
<field name="city" includeFrom="cityDS.cityname" />
<field name="taxcodecity" includeFrom="cityDS.taxcodecity" />
<field name="fdocode" includeFrom="cityDS.fdocode" />
<field name="zone" includeFrom="zoneDS.zone" />
</fields>
</DataSource>

How to get the xml node based on condition in sql server

This is my xml
DECLARE #XMLValues XML
SET #XMLValues = '<?xml version="1.0" encoding="UTF-8"?>
<DOCUMENTS name="NYSPIT">
<DOCUMENT ID="140208512T200911101">
<REPEATS>
<REPEAT NAME="EXCEPTIONS">
<ROW>
<FIELD VALUE="09_NYC-3A_2" NAME="PageType"/>
<FIELD VALUE="" NAME="KeyWord"/>
<FIELD VALUE="020852009111001.002" NAME="ImageName"/>
<FIELD VALUE="2" NAME="PageNo"/>
<FIELD VALUE="" NAME="Qualifier"/>
</ROW>
</REPEAT>
</REPEATS>
</DOCUMENT>
<DOCUMENT ID="140208512T200911102">
<REPEATS>
<REPEAT NAME="EXCEPTIONS">
<ROW>
<FIELD VALUE="09_NYC-3A_2" NAME="PageType"/>
<FIELD VALUE="" NAME="KeyWord"/>
<FIELD VALUE="020852009111001.002" NAME="ImageName"/>
<FIELD VALUE="2" NAME="PageNo"/>
<FIELD VALUE="" NAME="Qualifier"/>
</ROW>
</REPEAT>
</REPEATS>
</DOCUMENT>
</DOCUMENTS>
and i need to retrieve the XML node for ID - 140208512T200911101 alone. i cant able to get the information using various methods, still didnt get the correct one.
my desired result should be like this :
<DOCUMENT ID="140208512T200911101">
<REPEATS>
<REPEAT NAME="EXCEPTIONS">
<ROW>
<FIELD VALUE="09_NYC-3A_2" NAME="PageType"/>
<FIELD VALUE="" NAME="KeyWord"/>
<FIELD VALUE="020852009111001.002" NAME="ImageName"/>
<FIELD VALUE="2" NAME="PageNo"/>
<FIELD VALUE="" NAME="Qualifier"/>
</ROW>
</REPEAT>
</REPEATS>
</DOCUMENT>
Please help on this...
Thanks for your support and it is working fine, for getting the #ID value dynamically from a variable we need to user like this :
DECLARE #DCN Varchar(50)
SET #DCN = '140208512T200911101'
select #XMLValues.query('/DOCUMENTS/DOCUMENT[#ID = sql:variable("#DCN")]')
select #XMLValues.query('/DOCUMENTS/DOCUMENT[#ID = "140208512T200911101"]')
Result
<DOCUMENT ID="140208512T200911101">
<REPEATS>
<REPEAT NAME="EXCEPTIONS">
<ROW>
<FIELD VALUE="09_NYC-3A_2" NAME="PageType" />
<FIELD VALUE="" NAME="KeyWord" />
<FIELD VALUE="020852009111001.002" NAME="ImageName" />
<FIELD VALUE="2" NAME="PageNo" />
<FIELD VALUE="" NAME="Qualifier" />
</ROW>
</REPEAT>
</REPEATS>
</DOCUMENT>

indexing mutilple tables in solr using DIH

I'm developing search engine using Solr and I've been successful in indexing data from one table using DIH (Dataimport Handler). What I need is to get search result from 5 different tables. I couldn't do this without help.
if we assume x table with x rows, there should be x * x documents from each table, which lead to, 5x documents if I have 5 tables as total. In dataconfig.xml, I created 5 seperate entities in single document as shown below. the result from indexed data when I query *:* is only 6 of the entity users and 3 from entity classes which is the number of users total rows which is 9.
Clearly, this way didn't work for me, so how can I achieve this using only single core?
note: I followed DIHQuickStart and DIH tutorial which didn't help me.
<document>
<!-- Users -->
<entity dataSource="JdbcDataSource" name=" >
<field column="name" name="name" sourceColName="name" />
<field column="username" name="username" sourceColName="username"/>
<field column="email" name="email" sourceColName="email" />
<field column="country" name="country" sourceColName="country" />
</entity>
<!-- Classes -->
<entity dataSource="JdbcDataSource" name="classes" >
<field column="code" name="code" sourceColName="code" />
<field column="title" name="title" sourceColName="title" />
<field column="description" name="description" sourceColName="description" />
</entity>
<!-- Schools -->
<entity dataSource="JdbcDataSource" name="schools" >
<field column="school_name" name="school_name" sourceColName="school_name" />
<field column="country" name="country" sourceColName="country" />
<field column="city" name="city" sourceColName="city" />
</entity>
<!-- Resources -->
<entity dataSource="JdbcDataSource" name="resources" >
<field column="title" name="title" sourceColName="title" />
<field column="description" name="description" sourceColName="description" />
</entity>
<!-- Tasks -->
<entity dataSource="JdbcDataSource" name="tasks" >
<field column="title" name="title" sourceColName="title" />
<field column="description" name="description" sourceColName="description" />
</entity>
</document>
you need to look at the structures of your tables then either create queries with joins or creat nested entities like this for example
<dataConfig>
<dataSource driver="org.hsqldb.jdbcDriver" url="jdbc:hsqldb:/temp/example/ex" user="sa" />
<document name="schools">
<entity name="school" query="select * from schools s ">
<field column="school_name" name="school_Name" />
<entity name="school_class" query="select * from schools_classes sc where sc.school_name = '${school.school_name}'">
<field column="class_code" name="class_code" />
<entity name="class" query="select class_name from classes c where c.class_name= '${school_class.class_code}'">
<field column="class_name" name="class_name" />
</entity>
</entity>
</entity>
</document>
</dataConfig>

solr DIH - A problem about solr delta-imports

There is a problem when I use solr1.3 delta-imports to update the index. I have added the "last_modified" column in the table. After I use the "full-import" command to index the database data, the "dataimport.properties" file contains nothing, and when I use the "delta-import" command to update index, the solr list all the data in database not the lasted data. My db-data-config.xml:
deltaQuery="select shop_id from shop where last_modified > '${dataimporter.last_index_time}'">
<?xml version="1.0" encoding="UTF-8" ?>
<dataConfig>
<dataSource driver="com.mysql.jdbc.Driver" url="jdbc:mysql://localhost:3306/funguide" user="root" password="root"/>
<document name="shopinfo">
<entity name="shop" pk="shop_id"
query="select shop_id,title,description,tel,address,longitude,latitude from shop"
<field column="shop_id" name="id" />
<field column="title" name="title" />
<field column="description" name="description" />
<field column="tel" name="tel" />
<field column="address" name="address" />
<field column="longitude" name="longitude" />
<field column="latitude" name="latitude" />
</entity>
</document>
</dataConfig>
Anyboby know how to solve the problem? Thanks!
enzhaohoo#gmail.com
I also would recommend upgrading to Solr 1.4 RC as there have been quite a few improvements made to delta-imports with DataImportHandler. Please see DataImportHandler - Using delta-import command - wikipage for specifics.

Resources