iBATIS sqlmap with groupby how to prevent null record on child - ibatis

Need help with sqlmap group by. Im getting an empty child object when there is no child relationship
<resultMap id="GrpMap" class="Grp" groupBy="GroupId">
<result column="grp_id" property="GroupId" jdbcType="UUID"/>
<result column="nm" property="name" jdbcType="VARCHAR"/>
<result property="children" resultMap="Groups.childMap"/>
</resultMap>
<resultMap id="childMap" class="child">
<result column="child_ky" property="childKey" jdbcType="UUID"/>
<result column="name" property="name" jdbcType="VARCHAR"/>
</resultMap>
is there a way of specifying that if there is no children then not to populate the relationship? my sql query is a left outer join so will return null records for the child.
I want to do something like isnotNull column="child_ky" so the child does not get populated
<resultMap id="GrpMap" class="Grp" groupBy="GroupId">
<result column="grp_id" property="GroupId" jdbcType="UUID"/>
<result column="nm" property="name" jdbcType="VARCHAR"/>
<isnotnull child_ky>
<result property="children" resultMap="Groups.childMap"/>
</inotnull>
</resultMap>

iBatis offers the notNullColumn parameter for the result tag:
<resultMap id="GrpMap" class="Grp" groupBy="GroupId">
[...]
<result property="children" resultMap="Groups.childMap" notNullColumn="child_ky"/>
</resultMap>

Related

How does SOLR Cell add document content?

SOLR has a module called Cell. It uses Tika to extract content from documents and index it with SOLR.
From the sources at https://github.com/apache/lucene-solr/tree/master/solr/contrib/extraction , I conclude that Cell places the raw extracted text document text into a field called "content". The field is indexed by SOLR, but not stored. When you query for documents, "content" doesn't come up.
My SOLR instance has no schema (I left the default schema in place).
I'm trying to implement a similar kind of behavior using the default UpdateRequestHandler (POST to /solr/corename/update). The POST request goes:
<add commitWithin="60000">
<doc>
<field name="content">lorem ipsum</field>
<field name="id">123456</field>
<field name="someotherfield_i">17</field>
</doc>
</add>
With documents added in this manner, the content field is indexed and stored. It's present in query results. I don't want it to be; it's a waste of space.
What am I missing about the way Cell adds documents?
If you don't want your field to store the contents, you have to set the field as stored="false".
Since you're using the schemaless mode (there still is a schema, it's just generated dynamically when new fields are added), you'll have to use the Schema API to change the field.
You can do this by issuing a replace-field command:
curl -X POST -H 'Content-type:application/json' --data-binary '{
"replace-field":{
"name":"content",
"type":"text",
"stored":false }
}' http://localhost:8983/solr/collection/schema
You can see the defined fields by issuing a request against /collection/schema/fields.
The Cell code indeed adds the content to the document as content, but there's a built-in field translation rule that replaces content with _text_. In the schemaless SOLR, _text_ is marked as not for storing.
The rule is invoked by the following line in the SolrContentHandler.addField():
String name = findMappedName(fname);
In the params object, there's a rule that fmap.content should be treated as _text_. It comes from corename\conf\solrconfig.xml, where by default there's the following fragment:
<requestHandler name="/update/extract"
startup="lazy"
class="solr.extraction.ExtractingRequestHandler" >
<lst name="defaults">
<str name="lowernames">true</str>
<str name="fmap.meta">ignored_</str>
<str name="fmap.content">_text_</str> <!-- This one! -->
</lst>
</requestHandler>
Meanwhile, in corename\conf\managed_schema there's a line:
<field name="_text_" type="text_general" multiValued="true" indexed="true" stored="false"/>
And that's the whole story.

Hi I want the file name using filelistentityprocessor and lineentityprocessor

This is my data-config.xml. I can't use Tika EntityProcessor. Is there any way I can do it with LineEntityProcessor?
I am using solr4.4 to index million of documents . i want the file names and modified time to be indexed as well . But couldnot find the way to do it.
In the data-config.xml I am fetching files using filelistentityprocessor and then parsing each and every line using lineentityprocessor.
<dataConfig>
<dataSource encoding="UTF-8" type="FileDataSource" name="fds" />
<document>
<entity
name="files"
dataSource="null"
rootEntity="false"
processor="FileListEntityProcessor"
baseDir="C:/Softwares/PlafFiles/"
fileName=".*\.PLF"
recursive="true"
>
<field column="fileLastModified" name="last_modified" />
<entity name="na_04"
processor="LineEntityProcessor"
dataSource="fds"
url="${files.fileAbsolutePath}"
transformer="script:parseRow23">
<field column="url" name="Plaf_filename"/>
<field column="source" />
<field column="pict_id" name="pict_id" />
<field column="pict_type" name="pict_type" />
<field column="hierarchy_id" name="hierarchy_id" />
<field column="book_id" name="book_id" />
<field column="ciscode" name="ciscode" />
<field column="plaf_line" />
</entity>
</entity>
</document>
</dataConfig>
From the documentation of FileListEntityProcessor:
The implicit fields generated by the FileListEntityProcessor are fileDir, file, fileAbsolutePath, fileSize, fileLastModified and these are available for use within the entity [..].
You can move these values into differently named fields by referencing them:
<field column="file" name="filenamefield" />
<field column="fileLastModified" name="last_modified" />
This will require that you have a schema.xml that actually allows those two names.
If you need to use them in another string / manipulate it further before inserting:
You're already using files.fileAbsolutePath, so by using ${files.file} and ${files.fileLastModified} you should be able to extract the values you want.
You can modify these values and insert them into a specific field by using the TemplateTransformer and referencing the generated fields:
<field column="filename" template="file:///${files.file}" />

solr 4.4 multiple datasource connection

in my db-data-config.xml i have configured two datasource, each with his parameter name,
for example:
<dataSource name="test1"
type="JdbcDataSource"
driver="com.mysql.jdbc.Driver"
url="jdbc:mysql://localhost/firstdb"
user="username1"
password="psw1"/>
<dataSource name="test2"
type="JdbcDataSource"
driver="com.mysql.jdbc.Driver"
url="jdbc:mysql://localhost/seconddb"
user="username2"
password="psw2"/>
<document name="content">
<entity name="news" datasource="test1" query="select...">
<field column="OTYPE_ID" name="otypeID" />
<field column="NWS_ID" name="cntID" />
....
</entity>
<entity name="news_update" datasource="test2" query="select...">
<field column="OTYPE_ID" name="otypeID" />
<field column="NWS_ID" name="cntID" />
....
</entity>
</document>
</dataConfig>
but when in solr from dataimport i execute the second entity-name-query it launch an exception:
"Table 'firstdb.secondTable' doesn't exist\n\tat"
could someone help me? thank you in advance
A think that your query for news_update is wrong. You must have an error on name of table.
I'm pretty sure this question showed up on the solr-user mailing list. The answer given there was that you are using datasource in your entity tags instead of dataSource. It's case sensitive. If I recall the thread correctly, changing this solved your problem.

How do you use the Solr DIH to select XML based on descriptive values?

The XML has some descriptive fields and I would like to use them to select specific fields. Is there a way to get the data import handler to pick only "Text Block A" and "Text Block B" using "code=34089-3" as a key? The code field has no data but it is unique to the type of information I want to pick. When I use xpath="/document/component/section/text/paragraph" I end up with text blocks A, B, C and D. Ideally I would like to be able to pick only text block A. Is this even possible?
<component>
<section>
<id root="f915965e-fe3b-44eb-a2ed-c11f807e7f23"/>
<code code="34089-3"/>
<title>Title A</title>
<text>
<paragraph>Text Block A</paragraph>
<paragraph>Text Block B</paragraph>
</text>
</section>
</component>
<component>
<section>
<id root="80b7e2f1-f49f-4309-a340-210536705d4a"/>
<code code="34090-1"/>
<title>Title B</title>
<text>
<paragraph>Text Block C</paragraph>
<paragraph>Text Block D</paragraph>
</text>
</section>
</component>
<entity
name="IUPAC"
processor="XPathEntityProcessor"
forEach="/document"
url="${f.fileAbsolutePath}">
<field column="chemical_name" xpath="/document/component/section/code[#code='34089-3']/access below values???" />
</entity>
Try something like that:
/document/component/section[code/#code='34089-3']/text/paragraph

unsupported type Exception on importing documents from Database with Solr 4.0

Looked up information provided on a related question to set up a import of all documents that are stored within a mysql database.
you can find the original question here
Thanks to steps provided I was able to make it work for me with mysql DB. My config looks identical to the one mentioned at above link.
<dataConfig>
<dataSource name="db"
jndiName="java:jboss/datasources/somename"
type="JdbcDataSource"
convertType="false" />
<dataSource name="dastream" type="FieldStreamDataSource" />
<dataSource name="dareader" type="FieldReaderDataSource" />
<document name="docs">
<entity name="doc" query="select * from document" dataSource="db">
<field name="id" column="id" />
<field name="name" column="descShort" />
<entity name="comment"
transformer="HTMLStripTransformer" dataSource="db"
query="select id, body, subject from comment where iddoc='${doc.id}'">
<field name="idComm" column="id" />
<field name="detail" column="body" stripHTML="true" />
<field name="subject" column="subject" />
</entity>
<entity name="attachments"
query="select id, attName, attContent, attContentType from Attachment where iddoc='${doc.id}'"
dataSource="db">
<field name="attachment_name" column="attName" />
<field name="idAttachment" column="id" />
<field name="attContentType" column="attContentType" />
<entity name="attachment"
dataSource="dastream"
processor="TikaEntityProcessor"
url="attContent"
dataField="attachments.attContent"
format="text"
onError="continue">
<field column="text" name="attachment_detail" />
</entity>
</entity>
</entity>
</document>
</dataConfig>
I have a variety of attachments in DB such as jpeg, pdf, excel, doc and plain text. Now everything works great for most of the binary data (jpeg, pdf doc and such). But the import fails for certain files. It appears that the datasource is set up to throw an exception when it encounters a String instead of an InputStream. I set the onError="continue" flag on the entity "attachment" to ensure that the DataImport went through despite this error. Noticed that this problem has happened for a number of files. The exception is given below. Ideas ??
Exception in entity : attachment:java.lang.RuntimeException: unsupported type : class java.lang.String
at org.apache.solr.handler.dataimport.FieldStreamDataSource.getData(FieldStreamDataSource.java:89)
at org.apache.solr.handler.dataimport.FieldStreamDataSource.getData(FieldStreamDataSource.java:48)
at org.apache.solr.handler.dataimport.TikaEntityProcessor.nextRow(TikaEntityProcessor.java:103) at org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProcessorWrapper.java:243)
at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:465)
at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:491)
at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:491)
at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:404)
at org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:319)
at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:227)
at org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:422)
at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:487)
at org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:468)
I know this is an outdated question, but:
it appears to me that this exception is thrown when the BLOB (I work with Oracle) is null. When I add a where clause like "blob_column is not null", the problem disappears for me (Solr 4.10.1)

Resources