I managed to set up a connection to my SQL server to import data into
Solr. The idea is to import filetables but for now I first want to get it
working using regular tables. So I created
data-config.xml
<dataConfig>
<dataSource type="JdbcDataSource"
driver="com.microsoft.sqlserver.jdbc.SQLServerDriver"
url="jdbc:sqlserver://localhost;databaseName=inConnexion_Tenant2;integratedSecurity=true" />
<document>
<entity name="Dashboard" pk="id" query="SELECT Id,PublicId FROM foundation.Shops">
<field column="Id" name="Id"/>
<field column="PublicId" name="PublicId" />
</entity>
</document>
</dataConfig>
schema.xml
I added
<field name="Id"
type="string" indexed="true" stored="true" required="true" multiValued="false" />
<field name="PublicId"
type="string" indexed="true" stored="true" multiValued="false"/>
and changed uniqueKey entry to
<uniqueKey>Id</uniqueKey>
When I want to import my data (which is just data like Id: 5, PublicId:
"test"), I get the following error in the logging.
Error creating document : SolrInputDocument(fields: [PublicId=10065,
Id=117])
Related
I have 1 multivalued date type field, its definition in the schema.xml is shown below:
<field name="fecha_referencia" type="pdates" uninvertible="true" indexed="true" stored="true"/>
The total of values it can take are three, here is an example where it is already indexed:
fecha_referencia:["2015-12-04T00:00:00Z",
"2014-12-15T00:00:00Z",
"2014-02-03T00:00:00Z"]
I want to know is if you can divide the values at the time of indexing (I am indexing via DIH) into other dynamic fields or separate fields.
Example of what you are looking for:
fecha_referencia:["2015-12-04T00:00:00Z",
"2014-12-15T00:00:00Z",
"2014-02-03T00:00:00Z"],
fecha1:2015-12-04T00:00:00Z,
fecha2:2014-12-15T00:00:00Z,
fecha3:2014-02-03T00:00:00Z
Note: I have tried to test regex but have had no luck.
Any contribution would be of great help and well received by your server...
This is my data.config.xml structure:
<dataConfig>
<dataSource type="JdbcDataSource" driver="org.postgresql.Driver" url="jdbc:postgresql://10.152.11.47:5433/meta" user="us" password="ntm" URIEncoding="UTF-8" />
<document >
<entity name="tr_ident" query="SELECT id_ident, titulo,proposito,descripcion,palabra_cve
FROM ntm_p.tr_ident">
<field column="id_ident" name="id_ident" />
<field column="titulo" name="titulo" />
<field column="proposito" name="proposito" />
<field column="descripcion" name="descripcion" />
<field column="palabra_cve" name="palabra_cve" />
<entity name="tr_fecha_insumo" query="select fecha_creacion,fech_ini_verif,
fech_fin_verif from ntm_p.tr_fecha_insumo where id_fecha_insumo='${tr_ident.id_ident}'">
<field name="fecha_creacion" column="fecha_creacion" />
<field name="fech_ini_verif" column="fech_ini_verif" />
<field name="fech_fin_verif" column="fech_fin_verif" />
</entity>
<entity name="ti_fecha_evento"
query="select tipo_fecha,fecha_referencia from ntm_p.ti_fecha_evento where id_fecha_evento='${tr_ident.id_ident}'">
<field column="fecha_referencia" name="fecha_referencia" />
<entity name="tc_tipo_fecha" query="select des_tipo_fecha,id_tipo_fecha from ntm_p.tc_tipo_fecha where id_tipo_fecha='${ti_fecha_evento.tipo_fecha}'">
<field column="des_tipo_fecha" name="des_tipo_fecha" />
<field column="id_tipo_fecha" name="id_tipo_fecha" />
</entity>
</entity>
</entity>
</document>
</dataConfig>
I have a solr project. I want to put my csv file data into solr using dataimporthandler. I wrote this db-data-config.xml.
<dataConfig>
<dataSource type="FileDataSource"/>
<document>
<entity name="item" processor="FileListEntityProcessor" fileName="TableArchive.csv" baseDir="${solr.install.dir}/server/solr/archiveCore" dataSource="null" recursive="true" rootEntity="false">
<field column="NameAdded" name="NameAdded" />
<field column="DateAdded" name="DateAdded" />
<field column="NameModified" name="NameModified" />
<field column="DateModified" name="DateModified" />
<field column="strSO" name="strSO" />
<field column="strCust" name="strCust" />
<field column="strOperator" name="strOperator" />
<field column="PackName" name="PackName" />
<field column="DocName" name="DocName" />
</entity>
</document>
</dataConfig>
When I run data import handler from solr admin panel, it not indexing files. I don't know how to solve it.
I want to use multiple datasources in DataImporthandler in Solr and pass URL value in child entity after querying database in parent entity.
Here is my rss-data-config file:
<dataConfig>
<dataSource type="JdbcDataSource" name="ds-db" driver="com.mysql.jdbc.Driver" url="jdbc:mysql://localhost:3306/HCDACoreDB"
user="root" password="CDA#318"/>
<dataSource type="URLDataSource" name="ds-url"/>
<document>
<entity name="feeds" query="select f.feedurl, f.feedsource, c.categoryname from feeds f, category c where f.feedcategory = c.categoryid">
<field column="feedurl" name="url" dataSource="ds-db"/>
<field column="categoryname" name="category" dataSource="ds-db"/>
<field column="feedsource" name="source" dataSource="ds-db"/>
<entity name="rss"
transformer="HTMLStripTransformer"
forEach="/RDF/channel | /RDF/item"
processor="XPathEntityProcessor"
url="${dataimporter.functions.encodeUrl(feeds.feedurl)}" >
<field column="source-link" dataSource="ds-url" xpath="/rss/channel/link" commonField="true" />
<field column="Source-desc" dataSource="ds-url" xpath="/rss/channel/description" commonField="true" />
<field column="title" dataSource="ds-url" xpath="/rss/channel/item/title" />
<field column="link" dataSource="ds-url" xpath="/rss/channel/item/link" />
<field column="description" dataSource="ds-url" xpath="/rss/channel/item/description" stripHTML="true"/>
<field column="pubDate" dataSource="ds-url" xpath="/rss/channel/item/pubDate" />
<field column="guid" dataSource="ds-url" xpath="/rss/channel/item/guid" />
<field column="content" dataSource="ds-url" xpath="/rss/channel/item/content" />
<field column="author" dataSource="ds-url" xpath="/rss/channel/item/creator" />
</entity>
</entity>
</document>
What I am doings is in first entity named feeds I am querying database and want to use the feedurl as the URL for the child entity names rss.
The error I get when I run the dataimport is:
java.net.MalformedURLException: no protocol: nullselect f.feedurl, f.feedsource, c.categoryname from feeds f, category c where f
.feedcategory = c.categoryid
the URL us NULL meaning its not assigning the feedurl to URL.
Any suggestion on what I am doing wrong?
Here's an example:
<?xml version="1.0" encoding="UTF-8"?>
<dataConfig>
<dataSource name="db1" ... />
<dataSource name="db2"... />
<document>
<entity name="outer" dataSource="db1" query=" ... ">
<field column="id" />
<entity name="inner" dataSource="db2" query=" select from ... where id = ${outer.id} ">
<field column="innercolumn" splitBy=":::" />
</entity>
</entity>
</document>
the idea is to have one definition of the entity nested that does the extra query to the other database.
you can access the parent entity fields like this ${outer.id}
I'm new in Solr and I'm struggling to import some XML Data which does not contain a ID field, although It's required as it says my schema.xml:
An XML example:
<results>
<estacions>
<estacio id="72400" nom="Aeroport"/>
<estacio id="79600" nom="Arenys de Mar"/>
...
</estacions>
</results>
Schema.xml:
<uniqueKey>id</uniqueKey>
At this point, I need to import this xml from http fetch, then I use DataimportHandler.
This is my data-config.xml
<dataConfig>
<dataSource type="URLDataSource" />
<document>
<entity name="renfe"
url="http://host_url/myexample.xml"
processor="XPathEntityProcessor"
forEach="/results/estacions/estacio"
transformer="script:generateCustomId">
<field column="idestacio" xpath="/results/estacions/estacio/#id" commonField="true" />
<field column="nomestacio" xpath="/results/estacions/estacio/#nom" commonField="true" />
</entity>
</document>
Then, it seems to work properly, but I got the following error:
org.apache.solr.common.SolrException: [doc=null] missing required field: id
This makes me think that I should generate an automatic id while importing, and by using the data-config.xml, but I don't reach to see how to do it.
How should I do? Using a ScriptTransformer? Any idea is grateful
And another question: Can I force a value during the import ?
For ex: <field column="site" value="estacions"/> (obviously this does not work)
You can use code below to generate ID:
<dataConfig>
<script><![CDATA[
id = 1;
function GenerateId(row) {
row.put('id', (id ++).toFixed());
return row;
}
]]></script>
<dataSource type="URLDataSource" />
<document>
<entity name="renfe"
url="http://host_url/myexample.xml"
processor="XPathEntityProcessor"
forEach="/results/estacions/estacio"
transformer="script:GenerateId">
<field column="idestacio" xpath="/results/estacions/estacio/#id" commonField="true" />
<field column="nomestacio" xpath="/results/estacions/estacio/#nom" commonField="true" />
</entity>
</document>
I have this "catch all" field in my schema.xml:
<dynamicField name="*_s" type="string" indexed="true" stored="true" />
In the example below lets say i have a table that has 2 fields: "custom_value" and "custom_key" with these values:
custom_key: "mykey"
custom_value: "myvalue"
My Goal is to index a document that has a field called "mykey" and the value "myvalue". How can i do that?
<dataConfig>
<dataSource type="JdbcDataSource"
driver="com.mysql.jdbc.Driver"
url="jdbc:mysql://localhost/MY_DB"
user="MYUSER"
password="MYPASS"
batchSize="-1"/>
<document>
<entity name="article" query="SELECT id, custom_key, custom_value FROM mytable">
<field column="id" name="id"/>
<field column="custom_value" name=":::WHAT TO PUT HERE?:::_s"/>
</entity>
</document>
Found a (hacky?) solution, that works for my purposes, i will not mark this question as answered for a few days, incase someone comes up with a cleaner/better solution.
<dataConfig>
<script><![CDATA[
function insertVariants(row) {
row.put(row.get('custom_key') + '_custom', row.get('custom_value'));
return row;
}
]]></script>
<dataSource type="JdbcDataSource"
driver="com.mysql.jdbc.Driver"
url="jdbc:mysql://localhost/MY_DB"
user="MYUSER"
password="MYPASS"
batchSize="-1"/>
<document>
<entity name="article" query="SELECT id, custom_key, custom_value FROM mytable" transformer="script:insertVariants">
<field column="id" name="id"/>
</entity>
</document>
</dataConfig>