Multiple Indexes in same Solr Core..? - solr

I am using Apache Solr..I have the following Scenario.. :
I have Two table in my PostGreSQL database. One is "Cars". Other is "Dealers"
Now i have a data-config file for Cars like the following :
<document name="offerings">
<entity name="jc_offerings" query="select * from jc_offerings" >
<field column="id" name="id" />
<field column="name" name="name" />
<field column="display_name" name="display_name" />
<field column="extra" name="extra" />
</entity>
</document>
I have a similar data--config.xml for "Dealers". It has the same fields as Cars : name, extra etc
Now in my Schema.xml , i have defined the following fields :
<fields>
<field name="id" type="string" indexed="true" />
<field name="name" type="name" indexed="true" />
<field name="extra" type="extra" indexed="true" />
<field name="CarsText" type="text_general" indexed="true"
stored="true" multiValued="true"/>
</fields>
<uniqueKey>id</uniqueKey>
<defaultSearchField>CarsText</defaultSearchField>
<copyField source="name" dest="CarsText"/>
<copyField source="extra" dest="CarsText"/>
Now i want to search like : "where name is Maruti"..So how will Solr know Whether to Search ::: Cars Field : name OR Dealer Field "name"..??
I have read to the following link : http://wiki.apache.org/solr/MultipleIndexes
But i am not able to understand how is works..??
After reading that link : I made another field in My Cars and Dealers *data-config.xml* .. Something like :
<field name="type" value="car" /> : in Cars date-config.xml
and
<field name="type" value="dealer" /> : in Cars date-config.xml
And then in Schema.xml i created a new field :
<field name="type" type="string" indexed="true" stored="true" />
And then i queried something like :
localhost:8983/solr/select?q=name:Maruti&fq=type:dealer
But it dint Worked..!!
So what should i do..??

if the fields are the same for both cars and dealers, you could use one index with an object defined like so:
<fields>
<field name="id" type="string" indexed="true" stored="true"/>
<field name="name" type="name" indexed="true" stored="true" />
<field name="extra" type="extra" indexed="true" stored="true" />
<field name="description_text" type="text_general" indexed="true" stored="true" multiValued="true"/>
<field name="type" type="string" indexed="true" stored="true" />
</fields>
this will work for both cars and dealers (so you don't need to have 2 indexes) and you'll use the "type" field to sort out if you want a "dealer" or a "car" (i'm using the same system to filter out similar types of objects with only a minor "semanthical" difference)
also you'll need to add stored="true" to the fields you want to retrieve, or you'll be only able to use them for searching (hence that index="true")

Adding a default value to the type field will ensure the type value being set to cars|dealer.
You will have to index the sources separately. Then use copy field and you can easily filter on either cars|dealer.
This does seem a bit tricky and is not explained well in the muti-indexes link referred to above.

Related

SOLR: copy 2 fields into another field and add filters to that new field

While importing I have below fields in CSV file
<field name="Brand" type="string" indexed="true"/>
<field name="Colour" type="lowercaseExactMatch"/>
<field name="Keywords" type="text_general"/>
<field name="Name" type="text_general" indexed="true"/>
<field name="Price" type="string" indexed="true"/>
<field name="SKU" type="string" multiValued="false" indexed="true" required="true" stored="true"/>
I want to create another field dynamically NameKeywords, in which I want to concat Name and Keywords fields.
Also, I want to apply lowercase, EnglishPorterFilterFactory, EnglishPossessiveFilter, and HyphenatedWordsFilter
So I can apply filters to that field by creating a custom field type. But How to combine two fields into another field?
I saw CopyField into my schema.xml
<copyField source="Name" dest="Name_str" maxChars="256"/>
But not sure is it displays anywhere and how to combine fields here.
Create a field named NameKeywords as below.
<field name="NameKeywords" type="customFieldType" indexed="true" stored="true" multiValued="true"/>
then copy the source fields to destination field as below.
<copyField source="Name" dest="NameKeywords"/>
<copyField source="Keywords" dest="NameKeywords"/>

parent child indexing in apache solr

I'm new to Apache solr search. I'm not getting ho to get solr search result with child documents.
My entity in data-config.xml
<entity name="products" query="SELECT DISTINCT IDENTIFIER,PDT_NAME,PDT_DESCRIPTION FROM **PARENT_TABLE**"
deltaQuery="SELECT IDENTIFIER FROM PARENT_TABLE WHERE LAST_MODIFIED_DATE > '${dataimporter.last_index_time}'">
<field column="IDENTIFIER" name="pdtid" />
<field column="PDT_NAME" name="productname" />
<field column="PDT_DESCRIPTION" name="productdescription" />
<entity name="productVersions" child="true" query="SELECT DISTINCT child_id , child_name FROM WHERE IDENTIFIER = '${**products.IDENTIFIER**}'">
<field column="IDENTIFIER" name="productVersions.pdtesat" />
<field column="VERSION_NUMBER" name="productVersions.versionnum" />
<field column="DISPLAY_NAME" name="productVersions.displayname" />
</entity>
</entity>
field details in managed-schema file:
<field name="pdtid" type="text_general" indexed="true" stored="true" multiValued="false" />
<field name="productname" type="text_general" indexed="true" stored="true" multiValued="true" />
<field name="productnamerrr" type="text_general" indexed="true" stored="true" multiValued="false" />
<field name="productdescription" type="text_general" indexed="true" stored="true" multiValued="false" />
<field name="productVersions.childid" type="text_general" indexed="true" stored="true" multiValued="false" />
<field name="productVersions.versionnum" type="text_general" indexed="true" stored="true" multiValued="false" />
<field name="productVersions.displayname" type="text_general" indexed="true" stored="true" multiValued="false" />
I'm expecting my solr result should be :
"response":{"numFound":26,"start":0,"docs":[
{
"productdescription":" Java",
"productnamerrr":"pdtid",
"pdtid":"6591",
"child_docs" : [
"productVersions":[
"productVersions.childid":"123"
"productVersions.versionnum":"V1"
"productVersions.displayname":"disp"],
"productVersions":[
"productVersions.childid":"456"
"productVersions.versionnum":"V2"
"productVersions.displayname":"disp2"]
],
"id":"92689209-dc5f-4ae6-bd3c-d55dbd0e200c",
"_version_":1599132440456069120},
Please help me in getting the multiple child docs in json format after indexing.
May 2nd edit.
My query result from solr search like below.
"response":{"numFound":38,"start":0,"docs":[
{
"productdescription":" JIRA provides issue (bug) and project tracking
for the software development team.",
"productnamerrr":"Atlassian JIRA",
"productVersions":
["childid:6.x,versionnum:Jira 6.x,displayname :Withdrawn",
"childid:2.0.3,versionnum:Atlassian JIRA,displayname:Planning",
"childid:JIRA Server 5.0.1 - 6.3.15,versionnum:JIRA - JEditor,displayname :Withdrawn",
"childid:1.x,versionnum:Jira 1.x,displayname :Withdrawn"
],
"id":"0b5ba528-ef7a-49ba-a97b-2ea94922cbb5",
"_version_":1599297669816123392},
Edited on May 3-2018
returned data is correct. But the i'm expecting in parent child documents explicitly. getting child docs as below.
"productVersions":["childid:6.x,versionnum:Jira 6.x,displayname :Withdrawn",
"childid:2.0.3,versionnum:Atlassian JIRA,displayname:Planning",
"childid:JIRA Server 5.0.1 - 6.3.15,versionnum:JIRA - JEditor,displayname :Withdrawn",
"childid:1.x,versionnum:Jira 1.x,displayname :Withdrawn"
],
Expecting like below.
"productVersions":[
"productVersions.childid":"123"
"productVersions.versionnum":"V1"
"productVersions.displayname":"disp"],
"productVersions":[
"productVersions.childid":"456"
"productVersions.versionnum":"V2"
"productVersions.displayname":"disp2"]
],
How can i change the query to get child docs separately as a separate entity.??

indexing in returning only few of the columns specified in query in data-import xml

indexing in returning only few of the columns specified in query in data-import xml.
<entity
name="All_Manuals"
query="SELECT Query........"
dataSource="JdbcDataSource">
<field column="Column1" name="id" />
<field column="Column2" name="deptId" />
<field column="Column3" name="groupId" />
<field column="Column4" name="subGrpId" />
<field column="Column5" name="manualId" />
</entity>
We are indexing above all columns, but when we are fetching it is returning only first two columns.
you need to add all columns in your schema.xml like this :
<field name="id" type="string" indexed="true" stored="true" />
<field name="deptId" type="string" indexed="true" stored="true" />
<field name="groupId" type="string" indexed="true" stored="true" />
..............................
and suppose if you dnt want indexing on any column but still want that cloumn in your results
<field name="xxxxx" type="string" indexed="false" stored="true" />

How to index columns with same name but different data in solr

I have two table and both the tables have delete_status,but these columns have different data
CODE:(data-config.xml)
<entity name="category_masters" query="SELECT
category_updated,delete_status,category_id,category_name FROM category_masters
where category_id='${type_masters.category_id}'">
category_id=${category_masters.category_id}">
<field column="category_id" name="id"/>
<field column="category_name" name="category_name" indexed="true" stored="true" />
**<field column="delete_status" name="delete_status" indexed="true" stored="true" />**
<field column="category_updated" name="category_updated" indexed="true"
stored="true" />
</entity>
<entity name="type_masters" pk="type_id" query="SELECT
type_updated,delete_status as type_masters_delte,type_id,category_id,type_name FROM type_masters
where type_id='${businessmasters.Business_Type}' ">
<field column="type_id" name="id"/>
<field column="category_id" name="category_id" indexed="true" stored="true" />
<field column="type_name" name="type_name" indexed="true" stored="true" />
**<field column="delete_status" name="delete_status" indexed="true" stored="true" />**
<field column="type_updated" name="type_updated" indexed="true" stored="true" />
How do i display data from both the columns,i tried aliasing the columns but it does not work.
And when i query i only see one delete_status column,even if i make it multivalued how do i differentiate which delete_status belongs to which table.
I want the data separately and cant make changes in the database.
In your case, i would use the DIH. In that case, you could define an join to merge both tables in data-config.xml.
Using that file supports aliases for Column names, like table1.delete_status as type_masters_delete

solr error: unknown field ignored_stream_source_info

I am working on solar 4.2 on windows 7.
My schema :
<field name="id" type="string" indexed="true" stored="true"
required="true"/>
<field name="author" type="string" indexed="true" stored="true"/>
<field name="comments" type="text" indexed="true" stored="true"/>
<field name="keywords" type="text" indexed="true" stored="true"/>
<field name="contents" type="text" indexed="true" stored="true"/>
<field name="title" type="text" indexed="true" stored="true"/>
<field name="revision_number" type="string" indexed="true"
stored="true" />
(dynamic field name is ignored_* )
I get an error:
[doc=8]unknown field ignored_stream_source_info
What does this error message mean?
Check for the configuration for the fmap.content content parameter.
fmap.content by default maps to the text field which is not defined in the schema.xml specified by you. You can pass the parameter fmap.content=contents to map it to the contents field.
You can modify the solrconfig.xml to specify the same.

Resources