solr accurate match through copyField and defaultOperator

solr accurate match through copyField and defaultOperator - solr

In the Solr schema.xml, configure the field & and copyField
<field name="title" type="text" indexed="true" stored="true" required="false" />
<field name="content" type="text" indexed="true" stored="true" required="false" />
<field name="all" type="text" indexed="true" stored="false" multiValued="true"/>
<copyField source="title" dest="all"/>
<copyField source="content" dest="all"/>
<solrQueryParser defaultOperator="AND"/>
And do dataimport which incude a document like this: title="sport",content="I like basket".
Now i set query string as:
all:sport basket
I intent to get this document through match title field to "sport" and match content field to "basket".
I mean when i use "all: sport basket", sorl will hit the the document from the source document:
<title>sport</title>
<content>I like basket</content>
But sorl copyField seems can't do this, can anyone help?

Related

SOLR: copy 2 fields into another field and add filters to that new field

While importing I have below fields in CSV file
<field name="Brand" type="string" indexed="true"/>
<field name="Colour" type="lowercaseExactMatch"/>
<field name="Keywords" type="text_general"/>
<field name="Name" type="text_general" indexed="true"/>
<field name="Price" type="string" indexed="true"/>
<field name="SKU" type="string" multiValued="false" indexed="true" required="true" stored="true"/>
I want to create another field dynamically NameKeywords, in which I want to concat Name and Keywords fields.
Also, I want to apply lowercase, EnglishPorterFilterFactory, EnglishPossessiveFilter, and HyphenatedWordsFilter
So I can apply filters to that field by creating a custom field type. But How to combine two fields into another field?
I saw CopyField into my schema.xml
<copyField source="Name" dest="Name_str" maxChars="256"/>
But not sure is it displays anywhere and how to combine fields here.

Create a field named NameKeywords as below.
<field name="NameKeywords" type="customFieldType" indexed="true" stored="true" multiValued="true"/>
then copy the source fields to destination field as below.
<copyField source="Name" dest="NameKeywords"/>
<copyField source="Keywords" dest="NameKeywords"/>

How to write nested schema.xml in solr?

How to write nested schema.xml in solr
The document in schema.xml says
<!-- points to the root document of a block of nested documents. Required for nested
document support, may be removed otherwise
-->
<field name="_root_" type="string" indexed="true" stored="false"/>
http://svn.apache.org/viewvc/lucene/dev/trunk/solr/example/solr/collection1/conf/schema.xml?view=markup
Which can be used in
https://cwiki.apache.org/confluence/display/solr/Other+Parsers#OtherParsers-BlockJoinQueryParsers
What will be schema.xml for nesting the following items:
Person string
Address
city string
postcode string

I know this is an old question, but I ran into a similar issue. Modifying my solution for yours, the fields you need to add to your schema.xml are as follows:
<field name="person" type="string" indexed="true" stored="true" />
<field name="address" type="string" indexed="true" stored="true" multiValued="true"/>
<field name="address.city" type="string" indexed="true" stored="true" />
<field name="address.postcode" type="string" indexed="true" stored="true" />
Then when you run it you should be able to add the following JSON to your Solr instance and see the matching output in the query:
{
"person": "John Smith",
"address": {
"city": "San Diego",
"postcode": 92093
}
}

solr join function to query documents in multiple cores NullPointerException

I use solr join to query documents from two cores, my cores is defined as follows:
Post core:
<fields>
<!-- general -->
<field name="id"type="string"indexed="true"stored="true" multiValued="false" required="true"/>
<field name="creatorId"type="string"indexed="true"stored="true"multiValued="false" required="true"/>
.
.
.
</fields>
User core:
<fields>
<!-- general -->
<field name="id" type="string" indexed="true" stored="true" multiValued="false" required="true"/>
<field name="username" type="string" indexed="true" stored="true" multiValued="false" />
<field name="email" type="string" indexed="true" stored="true" multiValued="false" />
<field name="userBrief" type="string" indexed="true" stored="true" multiValued="false" />
<field name="jobNumber" type="string" indexed="true" stored="true" multiValued="false" />
</fields>
now I want to query all user who has created post, I use join function, my url is like this:
http://localhost:9080/solr/user/select?q=*:*&fq={!join from=creatorId to=id fromIndex=post}
but it don't work, and it throw a exception:
null: java.lang.NullPointerException
at org.apache.lucene.search.IndexSearcher.rewrite(IndexSearcher.java:559)
at org.apache.lucene.search.IndexSearcher.createNormalizedWeight(IndexSearcher.java:646)
at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:280)
.
.
.
I don't know why, can you help me?

The fq parameter requires a valid query with the !join.
Try adding an everything search to the end of the fq param like this. http://localhost:9080/solr/user/select?q=*:*&fq={!join from=creatorId to=id fromIndex=post}*:*
In a realistic setting you would likely want to filter the joined results in some way, for example, "Find me all action movies rated by this user updated in the past two weeks," where the movies and user ratings are stored as separate documents.

Solr 4 indexing creates a field with no name but all field values concatenated as value for this field

I have several text_en fields in Solr which are "Indexed" but not "Stored". I store these large text values for the document in MongoDb. However when I look at the Solr index, every document has a field which has no name. But all the fields of the document (including the indexed but not stored) are stored in this field.
What is this field and how can I eliminate it. It is increasing the size of my index.
<fields>
<dynamicField indexed="true" name="*_i" stored="true" type="int"/>
<dynamicField indexed="true" name="*_s" stored="true" type="string"/>
<dynamicField indexed="true" name="*_l" stored="true" type="long"/>
<dynamicField indexed="true" name="*_t" stored="true" type="text_en"/>
<dynamicField indexed="true" name="*_b" stored="true" type="boolean"/>
<dynamicField indexed="true" name="*_f" stored="true" type="float"/>
<dynamicField indexed="true" name="*_d" stored="true" type="double"/>
<dynamicField indexed="true" name="*_tiled" stored="false" type="double"/>
<dynamicField indexed="true" name="*_dt" stored="true" type="date"/>
<dynamicField indexed="true" name="*_p" stored="true" type="location"/>
<dynamicField name="random_*" type="random"/>
<dynamicField indexed="true" multiValued="true" name="attr_*" stored="true" type="string"/>
<dynamicField indexed="true" multiValued="true" name="*" stored="true" type="text_en"/>
<dynamicField indexed="true" multiValued="true" name="attr_*" stored="true" type="string"/>
<!-- My Custom Fields -->
<uniqueKey>id</uniqueKey>
<defaultSearchField>text_all</defaultSearchField>
<solrQueryParser defaultOperator="AND"/>
<copyField dest="author_display" source="author"/>
<copyField dest="keywords_display" source="keywords"/>
<copyField dest="text_all" source="id"/>
<copyField dest="text_all" source="url"/>
<copyField dest="text_all" source="title"/>
<copyField dest="text_all" source="description"/>
<copyField dest="text_all" source="keywords"/>
<copyField dest="text_all" source="author"/>
<copyField dest="text_all" source="body"/>
<copyField dest="text_all" source="*_t"/>
<copyField dest="spell" source="title"/>
<copyField dest="spell" source="body"/>
<copyField dest="spell" source="description"/>
<copyField dest="spell" source="author"/>
<copyField dest="autocomplete" source="title"/>
<copyField dest="autocomplete" source="body"/>
<copyField dest="autocomplete" source="description"/>
<copyField dest="autocomplete" source="author"/>
</fields>

You are seeing this behavior because of the following entry in your schema.xml file
<dynamicField indexed="true" multiValued="true" name="*" stored="true" type="text_en"/>
This a generic catch all field that you have defined in your schema. If you pass any documents to the index field names that do not match other fields in the schema either by convention (via your other dynamicField settings) or specific field names, Solr will create that field "on the fly" as a text_en type that can have multiple entries since it is setup as multiValued="true". And these fields are all being stored as well because of stored="true" setting. I would recommend removing this field from your schema.xml and reindexing your data.
For more details on the settings in this file, please reference - SchemaXml on the Solr Wiki.

Multiple Indexes in same Solr Core..?

I am using Apache Solr..I have the following Scenario.. :
I have Two table in my PostGreSQL database. One is "Cars". Other is "Dealers"
Now i have a data-config file for Cars like the following :
<document name="offerings">
<entity name="jc_offerings" query="select * from jc_offerings" >
<field column="id" name="id" />
<field column="name" name="name" />
<field column="display_name" name="display_name" />
<field column="extra" name="extra" />
</entity>
</document>
I have a similar data--config.xml for "Dealers". It has the same fields as Cars : name, extra etc
Now in my Schema.xml , i have defined the following fields :
<fields>
<field name="id" type="string" indexed="true" />
<field name="name" type="name" indexed="true" />
<field name="extra" type="extra" indexed="true" />
<field name="CarsText" type="text_general" indexed="true"
stored="true" multiValued="true"/>
</fields>
<uniqueKey>id</uniqueKey>
<defaultSearchField>CarsText</defaultSearchField>
<copyField source="name" dest="CarsText"/>
<copyField source="extra" dest="CarsText"/>
Now i want to search like : "where name is Maruti"..So how will Solr know Whether to Search ::: Cars Field : name OR Dealer Field "name"..??
I have read to the following link : http://wiki.apache.org/solr/MultipleIndexes
But i am not able to understand how is works..??
After reading that link : I made another field in My Cars and Dealers *data-config.xml* .. Something like :
<field name="type" value="car" /> : in Cars date-config.xml
and
<field name="type" value="dealer" /> : in Cars date-config.xml
And then in Schema.xml i created a new field :
<field name="type" type="string" indexed="true" stored="true" />
And then i queried something like :
localhost:8983/solr/select?q=name:Maruti&fq=type:dealer
But it dint Worked..!!
So what should i do..??

if the fields are the same for both cars and dealers, you could use one index with an object defined like so:
<fields>
<field name="id" type="string" indexed="true" stored="true"/>
<field name="name" type="name" indexed="true" stored="true" />
<field name="extra" type="extra" indexed="true" stored="true" />
<field name="description_text" type="text_general" indexed="true" stored="true" multiValued="true"/>
<field name="type" type="string" indexed="true" stored="true" />
</fields>
this will work for both cars and dealers (so you don't need to have 2 indexes) and you'll use the "type" field to sort out if you want a "dealer" or a "car" (i'm using the same system to filter out similar types of objects with only a minor "semanthical" difference)
also you'll need to add stored="true" to the fields you want to retrieve, or you'll be only able to use them for searching (hence that index="true")

Adding a default value to the type field will ensure the type value being set to cars|dealer.
You will have to index the sources separately. Then use copy field and you can easily filter on either cars|dealer.
This does seem a bit tricky and is not explained well in the muti-indexes link referred to above.