Solr with String and Text filed Queries .? - solr

<field name="NAME" type="string" indexed="true" stored="true" multiValued="true"/>
<field name="NAMETEXT" type="text" indexed="true" stored="true" multiValued="true"/>
My Search Query is
NAME:22
NAMETEXT:22 both query i tried.. It returns results with name
"index 2.2", "indexch 2.2", index3 2.2
Why its returning this values..

Because your WordDelimiter converts "2.2" into "22" and these terms are matched with your query.

Related

Search in json nested fields (schema + query)

I got the following JSON stored in a Riak bucket which handle Solr research.
{
"date" : 1535673489,
"customer" : {
"name" : "X"
"id" : 1205643
}
}
And my schema.xml fields look like that for the moment
<field name="_version_" type="long" indexed="true" stored="true"/>
<field name="date" type="int" indexed="true" stored="true" mult iValued="true"/>
And the research on date works perfectly fine with query as
$RIAK_HOST/search/query/order?wt=json&q=date:[1535553489%20TO%201535599999]
Unfortunately I didn't found any documentation that explains how to properly field and query on sub field as customer.name or customer.id
Edit: As I found on the following post Riak search schema and nested fields, it seems that I need to create the fields as follow:
<field name="customer_name" type="string" indexed="true" stored="true" mult iValued="true"/>
But then when I query on the fields, I got no answer to my request
Edit 2: I proceed to the following experimentation and I get no error from riak.
I uploaded the file
{
"customer_name" : "toto",
"customer" : {
"name" : "tata"
}
}
And on research Riak obtained the result from the field "toto" and not the one from "tata". Is it possible that the nesting research is unactivated or associated to another character?
The fields you need to add to your schema.xml are as follows:
<field name="date" type="string" indexed="true" stored="true"/>
<field name="customer" type="string" indexed="true" stored="true" multiValued="true"/>
<field name="customer.name" type="string" indexed="true" stored="true"/>
<field name="customer.id" type="string" indexed="true" stored="true"/>
And then you need to query your index as follow:
$RIAK_HOST/search/query/order?wt=json&q=customer.name:t*

Find duplicates objects with solr4 and Haystack

I use the facet mode of solr to find duplicates. It works pretty well but I can't figure how to get objects id's.
>>> from haystack.query import SearchQuerySet
>>> sqs = SearchQuerySet().facet('text_string', limit=-1)
>>> sqs.facet_counts()
{
'dates': {},
'fields': {
'text_string': [
('the red ballon', 4),
('my grand pa is an alien', 2),
('be kind rewind', 12),
],
},
'queries': {}
}
How can I get id of my objects 'the red ballon', 'my grand pa is an alien', etc. , do I have to add id field in the schema.xml of solr ?
I'm expecting something like that:
>>> sqs.facet_counts()
{
'dates': {},
'fields': {
'text_string': [
(object_id, 'the red ballon', 4),
(object_id, 'my grand pa is an alien', 2),
(object_id, 'be kind rewind', 12),
],
},
'queries': {}
}
EDIT: Added schema.xml and search_indexes.py
schema.xml for solr
...
<fields>
<!-- general -->
<field name="id" type="string" indexed="true" stored="true" multiValued="false" required="true"/>
<field name="django_ct" type="string" indexed="true" stored="true" multiValued="false"/>
<field name="django_id" type="string" indexed="true" stored="true" multiValued="false"/>
<field name="_version_" type="long" indexed="true" stored ="true"/>
<dynamicField name="*_i" type="int" indexed="true" stored="true"/>
<dynamicField name="*_s" type="string" indexed="true" stored="true"/>
<dynamicField name="*_l" type="long" indexed="true" stored="true"/>
<dynamicField name="*_t" type="text_en" indexed="true" stored="true"/>
<dynamicField name="*_b" type="boolean" indexed="true" stored="true"/>
<dynamicField name="*_f" type="float" indexed="true" stored="true"/>
<dynamicField name="*_d" type="double" indexed="true" stored="true"/>
<dynamicField name="*_dt" type="date" indexed="true" stored="true"/>
<dynamicField name="*_p" type="location" indexed="true" stored="true"/>
<dynamicField name="*_coordinate" type="tdouble" indexed="true" stored="false"/>
<field name="text" type="text_en" indexed="true" stored="true" multiValued="false" termVectors="true" />
<field name="title" type="text_en" indexed="true" stored="true" multiValued="false" />
<!-- Used for duplicate content detection -->
<copyField source="title" dest="text_string" />
<field name="text_string" type="string" indexed="true" stored="true" multiValued="false" />
<field name="pk" type="long" indexed="true" stored="true" multiValued="false" />
</fields>
<!-- field to use to determine and enforce document uniqueness. -->
<uniqueKey>id</uniqueKey>
<!-- field for the QueryParser to use when an explicit fieldname is absent -->
<defaultSearchField>text</defaultSearchField>
<!-- SolrQueryParser configuration: defaultOperator="AND|OR" -->
<solrQueryParser defaultOperator="AND"/>
...
searche_indexes.py
class VideoIndex(indexes.SearchIndex, indexes.Indexable):
text = indexes.CharField(document=True, use_template=True)
pk = indexes.IntegerField(model_attr='pk')
title = indexes.CharField(model_attr='title', boost=1.125)
def index_queryset(self, using=None):
return Video.on_site.all()
def get_model(self):
return Video
Faceting is the arrangement of search results into categories (which are based on indexed terms). Within each category, Solr reports on the number of hits for relevant term, which is called a facet constraint. Faceting makes it easy for users to explore search results on sites such as movie sites and product review sites, where there are many categories and many items within a category.
Here is good example of it...
faceting example by Yonik
faceting example on solr wiki
In your case you may need to fire a query again to get the id and othere details....

How to write nested schema.xml in solr?

How to write nested schema.xml in solr
The document in schema.xml says
<!-- points to the root document of a block of nested documents. Required for nested
document support, may be removed otherwise
-->
<field name="_root_" type="string" indexed="true" stored="false"/>
http://svn.apache.org/viewvc/lucene/dev/trunk/solr/example/solr/collection1/conf/schema.xml?view=markup
Which can be used in
https://cwiki.apache.org/confluence/display/solr/Other+Parsers#OtherParsers-BlockJoinQueryParsers
What will be schema.xml for nesting the following items:
Person string
Address
city string
postcode string
I know this is an old question, but I ran into a similar issue. Modifying my solution for yours, the fields you need to add to your schema.xml are as follows:
<field name="person" type="string" indexed="true" stored="true" />
<field name="address" type="string" indexed="true" stored="true" multiValued="true"/>
<field name="address.city" type="string" indexed="true" stored="true" />
<field name="address.postcode" type="string" indexed="true" stored="true" />
Then when you run it you should be able to add the following JSON to your Solr instance and see the matching output in the query:
{
"person": "John Smith",
"address": {
"city": "San Diego",
"postcode": 92093
}
}

solr join function to query documents in multiple cores NullPointerException

I use solr join to query documents from two cores, my cores is defined as follows:
Post core:
<fields>
<!-- general -->
<field name="id"type="string"indexed="true"stored="true" multiValued="false" required="true"/>
<field name="creatorId"type="string"indexed="true"stored="true"multiValued="false" required="true"/>
.
.
.
</fields>
User core:
<fields>
<!-- general -->
<field name="id" type="string" indexed="true" stored="true" multiValued="false" required="true"/>
<field name="username" type="string" indexed="true" stored="true" multiValued="false" />
<field name="email" type="string" indexed="true" stored="true" multiValued="false" />
<field name="userBrief" type="string" indexed="true" stored="true" multiValued="false" />
<field name="jobNumber" type="string" indexed="true" stored="true" multiValued="false" />
</fields>
now I want to query all user who has created post, I use join function, my url is like this:
http://localhost:9080/solr/user/select?q=*:*&fq={!join from=creatorId to=id fromIndex=post}
but it don't work, and it throw a exception:
null: java.lang.NullPointerException
at org.apache.lucene.search.IndexSearcher.rewrite(IndexSearcher.java:559)
at org.apache.lucene.search.IndexSearcher.createNormalizedWeight(IndexSearcher.java:646)
at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:280)
.
.
.
I don't know why, can you help me?
The fq parameter requires a valid query with the !join.
Try adding an everything search to the end of the fq param like this. http://localhost:9080/solr/user/select?q=*:*&fq={!join from=creatorId to=id fromIndex=post}*:*
In a realistic setting you would likely want to filter the joined results in some way, for example, "Find me all action movies rated by this user updated in the past two weeks," where the movies and user ratings are stored as separate documents.

solr accurate match through copyField and defaultOperator

In the Solr schema.xml, configure the field & and copyField
<field name="title" type="text" indexed="true" stored="true" required="false" />
<field name="content" type="text" indexed="true" stored="true" required="false" />
<field name="all" type="text" indexed="true" stored="false" multiValued="true"/>
<copyField source="title" dest="all"/>
<copyField source="content" dest="all"/>
<solrQueryParser defaultOperator="AND"/>
And do dataimport which incude a document like this: title="sport",content="I like basket".
Now i set query string as:
all:sport basket
I intent to get this document through match title field to "sport" and match content field to "basket".
I mean when i use "all: sport basket", sorl will hit the the document from the source document:
<title>sport</title>
<content>I like basket</content>
But sorl copyField seems can't do this, can anyone help?

Resources