Solr: how do I use dismax instead of using copyField? - solr

I've been trying to figure this out for a bit now. If I create a schema without the directive:
<copyField source="*" dest="text" />
I can't seem to pull anything up. But when I add that directive, things magically appear. I'm trying my query with ?defType=dismax, but that doesn't seem to help.
Am I missing something? Do I need something special in my schema? I'm indexing all the fields I need to search against.
Thoughts?
Thanks!

If you use defType=lucene you need to specify the field before your search query like this:
q=title:test
If you don't specify a field solr will use the default field specified in solrconfig.xml. This field is text by default. As all the fields are copied to text the search works well.
If you decide to use dismax the query structure changes. You need to put your search term like that:
q=test
and specify the fields to search in other parameter like that:
<str name="qf">field1 field2</str>
Where field1 and field2 are the fields you want to search the terms.

Related

Solr dynamicField not searched in query without field name

I'm experimenting with the Example database in Solr 4.10 and not understanding how dynamicFields work. The schema defines
dynamicField name="*_s" type="string" indexed="true" stored="true"
If I add a new item with a new field name (say "example_s":"goober" in JSON format), a query like
?q=goober
returns no matches, while
?q=example_s:goober
will find the match. What am I missing?
I would like to see the SearchHandler from solrconfig.xml file that you are using to execute the above mentioned query.
In SearchHandler we generally have Default Query Field i.e. qf parameter.
Check that your dynamic field example_s is present in that query field list of solrconfig file else you can pass it while sending query to search handler.
Hope this will help you in resolving your problem.
If you are using the default schema, here's what's happening:
You are probably using default end-point (/select), so you get the definition of search type and parameters from that. Which means, it is default (lucene) search and the field searched is text.
The text field is an aggregate and is populated by copyField instruction from other fields.
Your dynamic field definition for *_s allows you to index the text with any name ending in _s, such as example_s. It's indexed (so you could search against it directly) and stored (so you can see it when you ask for all fields). It will not however search it as a general text. Notice that (differently from ElasticSearch), Solr strings have to be matched fully and completely. If you have some multi-word text in it, there is barely any point searching it. "goober" is one word so it's not a very good example to understand the difference here.
The easiest solution for you is add another copyField instruction:
<copyField source="*_s" dest="text"/>, then all your *_s dynamic fields would also be searchable. But notice that the search analyzers will not be the ones for *_s definition, but the ones for the text field's definition, which is not string, but text_general, defined elsewhere in the file.
As to Solr vs. ElasticSearch, they both err on the different sides of magic. Solr makes you configure the system and makes it very easy to see the exact current configuration. ElasticSearch hides all of the configuration, but you have to rediscover it the second you want to change away from the default behaviour. In the end, the result is probably similar and meets somewhere in the middle.

dynamic fieldType assignments in solr

I have a scenario where I would like to dynamically assign incoming document fields to two different solr schema fieldTypes. One fieldType will be an 'exact match' fieldType while the other will be a 'full text' fieldType. The fields will follow a predictable pattern but the pattern can not be recognized using the dynamicField type and will not be known ahead of time.
So here is an example of the field names that I need to be able to process:
FOO_BAR_TEXT_1
FOO_BAR_TEXT_2
WIDGET_BAR_TEXT_3
WIDGET_BAR_TEXT_4
--
FOO_BAR_SELECT_1
FOO_BAR_SELECT_2
WIDGET_BAR_SELECT_1
The above fields will not be defined in advance. I need to map all fields with the name _BAR_SELECT_ to a fieldType of 'exactMatch' and I need to map all of the fields with name _BAR_TEXT_ to a fieldType of 'fulltext'. I was hoping there might be a way of doing
this dynamically when the document is indexed.
Have you tried using solr dynamic fields?
https://cwiki.apache.org/confluence/display/solr/Dynamic+Fields
Basically it would look something like this:
Obviously you'd need to make your own definitions (or use an existing one) for the types.
It is currently not possible to create fields like *_BAR_SELECT_*.
In the old solr wiki, as well as in the collection1 schema.xml file the is a restriction mentioned for dynamic fields:
RESTRICTION: the glob-like pattern in the name attribute must have a "*" only at the start or the end.
However, if you change the name to, say, BAR_SELECT_*, than it will be possible to dynamicly create fields "BAR_TEXT_FOO_1", "BAR_TEXT_FOO_2", "BAR_TEXT_WIDGET_3", and so on.
Like this:
<dynamicField name="BAR_TEXT_*" type="fulltext" />
<dynamicField name="BAR_SELECT_*" type="exactMatch" />

Solr filter query for suggester

I'm using the Solr Suggester Component, and was wondering, if the results can be filtered by the fq parameter. I have a query like this:
http://localhost:8982/solr/core1/suggest?q=shirts&fq=category_id%3A321&wt=json&indent=true&spellcheck=true&spellcheck.build=true
Here, I try to get some suggestions for q=shirts. I want to filter this by fq=category_id:321, so that I don't get suggestions from other categories. Because category with the category_id:321 doesn't have any products related to shirts, it shouldn't return any suggestions. But it does. And when trying to search for that suggestion, it doesn't find anything, because the "original" search is filtered with the fq=... parameters.
I found something with collate here http://wiki.apache.org/solr/SpellCheckComponent#spellcheck.collate. It collates my results, but also returns suggestions for shirts.
So my question is, is the Suggester (or basically the SpellCheckerComponent) aware of the fq parameter, and how can I use this parameter to filter the suggestions (or in a later stage, the spelling corrections).
EDIT
I found out, that the "normal" spellcheck component (e.g. with the class solr.IndexBasedSpellChecker for example), does indeed take the fq parameter into account. I can set
<str name="spellcheck.collate">true</str>
<str name="spellcheck.collateExtendedResults">false</str>
and a suggestion for shitr is not returned when filtering by a certain category id, where the keyword shirt is not present.
I'm wondering, why this doesn't work with the suggest component. Any ideas?
I do not think so, you need to look at this a bit differently to understand why. As you may already know, the SpellChecker operates based on dictionary built from the field you specified in the config.
default
text
solr.DirectSolrSpellChecker
...
And the copy fields that should makeup your dictionary during indexing fill the "text" field hence the dictionary.
Example :
At this point spell checker does not know where the suggestion came from.
So with collation you can do little better, did you try &spellcheck=true&spellcheck.extendedResults=true&spellcheck.collate=true ? This will make sure suggestion has some results .
The spellcheck.extendedResults ,Provide additional information about the suggestion, such as the frequency (hits) in the index, which may help you in your logic.

Show a portion of searchable text in Solr

I have indexed very large documents, In some cases these documents has 100.000 characters. Is there a way to return a portion of the documents (lets say the 300 first characters) when i am querying "Solr"?. Is there any attribute to set in the schema.xml or solrconfig.xml to achieve this?
I have tried many things but nothing worked.
Thank you in advance,
Tom
If you want a preview, you need to use a copyField and specify maxChars:
<copyField source="searchedField" dest="previewField" maxChars="300" />
Then display previewField instead of searchedField in your results.
I'm assuming you do not want normal search highlighting. If you do, just use the built-in highlighting parameters with hl.fragsize as outlined in this answer.

Querying Solr without specifying field names

I'm new to using Solr, and I must be missing something.
I didn't touch much in the example schema yet, and I imported some sample data. I also set up LocalSolr, and that seems to be working well.
My issue is just with querying Solr in general. I have a document where the name field is set to tom. I keep looking at the config files, and I just can't figure out where I'm going awry. A bunch of fields are indexed and stored, and I can see the values in the admin, but I can't get querying to work properly. I've tried various queries (http://server.com/solr/select/?q=value), and here are the results:
**Query:** ?q=tom
**Result:** No results
**Query:** q=\*:\*
**Result:** 10 docs returned
**Query:** ?q=*:tom
**Result:** No results
**Query:** ?q=name:tom
**Result:** 1 result (the doc with name : tom)
I want to get the first case (?q=tom) working. Any input on what might be going wrong, and how I can correct it, would be appreciated.
Set <defaultSearchField> to name in your schema.xml
The <defaultSearchField> Is used by
Solr when parsing queries to identify
which field name should be searched in
queries where an explicit field name
has not been used.
You might also want to check out (e)dismax instead.
I just came across to a similar problem... Namely I have defined multiple fields (that did not exist in the schema.xml) to describe my documents, and want to search/query on the multiple fields of the document, not only one of them (like the "name" in the above mentioned example).
In order to achieve this, I have created a new field ("compoundfield"), where I then put/copyField my defined fields (just like the "text" field on the schema.xml document that comes with Solr distribution). This results in something like this:
coumpoundfield definition:
<field name="compoundfield" type="text_general" indexed="true" stored="false" multiValued="true"/>
defaultSearchField:
<!-- field for the QueryParser to use when an explicit fieldname is absent -->
<defaultSearchField>compoundfield</defaultSearchField>
<!-- SolrQueryParser configuration: defaultOperator="AND|OR" -->
<solrQueryParser defaultOperator="OR"/>
<!-- copyField commands copy one field to another at the time a document
is added to the index. It's used either to index the same field differently,
or to add multiple fields to the same field for easier/faster searching. -->
<!-- ADDED Fields -->
<copyField source="field1" dest="compoundfield"/>
<copyField source="field2" dest="compoundfield"/>
<copyField source="field3" dest="compoundfield"/>
This works fine for me, but I am not sure if this is the best way to make such a "multiple field" search...
Cheers!
It seems that a DisMax parser
is the right thing to use for this end.
Related stackoverflow thread here.
The current solution is deprecated in newer versions of lucene/solr. To change the default search field either use the df parameter or change the field that is in:
<initParams
path="/update/**,/query,/select,/tvrh,/elevate,/spell,/browse">
<lst name="defaults">
<str name="df">default_field</str>
</lst>
</initParams>
inside the solrconfig.xml
Note I am using a non-managed schema and solr 7.0.0 at the time of writing
Going through the solr tutorial is definitely worth your time:
http://lucene.apache.org/solr/tutorial.html
My guess is that the "name" field is not indexed, so you can't search on it. You'd need to change your schema to make it indexed.
Also make sure that your XML actually lines up with the schema. So if you are adding a field named "name" in the xml, but the schema doesn't know about it, then Solr will just ignore that field (ie it won't be "stored" or "indexed").
Good luck
Well, despite of setting a default search field is quite usefull i don't understand why don't you just use the solr query syntax:
......./?q=name:tom
or
......./?q=:&fq=name:tom

Resources