Show a portion of searchable text in Solr

Show a portion of searchable text in Solr - solr

I have indexed very large documents, In some cases these documents has 100.000 characters. Is there a way to return a portion of the documents (lets say the 300 first characters) when i am querying "Solr"?. Is there any attribute to set in the schema.xml or solrconfig.xml to achieve this?
I have tried many things but nothing worked.
Thank you in advance,
Tom

If you want a preview, you need to use a copyField and specify maxChars:
<copyField source="searchedField" dest="previewField" maxChars="300" />
Then display previewField instead of searchedField in your results.
I'm assuming you do not want normal search highlighting. If you do, just use the built-in highlighting parameters with hl.fragsize as outlined in this answer.

Related

Solr: how do I use dismax instead of using copyField?

I've been trying to figure this out for a bit now. If I create a schema without the directive:
<copyField source="*" dest="text" />
I can't seem to pull anything up. But when I add that directive, things magically appear. I'm trying my query with ?defType=dismax, but that doesn't seem to help.
Am I missing something? Do I need something special in my schema? I'm indexing all the fields I need to search against.
Thoughts?
Thanks!

If you use defType=lucene you need to specify the field before your search query like this:
q=title:test
If you don't specify a field solr will use the default field specified in solrconfig.xml. This field is text by default. As all the fields are copied to text the search works well.
If you decide to use dismax the query structure changes. You need to put your search term like that:
q=test
and specify the fields to search in other parameter like that:
<str name="qf">field1 field2</str>
Where field1 and field2 are the fields you want to search the terms.

Solr - Get 2 separate highlights on same field

We are using solr 4.9. We need different highlighting on same field on same time .
One highlight check for whole content and other check for only first N characters.
Please suggest how we can do this?? or if not possible then any alternative ways.
Thanks

As a simpler solution, you can have one copy field that will copy only N chars from source to destination
<copyField source="cat" dest="text" maxChars="30000" />
Then while searching, you can specify both fields for highlighting
&hl=true&hl.fl=cat,text

Solr dynamicField not searched in query without field name

I'm experimenting with the Example database in Solr 4.10 and not understanding how dynamicFields work. The schema defines
dynamicField name="*_s" type="string" indexed="true" stored="true"
If I add a new item with a new field name (say "example_s":"goober" in JSON format), a query like
?q=goober
returns no matches, while
?q=example_s:goober
will find the match. What am I missing?

I would like to see the SearchHandler from solrconfig.xml file that you are using to execute the above mentioned query.
In SearchHandler we generally have Default Query Field i.e. qf parameter.
Check that your dynamic field example_s is present in that query field list of solrconfig file else you can pass it while sending query to search handler.
Hope this will help you in resolving your problem.

If you are using the default schema, here's what's happening:
You are probably using default end-point (/select), so you get the definition of search type and parameters from that. Which means, it is default (lucene) search and the field searched is text.
The text field is an aggregate and is populated by copyField instruction from other fields.
Your dynamic field definition for *_s allows you to index the text with any name ending in _s, such as example_s. It's indexed (so you could search against it directly) and stored (so you can see it when you ask for all fields). It will not however search it as a general text. Notice that (differently from ElasticSearch), Solr strings have to be matched fully and completely. If you have some multi-word text in it, there is barely any point searching it. "goober" is one word so it's not a very good example to understand the difference here.
The easiest solution for you is add another copyField instruction:
<copyField source="*_s" dest="text"/>, then all your *_s dynamic fields would also be searchable. But notice that the search analyzers will not be the ones for *_s definition, but the ones for the text field's definition, which is not string, but text_general, defined elsewhere in the file.
As to Solr vs. ElasticSearch, they both err on the different sides of magic. Solr makes you configure the system and makes it very easy to see the exact current configuration. ElasticSearch hides all of the configuration, but you have to rediscover it the second you want to change away from the default behaviour. In the end, the result is probably similar and meets somewhere in the middle.

Solr terms component complete field match

i am new to Solr.
I am working with the terms component to get the Top Terms from a Field.
For Example:
I got the field "Firm" and there are many types of firms in it with the endings "gmbh" and "ag".
But i need this Field sepperated by the full content of it.
For Example: Mustermann gmbh, max gmbh, etc .....
I've tried many different fieldtypes in the schema.xml but nothing worked.
Thank you in advance.
Best regards,
Lorenzo :-)

You can use Facets in your request to get the "Top X of field Y"
E.g.
q=*&facet=true&facet.field=Firm&facet.limit=50&facet.minCount=1
When you use facet.limit you get the top X results.
Your field Firm in the schema.xml should not use a Tokenizer, because you would get "mustermann" and "gmbh" instead of "mustermann gmbh" (I think "string" is in standard a field without a Tokenizer)
Don't forget to reindex if you have to change field values.

Querying Solr without specifying field names

I'm new to using Solr, and I must be missing something.
I didn't touch much in the example schema yet, and I imported some sample data. I also set up LocalSolr, and that seems to be working well.
My issue is just with querying Solr in general. I have a document where the name field is set to tom. I keep looking at the config files, and I just can't figure out where I'm going awry. A bunch of fields are indexed and stored, and I can see the values in the admin, but I can't get querying to work properly. I've tried various queries (http://server.com/solr/select/?q=value), and here are the results:
**Query:** ?q=tom
**Result:** No results
**Query:** q=\*:\*
**Result:** 10 docs returned
**Query:** ?q=*:tom
**Result:** No results
**Query:** ?q=name:tom
**Result:** 1 result (the doc with name : tom)
I want to get the first case (?q=tom) working. Any input on what might be going wrong, and how I can correct it, would be appreciated.

Set <defaultSearchField> to name in your schema.xml
The <defaultSearchField> Is used by
Solr when parsing queries to identify
which field name should be searched in
queries where an explicit field name
has not been used.
You might also want to check out (e)dismax instead.

I just came across to a similar problem... Namely I have defined multiple fields (that did not exist in the schema.xml) to describe my documents, and want to search/query on the multiple fields of the document, not only one of them (like the "name" in the above mentioned example).
In order to achieve this, I have created a new field ("compoundfield"), where I then put/copyField my defined fields (just like the "text" field on the schema.xml document that comes with Solr distribution). This results in something like this:
coumpoundfield definition:
<field name="compoundfield" type="text_general" indexed="true" stored="false" multiValued="true"/>
defaultSearchField:
<!-- field for the QueryParser to use when an explicit fieldname is absent -->
<defaultSearchField>compoundfield</defaultSearchField>
<!-- SolrQueryParser configuration: defaultOperator="AND|OR" -->
<solrQueryParser defaultOperator="OR"/>
<!-- copyField commands copy one field to another at the time a document
is added to the index. It's used either to index the same field differently,
or to add multiple fields to the same field for easier/faster searching. -->
<!-- ADDED Fields -->
<copyField source="field1" dest="compoundfield"/>
<copyField source="field2" dest="compoundfield"/>
<copyField source="field3" dest="compoundfield"/>
This works fine for me, but I am not sure if this is the best way to make such a "multiple field" search...
Cheers!

It seems that a DisMax parser
is the right thing to use for this end.
Related stackoverflow thread here.

The current solution is deprecated in newer versions of lucene/solr. To change the default search field either use the df parameter or change the field that is in:
<initParams
path="/update/**,/query,/select,/tvrh,/elevate,/spell,/browse">
<lst name="defaults">
<str name="df">default_field</str>
</lst>
</initParams>
inside the solrconfig.xml
Note I am using a non-managed schema and solr 7.0.0 at the time of writing

Going through the solr tutorial is definitely worth your time:
http://lucene.apache.org/solr/tutorial.html
My guess is that the "name" field is not indexed, so you can't search on it. You'd need to change your schema to make it indexed.
Also make sure that your XML actually lines up with the schema. So if you are adding a field named "name" in the xml, but the schema doesn't know about it, then Solr will just ignore that field (ie it won't be "stored" or "indexed").
Good luck

Well, despite of setting a default search field is quite usefull i don't understand why don't you just use the solr query syntax:
......./?q=name:tom
or
......./?q=:&fq=name:tom

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Show a portion of searchable text in Solr - solr

Related

Solr: how do I use dismax instead of using copyField?

Solr - Get 2 separate highlights on same field

Solr dynamicField not searched in query without field name

Solr terms component complete field match

Querying Solr without specifying field names

Categories

Resources