If I have lots of requests which search selecting different addresses, may I use a wildcard for select query, selecting all addresses for warming in settings of query related listeners? I would like to cache all addresses to make subsequent queries of separate addresses faster. Or using wildcards for caching isn't possible?
<listener event="newSearcher" class="solr.QuerySenderListener">
<arr name="queries">
<lst>
<str name="q">address:*</str>
<str name="rows">10000</str>
</lst>
</arr>
</listener>
<listener event="firstSearcher" class="solr.QuerySenderListener">
<arr name="queries">
<lst>
<str name="q">address:*</str>
<str name="rows">10000</str>
</lst>
</arr>
</listener>
The query address:* retrieves all documents having a non-empty value in the field address, but that won't be that much useful for Solr's filter cache since a subsequent hit would only match the wildcard character as a filter.
You need to load documents where address field actually matches a precise value, and the wildcard character in this context will be treated as a unique filter for the filter cache, not as a cacthall.
So it's not that caching a wildcard query doesn't work but it doesn't warm the cache as you might expect/need, that is for all distinct values in the field (it could be useful as a "shortcut" to warm all possible results though, but imagine the cost of warming a wildcard query if the field is not restricted to a finite set..).
Instead you may have to use filter queries, each intersecting the whole set of documents (this always implies a main wildcard query q=*:* on which you apply a fq), and using one fq per possible value in the field - or per most frequently submitted values if the field is not restricted, which will load every (or the most frequently loaded) subsets of documents by addresses, which actually means warming the filter cache for each one of them.
https://lucene.apache.org/solr/guide/7_3/query-settings-in-solrconfig.html#filtercache
Is it possible to tell Solr to use a specific filter value if no other filter is defined for that field?
Example:
If there is no other fq entry present for a field age then search by default for age > 18.
Yes, you can add these to the requestHandler definition:
<lst name="defaults">
<str name="fq">age:[18 TO *]</str>
</lst>
(or if you really meant larger than 18 and not 18 or older, {18 TO *] or [19 TO *]).
You can also use appends and invariants instead of defaults to add a filter query to all queries or set a parameter to a static value that an URL parameter can't override.
I am using Solr version 5 for searching data. I am using below query which searches for keyword in all fields.
http://localhost:8983/solr/document/select?q=keyword1+keyword2&wt=json
Can anyone suggest me query to search for keyword only in title field.
Thanks.
use
http://localhost:8983/solr/document/select?q=title:*yourkeyword*&wt=json
or for exact match
http://localhost:8983/solr/document/select?q=title:"yourkeyword"&wt=json
You can not search for a keyword in all fields without some extra work:
How can I search all field in SOLR that contain the keywords,.?
The "q"-Parameter contains the query string and for the standard parser this means that you must specify the field via colon like in
fieldname:searchterm
or the standard parser will use the default field. The default field is specified in the "df"-Parameter and if you did not change your solrconfig.xml you will search in the "text"-Field because you will find something like
<requestHandler name="/select" class="solr.SearchHandler">
<lst name="defaults">
<str name="df">text</str>
</lst>
</requestHandler>
P.S. If you want to search in all fields you have either to copy all field-content to one field or you must use a specific query parser like dismax parser, where you can list all your fields in the "qf"-Parameter.
P.P.S. You can not search in all fields but you can highlight in all fields :-)
The best way is to run the query from Admin concole. When we run it, it also provides the actuall SQL query executed. Just copy the query and use it.
About the question: search specific field value from the Solr. In the admin console look for 'Q' text box. write the yourfield=value OR yourfield:value. Hit the 'Execute Query' button. Top right side the SQL will be available.
Generated Query: ......select?indent=on&q=YOURFIELD:"VALUE"&wt=json
I am using Solr for indexing different types of products. The product types (category) have different facets. For example:
camera: megapixel, type (slr/..), body construction, ..
processors: no. of cores, socket type, speed, type (core i5/7)
food: type, origin, shelf-life, weight
tea: type (black/green/white/..), origin, weight, use milk?
serveware: type, material, color, weight
...
And they have common facets as well:
Brand, Price, Discount, Listing timeframe (like new), Availability, Category
I need to display the relevant facets and breadcrumbs when user clicks on any category, or brand page or a global search across all types of products. This is same as what we see on several ecommerce sites.
The query that I have is:
Since the facet types are more or less unique across different types of products, do I put them in separate schemas? Is that the way to do it? The fundamental worry is that those fields will not have any data for other types of products. And are there any implementation principles here that makes retrieving the respective faces for a given product type easier?
I would like to have a design that is scalable to accommodate lots of items in each product type as we go forward, as well as that is easy to use and performance oriented, if possible. Right now I am having a single instance of Solr.
The only risk of underpopulated facets are when they misrepresent the search. I'm sure you've used a search site where the metadata you want to facet on is underpopulated so that when you apply the facet you also eliminate from your result set a number of records that should have been included. The thing to watch is that the facet values are populated consistently where they are appropriate. That means that your "tea" records don't need to have a number of cores listed, and it won't impact anything, but all of your "processor" records should, and (to whatever extent possible) they should be populated consistently. This means that if one processor lists its number of cores as "4", and another says "quadcore", these are two different values and a user applying either facet value will eliminate the other processor from their result. If a third quadcore processor is entirely missing the "number of cores" stat from the no_cores facet field (field name is arbitrary), then your facet could be become counterproductive.
So, we can throw all of these records into the same Solr, and as long as the facets are populated consistently where appropriate, it's not really necessary that they be populated for all records, especially when not applicable.
Applying facets dynamically
Most of what you need to know is in the faceting documentation of Solr. The important thing is to specify the appropriate arguments in your query to tell Solr which facets you want to use. (Until you actually facet on a field, it's not a facet but just a field that's both stored="true" and indexed="true".) For a very dynamic effect, you can specify all of these arguments as part of the query to Solr.
&facet=true
This may seem obvious, but you need to turn on faceting. This argument is convenient because it also allows you to turn off faceting with facet=false even if there are lots of other arguments in your query detailing how to facet. None of it does anything if faceting is off.
&facet.field=no_cores
You can include this field over and over again for as many fields as you're interested in faceting on.
&facet.limit=7
&f.no_cores.facet.limit=4
The first line here limits the number of values for returned by Solr for each facet field to 7. The 7 most frequent values for the facet (within the search results) will be returned, with their record counts. The second line overrides this limit for the no_cores field specifically.
&facet.sort=count
You can either list the facet field's values in order by how many appear in how many records (count), or in index order (index). Index order generally means alphabetically, but depends on how the field is indexed. This field is used together with facet.limit, so if the number of facet values returned is limited by facet.limit they will either be the most numerous values in the result set or the earliest in the index, depending on how this value is set.
&facet.mincount=1
There are very few circumstances that you will want to see facet values that appear zero times in your search results, and this can fix the problem if it pops up.
The end result is a very long query:
http://localhost/solr/collecion1/search?facet=true&facet.field=no_cores&
facet.field=socket_type&facet.field=processor_type&facet.field=speed&
facet.limit=7&f.no_cores.facet.limit=4&facet.mincount=1&defType=dismax&
qf=name,+manufacturer,+no_cores,+description&
fl=id,name,no_cores,description,price,shipment_mode&q="Intel"
This is definitely effective, and allows for the greatest amount of on-the-fly decision-making about how the search should work, but isn't very readable for debugging.
Applying facets less dynamically
So these features allow you to specify which fields you want to facet on, and do it dynamically. But, it can lead to a lot of very long and complex queries, especially if you have a lot of facets you use in each of several different search modes.
One option is to formalize each set of commonly used options in a request handler within your solrconfig.xml. This way, you apply the exact same arguments but instead of listing all of the arguments in each query, you just specify which request handler you want.
<requestHandler name="/processors" class="solr.SearchHandler">
<lst name="defaults">
<str name="defType">dismax</str>
<str name="echoParams">explicit</str>
<str name="fl">id,name,no_cores,description,price,shipment_mode</str>
<str name="qf">name, manufacturer, no_cores, description</str>
<str name="sort">score desc</str>
<str name="rows">30</str>
<str name="wt">xml</str>
<str name="q.alt">*</str>
<str name="facet.mincount">1</str>
<str name="facet.field">no_cores</str>
<str name="facet.field">socket_type</str>
<str name="facet.field">processor_type</str>
<str name="facet.field">speed</str>
<str name="facet.limit">10</str>
<str name="facet.sort">count</str>
</lst>
<lst name="appends">
<str name="fq">category:processor</str>
</lst>
</requestHandler>
If you set up a request hander in solrconfig.xml, all it does is serve as a shorthand for a set of query arguments. You can have as many request handlers as you want for a single solr index, and you can alter them without rebuilding the index (reload the Solr core or restart the server application (JBoss or Tomcat, e.g.), to put changes into effect).
There are a number of things going on with this request handler that I didn't get into, but it's all just a way of representing default Solr request arguments so that your live queries can be simpler. This way, you might make a query like:
http://localhost/solr/collection1/processors?q="Intel"
to return a result set with all of your processor-specific facets populated, and filtered so that only processor records are returned. (This is the category:processor filter, which assumes a field called category where all the processor records have a value processor. This is entirely optional and up to you.) You will probably want to retain the default search request handler that doesn't filter by record category, and which may not choose to apply any of the available (stored="true" and indexed="true") fields as active facets.
Is it possible to search in Solr over two fields using two different words and get back only those results which contain both of them?
For example, if I have fields "type" and "location" , I want only those results who have type='furniture' and location = 'office' in them.
You can use boolean operators and search on individual fields.
q=type:furniture AND location:office
If the values are fixed, it is better to use Filter Queries for Performance.
fq=type:furniture AND location:office
The suggested solutions have the drawback, that you have to care about escaping special characters.
If the user searches for "type:d'or AND location:coffee break" the query will fail.
I suggest to combine two edismax handlers:
<requestHandler name="/combine" class="solr.SearchHandler" default="false">
<lst name="invariants">
<str name="q">
(_query_:"{!edismax qf='type' v=$uq1}"
AND _query_:"{!edismax qf='location' v=$uq2}")
</str>
</lst>
</requestHandler>
Call the request handler like this:
http://localhost:8983/solr/collection1/combine?uq1=furniture&uq2=office
Explanation
The variables $uq1 and $uq2 will be replaced by the request parameters uq1 and uq2 will.
The result of the first edismax query (uq1) is combined by logical AND with the second edismax query (uq2)
Solr Docs
https://wiki.apache.org/solr/LocalParams
You can also use the boostQuery function on the dismaxRequest handler as
type=dismax&bq=type:furniture AND location:office
fq=type:furniture AND location:office
Instead of using AND, this could be break into two filter queries as well.
fq=type:furniture
fq=location:office