Solr End User Query Translation - solr

I am wondering if there is anyway to transform an end user query to a more complicated solr query based on some rules.
For example, if the user types in 32" television, then I want to use the dismax query parser to let solr take care of this user query string like below:
http://localhost:8983/solr/select/?q=32" television&defType=dismax
However, if the user types in "televisions on sale", then I want to do a regular search for token televisions and onsale flag is true like below:
http://localhost:8983/solr/select/?q=name:televisions AND isOnSale:true
Is this possible? Or must this logic require an advance search form where the user can clearly state in a checkbox that they only want on sale items.
Thanks.

Transforming the user query is quite possible. You can do it in following two ways
implement a Servlet Filter that listens to user query transforms it before dispatching it to solr request handler.
Look at query parser plugin in SOLR and implement one based on the existing one like standard query parser and modify it to apply transformation rules.

Let the search happen through the whole index and let the user choose. If a review shows up, render it with the appropriate view. If a product shows up, offer to search for more products.
Samsung 32 in reviews --read more
LG 32 in offers --find more like this
Your offers page can offer more options, such as filtering products on sale.
You may use a global boost field on documents. For example, a product on sale has a score of 1.0 while out of stock products have 0.33. A review of a new products has 1.0, old products have less.

Maybe you can set up the search so when someone searches for whatever have isOnSale as a secondary sort parameter. So by default sort by score then sort by isonsale or just sort by isonsale. That way you will still get all "television" ads in the results just the ones on sale are on top.

Related

solr facet counts not correct with stats and group option

I am using solr search for products-search on our web page. Since now, all works fine.
But while implementing a price slider, to filter actual results by pricerange, I'm stuck with the following issue:
There is no way to exclude filters for the stats option, same way as it is possible on facets. I use stats for getting the overall min- and max-price, no matter what price range is selected (on the slider) and which category is selected on actual search.
So best way to get this values is to exclude the range-filter on stats select, otherwise there will be max- and min-price just for the actual (ranged) result.
exclude a filter on facets (works on solr 4.4):
...&fq={!tag=cat}categories:Electronics/Computers&facet=true&facet.field={!ex=cat}categories&...
But using this for stats is not possible (see https://issues.apache.org/jira/browse/SOLR-3177)
So then I tried using a group select as suggested on that called page.
my solr call looks like this:
fq={!tag=cat}categories:Electronics/Computers&facet=true&
facet.field={!ex=cat}categories_raw&
facet.prefix=Electronics&stats=true&stats.field=minPrice&
stats.field=maxPrice&stats.field=vat&group=true&group.query=minPrice:[* TO 20]
maxPrice:[0 TO *]&group.main=true
All fine. I get the correct stats result and the correct result-count having applied the pricerange-filter. .... EXCEPT the problem, that the facet counts now were wrong, as I did not apply the price range filter.
I know there is a group.facet option, as I also tried. But using that group.facet I need to use a group.field on which the results are based on. In my opinion, usually I need to use the price-field as group.field (group.field=price).
But we do have two price fields on our products (min and max-price). I tried to set them both as group.field parameter, but still get the wrong facet-counts.
It looks like I am just a small step away from the correct solution, but I don't get it.

Solr dynamic sorting

We have a website on which you can search through a large amount of products from different shops. Say we have 5 products per result page and the 10 best matches for a search have all the same score. 8 of the products are of one shop (A), and the two others by two other shops (B,C).
What we often get is (letter indicating a product of this shop)
A
A
A
A
A
---- second result page ----
A
B
A
C
A
but what we want to get is something like this:
A
C
B
A
A
---- second result page ----
A
A
A
A
A
Writing function query seems to be one option
http://www.solrtutorial.com/custom-solr-functionquery.html
What is the best way to achieve this?
You could group the results by shop using Field Collapsing and display the result either as a group or flattened list (depending on how you want it).
Another trick that I've seen in use to help the users see results from multiple group is to use Facets. You could have a sidebar (or something similar) that does two things:
By default it lets the user know that there are other filter criteria (ex. shops) in the result. This helps a lot when the result is paginated.
With facets being present, it is upto the user to choose whatever criteria she/he wishes to apply, thus relieving you of implementing heavy scenario based logic.
Read more about faceting here.
Edit:
If you have to use custom sort logic, you could write it down using Functions and use it in the sort when querying Solr. Here is the reference from the docs.

Drupal Job board: Faceted Search with "OR" operator, but sort results by most matching facet criterias/term count

I'm quite stuck with searching for a solution for my problem and I hope that you can maybe help me.
In general I want to build a small job platform. It includes an "Explore"-Section, which is just like a Search-Page with Facets.
The actual job-nodes can be tagged with terms of the two vocabulary "skills" and "interests".
The facets on the search page allow the user to filter jobs exactly along these skills and interests.
However, I want to use the "OR"-Operator for the Facets, so that the user gets a list with jobs, that nearly perfect match their skills & interest but also jobs that match only some of these terms.
So, here you can see the default listing page. On the left are the Facets for interest and type (Operator "OR"). On the right, you can see the result set with title, and the node's skills & interest terms:
See the image of the Jobsearch Default page
Now, I'm applying "Musik" and "Kultur" as interest-filters:
See the image of the Jobsearch with applied filters
As you can see in the result-set, the OR-operator delivers all the results.
However, I would like to sort these results according to their "relevance" resp. according to the count of matched criterias.
The 4. and 5. results match both terms, that are selected in the facet, but they should be listed in front of all other terms.
So, I hope you understand what I want to achieve. I started at first with Views to accomplish the goal, but I then switched to search_api and SOLR as I think, that this approach is more enhanceable in the future.
The second aim is, that a user can store his/her individual interests & skills (the filters mentioned before) in his user profile. Here, the user should see individual job recommendations based on his profile on his account-page.
So, any hints, tips, tricks, links are very welcome as I have no idea if I'm on the right track to solve my problem(s). :)
Robert
Maybe this approach could be an alternative:
Instead of using the tags as facets/filters, I could use them just as search input.
when i'm typing my terms/tags within the search field of an apache-sold-search-page, i'm getting exactly the results sorted by their relevance:
Searching the tags instead of filtering
So, maybe I have just to do a small piece of code, that automatically creates a search query based on the clicked term/tagsā€¦

Click-through relevancy ranking

I want to implement click-through relevancy ranking in a search (solr). Basically depending on the users' feedback (which are clicks), we want to change the ordering of search results. Following is my approach.
We will add a new field to document to index the queries for which result/document has been accessed (or clicked). Whenever a result is clicked, we will update the index to include the query for which the result has been clicked. We will use solr's partial updates to add the new query to the index. Since, we use index as our data-store as well, all our fields are stored and I can afford to store one more field.
Is this the right approach to implement this feature?
Note: I, yet have to evaluate logging, and it is (yet) away from implementing it. I was just building a requirement specification to start with, which I formulated.
It is as follows.
Evaluate user selection (Click through) for `query` and matched result position.
The position is important because it determines the relevancy.
I chose the top results to be 3. (Assume N=3).
If users are selecting something that has a N>3, it is important to increase this result boost for the query.
If the position is at N<=3, we're good.
If position is consistantly at N<=3, demote the top results (maybe?)
However, we may get a lot of wrong info, here. Assume, a single user went crazy and clicks absolutely irrelevant results.
So we need to monitor usage, and log even user events, apart from just the basic position and click through to cover this.
So, log needs to be on :
Clicks results per page per {user-login|session}.
Click on result for {Query + Filters + Facets}. A special flag for {did you mean... | autocomplete} click events, with {TimeStamp + Location}
If a significant number of unique users indicate clicking on low score documents during a time range (months), I would boost the documents according to location.
Since we even have co-related a user session(login), I might be able to map results according to the user (if irrelevant noise generated by user, send it back to him ;P).
However, I would try my best not to put in too much boost. The search may look tampered.
Also a feedback form for the users to fill in might be a good idea to see how well you are going.

Solr - How do I get the number of documents for each field containing the search term within that field in Solr?

Imagine an index like the following:
id partno name description
1 1000.001 Apple iPod iPod by Apple
2 1000.123 Apple iPhone The iPhone
When the user searches for "Apple" both documents would be returned. Now I'd like to give the user the possibility to narrow down the results by limiting the search to one or more fields that have documents containing the term "Apple" within those fields.
So, ideally, the user would see something like this in the filter section of the ui after his first query:
Filter by field
name (2)
description (1)
When the user applies the filter for field "description", only documents which contain the term "Apple" within the field "description" would be returned. So the result set of that second request would be the iPod document only. For that I'd use a query like ?q=Apple&qf=description (I'm using the Extended DisMax Query Parser)
How can I accomplish that with Solr?
I already experimented with faceting, grouping and highlighting components, but did not really come to a decent solution to this.
[Update]
Just to make that clear again: The main problem here is to get the information needed for displaying the "Filter by field" section. This includes the names of the fields and the hits per field. Sending a second request with one of those filters applied already works.
Solr just plain Doesn't Do This. If you absolutely need it, I'd try it the multiple requests solution and benchmark it -- solr tends to be a lot faster than what people put in front of it, so an couple few requests might not be that big of a deal.
you could achieve this with two different search requests/queries:
name:apple -> 2 hits
description:apple -> 1 hit
EDIT:
You also could implement your own SearchComponent that executes multiple queries in the background and put it in the SearchHandler processing chain so you only will need a single query in the frontend.
if you want the term to be searched over the same fields every time, you have 2 options not breaking the "single query" requirement:
1) copyField: you group at index time all the fields that should match togheter. With just one copyfield your problem doesn't exist, if you need more than one, you're at the same spot.
2) you could filter the query each time dynamically adding the "fq" parameter at the end
http://<your_url_and_stuff>/?q=Apple&fq=name:Apple ...
this works if you'll be searching always on the same two fields (or you can setup them before querying) otherwise you'll always need at least a second query
Since i said "you have 2 options" but you actually have 3 (and i rushed my answer), here's the third:
3) the dismax plugin described by them like this:
The DisMaxQParserPlugin is designed to process simple user entered phrases
(without heavy syntax) and search for the individual words across several fields
using different weighting (boosts) based on the significance of each field.
so, if you can use it, you may want to give it a look and start from the qf parameters (that is what the option number 2 wanted to be about, but i changed it in favor of fq... don't ask me why...)
SolrFaceting should solve your problem.
Have a look at the Examples.
This can be achieved with Solr faceting, but it's not neat. For example, I can issue this query:
/select?q=*:*&rows=0&facet=true&facet.query=title:donkey&facet.query=text:donkey&wt=json
to find the number of documents containing donkey in the title and text fields. I may get this response:
{
"responseHeader":{"status":0,"QTime":1,"params":{"facet":"true","facet.query":["title:donkey","text:donkey"],"q":"*:*","wt":"json","rows":"0"}},
"response":{"numFound":3365840,"start":0,"docs":[]},
"facet_counts":{
"facet_queries":{
"title:donkey":127,
"text:donkey":4108
},
"facet_fields":{},
"facet_dates":{},
"facet_ranges":{}
}
}
Since you also want the documents back for the field-disjunctive query, something like the following works:
/select?q=donkey&defType=edismax&qf=text+titlle&rows=10&facet=true&facet.query=title:donkey&facet.query=text:donkey&wt=json

Resources