First, I'm fairly new to solr and I'm far from sure that solr is the right solution
to solve this problem. The documents that I'm working on is already there so, if solr can solve it, then it would be great :)
One of our fields in a document is of type string and have attribute multiValued set to true. It contains a list of id's that the current document relates to.
The task a head now is that I know have a second list of id's (same domain) and, if any of these id's matches (if more then one id matches then I want a higher score), then I would like to boost the score of the document.
Use Boost Query if you are using dismax or edismax.
For example, bq=id:1 OR id:2 OR id3 will boost documents which have at least one of the 3 ids. It will also give a higher boost to documents with more matching ids.
Related
I have few near duplicate documents stored in solr. Schema has a autogenerated uuid as the unique key so duplicates can get into the index. I need to get the counts of duplicated documents based on field/fields in the schema.
I am trying to get quick numbers without writing a client program and going through the full result set, something on solr console itself.
Tried to use facets but not able to get the total counts. below query gives the duplicates for each value of 'idfield' but they need to be iterated till last page and summed up (over couple of million entries).
q=*:*&facet=true&facet.mincount=2&facet.field=idfield
jason facet query can be used to find out unique values as explained in this blog
http://yonik.com/solr-count-distinct/
or it can be done using collapse filter and finding the difference
q=*:*&fq={!collapse=true field=idfield} - get the numfound and subtract from MatchAllDocs query (*:*)
You can also use facet.mincount=2 to get duplicate documents by faceting on unique id field. Ex: /solr/core/select?q=:&facet=on&facet.field=uniqueidfield&facet.mincount=2&facet.missing=true
Also you can add facet.limit=-1&rows=0 to get the document ids with duplicate ids.
I'm new to Solr's faceting. The facet section in the search response sees only contain facet terms and counts, no documents associated. Now, if I would like to find a document which belongs to a facet, do I need to pass the facet in the search query and do the search over again? what is the best way to use facets? Thanks
Yes, you have to issue a new search, typically by adding a filter query, in the form of myfacetfield:"facetterm" to your original request.
If you want the documents belonging to each aggregation bucket delivered up front, you can use grouping instead.
I'm trying to understand how to approach search requirements I have.
The first one is a normal product search that I know Solr can handle appropriately, where you search for a term and Solr returns relevant documents.
The second one is a search for products within a certain category. I have a hierarchical structure in my database that consists in a category with many subcategories and those have products.
The thing is, when some very specific words are searched for, the first approach shouldn't be used, instead a search for a category should be done and only products within that category should be returned, which for me is a very basic SQL query (select * from products where categoryId = 1000).
Does Solr should or can be used in the second case? If so, what is the normal approach to use?
Besides what #Mysterion proposed of filter queries you should take a look at Solr Facets which gives you very powerful catogory-like searching.
You also might want to consider multivalue field for categoryParentIds which will contain the parent categories that the product is in and thus combined with filter query and or facets will get your parent category searching.
Yes, you could use similar approach in Solr, by attributing your products with categoryId and later, while searching add filter query similiar to SQL, categoryId:10000
For more info about filter query, take a look here - http://wiki.apache.org/solr/CommonQueryParameters#fq
I would like to perform a Solr search using the values of certain fields of an indexed document which I can identify by its id. With MLT this is somehow possible, but I would prefer a regular query parser. Can I somehow use subqueries to inject the result of a subquery into the main query?
For example, let's say I have indexed information about books into solr, where each document represents a book, with an id, title and author field. At query time I have only the document id availible and I would like to search for books by the same author in a single step. Is this possible without using MLT?
You can use JOIN.
http://HOST:PORT/CORE/select?q={!join from=author to=author}id:<ID>
I'm working on improving search which is powered by solr for my e-commerce project. So search queries are performed into Solr and results are returned by Solr.
This is working fine. Now I need to offer a facet on the search results. The first could be category this is easy to implement as Category is common to all product and in the query I make I just enable facet and pass category as facet field.
However, for different nature of products there could be different products and they have few facets defined for them.
I'm clueless as how would I know them in advance and pass it in solr search query? Does solr return all facet field by some queries along with the search results? If yes, how?
If no, then what could be the correct way to proceed further.
Pass all Unique Facet Field Name on which you want to make facet filtering, and you will get all records that have facet field.
Define all the static fieldnames in your facet query search, if there are no hits you will not get any results back for that field.
Pass all the possible fields(on which u need faceting) in Facet Field with facet.mincount=1.. so, you'll get only those fields which has at-least one occurrence in your solr data
http://<hostname>:<portname>/solr/<core_name>/select?q:<fieldname>:<value>&fq=<field_name>:<value>&fl=<field1>,<field2>,<fieldn>&start=0&rows=10&facet=true&facet.field=<field1>&facet.field=<field2>&facet.field=<fieldn>&facet.mincount=1