I'm working on implementing Solr in a project and right now I'm stuck on a specific search including an arr field. The thing is:
I'd like to search sub-id's on an object, these sub-id's are stored in a multivalue field, e.g.:
<arr name="SubIds">
<int>12272</int>
<int>12304</int>
<int>12306</int>
</arr>
The query (or part of the query) that I want to use is as follows:
map(SubIds,i,i,1,0)
When I, for example, fill 12304 on the 'i' space in the map function above, I would expect my function to return 1. If I would enter 12345 it should return 0. The thing is that when I run this query it returns 0, or "There's no number 12304 in this field, I return 0".
When removing the 0 from my map function I can see the actual value returned to me (when 12304 return 1, when different return value), in this case that's 12306! I've tried this with some different multivalued fields but the result is the same; it looks like the function is checking the last value in the multivalue field against my filled in ID.
Is this true? And when it does, is there any way in looking through the whole arr and only return 0 when the value doesn't exist in the whole multivalued field?
** Edit: It's just a hunch, but could it be that the map() function automatically orders the arr list when it sees that all the items are of type int (for example). That could mean that the map returns the first number (the highest) which would (in my example) be 12306, not 12304...*
Thanks!
... It looks like function queries don't work with multivalued fields ...
http://lucene.472066.n3.nabble.com/Using-multivalued-field-in-map-function-td3318843.html#a3322023:
Function queries don't work with multivalued field.
http://wiki.apache.org/solr/FunctionQuery#Vector_Functions
Given the following case, is there anybody who has a better idea on how I can query the wanted data?
I've got a website full of blogposts and every blogpost has an owner,
this owner is refererred to through his/her id. For example: BloggerId
= 123. It's also possible that the blog has multiple co-writers, which
are also referred to by there BloggerId but these id's are stored in
the multivalue field, in my previous example SubIds.
When searching for a specific blogger one searches the BloggerId.
Searchresults are influenced by a number of variables, the
country/state/more specific geological data, the blogcategory, etc.
For this I use a facetted query. Next I want to make some results more
important, depending on the BloggerId, I tried to do this with the
following query:
?q={!func}map(sum(map(BloggerId,12304,12304,2,0),map(BloggerId,12304,12304,1,0)),3,3,2)&fl=*,score&facet.field=Country&f.Country.facet.limit=6&facet.field=State&fq=(BlogCategory:internet%20OR%20BlogCategory:sports&sort=score%20desc,Top%20desc,%20SortPriority%20asc&start=0&omitHeader=true
In the resulting list, blogs written by BloggerId 12304 should be on
top of the list, followed by the blogs where BloggerId 12304 was
co-writer. After that, all other blogs that follow the criteria but
aren't written (or co-written) by BloggerId 12304.
Maybe I could make this multivalued field a string field (where id's are seperated by ";") and query my value, but if one has a better idea your always welcome!
In the end I chose to add a string valued field with whitespaces to seperate the different values. After that I used the solr.WhitespaceTokenizerFactory class to quickly scan the string for occurences of a specific ID.
Related
I'm looking at a very old solr instance (4-6 years since last touched), and I am seeing these extra dynamic fields, 'f_' and 'fs_' for multi and single valued facet fields.
My understanding, though, is that facets only happen in query-time.
Also, it's just a copy over - the fields dont change type.
So before I nuke these fields to kingdom come; is there a reason for facet fields in an index that is just a copied field?
Thanks
Facets only happening query time is a bit of a misnomer - the content (the tokens) that the facet represents from is generated when indexing. The facet gives the distinct number of documents that has a specific token present.
That means that if the field type is identical and there is only one field being copied into the other named field, the behaviour between the source and the destination field should be identical.
However, if there are multiple fields copying content into the same field, the results will differ. Also be aware that the type is given from the schema for the field, it's not changed by the copyField instruction in any way. A copy field operation happens before any content runs through the indexing chain for the field.
Usually you want facets to be generated on string fields so that the indexed values are kept as-is, while you want to use a text field or similar for searching (with tokenization), since a string field would only give exact (including matching case) hits.
I would like to create an output based on the field-names of my Solr index objects.
What I have are objects like this e.g.:
{
"Id":"ID12345678",
"GroupKey":"Beta",
"PricePackage":5796.0,
"PriceCoupon":5316.0,
"PriceMin":5316.0
}
Whereby the Price* fields may vary from object to object, some might have more than three of those, some less, however they would be always prefixed with Price.
How can I query Solr to get a list with all field-names prefixed by Price?
I've looked into filters, facets but could not find any clue on how to do this, as all examples - e.g. regex facet - are in regard to the field-value, not the field-name itself. Or at least I could not adapt it to that.
You can get a comma separated list of all existing field names if you query for 0 documents and use the csv response writer (wt parameter) to generate the field name list.
For example if you request /solr/collection/select?q=*:*&wt=csv you get a list of all fields. If you only want fields prefixed with Price you could also add the field list parameter (fl) to limit the fields.
So the request to /solr/collection/select?q=*:*&wt=csv&fl=Price*should return the following response:
PricePackage,PriceCoupon,PriceMin
With this solution you get all fields existing including dynamic fields.
I have a structure where I want to search my documents and filter/rank/set conditions on my parents. Example, a doc is a match because it contains my searched string, but also because its parent contains a certain value.
Using the graph parser and experimenting with the filter is the best way I have noticed doing this. I tried block join child parser first but it wouldn't do it for me.
The problem I am facing now is that I can't seem to get the filter to work in this way:
traversalFilter="(-field:x) OR (field2:y)"
Meaning, if field does not have value x it is ok, if field has value x and field2 has y its also ok. Other cases is filtered away.
But it won't work. Any help is appreciated!
Edit for more information:
I have set up a test core with all my fields stored in a text_general field. Default solrconfig. I have a simple chaining I'm using from parameter as document id. And a to field storing all ids of each documents children. And the graph parser works fine, its just this kind of filter that does not work for me.
I have documents with field with value a or b.
A query like this:
q=*
fq={!graph from=id to=to returnRoot=false traversalFilter="(field:b)" }id:0
This query filters away any document and its children that do not have b as value on field.
q=*
fq={!graph from=id to=to returnRoot=false traversalFilter="(-field:b)" }id:0
Should then work in the opposite. Filter away documents with b as value. But this does not work for some reason.
Edit:
from solrquerysyntax:
https://wiki.apache.org/solr/SolrQuerySyntax
Pure negative queries (all clauses prohibited) are allowed.- inStock:false finds all field values where inStock is not false
Which is why q=* fq=-(field:x) works fine, in returning all documents not containing value x in field.
So why can't I add the same filter in the graph traversal
EDIT3:
I have now started looking on the graph parser and have noticed that when filtering -(-field:x) is the same as +field:x. But +(-field:x) is not the same as -field:x and does not work.
Suppose a user enters a two word input for search, since the default boolean applied is OR, all entries containing all or both entries appear.
What I was interested to know, is that if conditions specifically meeting the AND condition could be boosted.
In case of multiple words, can words be specified to imply specific constraints in searching or boost few parameters in case these words are present.For e.g: , if input be "with x and y without z", can i make my solr to interpret it as (x AND y) AND (Not z)? or at least boost those entries which partially or fully meet the requirement?
EDIT:
I have tried using boost with edismax as shown here:
$query = $client->createSelect(); //create search query
$query->setQuery('memberType:'.$searchQuery.' firstName:'.$searchQuery.' gender:'.$searchQuery); //include fields required for searching //meantion fields to be searched and search query/ies
$edismax = $query->getEDisMax();
$edismax->setQueryFields('firstName memberType^3 gender^2'); //boost fields
$query->setStart($start)->setRows($rows); //vary bracketted numbers to vary results staring point and no. of rows to be displayed, use variables instead of constants
$query->setFields(array('id', 'firstName', 'lastName', 'eid', 'gender', 'memberType')); //set return fields
//$query->addSort('id', $query::SORT_ASC); //sort field and customisations
$resultSet = $client->select($query);
When i search for a name with a particular member type, like "sanjay candidate" i expect the order to be entries with sanjay and candidate, and then all users who are candidates and then all users who are sanjay, but instead i get sanjay and candidate then all who are sanjay and then all candidates.
I am not able to figure out what the issue may be or if i can provide a more customized boosting.
If you are using eDismax, you have a whole collection of boosting options for a phrase, bigram, a separate boosting query and so on. Reading through the wiki page and experiment. You should not need to do any custom coding for this scenario.
Can I, within a Solr function query, count the number of values in a multivalued field? How would I write a function query that returns documents with, say, 3 or more values for a particular field?
Here's the function query reference, and it doesn't list anything like that, so I think it's safe to assume that there's no such thing.
If the value count is somehow relevant in your case, add it as a separate int field, then operate on that field.