I want to find the name of the facets out put based on the results i mean if i have the out put as
<lst name="facet_fields">
<lst name="state">
<int name="kerala">3312</int>
<int name="andaman">10</int>
<int name="andhra">0</int>
<int name="arunachal">0</int>
<int name="assam">0</int>
</lst>
</lst>
i want the result of output as kerala,andaman as both of them having the count > 0
is there any possibility,please help me on this
I guess you want to specify the minimum count as 1 in your query. It can be achieved using facet.mincount
Related
I am using 'Facet' to find the count of top 3 most repeated words in a particular field say "msgs" which contains more than 10,000 records.
and I get the output similar to this.
word1 1600
word2 1536
word3 956
Now, along with the count, I want to display those particular fields which contain the above words. Any suggestions??
Okay. I hope I understand what you need. You could try query similar to this one:
http://solrhost:solrport/solr/select?q=your_query&rows=0&facet=true&facet.limit=-1&facet.field=your_facet_field1&facet.field=your_facet_field2
where
solrhost - Solr address
solrport - Solr port (default 8983)
your_facet_field1, etc - your field msgs
your_query could be : if you want to facet every document
Result will be something like this:
<response>
<responseHeader>
<status>0</status>
<QTime>2</QTime>
</responseHeader>
<result numFound="4" start="0" />
<lst name="facet_counts">
<lst name="facet_queries" />
<lst name="facet_fields">
<lst name="your_facet_field1">
<int name="search">0</int>
<int name="memory">0</int>
<int name="graphics">0</int>
<int name="card">0</int>
<int name="music">1</int>
<int name="software">0</int>
<int name="electronics">3</int>
<int name="copier">0</int>
<int name="multifunction">0</int>
<int name="camera">0</int>
<int name="connector">2</int>
<int name="hard">0</int>
<int name="scanner">0</int>
<int name="monitor">0</int>
<int name="drive">0</int>
<int name="printer">0</int>
</lst>
<lst name="your_facet_field2">
<int name="false">3</int>
<int name="true">1</int>
</lst>
</lst>
</lst>
</response>
Is there a way in Solr to get a list of Terms by their distance to another term, similar to the TermsComponent which can return terms by their count in a document?
For instance, if I have the following text indexed:
The quick brown fox jumped over the lazy dogs.
and
What does the fox say?
And I searched for the term 'fox', I would expect the following output:
<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">7</int>
</lst>
<lst name="terms">
<lst name="text">
<int name="brown">0</int>
<int name="jumped">0</int>
<int name="say">0</int>
<int name="the">1</int>
<int name="quick">1</int>
<int name="over">1</int>
<int name="does">1</int>
<int name="what">2</int>
<int name="lazy">3</int>
<int name="dogs">4</int>
</lst>
</lst>
</response>
In this example I am using a VERY simple algorithm to calculate the value ( total_word_distance / number_of_docs_appeared). For example, 'The' occurs in both documents, once with a distance of 0 and another with a distance of 2, so (0 + 2) / 2 give the answer of 1.
Again, what I am asking is if something like this already exists, and if not, how would one go about do this?
I want to write a query that for ex. in sql psodocode like below
select * from temptable where price + 3 = 188;
Solr query i try is below
http://127.0.0.1:8983/solr/select/?fl=score,id&defType=func&q=sum(price,3):188
but i get below error. How can i query in solr? Please do not advice using "TO" keyword.
<response>
<lst name="responseHeader">
<int name="status">400</int>
<int name="QTime">1</int>
<lst name="params">
<str name="fl">score,id</str>
<str name="q">sum(price,3):188</str>
<str name="defType">func</str>
</lst>
</lst>
<lst name="error">
<str name="msg">
org.apache.solr.search.SyntaxError: Unexpected text after function: :188
</str>
<int name="code">400</int>
</lst>
</response>
frange query will do
{!frange l=188 u=188} sum(price,3)
My schema is like :
product_id
category_id
A category contains products.
In solr 3.6, I group results on category_id and it works well.
I just added a new field:
group_id
A group contains products that vary on size or color.
Example: shoes in blue, red and yellow are 3 differents products and have the same group_id.
Additionally to the result grouping on field category_id, I would like to have in my results only one product for a group_id, assuming group_id can be null (for products that aren't part of a group).
To follow the example of the shoes, it means that for the request "shoe", only one of the 3 products should be in results.
I thought to do a second result grouping on group_id, but I doesn't seem possible to do that way.
Any idea?
EDIT : For now, i process the results in php to delete documents that have a group_id that is already in the results. I leave this subject open, in case someone finds how to group on 2 fields
If your aim is to get grouping counts based on multiple "group by" fields, you can use pivot faceting to achieve this.
&facet.pivot=category_id,group_id
Solr will give you back a hierarchy of grouped result counts, following the page of search results, under the facet_pivot element.
http://wiki.apache.org/solr/SimpleFacetParameters?highlight=%28pivot%29#Pivot_.28ie_Decision_Tree.29_Faceting
It is not possible to group by query on two fields.
If you need count then you can use facet.field(For single field) or facet.pivot(For multiple field).
It is not actually group but you can get count of that group for multiple field.
Example Output:
<?xml version="1.0" encoding="UTF-8"?>
<response>
<lst name="responseHeader">
<bool name="zkConnected">true</bool>
<int name="status">0</int>
<int name="QTime">306</int>
</lst>
<result name="response" numFound="667" start="0" maxScore="0.70710677">
<doc>
<int name="idField">7393</int>
<int name="field_one">12</int>
</doc>
</result>
<lst name="facet_counts">
<lst name="facet_queries"/>
<lst name="facet_fields"/>
<lst name="facet_ranges"/>
<lst name="facet_intervals"/>
<lst name="facet_heatmaps"/>
<lst name="facet_pivot">
<arr name="field_one,field_two">
<lst>
<str name="field">field_one</str>
<int name="value">3</int>
<int name="count">562</int>
<arr name="pivot">
<lst>
<str name="field">field_two</str>
<bool name="value">true</bool>
<int name="count">347</int>
</lst>
<lst>
<str name="field">field_two</str>
<bool name="value">false</bool>
<int name="count">215</int>
</lst>
</arr>
</lst>
<lst>
<str name="field">field_one</str>
<int name="value">12</int>
<int name="count">105</int>
<arr name="pivot">
<lst>
<str name="field">field_two</str>
<bool name="value">true</bool>
<int name="count">97</int>
</lst>
<lst>
<str name="field">field_two</str>
<bool name="value">false</bool>
<int name="count">8</int>
</lst>
</arr>
</lst>
</arr>
</lst>
</lst>
</response>
Example Query :
http://192.168.100.145:7983/solr/<collection>/select?facet.pivot=field_one,field_two&facet=on&fl=idField,field_one&indent=on&q=field_one:(3%2012)&rows=1&wt=xml
if you can change the data that you are posting to solr, then I suggest that you create a string field which will have a concatenation of category_id and group_id. For example, if the category_id = 5 and group_id=2, then your string field can be :- '5,2' (using ',' or any other character as a delimiter). You can then group on this string field.
Can you please explain me , what is facet ?
What did I understand is , suppose I have following documents.
State Country
karntaka India
Bangalore India
Delhi India
Noida India
It collapse multiple same value of field to a single value and returns number of times that value occurred.
Now when i am search on field 'Country' then obviously I am getting 4 times India , So i keep facet=on and facet.field=Country, with a motive of getting only one time India , but when i fired query rather I am getting
some weird result
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">6</int>
</lst>
<result name="response" numFound="4" start="0">
<doc>
<str name="country">India</str></doc>
<doc>
<str name="country">India</str></doc>
<doc>
<str name="country">India</str></doc>
<doc>
<str name="country">India</str></doc>
</result>
<lst name="facet_counts">
<lst name="facet_queries"/>
<lst name="facet_fields">
<lst name="country">
<int name="a">4</int>
<int name="d">4</int>
<int name="di">4</int>
<int name="dia">4</int>
<int name="i">4</int>
<int name="ia">4</int>
<int name="in">4</int>
<int name="ind">4</int>
<int name="indi">4</int>
<int name="india">4</int>
</lst>
</lst>
<lst name="facet_dates"/>
<lst name="facet_ranges"/>
</lst>
</response>
Can any one help me to understand .
Thanks
If you had a Washington, USA entry, the facet would report 4 results for India and 1 for USA.
Use a string field type. You seem to have used a (text) field with lowercasing and n-gramming, which may benefit people who spell India as Inde, for example. A string field is not processed like this and therefore its best suited for a field meant to be faceted.