Grouping results and keeping facet counts consistent

Grouping results and keeping facet counts consistent - solr

Using Solr 3.3
Key Store Item Name Description Category Price
=========================================================================
1 Store Name Xbox 360 Nice game machine Electronic Games 199.99
2 Store Name Xbox 360 Nice game machine Electronic Games 199.99
3 Store Name Xbox 360 Nice game machine Electronic Games 249.99
I have data similar to above table and loaded into Solr. Item Name,
description Category, Price are searchable.
Expected result
Facet Field
Category
Electronic(1)
Games(1)
**Store Name**
XBox 360 Nice game machine priced from 199.99 - 249.99
What will be the query parameters that I can send to Solr to receive results above, basically I wan to group it by Store, ItemName, Description and min max price
And I want to keep paging consistent with the main (StoreName). The paging should be based on the Store Name group. So if 20 stores were found. I should be able to correctly page.
Please suggest

If using Solr 4.0, the new "Grouping" (which replaces FieldCollapsing) fixes this issue when you add the parameter "group.facet=true".
So to group your fields you would have add the following parameters to your search request:
group=true // Enables grouping
group.facet=true // Facet counts to be number of groups instead of documents
group.field=Store // Groups results by the field "Store"
group.ngroups=true // Tells Solr to return the number of groups found
The number of groups found is what you would show to the user and use for paging, instead of the normal total count, which would be the total number of documents in the index.

Have you looked into field collapsing? It is new in Solr 3.3.
http://wiki.apache.org/solr/FieldCollapsing

What I did is I created another field that grouped the required fields in a single field and stored it, problem solved, so now I just group only on that field and I get the correct count.

Related

Google Data Studio convert metric to dimension not working

I have imported my GA4 data into Google Data Studio and am trying to see how many giftcards have been sold by their value.
The item revenue metric in GA4 is equal to the giftcard value (i.e. revenue = $200 therefore $200 giftcard was sold).
I want to breakdown sales by giftcard value like so:
Giftcard (revenue)
Count
$200
4
$250
3
$300
6
To do this, I need to set a copy of item revenue as a dimension rather than a metric.
In Google Data Studio, I can create a calculated field with the following formula that should convert the item revenue into text:
CAST(Item Revenue AS TEXT)
The problem I'm having is that while the formula sets the field type as text, it is still regarded by GDS as a metric and can't be used as a dimension.
Even when I try to add text, GDS still recognises the field as a number:
CONCAT(CAST(Item Revenue AS TEXT), " giftcard")

To use a metric as a dimension you can make a combination of data. When defining the graphic element (table, for example) and the respective data source, just create a data combination, but do not combine the data with any other source and just define the combination with the initial data itself. So you will have the same data structure only through a combined structure.
When making a combination of data, data studio recognizes all calculated fields (metrics) as dimensions. Thus, it is possible to make the conversion.

Solr Boost-Function on Sales

I am using Apache Solr 8 with products as documents. Each document includes sales within the last X days that I want to boost, as well as a title and other fields.
Say productA has been sold 5 times, I want to boost it with score+10; a productB has been sold 50 times, I want to boost the score by 30.
I tried to use a boostFunction that looks like (edismax query parser)
q=Coffee&qf=title&bf=if(lt(sales,5),10,if(lt(sales,50),30))
Solr now returns documents that have nothing to do with my "Coffee"-Query but just match the boostfunction. There are even results with score "0".
E.g.
Rank;Score;Sales;Title
1;58.53;55;Coffee big
2;38.11;50;Coffee
3;30;55;Tea
Any idea to get rid of those "only boost function"-matches?

Found the answer!
My Query-Fields actually included boostings like
&qf=title^2 longDescription^0 whatever^0...
Instead of excluding the results found in those 0-boosted fields, solr adds them and matches with - well score 0.
When I remove the 0-boostings, everything works as intended.

Is it possible to create an SQL query that displays results like this?

Background
I have a database that hold records of all assets in an office. Each asset have a condition, a category name and an age.
A ConditionID can be;
In use
Spare
In Circulation
CategoryID are;
Phone
PC
Laptop
and Age is just a field called AquiredDate which holds records like;
2009-04-24 15:07:51.257
Example
I've created an example of the inputs of the query to explain better what I need if possible.
NB.
Inputs are in Orange in the above example.
I've split the example into two separate queries.
Count would be the output
Question
Is this type of query and result set possible using SQL alone? And if so where do I start? Would it be easier to use Ms Excel also?

Yes it is possible, for your orange fields you can just e.g.
where CategoryID ='Phone' and ConditionID in ('In use', 'In Circulation')
For the yellow one you could do a datediff of days of accuired date to now and divide it by 365 and floor that value, to get the last one (6+ years category) you need to take the minimum of 5 and the calculated value so you get 0 for all between 0-1 year old etc. until 5 which has everything above 6 years.
When you group by that calculated column and select the additional the count you get what you desire.

SOLR search array vs individual documents

I've got a business case where I need to check if the search query is about displays businesses
eg: q="night clubs new york"
I've got a list of Countries, state city and region in my database 3million + records and I've got a list of business categories.
All I want to do is check if in the query has a business category in it (night clubs) and does it have a City, state or country's name (new york). So i'm checking the number of results retuned for the below query. If I get 2 numResults then this is a business query and then I query my Solr index to search for businesses.
query: places_ss:(night clubs new york) OR categories_ss:(night clubs new york)
Speed Question: How should I save the list of cities, states and countries in SOLR to get maximum search speed ?
Have one document id:places and add distinct cities, states and countries in on array places_ss
have multiple documents with different id's with 100,000 place names in each document in an array.
?
have a document or multiple documents with place_s string(not array) each place separated by space and each space in place separated by underscore eg: new york becomes new_york.
And during query time I will get multiple combinations of night clubs new york
eg: night night_clubs night_clubs_new night_clubs_new_york clubs_new clubs_new_york new_york york and query for place.
Would it be a good idea to have a separate core just for above place documents to increase speed ?
Is this a good solution ?

Document organisation :
better to have a document approche with :
- location
- activity
- other things needed!
location
You should save your location like this
Country:state:city:suburb.... so that you can seach in usa:new york:new york*
of ::new york
No need for _
avoid that, there is no needs !
activity
activity should be stored in another field for precision on the search and speed.

Solr: Searching a term in multiple, indexed fields and returning top 'N' hits from each search field

I have two indexed fields in my Solr schema
Employee Name
Manager Name
Which are plain strings.
my Question is: Given a search term, I want to display top 5 suggested completions from Manager Names and the next 5 from Employee Names.
I can use copy fields, but sometimes I get all top 10 results from Employee Names.
I have a hunch that boosting can help me.. but could not figure out how?

Boost can't help you control the results and distribute 5 each in the top 10 results.
Probably you can check on Field Collapsing, where you can group per role (Manager and Name) and limit 5 results for the group.
So you would have 2 groups returned back to you with 5 results each.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Grouping results and keeping facet counts consistent - solr

Have you looked into field collapsing? It is new in Solr 3.3. http://wiki.apache.org/solr/FieldCollapsing

What I did is I created another field that grouped the required fields in a single field and stored it, problem solved, so now I just group only on that field and I get the correct count.

Related

Google Data Studio convert metric to dimension not working

Solr Boost-Function on Sales

Is it possible to create an SQL query that displays results like this?

SOLR search array vs individual documents

Solr: Searching a term in multiple, indexed fields and returning top 'N' hits from each search field

Categories

Resources