How to return the count of a field with each object in Solr
When I do fq=verify_ix:1 I have a response below, I want to get count where verify_ix = 1 in the response too. How can I do that?
"response": {
"numFound": 9484,
"start": 0,
"maxScore": 1,
"docs": [
{
"id": "10000000000965509",
"description_s": "No Description",
"recommendation_ix": 0,
"sId_lx": 30005938,
"sType_sx": "P",
"condition_ix": 1000,
"verify_ix": 1
},
.
.
.
{
"id": "10000000000965734",
"description_s": "No Description",
"recommendation_ix": 1,
"sId_lx": 30005947,
"sType_sx": "P",
"condition_ix": 2000,
"verify_ix": 1
}
]}
If you want counts of the different values for a given field, you can send a request to Solr with facet=true and facet.field=verify_ix. For counts over all records, set q=*:*. If you don't want to see any rows returned, you can set rows=0.
See here for more details on faceting:
https://cwiki.apache.org/confluence/display/solr/Faceting
(I tested this with Solr 5, but faceting should work with Solr 4 as well.)
Related
I´m new to Neo4j, and want to implement a service that makes use of it.
I´ve read the docs and searched for it, however I still didn´t get an answer to this simple question:
How do I specify which database to query in a Neo4j query?
E.g. I connected to bolt://localhost:7687, and have three databases in there: system, neo4j, and mydb. The neo4j database is the standard.
When I open the Neo4j browser and do a query such as MATCH (n) RETURN n, it automatically assumes that I want to query the standard DB which is called neo4j. However, I want to query another one, mydb.
My output when I query aforementioned query says
{
"query": {
"text": "match (n) return n",
"parameters": {}
},
"queryType": "r",
"counters": {
"_stats": {
"nodesCreated": 0,
"nodesDeleted": 0,
"relationshipsCreated": 0,
"relationshipsDeleted": 0,
"propertiesSet": 0,
"labelsAdded": 0,
"labelsRemoved": 0,
"indexesAdded": 0,
"indexesRemoved": 0,
"constraintsAdded": 0,
"constraintsRemoved": 0
},
"_systemUpdates": 0
},
"updateStatistics": {
"_stats": {
"nodesCreated": 0,
"nodesDeleted": 0,
"relationshipsCreated": 0,
"relationshipsDeleted": 0,
"propertiesSet": 0,
"labelsAdded": 0,
"labelsRemoved": 0,
"indexesAdded": 0,
"indexesRemoved": 0,
"constraintsAdded": 0,
"constraintsRemoved": 0
},
"_systemUpdates": 0
},
"plan": false,
"profile": false,
"notifications": [],
"server": {
"address": "localhost:7687",
"version": "Neo4j/4.4.5",
"agent": "Neo4j/4.4.5",
"protocolVersion": 4.4
},
"resultConsumedAfter": {
"low": 2,
"high": 0
},
"resultAvailableAfter": {
"low": 8,
"high": 0
},
"database": {
"name": "neo4j"
}
}
In the last JSON value is the proof that the query was executed on database neo4j.
What do I have to add to my queries to instead query another database in the same DBMS?
You can change/specify the database using the following options.
From the Neo4j Browser, you can select the database in the sidebar.
In Cypher syntax, the use command lets you choose different databases.
:use mydb.
If you connect to Neo4j through an Application driver, you can specify the database while creating the session object.
For example, if you are using the Python driver:
from neo4j import GraphDatabase
driver = GraphDatabase.driver(uri, auth=(user, password))
session = driver.session(database="mydb")
Specify the default database in a system-wide manner by modifying the config_dbms.default_database value in the the neo4j.conf file.
I'm retrieving multiple appointments via AppointmentCalendar.FindAppointmentsAsync. I'm evaluating the Recurrence.RecurrenceType and noticed an unexpected value of 1 for master appointments of a series. I expect the Recurrence.RecurrenceType to be 0 (Master) but instead it is 1 (Instance).
(Note: I added AppointmentProperties.Recurrence to FindAppointmentsOptions.FetchProperties that is passed to GetAppointmentsAsync, so the Recurrence data should be fetched propertly.)
To double check I retrieved the respective master appointment via GetAppointmentAsync (instead of FindAppointmentsAsync) using its LocalId - and here the RecurrenceType is correctly set to 0.
Here is demo output for a test appointment series:
Data gotten by FindAppointmentsAsync (Instance??):
"Recurrence": {
"Unit": 0,
"Occurrences": 16,
"Month": 1,
"Interval": 1,
"DaysOfWeek": 0,
"Day": 1,
"WeekOfMonth": 0,
"Until": "2016-09-29T02:00:00+02:00",
"TimeZone": "Europe/Budapest",
"RecurrenceType": 1,
"CalendarIdentifier": "GregorianCalendar"
},
"StartTime": "2016-09-14T19:00:00+02:00",
"OriginalStartTime": "2016-09-14T19:00:00+02:00",
Data gotten by GetAppointmentAsync for the same appointment (Master):
"Recurrence": {
"Unit": 0,
"Occurrences": 16,
"Month": 1,
"Interval": 1,
"DaysOfWeek": 0,
"Day": 1,
"WeekOfMonth": 0,
"Until": "2016-09-29T02:00:00+02:00",
"TimeZone": "Europe/Budapest",
"RecurrenceType": 0,
"CalendarIdentifier": "GregorianCalendar"
},
"StartTime": "2016-09-14T19:00:00+02:00",
"OriginalStartTime": null,
Notice the difference in RecurrenceType. Also note that OriginalStartTime is set to null for the master gotten by GetAppointmentAsync but has a value for the appointment gotten by FindAppointmentsAsync.
You can also see that the StartTime for the master appointment is the start time set for the alleged Instance (which in reality is the master).
Shouldn't FindAppointmentsAsync return a master as the first element of a series, instead of an instance?
(SDK: 10.0.14393.0, Anniversary)
Code to explicitly find such a master/instance situation for a given calendar:
var appointmentsCurrent = await calendar.FindAppointmentsAsync(DateTimeOffset.Now, TimeSpan.FromDays(365), findAppointmentOptions);
foreach(var a in appointmentsCurrent)
{
var a2 = await calendar.GetAppointmentAsync(a.LocalId);
if (a2.Recurrence?.RecurrenceType == RecurrenceType.Master &&
a2.StartTime == a.StartTime &&
a.Recurrence?.RecurrenceType == RecurrenceType.Instance &&
a.OriginalStartTime == a2.StartTime)
{
Debug.WriteLine("Gotcha!");
}
}
I tested above code on my side. If you get the count of the appointments which are got from FindAppontmentsAsync by the following code:var count=appointmentsCurrent.Count;, you will find it does return the count of the appointment instances, not the count of master appointments. So the FindAppontmentsAsync method got all instances of the appointments not master appointments. This is the reason why the RecurrenceType is instance.
It seems like we can get one master appointment by method GetAppointmentAsync as you mentioned above, so I suppose this may not block you.
If you think this is not a good design for this API or you require a API for finding all the master appointments in one calendar, you can submit your ideas to the windows 10 feedback tool or the user voice site.
i have a query string with 5 words. for exmple "cat dog fish bird animals".
i need to know how many matches each word has.
at this point i create 5 queries:
/q=name:cat&rows=0&facet=true
/q=name:dog&rows=0&facet=true
/q=name:fish&rows=0&facet=true
/q=name:bird&rows=0&facet=true
/q=name:animals&rows=0&facet=true
and get matches count of each word from each query.
but this method takes too many time.
so is there a way to check get numCount of each word with one query?
any help appriciated!
In this case, functionQueries are your friends. In particular:
termfreq(field,term) returns the number of times the term appears in the field for that document. Example Syntax:
termfreq(text,'memory')
totaltermfreq(field,term) returns the number of times the term appears in the field in the entire index. ttf is an alias of
totaltermfreq. Example Syntax: ttf(text,'memory')
The following query for instance:
q=*%3A*&fl=cntOnSummary%3Atermfreq(summary%2C%27hello%27)+cntOnTitle%3Atermfreq(title%2C%27entry%27)+cntOnSource%3Atermfreq(source%2C%27activities%27)&wt=json&indent=true
returns the following results:
"docs": [
{
"id": [
"id-1"
],
"source": [
"activities",
"activities"
],
"title": "Ajones3 Activity Entry 1",
"summary": "hello hello",
"cntOnSummary": 2,
"cntOnTitle": 1,
"cntOnSource": 1,
"score": 1
},
{
"id": [
"id-2"
],
"source": [
"activities",
"activities"
],
"title": "Common activity",
"cntOnSummary": 0,
"cntOnTitle": 0,
"cntOnSource": 1,
"score": 1
}
}
]
Please notice that while it's working well on single value field, it seems that for multivalued fields, the functions consider just the first entry, for instance in the example above, termfreq(source%2C%27activities%27) returns 1 instead of 2.
I am issuing the following query:
"responseHeader": {
"status": 0,
"QTime": 1,
"params": {
"q": "(test)",
"defType": "edismax",
"indent": "true",
"fl": "distributor_status,QOH_estimate,id,score",
"start": "0",
"sort": "score desc,id desc",
"fq": "(QOH_estimate:[1 TO *])+OR+(distributor_status:stock)+OR+(*:* -distributor_status:VENDORDISC)",
"rows": "10",
"wt": "json",
"_": "1446833368873"
}
}
I am getting back documents like the following:
{ "id": "5445a000e4b0fb20ffca4aba",
"QOH_estimate": 0,
"distributor_status": "VENDORDISC",
"score": 4.48295
}
How does this document get past the fq?
Its QOH_estimate is 0, so it fails the QOH_estimate:[1 TO *]. Its distributor_status is VENDORDISC, so it fails distribotor_status:stock. Its distributor_status is VENDORDISC, so I would also expect it to fail the (*:* -distributor_status:VENDORDISC) as well. Since it fails all 3 parts of the disjunctive query, I would expect it to be eliminated, yet it is not being eliminated. Why?
I think your spaces between the clauses are double-escaping. Why otherwise, you have +OR+ in that output when the other spaces are fine.
If that does not help, try adding debug flag and see how that all gets parsed into the Lucene level. That should give a hint to the final expansion.
I have solr documents with two fields, one is a string and one is an integer. Both fields are allowed to be null. I am attempting to write a query that will eliminate documents with the following properties:
textField = "badValue" AND (numberField is null OR numberField = 0)
I added the following fq:
((NOT textField=badValue) OR numberField=[1 TO *])
This does not seem to have worked properly, because I am getting a document with textField = badValue and numberField = 0. What did I do wrong with my fq?
The full query response header, containing the parsed query is:
"responseHeader": {
"status": 0,
"QTime": 245,
"params": {
"q": "(numi) AND (solr_specs:[* TO ] OR full_description:[ TO ])",
"defType": "edismax",
"bf": "log(sum(popularity,1))",
"indent": "true",
"qf": "categories^3.0 manufacturer^1.0 sku^0.2 split_sku^0.2 upc^1.0 invoice_description^2.6 full_description solr_specs^0.8 solr_spec_values^1.7 legacyid legacy_altcode id",
"fl": "distributor_status,QOH_estimate,id,score",
"start": "0",
"fq": "((:* NOT distributor_status=VENDORDISC) OR QOH_estimate=[1 TO *])",
"sort": "score desc,id desc",
"rows": "20",
"wt": "json",
"_": "1441220051438"
}
}
QOH_estimate is numberField and distributor_status is textField.
Please try the following in your fq parameter: ((*:* NOT textField:badValue) OR numberField:[1 TO *]).
((*:* NOT distributor_status:VENDORDISC) OR QOH_estimate:[1 TO *])
Here you first selecting the documents which are not containing textField:badValue and ORing with documents coming from numberField:[1 TO *] condition.