Is it possible to perform a search with solr within a subset of data? I am using solr combined with tokyo tyrant in python
I read Searching within a subset of data - Solr but I guess it does not really fit my problem because I am not using solr.NET
What I want is to:
find the elements of the data set with code = 'xxxx' and
I want to perform the search within the a subset of data : data whose id are in a given list / or with id startswith 'yy'
So 1 is not a problem but I do not know how to do 2
thanks for your help
Does query like this work for you q=id:(id1 OR id2 OR id3) OR id:yy*
You can use id:(id1 OR id2 OR id3) to search for ids in the id field and id:yy* for the prefix query to check for ids starting with yy
if you have a server their's an administration panel where you can do some request.
but all you have to do to make request is to send an http request with good parameters, basis are explained here.
http://wiki.apache.org/solr/SolrQuerySyntax
http://wiki.apache.org/solr/CommonQueryParameters
Related
In my flutter/dart app, I am using FlutterFire to sync data with my FireStore database.
I have a list of IDs from document (of the same collection) that I want to retrieve from FireStore.
I am trying to do the following:
await FirebaseFirestore.instance.collection('groups').doc([id1, id2, id3, id4]).get();
But ".doc()" won't accept a list as a parameter for the document ID to look for.
I tried with the "where" method like this:
await FirebaseFirestore.instance.collection('groups').where("id", whereIn: [id1, id2, id3, id4]).get();
But it won't work since the "id" isn't a field in the documents (it is their name/identifier).
The only two solutions I can think about are:
Doing one request per document I want to retrieve (not ideal at all)
Adding a field "id" in the documents that store the same information as their name/identifier (I don't like duplicating data that can end mismatching)
The third one (ideal IMO) would be to enable .doc to accept lists/arrays, so you can search for a list of documents you know the identifier of.
Am I missing something? Does anybody know any other solution?
THANKS!
(asked the same question here)
I found a solution here!
So adapting it to my case it looks like this:
await FirebaseFirestore.instance.collection('groups').where(FieldPath.documentId, whereIn: [id1, id2, id3, id4]).get();
Be aware that there is a limit of 10 to the number of values/IDs you can use in the search.
I have Solr documents that can have 3 possible states (state_s in {new, updated, lost}). These documents have a field named ip_s. These documents also have a field nlink_i that can be equal to 0.
What I want to know is: how many new ip_s I have. Where I consider a new ip is an ip that belong to a document whose state_s="new" that does not appear in any document with state_s = "updated" OR state_s = "lost" .
Using Solr facet search I found a solution using the following query parameters:
q=sate_s:"lost"+OR+sate_s:"updated"
facet=true&facet.field=ip_s&facet.limit=-1
Basically, all ip in
"facet_fields":{
"ip_s":[
"105.25.12.114",1,
"105.25.15.114",1,
"114.28.65.76",0,
...]
with 0 occurence (e.g. 114.28.65.76) are "new ips".
Q1: Is there a better way to do this search. Because using the facet query describe above I still need to read the list of ip_s and count all ip with occurence = 0.
Q2: If I want to do the same search, (i.e. get the new ip) but I want to consider only documents where nlink_i>0 how can I do?. If I add a filter : fq=nlink_i:[1 TO *] all ip appearing in documents with link_i=0 will also have their number of occurrence set to 0. So I cannot not apply the solution describe above to get new ip.
Q1: To avoid the 0 count facets, you can use facet.mincount=1.
Q2: I think the solution above should also answer Q2?
Alternatively to facets you can use Solr grouping functionality. The aggregation of values for your Q1 does not get much nicer, but at least Q2 works as well. It would look something like:
select?q=*:*&group=true&group.field=ip_s&group.sort=state_s asc&group.limit=1
In order for your programmatic aggregation logic to work, you would have to change your state_s value for new entries to something that appears first for ascending ordering. Then you would count all groups that contain a document with a "new-state-document" as first entry. The same logic still works if you add a fq parameter to address Q2.
I found another solution using facet.pivot that works for Q1 and Q2:
http://localhost:8983/solr/collection1/query?q=nbLink_i:[1%20TO%20*]&updated&facet=true&facet.pivot=ip_s,state_s&facet.limit=-1&rows=0
I need to search by DocId because I have files in Drive that I am also searching, and need to merge the results. I also need to limit the results by other fields. I tried this query:
INFO: Searching with query: DocId:(4842249208725504 5405199162146816 5510752278413312 5581121022590976 5827411627212800)
However it found 0 results even though they exist. I also tried doc_id and id.
log.info("Searching with query: " + q);
try {
Results<ScoredDocument> results = getIndex().search(q);
I will also need to filter by other fields, ex:
DocId:(123456789) year:(2012)
The other fields work during searching, but not DocId. In the Admin interface, it shows DocId as being one of the fields! http://localhost:8888/_ah/admin/search?subsection=searchIndex...
Inside each document have an atom field named docId and in that field pass in the doc id. Then you can do a search per normal (as you suggested).
Here is a quote from the documentation
While it is convenient to create readable, meaningful unique document
identifiers, you cannot include the doc_id in a search. Consider this
scenario: You have an index with documents that represent parts, using
the part's serial number as the doc_id. It will be very efficient to
retrieve the document for any single part, but it will be impossible
to search for a range of serial numbers along with other field values,
such as purchase date. Storing the serial number in an atom field
solves the problem.
If you know the doc ID in advance, rather then searching for it why not just get it directly?
doc = index.get("AZ125")
https://developers.google.com/appengine/docs/python/search/#Python_Retrieving_documents_by_doc_ids
I need to implement further functionality picture of it is attached below. I've already built an application based on Solr search.
In a few words about this functionality: drop down will contain similar search phrases within concrete category and number of items found.
In what way to make Solr collect such data and somehow receive it?
Yes, you can do that in Solr using Facets, which allow grouping results. The default behaviour of facets is to return the group name and the number of items found. You do that by adding these 2 items you your query string facet=true, facet.field=category.
An example query in your case will be
http://localhost:8983/solr/NAME_OF_YOUR_INDEX/select/?wt=json&indent=on&q=ipo&fl=category,name&facet=true&facet.field=category
Take a look at the tutorial for more details.
This is roughly equivalent to doing this in SQL:
SELECT category, COUNT(*) FROM items WHERE text LIKE "%ipo%" GROUP BY category;
I'm unclear on this point from the documentation. Is it possible to give Solr X document IDs and tell it that I want documents similar to those?
Example:
The user is browsing 5 different articles
I send Solr the IDs of these 5 articles so I can present the user other similar articles
I am not clear about sending the document IDs, nor whether MoreLikeThis can operate on multiple documents as in this example.
you can try passing multiple Ids with the Query q=id:(document_id1 OR document_id2 OR document_id3) :-
e.g.
http://localhost:8080/solr/select/?qt=mlt&q=id:(document_id1 OR document_id2 OR document_id3)&mlt.fl=[field1],[field2],[field3]&fl=id&rows=10