QueryingSolr : getRequestHandler returns result but not selectHandler - solr

I have id field in solr that uniquely identifies a solr document
When querying solr using getHandler :
solr/{collection}/get?id=p_1266762970&fl=*
Result:
"doc":
{
"lastIndexed":"2014-12-25T09:48:56.509Z",
"id":"1266762970",
"solrId":"p_1266762970",
.....
}
But when querying using solr admin - selectHandler - no documents are returned.
Solr query looks like:
solr/{collection}/select?q=:&fq=solrId:p_1266762970
solr/{collection}/select?q=:&fq=id:1266762970
I tried doing a hard commit and it returned successfully but still the same results
I have other documents in solr as well that shows up correct results.This issue exist for some of the ids (8 out of 2.3 million) only.
Updated: UniqueKey is
<uniqueKey>solrId</uniqueKey>

Related

Solr Schemaless mode Query to match any word in a [ Text_general] field type

I'm trying to Search solr query in schemaless mode. but a matching document was not found in the result.
q: i m from india
numResult: 0
q: parapfa551d3aef764ddca9e6e421fe8d50e8:i m from india
numResult: 1
my data(600+) set in solr all fields of document dynamic added using solr schemaless mode. see my solr document.
the first query I tried to run in solr schemaless mode. but numResult get zero.[solr standalone mode query working. but dynamic field not added in solr.]
can I best matching the document found in solr schemaless mode.?
{
"id":"d9263e11",
"titleh4cd06d47basdsa6d14ed8838a123":["User _ name"],
"parapfa551d3aef764ddca9e6e421fe8d50e8":[" My name is XYZ "],
"parapffe001011d4346ad9ce9edb67b3b85e4":[" i m from USA ...."],
"_version_":1748577992052834304},
{
"id":"d9263e20",
"titleh4cd06d47b6d14ed883842ae4cedab224":[" User_name "],
"parap759981766b644e229bda2b0cc5bd0bd9":[" my name is ...."],
"parapfa551d3aef764ddca9e6e421fe8d50e8":[" i m from INDIA"],
"_version_":1748577992544616448},
{
"id":"d9263e45",
"titlehdd4a37c0b21e4d9bbd03a56ba0120f01":[" User_name"],
"parapa2aa4798c7fc44aab5e4f6447c529f83":["my name is .... "],
"parap8ee9090e8e054d78b8dc7ff06a7fb702":[" i m from Germany"],
"_version_":13204902384923489909}
I'm trying to best match the document found in solr schemaless mode.
Your first query
q: i m from india
does not have a field specified to search on and therefore Solr will use on a default field (usually _text_) when searching. I suspect your index is probably not populating this default field and hence there is no match.
Your second query
q: parapfa551d3aef764ddca9e6e421fe8d50e8:i m from india
is searching for the string in the parapfa551d3aef764ddca9e6e421fe8d50e8 field, and in this case the match is found.
You can use Solr's debugQuery parameter to see how Solr handles each of these searches on your particular configuration.

Solr get latest updated / added documents

I have a Solr collection which has following fields : event_time, some_field1, some_field2. event_time is a long type. I am looking to see if there is a filter query which can only return documents which have max event_time. There could be more than one documents here.

Solr field UUID is not unique

I'm running Solr 5.1.0 on our Plone 4.2.6 system. In my schema.xml I set UID to be the uniqueKey. On the system there are approx 82.000 documents. After building an Index, I found the following amount of indexed docs:
numDocs: 54537
maxDocs: 82561
deletedDocs: 28024
Furthermore under 'core-specific-tools > Schema Browser > Load Term Info > Histogram' I found that for the field UID there are 26,514 values of UID contained in only one document but also 28,022 values contained in 2 documents. Therefore that amount of indexed documents were tagged as deletable.
Can anyone tell me what reason there could be for that many UIDs to be the same although they should be distinct?

How to retrieve Solr documents having at least 2 numFound with grouping

I'm working with Solr and grouping (field collapsing) function.
The request I want to do is the following : get only documents having the "rating" field, and having at least 2 documents in the result[#name='doclist'] node.
The request :
/select/?q=rating:[* TO *]&version=2.2&start=0&rows=10&indent=on&group=true&group.field=HashKey&group.ngroups=true
Is there a way to do that?

how to Index URL in SOLR so I can boost results after website

I have thousands of documents indexed in my SOLR which represents data crawled from different websites. One of the fields of a document is SourceURL which contains the url of a webpage that I crawled and indexed into this Document.
I want to boost results from a specific website using boost query.
For example I have 4 documents each containing in SourceURL the following data
https://meta.stackoverflow.com/page1
http://www.stackoverflow.com/page2
https://stackoverflow.com/page3
https://stackexchange.com/page1
I want to boost all results that are from stackoverflow.com, and not subdomains (in this case result 2 and 3 ).
Do you know how can I index the url field and then use boost query to identify all the documents from a specific website like in the case above ?
One way would be to parse the url prior to index time and specify if it is a primary domain ( primarydomain boolean field in your schema.xml file for example).
Then you can boost the primarydomain field in your query results. See using the DisMaxQParserPlugin from the Solr Wiki for an example on how to boost fields at query time.

Resources