"There is no index available for this selector" despite the fact I made one - cloudant

In my data, I have two fields that I want to use as an index together. They are sensorid (any string) and timestamp (yyyy-mm-dd hh:mm:ss).
So I made an index for these two using the Cloudant index generator. This was created successfully and it appears as a design document.
{
"index": {
"fields": [
{
"name": "sensorid",
"type": "string"
},
{
"name": "timestamp",
"type": "string"
}
]
},
"type": "text"
}
However, when I try to make the following query to find all documents with a timestamp newer than some value, I am told there is no index available for the selector:
{
"selector": {
"timestamp": {
"$gt": "2015-10-13 16:00:00"
}
},
"fields": [
"_id",
"_rev"
],
"sort": [
{
"_id": "asc"
}
]
}
What have I done wrong?

It seems to me like cloudant query only allows sorting on fields that are part of the selector.
Therefore your selector should include the _id field and look like:
"selector":{
"_id":{
"$gt":0
},
"timestamp":{
"$gt":"2015-10-13 16:00:00"
}
}
I hope this works for you!

Related

Manipulate field value of copy-field in Apache Solr

I have a simple string "PART_NUMBER" value as a field in solr. I would like to add an additional field which places that value in a URL field. To do this, I created a new field type, field, and copy field
"add-field-type": {
"name": "endpoint_url",
"class": "solr.TextField",
"positionIncrementGap": "100",
"analyzer": {
"tokenizer": {
"class": "solr.KeywordTokenizerFactory"
},
"filters": [
{
"class": "solr.PatternReplaceFilterFactory",
"pattern": "([\\s\\S]*)",
"replacement": "http://myurl/$1.jpg"
}
]
}
},
"add-field": {
"name": "URL",
"type": "endpoint_url",
"stored": true,
"indexed": true
},
"add-copy-field":{ "source":"PART_NUMBER", "dest":"URL" }
As some of you probably guessed, my query output looks like
{
"id": "1",
"PART_NUMBER": "ABCD1234",
"URL": "ABCD1234",
"_version_": 1645658574812086272
}
Because the endpoint_url fieldtype only modifies the index. Indeed, when doing my analysis, I get
http://myurl/ABCD1234.jpg
My question: Is there any way to apply a tokenizer or filter and feed it back in to the field value? I would prefer this output when returning the result:
{
"id": "1",
"PART_NUMBER": "ABCD1234",
"URL": "http://myurl/ABCD1234.jpg",
"_version_": 1645658574812086272
}
Is this possible to do in Solr?
Solution was posted here:
Custom Solr analyzers not being used during indexing
I need to use an Update Processors In order to change the field value before analysis. The process can be found here:
https://lucene.apache.org/solr/guide/8_1/update-request-processors.html

JMESPath query for nested array structures

I have the following data structure as a result of aws logs get-query-results:
{
"status": "Complete",
"statistics": {
"recordsMatched": 2.0,
"recordsScanned": 13281.0,
"bytesScanned": 7526096.0
},
"results": [
[
{
"field": "time",
"value": "2019-01-31T21:53:01.136Z"
},
{
"field": "requestId",
"value": "a9c233f7-0b1b-3326-9b0f-eba428e4572c"
},
{
"field": "logLevel",
"value": "INFO"
},
{
"field": "callerId",
"value": "a9b0f9c2-eb42-3986-33f7-8e450b1b72cf"
}
],
[
{
"field": "time",
"value": "2019-01-25T13:13:01.062Z"
},
{
"field": "requestId",
"value": "a4332628-1b9b-a9c2-0feb-0cd4a3f7cb63"
},
{
"field": "logLevel",
"value": "INFO"
},
{
"field": "callerId",
"value": "a9b0f9c2-eb42-3986-33f7-8e450b1b72cf"
}
],
]
}
The AWS CLI support JMESPath language for filtering output. I need to apply a query string, to filter among the returned "results" the objects that contain the "callerId" as a "field", retrieve the "value" property and obtain the following output:
[
{
callerId: "a9b0f9c2-eb42-3986-33f7-8e450b1b72cf"
},
{
callerId: "a9b0f9c2-eb42-3986-33f7-8e450b1b72cf"
}
]
The first step I do is flatter the results array with the query string: results[]
This will get read of the other root properties (status, statistics) and return only one big array with all of the {field: ..., value: ...} alike objects. But after this I can't manage to properly filter for those objects that match field=="callerId". I tried, among others, the following expressions without success:
'results[][?field=="callerId"]'
'results[][*][?field=="callerId"]'
'results[].{ callerId: #[?field=="callerId"].value }'
I'm not an expert in JMESPath and I was doing the tutorials of the jmespath.org site but couldn't manage to make it work.
Thanks!
Using jq is a good thing because it's more complete language, but if you want to do it with JMES Path here the solution:
results[*][?field=='callerId'].{callerId: value}[]
to get:
[
{
"callerId": "a9b0f9c2-eb42-3986-33f7-8e450b1b72cf"
},
{
"callerId": "a9b0f9c2-eb42-3986-33f7-8e450b1b72cf"
}
]
I'm not able to reproduce fully since I don't have the same logs in my log stream but I was able to do this using jq and putting the sample JSON object in a file
cat sample_output.json | jq '.results[][] | select(.field=="callerId") | .value'
OUTPUT:
"a9b0f9c2-eb42-3986-33f7-8e450b1b72cf"
"a9b0f9c2-eb42-3986-33f7-8e450b1b72cf"
you could pipe the output from the aws cli to jq.
I was able to get pretty close with the native JMESPath query and using the built in editor in this site
http://jmespath.org/examples.html#filtering-and-selecting-nested-data
results[*][?field==`callerId`][]
OUTPUT:
[
{
"field": "callerId",
"value": "a9b0f9c2-eb42-3986-33f7-8e450b1b72cf"
},
{
"field": "callerId",
"value": "a9b0f9c2-eb42-3986-33f7-8e450b1b72cf"
}
]
but I'm not sure how to get callerId to be the key and the value to be the value from another key.

Cloudant find Query with $and and $or elements

I'm using the following json to find results in a Cloudant
{
"selector": {
"$and": [
{
"type": {
"$eq": "sensor"
}
},
{
"v": {
"$eq": 2355
}
},
{
"$or": [
{
"p": "#401000103"
},
{
"p": "#401000114"
}
]
},
{
"t_max": {
"$gte": 1459554894
}
},
{
"t_min": {
"$lte": 1459509591
}
}
]
},
"fields": [
"_id",
"p"
],
"limit": 200
}
If I run this againt my cloudant database I get the following error:
{
"error": "unknown_error",
"reason": "function_clause",
"ref": 3379914628
}
If I remove one the $or elements I get the results for query.
(,{"p":"#401000114"})
Also i get a result if I replace #401000114 with #401000114 I get result.
But when I want to use both element I get the error code above.
Can anybody tell what this error_reason: function_clause mean?
error_reason: function_clause means there was a problem on the server, you should probably reach out to Cloudant Support and see if they can help you with your issue.
I had contact with the Cloudant support.
This is there answer:
The issue affects Cloudant generally
It affects both mult-tenant and dedicated clusters.
There are working on the sollution.
A workaround is in the array to which the $or operator applies has two elements, you can get the correct result by repeating one of the items in the array.

Elasticsearch score results based partly on Popularity

I'm using Elasticsearch for this project but a Solr solution might be appropriate too. In the query I'd like to include a portion of a should clause that will return results even if none of the other terms can. This will be used for document popularity. I'll periodically calculate reading popularity and add a float field to each doc with a numeric value.
The idea is to return docs based on terms but when that fails, return popular docs ranked by popularity. These should be ordered by term match scores or magnitude of popularity score.
I realize that I could quantize the popularity and treat it like a tag "hottest", "hotter", "hot"... but would like to use numeric field since the ranking is well defined.
Here is the current form of my data (from fetch by id):
GET /index/docs/ipad
returns a sample object
{
"_index": "index",
"_type": "docs",
"_id": "doc1",
"_version": 1,
"found": true,
"_source": {
"category": ["tablets", "electronics"],
"text": ["buy", "an", "ipad"],
"popularity": 0.95347457,
"id": "doc1"
}
}
Current query format
POST /index/docs/_search
{
"size": 10,
"query": {
"bool": {
"should": [
{"terms": {"text": ["ipad"]}}
],
"must": [
{"terms": {"category": ["electronics"]}}
]
}
}
}
This may seem an odd query format but these are structured objects, not free form text.
Can I add popularity to this query so that it returns items ranked by popularity magnitude along with those returned by the should terms? I'd boost the actual terms above the popularity so they'd be favored.
Note I do not want to boost by popularity, I want to return popular if the rest of the query returns nothing.
One approach I can think of is wrapping match_all filter in constant score
and using sort on score followed by popularity
example:
{
"size": 10,
"query": {
"bool": {
"should": [
{
"terms": {
"text": [
"ipad"
]
}
},
{
"constant_score": {
"filter": {
"match_all": {}
},
"boost": 0
}
}
],
"must": [
{
"terms": {
"category": [
"electronics"
]
}
}
],
"minimum_should_match": 1
}
},
"sort": [
{
"_score": {
"order": "desc"
}
},
{
"popularity": {
"unmapped_type": "double"
}
}
]
}
You want to look into the function score query and a decay function for this.
Here's a gentle intro: https://www.found.no/foundation/function-scoring/

CouchDB View With OR Condition

I have two kinds of documents in couchDB with following json type:
1.
{
"_id": "4a91f3e8-616a-431d-8199-ace00055763d",
"_rev": "2-9105188217acd506251c98cd4566e788",
"Vehicle": {
"type": "STRING",
"name": "Vehicle",
"value": "12345"
},
"Start": {
"type": "DATE",
"name": "Start",
"value": "2014-09-10T11:19:00.000Z"
}
}
2.
{
"_id": "4a91f3e8-616a-431d-8199-ace00055763d",
"_rev": "2-9105188217acd506251c98cd4566e788",
"Equipment": {
"type": "STRING",
"name": "Equipment",
"value": "12345"
},
"Start": {
"type": "DATE",
"name": "Start",
"value": "2014-09-10T11:19:00.000Z"
}
}
I want to make one view which search all these documents whose doc.Vehicle.value=12345 OR doc.Equipment.value=12345.
How can I make this view that will return all these kind of documents.
Thanks in advance.
Just emit both (yes, map functions may emits multiple times different key-values for the same doc) values with your view:
function(doc){
if (doc.Equipment) {
emit(doc.Equipment.value, null)
}
if (doc.Vehicle) {
emit(doc.Vehicle.value, null)
}
}
And request them by the same key:
http://localhost:5984/db/_design/ddoc/_view/by_equip_value?key="12345"
See also the Guide to Views for more info about CouchDB views.
With Kxepals Version, you cannot query the type of results ("12345" can be either Vehicle, OR Equipment). you can only see the result when you use "include_docs=true" and search inside the doc, or make a second query with the id of the results.
If you want to see the type (or Query by type) you need to extend the View :
..
if(doc.Equipment) {
emit (doc.Equipment.value,doc.Equipment.name);
}
if(doc.Vehicle) {
emit(doc.Vehicle.value,doc.Vehicle.name);
}
Here, the name is the value of the result rows.
But you can also define the results in the query, if you put the name as a first query item:
if(doc.Equipment) {
emit([doc.Equipment.name,doc.Equipment.value],null);
}
if(doc.Vehicle) {
emit ([doc.Vehicle.name,doc.Vehicle.value],null);
}
Here, the
Your Query for Vehicles:
/viewname?startkey=["Vehicle"]&Endkey=["Vehicle",{}]
Equipment:
/viewname?startkey=["Equipment"]&endkey=["Equipment,{}]
Here, the name is the first Item of the result rows key array.
Maybe this will help : http://de.slideshare.net/okurow/couchdb-mapreduce-13321353
BTW: Better solution would be :
{
"_id": "4a91f3e8-616a-431d-8199-ace00055763d",
"_rev": "2-9105188217acd506251c98cd4566e788",
"type": "Vehicle",
"value":"12345",
"Start": {
"type": "DATE",
"name": "Start", // ? maybe also obsolete, because already inside "Start" Element
"value": "2014-09-10T11:19:00.000Z"
}
}
{
"_id": "4a91f3e8-616a-431d-8199-ace00055763d",
"_rev": "2-9105188217acd506251c98cd4566e788",
"type": "Equipment",
"value":"12345",
"Start": {
"type": "DATE",
"name": "Start", // ? maybe also obsolete, because already inside "Start" Element
"value": "2014-09-10T11:19:00.000Z"
}
}
in this case you can use only one emit:
emit([doc.type,doc.value],null)

Resources