Loop over children properties in painless script - loops

I have the following ElasticSearch data structure for products in a webshop. It's a main product with children that have a special price and a regular price. I have omitted some extraneous info in the children to make it more clear.
{
"_index": "vue_storefront_catalog_1_product_1617378559",
"_type": "_doc",
"_source": {
"configurable_children": [
{
"price": 49.99,
"special_price": 34.99
}
{
"price": 49.99,
"special_price": null
}
]
}
Using the following mapping in Elasticsearh:
{
"vue_storefront_catalog_1_product_1614928276" : {
"mappings" : {
"properties" : {
"configurable_children" : {
"properties" : {
"price" : {
"type" : "double"
}
"special_price" : {
"type" : "double"
}
}
}
}
}
}
}
I have created a loop in a painless script to go through the children of the configurable_children. I need to do this to determine if the main product is on sale, based on the children configurable_children
boolean hasSale = false;
for(item in doc['configurable_children']) {
hasSale = true;
if (1 - (item['special_price'].value / item['price'].value) > 0.5) {
hasSale = true;
}
}
return hasSale
When I look at the results I see the following error:
"failed_shards": [{
"shard": 0,
"index": "vue_storefront_catalog_1_product_1617512844",
"node": "2EQcMMqlQgiuT5GAFPo90w",
"reason": {
"type": "script_exception",
"reason": "runtime error",
"script_stack": ["org.elasticsearch.search.lookup.LeafDocLookup.get(LeafDocLookup.java:90)", "org.elasticsearch.search.lookup.LeafDocLookup.get(LeafDocLookup.java:41)", "for(item in doc['configurable_children']) { ", " ^---- HERE"],
"script": " boolean hasSale = false; for(item in doc['configurable_children']) { hasSale = true; if (1 - (item['special_price'].value / item['price'].value) > 0.5) { hasSale = true; } } return hasSale ",
"lang": "painless",
"position": {
"offset": 42,
"start": 26,
"end": 70
},
"caused_by": {
"type": "illegal_argument_exception",
"reason": "No field found for [configurable_children] in mapping with types []"
}
}
}]
No field found for [configurable_children] in mapping with types []
Anyone knows what I'm doing wrong? It looks like the for in loop needs a different kind of data? How do I look through all of the products to determine the sale price?

Related

ElasticSearch sort array size incoherent results

I am trying to sort by array size in ElasticSearch 7.1.
I indexed the following data without creating any custom mapping:
{
"myarray": [{
"field": {
"value": "test"
}
}]
}
When I look at the mapping, it is giving me:
{
"properties": {
"myarray": {
"properties": {
"field": {
"properties": {
"value": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
}
}
}
}
Now I want to query the index and sort by the highest number of elements in myarray. I have tried doing:
{
"sort": {
"_script": {
"type": "number",
"order": "desc",
"script": "doc.containsKey('myarray.field.value') ? doc['myarray.field.value'].values.size() : 0"
}
}
}
which gives me an error like Fielddata is disabled on text fields by default.[...] Alternatively use a keyword field instead. So I try with
{
"sort": {
"_script": {
"type": "number",
"order": "desc",
"script": "doc.containsKey('myarray.field.value.keyword') ? doc['myarray.field.value.keyword'].values.size() : 0"
}
}
}
which gives me the error Illegal list shortcut value [values].. So then I tried with (removing the values keyword):
{
"sort": {
"_script": {
"type": "number",
"order": "desc",
"script": "doc.containsKey('myarray.field.value.keyword') ? doc['myarray.field.value.keyword'].size() : 0"
}
}
}
and it works, however I have some results that are sorted nicely and suddenly an element that should be at the top appears in the middle.
Is that because it is sorting by the length of the value as a string and not the length of myarray?
This is because text type mapping does not provide sorting, to add sorting you must map the array field with keyword type.
For more info and syntax please refer this : https://www.elastic.co/guide/en/elasticsearch/reference/6.8/search-request-sort.html

elasticsearch aggregates some values in a single field

I have some raw data
{
{
"id":1,
"message":"intercept_log,UDP,0.0.0.0,68,255.255.255.255,67"
},
{
"id":2,
"message":"intercept_log,TCP,172.22.96.4,52085,239.255.255.250,3702,1:"
},
{
"id":3,
"message":"intercept_log,UDP,1.0.0.0,68,255.255.255.255,67"
},
{
"id":4,
"message":"intercept_log,TCP,173.22.96.4,52085,239.255.255.250,3702,1:"
}
}
Demand
I want to group this data by the value of the message part of the message.
Output value like that
{
{
"GroupValue":"TCP",
"DocCount":"2"
},
{
"GroupValue":"UDP",
"DocCount":"2"
}
}
Try
I have tried with these codes but failed
GET systemevent*/_search
{
"size": 0,
"aggs": {
"tags": {
"terms": {
"field": "message.keyword",
"include": " intercept_log[,,](.*?)[,,].*?"
}
}
},
"track_total_hits": true
}
Now I try to use pipelines to meet this need.
"aggs" seems to only group fields.
Does anyone have a better idea?
Link
Terms aggregation
Update
My scene is a little special. I collect logs from many different servers, and then import the logs into es. Therefore, there is a big difference between message fields. If you directly use script statements for grouping statistics, it will result in group failure or inaccurate grouping. I try to filter out some data according to the conditions, and then use script to group the operation code (comment code 1), but this code can't group the correct results.
This is my scene to add:
Our team uses es to analyze the server log, uses rsyslog to forward the data to the server center, and then uses logstash to filter and extract the data to es. At this time, there is a field called message in ES, and the value of message is the detailed log information. At this time, we need to count the data containing some values in the message.
comment code 1
POST systemevent*/_search
{
"size": 0,
"query": {
"bool": {
"must": [
{
"match_phrase": {
"message": {
"query": "intercept_log"
}
}
}
]
}
},
"aggs": {
"protocol": {
"terms": {
"script": "def values = /,/.split(doc['message.keyword'].value); return values.length > 1 ? values[1] : 'N/A'",
"size": 10
}
}
},
"track_total_hits": true
}
comment code 2
POST test2/_search
{
"size": 0,
"aggs": {
"protocol": {
"terms": {
"script": "def values = /.*,.*/.matcher( doc['host.keyword'].value ); if( name.matches() ) {return values.group(1) } else { return 'N/A' }",
"size": 10
}
}
}
}
The easiest way to solve this is by leveraging scripts in the terms aggregation. The script would simply split on commas and take the second value.
POST systemevent*/_search
{
"size": 0,
"aggs": {
"protocol": {
"terms": {
"script": "def values = /,/.split(doc['message.keyword'].value); return values.length > 1 ? values[1] : 'N/A';",
"size": 10
}
}
}
}
Use Regex
POST test2/_search
{
"size": 0,
"aggs": {
"protocol": {
"terms": {
"script": "def m = /.*proto='(.*?)'./.matcher(doc['message.keyword'].value ); if( m.matches() ) { return m.group(1) } else { return 'N/A' }"
}
}
}
}
The results would look like
"buckets" : [
{
"key" : "TCP",
"doc_count" : 2
},
{
"key" : "UDP",
"doc_count" : 2
}
]
A better and more efficient way would be to split the message field into new fields using an ingest pipeline or Logstash.

elasticsearch how to use exact search and ignore the keyword special characters in keywords?

i had some id value (numeric and text combination) in my elasticsearch index, and in my program user might will input some special characters in search keyword.
and i want to know is there anyway that can let elasticsearch to use exact search and also can remove some special characters in search keywork
i already use custom analyzer to split search keyword by some special characters. and use query->match to search data, and i still got no results
data
{
"_index": "testdata",
"_type": "_doc",
"_id": "11112222",
"_source": {
"testid": "1MK444750"
}
}
custom analyzer
"analysis" : {
"analyzer" : {
"testidanalyzer" : {
"pattern" : """([^\w\d]+|_)""",
"type" : "pattern"
}
}
}
mapping
{
"article" : {
"mappings" : {
"_doc" : {
"properties" : {
"testid" : {
"type" : "text",
"analyzer" : "testidanalyzer"
}
}
}
}
}
}
here's my elasticsearch query
GET /testdata/_search
{
"query": {
"match": {
// "testid": "1MK_444-750" // no result
"testid": "1MK444750"
}
}
}
and analyzer successfully seprated separated my keyword, but i just can't match anything in result
POST /testdata/_analyze
{
"analyzer": "testidanalyzer",
"text": "1MK_444-750"
}
{
"tokens" : [
{
"token" : "1mk",
"start_offset" : 0,
"end_offset" : 3,
"type" : "word",
"position" : 0
},
{
"token" : "444",
"start_offset" : 4,
"end_offset" : 7,
"type" : "word",
"position" : 1
},
{
"token" : "750",
"start_offset" : 8,
"end_offset" : 11,
"type" : "word",
"position" : 2
}
]
}
please help, thanks in advance!
First off, you should probably model the testid field as keyword rather than text, it's a more appropriate data type.
You want to put in a feature whereby some characters (_, -) are effectively ignored at search time. You can achieve this by giving your field a normalizer, which tells Elasticsearch how to preprocess data for this field prior to indexing or searching. Specifically, you can declare a mapping char filter in your normalizer that replaces these characters with an empty string.
This is how all these changes would fit into your mapping:
PUT /testdata
{
"settings": {
"analysis": {
"char_filter": {
"mycharfilter": {
"type": "mapping",
"mappings": [
"_ => ",
"- => "
]
}
},
"normalizer": {
"mynormalizer": {
"type": "custom",
"char_filter": [
"mycharfilter"
]
}
}
}
},
"mappings": {
"_doc": {
"properties": {
"testid" : {
"type" : "keyword",
"normalizer" : "mynormalizer"
}
}
}
}
}
The following searches would then produce the same results:
GET /testdata/_search
{
"query": {
"match": {
"testid": "1MK444750"
}
}
}
GET /testdata/_search
{
"query": {
"match": {
"testid": "1MK_444-750"
}
}
}

Remove elements/objects From Array in ElasticSearch Followed by Matching Query

I'm having issues trying to remove elements/objects from an array in elasticsearch.
This is the mapping for the index:
{
"example1": {
"mappings": {
"doc": {
"properties": {
"locations": {
"type": "geo_point"
},
"postDate": {
"type": "date"
},
"status": {
"type": "long"
},
"user": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
}
}
}
And this is an example document.
{
"_index": "example1",
"_type": "doc",
"_id": "8036",
"_score": 1,
"_source": {
"user": "kimchy8036",
"postDate": "2009-11-15T13:12:00",
"locations": [
[
72.79887719999999,
21.193036000000003
],
[
-1.8262150000000001,
51.178881999999994
]
]
}
}
Using the query below, I can add multiple locations.
POST /example1/_update_by_query
{
"query": {
"match": {
"_id": "3"
}
},
"script": {
"lang": "painless",
"inline": "ctx._source.locations.add(params.newsupp)",
"params": {
"newsupp": [
-74.00,
41.12121
]
}
}
}
But I'm not able to remove array objects from locations. I have tried the query below but it's not working.
POST example1/doc/3/_update
{
"script": {
"lang": "painless",
"inline": "ctx._source.locations.remove(params.tag)",
"params": {
"tag": [
-74.00,
41.12121
]
}
}
}
Kindly let me know where i am doing wrong here. I am using elastic version 5.5.2
In painless scripts, Array.remove() method removes by index, not by value.
Here's a working example that removes array elements by value in Elasticsearch script:
POST objects/_update_by_query
{
"query": {
... // use regular ES query to remove only in relevant documents
},
"script": {
"source": """
if (ctx._source[params.array_attribute] != null) {
for (int i=ctx._source[params.array_attribute].length-1; i>=0; i--) {
if (ctx._source[params.array_attribute][i] == params.value_to_remove) {
ctx._source[params.array_attribute].remove(i);
}
}
}
""",
"params": {
"array_attribute": "<NAME_OF_ARRAY_PROPERTY_TO_REMOVE_VALUE_IN>",
"value_to_remove": "<VALUE_TO_REMOVE_FROM_ARRAY>",
}
}
}
You might want to simplify script, if your script shall only remove values from one specific array attribute. For example, removing "green" from document's .color_list array:
_doc/001 = {
"color_list": ["red", "blue", "green"]
}
Script to remove "green":
POST objects/_update_by_query
{
"query": {
... // use regular ES query to remove only in relevant documents
},
"script": {
"source": """
for (int i=ctx._source.color_list.length-1; i>=0; i--) {
if (ctx._source.color_list[i] == params.color_to_remove) {
ctx._source.color_list.remove(i);
}
}
""",
"params": {
"color_to_remove": "green"
}
}
}
Unlike add(), remove() takes the index of the element and remove it.
Your ctx._source.locations in painless is an ArrayList. It has List's remove() method:
E remove(int index)
Removes the element at the specified position in this list (optional operation). ...
See Painless API - List for other methods.
See this answer for example code.
"script" : {
"lang":"painless",
"inline":"ctx._source.locations.remove(params.tag)",
"params":{
"tag":indexToRemove
}
}
If with ctx._source.locations.add(elt) You add the element, with ctx._source.locations.remove(indexToRemove), you remove by the index of element in the array.

Mongo regex with array of elements not accurate

Document schema is as follows
"events": [
{
"title": "title1"
},
{
"title": "title2"
},
{
"title": "title3"
}
]
I have requirement to search (regex ) in events.title field and get the only matching element/object from the array . For that i am querying like this ,
db.collection.find({ "$or" : [ { "events.title" : { "$regex" : ".*title2.*" , "$options" : "i"}} , { "events.title" : { "$regex" : ".*title5.*" , "$options" : "i"}}] , "events" : { "$elemMatch" : { "title" : { "$ne" : null }}}},{"events.$":1,"_id":0});
I was expecting the result would be { "events" : [ { "title" : "title2"} ] } , but it returns
{ "events" : [ { "title" : "title1"} ] }
How can i update the query so that only matching element in array is returned in result ?
UPDATE
"events": {
$elemMatch: {
"$or": [{
"title": {
"$regex": ".*title2.*",
"$options": "i"
}
},
{
"title": {
"$regex": ".*title5.*",
"$options": "i"
}
}]
}
}
did the trick . Now it returns only matching element in the array irrespective of the position.
Thanks for the help
Try this
db.collection.find({events: {
$elemMatch: {
title: {$in: [/.*title2.*/i,
/.*title5.*/i]
}
}
}
},
{"events.$": 1, _id: 0});
{"events": {
$elemMatch: {
"$or": [{
"title": {
"$regex": ".*title2.*",
"$options": "i"
}
},
{
"title": {
"$regex": ".*title5.*",
"$options": "i"
}
}]
}
}
}
in projection did the trick . Now it returns only matching element in the array irrespective of the position

Resources