I send a query from the client for docs where the currentDate is between two values attached to all docs. The docs have an expireDate and an availableDate.
How do I formulate this query? As a guess I came up with the following, which doesn't work:
{
"constant_score": {
"filter": {
"range" : {
"expiredate" : {
"lte": "2015-08-15T11:28:45.114-07:00"
},
"availabledate" : {
"gte": "2015-08-11T11:28:45.114-07:00"
}
}
}
}
}
This even looks wrong because afaik the range won't work on a compound object or an array of one-sided ranges. I guess I could add an entire new filter so one for each field. Is there a way to use one date range for two fields?
The correct way of expressing that query is using two range filters (one for each date field) and place them in another bool/must filter.
{
"query": {
"constant_score": {
"filter": {
"bool": {
"must": [
{
"range": {
"expireDate": {
"lte": "2015-08-15T11:28:45.114-07:00"
}
}
},
{
"range": {
"availabledate": {
"gte": "2015-08-11T11:28:45.114-07:00"
}
}
}
]
}
}
}
}
}
Related
I have some raw data
{
{
"id":1,
"message":"intercept_log,UDP,0.0.0.0,68,255.255.255.255,67"
},
{
"id":2,
"message":"intercept_log,TCP,172.22.96.4,52085,239.255.255.250,3702,1:"
},
{
"id":3,
"message":"intercept_log,UDP,1.0.0.0,68,255.255.255.255,67"
},
{
"id":4,
"message":"intercept_log,TCP,173.22.96.4,52085,239.255.255.250,3702,1:"
}
}
Demand
I want to group this data by the value of the message part of the message.
Output value like that
{
{
"GroupValue":"TCP",
"DocCount":"2"
},
{
"GroupValue":"UDP",
"DocCount":"2"
}
}
Try
I have tried with these codes but failed
GET systemevent*/_search
{
"size": 0,
"aggs": {
"tags": {
"terms": {
"field": "message.keyword",
"include": " intercept_log[,,](.*?)[,,].*?"
}
}
},
"track_total_hits": true
}
Now I try to use pipelines to meet this need.
"aggs" seems to only group fields.
Does anyone have a better idea?
Link
Terms aggregation
Update
My scene is a little special. I collect logs from many different servers, and then import the logs into es. Therefore, there is a big difference between message fields. If you directly use script statements for grouping statistics, it will result in group failure or inaccurate grouping. I try to filter out some data according to the conditions, and then use script to group the operation code (comment code 1), but this code can't group the correct results.
This is my scene to add:
Our team uses es to analyze the server log, uses rsyslog to forward the data to the server center, and then uses logstash to filter and extract the data to es. At this time, there is a field called message in ES, and the value of message is the detailed log information. At this time, we need to count the data containing some values in the message.
comment code 1
POST systemevent*/_search
{
"size": 0,
"query": {
"bool": {
"must": [
{
"match_phrase": {
"message": {
"query": "intercept_log"
}
}
}
]
}
},
"aggs": {
"protocol": {
"terms": {
"script": "def values = /,/.split(doc['message.keyword'].value); return values.length > 1 ? values[1] : 'N/A'",
"size": 10
}
}
},
"track_total_hits": true
}
comment code 2
POST test2/_search
{
"size": 0,
"aggs": {
"protocol": {
"terms": {
"script": "def values = /.*,.*/.matcher( doc['host.keyword'].value ); if( name.matches() ) {return values.group(1) } else { return 'N/A' }",
"size": 10
}
}
}
}
The easiest way to solve this is by leveraging scripts in the terms aggregation. The script would simply split on commas and take the second value.
POST systemevent*/_search
{
"size": 0,
"aggs": {
"protocol": {
"terms": {
"script": "def values = /,/.split(doc['message.keyword'].value); return values.length > 1 ? values[1] : 'N/A';",
"size": 10
}
}
}
}
Use Regex
POST test2/_search
{
"size": 0,
"aggs": {
"protocol": {
"terms": {
"script": "def m = /.*proto='(.*?)'./.matcher(doc['message.keyword'].value ); if( m.matches() ) { return m.group(1) } else { return 'N/A' }"
}
}
}
}
The results would look like
"buckets" : [
{
"key" : "TCP",
"doc_count" : 2
},
{
"key" : "UDP",
"doc_count" : 2
}
]
A better and more efficient way would be to split the message field into new fields using an ingest pipeline or Logstash.
I wish to nested query the array via graphql, by elimination repeated properties. Below is the json file
{
"MAIN_ARRAY": [
{
"One": [
{
"title": "Title",
"description": "Description",
"avatar": "../../assets/image/author-1.jpg"
}
],
"Two": [
{
"title": "Title",
"description": "Description",
"avatar": "../../assets/image/author-1.jpg"
}
]
}
]
}
I dont want to repeat the properties title, description, avatar for One and Two because its the same. Is there any workaround for this to avoid repeating it. Below code didnt work.
query {
fileJson {
MAIN_ARRAY {
One, Two {
title
description
avatar
}
}
}
}
Assuming the underlying types of both One & Two are the same, you can use Fragments
query {
fileJson {
MAIN_ARRAY {
One {
...MyFragment
}
Two {
...MyFragment
}
}
}
}
fragment MyFragment on MyType {
title
description
avatar
}
I am trying to conduct an Elasticsearch query that searched a text field ("body") and returns items that match at least one of two multi-word phrases I provide (ie: "stack overflow" OR "the stackoverflow"). I would also like the query to only provide results that occur after a given timestamp, with the results ordered by time.
My current solution is below. I believe the MUST is working correctly (gte a timestamp), but the BOOL + SHOULD with two match_phrases is not correct. I am getting the following error:
Unexpected character ('{' (code 123)): was expecting double-quote to start field name
Which I think is because I have two match_phrases in there?
This is the ES mapping and the details of the ES API I am using details are here.
{"query":
{"bool":
{"should":
[{"match_phrase":
{"body":"a+phrase"}
},
{"match_phrase":
{"body":"another+phrase"}
}
]
},
{"bool":
{"must":
[{"range":
{"created_at:
{"gte":"thispage"}
}
}
]}
}
},"size":10000,
"sort":"created_at"
}
I think you were just missing a single " after created_at.
{
"query": {
"bool": {
"must": [
{
"range": {
"created_at": {
"gte": "1534004694"
}
}
},
{
"bool": {
"should": [
{
"match_phrase": {
"body": "a+phrase"
}
},
{
"match_phrase": {
"body": "another+phrase"
}
}
]
}
}
]
}
},
"size": 10,
"sort": "created_at"
}
Also, you are allowed to have both must and should as properties of a bool object, so this is also worth trying.
{
"query": {
"bool": {
"must": {
"range": {
"created_at": {
"gte": "1534004694"
}
}
},
"should": [
{
"match_phrase": {
"body": "a+phrase"
}
},
{
"match_phrase": {
"body": "another+phrase"
}
}
]
}
},
"size": 10,
"sort": "created_at"
}
On a side note, Postman or any JSON formatter/validator would really help in determining where the error is.
I have below data:
{
"results":[
{
"ID":"1",
"products":[
{
"product":"car",
"number":"5"
},
{
"product":"computer",
"number":"212"
}
]
},
{
"ID":"2",
"products":[
{
"product":"car",
"number":"9"
},
{
"product":"computer",
"number":"463"
},
{
"product":"bicycle",
"number":"5"
}
]
}
]
}
And my query is below:
{
"query":{
"bool":{
"must":[
{
"wildcard":{
"results.products.product":"*car*"
}
},
{
"wildcard":{
"results.products.number":"*5*"
}
}
]
}
}
}
What I expect is to get only ID1. Because only it has a product with { "product":"car", "number":"5" } record. But what I get is both ID1 and ID2 because ID2's first record has "product":"car" and third record has "number":"5" records separately.
How can I fix this query?
You need to define your products as a nested type when creating mapping. Try with following mapping example:
PUT http://localhost:9200/indexname
{
"mappings": {
"typename": {
"properties": {
"products" : {
"type" : "nested"
}
}
}
}
}
Then you can use nested queries to match entire elements of your array - just as you need to.
{
"query": {
"nested": {
"path": "products",
"query": {
"bool": {
"must": [
{ "wildcard": { "products.product": "*car*" }},
{ "wildcard": { "products.number": "*5*" }}
]
}
}
}
}
}
Trying to perform an ES query, I ran into a problem while trying to do a nested filtering of objects in an array. Our structure of data has changed from being:
"_index": "events_2015-07-08",
"_type": "my_type",
"_source":{
...
...
"custom_data":{
"className:"....."
}
}
to:
"_index": "events_2015-07-08",
"_type": "my_type",
"_source":{
...
...
"custom_data":[ //THIS CHANGED FROM AN OBJECT TO AN ARRAY OF OBJECTS
{
"key":".....",
"val":"....."
},
{
"key":".....",
"val":"....."
}
]
}
this nested filter works fine on indices that have the new data structure:
{
"nested": {
"path": "custom_data",
"filter": {
"bool": {
"must": [
{
"term":
{
"custom_data.key": "className"
}
},
{
"term": {
"custom_data.val": "SOME_VALUE"
}
}
]
}
},
"_cache": true
}
}
However, it fails when going over indices that have the older data structure, so that feature cannot be added. Ideally I'd be able to find both data structures but at this point i'd settle for a "graceful failure" i.e. just don't return results where the structure is old.
I have tried adding an "exists" filter on the field "custom_data.key", and an "exists" within "not" on the field "custom_data.className", but I keep getting "SearchParseException[[events_2015-07-01][0]: from[-1],size[-1]: Parse Failure [Failed to parse source"
There is an indices filter (and query) that you can use to perform conditional filters (and queries) based on the index that it is running against.
{
"query" : {
"filtered" : {
"filter" : {
"indices" : {
"indices" : ["old-index-1", "old-index-2"],
"filter" : {
"term" : {
"className" : "SOME_VALUE"
}
},
"no_match_filter" : {
"nested" : { ... }
}
}
}
}
}
}
Using this, you should be able to transition off of the old mapping and onto the new mapping.