ElasticSearch - Append to integer array - arrays

I am new to ES and but I'm getting the hang of it.
It's a really powerful piece of software, but I have to say that the documentation is really lacking and confusing some times.
Here's my question:
I have an integer array, that looks like this:
"hits_history" : [0,0]
I want to append an integer to that array via an "update_by_query" call, I searched and found this link: https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-update.html
which has this example:
POST test/type1/1/_update
{
"script" : {
"inline": "ctx._source.tags.add(params.tag)",
"lang": "painless",
"params" : {
"tag" : "blue"
}
}
}
so I tried:
curl -XPOST 'localhost:9200/example/example/_update_by_query?pretty' -H 'Content-Type: application/json' -d'
{
"script": {
"inline": "ctx._source.hits_history.add(params.hits)",
"params": {"hits": 0}
},
"query": {
"match_all": {}
}
}
'
but it gave me this error:
"ctx._source.hits_history.add(params.hits); ",
" ^---- HERE"
"type" : "script_exception",
"reason" : "runtime error",
"caused_by" : {
"type" : "illegal_argument_exception",
"reason" : "Unable to find dynamic method [add] with [1] arguments for class [java.lang.Integer]."
So, I looked further and found this: https://www.elastic.co/guide/en/elasticsearch/guide/current/partial-updates.html
which has this example:
We can also use a script to add a new tag to the tags array.
POST /website/blog/1/_update
{
"script" : "ctx._source.tags+=new_tag",
"params" : {
"new_tag" : "search"
}
}
So I tried it:
curl -XPOST 'localhost:9200/example/example/_update_by_query?pretty' -H 'Content-Type: application/json' -d'
{
"script": {
"inline": "ctx._source.hits_history += 0;"
},
"query": {
"match_all": {}
}
}
'
Result:
"type" : "script_exception",
"reason" : "runtime error",
"caused_by" : {
"type" : "class_cast_exception",
"reason" : "Cannot apply [+] operation to types [java.util.ArrayList] and [java.lang.Integer]."
So, how can I append items to the arrayList? Is there a more up-to-date documentation I should look into?
What I wanted to do was simply something like this:
ctx._source.hits_history.add(ctx._source.today_hits);
ctx._source.today_hits = 0;
Thank you

You should store first value as array (containing one value).
Then you can use add() method.
POST /website/blog/1/_update
{
"script" : "if (ctx._source.containsKey('tags')) { ctx._source.tags.add('next') } else { ctx._source.tags = ['first'] }"
}

Related

Unable to Parse JSON object with double quotes

I am unable to parse nested JSON with in double quotes here is sample object. Flatten do not getting "RawRequest" key due it quoted in double quotes although its valid json:
"Business ID":"Sajidbutta",
"Acout":"Saji_12"
"Report": {
"ReportPDF": {
"RawRequest": "{ \"subcode\" : \"35656\", \"Hoppoes\" :\"Hello\" ,\"Hoppoes\":[{\"tsn\" : \"44544545\", \"title\" : \"Owner\"}] }"
}
}
}
You have two issues.
First, you're missing a comma between "Saji_12" and "Report":
"Acout":"Saji_12" "Report":
The second one is that you're missing the escaping \ in front of the " in the JSON string. Consider the following
{
"Business ID": "Sajidbutta",
"Acout": "Saji_12",
"Report": {
"ReportPDF": {
"RawRequest": "{ \"subcode\" : \"35656\", \"Hoppoes\" :\"Hello\" ,\"Hoppoes\":[{\"tsn\" : \"44544545\", \"title\" : \"Owner\"}] }"
}
}
}
If the "RawRequest" is meant to actually be a json, then the following is appropriate (note the lack of quotation marks after the RawRequest key:
{
"Business ID": "Sajidbutta",
"Acout": "Saji_12",
"Report": {
"ReportPDF": {
"RawRequest": {
"subcode": "35656",
"Hoppoes": "Hello",
"Hoppoes": [
{
"tsn": "44544545",
"title": "Owner"
}
]
}
}
}
}
Note also the duplicate key "Hoppoes" :)

Merging 2 json files into new json with no duplicates

My dedicated servers are generating 2 laptime arrays and I would like to use a script to merge them into a single, new json file, with duplicate "steamids" removed (and kept grouped together as they still are) and both arrays under a single loggedTimes {} (so I can feed it to a html script that produces laptimes and a leaderboard). In other words, I want the structure to remain.
The first laptime file and the second laptime file go through the following command
jq 'reduce . as $item ({}; . * $item)' laptimes_data_ams.json laptimes_data_kow.json > laptimes.json
to then generate the (badly) merged laptime file.
I can get a file reduced but can't get any further than that. I checked threads by other around here and whenever I try their suggestions the script just refuses to work. Anybody available to lend me a hand in generating a working script to keep this final structure post-merge?
{
"loggedTimes" : {
steamids" : {
"idnumber1" : "name1",
"idnumber2" : "name2"
},
"vehicles" : {
"vehiclenumber1" : {
"laptimes" : {
"idnumber1" : {
"lapTime" : time1,
"logtime" : log1,
"name" : "name 1",
"rank" : rank1,
"refId" : id1,
"vehicleid" : vehiclenumber1,
"wet" : 0
},
"idnumber2" : {
"lapTime" : time2,
"logtime" : log2,
"name" : "name 2",
"rank" : rank2,
"refId" : id2,
"vehicleid" : vehiclenumber1,
"wet" : 0
}
}
}
"vehiclesnumber2" : {
//you get the idea by now
}
}
}
You haven't specified how the merge is to be performed, but one option would be to let the key-value pairs in the second file dominate. In that case, you could write:
jq -n '
input as $one
| input as $two
| ($one + $two)
| .loggedTimes.steamids = ($one.loggedTimes.steamids + $two.loggedTimes.steamids)
' 1.json 2.json
With your input, this produces output from which the following is an extract:
{
"loggedTimes": {
"steamids": {
"76561197960277005": "[DECOCO]koker_SZ",
"76561197960436395": "JOJO",
...
},
"vehicles": {
"-1142039519": {
"lapTimes": {}
},
"-1201605905": {
"lapTimes": {
"76561197984026143": {
"lapTime": 609101,
"logtime": 1606516985,
"name": "Night Dick",
"rank": 1,
"refId": 36032,
"vehicleId": -1201605905,
"wet": 0
}
}
}
...
}
}
}

bash split array into separate files with dynamic name

I have the following returned to me as a response of a mocking tool I'm using.
{
"mappings" : [
{
"id" : "bcf3559f-7ff7-406b-a4f1-6d3e9ac00e63",
"name" : "Hellow world 2",
"request" : {
"url" : "/hello-world-2",
"method" : "POST"
},
"response" : {
"status" : 200,
"body" : "\nBody content for stub 3\n\n",
"headers" : { }
},
"uuid" : "bcf3559f-7ff7-406b-a4f1-6d3e9ac00e63",
"persistent" : true,
"priority" : 5
},
{
"id" : "9086b24f-4f5e-465a-bbe5-73bbfb82cd5c",
"name": "Hello world",
"request" : {
"url" : "/hello-world",
"method" : "ANY"
},
"response" : {
"status" : 200,
"body" : "Hi!"
},
"uuid" : "9086b24f-4f5e-465a-bbe5-73bbfb82cd5c"
} ]
}
I'd like to know how I can split each object into it's own file with the file named after the id of the object.
E.g:
bcf3559f-7ff7-406b-a4f1-6d3e9ac00e63.json
bcf3559f-7ff7-406b-a4f1-6d3e9ac00e63.json
I have got as far as this so far but can't get it over the line:
jq -c '.mappings = (.mappings[] | [.])' mappings.json |
while read -r json ; do
N=$((N+1))
jq . <<< "$json" > "tmp/file${N}.json"
done
I'd recommend printing the id on one line, and the corresponding object on the next. For example:
jq -c '.mappings[] | .id, .' mappings.json |
while read -r id ; do
echo "id=$id"
read -r json
jq . <<< "$json" > "tmp/${id}.json"
done
I would write a simple Python script instead (or the equivalent in your favorite, general-purpose programming language).
import sys, json
d = json.load(sys.stdin):
for o in d['mappings']:
with open(os.path.join('tmp', o['id'] + '.json'), 'w') as f:
json.dump(o, f)
This would be more efficient and less error-prone, at least until jq gets some sort of output built-in:
# hypothetical
jq '.mappings[] | output("tmp/\(.id).json")' mappings.json

MongoDB search using $in array not working

I'm using MongoDB shell version: 2.4.8, and would simply like to know why a nested array search doesn't work quite as expected.
Assume we have 2 document collections, (a) Users:
{
"_id" : ObjectId("u1"),
"username" : "user1",
"org_ids" : [
ObjectId("o1"),
ObjectId("o2")
]
}
{
"_id" : ObjectId("u2"),
"username" : "user2",
"org_ids" : [
ObjectId("o1")
]
}
and (b) Organisations:
{
"_id" : ObjectId("o1"),
"name" : "Org 1"
}
{
"_id" : "ObjectId("o2"),
"name" : "Org 2"
}
Collections have indexes defined for
Users._id, Users.org_id, Organisations._id
I would like to find all Organisations a specific user is a member of.
I've tried this:
> myUser = db.Users.find( { _id: ObjectId("u1") })
> db.Organisations.find( { _id : { $in : [myUser.org_ids] }})
yet it yields nothing as a result. I've also tried this:
> myUser = db.Users.find( { _id: ObjectId("u1") })
> db.Organisations.find( { _id : { $in : myUser.org_ids }})
but it outputs the error:
error: { "$err" : "invalid query", "code" : 12580 }
(which basically says you need to pass $in an array) ... but that's what I thought I was doing originally ? baffled.
Any ideas what I'm doing wrong?
db.collection.find() returns a cursor - according to documentation. Then myUser.org_ids is undefined, but $in field must be an array. Let's see the solution!
_id is unique in a collection. So you can do findOne:
myUser = db.Users.findOne( { _id: ObjectId("u1") })
db.Organisations.find( { _id : { $in : myUser.org_ids }})
If you are searching for a non-unique field you can use toArray:
myUsers = db.Users.find( { username: /^user/ }).toArray()
Then myUsers will be an array of objects matching to the query.

Delete Elasticsearch index without deleting its mappings

How can I delete data from my elasticsearch database without deleting my index mapping?
I am Tire gem and using the delete command deletes all my mappings and run the create command once again. I want to avoid the create command from being run again and again.
Please help me out with this.
found it at http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/docs-delete-by-query.html
DELETE <index>/_query
{
"query" : {
"match_all": {}
}
}
You can also just delete a specific type by changing it to DELETE <index>/<type>/_query
This will delete the data and maintain the mappings, setting, etc.
You can use index templates, which will be applied to indices whose name matches a pattern.
That way you can simply delete an index, using the delete index api (way better than deleting all documents in it), and when you recreate the same index the matching index templates will get applied to it, so that you don't need to recreate its mappings, settings, warmers as well...
What happens is that the mappings will get deleted as they refer to the index that you deleted, but since they are stored in the index templates as well you won't need to resubmit them again when recreating the same index later on.
Due to the way ElasticSearch delete it's documents (by flagging the document with a bitset for deletion) it wouldn't be worthwhile to iterate through X amount of documents and flagging them for delete. I believe when you flush an indices it will free memory by removing all documents with the delete bitset flagged, being an expensive operation and slowing down the shards on which the index resides in.
Hope this helps.
Updating Yehosef's answer based on the latest docs (6.2 as of this post):
POST <index>/_delete_by_query
{
"query" : {
"match_all": {}
}
}
Deleting by query is deprecated in 1.5.3
You should use the scroll/scan API to find all matching ids and then issue a bulk request to delete them.
As documented here
curl -XGET 'localhost:9200/realestate/houses/_search?scroll=1m' -d '
{
"query": {
"match_all" : { }
},
"fields": []
}
'
and then the bulk delete (don't forget to put a new line after the last row)
curl -XPOST 'localhost:9200/_bulk' -d '
{ "delete" : { "_index" : "realestate", "_type" : "houses", "_id" : "1" } }
{ "delete" : { "_index" : "realestate", "_type" : "houses", "_id" : "2" } }
{ "delete" : { "_index" : "realestate", "_type" : "houses", "_id" : "3" } }
{ "delete" : { "_index" : "realestate", "_type" : "houses", "_id" : "4" } }
{ "delete" : { "_index" : "realestate", "_type" : "houses", "_id" : "5" } }
{ "delete" : { "_index" : "realestate", "_type" : "houses", "_id" : "6" } }
{ "delete" : { "_index" : "realestate", "_type" : "houses", "_id" : "7" } }
{ "delete" : { "_index" : "realestate", "_type" : "houses", "_id" : "8" } }
'

Resources