I am trying to transform the JSON as below expected output. Stuck with the spec below. Can someone help in this?
There is an inner array with name "content" in the "results" array which I want to make it as a part of main array.
Input JSON
{
"total": 100,
"start": 1,
"page-length": 10,
"results": [
{
"index": 1,
"uri": "uri1/uri2",
"extracted": {
"kind": "object",
"content": [
{
"code": "A1",
"region": "APAC"
}
]
}
},
{
"index": 2,
"uri": "uri1/uri2",
"extracted": {
"kind": "object",
"content": [
{
"code": "B1",
"region": "AMER"
}
]
}
},
{
"index": 3,
"uri": "uri1/uri2",
"extracted": {
"kind": "object",
"content": [
{
"code": "C1",
"region": "APAC"
}
]
}
}
]
}
Expected json output
[
{
"code": "A1",
"region": "APAC"
},
{
"code": "B1",
"region": "AMER"
},
{
"code": "C1",
"region": "APAC"
}
]
Spec Tried
[
{
"operation": "shift",
"spec": {
"results": {
"*": {
"extracted": { "content": { "#": "&" } }
}
}
}
},
{
"operation": "shift",
"spec": {
"content": {
"#": "&"
}
}
}
]
Find the output below I am getting on Jolt tool
You can use # wildcard nested within square brackets in order to reach the level of the indexes of the "results" array such that
[
{
"operation": "shift",
"spec": {
"results": {
"*": {//the indexes of the "results" array
"extracted": {
"content": {
"*": {//the indexes of the "content" array
"*": "[#5].&"
}
}
}
}
}
}
}
]
Also the following spec, which repeats the content of the inner array without keys, will give the same result :
[
{
"operation": "shift",
"spec": {
"results": {
"*": {
"extracted": {
"content": {
"*": "[]"// [] seems like redundant but kept for the case the array has a single object.
}
}
}
}
}
}
]
which is quite similar to your tried one.
Related
Input is :
{
"Size": "2",
"done": "true",
"records": [
{
"Id": "a7g6s0000004GZuAAM",
"NN": "00096411.0",
"Name": "ISOLIN TRADE & INVEST"
},
{
"Id": "a7g6s0000004GZzAAM",
"Number": "00096412.0",
"Name": "ISOLIN"
}
]
}
Spec used:
[
{
"operation": "remove",
"spec": {
"records": {
"*": {
"attributes": " "
}
}
}
},
{
"operation": "shift",
"spec": {
"*": "&",
"records": {
"*": { //iterate on each object of records
"*": { //iterate on each element of object
"$": "Items[#1].Fields[].Name",
"#": "Items[#1].Fields[].Value"
}
}
}
}
}
]
Current Output:
{
"Size": "2",
"done": "true",
"Items": [
{
"Fields": [
{
"Name": "Id"
},
{
"Value": "a7g6s0000004GZuAAM"
},
{
"Name": "NN"
},
{
"Value": "00096411.0"
},
{
"Name": "Name"
},
{
"Value": "ISOLIN TRADE & INVEST"
},
{
"Name": "Id"
},
{
"Value": "a7g6s0000004GZzAAM"
},
{
"Name": "Number"
},
{
"Value": "00096412.0"
},
{
"Name": "Name"
},
{
"Value": "ISOLIN"
}
]
}
]
}
Expected output:
{
"Size": "2",
"done": "true",
"Items": [
{
"Fields": [
{
"Name": "Id",
"Value": "a7g6s0000004GZuAAM"
},
{
"Name": "NN",
"Value": "00096411.0"
},
{
"Name": "Name",
"Value": "ISOLIN TRADE & INVEST"
},
{
"Name": "Id",
"Value": "a7g6s0000004GZzAAM"
},
{
"Name": "Number",
"Value": "00096412.0"
},
{
"Name": "Name",
"Value": "ISOLIN"
}
]
}
]
}
You can seperate those attributes into individual objects through use of Items.&2.[#2]. pattern as prefix such as
[
{
"operation": "shift",
"spec": {
"*": "&",
"records": {
"*": {
"*": {
"$": "Items.&2.[#2].Name",
"#": "Items.&2.[#2].Value"
}
}
}
}
},
{
"operation": "shift",
"spec": {
"*": "&",
"Items": {
"*": {
"*": "Fields[]"
}
}
}
}
]
I am struggeling with some jolt transformation. I need to extract informations from an array, but also need some uppper level informations.
I have bills and some bills have multiple attachments. I want to store this attachments in a Postgress db and for each attachment aplly the bill id...
My input
[
{
"bill_id": 1,
"entities": [
{
"type": "alpha",
"data": "foo"
},
{
"type": "beta",
"data": "bar"
}
]
},
{
"bill_id": 2,
"entities": []
}
]
My desired output
[
{
"bill_id": 1,
"type": "alpha",
"data": "foo"
},
{
"bill_id": 1,
"type": "beta",
"data": "bar"
}
]
I would be very glad, if someone could help me out
Well, i found an answer, that perfectly matches my needs. A little bit tricky, but it is working fine with two shifts:
[
{
"operation": "shift",
"spec": {
"*": {
"entities": {
"*": {
"#(2,bill_id)": "[&3].[&1].bill_id",
"type": "[&3].[&1].type",
"data": "[&3].[&1].data"
}
}
}
}
},
{
"operation": "shift",
"spec": {
"*": {
"*": "[]"
}
}
}
]
I want to do a Jolt transformation with a json array, and i only want the ones with certain attribute.
For example:
Input:
{
"characteristic": [
{
"name": "BrandId",
"value": "b"
},
{
"name": "status",
"value": "SENT"
},
{
"name": "statusTxt",
"value": "sent"
}
]
}
I want output to be
{
"status":"SENT",
"statusTxt":"sent"
}
This will lead to the output you want based on your input:
[
{
"operation": "shift",
"spec": {
"characteristic": {
"*": {
"name": {
"status": {
"#(2,value)": "status"
},
"statusTxt": {
"#(2,value)": "statusTxt"
}
}
}
}
}
}
]
Can Jolt flatten an array of objects which contain array? For example, is there a way to write Jolt transformation specs for the following input and output?
Jolt https://github.com/bazaarvoice/jolt
Input:
[
{"Id": "111", ["mobile": "1111", "home": "1112"]},
{"Id": "222", ["mobile": "2221"]}
]
Output:
[
{"Id": "111", "mobile": "1111"},
{"Id": "111", "home": "1112"},
{"Id": "222", "mobile": "2221"}
]
I didn't find a way to represent the output array indexes.
assuming input is fixed as
[
{
"Id": "111",
"Val": [
{
"mobile": "1111"
},
{
"home": "1112"
}
]
},
{
"Id": "222",
"Val": [
{
"mobile": "2221"
}
]
}
]
then applying the following transformation
[
{
"operation": "shift",
"spec": {
"*": {
"Id": null,
"Val": {
"*": {
"#": "[&3].[&0]",
"#(2,Id)": "[&3].[&0].id"
}
}
}
}
},
{
"operation": "shift",
"spec": {
"*": {
"*": "[]"
}
}
}
]
reach the expected output:
[ {
"mobile" : "1111",
"id" : "111"
}, {
"home" : "1112",
"id" : "111"
}, {
"mobile" : "2221",
"id" : "222"
} ]
I need to aggregate an array as follows
Two document examples:
{
"_index": "log",
"_type": "travels",
"_id": "tnQsGy4lS0K6uT3Hwzzo-g",
"_score": 1,
"_source": {
"state": "saopaulo",
"date": "2014-10-30T17",
"traveler": "patrick",
"registry": "123123",
"cities": {
"saopaulo": 1,
"riodejaneiro": 2,
"total": 2
},
"reasons": [
"Entrega de encomenda"
],
"from": [
"CompraRapida"
]
}
},
{
"_index": "log",
"_type": "travels",
"_id": "tnQsGy4lS0K6uT3Hwzzo-g",
"_score": 1,
"_source": {
"state": "saopaulo",
"date": "2014-10-31T17",
"traveler": "patrick",
"registry": "123123",
"cities": {
"saopaulo": 1,
"curitiba": 1,
"total": 2
},
"reasons": [
"Entrega de encomenda"
],
"from": [
"CompraRapida"
]
}
},
I want to aggregate the cities array, to find out all the cities the traveler has gone to. I want something like this:
{
"traveler":{
"name":"patrick"
},
"cities":{
"saopaulo":2,
"riodejaneiro":2,
"curitiba":1,
"total":3
}
}
Where the total is the length of the cities array minus 1. I tried the terms aggregation and the sum, but couldn't output the desired output.
Changes in the document structure can be made, so if anything like that would help me, I'd be pleased to know.
in the document posted above "cities" is not a json array , it is a json object.
If changing the document structure is a possibility I would change cities in the document to be an array of object
example document:
cities : [
{
"name" :"saopaulo"
"visit_count" :"2",
},
{
"name" :"riodejaneiro"
"visit_count" :"1",
}
]
You would then need to set cities to be of type nested in the index mapping
"mappings": {
"<type_name>": {
"properties": {
"cities": {
"type": "nested",
"properties": {
"city": {
"type": "string"
},
"count": {
"type": "integer"
},
"value": {
"type": "long"
}
}
},
"date": {
"type": "date",
"format": "dateOptionalTime"
},
"registry": {
"type": "string"
},
"state": {
"type": "string"
},
"traveler": {
"type": "string"
}
}
}
}
After which you could use nested aggregation to get the city count per user.
The query would look something on these lines :
{
"query": {
"match": {
"traveler": "patrick"
}
},
"aggregations": {
"city_travelled": {
"nested": {
"path": "cities"
},
"aggs": {
"citycount": {
"cardinality": {
"field": "cities.city"
}
}
}
}
}
}