NIFI jolt how to array to JSON - arrays

don't know how to use jolt's transformations such as shift、default、modify-default-beta.
I have the following array value as
input :
{
"a": [
[
"1",
"2",
"3"
],
[
"11",
"22",
"33"
]
],
"b": [
"age",
"name",
"address"
]
}
expected output :
[
{
"age": "1",
"name": "2",
"address": "3"
},
{
"age": "11",
"name": "22",
"address": "33"
}
]

You can use the following shift transformation spec
[
{
"operation": "shift",
"spec": {
"a": {
"*": {
"*": {
"#": "[&2].#(4,b[&1])"
}
}
}
}
}
]
where we walk through the subindexes of the array sorted by
a[0][0]
a[0][1]
a[0][2]
a[1][0]
a[1][1]
a[1][2]
and step by step match
b[0]
b[1]
b[2]
as getting the value of the array b after traversing 4 levels
up the tree
(once per :, triple per {)
while tiling them arraywise by [&2]

Related

JOLT: Merge specific data from JSON array using id key

I'm getting data in an specific way from an API and I have to convert it to a cleaner version of it.
What I get from the API is a JSON like this (you can see that there is some information duplicated as for the first fields but the investor is different).
{
"clubhouse": [
{
"id": "01",
"statusId": "ok",
"stateid": "2",
"TypeId": "3",
"investors": [
{
"investor": {
"id": "1234",
"gender": "01"
},
"inamount": "1500000",
"ratio": "12"
}
]
},
{
"id": "01",
"statusId": "ok",
"stateid": "2",
"TypeId": "3",
"investors": [
{
"investor": {
"id": "4321",
"gender": "02"
},
"inamount": "1700000",
"ratio": "12"
}
]
},
{
"id": "02",
"statusId": "ok",
"stateid": "2",
"TypeId": "3",
"investors": [
{
"investor": {
"id": "1333",
"gender": "01"
},
"inamount": "1500000",
"ratio": "12"
}
]
},
{
"id": "03",
"statusId": "ok",
"stateid": "5",
"TypeId": "3",
"investors": [
{
"investor": {
"id": "",
"gender": ""
},
"inamount": "",
"ratio": ""
}
]
},
{
"id": "02",
"statusId": "ok",
"stateid": "2",
"TypeId": "3",
"investors": [
{
"investor": {
"id": "1334",
"gender": "02"
},
"inamount": "1900000",
"ratio": "12"
}
]
}
]
}
I need to merge the investors and eliminate the duplicated information, the the expected result will be
{
"clubhouse": [
{
"id": "01",
"statusId": "ok",
"stateid": "2",
"TypeId": "3",
"investors": [
{
"investor": {
"id": "1234",
"gender": "01"
},
"inamount": "1500000",
"ratio": "12"
},
{
"investor": {
"id": "4321",
"gender": "02"
},
"inamount": "1700000",
"ratio": "12"
}
]
},
{
"id": "02",
"statusId": "ok",
"stateid": "2",
"TypeId": "3",
"investors": [
{
"investor": {
"id": "1333",
"gender": "01"
},
"inamount": "1500000",
"ratio": "12"
},
{
"investor": {
"id": "1334",
"gender": "02"
},
"inamount": "1900000",
"ratio": "12"
}
]
},
{
"id": "03",
"statusId": "ok",
"stateid": "5",
"TypeId": "3",
"investors": [
{
"investor": {
"id": "1555",
"gender": "01"
},
"inamount": "2000000",
"ratio": "15"
}
]
}
]
}
I'd try a couple of JOLTS and I got to merge the fields but not eliminate the duplicates.
You can start with grouping by id values such as
[
{
// group by "id" values to create separate objects
"operation": "shift",
"spec": {
"*": {
"*": {
"*": "#(1,id).&",
"investors": {
"*": {
"*": {
"#": "#(4,id).&3[&4].&" // &3 -> going 3 levels up to grab literal "investors", [&4] -> going 4 levels up the tree in order to reach the indexes of "clubhouse" array, & -> replicate the leaf node values for the current key-value pair
}
}
}
}
}
}
},
{
// get rid of "null" values
"operation": "modify-overwrite-beta",
"spec": {
"*": "=recursivelySquashNulls"
}
},
{
// pick only the first components from the repeated values populated within the arrays
"operation": "cardinality",
"spec": {
"*": {
"*": "ONE",
"investors": "MANY"
}
}
},
{
// get rid of object labels
"operation": "shift",
"spec": {
"*": ""
}
}
]

JOLT: Merge specific data from JSON array using id key and leave other arrays untouch

I previously have this issue of merging data into another one to avoid duplicates and make a cleaner version of the JSON. I got a solution in here that worked like a charm for a while but after I got more information arrayed inside the JSON things got a little bit tricky.
I have this array:
{
"clubhouse": [
{
"id": "01",
"statusId": "ok",
"stateid": "2",
"nationalities": [
{
"nationalityid": "1"
},
{
"nationalityid": "2"
},
{
"nationalityid": "3"
}
],
"TypeId": "3",
"investors": [
{
"investor": {
"id": "1234",
"gender": "01"
},
"inamount": "1500000",
"ratio": "12"
}
]
},
{
"id": "01",
"statusId": "ok",
"stateid": "2",
"nationalities": [
{
"nationalityid": "1"
},
{
"nationalityid": "2"
},
{
"nationalityid": "3"
}
],
"TypeId": "3",
"investors": [
{
"investor": {
"id": "4321",
"gender": "02"
},
"inamount": "1700000",
"ratio": "12"
}
]
},
{
"id": "02",
"statusId": "ok",
"stateid": "2",
"nationalities": [
{
"nationalityid": "3"
},
{
"nationalityid": "4"
},
{
"nationalityid": "5"
}
],
"TypeId": "3",
"investors": [
{
"investor": {
"id": "1333",
"gender": "01"
},
"inamount": "1500000",
"ratio": "12"
}
]
},
{
"id": "03",
"statusId": "ok",
"stateid": "5",
"nationalities": [
{
"nationalityid": "3"
},
{
"nationalityid": "4"
},
{
"nationalityid": "5"
}
],
"TypeId": "3",
"investors": [
{
"investor": {
"id": "",
"gender": ""
},
"inamount": "",
"ratio": ""
}
]
},
{
"id": "02",
"statusId": "ok",
"stateid": "2",
"nationalities": [
{
"nationalityid": "3"
},
{
"nationalityid": "4"
},
{
"nationalityid": "5"
}
],
"TypeId": "3",
"investors": [
{
"investor": {
"id": "1334",
"gender": "02"
},
"inamount": "1900000",
"ratio": "12"
}
]
}
]
}
I was using this JOLT but it doesnt work with the nationalities,since it loses the array they are in.
[
{
// group by "id" values to create separate objects
"operation": "shift",
"spec": {
"*": {
"*": {
"*": "#(1,id).&",
"investors": {
"*": {
"*": {
"#": "#(4,id).&3[&4].&" // &3 -> going 3 levels up to grab literal "investors", [&4] -> going 4 levels up the tree in order to reach the indexes of "clubhouse" array, & -> replicate the leaf node values for the current key-value pair
}
}
}
}
}
}
},
{
// get rid of "null" values
"operation": "modify-overwrite-beta",
"spec": {
"*": "=recursivelySquashNulls"
}
},
{
// pick only the first components from the repeated values populated within the arrays
"operation": "cardinality",
"spec": {
"*": {
"*": "ONE",
"investors": "MANY"
}
}
},
{
// get rid of object labels
"operation": "shift",
"spec": {
"*": ""
}
}
]
What I need to get is something like this:
{
"clubhouse": [
{
"id": "01",
"statusId": "ok",
"stateid": "2",
"nationalities": [
{
"nationalityid": "1"
},
{
"nationalityid": "2"
},
{
"nationalityid": "3"
}
],
"TypeId": "3",
"investors": [
{
"investor": {
"id": "1234",
"gender": "01"
},
"inamount": "1500000",
"ratio": "12"
},
{
"investor": {
"id": "4321",
"gender": "02"
},
"inamount": "1700000",
"ratio": "12"
}
]
},
{
"id": "02",
"statusId": "ok",
"stateid": "2",
"nationalities": [
{
"nationalityid": "3"
},
{
"nationalityid": "4"
},
{
"nationalityid": "5"
}
],
"TypeId": "3",
"investors": [
{
"investor": {
"id": "1333",
"gender": "01"
},
"inamount": "1500000",
"ratio": "12"
},
{
"investor": {
"id": "1334",
"gender": "02"
},
"inamount": "1900000",
"ratio": "12"
}
]
},
{
"id": "03",
"statusId": "ok",
"stateid": "5",
"nationalities": [
{
"nationalityid": "3"
},
{
"nationalityid": "4"
},
{
"nationalityid": "5"
}
],
"TypeId": "3",
"investors": [
{
"investor": {
"id": "",
"gender": ""
},
"inamount": "",
"ratio": ""
}
]
}
]
}
You can rearrange the first shift transformation by adding a new object tagged "nationalities" which has one level reduced identifiers compared to the already existing object tagged "investors", and the existing cardinality transformation would already pick only the first array among repeated identical "nationalities" arrays if the remaining specs are kept as they are, such as the below one
[
{
"operation": "shift",
"spec": {
"*": {
"*": {
"*": "#(1,id).&",
"nationalities": {
"*": {
"#": "#(3,id).&2[&3][]"
}
},
"investors": {
"*": {
"*": {
"#": "#(4,id).&3[&4].&"
}
}
}
}
}
}
},
...
]

lookup compare collection data with array in aggregate result in mongo DB

i want to compare collection with array in aggregate result
i have following two collection.
chat collection
chat.tags is a array value in reference key come from the tags collection.
"chat": [
{
"id": "test1",
"tags": [
"AAA",
"BBB",
"CCC",
"AAA"
]
},
{
"id": "test2",
"tags": [
"AAA",
"BBB",
"CCC"
]
}
]
tag collection
"tag": [
{
"id": "1234",
"key": "AAA",
"name": "a"
},
{
"id": "1235",
"key": "BBB",
"name": "b"
},
{
"id": "1236",
"key": "CCC",
"name": "c"
},
{
"id": "1237",
"key": "DDD",
"name": "d"
},
]
i want to result that id is "test1" and unique tags in chat collection.
i want to following result using with mongo aggregate.
Is it possible with from, let, pipeline when using lookup?
[
{
"chat": [
{
"id": "test1",
"setTags": [
"AAA",
"BBB",
"CCC"
]
}
],
"tag": [
{
"id": "1234",
"key": "AAA",
"name": "a"
},
{
"id": "1235",
"key": "BBB",
"name": "b"
},
{
"id": "1236",
"key": "CCC",
"name": "c"
}
]
}
]
please help me.
This can be achieved with a simple $lookup, like so:
db.chat.aggregate([
{
$match: {
id: "test1"
}
},
{
$lookup: {
from: "tag",
localField: "tags",
foreignField: "key",
as: "tagDocs"
}
},
{
$project: {
chat: [
{
id: "$id",
setTags: "$tags"
}
],
tag: "$tagDocs"
}
}
])
Mongo Playground
I didn't fully understand what the output structure you want is but it can easily be changed via a different $project stage.
--- EDIT ---
With Mongo's v3.6 $lookup syntax the pipeline remains the same, just the $lookup stage changes:
{
$lookup: {
from: "tag",
let: {
tagKeys: "$tags"
},
pipeline: [
{
$match: {
$expr: {
$in: [
"$key",
"$$tagKeys"
]
}
}
}
],
as: "tagDocs"
}
},
Mongo Playground

Use Jolt to flatten an array of objects which contain array

Can Jolt flatten an array of objects which contain array? For example, is there a way to write Jolt transformation specs for the following input and output?
Jolt https://github.com/bazaarvoice/jolt
Input:
[
{"Id": "111", ["mobile": "1111", "home": "1112"]},
{"Id": "222", ["mobile": "2221"]}
]
Output:
[
{"Id": "111", "mobile": "1111"},
{"Id": "111", "home": "1112"},
{"Id": "222", "mobile": "2221"}
]
I didn't find a way to represent the output array indexes.
assuming input is fixed as
[
{
"Id": "111",
"Val": [
{
"mobile": "1111"
},
{
"home": "1112"
}
]
},
{
"Id": "222",
"Val": [
{
"mobile": "2221"
}
]
}
]
then applying the following transformation
[
{
"operation": "shift",
"spec": {
"*": {
"Id": null,
"Val": {
"*": {
"#": "[&3].[&0]",
"#(2,Id)": "[&3].[&0].id"
}
}
}
}
},
{
"operation": "shift",
"spec": {
"*": {
"*": "[]"
}
}
}
]
reach the expected output:
[ {
"mobile" : "1111",
"id" : "111"
}, {
"home" : "1112",
"id" : "111"
}, {
"mobile" : "2221",
"id" : "222"
} ]

How to assign an array value to an object using JOLT?

I'm trying to flatten a json file and convert array elements to objects and assign those object values from another array. I will post both desired output and my current spec in a response:
Here's my input (required output in response):
{
"information": {
"Id": "2",
"cId": "P2",
"sId": 11,
"dataFrom": "4/15/2018T6:31:02Z",
"dataTo": "4/15/2018T6:42:02Z"
},
"description": {
"indicator": "SomeName",
"details": {
"headers": [
"id",
"aId",
"NAME4"
],
"values": [
[
1609,
12,
"NAME3"
],
[
1610,
11,
"NAME2"
]
]
}
}
}```

Resources