Merge json arrays with duplicate keys

Merge json arrays with duplicate keys - arrays

I want to merge two json arrays with help of jq. Each object in arrays contains name field, which allow me to group by and merge two arrays into one.
LABELS
[
{
"name": "power_branch",
"description": "master"
},
{
"name": "test_branch",
"description": "main"
}
]
RUNNERS
[
{
"name": "power_branch",
"runner": "power",
"runner_tag": "macos"
},
{
"name": "power_branch",
"runner": "power",
"runner_tag": "ubuntu"
},
{
"name": "test_branch",
"runner": "tester",
"runner_tag": ""
},
{
"name": "development",
"runner": "dev",
"runner_tag": "ubuntu"
}
]
Desired Output
[
{
"name": "power_branch",
"description": "master",
"runner": "power",
"runner_tag": "macos"
},
{
"name": "power_branch",
"description": "master",
"runner": "power",
"runner_tag": "ubuntu"
},
{
"name": "test_branch",
"description": "main",
"runner": "tester",
"runner_tag": ""
}
]
I tried with following script, but power_branch entry was override, instead i want another entry with different runner_tag
#!/usr/bin/bash
LABELS='[{"name": "power_branch","description": "master"},{"name": "test_branch","description": "main"}]'
RUNNERS='''
[
{ "name": "power_branch", "runner": "power", "runner_tag": "macos" },
{ "name": "power_branch", "runner": "power", "runner_tag": "ubuntu" },
{ "name": "test_branch", "runner": "tester", "runner_tag": "" },
{ "name": "development", "runner": "dev", "runner_tag": "ubuntu" }
]
'''
FINAL=$(jq -s '[ .[0] + .[1] | group_by(.name)[] | select(length > 1) | add]' <(echo $LABELS) <(echo $RUNNERS))
echo $FINAL
OUTPUT
[
{
"name": "power_branch",
"description": "master",
"runner": "power",
"runner_tag": "ubuntu"
},
{
"name": "test_branch",
"description": "main",
"runner": "tester",
"runner_tag": ""
}
]

If you have two files labels.json and runners.json, you could read in the latter (runners) as a variable using --argjson and append to each element of the input array (labels) using map the corresponding fields determined by select.
jq --argjson runners "$(cat runners.json)" '
map(.name as $name | . + ($runners[] | select(.name == $name)))
' labels.json
However, this reads the whole runners array into your shells command line space (--argjson takes two strings: a name and a value) which can easily overflow if the runners array gets big enough.
Therefore, instead of using command substitution "$(…)", you could read in the runners file directly using either --slurpfile for the cost of another iteration level [][], or (despite the manual saying not to - read more about it in the comments) using --argfile with just a single iteration level as before:
jq --slurpfile runners runners.json '
map(.name as $name | . + ($runners[][] | select(.name == $name)))
' labels.json
jq --argfile runners runners.json '
map(.name as $name | . + ($runners[] | select(.name == $name)))
' labels.json
To circumvent all these issues, #peak suggested using input for each file together with the -n option. Note that this requires the two files to be provided in this exact order as they are being read in sequentially.
jq -n 'input as $runners | input |
map(.name as $name | . + ($runners[] | select(.name == $name)))
' runners.json labels.json
As the second input (labels) is passed on directly as the filter's main input (in contrast to runners, which is stored in a variable for later use), this could be further simplified by removing again the -n option (order of the files still matters):
jq 'input as $runners |
map(.name as $name | . + ($runners[] | select(.name == $name)))
' runners.json labels.json
Finally, here's yet another approach using the SQL-style operators INDEX and JOIN which were introduced in jq v1.6. This also employs the technique using just one input and also the order of the files still matters as we need the runners array as the filter's primary input.
jq '
JOIN(INDEX(input[]; .name); .name) | map(select(.[1]) | add)
' runners.json labels.json

Related

Creating json-file with json-objects array using jq in bash

I have an input row like this: 1374240, 1374241. I need to make json file:
{
"version": "1.0",
"tests": [
{
"id": 1374240,
"selector": ""
},
{
"id": 1374241,
"selector": ""
}
]
}
I maked associated array:
idRow='1374240, 1374241'
IFS=',' read -r -a array <<<"$idRow"
trimmedArray=()
for id in "${array[#]}"; do
trimmedId="$(echo -e "${id}" | xargs)"
testRow="{\"id\":${trimmedId},\"selector\":\"\"}"
trimmedArray+=("$testRow")
done
echo "${trimmedArray[*]}"
Output:
{"id":1374240,"selector":""} {"id":1374241,"selector":""}
How i can insert it in final json structure and write a file?
I am tried a different variants with jq, but I can`t get finally structure. Please, help.

Read in the numbers as raw text using -R, split at the ,, use tonumbers to convert them to numbers, and create the structure on the fly:
echo "1374240, 1374241" | jq -R '
{version:"1.0",tests:(
split(",") | map(
{id: tonumber, selector: ""}
)
)}
'
Demo
If you can omit the comma in the first place, it's even easier to read in numbers as they itself are JSON:
echo "1374240 1374241" | jq -s '
{version:"1.0",tests: map(
{id: tonumber, selector: ""}
)}
'
Demo
Output:
{
"version": "1.0",
"tests": [
{
"id": 1374240,
"selector": ""
},
{
"id": 1374241,
"selector": ""
}
]
}

jq - subtracting one array from another using a single command

I have three operations with jq to get the right result. How can I do it within one command?
Here is a fragment from the source JSON file
[
{
"Header": {
"Tenant": "tenant-1",
"Rcode": 200
},
"Body": {
"values": [
{
"id": "0b0b-0c0c",
"name": "NumberOfSearchResults"
},
{
"id": "aaaa0001-0a0a",
"name": "LoadTest"
}
]
}
},
{
"Header": {
"Tenant": "tenant-2",
"Rcode": 200
},
"Body": {
"values": []
}
},
{
"Header": {
"Tenant": "tenant-3",
"Rcode": 200
},
"Body": {
"values": [
{
"id": "cccca0003-0b0b",
"name": "LoadTest"
}
]
}
},
{
"Header": {
"Tenant": "tenant-4",
"Rcode": 200
},
"Body": {
"values": [
{
"id": "0f0g-0e0a",
"name": "NumberOfSearchResults"
}
]
}
}
]
I apply two filters and create two intermediate JSON files. First I create the list of all tenants
jq -r '[.[].Header.Tenant]' source.json >all-tenants.json
And then I select to create an array of all tenants not having a particular key present in the Body.values[] array:
jq -r '[.[] | select (all(.Body.values[]; .name !="LoadTest") ) ] | [.[].Header.Tenant]' source.json >filter1.json
Results - all-tenants.json
["tenant-1",
"tenant-2",
"tenant-3",
"tenant-4"
]
filter1.json
["tenant-2",
"tenant-4"
]
And then I substruct filter1.json from all-tenants.json to get the difference:
jq -r -n --argfile filter filter1.json --argfile alltenants all-tenants.json '$alltenants - $filter|.[]'
Result:
tenant-1
tenant-3
Tenant names - values for the "Tenant" key are unique and each of them occurs only once in the source.json file.
Just to clarify - I understand that I can have a select condition(s) that would give me the same resut as subtracting two arrays.
What I want to understand - how can I assign and use these two arrays into vars directly in a single command not involving the intermediate files?
Thanks

Use your filters to fill in the values of a new object and use the keys to refer to the arrays.
jq -r '{
"all-tenants": [.[].Header.Tenant],
"filter1": [.[]|select (all(.Body.values[]; .name !="LoadTest"))]|[.[].Header.Tenant]
} | .["all-tenants"] - .filter1 | .[]'
Note: .["all-tenants"] is required by the special character "-" in that key. See the entry under Object Identifier-Index in the manual.

how can I assign and use these two arrays into vars directly in a single command not involving the intermediate files?
Simply store the intermediate arrays as jq "$-variables":
[.[].Header.Tenant] as $x
| ([.[] | select (all(.Body.values[]; .name !="LoadTest") ) ] | [.[].Header.Tenant]) as $y
| $x - $y
If you want to itemize the contents of $x - $y, then simply add a final .[] to the pipeline.

jq: error (at <stdin>:1): Cannot index array with string "name" SOLVED

I'm trying to make a JSON collection of country, region, city, org, ip of intrusion attempts.
My JSON test info:
[
{
"total": 0,
"country": [
{
"name": "CN",
"nr": 0,
"region": [
{
"name": "Beijing",
"nr": 0,
"City": [
{
"name": "Haidian",
"nr": 0,
"Organisation": [
{
"name": "AS45090 Shenzhen Tencent Computer Systems Company Limited",
"nr": 0,
"IPS": [
{
"192.144.207.22": 0
}
]
}
]
}
]
}
]
},
{
"name": "NL",
"nr": 0,
"region": [
{
"name": "Noord Holland",
"nr": 0,
"City": [
{
"name": "Amsterdam",
"nr": 0,
"Organisation": [
{
"name": "FEAS",
"nr": 0,
"IPS": [
{
"192.162.1.1": 0
}
]
}
]
}
]
}
]
}
]
}
]
I load the existing json (test) string into $geoInfo. Now i'm trying to change the nr value in the object where"name": "CN"
i have tested two sollutions:
geoInfo="$( jq --arg country ${tmpGeo[0]} --arg count $count -r '.country | map( if .name == $country then . + { .nr=$count } else . end )'<<<"${geoInfo}" )"
And
geoInfo="$( jq --arg country ${tmpGeo[0]} --arg count $count -r '.country | select(.[].name == "CN") | .nr) = $count'<<<"${geoInfo}" )"
With both solutions I get:
jq: error (at <stdin>:1): Cannot index array with string "name"
I use jq version 1.6.
What is going wrong?

Would you please try the following:
geoInfo=$(jq "(.[].country[] | select(.name == \"CN\") | .nr) = 1" <<<"$geoInfo")

Just forget all of this! I'm terribly sorry. The errors i got was for a statement in the else section and I was changing the query in the then section. Probably worked to long on the code yesterday. my first solution had a small mistake after changing the .nr= to "nr": it worked:
geoInfo="$( jq --arg country ${tmpGeo[0]} --arg count $count -r '.country | map( if .name == $country then . + { "nr": $count } else . end )'<<<"${geoInfo}" )"

subtracting one json file from another in jq

Is there a way to compare two json files in jq? Specifically, I'd like to be able to remove objects from one json file if they occur in another json file. Basically, subtract one file from another. It would be a bonus if I could generalize this so that I could define the equality criteria for the objects, but this is not strictly necessary, it can be based strictly on the objects being identical.
So the more general case would look like this. Let's say I have a file that looks like this:
[
{
"name": "Cynthia",
"surname": "Craig",
"isActive": true,
"balance": "$2,426.88"
},
{
"name": "Elise",
"surname": "Long",
"isActive": false,
"balance": "$1,892.72"
},
{
"name": "Hyde",
"surname": "Adkins",
"isActive": true,
"balance": "$1,769.34"
},
{
"name": "Matthews",
"surname": "Jefferson",
"isActive": true,
"balance": "$1,991.42"
},
{
"name": "Kris",
"surname": "Norris",
"isActive": false,
"balance": "$2,137.11"
}
]
And I have a second file that looks like this:
[
{
"name": "Cynthia",
"surname": "Craig"
},
{
"name": "Kris",
"surname": "Norris"
}
]
I'd like to remove any objects from the first file where the name and surname fields match an object of the second file, so that the results should look like this:
[
{
"name": "Elise",
"surname": "Long",
"isActive": false,
"balance": "$1,892.72"
},
{
"name": "Hyde",
"surname": "Adkins",
"isActive": true,
"balance": "$1,769.34"
},
{
"name": "Matthews",
"surname": "Jefferson",
"isActive": true,
"balance": "$1,991.42"
}
]

The following solution is intended to be generic, efficient and as simple as possible subject to the first two objectives.
Genericity
For genericity, let us suppose that $one and $two are two arrays of
JSON entities, and that we wish to find those items, $x, in $one
such that ($x|filter) does not appear in map($two | filter), where filter is an arbitrary filter. (In the present instance, it is {surname, name}.)
The solution uses INDEX/1, which was added to jq after the official 1.5 release, so we begin by reproducing its definition:
def INDEX(stream; idx_expr):
reduce stream as $row ({};
.[$row|idx_expr|
if type != "string" then tojson
else .
end] |= $row);
def INDEX(idx_expr): INDEX(.[]; idx_expr);
Efficiency
For efficiency, we will need to use a JSON object as a dictionary;
since keys must be strings, we will need to ensure that when converting an object
to a string, the objects are normalized. For this, we define normalize as follows:
# Normalize the input with respect to the order of keys in objects
def normalize:
. as $in
| if type == "object" then reduce keys[] as $key
( {}; . + { ($key): ($in[$key] | normalize) } )
elif type == "array" then map( normalize )
else .
end;
To construct the dictionary, we simply apply (normalize|tojson):
def todict(filter):
INDEX(filter| normalize | tojson);
The solution
The solution is now quite simple:
# select those items from the input stream for which
# (normalize|tojson) is NOT in dict:
def MINUS(filter; $dict):
select( $dict[filter | normalize | tojson] | not);
def difference($one; $two; filter):
($two | todict(filter)) as $dict
| $one[] | MINUS( filter; $dict );
difference( $one; $two; {surname, name} )
Invocation
$ jq -n --argfile one one.json --argfile two two.json -f difference.jq

Here is a solution which uses --argfile and project/1 from pull/1062
def project(q):
. as $in
| reduce (q | if type == "object" then keys[] else .[] end) as $k (
{}
; . + { ($k) : ($in[$k]) }
)
;
map(
reduce $arg[] as $a (
.
; select(project($a) != $a)
)
| values
)
If you place the "second" file in second.json, the data in data.json and the above filter in filter.jq you can run this with
jq -M --argfile arg second.json -f filter.jq data.json
to produce
[
{
"name": "Elise",
"surname": "Long",
"isActive": false,
"balance": "$1,892.72"
},
{
"name": "Hyde",
"surname": "Adkins",
"isActive": true,
"balance": "$1,769.34"
},
{
"name": "Matthews",
"surname": "Jefferson",
"isActive": true,
"balance": "$1,991.42"
}
]
You can replace the expression select(project($a) != $a) with something else if you want to revise the equality criteria for the objects.
Thinking about this a little more we can eliminate the need for project/1 by using contains. This should be more efficient as it eliminates construction of a temporary object.
map(
reduce $arg[] as $a (
.
; select(.!=null and contains($a)==false)
)
| values
)
this can be further simplified using any:
map(select(any(.; contains($arg[]))==false))
which is short enough to be used directly on the command line:
jq -M --argfile arg second.json 'map(select(any(.; contains($arg[]))==false))' data.json

jq solution:
jq --slurpfile s f2.json '[ .[] | . as $o | if (reduce $s[0][] as $i
([]; . + [($o | contains($i))]) | any) then empty else $o end ]' f1.json
The output:
[
{
"name": "Elise",
"surname": "Long",
"isActive": false,
"balance": "$1,892.72"
},
{
"name": "Hyde",
"surname": "Adkins",
"isActive": true,
"balance": "$1,769.34"
},
{
"name": "Matthews",
"surname": "Jefferson",
"isActive": true,
"balance": "$1,991.42"
}
]

JQ: Remove object from multiple arrays

I want to use jq to remove all objects with a given name from all arrays in the input data. For example deleting "Name1" from this:
{
"Category1": [
{
"name": "Name1",
"desc": "Desc1"
},
{
"name": "Name2",
"desc": "Desc2"
}
],
"Category2": [
{
"name": "Name1",
"desc": "Desc1"
},
{
"name": "Name3",
"desc": "Desc3"
}
],
"Category3": [
{
"name": "Name4",
"desc": "Desc4"
}
]
}
Should yield this:
{
"Category1": [
{
"name": "Name2",
"desc": "Desc2"
}
],
"Category2": [
{
"name": "Name3",
"desc": "Desc3"
}
],
"Category3": [
{
"name": "Name4",
"desc": "Desc4"
}
]
}
I haven't worked with jq, or indeed JSON, much and after several hours of googling and experimenting I haven't been able to figure it out. How would I do this?
The closest I managed was this:
cat input | jq 'keys[] as $k | .[$k] |= map( select( .name != "Name1"))'
This does filter each of the arrays but returns the result as three separate objects and this is not what I want.

If the structure of your input JSON is always as seen on your example, try this:
map_values(map(select(.name != "Name1")))

Here is a solution that will remove all objects with the specified name, wherever they occur. It uses the generic function walk/1,
which is a built-in in versions of jq > 1.5, and can therefore be omitted if your jq includes it, but there is no harm in including it redundantly, e.g. in a jq script.
# Apply f to composite entities recursively, and to atoms
def walk(f):
. as $in
| if type == "object" then
reduce keys[] as $key
( {}; . + { ($key): ($in[$key] | walk(f)) } ) | f
elif type == "array" then map( walk(f) ) | f
else f
end;
walk(if type == "object" and .name == "Name1" then empty else . end)
If you really only want to remove objects from arrays, then you could use:
walk(if type == "array" then map(select( type != "object" or .name != "Name1")) else . end)

Here is a solution which uses reduce and del
reduce keys[] as $k (
.
; del(.[$k][] | select(.name == "Name1"))
)

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Merge json arrays with duplicate keys - arrays

Related

Creating json-file with json-objects array using jq in bash

jq - subtracting one array from another using a single command

jq: error (at <stdin>:1): Cannot index array with string "name" SOLVED

subtracting one json file from another in jq

JQ: Remove object from multiple arrays

Categories

Resources