JSON: use jq to edit specific values in nested arrays - arrays

I'm trying to update values within an array inside an array using the utility jq. I've pasted the sample json below.
More specifically: Within the sheets array, and then within the formulas array, I'd like to change each columnName with a value of "MONTH" to "YEAR". I'd like to do the same for within the sheets array, within the columnStyles array, change each incidence of "MONTH" also to "YEAR"
This jq filter gets me the list of columnNames.
.sheets[1] | .formulas[] | .columnName
How can I edit the entire file in place by just updating the values I want? Do I use map with if?
And what if I wanted to edit a portion of a value? For example, in a forumlaString property, just changing the part of the string that contains MONTH but leaving the rest intact?
{
"version": "6.1.1",
"className": "xyz",
"sheets": [
{
"name": "Pass1",
"sheetId": "95e6c2cd-abbe-46c1-8012-bdf37438b9b7",
"keep": true,
"formulas": [
{
"columnName": "SAMPLE_PROVIDER",
"columnId": "0",
"columnIndex": 0,
"formulaString": "\u003dGROUPBY(#Raw!SAMPLE_PROVIDER)"
},
{
"columnName": "MONTH",
"columnId": "1",
"columnIndex": 1,
"formulaString": "\u003dGROUPBY(#Raw!MONTH)"
}
],
"columnStyles": [
{
"columnId": "0",
"name": "SAMPLE_PROVIDER",
"width": 206,
"thousandSeparator": true
},
{
"columnId": "1",
"name": "MONTH",
"width": 100,
"thousandSeparator": true
}
],
"nextColumnId": 2
},
{
"name": "Transform1",
"sheetId": "49071c1c-fa84-4ae3-92c1-b63175a6b26c",
"keep": true,
"formulas": [
{
"columnName": "SAMPLE_PROVIDER",
"columnId": "0",
"columnIndex": 0,
"formulaString": "\u003d#Pass1!SAMPLE_PROVIDER"
},
{
"columnName": "MONTH",
"columnId": "1",
"columnIndex": 1,
"formulaString": "\u003d#Pass1!MONTH"
}
],
"columnStyles": [
{
"columnId": "0",
"name": "SAMPLE_PROVIDER",
"width": 179,
"thousandSeparator": true
},
{
"columnId": "1",
"name": "MONTH",
"width": 100,
"thousandSeparator": true
}
],
"nextColumnId": 3
}
],
"advancedSchedulingInUse": true,
"errorHandlingMode": "IGNORE"
}

To change the columnName field in the desired containers, you can use
jq '(.sheets[] | .formulas[]? | .columnName | select(.=="MONTH")) |= "YEAR"' tmp.json
(The ? avoids an error if there is no key formula.)
To replace MONTH with YEAR in formula strings, replace each formulaString value with a possible modified string returned by sub.
jq '(.sheets[] | .formulas[]? | .formulaString) |= sub("MONTH"; "YEAR")' tmp.json
(sub requires jq 1.5, compiled with the Oniguruma library.)
To combine these into a single jq filter? I'm not sure yet; I have only a tenuous understanding of why either one alone works.

It looks like you're updating more than just fields in the formulas arrays, but a little bit of everything.
If you want to indiscriminately change all occurrences of the string "MONTH" to "YEAR", you could do this:
(.. | strings) |= sub("MONTH"; "YEAR")

This may be a task for walk/1.
(If your jq does not have walk/1, then you can copy its definition from https://github.com/stedolan/jq/blob/master/src/builtin.jq)
For example, if you want to change "MONTH" to "YEAR" whenever "MONTH" appears as the value of a key in an object, then the following would do the job:
jq 'walk(if type == "object"
then with_entries(.value |= (if . == "MONTH" then "YEAR" else . end))
else . end)' input.json
Equivalently:
jq 'walk(if type == "object"
then with_entries(if .value == "MONTH" then .value = "YEAR" else . end)
else . end)' input.json
These can easily be modified in accordance with similar requirements.

Related

How do I provide an incrementing counter in place of an existing JSON value using jq

I have an JSON file similar to this:
{
"version": "2.0",
"stage" : {
"objects" : [
{
"foo" : 1100,
"bar" : false,
"id" : "56a983f1-8111-4abc-a1eb-263d41cfb098"
},
{
"foo" : 1100,
"bar" : false,
"id" : "6369df4b-90c4-4695-8a9c-6bb2b8da5976"
}],
"bish" : "#FFFFFF"
},
"more": "abcd"
}
I would like the output to be exactly the same, with the exception of an incrementing integer in place of the "id" : "guid" - something like:
{
"version": "2.0",
"stage" : {
"objects" : [
{
"foo" : 1100,
"bar" : false,
"id" : 1
},
{
"foo" : 1100,
"bar" : false,
"id" : 2
}],
"bish" : "#FFFFFF"
},
"more": "abcd"
}
I'm new to jq. I can set the id's to a fixed integer with .stage.objects[].id |= 1.
{
"version": "2.0",
"stage": {
"objects": [
{
"foo": 1100,
"bar": false,
"id": 1
},
{
"foo": 1100,
"bar": false,
"id": 1
}
],
"bish": "#FFFFFF"
},
"more": "abcd"
}
I can't figure out the syntax to make the assigned number iterate.
I tried various combinations of map, reduce, to_entries, foreach and other strategies mentioned in answers to similar questions but the data in those examples always consisted of something simple.
You can exploit the fact that to_entries on arrays uses the index as "key", then modify your value:
.stage.objects |= (to_entries | map(.value.id = .key + 1 | .value))
or
.stage.objects |= (to_entries | map(.value += {id: (.key + 1)} | .value))
Output:
{
"version": "2.0",
"stage": {
"objects": [
{
"foo": 1100,
"bar": false,
"id": 1
},
{
"foo": 1100,
"bar": false,
"id": 2
}
],
"bish": "#FFFFFF"
},
"more": "abcd"
}
Here's a variant using reduce to iterate over the keys:
.stage.objects |= reduce keys[] as $i (.; .[$i].id = $i + 1)
{
"version": "2.0",
"stage": {
"objects": [
{
"foo": 1100,
"bar": false,
"id": 1
},
{
"foo": 1100,
"bar": false,
"id": 2
}
],
"bish": "#FFFFFF"
},
"more": "abcd"
}
Demo
Update:
Is there a way to make the search and replace go deep? If the items in the objects array had children arrays with id's, could they be replaced as well?
Of course. You could enhance the LHS of the update to also cover all .children arrays recursively using recurse(.[].children | arrays):
(.stage.objects | recurse(.[].children | arrays)) |=
reduce keys[] as $i (.; .[$i].id = $i + 1)
Demo
Note that in this case each .children array is treated independently, thus numbering starts from 1 in each of them. If you want a continuous numbering instead, it has to be done outside and brought down into the iteration. Here's a solution gathering the target paths using path, numbering them using to_entries, and setting them iteratively using setpath:
reduce (
[path(.stage.objects[] | recurse(.children | arrays[]).id)] | to_entries[]
) as $i (.; setpath($i.value; $i.key + 1))
Demo

Use jq to extract some values from an array to top level, leaving the array intact

I have data in this format:
{
"searchResult": [
{
"key": "common1",
"value": "A string"
},
{
"key": "common2",
"value": "2149944"
},
{
"key": "varying1",
"value": "604516"
},
{
"key": "varying73",
"value": "58.92"
}
]
}
Within searchResult are some constantly present fields (timestamp, identifiers etc). The other keys are constantly changing and can be named anything. I need them transformed to the format below, with the predefined constant keys pulled out to the top level and the variable keys staying in the searchResult array.
{
"common1": "A string",
"common2": "2149944",
"searchResult": [
{
"key": "varying1",
"value": "604516"
},
{
"key": "varying73",
"value": "58.92"
}
]
}
Seeing as jq is already being used in the process, how can I do this transformation in jq please?
I have tried extracting the values using .name, but haven't managed to bring them to this top level.
Many thanks
Ben
You could use IN/1 as follows:
(.searchResult | (from_entries | {common1, common2})) + { searchResult }
| .searchResult |= map(select(.key | IN("common1", "common2") | not))

How get index of array with jq

I want search this string tb1qpvtnfqqs3cp4ly4375km7n5sga8hkdkujkm854 in that structure
{
"txid": "67bc5194442dc350312a7c0a5fc7ef912c31bf00b23349b4c3afdf177c91fb2f",
"hash": "8392ded0647e4166eda342cee409c7d0e1e3ffab24de41866d2e6a7bd0a245b3",
"version": 2,
"size": 245,
"vsize": 164,
"weight": 653,
"locktime": 1764124,
"vin": [
{
"txid": "69eed058cbd18b3bf133c8341582adcd76a4d837590d3ae8fa0ffee1d597a8c3",
"vout": 0,
"scriptSig": {
"asm": "0014759fc698313da549948940508df6db93a319096e",
"hex": "160014759fc698313da549948940508df6db93a319096e"
},
"txinwitness": [
"3044022014a8eb758063c52bc970d42013e653f5d3fb3c190b55f7cfa72680280cc5138602202a873b5cad4299b2f52d8cccb4dcfa66fa6ec256d533788f54440d4cdad7dd6501",
"02ec8ba22da03ed1870fe4b9f9071067a6a1fda6f582c5c858644e44bd401bfc0a"
],
"sequence": 4294967294
}
],
"vout": [
{
"value": 0.37841708,
"n": 0,
"scriptPubKey": {
"asm": "0 686bc8ce41505642c96f3eb99919fff63f4c0f11",
"hex": "0014686bc8ce41505642c96f3eb99919fff63f4c0f11",
"reqSigs": 1,
"type": "witness_v0_keyhash",
"addresses": [
"tb1qdp4u3njp2pty9jt086uejx0l7cl5crc3x3phwd"
]
}
},
{
"value": 0.00022000,
"n": 1,
"scriptPubKey": {
"asm": "0 0b173480108e035f92b1f52dbf4e90474f7b36dc",
"hex": "00140b173480108e035f92b1f52dbf4e90474f7b36dc",
"reqSigs": 1,
"type": "witness_v0_keyhash",
"addresses": [
"tb1qpvtnfqqs3cp4ly4375km7n5sga8hkdkujkm854"
]
}
}
],
"hex": "02000000000101c3a897d5e1fe0ffae83a0d5937d8a476cdad821534c833f13b8bd1cb58d0ee690000000017160014759fc698313da549948940508df6db93a319096efeffffff022c6b410200000000160014686bc8ce41505642c96f3eb99919fff63f4c0f11f0550000000000001600140b173480108e035f92b1f52dbf4e90474f7b36dc02473044022014a8eb758063c52bc970d42013e653f5d3fb3c190b55f7cfa72680280cc5138602202a873b5cad4299b2f52d8cccb4dcfa66fa6ec256d533788f54440d4cdad7dd65012102ec8ba22da03ed1870fe4b9f9071067a6a1fda6f582c5c858644e44bd401bfc0a1ceb1a00",
"blockhash": "000000009acb8b4f06a97beb23b3d9aeb3df71052dabec94465933b564c27f50",
"confirmations": 2,
"time": 1591687001,
"blocktime": 1591687001
}
I'd like to get the index of vout, in this case 1. is it possible with jq?
It's not clear what exactly you want.
I guess you want the n of the element of vout that contains the given address in its addresses list. That can be achieved with
jq '.vout[]
| select(.scriptPubKey.addresses[] == "tb1qpvtnfqqs3cp4ly4375km7n5sga8hkdkujkm854")
| .n
' file.json
You can also use
select((.scriptPubKey.addresses[]
| contains("tb1qpvtnfqqs3cp4ly4375km7n5sga8hkdkujkm854")))
to search for the address.
The following assumes that you want the index in .vout of the first object which has the given string as a leaf value, and that you have in mind using 0 as the index origin.
A simple and reasonably efficient jq program that finds all such indices is as follows:
.vout
| range(0;length) as $i
| if any(.[$i]|..;
. == "tb1qpvtnfqqs3cp4ly4375km7n5sga8hkdkujkm854")
then $i
else empty
end
With the given input, this in fact yields 1, which is in accordance with the problem description, so we seem to be on right track.
To get the first index, you could wrap the above in first(...), but in that case the result would be the empty stream if there is no occurrence. So perhaps you would prefer to wrap the above in first(...) // null
You could try something like this:
$vout={{ your json }}
$value="tb1qpvtnfqqs3cp4ly4375km7n5sga8hkdkujkm854"
result=$(echo "$vout" | jq -r '.[0] | select($value)')

Modify keys in multiple nested objects/arrays

I want to modify the value of all x keys in a json that looks like:
{
"a": {
"b": {
"c": [
{
"0": {
"x": 23,
"name": "AS"
}
},
{
"1": {
"x": 23,
"name": "AS"
}
},
{
"2": {
"x": 23,
"name": "Fe"
}
},
{
"3": {
"x": 23,
"name": "Pl"
}
}
]
}
}
}
I have tried multiple approaches, but I can't modify the value of x and obtain the full json as a result. All I managed to do is modify the value of x and obtain the last array as a result.
Here is the closest I have been to achieve the result: https://jqplay.org/s/Wx741btZOg
Using |= one can simply perform the update by writing:
.a.b.c |= [.[]|.[].x=97]
or perhaps more clearly:
.a.b.c |= map(.[].x=97)
If you really do want to "modify the value of all x keys", then you could use walk:
walk(if type == "object" and has("x") then .x=97 else . end)
(If your jq does not have walk, then you can snarf its def from the web, e.g. from builtin.jq )
To change all x values to 97, you can try this jq command:
<file jq '.a.b.c as $in | .a.b.c=[ $in[] | .[].x=97 ]'
The command stores the parent of the object in the variable $in such that you can modify one of its sub element.

subtracting one json file from another in jq

Is there a way to compare two json files in jq? Specifically, I'd like to be able to remove objects from one json file if they occur in another json file. Basically, subtract one file from another. It would be a bonus if I could generalize this so that I could define the equality criteria for the objects, but this is not strictly necessary, it can be based strictly on the objects being identical.
So the more general case would look like this. Let's say I have a file that looks like this:
[
{
"name": "Cynthia",
"surname": "Craig",
"isActive": true,
"balance": "$2,426.88"
},
{
"name": "Elise",
"surname": "Long",
"isActive": false,
"balance": "$1,892.72"
},
{
"name": "Hyde",
"surname": "Adkins",
"isActive": true,
"balance": "$1,769.34"
},
{
"name": "Matthews",
"surname": "Jefferson",
"isActive": true,
"balance": "$1,991.42"
},
{
"name": "Kris",
"surname": "Norris",
"isActive": false,
"balance": "$2,137.11"
}
]
And I have a second file that looks like this:
[
{
"name": "Cynthia",
"surname": "Craig"
},
{
"name": "Kris",
"surname": "Norris"
}
]
I'd like to remove any objects from the first file where the name and surname fields match an object of the second file, so that the results should look like this:
[
{
"name": "Elise",
"surname": "Long",
"isActive": false,
"balance": "$1,892.72"
},
{
"name": "Hyde",
"surname": "Adkins",
"isActive": true,
"balance": "$1,769.34"
},
{
"name": "Matthews",
"surname": "Jefferson",
"isActive": true,
"balance": "$1,991.42"
}
]
The following solution is intended to be generic, efficient and as simple as possible subject to the first two objectives.
Genericity
For genericity, let us suppose that $one and $two are two arrays of
JSON entities, and that we wish to find those items, $x, in $one
such that ($x|filter) does not appear in map($two | filter), where filter is an arbitrary filter. (In the present instance, it is {surname, name}.)
The solution uses INDEX/1, which was added to jq after the official 1.5 release, so we begin by reproducing its definition:
def INDEX(stream; idx_expr):
reduce stream as $row ({};
.[$row|idx_expr|
if type != "string" then tojson
else .
end] |= $row);
def INDEX(idx_expr): INDEX(.[]; idx_expr);
Efficiency
For efficiency, we will need to use a JSON object as a dictionary;
since keys must be strings, we will need to ensure that when converting an object
to a string, the objects are normalized. For this, we define normalize as follows:
# Normalize the input with respect to the order of keys in objects
def normalize:
. as $in
| if type == "object" then reduce keys[] as $key
( {}; . + { ($key): ($in[$key] | normalize) } )
elif type == "array" then map( normalize )
else .
end;
To construct the dictionary, we simply apply (normalize|tojson):
def todict(filter):
INDEX(filter| normalize | tojson);
The solution
The solution is now quite simple:
# select those items from the input stream for which
# (normalize|tojson) is NOT in dict:
def MINUS(filter; $dict):
select( $dict[filter | normalize | tojson] | not);
def difference($one; $two; filter):
($two | todict(filter)) as $dict
| $one[] | MINUS( filter; $dict );
difference( $one; $two; {surname, name} )
Invocation
$ jq -n --argfile one one.json --argfile two two.json -f difference.jq
Here is a solution which uses --argfile and project/1 from pull/1062
def project(q):
. as $in
| reduce (q | if type == "object" then keys[] else .[] end) as $k (
{}
; . + { ($k) : ($in[$k]) }
)
;
map(
reduce $arg[] as $a (
.
; select(project($a) != $a)
)
| values
)
If you place the "second" file in second.json, the data in data.json and the above filter in filter.jq you can run this with
jq -M --argfile arg second.json -f filter.jq data.json
to produce
[
{
"name": "Elise",
"surname": "Long",
"isActive": false,
"balance": "$1,892.72"
},
{
"name": "Hyde",
"surname": "Adkins",
"isActive": true,
"balance": "$1,769.34"
},
{
"name": "Matthews",
"surname": "Jefferson",
"isActive": true,
"balance": "$1,991.42"
}
]
You can replace the expression select(project($a) != $a) with something else if you want to revise the equality criteria for the objects.
Thinking about this a little more we can eliminate the need for project/1 by using contains. This should be more efficient as it eliminates construction of a temporary object.
map(
reduce $arg[] as $a (
.
; select(.!=null and contains($a)==false)
)
| values
)
this can be further simplified using any:
map(select(any(.; contains($arg[]))==false))
which is short enough to be used directly on the command line:
jq -M --argfile arg second.json 'map(select(any(.; contains($arg[]))==false))' data.json
jq solution:
jq --slurpfile s f2.json '[ .[] | . as $o | if (reduce $s[0][] as $i
([]; . + [($o | contains($i))]) | any) then empty else $o end ]' f1.json
The output:
[
{
"name": "Elise",
"surname": "Long",
"isActive": false,
"balance": "$1,892.72"
},
{
"name": "Hyde",
"surname": "Adkins",
"isActive": true,
"balance": "$1,769.34"
},
{
"name": "Matthews",
"surname": "Jefferson",
"isActive": true,
"balance": "$1,991.42"
}
]

Resources