I'm describing the output I'd like to have using jq
Here my json file
{
"partitions": [
{
"replicas": [
0,
1,
2
],
"log_dirs": [
"any",
"any",
"any"
]
},
{
"replicas": [
2,
0,
1
],
"log_dirs": [
"any",
"any",
"any"
]
},
[...]
I would like, for every object in partitions[] to replace the value of the i th string in log_dirs[] by its concatenation with the i th number in replicas[] in order to have something like this
{
"partitions": [
{
"replicas": [
0,
1,
2
],
"log_dirs": [
"any0",
"any1",
"any2"
]
},
{
"replicas": [
2,
0,
1
],
"log_dirs": [
"any2",
"any0",
"any1"
]
},
[...]
Use a reduce loop with a range
.partitions[] |= reduce range(0; ( .replicas | length ) ) as $r
( . ; .log_dirs[$r] += ( .replicas[$r] | tostring ) )
The reduce expression works by iterating over the entire log_dirs array upto the length of replicas list and modifying each entry .log_dirs[$r] (here r runs from 0 - length of replicas) by appending the corresponding value at replicas[$r]. Since replicas contains numerics, it needs to be converted to string for the append operation.
jqplay - demo
I'm not quite happy with this, but it does work:
jq '.partitions[] |= . + (
. as $p | {
log_dirs: [
range(.replicas | length) |
"\($p.log_dirs[.])\($p.replicas[.])"
]
}
)' in.json
Related
I have an JSON file similar to this:
{
"version": "2.0",
"stage" : {
"objects" : [
{
"foo" : 1100,
"bar" : false,
"id" : "56a983f1-8111-4abc-a1eb-263d41cfb098"
},
{
"foo" : 1100,
"bar" : false,
"id" : "6369df4b-90c4-4695-8a9c-6bb2b8da5976"
}],
"bish" : "#FFFFFF"
},
"more": "abcd"
}
I would like the output to be exactly the same, with the exception of an incrementing integer in place of the "id" : "guid" - something like:
{
"version": "2.0",
"stage" : {
"objects" : [
{
"foo" : 1100,
"bar" : false,
"id" : 1
},
{
"foo" : 1100,
"bar" : false,
"id" : 2
}],
"bish" : "#FFFFFF"
},
"more": "abcd"
}
I'm new to jq. I can set the id's to a fixed integer with .stage.objects[].id |= 1.
{
"version": "2.0",
"stage": {
"objects": [
{
"foo": 1100,
"bar": false,
"id": 1
},
{
"foo": 1100,
"bar": false,
"id": 1
}
],
"bish": "#FFFFFF"
},
"more": "abcd"
}
I can't figure out the syntax to make the assigned number iterate.
I tried various combinations of map, reduce, to_entries, foreach and other strategies mentioned in answers to similar questions but the data in those examples always consisted of something simple.
You can exploit the fact that to_entries on arrays uses the index as "key", then modify your value:
.stage.objects |= (to_entries | map(.value.id = .key + 1 | .value))
or
.stage.objects |= (to_entries | map(.value += {id: (.key + 1)} | .value))
Output:
{
"version": "2.0",
"stage": {
"objects": [
{
"foo": 1100,
"bar": false,
"id": 1
},
{
"foo": 1100,
"bar": false,
"id": 2
}
],
"bish": "#FFFFFF"
},
"more": "abcd"
}
Here's a variant using reduce to iterate over the keys:
.stage.objects |= reduce keys[] as $i (.; .[$i].id = $i + 1)
{
"version": "2.0",
"stage": {
"objects": [
{
"foo": 1100,
"bar": false,
"id": 1
},
{
"foo": 1100,
"bar": false,
"id": 2
}
],
"bish": "#FFFFFF"
},
"more": "abcd"
}
Demo
Update:
Is there a way to make the search and replace go deep? If the items in the objects array had children arrays with id's, could they be replaced as well?
Of course. You could enhance the LHS of the update to also cover all .children arrays recursively using recurse(.[].children | arrays):
(.stage.objects | recurse(.[].children | arrays)) |=
reduce keys[] as $i (.; .[$i].id = $i + 1)
Demo
Note that in this case each .children array is treated independently, thus numbering starts from 1 in each of them. If you want a continuous numbering instead, it has to be done outside and brought down into the iteration. Here's a solution gathering the target paths using path, numbering them using to_entries, and setting them iteratively using setpath:
reduce (
[path(.stage.objects[] | recurse(.children | arrays[]).id)] | to_entries[]
) as $i (.; setpath($i.value; $i.key + 1))
Demo
I have an jCal JSON array which I'd like to filter with jq. JSON arrays are somewhat new to me and I have been banging my head to the wall on this for hours...
The file looks like this:
[
"vcalendar",
[
[
"calscale",
{},
"text",
"GREGORIAN"
],
[
"version",
{},
"text",
"2.0"
],
[
"prodid",
{},
"text",
"-//SabreDAV//SabreDAV//EN"
],
[
"x-wr-calname",
{},
"unknown",
"Call log private"
],
[
"x-apple-calendar-color",
{},
"unknown",
"#ffaa00"
],
[
"refresh-interval",
{},
"duration",
"PT4H"
],
[
"x-published-ttl",
{},
"unknown",
"PT4H"
]
],
[
[
"vevent",
[
[
"dtstamp",
{},
"date-time",
"2015-04-05T16:42:10Z"
],
[
"created",
{},
"date-time",
"2015-02-18T16:44:04Z"
],
[
"uid",
{},
"text",
"9b23142b-8d86-3e17-2f44-2bed65b2e471"
],
[
"last-modified",
{},
"date-time",
"2015-04-05T16:42:10Z"
],
[
"description",
{},
"text",
"Phone call to +49xxxxxxxxxx lasted for 0 seconds."
],
[
"summary",
{},
"text",
"Outgoing: +49xxxxxxx"
],
[
"dtstart",
{},
"date-time",
"2015-02-18T10:58:12Z"
],
[
"dtend",
{},
"date-time",
"2015-02-18T10:58:44Z"
],
[
"transp",
{},
"text",
"OPAQUE"
]
],
[]
],
[
"vevent",
[
[
"dtstamp",
{},
"date-time",
"2015-04-05T16:42:10Z"
],
[
"created",
{},
"date-time",
"2015-01-09T19:12:05Z"
],
[
"uid",
{},
"text",
"c337e092-a012-5f5a-497f-932fbc6159e5"
],
[
"last-modified",
{},
"date-time",
"2015-04-05T16:42:10Z"
],
[
"description",
{},
"text",
"Phone call to +1xxxxxxxxxx lasted for 39 seconds."
],
[
"summary",
{},
"text",
"Outgoing: +1xxxxxxxxxx"
],
[
"dtstart",
{},
"date-time",
"2015-01-09T17:23:16Z"
],
[
"dtend",
{},
"date-time",
"2015-01-09T17:24:19Z"
],
[
"transp",
{},
"text",
"OPAQUE"
]
],
[]
],
]
]
I would like to filter out dtstart, dtend, the target phone number and the connection duration from the description for each vevent which was created e.g. in January 2019 ("2019-01.*") and output them as a CSV.
This JSON is a bit strange because the information is stored position-based in an array instead of an object.
Using the first element of an array ("vevent") to identify its contents is not the best practice.
But anyway ... if this is the data source you are dealing with, this code should help you.
jq -r '..
| arrays
| select(.[0] == "vevent")[1]
| [
(.[] | select(.[0] == "dtstart") | .[3]),
(.[] | select(.[0] == "dtend") | .[3]),
(.[] | select(.[0] == "description") | .[3])
]
| #csv
'
Alternatively, the repeating code can be transferred into a function
jq -r 'def getField($name; $idx): .[] | select(.[0] == $name) | .[$idx];
..
| arrays
| select(.[0] == "vevent")[1]
| [ getField("dtstart"; 3), getField("dtend"; 3), getField("description"; 3) ]
| #csv
'
Output
"2015-02-18T10:58:12Z","2015-02-18T10:58:44Z","Phone call to +49xxxxxxxxxx lasted for 0 seconds."
"2015-01-09T17:23:16Z","2015-01-09T17:24:19Z","Phone call to +1xxxxxxxxxx lasted for 39 seconds."
You can also extract phone number and duration with the help of regular expressions in jq:
jq -r 'def getField($name; $idx): .[] | select(.[0] == $name) | .[$idx];
..
| arrays
| select(.[0] == "vevent")[1]
| [
getField("dtstart"; 3),
getField("dtend"; 3),
(getField("description"; 3) | match("call to ([^ ]*)") | .captures[0].string),
(getField("description"; 3) | match("(\\d+) seconds") | .captures[0].string)
]
| #csv
'
Output
"2015-02-18T10:58:12Z","2015-02-18T10:58:44Z","+49xxxxxxxxxx","0"
"2015-01-09T17:23:16Z","2015-01-09T17:24:19Z","+1xxxxxxxxxx","39"
Not the most efficient solution, but quite understandable by first building an object out of key-value pairs and then filtering and transforming those.
.[2][][1] is a stream of events encoded as arrays.
Which means that:
.[2][][1]
| map({key:.[0], value:.[3]})
| from_entries
the above gives you a stream of objects; one object per event:
{
"dtstamp": "2015-04-05T16:42:10Z",
"created": "2015-02-18T16:44:04Z",
"uid": "9b23142b-8d86-3e17-2f44-2bed65b2e471",
"last-modified": "2015-04-05T16:42:10Z",
"description": "Phone call to +49xxxxxxxxxx lasted for 0 seconds.",
"summary": "Outgoing: +49xxxxxxx",
"dtstart": "2015-02-18T10:58:12Z",
"dtend": "2015-02-18T10:58:44Z",
"transp": "OPAQUE"
}
{
"dtstamp": "2015-04-05T16:42:10Z",
"created": "2015-01-09T19:12:05Z",
"uid": "c337e092-a012-5f5a-497f-932fbc6159e5",
"last-modified": "2015-04-05T16:42:10Z",
"description": "Phone call to +1xxxxxxxxxx lasted for 39 seconds.",
"summary": "Outgoing: +1xxxxxxxxxx",
"dtstart": "2015-01-09T17:23:16Z",
"dtend": "2015-01-09T17:24:19Z",
"transp": "OPAQUE"
}
Now plug that into the final program: select the wanted objects, add CSV headers, build the rows and ultimately convert to CSV:
["start", "end", "description"],
(
.[2][][1]
| map({key:.[0], value:.[3]})
| from_entries
| select(.created | startswith("2015-01"))
| [.dtstart, .dtend, .description]
)
| #csv
Raw output (-r):
"start","end","description"
"2015-01-09T17:23:16Z","2015-01-09T17:24:19Z","Phone call to +1xxxxxxxxxx lasted for 39 seconds."
If you need to further transform .description, you can use split or capture. Or use a different property, such as .summary, in your CSV rows. Only a single line needs to be changed.
I want search this string tb1qpvtnfqqs3cp4ly4375km7n5sga8hkdkujkm854 in that structure
{
"txid": "67bc5194442dc350312a7c0a5fc7ef912c31bf00b23349b4c3afdf177c91fb2f",
"hash": "8392ded0647e4166eda342cee409c7d0e1e3ffab24de41866d2e6a7bd0a245b3",
"version": 2,
"size": 245,
"vsize": 164,
"weight": 653,
"locktime": 1764124,
"vin": [
{
"txid": "69eed058cbd18b3bf133c8341582adcd76a4d837590d3ae8fa0ffee1d597a8c3",
"vout": 0,
"scriptSig": {
"asm": "0014759fc698313da549948940508df6db93a319096e",
"hex": "160014759fc698313da549948940508df6db93a319096e"
},
"txinwitness": [
"3044022014a8eb758063c52bc970d42013e653f5d3fb3c190b55f7cfa72680280cc5138602202a873b5cad4299b2f52d8cccb4dcfa66fa6ec256d533788f54440d4cdad7dd6501",
"02ec8ba22da03ed1870fe4b9f9071067a6a1fda6f582c5c858644e44bd401bfc0a"
],
"sequence": 4294967294
}
],
"vout": [
{
"value": 0.37841708,
"n": 0,
"scriptPubKey": {
"asm": "0 686bc8ce41505642c96f3eb99919fff63f4c0f11",
"hex": "0014686bc8ce41505642c96f3eb99919fff63f4c0f11",
"reqSigs": 1,
"type": "witness_v0_keyhash",
"addresses": [
"tb1qdp4u3njp2pty9jt086uejx0l7cl5crc3x3phwd"
]
}
},
{
"value": 0.00022000,
"n": 1,
"scriptPubKey": {
"asm": "0 0b173480108e035f92b1f52dbf4e90474f7b36dc",
"hex": "00140b173480108e035f92b1f52dbf4e90474f7b36dc",
"reqSigs": 1,
"type": "witness_v0_keyhash",
"addresses": [
"tb1qpvtnfqqs3cp4ly4375km7n5sga8hkdkujkm854"
]
}
}
],
"hex": "02000000000101c3a897d5e1fe0ffae83a0d5937d8a476cdad821534c833f13b8bd1cb58d0ee690000000017160014759fc698313da549948940508df6db93a319096efeffffff022c6b410200000000160014686bc8ce41505642c96f3eb99919fff63f4c0f11f0550000000000001600140b173480108e035f92b1f52dbf4e90474f7b36dc02473044022014a8eb758063c52bc970d42013e653f5d3fb3c190b55f7cfa72680280cc5138602202a873b5cad4299b2f52d8cccb4dcfa66fa6ec256d533788f54440d4cdad7dd65012102ec8ba22da03ed1870fe4b9f9071067a6a1fda6f582c5c858644e44bd401bfc0a1ceb1a00",
"blockhash": "000000009acb8b4f06a97beb23b3d9aeb3df71052dabec94465933b564c27f50",
"confirmations": 2,
"time": 1591687001,
"blocktime": 1591687001
}
I'd like to get the index of vout, in this case 1. is it possible with jq?
It's not clear what exactly you want.
I guess you want the n of the element of vout that contains the given address in its addresses list. That can be achieved with
jq '.vout[]
| select(.scriptPubKey.addresses[] == "tb1qpvtnfqqs3cp4ly4375km7n5sga8hkdkujkm854")
| .n
' file.json
You can also use
select((.scriptPubKey.addresses[]
| contains("tb1qpvtnfqqs3cp4ly4375km7n5sga8hkdkujkm854")))
to search for the address.
The following assumes that you want the index in .vout of the first object which has the given string as a leaf value, and that you have in mind using 0 as the index origin.
A simple and reasonably efficient jq program that finds all such indices is as follows:
.vout
| range(0;length) as $i
| if any(.[$i]|..;
. == "tb1qpvtnfqqs3cp4ly4375km7n5sga8hkdkujkm854")
then $i
else empty
end
With the given input, this in fact yields 1, which is in accordance with the problem description, so we seem to be on right track.
To get the first index, you could wrap the above in first(...), but in that case the result would be the empty stream if there is no occurrence. So perhaps you would prefer to wrap the above in first(...) // null
You could try something like this:
$vout={{ your json }}
$value="tb1qpvtnfqqs3cp4ly4375km7n5sga8hkdkujkm854"
result=$(echo "$vout" | jq -r '.[0] | select($value)')
input
{
"apps": [
{
"name": "whatever1",
"id": "ID1"
},
{
"name": "whatever2",
"id": "ID2",
"dep": [
"a.jar"
]
},
{
"name": "whatever3",
"id": "ID3",
"dep": [
"a.jar",
"b.jar"
]
}
]
}
output
{
"apps": [
{
"name": "whatever1",
"id": "ID1",
"dep": [
"b.jar"
]
},
{
"name": "whatever2",
"id": "ID2",
"dep": [
"a.jar",
"b.jar"
]
},
{
"name": "whatever3",
"id": "ID3",
"dep": [
"a.jar",
"b.jar"
]
}
]
}
in the above example
whatever1 does not have dep, so create one.
whatever2 has dep and does not have b.jar, so add b.jar
whatever3 aready has dep and b.jar is there so untouched.
what i have tried.
# add blindly, whatever3 is not right
cat dep.json | jq '.apps[].dep += ["b.jar"]'
# missed one level and whatever3 is gone.
cat dep.json | jq '.apps | map(select(.dep == null or (.dep | contains(["b.jar"]) | not)))[] | .dep += ["b.jar"]'
For the sake of clarity, let's define a helper function for performing the core task:
# It is assumed that the input is an object
# that either does not have the specified key or
# that it is array-valued
def ensure_has($key; $value):
if has($key) and (.[$key] | index($value)) then .
else .[$key] += [$value]
end ;
The task can now be accomplished in a straightforward way:
.apps |= map(ensure_has("dep"; "b.jar"))
Alternatively ...
.apps[] |= ensure_has("dep"; "b.jar")
after some trial and error, it looks like this is one way to do it.
cat dep.json | jq '.apps[].dep |= (. + ["b.jar"] | unique)'
Consider following collection in mongoDB :
{a:[4,2,8,71,21]}
{a:[24,2,2,1]}
{a:[4,1]}
{a:[4,2,8,21]}
{a:[2,8,71,21]}
{a:[4,2,8]}
How can I get following results in a most easily:
Getting nth element of array
{a:4}
{a:24}
{a:4}
{a:4}
{a:2}
{a:4}
Getting elements 2 to 4
{a:[8,71,21]}
{a:[2,1]}
{a:[]}
{a:[8,21]}
{a:[71,21]}
{a:[8]}
And other similar queries.
What you are looking for is the $slice projection.
Getting a number of elements from the beginning of an array
You can pass a simple $limit with a number of values to return (eg. 1):
> db.mycoll.find({}, {_id: 0, a: { $slice: 1}})
{ "a" : [ 4 ] }
{ "a" : [ 24 ] }
{ "a" : [ 4 ] }
{ "a" : [ 4 ] }
{ "a" : [ 2 ] }
{ "a" : [ 4 ] }
Getting a range of elements
You can pass an array with parameters of ( $skip, $limit ).
Note: to match your expected output you would have to find elements 3 to 5 (skip the first 2 elements, return the next 3):
> db.mycoll.find({}, {_id: 0, a: { $slice: [2,3]}})
{ "a" : [ 8, 71, 21 ] }
{ "a" : [ 2, 1 ] }
{ "a" : [ ] }
{ "a" : [ 8, 21 ] }
{ "a" : [ 71, 21 ] }
{ "a" : [ 8 ] }
Getting the nth element of array
Pass the number of elements to $skip and a value of 1 for the limit.
For example, to find the second element you need to skip 1 entry:
> db.mycoll.find({}, {_id: 0, a: { $slice: [1,1]}})
{ "a" : [ 2 ] }
{ "a" : [ 2 ] }
{ "a" : [ 1 ] }
{ "a" : [ 2 ] }
{ "a" : [ 8 ] }
{ "a" : [ 2 ] }
Note that the $slice operator:
always returns an array
will return an empty array for documents that match the find criteria but return an empty result for the $slice selection (eg. if you ask for the 5th element of an array with only 2 elements)