So without try and catch in my jq line everything goes well.
Thing is I need the try and catch because this example works, but not for all input:
> cat movies.json | jq '.[3]' |
jq '.release_date |= if . == null or . == ""
then .
else (. | strptime("%Y-%m-%d") | mktime) end'
{
"id": "166428",
"title": "How to Train Your Dragon: The Hidden World",
"poster": "https://image.tmdb.org/t/p/w1280/xvx4Yhf0DVH8G4LzNISpMfFBDy2.jpg",
"overview": "As Hiccup fulfills his dream of creating a peaceful dragon utopia, Toothless’ discovery of an untamed, elusive mate draws the Night Fury away. When danger mounts at home and Hiccup’s reign as village chief is tested, both dragon and rider must make impossible decisions to save their kind.",
"release_date": 1546473600
}
When I add try and catch I get this:
> cat movies.json | jq '.[3]' |
jq '.release_date |= try (if . == null or . == ""
then .
else (. | strptime("%Y-%m-%d") | mktime) end)
catch (.)'
{
"id": "166428",
"title": "How to Train Your Dragon: The Hidden World",
"poster": "https://image.tmdb.org/t/p/w1280/xvx4Yhf0DVH8G4LzNISpMfFBDy2.jpg",
"overview": "As Hiccup fulfills his dream of creating a peaceful dragon utopia, Toothless’ discovery of an untamed, elusive mate draws the Night Fury away. When danger mounts at home and Hiccup’s reign as village chief is tested, both dragon and rider must make impossible decisions to save their kind.",
"release_date": {
"__jq": 0
}
}
This is my version:
> jq --version
jq-1.6
In the end I want to get something like this working:
> cat movies.json |
jq 'map_values(try (.release_date |= if . == null or . == ""
then .
else (. | strptime("%Y-%m-%d") | mktime) end)
catch (.))'
Update: It appears you discovered a bug.
Some parts of your problem statement are unclear to me:
You special-case null and "", but if . is otherwise unparseable, you let strptime/1 error and presumably want to return the unparseable input (via catch (.), which has a subtlety to it). Why not just try and parse the date and fall back to the unparseable input regardless of special cases?
If you do intend to create a special-case, why not let this be if type == "string"? This is, after all, the only meaningful type to feed strptime/1 (even if it might not contain a parseable date).
When defaulting to the unparseable input, the output type/schema becomes unpredictable, but perhaps this is okay for you.
The "In the end" part suggests, by inlining .release_date within try, that this field may be optional. I don't know if you intended to signify that this may be the case, so I'm choosing to go with the assumption that it's not going to be, since it isn't specified.
jq 1.5
Here's a simplified example, some object properties removed:
$ jq -c '.[]' release_date.json
{"id":"42","release_date":"2019-12-31"}
{"id":"42","release_date":null}
{"id":"42","release_date":"SOON!"}
$ jq 'map(.release_date |= . as $date | try (strptime("%Y-%m-%d") | mktime) catch $date)' release_date.json
[
{
"id": "42",
"release_date": 1577750400
},
{
"id": "42",
"release_date": null
},
{
"id": "42",
"release_date": "SOON!"
}
]
In the catch block, . refers to the stringified exception, so to default to the unparseable value, it is temporarily referred to as $date. Or using a user-defined function with a $value argument:
$ jq 'def parsedate($date):
try ($date | strptime("%Y-%m-%d") | mktime)
catch $date;
map(.release_date |= parsedate(.))' release_date.json
jq 1.6
Interestingly, the solutions above don't work in jq 1.6.
I've tried to narrow the discrepancy down to a minimum.
https://jqplay.org/s/M_RpdNHvHF:
$ jq-1.6 '. |= try . catch .' <<< 1
{
"__jq": 0
}
Until this unintended behavior changes, avoiding |= and try-catch together is a viable option:
https://jqplay.org/s/ki8I1YnU56:
$ jq 'map(.release_date = (.release_date | . as $date
| try (strptime("%Y-%m-%d") | mktime) catch $date))' release_date.json
https://jqplay.org/s/i4FJPpXEG0:
$ jq 'def parsedate($date):
try ($date | strptime("%Y-%m-%d") | mktime)
catch $date;
map(.release_date = parsedate(.release_date))' release_date.json
I've reported it here.
In brief, jq 1.6 introduced a bug affecting the handling of catch/try in the context of |=.
The simple workaround in the present case is to avoid |= e.g. by:
(.release_date
| try (if . == null or . == ""
then .
else strptime("%Y-%m-%d") | mktime end)
catch .) as $r
| .release_date = $r'
Notice there is no need for an initial . | in the else clause.
Related
I've got JSON that, among other top-level content, includes the following:
{
"organizationStructure": [
{
"id": 212119,
"key": "level2"
},
{
"id": 212112,
"key": "level1"
}
]
}
How can I filter by the key to find only a given id (such as that for "level2")?
Or, to keep it simpler (or more complicated, depending on your point of view):
jq '.organizationalStructure[] |
select(.key == "level1") | {id: .id}'
Often, it's nice to clean up the output:
jq -r '.organizationalStructure[] |
select(.key == "level1") | {id: .id}.id'
As per PesaThe's suggestion in the comments, this can be simplified to:
jq -r '.organizationalStructure[] |
select(.key == "level1").id'
and {.id: id} can be written simply {id}
This can be done using select. Note if there is more than one item with the key of level2 this will only return the first:
.organizationalStructure | map(select(.key == "level2") | .id)[0]
Struggling with formatting of data in jq. I have 2 issues.
Need to take the last array .rental_methods and concatenate them into 1 line, colon separated.
#csv doesn't seem to work with my query. I get the error string ("5343") cannot be csv-formatted, only array
jq command is this (without the | #csv)
jq --arg LOC "$LOC" '.last_updated as $lu | .data[]|.[]| $lu, .station_id, .name, .region_id, .address, .rental_methods[]'
JSON:
{
"last_updated": 1539122087,
"ttl": 60,
"data": {
"stations": [{
"station_id": "5343",
"name": "Lot",
"region_id": "461",
"address": "Austin",
"rental_methods": [
"KEY",
"APPLEPAY",
"ANDROIDPAY",
"TRANSITCARD",
"ACCOUNTNUMBER",
"PHONE"
]
}
]
}
}
I'd like the output to end up as:
1539122087,5343,Lot,461,Austin,KEY:APPLEPAY:ANDROIDPAY:TRANSITCARD:ACCOUNTNUMBER:PHONE:,
Using #csv:
jq -r '.last_updated as $lu
| .data[][]
| [$lu, .station_id, .name, .region_id, .address, (.rental_methods | join(":")) ]
| #csv'
What you were probably missing with #csv before was an array constructor around the list of things you wanted in the CSV record.
You could repair your jq filter as follows:
.last_updated as $lu
| .data[][]
| [$lu, .station_id, .name, .region_id, .address,
(.rental_methods | join(":"))]
| #csv
With your JSON, this would produce:
1539122087,"5343","Lot","461","Austin","KEY:APPLEPAY:ANDROIDPAY:TRANSITCARD:ACCOUNTNUMBER:PHONE"
... which is not quite what you've said you want. Changing the last line to:
map(tostring) | join(",")
results in:
1539122087,5343,Lot,461,Austin,KEY:APPLEPAY:ANDROIDPAY:TRANSITCARD:ACCOUNTNUMBER:PHONE
This is exactly what you've indicated you want except for the terminating punctuation, which you can easily add (e.g. by appending + "," to the program above) if so desired.
This is VERY similar to Update one value in array of dicts, using jq
I have a foo.json and I want to update AAA to AAA-MY-SUFFIX.
Basically, I want to get the current value (AAA), and then add a suffix to it.
[
{
"Key": "Name",
"Value": "awesome"
},
{
"Key": "role",
"Value": "AAA"
}
]
From the previous question, I can REPLACE the value of AAA using this:
cat foo.json | jq '(.[] | select(.Key == "role") | .Value) |= "-MY_SUFFIX"'
But I want to APPEND a suffix to the existing value, not completely replace it.
Something like this (but it doesn't work, of course):
cat tags.json | jq '(.[] | select(.Key == "role") | .Value) |= .Value + "-MY_SUFFIX"'
I feel I'm SO close, but I just can figure it out :(
Close indeed. You could simply replace .Value + "-MY_SUFFIX" by
. + "-MY_SUFFIX"
Or better yet, use +=, as in: ... += "-MY_SUFFIX"
Personally, I'd use the filter:
map(if .Key == "role" then .Value += "-MY_SUFFIX" else . end)
(Actually, the stated requirements would accord better with using the suffix "-MY-SUFFIX" :-)
After much fooling around, I think I got it:
cat tags.json | jq '(.[] | select(.Key == "role") | .Value) |= (. + "- MY_SUFFIX")'
I have a json object that looks like this (prodused by i3-msg -t get_workspaces.
[
{
"name": "1",
"urgent": false
},
{
"name": "2",
"urgent": false
},
{
"name": "something",
"urgent": false
}
]
I am trying to use jq to figure out which index number in the list is based on a select query. jq have something called index(), but it seams to support only strings?
Using something like i3-msg -t get_workspaces | jq '.[] | select(.name=="something")' gives me the object I want. But I want it's index. In this case 2 (starting counting at 0)
Is this possible using jq alone?
So I provided a strategy for a solution to the OP, which OP quickly accepted. Subsequently #peak and #Jeff Mercado offered better and more complete solutions. So I have turned this into a community wiki. Please improve this answer if you can.
A straightforward solution (pointed out by #peak) is to use the builtin function, index:
map(.name == "something") | index(true)
The jq documentation confusingly suggests that index operates on strings, but it operates on arrays as well. Thus index(true) returns the index of the first true in the array of booleans produced by the map. If there is no item satisfying the condition, the result is null.
jq expresions are evaluated in a "lazy" manner, but map will traverse the entire input array. We can verify this by rewriting the above code and introducing some debug statements:
[ .[] | debug | .name == "something" ] | index(true)
As suggested by #peak, the key to doing better is to use the break statement introduced in jq 1.5:
label $out |
foreach .[] as $item (
-1;
.+1;
if $item.name == "something" then
.,
break $out
else
empty
end
) // null
Note that the // is no comment; it is the alternative operator. If the name is not found the foreach will return empty which will be converted to null by the alternative operator.
Another approach is to recursively process the array:
def get_index(name):
name as $name |
if (. == []) then
null
elif (.[0].name == $name) then
0
else
(.[1:] | get_index($name)) as $result |
if ($result == null) then null else $result+1 end
end;
get_index("something")
However this recursive implementation will use stack space proportional to the length of the array in the worst case as pointed out by #Jeff Mercado. In version 1.5 jq introduced Tail Call Optimization (TCO) which will allow us to optimize this away using a local helper function (note that this is minor adaptation to a solution provided by #Jeff Mercado so as to be consistent with the above examples):
def get_index(name):
name as $name |
def _get_index:
if (.i >= .len) then
null
elif (.array[.i].name == $name) then
.i
else
.i += 1 | _get_index
end;
{ array: ., i: 0, len: length } | _get_index;
get_index("something")
According to #peak obtaining the length of an array in jq is a constant time operation, and apparently indexing an array is inexpensive as well. I will try to find a citation for this.
Now let's try to actually measure. Here is an example of measuring the simple solution:
#!/bin/bash
jq -n '
def get_index(name):
name as $name |
map(.name == $name) | index(true)
;
def gen_input(n):
n as $n |
if ($n == 0) then
[]
else
gen_input($n-1) + [ { "name": $n, "urgent":false } ]
end
;
2000 as $n |
gen_input($n) as $i |
[(0 | while (.<$n; [ ($i | get_index(.)), .+1 ][1]))][$n-1]
'
When I run this on my machine, I get the following:
$ time ./simple
1999
real 0m10.024s
user 0m10.023s
sys 0m0.008s
If I replace this with the "fast" version of get_index:
def get_index(name):
name as $name |
label $out |
foreach .[] as $item (
-1;
.+1;
if $item.name == $name then
.,
break $out
else
empty
end
) // null;
Then I get:
$ time ./fast
1999
real 0m13.165s
user 0m13.173s
sys 0m0.000s
And if I replace it with the "fast" recursive version:
def get_index(name):
name as $name |
def _get_index:
if (.i >= .len) then
null
elif (.array[.i].name == $name) then
.i
else
.i += 1 | _get_index
end;
{ array: ., i: 0, len: length } | _get_index;
I get:
$ time ./fast-recursive
1999
real 0m52.628s
user 0m52.657s
sys 0m0.005s
Ouch! But we can do better. #peak mentioned an undocumented switch --debug-dump-disasm which lets you see how jq is compiling your code. With this you can see that modifying and passing the object to _indexof and then extracting the array, length, and index is expensive. Refactoring to just pass the index is a huge improvement, and a further refinement to avoid testing the index against the length makes it competitive with the iterative version:
def indexof($name):
(.+[{name: $name}]) as $a | # add a "sentinel"
length as $l | # note length sees original array
def _indexof:
if ($a[.].name == $name) then
if (. != $l) then . else null end
else
.+1 | _indexof
end
;
0 | _indexof
;
I get:
$ time ./fast-recursive2
null
real 0m13.238s
user 0m13.243s
sys 0m0.005s
So it appears that if each element is equally likely, and you want an average case performance, you should stick with the simple implementation. (C-coded functions tend to be fast!)
The solution originally proposed by #Jim-D using foreach would only work as intended for arrays of JSON objects, and both the originally proposed solutions are very inefficient. Their behavior in the absence of an item satisfying the condition might also have been surprising.
Solution using index/1
If you just want a quick-and-easy solution, you can use the builtin function, index, as follows:
map(.name == "something") | index(true)
If there is no item satisfying the condition, then the result will be null.
Incidentally, if you wanted ALL indices for which the condition is true, then the above is easily transformed into a super-fast solution by simply changing index to indices:
map(.name == "something") | indices(true)
Efficient solution
Here is a generic and efficient function that returns the index (i.e. offset) of the first occurrence of the item in the input array for which (item|f) is truthy (neither null nor false), and null otherwise. (In jq, javascript, and many others, the index into arrays is always 0-based.)
# 0-based index of item in input array such that f is truthy, else null
def which(f):
label $out
| foreach .[] as $x (-1; .+1; if ($x|f) then ., break $out else empty end)
// null ;
Example usage:
which(.name == "something")
Converting an array to entries will give you access to both the index and value in the array of the items. You could use that to then find the value you're looking for and get its index.
def indexof(predicate):
reduce to_entries[] as $i (null;
if (. == null) and ($i.value | predicate) then
$i.key
else
.
end
);
indexof(.name == "something")
This however does not short circuit and will go through the entire array to find the index. You'll want to return as soon as the first index has been found. Taking a more functional approach might be more appropriate.
def indexof(predicate):
def _indexof:
if .i >= .len then
null
elif (.arr[.i] | predicate) then
.i
else
.i += 1 | _indexof
end;
{ arr: ., i: 0, len: length } | _indexof;
indexof(.name == "something")
Note that the arguments are passed in to the inner function in this way to take advantage of some optimizations. Namely to take advantage of TCO, the function must not accept any additional parameters.
A still faster version can be obtained by recognizing that the array and its length do not vary:
def indexof(predicate):
. as $in
| length as $len
| def _indexof:
if . >= $len then null
elif ($in[.] | predicate) then .
else . + 1 | _indexof
end;
0 | _indexof;
Here is another version which seems to be slightly faster than the optimized versions from #peak and #jeff-mercado:
label $out | . as $elements | range(length) |
select($elements[.].name == "something") | . , break $out
IMO it is easier to read although it still relies on the break (to get the first match only).
I was doing 100 iterations on a ~1,000,000 element array (with the last element being the one to match). I only counted the user and kernel times, not the wall clock time. On average this solution took 3.4s, #peak's solution took 3.5s, and #jeff-mercado's took 3.6s. This matched what I was seeing in one off runs although to be fair I did have a run where this solution to 3.6s on average so there is unlikely to be any statistical significant difference between each solution.
I've been banging my head against the wall for several hours on this and just can't seem to find a way to do this. I have an array of keys and an array of values, how can I generate an object? Input:
[["key1", "key2"], ["val1", "val2"]]
Output:
{"key1": "val1", "key2": "val2"}
Resolved this on github:
.[0] as $keys |
.[1] as $values |
reduce range(0; $keys|length) as $i ( {}; . + { ($keys[$i]): $values[$i] })
The current version of jq has a transpose filter that can be used to pair up the keys and values. You could use it to build out the result object rather easily.
transpose | reduce .[] as $pair ({}; .[$pair[0]] = $pair[1])
Just to be clear:
(0) Abdullah Jibaly's solution is simple, direct, efficient and generic, and should work in all versions of jq;
(1) transpose/0 is a builtin in jq 1.5 and has been available in pre-releases since Oct 2014;
(2) using transpose/0 (or zip/0 as defined above), an even shorter but still simple, fast, and generic solution to the problem is:
transpose | map( {(.[0]): .[1]} ) | add
Example:
$ jq 'transpose | map( {(.[0]): .[1]} ) | add'
Input:
[["k1","k2","k3"], [1,2,3] ]
Output:
{
"k1": 1,
"k2": 2,
"k3": 3
}
Scratch this, it doesn't actually work for any array greater than size 2.
[map(.[0]) , map(.[1])] | map({(.[0]):.[1]}) | add
Welp, I thought this would be easy, having a little prolog experience... oh man. I ended up banging my head against a wall too. Don't think I'll ever use jq ever again.
Here is a solution which uses reduce with a state object holding an iteration index and a result object. It iterates over the keys in .[0] setting corresponding values in the result from .[1]
.[1] as $v
| reduce .[0][] as $k (
{idx:0, result:{}}; .result[$k] = $v[.idx] | .idx += 1
)
| .result