jq condensing sub array permutation query - arrays

I intend to extract a csv with a row for each sub array item.
Given a json array with a sub array. e.g. like this one:
[
{
"foo": 108,
"bar": ["a","b"]
},
{
"foo": 201,
"bar": ["c","d"]
}
]
It is possible to fetch the data by utilizing an intermediate object.
.[] | { "y": .foo, "x": .bar[] }| [.y,.x] | #csv
https://jqplay.org/s/922RlkbFNA
But I'd like to express it in a less elaborate form.
However the following does not work :( :
.[] | [ (.foo, .bar[]) ] | #csv
PS: I struggle to find a fitting headline

In three lines:
.[]
| [.foo] + (.bar[]|[.])
| #csv
or maybe less obscurely:
.[]
| .bar[] as $bar
| [.foo, $bar]
| #csv

Related

JQ adding count of nested array objects to CSV

Working with a nested arrays like this:
[
{
"value1": "Data1-0",
"value2": "Data2-0",
"nArray": [
{"nValue1": "nData1-0a","nValue2": "nData2-0a"},
{"nValue1": "nData1-0b","nValue2": "nData2-0a"},
{"nValue1": "nData1-0c","nValue2": "nData2-0a"}
],
"value3": "Data3-0"
},
{
"value1": "Data1-1",
"value2": "Data2-1",
"nArray": [
{"nValue1": "nData1-1a","nValue2": "nData2-1a"},
{"nValue1": "nData1-1b","nValue2": "nData2-1a"}
],
"value3": "Data3-1"
}
]
Desired output is CSV format like this:
Value1,Value2,nArrayCount
Data1-0,Data2-0,Data3-0,3
Data1-1,Data2-1,Data3-1,2
I was able to get the nested values but that produces multiple rows for each nArray value with this:
[.[] | [.value1,.value2,.value3] + (.nArray[]? | [.nValue1]) ] | .[] | #csv
All I need is a count.
If the data is like the one in your post (no double quotes or comma in Values), then :
#!/usr/bin/env bash
jq -r '["Value1", "Value2", "Value3", "nArrayCount"],
(.[] | [.value1, .value2, .value3, (.nArray|length)])
| join(",")' input.json
Close enough. Use length to get the length of an array.
$ jq -r '
["Value1", "Value2", "count"],
(.[] | [.value1, .value2, (.nArray|length)])
| #csv' input.json
"Value1","Value2","count"
"Data1-0","Data2-0",3
"Data1-1","Data2-1",2
$ jq -r '
["Value1", "Value2", "count"],
(.[] | [.value1,.value2] + [.nArray|length])
| #csv' input.json
"Value1","Value2","count"
"Data1-0","Data2-0",3
"Data1-1","Data2-1",2

JQ get objects from array that has a field ending in string

I am trying to do what I think should be a fairly simple filter but I keep running into errors. I have this JSON:
{
"versions": [
{
"archived": true,
"description": "Cod version 3.3/Sprint 8",
"id": "11500",
"name": "v 3.3",
"projectId": 11500,
"releaseDate": "2016-03-15",
"released": true,
"self": "https://xxxxxxx.atlassian.net/rest/api/2/version/11500",
"startDate": "2016-02-17",
"userReleaseDate": "14/Mar/16",
"userStartDate": "16/Feb/16"
},
{
"archived": true,
"description": "Hot fix",
"id": "12000",
"name": "v3.3.1",
"projectId": 11500,
"releaseDate": "2016-03-15",
"released": true,
"self": "https://xxxxxxx.atlassian.net/rest/api/2/version/12000",
"startDate": "2016-03-15",
"userReleaseDate": "14/Mar/16",
"userStartDate": "14/Mar/16"
},
{
"archived": false,
"id": "29704",
"name": "Sync-diff v1.0.0",
"projectId": 11500,
"releaseDate": "2022-02-16",
"released": true,
"self": "https://xxxxxxx.atlassian.net/rest/api/2/version/29704",
"startDate": "2022-02-06",
"userReleaseDate": "15/Feb/22",
"userStartDate": "05/Feb/22"
}
]
}
I just want to return any userReleaseDate that ends with '22'
I can get the boolean result by:
jq '.versions[].userReleaseDate | endswith("22")'
prints out false, false, true
But I am not sure how to retrieve the objects. I tried variations of this:
[.versions[] as $keys | $keys select(endswith("22"))]
and each threw an error. Any help would be appreciated.
This was so close:
jq '.versions[].userReleaseDate | endswith("22")'
Rather than outputting whether they end with 22 or not, you want to select the values which end with 22. Fixed:
jq '.versions[].userReleaseDate | select( endswith("22") )'
Now, your question asks for the dates that end with 22, but the title suggests you want the objects. For that, you'd want something a little different. We want to select from the versions, not from the dates.
jq '.versions[] | select( .userReleaseDate | endswith("22") )' # As a stream
jq '[ .versions[] | select( .userReleaseDate | endswith("22") ) ]' # As an array
jq '.versions | map( select( .userReleaseDate | endswith("22") ) )' # As an array
There are a number of issues with [ .versions[] as $keys | $keys select(endswith("22")) ].
The keys of array element aren't usually called keys but indexes. $indexes would be a better name.
Except .versions[] gets the values of the array elements, not the keys/indexes. $values would be a better name.
Except the variable only takes on a single value at a time. $value would be a better name.
$version would be an even better name.
There's a | missing between $keys and select(endswith("22")).
There's no mention of userReleaseDate anywhere.
The result is placed in an array (because of the [ ]). There's no need or desire for this.
You could use
.versions[] as $version | $version.userReleaseDate | select(endswith("22"))
or
.versions[].userReleaseDate as $date | $date | select(endswith("22"))
But these are just overly-complicated versions of
jq '.versions[].userReleaseDate | select( endswith("22") )'
Use select directly on the list of objects, extract and check the release date inside its argument:
jq '.versions[] | select(.userReleaseDate | endswith("22"))'

Use JQ to form new arrays from items in arrays by index

I have some JSON data which is pretty typical CSV-style data, however it's represented in JSON. I am struggling to figure out the correct jq expression to convert the following JSON back to some JSON which can generate the appropriate CSV with #csv.
There's a fixed number of 'columns', i.e. the "AAA" values, but the number of values in each 'column' is dynamic yet fixed across columns. That is, the length of the arrays "AAA", "BBB", "CCC", etc are all the same, but the length is dynamic and can change between data sets.
Input (note invalid numbers present, to illustrate example):
{
"AAA": [
111.1,
111.2,
111.3,
111..,
111.n
],
"BBB": [
222.1,
222.2,
222.3,
222..,
222.n
],
"CCC": [
333.1,
333.2,
333.3,
333..,
333.n
],
"DDD": [
444.1,
444.2,
444.3,
444..,
444.n
],
"EEE": [
555.1,
555.2,
555.3,
555..,
555.n
]
}
Desired output (note invalid numbers present, to illustrate example):
{
[
"AAA",
"BBB",
"CCC",
"DDD",
"EEE"
],
[
111.1,
222.1,
333.1,
444.1,
555.1
],
[
111.2,
222.2,
333.2,
444.2,
555.2
],
[
111.3,
222.3,
333.3,
444.3,
555.3
],
[
111..,
222..,
333..,
444..,
555..
],
[
111.n,
222.n,
333.n,
444.n,
555.n
]
}
Here is the desired CSV, for illustration purposes (as converting with #csv is pretty straightforward):
AAA,BBB,CCC,DDD,EEE
111.1,222.1,333.1,444.1,555.1
111.2,222.2,333.2,444.2,555.2
111.3,222.3,333.3,444.3,555.3
111..,222..,333..,444..,555..
111.n,222.n,333.n,444.n,555.n
If the required expression is far easier without the first array in the result object containing the "AAA" 'header' values then I can easily live without them.
Thank you.
You can use the transpose function in jq to do the transposing of arrays, formed from keys/values.
jq '[ to_entries[] | [.key, .value[]] ] | transpose'
The bulk of the magic is performed by the transpose built-in, but before that you just need to collect the values into an array of arrays. The CSV result can be generated with the #csv function.
jq --raw-output '[ to_entries[] | [.key, .value[]] ] | transpose[] | #csv'
You could also use map() and be avoid the redundant [..]
jq 'to_entries | map([.key, .value[]]) | transpose'
jq --raw-output 'to_entries | map([.key, .value[]]) | transpose[] | #csv'

jq - How to concatenate an array in json

Struggling with formatting of data in jq. I have 2 issues.
Need to take the last array .rental_methods and concatenate them into 1 line, colon separated.
#csv doesn't seem to work with my query. I get the error string ("5343") cannot be csv-formatted, only array
jq command is this (without the | #csv)
jq --arg LOC "$LOC" '.last_updated as $lu | .data[]|.[]| $lu, .station_id, .name, .region_id, .address, .rental_methods[]'
JSON:
{
"last_updated": 1539122087,
"ttl": 60,
"data": {
"stations": [{
"station_id": "5343",
"name": "Lot",
"region_id": "461",
"address": "Austin",
"rental_methods": [
"KEY",
"APPLEPAY",
"ANDROIDPAY",
"TRANSITCARD",
"ACCOUNTNUMBER",
"PHONE"
]
}
]
}
}
I'd like the output to end up as:
1539122087,5343,Lot,461,Austin,KEY:APPLEPAY:ANDROIDPAY:TRANSITCARD:ACCOUNTNUMBER:PHONE:,
Using #csv:
jq -r '.last_updated as $lu
| .data[][]
| [$lu, .station_id, .name, .region_id, .address, (.rental_methods | join(":")) ]
| #csv'
What you were probably missing with #csv before was an array constructor around the list of things you wanted in the CSV record.
You could repair your jq filter as follows:
.last_updated as $lu
| .data[][]
| [$lu, .station_id, .name, .region_id, .address,
(.rental_methods | join(":"))]
| #csv
With your JSON, this would produce:
1539122087,"5343","Lot","461","Austin","KEY:APPLEPAY:ANDROIDPAY:TRANSITCARD:ACCOUNTNUMBER:PHONE"
... which is not quite what you've said you want. Changing the last line to:
map(tostring) | join(",")
results in:
1539122087,5343,Lot,461,Austin,KEY:APPLEPAY:ANDROIDPAY:TRANSITCARD:ACCOUNTNUMBER:PHONE
This is exactly what you've indicated you want except for the terminating punctuation, which you can easily add (e.g. by appending + "," to the program above) if so desired.

Count the numer of instance in an array using JMESPath

In the example JSON at the bottom of this question, how can I count the number of key/value pairs in the array "Tags" using JMESPath?
According to the JMESPath documentation, I can do this using the count() function -
For example, the following expression creates an array containing the total number of elements in the foo object followed by the value of foo["bar"].
However, it seems that the documentation is incorrect. Using the JMESPath website, the query Reservations[].Instances[].[count(#), Tags] yeilds the result [ [ null ] ]. I then tested via the AWS command line and an error was returned -
Unknown function: count()
Is there actually a way of doing this using JMESPath?
Example JSON -
{
"Reservations": [
{
"Instances": [
{
"InstanceId": "i-asdf1234",
"InstanceName": "My Instance",
"Tags": [
{
"Value": "Value1",
"Key": "Key1"
},
{
"Value": "Value2",
"Key": "Key2"
},
{
"Value": "Value3",
"Key": "Key3"
},
{
"Value": "Value4",
"Key": "Key4"
}
]
}
]
}
]
}
The answer here is that the JMESPath documentation is shocking, and for some reason I was seeing out of date documentation (check the bottom right corner of the screen to see what version you are viewing.
I can do what I need to do using the length() function -
Reservations[].Instances[].Tags[] | length(#)
I managed to incorporate this usage of length length(Tags[*]) within a larger statement I think is useful and wanted to share:
aws ec2 describe-instances --region us-west-2 --query 'Reservations[*].Instances[*].{id: InstanceId, ami_id: ImageId, type: InstanceType, tag_count: length(Tags[*])}' --profile prod --output table;
--------------------------------------------------------------------
| DescribeInstances |
+--------------+-----------------------+------------+--------------+
| ami_id | id | tag_count | type |
+--------------+-----------------------+------------+--------------+
| ami-abc123 | i-redacted1 | 1 | m3.medium |
| ami-abc456 | i-redacted2 | 7 | m3.xlarge |
| ami-abc789 | i-redacted3 | 12 | t2.2xlarge |
+--------------+-----------------------+------------+--------------+

Resources