How can I use jq to filter only certain fields of the original JSON into an CSV? - arrays

I say this stack overflow anser with a simple jq command to convert a JSON to a csv file, but I need to improve it further.
Say I have the following JSON:
[
{
"name": "foo",
"env": "dev",
"version": "1.24"
},
{
"name": "bar",
"env": "staging",
"version": "1.21"
},
{
"name": "boo",
"env": "prod",
"version": "1.23"
},
{
"name": "far",
"env": "prod",
"version": "1.24"
}
]
How does one create the CSV with only the "name" and "version" fields?
My current command is:
jq -r '(map(keys) | add | unique) as $cols | map(.[] | {name, version} as $row | $cols | map($row[.])) as $rows | $cols, $rows[] | #csv'
This is not working. Can anyone provide some help?
Thanks!

If you know the two column names anyway, you could simply extract them directly using .name and .version:
<file jq -r '["name", "version"], (.[] | [.name, .version]) | #csv'
You can also use your $cols variable, so the names only appear once:
<file jq -r '["name", "version"] as $cols | $cols, (.[] | [.[$cols[]]]) | #csv'
Or import them dynamically, e.g. using --args:
<file jq -r '$ARGS.positional, (.[] | [.[$ARGS.positional[]]]) | #csv' \
--args name version
Output:
"name","version"
"foo","1.24"
"bar","1.21"
"boo","1.23"
"far","1.24"

Related

JQ get objects from array that has a field ending in string

I am trying to do what I think should be a fairly simple filter but I keep running into errors. I have this JSON:
{
"versions": [
{
"archived": true,
"description": "Cod version 3.3/Sprint 8",
"id": "11500",
"name": "v 3.3",
"projectId": 11500,
"releaseDate": "2016-03-15",
"released": true,
"self": "https://xxxxxxx.atlassian.net/rest/api/2/version/11500",
"startDate": "2016-02-17",
"userReleaseDate": "14/Mar/16",
"userStartDate": "16/Feb/16"
},
{
"archived": true,
"description": "Hot fix",
"id": "12000",
"name": "v3.3.1",
"projectId": 11500,
"releaseDate": "2016-03-15",
"released": true,
"self": "https://xxxxxxx.atlassian.net/rest/api/2/version/12000",
"startDate": "2016-03-15",
"userReleaseDate": "14/Mar/16",
"userStartDate": "14/Mar/16"
},
{
"archived": false,
"id": "29704",
"name": "Sync-diff v1.0.0",
"projectId": 11500,
"releaseDate": "2022-02-16",
"released": true,
"self": "https://xxxxxxx.atlassian.net/rest/api/2/version/29704",
"startDate": "2022-02-06",
"userReleaseDate": "15/Feb/22",
"userStartDate": "05/Feb/22"
}
]
}
I just want to return any userReleaseDate that ends with '22'
I can get the boolean result by:
jq '.versions[].userReleaseDate | endswith("22")'
prints out false, false, true
But I am not sure how to retrieve the objects. I tried variations of this:
[.versions[] as $keys | $keys select(endswith("22"))]
and each threw an error. Any help would be appreciated.
This was so close:
jq '.versions[].userReleaseDate | endswith("22")'
Rather than outputting whether they end with 22 or not, you want to select the values which end with 22. Fixed:
jq '.versions[].userReleaseDate | select( endswith("22") )'
Now, your question asks for the dates that end with 22, but the title suggests you want the objects. For that, you'd want something a little different. We want to select from the versions, not from the dates.
jq '.versions[] | select( .userReleaseDate | endswith("22") )' # As a stream
jq '[ .versions[] | select( .userReleaseDate | endswith("22") ) ]' # As an array
jq '.versions | map( select( .userReleaseDate | endswith("22") ) )' # As an array
There are a number of issues with [ .versions[] as $keys | $keys select(endswith("22")) ].
The keys of array element aren't usually called keys but indexes. $indexes would be a better name.
Except .versions[] gets the values of the array elements, not the keys/indexes. $values would be a better name.
Except the variable only takes on a single value at a time. $value would be a better name.
$version would be an even better name.
There's a | missing between $keys and select(endswith("22")).
There's no mention of userReleaseDate anywhere.
The result is placed in an array (because of the [ ]). There's no need or desire for this.
You could use
.versions[] as $version | $version.userReleaseDate | select(endswith("22"))
or
.versions[].userReleaseDate as $date | $date | select(endswith("22"))
But these are just overly-complicated versions of
jq '.versions[].userReleaseDate | select( endswith("22") )'
Use select directly on the list of objects, extract and check the release date inside its argument:
jq '.versions[] | select(.userReleaseDate | endswith("22"))'

jq parse json problem cant get it to cycle loop

$ cat sax.json
{"sax": [{"name": "mex20", "links": {"self": "http://website/catalog/sax/e49887"}, "tags": null, "enabled": true, "id": "e49887", "description": null}, {"name": "mex15", "links": {"self": "http://website/catalog/sax/e6de26"}, "tags": null, "enabled": true, "id": "e6de26", "description": null}, {"name": "mex56", "links": {"self": "http://website/catalog/sax/6cc093"}, "tags": null, "enabled": true, "id": "6cc093", "description": null}, {"name": "mex82", "links": {"self": "http://website/catalog/sax/89e0fe"}, "tags": null, "enabled": true, "id": "89e0fe", "description": null}]}
$ cat sax.json | jq '.sax[] | select(.name | contains("mex"))' | jq .id
"e49887"
"e6de26"
"6cc093"
"89e0fe"
get_id.sh
#!/bin/bash
declare -a array=($(jq .sax[].name sax.json ))
for i in "${array[#]}"
do cat sax.json | jq '.sax[] | select(.name | contains($i))' | jq .id
done
cycle doesnt work.
help please
In the first query, there is no need to call jq twice; you can also avoid the UUOC:
< sax.json jq '.sax[] | select(.name | contains("mex")) | .id'
To make a shell variable's value available to jq, it is often best to use the --arg or --argjson command-line option. In your case, you'd want to use --argjson as $i already contains the enclosing quotation marks: jq --argjson i "$i" ...
Alternatively, you could set the array contents using jq -r to strip away the quotation marks, and then use --arg i "$i".
The semantics of contains is rather complex; in general, to check if one string is a substring of another, it is better to use startswith, index, test, or similar, as appropriate.
jq --argjson i "$i"
So jq can't be used in loop?
foreach?

Use jq to replace txt file array string values from dictionary

I have a dictionary that looks like this:
{
"uid": "d6fc3e2b-0001a",
"name": "ABC Mgmt",
"type": "host"
}
{
"uid": "d6fc3e2b-0002a",
"name": "Server XYZ",
"type": "group"
}
{
"uid": "d6fc3e2b-0003a",
"name": "NTP Primary",
"type": "host"
}
{
"uid": "d6fc3e2b-0004a",
"name": "H-10.10.10.10",
"type": "host"
}
Then I have a txt file:
"d6fc3e2b-0001a"
"d6fc3e2b-0001a","d6fc3e2b-0002a","d6fc3e2b-0003a"
"d6fc3e2b-0004a"
Expected Output:
"ABC Mgmt"
"ABC Mgmt","Server XYZ","NTP Primary"
"H-10.10.10.10"
I have some trouble to make jq using an array which is not json format. I tried various solutions that I found, but none of them worked. I am rather new to scripting, need some help.
input=file.txt
while IFS= read -r line
do
{
value=$(jq -r --arg line "$line" \
'from_entries | .[($line | split(","))[]]' \
dictionary.json)
echo $name
}
done < "$input"
In the following solution, the dictionary file is read using the --slurpfile command-line option, and the lines of "text" are read using inputs in conjunction with the -n command-line option. The -r command-line option is used in conjunction with the #csv filter to produce the desired output.
Invocation
jq -n -R -r --slurpfile dict stream.json -f program.jq stream.txt
program.jq
(INDEX($dict[]; .uid) | map_values(.name)) as $d
| inputs
| split(",")
| map(fromjson)
| map($d[.])
| #csv
Caveat
The above assumes that the quoted values in stream.txt do not themselves contain commas.
If the quoted values in stream.txt do contain commas, then it would be much easier if the values given on each line in stream.txt were given as JSON entities, e.g. as an array of strings, or as a sequence of JSON strings with no separator character.
Solution to problem described in a comment
Invocation
< original.json jq -r --slurpfile dict stream.json -f program.jq
program.jq
(INDEX($dict[]; .uid) | map_values(.name)) as $d
| .source
| map($d[.])
| #csv

jq - How to concatenate an array in json

Struggling with formatting of data in jq. I have 2 issues.
Need to take the last array .rental_methods and concatenate them into 1 line, colon separated.
#csv doesn't seem to work with my query. I get the error string ("5343") cannot be csv-formatted, only array
jq command is this (without the | #csv)
jq --arg LOC "$LOC" '.last_updated as $lu | .data[]|.[]| $lu, .station_id, .name, .region_id, .address, .rental_methods[]'
JSON:
{
"last_updated": 1539122087,
"ttl": 60,
"data": {
"stations": [{
"station_id": "5343",
"name": "Lot",
"region_id": "461",
"address": "Austin",
"rental_methods": [
"KEY",
"APPLEPAY",
"ANDROIDPAY",
"TRANSITCARD",
"ACCOUNTNUMBER",
"PHONE"
]
}
]
}
}
I'd like the output to end up as:
1539122087,5343,Lot,461,Austin,KEY:APPLEPAY:ANDROIDPAY:TRANSITCARD:ACCOUNTNUMBER:PHONE:,
Using #csv:
jq -r '.last_updated as $lu
| .data[][]
| [$lu, .station_id, .name, .region_id, .address, (.rental_methods | join(":")) ]
| #csv'
What you were probably missing with #csv before was an array constructor around the list of things you wanted in the CSV record.
You could repair your jq filter as follows:
.last_updated as $lu
| .data[][]
| [$lu, .station_id, .name, .region_id, .address,
(.rental_methods | join(":"))]
| #csv
With your JSON, this would produce:
1539122087,"5343","Lot","461","Austin","KEY:APPLEPAY:ANDROIDPAY:TRANSITCARD:ACCOUNTNUMBER:PHONE"
... which is not quite what you've said you want. Changing the last line to:
map(tostring) | join(",")
results in:
1539122087,5343,Lot,461,Austin,KEY:APPLEPAY:ANDROIDPAY:TRANSITCARD:ACCOUNTNUMBER:PHONE
This is exactly what you've indicated you want except for the terminating punctuation, which you can easily add (e.g. by appending + "," to the program above) if so desired.

Count the numer of instance in an array using JMESPath

In the example JSON at the bottom of this question, how can I count the number of key/value pairs in the array "Tags" using JMESPath?
According to the JMESPath documentation, I can do this using the count() function -
For example, the following expression creates an array containing the total number of elements in the foo object followed by the value of foo["bar"].
However, it seems that the documentation is incorrect. Using the JMESPath website, the query Reservations[].Instances[].[count(#), Tags] yeilds the result [ [ null ] ]. I then tested via the AWS command line and an error was returned -
Unknown function: count()
Is there actually a way of doing this using JMESPath?
Example JSON -
{
"Reservations": [
{
"Instances": [
{
"InstanceId": "i-asdf1234",
"InstanceName": "My Instance",
"Tags": [
{
"Value": "Value1",
"Key": "Key1"
},
{
"Value": "Value2",
"Key": "Key2"
},
{
"Value": "Value3",
"Key": "Key3"
},
{
"Value": "Value4",
"Key": "Key4"
}
]
}
]
}
]
}
The answer here is that the JMESPath documentation is shocking, and for some reason I was seeing out of date documentation (check the bottom right corner of the screen to see what version you are viewing.
I can do what I need to do using the length() function -
Reservations[].Instances[].Tags[] | length(#)
I managed to incorporate this usage of length length(Tags[*]) within a larger statement I think is useful and wanted to share:
aws ec2 describe-instances --region us-west-2 --query 'Reservations[*].Instances[*].{id: InstanceId, ami_id: ImageId, type: InstanceType, tag_count: length(Tags[*])}' --profile prod --output table;
--------------------------------------------------------------------
| DescribeInstances |
+--------------+-----------------------+------------+--------------+
| ami_id | id | tag_count | type |
+--------------+-----------------------+------------+--------------+
| ami-abc123 | i-redacted1 | 1 | m3.medium |
| ami-abc456 | i-redacted2 | 7 | m3.xlarge |
| ami-abc789 | i-redacted3 | 12 | t2.2xlarge |
+--------------+-----------------------+------------+--------------+

Resources