Converting tsv with arrays to JSON with jq

Converting tsv with arrays to JSON with jq - arrays

i found jq very helpful in converting tsv to JSON file, however, i want to figure out how to do it with jq when i have array in my tsv:
name age pets
Tim 15 cats,dogs
Joe 11 rabbits,birds
...
ideal JSON:
[
{
name: "Tim",
age: "15",
pet:["cats","dogs"]
},
name: "Joe",
age: "11",
pet:["rabbits","birds"]
}, ...
]
This is the command i tried:
cat file.tsv | jq -s --slurp --raw-input --raw-output 'split("\n") | .[1:-1] | map(split("\t")) |
map({"name": .[0],
"age": .[1],
"pet": .[2]})'
and the output the the above command is:
[
{
name: "Tim",
age: "15",
pet:"cats,dogs"
},
name: "Joe",
age: "11",
pet:"rabbits,birds"-
}, ...
]

Like this:
jq -rRs 'split("\n")[1:-1] |
map([split("\t")[]|split(",")] | {
"name":.[0],
"age":.[1],
"pet":.[2]
}
)' input.tsv

In case the name includes any commas, I'd go with the following, which also avoids having to "slurp" the input:
inputs
| split("\t")
| {name: .[0], age: .[1], pet: .[2]}
| .pet |= split(",")
To skip the header, simply invoke jq with the -R option, e.g. like this:
jq -R -f program.jq input.tsv
If you want the result as an array, simply enclose the entire filter above in square brackets.

Related

JQ - access nested square brackets with fields with no names

trying to access a field in the list array via jq. The fields doesnt have a name for me to gain access to and extract. Please assist?
Trying to extract John and Smith.
$ cat test.txt
{
"content": {
"list": [
[
[
"name",
"John",
123
],
[
"surname",
"Smith",
345
],
1
]
]
}
}
$ jq -r '.content | {name: ."list"}' test.txt
{
"name": [
[
[
"name",
"John",
123
],
[
"surname",
"Smith",
345
],
1
]
]
}

You could do something as naive as:
$ jq -r '.content.list[][][1]?' test.json
John
Smith
Which will extract the second field from the array third nested arrays, and ignore the numeric literal.
Alternative you could manipulate the data before-hand to make it easier to manipulate afterwards:
$ jq '.content.list | map(map({ (.[0]): .[1] }?) | add)'
[
{
"name": "John",
"surname": "Smith"
}
]
Extracting the name(s) would be as simple as just using | [].name:
$ jq '.content.list | map(map({ (.[0]): .[1] }?) | add) | .[].name'
"John"

jq: map field array with different length

I'm working with those JSONs:
{
"extension": [
{
"url": "url1",
"system": "system1"
},
{
"url": "url2",
"system": "system2"
}
]
}
{
"extension": [
{
"url": "url3",
"system": "system3"
}
]
}
As you can see, both JSON objects have different .extension lenght.
I'm using this command in order to map input JSONs:
jq --raw-output '[.extension[] | .url, .system] | #csv'
You can find jqplay here.
I'm getting that:
"url1","system1","url2","system2"
"url3","system3"
What I would like to get is:
"url1","system1","url2","system2"
"url3","system3",,
Any ideas about how I could map those "fields" "correctly"?

Flip the table twice using transpose | transpose to fill up the slots missing from the unrigged square shape with null:
jq -rs 'map(.extension) | transpose | transpose[] | map(.url, .system) | #csv'
"url1","system1","url2","system2"
"url3","system3",,
Demo

A fairly efficient solution:
def pad:
(map(length)|max) as $mx
| map( . + [range(length;$mx)|null] );
[inputs | [.extension[] | (.url, .system)]]
| pad[]
| #csv
This of course should be used with the -n command-line option.

Creating Array of Objects from Bash Array using jq

I am trying to create an array of objects in bash given an array in bash using jq.
Here is where I am stuck:
IDS=("baf3eca8-c4bd-4590-bf1f-9b1515d521ba" "ef2fa922-2038-445c-9d32-8c1f23511fe4")
echo "${IDS[#]}" | jq -R '[{id: ., names: ["bob", "sally"]}]'
Results in:
[
{
"id": "baf3eca8-c4bd-4590-bf1f-9b1515d521ba ef2fa922-2038-445c-9d32-8c1f23511fe4",
"names": [
"bob",
"sally"
]
}
]
My desired result:
[
{
"id": "baf3eca8-c4bd-4590-bf1f-9b1515d521ba",
"names": [
"bob",
"sally"
]
},
{
"id": "ef2fa922-2038-445c-9d32-8c1f23511fe4",
"names": [
"bob",
"sally"
]
}
]
Any help would be much appreciated.

Split your bash array into NUL-delimited items using printf '%s\0', then read the raw stream using -R or --raw-input and within your jq filter split them into an array using split and the delimiter "\u0000":
printf '%s\0' "${IDS[#]}" | jq -Rs '
split("\u0000") | map({id:., names: ["bob", "sally"]})
'

for id in "${IDS[#]}" ; do
echo "$id"
done | jq -nR '[ {id: inputs, names: ["bob", "sally"]} ]'
or as a one-liner:
printf "%s\n" "${IDS[#]}" | jq -nR '[{id: inputs, names: ["bob", "sally"]}]'

not sure about the most efficient way to filter a set of json arrays with jq when some keys are only present in a few arrays - //?

just started using jq to try and consolidate a few keys that are in a much larger json file piped from curl.
the following command outputs the desired information from all 10 records as expected:
curl http://localhost/test.json | jq -r '.array[] | {name: .firstname, job: .position, location: .sites[]? .officename}'
2 of the records have an additional key, which is absent from the others. adding that key to the command results in keys being returned only for those two records. removing the ? from .contact[]? errors out with jq: error: Cannot iterate over null:
curl http://localhost/test.json | jq -r '.array[] | {name: .firstname, job: .position, location: .sites[]? .officename, phone: .contact[]? .number}'
currently working around this by using the // operator as shown below. is there a more efficient or recommended way of using jq in this manner?
curl http://localhost/test.json | jq -r '.array[] | {name: .firstname, job: .position, location: .sites[]? .officename, phone: .contact[]? .number} // {name: .firstname, job: .position, location: .sites[]? .officename}'
thanks
---edit: adding a short example of test.json below for reproducibility:
{
"id": 1,
"array": [{
"firstname": "Nobody",
"lastname": "Nothing",
"sites": [{
"officename": "Site1",
"city": "City"
}],
"position": "Test1"
},
{
"firstname": "Anybody",
"lastname": "Anything",
"sites": [{
"officename": "Site2",
"city": "City2"
}],
"position": "Test2",
"contact": [{
"number": "123-456-7890",
"email": "test#test.com"
}]
}
]
}

Perhaps you are looking for a variant of your solution along the lines of:
.array[]
| {name: .firstname, job: .position, location: .sites[]? .officename}
+ ({phone: .contact[].number}? // null)
(The point being that adding null to a JSON object yields the same object.)
This variant of course comes with the same caveats as the original. In particular, the use of all but the first array iterators might be problematic.

Trying to filter an array output with jq

I have the given input as such:
[{
"ciAttributes": {
"entries": "{\"hostname-cdc1.website.com\":[\"127.0.0.1\"],\"hostname-cdc1-extension.website.com\":[\"127.0.0.1\"]}"
},
"ciAttributes": {
"entries": "{\"hostname-dfw1.website.com\":[\"127.0.0.1\"],\"hostname-dfw1-extension.website.com\":[\"127.0.0.1\"]}"
},
"ciAttributes": {
"entries": "{\"hostname-cdc2.website.com\":[\"127.0.0.1\"],\"hostname-cdc2-extension.website.com\":[\"127.0.0.1\"]}"
},
"ciAttributes": {
"entries": "{\"hostname-dfw2.website.com\":[\"127.0.0.1\"],\"hostname-dfw2-extension.website.com\":[\"127.0.0.1\"]}"
},
}]
...and when I execute my jq with the following command (manipulating existing json):
jq '.[].ciAttributes.entries | fromjson | keys | [ { hostname: .[0] }] | add' | jq -s '{ instances: . }'
...I get this output:
{
"instances": [
{
"hostname": "hostname-cdc1.website.com"
},
{
"hostname": "hostname-dfw1.website.com"
},
{
"hostname": "hostname-cdc2.website.com"
},
{
"hostname": "hostname-dfw2.website.com"
}
]
}
My end goal is to only extract "hostnames" that contain "cdc." I've tried playing with the json select expression but I get a syntax error so I'm sure I'm doing something wrong.

First, there is no need to call jq more than once.
Second, because the main object does not have distinct key names, you would have to use the --stream command-line option.
Third, you could use test to select the hostnames of interest, especially if as seems to be the case, the criterion can most easily be expressed as a regex.
So here in a nutshell is a solution:
Invocation
jq -n --stream -c -f program.jq input.json
program.jq
{instances:
[inputs
| select(length==2 and (.[0][-2:] == ["ciAttributes", "entries"]))
| .[-1]
| fromjson
| keys_unsorted[]
| select(test("cdc.[.]"))]}

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Converting tsv with arrays to JSON with jq - arrays

Like this: jq -rRs 'split("\n")[1:-1] | map([split("\t")[]|split(",")] | { "name":.[0], "age":.[1], "pet":.[2] } )' input.tsv

Related

JQ - access nested square brackets with fields with no names

jq: map field array with different length

Creating Array of Objects from Bash Array using jq

not sure about the most efficient way to filter a set of json arrays with jq when some keys are only present in a few arrays - //?

Trying to filter an array output with jq

Categories

Resources