how to grep the values from mongodb - arrays

New to development. I am trying to grep the values from JSON file. Can some one help me on this.
[{
"State": "New York",
"City": "Queens",
"Cars": {
"gas": {
"USAMade": {
"Ford": ["Fordcars", "Fordtrucks", "Fordsuv"]
},
"OutsideUS": {
"Toyota": ["Tcars", "Ttrucks", "TSUV"]
}
},
"electric": {
"USAMade": {
"Tesla": ["model3", "modelS", "modelX"]
},
"OutsideUS": {
"Nissan": ["Ncars", "Ntrucks", "NSUV"]
}
}
}
},
{
"State": "Atlanta",
"City": "Roswell",
"Cars": {
"gas": {
"USAMade": {
"Ford": ["Fordcars", "Fordtrucks", "Fordsuv"]
},
"OutsideUS": {
"Toyota": ["Tcars", "Ttrucks", "TSUV"]
}
},
"electric": {
"USAMade": {
"Tesla": ["model3", "modelS", "modelX"]
},
"OutsideUS": {
"Nissan": ["Ncars", "Ntrucks", "NSUV"]
}
}
}
}
]
How to list the type of cars like ( gas/electric)?
once i get the type, i want to list the respective country of made ( USAMade/OutsideUS).
After that i want to list the models ( Ford/Toyota)?

Lets suppose you have the documents in the file test.json , here it is how to grep using linux shell tools cat,jq,sort,uniq:
1) cat test.json | jq '.[] | .Cars | keys[] ' | sort | uniq
"electric"
"gas"
2) cat test.json | jq '.[] | .Cars[] | keys[] ' | sort | uniq
"OutsideUS"
"USAMade"
3) cat test.json | jq '.[] | .Cars[][] | keys[] ' | sort | uniq
"Ford"
"Nissan"
"Tesla"
"Toyota"
If your data is in mongoDB , I suggest you keep this distinct values in single document in separate collection and populate the frontend page on load from this collection and the document can look something like this:
{
State:["Atlanta","Oregon"],
City:["New York" , "Tokio" , "Moskow"],
Location:["OutsideUS" ,"USAMade"],
Model:["Ford","Toyota","Nissan"]
}
You don't need to extract distinct values from database every time your front page loads since it is not scalable solution and at some point it will become performance bottleneck ...
But if you want it anyway to get only the distinct keys from mongoDB based on selection you can do as follow:
1.
mongos> db.test.aggregate([ {"$project":{"akv":{"$objectToArray":"$Cars"}}} ,{$unwind:"$akv"} ,{ $group:{_id:null , "allkeys":{$addToSet:"$akv.k"} } }] ).pretty()
{ "_id" : null, "allkeys" : [ "gas", "electric" ] }
mongos> db.test.aggregate([ {"$project":{"akv":{"$objectToArray":"$Cars.gas"}}} ,{$unwind:"$akv"} ,{ $group:{_id:null , "allkeys":{$addToSet:"$akv.k"} } }] ).pretty()
{ "_id" : null, "allkeys" : [ "USAMade", "OutsideUS" ] }
mongos> db.test.aggregate([ {"$project":{"akv":{"$objectToArray":"$Cars.gas.USAMade"}}} ,{$unwind:"$akv"} ,{ $group:{_id:null , "allkeys":{$addToSet:"$akv.k"} } }] ).pretty()
{ "_id" : null, "allkeys" : [ "Ford" ] }

Related

jq - subtracting one array from another using a single command

I have three operations with jq to get the right result. How can I do it within one command?
Here is a fragment from the source JSON file
[
{
"Header": {
"Tenant": "tenant-1",
"Rcode": 200
},
"Body": {
"values": [
{
"id": "0b0b-0c0c",
"name": "NumberOfSearchResults"
},
{
"id": "aaaa0001-0a0a",
"name": "LoadTest"
}
]
}
},
{
"Header": {
"Tenant": "tenant-2",
"Rcode": 200
},
"Body": {
"values": []
}
},
{
"Header": {
"Tenant": "tenant-3",
"Rcode": 200
},
"Body": {
"values": [
{
"id": "cccca0003-0b0b",
"name": "LoadTest"
}
]
}
},
{
"Header": {
"Tenant": "tenant-4",
"Rcode": 200
},
"Body": {
"values": [
{
"id": "0f0g-0e0a",
"name": "NumberOfSearchResults"
}
]
}
}
]
I apply two filters and create two intermediate JSON files. First I create the list of all tenants
jq -r '[.[].Header.Tenant]' source.json >all-tenants.json
And then I select to create an array of all tenants not having a particular key present in the Body.values[] array:
jq -r '[.[] | select (all(.Body.values[]; .name !="LoadTest") ) ] | [.[].Header.Tenant]' source.json >filter1.json
Results - all-tenants.json
["tenant-1",
"tenant-2",
"tenant-3",
"tenant-4"
]
filter1.json
["tenant-2",
"tenant-4"
]
And then I substruct filter1.json from all-tenants.json to get the difference:
jq -r -n --argfile filter filter1.json --argfile alltenants all-tenants.json '$alltenants - $filter|.[]'
Result:
tenant-1
tenant-3
Tenant names - values for the "Tenant" key are unique and each of them occurs only once in the source.json file.
Just to clarify - I understand that I can have a select condition(s) that would give me the same resut as subtracting two arrays.
What I want to understand - how can I assign and use these two arrays into vars directly in a single command not involving the intermediate files?
Thanks
Use your filters to fill in the values of a new object and use the keys to refer to the arrays.
jq -r '{
"all-tenants": [.[].Header.Tenant],
"filter1": [.[]|select (all(.Body.values[]; .name !="LoadTest"))]|[.[].Header.Tenant]
} | .["all-tenants"] - .filter1 | .[]'
Note: .["all-tenants"] is required by the special character "-" in that key. See the entry under Object Identifier-Index in the manual.
how can I assign and use these two arrays into vars directly in a single command not involving the intermediate files?
Simply store the intermediate arrays as jq "$-variables":
[.[].Header.Tenant] as $x
| ([.[] | select (all(.Body.values[]; .name !="LoadTest") ) ] | [.[].Header.Tenant]) as $y
| $x - $y
If you want to itemize the contents of $x - $y, then simply add a final .[] to the pipeline.

Trying to filter an array output with jq

I have the given input as such:
[{
"ciAttributes": {
"entries": "{\"hostname-cdc1.website.com\":[\"127.0.0.1\"],\"hostname-cdc1-extension.website.com\":[\"127.0.0.1\"]}"
},
"ciAttributes": {
"entries": "{\"hostname-dfw1.website.com\":[\"127.0.0.1\"],\"hostname-dfw1-extension.website.com\":[\"127.0.0.1\"]}"
},
"ciAttributes": {
"entries": "{\"hostname-cdc2.website.com\":[\"127.0.0.1\"],\"hostname-cdc2-extension.website.com\":[\"127.0.0.1\"]}"
},
"ciAttributes": {
"entries": "{\"hostname-dfw2.website.com\":[\"127.0.0.1\"],\"hostname-dfw2-extension.website.com\":[\"127.0.0.1\"]}"
},
}]
...and when I execute my jq with the following command (manipulating existing json):
jq '.[].ciAttributes.entries | fromjson | keys | [ { hostname: .[0] }] | add' | jq -s '{ instances: . }'
...I get this output:
{
"instances": [
{
"hostname": "hostname-cdc1.website.com"
},
{
"hostname": "hostname-dfw1.website.com"
},
{
"hostname": "hostname-cdc2.website.com"
},
{
"hostname": "hostname-dfw2.website.com"
}
]
}
My end goal is to only extract "hostnames" that contain "cdc." I've tried playing with the json select expression but I get a syntax error so I'm sure I'm doing something wrong.
First, there is no need to call jq more than once.
Second, because the main object does not have distinct key names, you would have to use the --stream command-line option.
Third, you could use test to select the hostnames of interest, especially if as seems to be the case, the criterion can most easily be expressed as a regex.
So here in a nutshell is a solution:
Invocation
jq -n --stream -c -f program.jq input.json
program.jq
{instances:
[inputs
| select(length==2 and (.[0][-2:] == ["ciAttributes", "entries"]))
| .[-1]
| fromjson
| keys_unsorted[]
| select(test("cdc.[.]"))]}

How to filter for key pairs in an object array (json and jq)

This is a follow-up on
jq select error: "Cannot index string with string <object>"
Previously, I can filter the entries in a json file that has the target objects with the following command and filter:
[{
"input": {
"obj1": {
"$link": "randomtext1"
},
"id": "a"
}
}]
jq -r '.[] | select( any(.input[]; type=="object" and has("$link") and (.["$link"]=="randomtext1")))|.id'
will give "a"
How can I filter if now the key "$link" and its value "randomtext1" belong to an array?
[{
"input": {
"obj1": [{
"$link": "randomtext1"
}],
"id": "a"
}
}]
(I still want to be able to find "a" as the result)
Example .json:
[
{
"input": {
"obj1": [{
"$link": "randomtext1"
}],
"obj2": [{
"$link": "randomtext2"
}],
"someotherobj": "123"
},
"id": "a"
},
{
"input": {
"obj3": {
"$link": "randomtext1"
},
"obj4": {
"$link": "randomtext2"
}
},
"id": "b"
}
]
I am hoping to find both a and b with "randomtext1" keyword but only got b with the same filter from the previous case after obj1 and obj2 have been "shielded/masked" by the array brackets in the example json file.
Simply add an "or" to cover the new possibility:
.[]
| select( any(.input[];
(type=="object" and (has("$link") and (.["$link"]=="randomtext1")))
or (type=="array" and any(.[];
type == "object" and (has("$link") and (.["$link"]=="randomtext1")))) ))
|.id
... or more readably:
def relevant($txt):
type == "object" and has("$link") and (.["$link"]==$txt);
.[]
| select( any(.input[];
relevant("randomtext1")
or (type=="array" and any(.[]; relevant("randomtext1"))) ))
|.id

JQ - return one array for multiple nested JSON arrays

I have a JSON structure that has repeated keys per message. I would like to combine these into one array per message.
[
{
"id": 1,
"PolicyItems": [
{
"accesses": [
{
"isAllowed": true,
"type": "drop"
},
{
"isAllowed": true,
"type": "select"
}
],
"groups": [],
"users": ["admin"]
}
]
},
{
"id": 2,
"PolicyItems": [
{
"accesses": [
{
"isAllowed": true,
"type": "drop"
}
{
"isAllowed": true,
"type": "update"
}
],
"groups": [],
"users": [
"admin",
"admin2"
]
}
]
}]
I have this:
cat ranger_v2.json | jq -r '[.[] | {"id", "access_type":(.policyItems[].accesses[] | .type)}]'
But this outputs:
[
{
"id": 1,
"access_type": "drop"
},
{
"id": 1,
"access_type": "select"
},
{
"id": 2,
"access_type": "drop"
},
{
"id": 2,
"access_type": "update"
}
]
However, what I want is to output:
[{
"id": 1,
"access_type": ["drop|select"]
},
{
"id": 2,
"access_type": ["drop|update"]
}]
Any ideas how I could do this? I'm a bit stumped!
The values could be 'drop' and 'select', but equally could be anything, so I don't want to hard code these.
Let's start by observing that with your input, the filter:
.[]
| {id, access_type: [.PolicyItems[].accesses[].type]}
produces the two objects:
{
"id": 1,
"access_type": [
"drop",
"select"
]
}
{
"id": 2,
"access_type": [
"drop",
"update"
]
}
Now it's a simple matter to tweak the above filter so as to produce the desired format:
[.[]
| {id, access_type: [.PolicyItems[].accesses[].type]}
| .access_type |= [join("|")] ]
Or equivalently, the one-liner:
map({id, access_type: [[.PolicyItems[].accesses[].type] | join("|")]})
I found something that I can work with.
If I wrap the query with []...
cat ranger_v2.json | jq -r '[.[] | {"id", "access_type":([.policyItems[].accesses[] | .type])}]'
... it produces this type of output:
[
{
"id": 1,
"access_type": ["drop","select"]
},
{
"id": 2,
"access_type": ["drop","update"]
}
]
I can then use the following:
(if (."access_type" | length > 0 ) then . else ."access_type" = [""] end )]
and
(."access_type" | #tsv)
Before I can convert to #csv and use sed to replace the tab with a pipe.
#csv' | sed -e "s/[\t]\+/|/g"
It may not be the most economical way of getting what I need, but it works for me. (Please let me know if there's a better way of doing it.)
cat ranger_v2.json | jq -r '[.[] | {"id", "access_type":([.policyItems[].accesses[] | .type])}] | .[] | [(if (."access_type" | length > 0 ) then . else ."access_type" = [""] end )] | .[] | [.id, (."access_type" | #tsv)] | #csv' | sed -e "s/[\t]\+/|/g"

Delete on nested array with jq

this is my data structure:
[
{
"name": "name1",
"organizations": [
{
"name": "name2",
"spaces": [
{
"name": "name3",
"otherkey":"otherval"
},
{
"name": "name4",
"otherkey":"otherval"
}
]
}
]
},
{
"name": "name21",
"organizations": [
{
"name": "name22",
"spaces": [
{
"name": "name23",
"otherkey":"otherval"
},
{
"name": "name24",
"otherkey":"otherval"
}
]
}
]
}
]
i just want to keep name=name1, remove the nested array object with name=name4 and want to keep the rest of the object intact. I tried with map(select) but this will just give me the full object. Is it possible to work with del on specific subarrays and keep the rest as it is?
result should be the following. in addition i want to avoid enumeration all attributes to keep on outer objects:
[
{
"name": "name1",
"organizations": [
{
"name": "name2",
"spaces": [
{
"name": "name3",
"otherkey":"otherval"
}
]
}
]
}
]
any idea? thanks!
A very targeted solution would be:
path(.[0].organizations[0].spaces) as $target
| (getpath($target) | map(select(.name != "name4"))) as $new
| setpath($target; $new)
If permissible, though, you might consider:
walk(if type == "object" and .spaces|type == "array"
then .spaces |= map(select(.name != "name4"))
else . end)
or:
del(.. | .spaces? // empty | .[] | select(.name == "name4") )
(If your jq does not have walk/1 then its jq definition can easily be found by googling.)
You can use the below and it will remove the "name": "name4" array only.
jq 'del(.[] | .organizations? | .[] | .spaces?|.[] | select(.name? == "name4"))' yourJsonFile.json
Here is a solution using select, reduce, tostream and delpaths
map(
select(.name == "name1")
| reduce (tostream|select(length==2)) as [$p,$v] (
.
; if [$p[-1],$v] == ["name","name4"] then delpaths([$p[:-1]]) else . end
)
)
I took a similar approach as #peak but inverted it, so instead of selecting what you want and setting that in the output we're selecting what we don't want and deleting it.
[path(.organizations[0].spaces[]| select(.name == "name4")] as $trash | delpaths($trash)

Resources