Construct unique arrays from nested array values with common parents - arrays

Likely a close question to JQ: Nested JSON transformation but I wasn't able to get my head around it.
Sample JSON:
"value": [
{
"FeatureStatus": [
{
"FeatureName": "Sway1",
"FeatureServiceStatus": "ServiceOperational"
},
{
"FeatureName": "Sway2",
"FeatureServiceStatus": "ServiceDegraded"
}
],
"Id": "SwayEnterprise",
},
{
"FeatureStatus": [
{
"FeatureName": "yammerfeatures",
"FeatureServiceStatus": "ServiceOperational"
}
],
"Id": "yammer"
}
]
What I want to do is create an output with jq which results in the following;
{"Sway":"Sway1":"ServiceOperational"},
{"Sway":"Sway2":"ServiceDegraded"},
{"Yammer":"yammerfeatures":"ServiceOperational"}
My various attempts either end up with thousands of non-unique (i.e Yammer with Sway status), or only one Id with x number of FeatureServiceStatus.
Any pointers would be greatly appreciated. I've gone through the tutorial and the cookbook. I am perhaps 2.5 days into using jq.

Assuming that the enclosing braces have been added to make the input valid JSON, the filter:
.value[]
| [.Id] + (.FeatureStatus[] | [ .FeatureName, .FeatureServiceStatus ])
produces:
["SwayEnterprise","Sway1","ServiceOperational"]
["SwayEnterprise","Sway2","ServiceDegraded"]
["yammer","yammerfeatures","ServiceOperational"]
You can then easily reformat this as desired.

Related

To fetch elements and compare elements from JSON object in Ruby

I have many json resources similar to the below one. But, I need to only fetch the json resource which satisfies the two conditions:
(1) component.code.text == Diastolic Blood Pressure
(2) valueQuantity.value < 90
This is the JSON object/resource
{
"fullUrl": "urn:uuid:edf9439b-0173-b4ab-6545 3b100165832e",
"resource": {
"resourceType": "Observation",
"id": "edf9439b-0173-b4ab-6545-3b100165832e",
"component": [ {
"code": {
"coding": [ {
"system": "http://loinc.org",
"code": "8462-4",
"display": "Diastolic Blood Pressure"
} ],
"text": "Diastolic Blood Pressure"
},
"valueQuantity": {
"value": 81,
"unit": "mm[Hg]",
"system": "http://unitsofmeasure.org",
"code": "mm[Hg]"
}
}, {
"code": {
"coding": [ {
"system": "http://loinc.org",
"code": "8480-6",
"display": "Systolic Blood Pressure"
} ],
"text": "Systolic Blood Pressure"
},
"valueQuantity": {
"value": 120,
"unit": "mm[Hg]",
"system": "http://unitsofmeasure.org",
"code": "mm[Hg]"
}
} ]
},
}
JSON file
I need to write a condition to fetch the resource with text: "Diastolic Blood Pressure" AND valueQuantity.value > 90
I have written the following code:
def self.hypertension_observation(bundle)
entries = bundle.entry.select {|entry| entry.resource.is_a?(FHIR::Observation)}
observations = entries.map {|entry| entry.resource}
hypertension_observation_statuses = ((observations.select {|observation| observation&.component&.at(0)&.code&.text.to_s == 'Diastolic Blood Pressure'}) && (observations.select {|observation| observation&.component&.at(0)&.valueQuantity&.value.to_i >= 90}))
end
I am getting the output without any error. But, the second condition is not being satisfied in the output. The output contains even values < 90.
Please anyone help in correcting this ruby code regarding fetching only, output which contains value<90
I will write out what I would do for a problem like this, based on the (edited) version of your json data. I'm inferring that the full json file is some list of records with medical data, and that we want to fetch only the records for which the individual's diastolic blood pressure reading is < 90.
If you want to do this in Ruby I recommend using the JSON parser which comes with your ruby distro. What this does is it takes some (hopefully valid) json data and returns a Ruby array of hashes, each with nested arrays and hashes. In my solution I saved the json you posted to a file and so I would do something like this:
require 'json'
require 'pp'
json_data = File.read("medical_info.json")
parsed_data = JSON.parse(json_data)
fetched_data = []
parsed_data.map do |record|
diastolic_text = record["resource"]["component"][0]["code"]["text"]
diastolic_value_quantity = record["resource"]["component"][0]["valueQuantity"]["value"]
if diastolic_value_quantity < 90
fetched_data << record
end
end
pp fetched_data
This will print a new array of hashes which contains only the results with the desired values for diastolic pressure. The 'pp' gem is for 'Pretty Print' which isn't perfect but makes the hierarchy a little easier to read.
I find that when faced with deeply nested JSON data that I want to parse in Ruby, I will save the JSON data to a file, as I did here, and then in the directory where the file is, I run IRB so I can just play with accessing the hash values and array elements that I'm looking for.

MongoDB filter a sub-array of Objects

The document structure has a round collection, which has an array of holes Objects embedded within it, with each hole played/scored entered.
The structure looks like this (there are more fields, but this summarises):
{
"_id": {
"$oid": "60701a691c071256e4f0d0d6"
},
"schema": {
"$numberDecimal": "1.0"
},
"playerName": "T Woods",
"comp": {
"id": {
"$oid": "607019361c071256e4f0d0d5"
},
"name": "US Open",
"tees": "Pro Tees",
"roundNo": {
"$numberInt": "1"
},
"scoringMethod": "Stableford"
},
"holes": [
{
"holeNo": {
"$numberInt": "1"
},
"holePar": {
"$numberInt": "4"
},
"holeSI": {
"$numberInt": "3"
},
"holeGross": {
"$numberInt": "4"
},
"holeStrokes": {
"$numberInt": "1"
},
"holeNett": {
"$numberInt": "3"
},
"holeGrossPoints": {
"$numberInt": "2"
},
"holeNettPoints": {
"$numberInt": "3"
}
}
]
}
In the Atlas web UI, it shows as (note there are 9 holes in this particular round of golf - limited to 3 for brevity):
I would like to find the players who have a holeGross of 2, or less, somewhere in their round of golf (i.e. a birdie on par 3 or better).
Being new to MongoDB, and NoSQL constructs, I am stuck with this. Reading around the aggregation pipeline framework, I have tried to break down the stages I will need as:
Filter by the comp.id and comp.roundNo
Filter this result with any hole within the holes array of Objects
Maybe I have approached this wrong, and should filter or structure this pipeline differently?
So far, using the Atlas web UI, I can apply these filters individually as:
{
"comp.id": ObjectId("607019361c071256e4f0d0d5"),
"comp.roundNo": 2
}
And:
{ "holes.0.holeGross": 2 }
But I have 2 problems:
The second filter query, I have hard-coded the array index to get this value. I would need to search across all the sub-elements of every document that matches this comp.id && comp.roundNo
How do I combine these? I presuming this is where the aggregation comes in, as well as enumerating across the whole array (as above).
I note in particular it is the extra ".0." part of the second query that I am not seeing from various other online postings trying to do the same thing. Is my data structure incorrect? Do I need the [0]...[17] Objects for an 18-hole round of golf?
I would like to find the players who have a holeGross of 2, or less, somewhere in their round of golf
if that is the goal, a simple $lte search inside the holes array like the following would do:
db.collection.find({ "holes.holeGross": { $lte: 2 } })
you simply have to not specify an array index such as 0 in the property path in order to search each element of the array.
https://mongoplayground.net/p/KhZLnj9mJe5

Filter & sort array of objects then output only one property of each object

I have JSON that dynamically comes to me in the format:
{
"world": 583,
"peace": [
{
"id": "Happy",
"valid": true,
"version": "9"
},
{
"id": "Old",
"valid": false,
"version": "2020"
},
{
"id": "New",
"valid": true,
"version": "2021"
},
{
"id": "Year",
"valid": true,
"version": "5"
}
]
}
I'm brand new to jq, and I've read the tutorial and the manual, and several questions here on Stack Overflow, including:
How to sort a json file by keys and values of those keys in jq
I want to use jq to output all id's that are valid ("valid"=true) and have the output sorted by id.
So in this example, I would like the output to be:
Happy
New
Year
So far, I have:
jq '..|.peace[]|select(.valid)|=sort_by(.id)'
But that issues a Cannot index string with string "id" error.
How can I make this work?
Thank you, and Happy New Year!
Keep the .peace array intact for filtering & sorting then break away from it:
jq --raw-output '.peace | map(select(.valid).id) | sort[]' <f.json>
^ ^ ^ ^
A B C D
Where:
A: Keep array intact
B: map + select = filter & keep id only
C: sort result (which is a list of strings (i.e. ids))
D: "spread" each item out of the array
--raw-output will output as "text" and not as string, e.g. Happy instead of "Happy"
jq play fiddle available here.
Acknowledgement: this answer has been vastly improved by #peak suggestion

How to use jq to selectively update JSON array, referring to parent properties

I'm trying to use jq to selectively change a single property of an object nested in an array, keeping the rest of the array intact. The generated property value needs to refer up to a property of the parent object.
As an example, take this array:
[
{
"name": "keep_me"
},
{
"name": "leave_unchanged",
"list": [
{
"id": "1",
"key": "keep_this"
}
]
},
{
"name": "a",
"list": [
{
"id": "2",
"key": "also_keep_this"
},
{
"id": "42",
"key": "replace_this"
}
]
}
]
I want to change the value of the last key (replace_this), using the name property of the parent object to generate a value like generated_value_for_a_42.
The key problem here seems to be leaving the rest of the array unmodified, while updating specific elements. But the need to refer 'up the tree' to a parent property complicates matters.
I tried wrapping changes in parentheses to keep untouched elements, but then had trouble with variable binding (using as) to the right scope, for accessing the parent property. So I either ended up discarding parts of the array or objects, or getting errors about the variable binding.
This answer helped me in the right direction. The important learnings were to use |= in the right place, and have the filters wrapped in if p then f else . to keep the unchanged elements.
The following jq script solves the task:
map(.name as $name |
if has("list")
then .list |= map(if .key | contains("keep") | not
then .key = "generated_value_for_" + $name + "_" + .id
else .
end)
else .
end)

Query nested arrays in ArangoDB

I'm looking for a way to query nested arrays in ArangoDB.
The JSON structure I have is:
{
"uid": "bykwwla4prqi",
"category": "party",
"notBefore": "2016-04-19T08:43:35.388+01:00",
"notAfter": "9999-12-31T23:59:59.999+01:00",
"version": 1.0,
"aspects": [
"participant"
],
"description": [
{ "value": "User Homer Simpson, main actor in 'The Simpsons'", "lang": "en"}
],
"properties": [
{
"property": [
"urn:project:domain:attribute:surname"
],
"values": [
"Simpson"
]
},
{
"property": [
"urn:project:domain:attribute:givennames"
],
"values": [
"Homer",
"Jay"
]
}
]
}
I tried to use a query like the following to find all parties having a given name 'Jay':
FOR r IN resource
FILTER "urn:project:domain:attribute:givennames" IN r.properties[*].targets[*]
AND "Jay" IN r.properties[*].values[*]
RETURN r
but unfortunately it does not work - it returns an empty array. If I use a '1' instead of '*' for the properties array it works. But the array of the properties has no fixed structure.
Does anybody have an idea how to solve this?
Thanks a lot!
You can inspect what the filter does using a simple trick: you RETURN the actual filter condition:
db._query(`FOR r IN resource RETURN r.properties[*].property[*]`).toArray()
[
[
[
"urn:project:domain:attribute:surname"
],
[
"urn:project:domain:attribute:givennames"
]
]
]
which makes it pretty clear whats going on. The IN operator can only work on one dimensional arrays. You could work around this by using FLATTEN() to remove the sub layers:
db._query(`FOR r IN resource RETURN FLATTEN(r.properties[*].property[*])`).toArray()
[
[
"urn:project:domain:attribute:surname",
"urn:project:domain:attribute:givennames"
]
]
However, while your documents are valid json (I guess its converted from xml?) you should alter the structure as one would do it in json:
"properties" : {
"urn:project:domain:attribute:surname":[
"Simpson"
],
"urn:project:domain:attribute:givennames": [
"Homer",
"Jay"
]
}
Since the FILTER combination you specify would also find any other Jay (not only those found in givennames) and the usage of FLATTEN() will prohibit using indices in your filter statement. You don't want to use queries that can't use indices on reasonably sized collections for performance reasons.
In Contrast you can use an array index on givennames with the above document layout:
db.resource.ensureIndex({type: "hash",
fields:
["properties.urn:project:domain:attribute:givennames[*]"]
})
Now doublecheck the explain for the query:
db._explain("FOR r IN resource FILTER 'Jay' IN " +
"r.properties.`urn:project:domain:attribute:givennames` RETURN r")
...
6 IndexNode 1 - FOR r IN resource /* hash index scan */
...
Indexes used:
By Type Collection Unique Sparse Selectivity Fields Ranges
6 hash resource false false 100.00 % \
[ `properties.urn:project:domain:attribute:givennames[*]` ] \
("Jay" in r.`properties`.`urn:project:domain:attribute:givennames`)
that its using the index.

Resources