How do I set the allowed values without repeating myself? - arrays

I want my JSON schema to accept a list but the values in the list are from a set list and can apparear in any order!
I.e. ["GOV","CRD", "CON"] is acceptable, but so is ["CRD", "GOV", "COM"].
My current thinking is something along these lines:
"sources":{"type": "array",
"uniqueItems": true,
"emum": ["CRD", "GOV", "COM", "CON", "OTH", "UTL", "PRO", "TEL", "POS", "INS", "CCJ", "POP", "VOT", "MVR", "PPS", "DRV", "PMC"]},
But I'm not entirely sure that's going to do what I want. I've read up on items where you can define the values in the list, but it looks like that would set the order and also the number of items in the list. Although both can be worked around using oneOf combined with definitions.
E.g. (shortened for space saving reasons) Please feel free to correct this if I'm wrong:
{
"definitions": {
"source":{"emum": ["CRD", "GOV", "COM", "CON", "OTH", "UTL", "PRO", "TEL", "POS", "INS", "CCJ", "POP", "VOT", "MVR", "PPS", "DRV", "PMC", ""]},
}
"sources":{"type": "array",
"uniqueItems": true,
"items": {
"source": {"$ref": "#/definitions/source"},
"source": {"$ref": "#/definitions/source"},
"source": {"$ref": "#/definitions/source"},
.
.
.
}
}
}
My question is: Is there a nicer way to do this?

You don't have to specify every possible order. When the array is made limited by enum, the items can come in any order. However, you have to specify the type of the enumerated values.
"sources":{
"type": "array",
"uniqueItems": true,
"items": {
"type": "string",
"emum": ["CRD", "GOV", "COM"]
}

Related

MongoDB filter a sub-array of Objects

The document structure has a round collection, which has an array of holes Objects embedded within it, with each hole played/scored entered.
The structure looks like this (there are more fields, but this summarises):
{
"_id": {
"$oid": "60701a691c071256e4f0d0d6"
},
"schema": {
"$numberDecimal": "1.0"
},
"playerName": "T Woods",
"comp": {
"id": {
"$oid": "607019361c071256e4f0d0d5"
},
"name": "US Open",
"tees": "Pro Tees",
"roundNo": {
"$numberInt": "1"
},
"scoringMethod": "Stableford"
},
"holes": [
{
"holeNo": {
"$numberInt": "1"
},
"holePar": {
"$numberInt": "4"
},
"holeSI": {
"$numberInt": "3"
},
"holeGross": {
"$numberInt": "4"
},
"holeStrokes": {
"$numberInt": "1"
},
"holeNett": {
"$numberInt": "3"
},
"holeGrossPoints": {
"$numberInt": "2"
},
"holeNettPoints": {
"$numberInt": "3"
}
}
]
}
In the Atlas web UI, it shows as (note there are 9 holes in this particular round of golf - limited to 3 for brevity):
I would like to find the players who have a holeGross of 2, or less, somewhere in their round of golf (i.e. a birdie on par 3 or better).
Being new to MongoDB, and NoSQL constructs, I am stuck with this. Reading around the aggregation pipeline framework, I have tried to break down the stages I will need as:
Filter by the comp.id and comp.roundNo
Filter this result with any hole within the holes array of Objects
Maybe I have approached this wrong, and should filter or structure this pipeline differently?
So far, using the Atlas web UI, I can apply these filters individually as:
{
"comp.id": ObjectId("607019361c071256e4f0d0d5"),
"comp.roundNo": 2
}
And:
{ "holes.0.holeGross": 2 }
But I have 2 problems:
The second filter query, I have hard-coded the array index to get this value. I would need to search across all the sub-elements of every document that matches this comp.id && comp.roundNo
How do I combine these? I presuming this is where the aggregation comes in, as well as enumerating across the whole array (as above).
I note in particular it is the extra ".0." part of the second query that I am not seeing from various other online postings trying to do the same thing. Is my data structure incorrect? Do I need the [0]...[17] Objects for an 18-hole round of golf?
I would like to find the players who have a holeGross of 2, or less, somewhere in their round of golf
if that is the goal, a simple $lte search inside the holes array like the following would do:
db.collection.find({ "holes.holeGross": { $lte: 2 } })
you simply have to not specify an array index such as 0 in the property path in order to search each element of the array.
https://mongoplayground.net/p/KhZLnj9mJe5

Replace array values if they exist with jq

While I use jq a lot, I do so mostly for simpler tasks. This one has tied my brain into a knot.
I have some JSON output from unit tests which I need to modify. Specifically, I need to remove (or replace) an error value because the test framework generates output that is hundreds of lines long.
The JSON looks like this:
{
"numFailedTestSuites": 1,
"numFailedTests": 1,
"numPassedTestSuites": 1,
"numPassedTests": 1,
...
"testResults": [
{
"assertionResults": [
{
"failureMessages": [
"Error: error message here"
],
"status": "failed",
"title": "matches snapshot"
},
{
"failureMessages": [
"Error: another error message here",
"Error: yet another error message here"
],
"status": "failed",
"title": "matches another snapshot"
}
],
"endTime": 1617720396223,
"startTime": 1617720393320,
"status": "failed",
"summary": ""
},
{
"assertionResults": [
{
"failureMessages": [],
"status": "passed",
},
{
"failureMessages": [],
"status": "passed",
}
]
}
]
}
I want to replace each element in failureMessages with either a generic failed message or with a truncated version (let's say 100 characters) of itself.
The tricky part (for me) is that failureMessages is an array and can have 0-n values and I would need to modify all of them.
I know I can find non-empty arrays with select(.. | .failureMessages | length > 0) but that's as far as I got, because I don't need to actually select items, I need to replace them and get the full JSON back.
The simplest solution is:
.testResults[].assertionResults[].failureMessages[] |= .[0:100]
Check it online!
The online example keeps only the first 10 characters of the failure messages to show the effect on the sample JSON you posted in the question (it contains short error messages).
Read about array/string slice (.[a:b]) and update assignment (|=) in the JQ documentation.

Patch request in SCIM with Azure AD

How should I handle the following PATCH request, for a user that when initially added didn't have any address (not even an empty addresses array)?
{
"schemas": [
"urn:ietf:params:scim:api:messages:2.0:PatchOp"
],
"Operations": [
{
"op": "Add",
"path": "addresses[type eq \"work\"].formatted",
"value": "Columbus"
}
]
}
Should I "proactively" create an addresses array, with a single value as following (what seems a very bad solutions)?
{"type": "work", formatted: "Columbus"}
I would expect a patch request that looks like:
{
"schemas": [
"urn:ietf:params:scim:api:messages:2.0:PatchOp"
],
"Operations":[{
"op":"add",
"value":{
"addresses":[
{
"formatted":"Columbus",
"type":"work"
}
]
}]
}
If no array exists yet, then you should create the array and then add the value to the array. You can set it ahead of time to be an empty array, or can leave the value as null until such a point where a value needs to be added to the array, and then at that time create the array and then add the value to it. Kindly check this link

Query nested arrays in ArangoDB

I'm looking for a way to query nested arrays in ArangoDB.
The JSON structure I have is:
{
"uid": "bykwwla4prqi",
"category": "party",
"notBefore": "2016-04-19T08:43:35.388+01:00",
"notAfter": "9999-12-31T23:59:59.999+01:00",
"version": 1.0,
"aspects": [
"participant"
],
"description": [
{ "value": "User Homer Simpson, main actor in 'The Simpsons'", "lang": "en"}
],
"properties": [
{
"property": [
"urn:project:domain:attribute:surname"
],
"values": [
"Simpson"
]
},
{
"property": [
"urn:project:domain:attribute:givennames"
],
"values": [
"Homer",
"Jay"
]
}
]
}
I tried to use a query like the following to find all parties having a given name 'Jay':
FOR r IN resource
FILTER "urn:project:domain:attribute:givennames" IN r.properties[*].targets[*]
AND "Jay" IN r.properties[*].values[*]
RETURN r
but unfortunately it does not work - it returns an empty array. If I use a '1' instead of '*' for the properties array it works. But the array of the properties has no fixed structure.
Does anybody have an idea how to solve this?
Thanks a lot!
You can inspect what the filter does using a simple trick: you RETURN the actual filter condition:
db._query(`FOR r IN resource RETURN r.properties[*].property[*]`).toArray()
[
[
[
"urn:project:domain:attribute:surname"
],
[
"urn:project:domain:attribute:givennames"
]
]
]
which makes it pretty clear whats going on. The IN operator can only work on one dimensional arrays. You could work around this by using FLATTEN() to remove the sub layers:
db._query(`FOR r IN resource RETURN FLATTEN(r.properties[*].property[*])`).toArray()
[
[
"urn:project:domain:attribute:surname",
"urn:project:domain:attribute:givennames"
]
]
However, while your documents are valid json (I guess its converted from xml?) you should alter the structure as one would do it in json:
"properties" : {
"urn:project:domain:attribute:surname":[
"Simpson"
],
"urn:project:domain:attribute:givennames": [
"Homer",
"Jay"
]
}
Since the FILTER combination you specify would also find any other Jay (not only those found in givennames) and the usage of FLATTEN() will prohibit using indices in your filter statement. You don't want to use queries that can't use indices on reasonably sized collections for performance reasons.
In Contrast you can use an array index on givennames with the above document layout:
db.resource.ensureIndex({type: "hash",
fields:
["properties.urn:project:domain:attribute:givennames[*]"]
})
Now doublecheck the explain for the query:
db._explain("FOR r IN resource FILTER 'Jay' IN " +
"r.properties.`urn:project:domain:attribute:givennames` RETURN r")
...
6 IndexNode 1 - FOR r IN resource /* hash index scan */
...
Indexes used:
By Type Collection Unique Sparse Selectivity Fields Ranges
6 hash resource false false 100.00 % \
[ `properties.urn:project:domain:attribute:givennames[*]` ] \
("Jay" in r.`properties`.`urn:project:domain:attribute:givennames`)
that its using the index.

JSON Schema: array with exactly n elements of given sub-schemas

I'm trying to figure out how can I write a JSON Schema for an array which has to contain exactly 2 elements, where each of those elements conform to its own subschema. I've got no idea at all, since none of anyOf, allOf, oneOf don't suit well here.
Let's say that I've got ss1 and ss2 subschemas that define elements of type t1 and t2, respectively. How can I write a schema which will accept arrays that one element of type t1 (conforming to ss1) and another element of type t2 (conforming to ss2)?
The items keyword has a special format just for this. Instead of the value being a schema, it can be an array of schemas. When items is used this way, the items in the array must conform to the corresponding schema in the items array of schemas. Here is a working example assuming t1 = string and t2 = integer.
{
"type": "array",
"items": [
{ "$ref": "#/definitions/ss1" },
{ "$ref": "#/definitions/ss2" }
],
"minItems": 2,
"maxItems": 2,
"definitions": {
"ss1": { "type": "string" },
"ss2": { "type": "integer" }
}
}

Resources