JSON Schema: array with exactly n elements of given sub-schemas - arrays

I'm trying to figure out how can I write a JSON Schema for an array which has to contain exactly 2 elements, where each of those elements conform to its own subschema. I've got no idea at all, since none of anyOf, allOf, oneOf don't suit well here.
Let's say that I've got ss1 and ss2 subschemas that define elements of type t1 and t2, respectively. How can I write a schema which will accept arrays that one element of type t1 (conforming to ss1) and another element of type t2 (conforming to ss2)?

The items keyword has a special format just for this. Instead of the value being a schema, it can be an array of schemas. When items is used this way, the items in the array must conform to the corresponding schema in the items array of schemas. Here is a working example assuming t1 = string and t2 = integer.
{
"type": "array",
"items": [
{ "$ref": "#/definitions/ss1" },
{ "$ref": "#/definitions/ss2" }
],
"minItems": 2,
"maxItems": 2,
"definitions": {
"ss1": { "type": "string" },
"ss2": { "type": "integer" }
}
}

Related

How do I set the allowed values without repeating myself?

I want my JSON schema to accept a list but the values in the list are from a set list and can apparear in any order!
I.e. ["GOV","CRD", "CON"] is acceptable, but so is ["CRD", "GOV", "COM"].
My current thinking is something along these lines:
"sources":{"type": "array",
"uniqueItems": true,
"emum": ["CRD", "GOV", "COM", "CON", "OTH", "UTL", "PRO", "TEL", "POS", "INS", "CCJ", "POP", "VOT", "MVR", "PPS", "DRV", "PMC"]},
But I'm not entirely sure that's going to do what I want. I've read up on items where you can define the values in the list, but it looks like that would set the order and also the number of items in the list. Although both can be worked around using oneOf combined with definitions.
E.g. (shortened for space saving reasons) Please feel free to correct this if I'm wrong:
{
"definitions": {
"source":{"emum": ["CRD", "GOV", "COM", "CON", "OTH", "UTL", "PRO", "TEL", "POS", "INS", "CCJ", "POP", "VOT", "MVR", "PPS", "DRV", "PMC", ""]},
}
"sources":{"type": "array",
"uniqueItems": true,
"items": {
"source": {"$ref": "#/definitions/source"},
"source": {"$ref": "#/definitions/source"},
"source": {"$ref": "#/definitions/source"},
.
.
.
}
}
}
My question is: Is there a nicer way to do this?
You don't have to specify every possible order. When the array is made limited by enum, the items can come in any order. However, you have to specify the type of the enumerated values.
"sources":{
"type": "array",
"uniqueItems": true,
"items": {
"type": "string",
"emum": ["CRD", "GOV", "COM"]
}

How to transform a JSON array nested inside an object inside another array in Postgres?

I'm using Postgres 9.6 and have a JSON field called credits with the following structure; A list of credits, each with a position and multiple people that can be in that position.
[
{
"position": "Set Designers",
people: [
"Joe Blow",
"Tom Thumb"
]
}
]
I need to transform the nested people array, which are currently just strings representing their names, into objects that have a name and image_url field, like this
[
{
"position": "Set Designers",
people: [
{ "name": "Joe Blow", "image_url": "" },
{ "name": "Tom Thumb", "image_url": "" }
]
}
]
So far I've only been able to find decent examples of doing this on either the parent JSON array or on an array field nested inside a single JSON object.
So far this is all I've been able to manage and even it is mangling the result.
UPDATE campaigns
SET credits = (
SELECT jsonb_build_array(el)
FROM jsonb_array_elements(credits::jsonb) AS el
)::jsonb
;
Create an auxiliary function to simplify the rather complex operation:
create or replace function transform_my_array(arr jsonb)
returns jsonb language sql as $$
select case when coalesce(arr, '[]') = '[]' then '[]'
else jsonb_agg(jsonb_build_object('name', value, 'image_url', '')) end
from jsonb_array_elements(arr)
$$;
With the function the update is not so horrible:
update campaigns
set credits = (
select jsonb_agg(jsonb_set(el, '{people}', transform_my_array(el->'people')))
from jsonb_array_elements(credits::jsonb) as el
)::jsonb
;
Working example in rextester.

Construct unique arrays from nested array values with common parents

Likely a close question to JQ: Nested JSON transformation but I wasn't able to get my head around it.
Sample JSON:
"value": [
{
"FeatureStatus": [
{
"FeatureName": "Sway1",
"FeatureServiceStatus": "ServiceOperational"
},
{
"FeatureName": "Sway2",
"FeatureServiceStatus": "ServiceDegraded"
}
],
"Id": "SwayEnterprise",
},
{
"FeatureStatus": [
{
"FeatureName": "yammerfeatures",
"FeatureServiceStatus": "ServiceOperational"
}
],
"Id": "yammer"
}
]
What I want to do is create an output with jq which results in the following;
{"Sway":"Sway1":"ServiceOperational"},
{"Sway":"Sway2":"ServiceDegraded"},
{"Yammer":"yammerfeatures":"ServiceOperational"}
My various attempts either end up with thousands of non-unique (i.e Yammer with Sway status), or only one Id with x number of FeatureServiceStatus.
Any pointers would be greatly appreciated. I've gone through the tutorial and the cookbook. I am perhaps 2.5 days into using jq.
Assuming that the enclosing braces have been added to make the input valid JSON, the filter:
.value[]
| [.Id] + (.FeatureStatus[] | [ .FeatureName, .FeatureServiceStatus ])
produces:
["SwayEnterprise","Sway1","ServiceOperational"]
["SwayEnterprise","Sway2","ServiceDegraded"]
["yammer","yammerfeatures","ServiceOperational"]
You can then easily reformat this as desired.

Query nested arrays in ArangoDB

I'm looking for a way to query nested arrays in ArangoDB.
The JSON structure I have is:
{
"uid": "bykwwla4prqi",
"category": "party",
"notBefore": "2016-04-19T08:43:35.388+01:00",
"notAfter": "9999-12-31T23:59:59.999+01:00",
"version": 1.0,
"aspects": [
"participant"
],
"description": [
{ "value": "User Homer Simpson, main actor in 'The Simpsons'", "lang": "en"}
],
"properties": [
{
"property": [
"urn:project:domain:attribute:surname"
],
"values": [
"Simpson"
]
},
{
"property": [
"urn:project:domain:attribute:givennames"
],
"values": [
"Homer",
"Jay"
]
}
]
}
I tried to use a query like the following to find all parties having a given name 'Jay':
FOR r IN resource
FILTER "urn:project:domain:attribute:givennames" IN r.properties[*].targets[*]
AND "Jay" IN r.properties[*].values[*]
RETURN r
but unfortunately it does not work - it returns an empty array. If I use a '1' instead of '*' for the properties array it works. But the array of the properties has no fixed structure.
Does anybody have an idea how to solve this?
Thanks a lot!
You can inspect what the filter does using a simple trick: you RETURN the actual filter condition:
db._query(`FOR r IN resource RETURN r.properties[*].property[*]`).toArray()
[
[
[
"urn:project:domain:attribute:surname"
],
[
"urn:project:domain:attribute:givennames"
]
]
]
which makes it pretty clear whats going on. The IN operator can only work on one dimensional arrays. You could work around this by using FLATTEN() to remove the sub layers:
db._query(`FOR r IN resource RETURN FLATTEN(r.properties[*].property[*])`).toArray()
[
[
"urn:project:domain:attribute:surname",
"urn:project:domain:attribute:givennames"
]
]
However, while your documents are valid json (I guess its converted from xml?) you should alter the structure as one would do it in json:
"properties" : {
"urn:project:domain:attribute:surname":[
"Simpson"
],
"urn:project:domain:attribute:givennames": [
"Homer",
"Jay"
]
}
Since the FILTER combination you specify would also find any other Jay (not only those found in givennames) and the usage of FLATTEN() will prohibit using indices in your filter statement. You don't want to use queries that can't use indices on reasonably sized collections for performance reasons.
In Contrast you can use an array index on givennames with the above document layout:
db.resource.ensureIndex({type: "hash",
fields:
["properties.urn:project:domain:attribute:givennames[*]"]
})
Now doublecheck the explain for the query:
db._explain("FOR r IN resource FILTER 'Jay' IN " +
"r.properties.`urn:project:domain:attribute:givennames` RETURN r")
...
6 IndexNode 1 - FOR r IN resource /* hash index scan */
...
Indexes used:
By Type Collection Unique Sparse Selectivity Fields Ranges
6 hash resource false false 100.00 % \
[ `properties.urn:project:domain:attribute:givennames[*]` ] \
("Jay" in r.`properties`.`urn:project:domain:attribute:givennames`)
that its using the index.

jq: select only an array which contains element A but not element B

My data is a series of JSON arrays. Each array has one or more elements with name and id keys:
[
{
"name": "first_source",
"id": "abcdef"
},
{
"name": "second_source",
"id": "ghijkl"
},
{
"name": "third_source",
"id": "opqrst"
}
]
How, using jq, do I select only the arrays which contain an element with "first source" as the name value, but which don't contain "second_source" as the name value of any element?
This only returns an element for further processing:
jq '.[] | select (.name == "first_source")
But I clearly need to return the entire array for my scenario to work.
You can use this filter:
select(
(map(.name == "first_source") | any) and
(map(.name != "second_source") | all)
)
You need to test all the elements of an array for an existence of the names. You can do that by mapping each object to your condition and use the any or all filter appropriately.
Here, you want to see if any item is named "first_source" and all items are not named "second_source".

Resources