JQ: Reduce array of objects to object, adding to array - arrays

I've got a more complex JQ expression that's handling an array of objects.
The input looks like this:
[
{ "key": "1", "value": "value 1"},
{ "key": "2", "value": "value 2"},
{ "key": "1", "value": "value 3"},
]
What I want to get is this:
{
"1": { "values": ["value 1", "value 3"] },
"2": { "values": ["value 2"] }
}
or, for my use case:
{
"1": [ "value 1", "value 3" ],
"2": [ "value 2" ]
}
would also be OK.
I've already tried to use … | { (.key): [.value] } but the result is (to no real surprise to me) that later occurrences of keys simply overwrite already existing ones. What I want to accomplish is something like "create a new key/value pair or add .value to an already existing one's 'values' array".

The drawback of a solution relying on group_by is that group_by requires a sort, which is unnecessary here. In this response, I'll show how to avoid any sorting by using a generic (and generally useful) jq function that "melds" an array of JSON objects, essentially by popping the value at each key into an array, and then concatenating corresponding arrays.
# input should be an array of objects
def meld:
reduce .[] as $o
({}; reduce ($o|keys)[] as $key (.; .[$key] += [$o[$key]] ));
Let's also define some data:
def data:
[
{ "key": "1", "value": "value 1"},
{ "key": "2", "value": "value 2"},
{ "key": "1", "value": "value 3"}
]
;
Then the filter:
data | map([.] | from_entries) | meld
produces:
{"1":["value 1","value 3"],"2":["value 2"]}

OK, after finally finding out what I wanted I also understand that my previous filters didn't keep the input array but resulted in objects being output after each other. So that was basically the reason why all examples I found wouldn't work.
I wanted to group by keys (hence the key/value requirement), which group_by already does, but wouldn't work.
From grouping working its only a small step to my solution (unique keys, values in arrays).
'… group_by(.key) | map({ "key": .[0].key, "values": map(.value) | unique })'
The output now looks like this, which is perfectly fine for my requirements:
[
{
"key": "1",
"values": [
"value 1",
"value 3"
],
},
{
"key": "2",
"values": [
"value 2"
]
}
]

Related

Solr - Search parent having multiple child documents

I have a Solr index with structure as below.
{
"id": "1",
"bookName_s": "Core Concepts",
"bookDescription_s": "This is the description",
"isbn_s": "ABCD:123",
"reviews": [
{
"id": "2",
"Name_s": "review1",
"detail_s": "sample review"
}
],
"students": [
{
"id": "3",
"Name_s": "student1",
"student_s": "test student"
}
]
}
How do i search for parent that has a reviewer with Name_s as 'review1' and student with Name_s as 'student'.
I tried parent block chain query like below but nothing seems to work -
q=({!parent which="*:* -_nest_path_:*"}(+_nest_path_:\/reviews +Name_s:*rev*)) AND ({!parent which="*:* -_nest_path_:*"}(+_nest_path_:\/students +Name_s:*stu*))
q=({!parent which="*:* -_nest_path_:*"}(+_nest_path_:\/reviews +Name_s:*rev*)(+_nest_path_:\/students +Name_s:*stu*))
Is there a way i can acheive this using the q operator instead of fq parameter? thanks
Based on EricLavault suggestion i modified the index to include type of the object in the index like below -
{
"id": "1",
"bookName_s": "Core Concepts",
"bookDescription_s": "This is the description",
"isbn_s": "ABCD:123",
"type_s":"book"
"reviews": [
{
"id": "2",
"Name_s": "review1",
"detail_s": "sample review",
"type":"review"
}
],
"students": [
{
"id": "3",
"Name_s": "student1",
"type":"student",
"student_s": "test student"
}
]
}
and below queries worked.
{!parent which="type:book"}(type:review AND Name_s:review1) OR (type:student AND Name_s:student1)
returns all books with review1 and student1

Use jq to extract some values from an array to top level, leaving the array intact

I have data in this format:
{
"searchResult": [
{
"key": "common1",
"value": "A string"
},
{
"key": "common2",
"value": "2149944"
},
{
"key": "varying1",
"value": "604516"
},
{
"key": "varying73",
"value": "58.92"
}
]
}
Within searchResult are some constantly present fields (timestamp, identifiers etc). The other keys are constantly changing and can be named anything. I need them transformed to the format below, with the predefined constant keys pulled out to the top level and the variable keys staying in the searchResult array.
{
"common1": "A string",
"common2": "2149944",
"searchResult": [
{
"key": "varying1",
"value": "604516"
},
{
"key": "varying73",
"value": "58.92"
}
]
}
Seeing as jq is already being used in the process, how can I do this transformation in jq please?
I have tried extracting the values using .name, but haven't managed to bring them to this top level.
Many thanks
Ben
You could use IN/1 as follows:
(.searchResult | (from_entries | {common1, common2})) + { searchResult }
| .searchResult |= map(select(.key | IN("common1", "common2") | not))

Use jq nested array values [duplicate]

This question already has an answer here:
Select multiple fields at different levels
(1 answer)
Closed 2 years ago.
I have a file, let's call it heroes.json, where part of the data is a nested array of object, superpowers:
[
{
"hero": "Superman",
"id": "123",
"realName": "Clark Kent",
"age": "?",
"superpowers": [
{
"name": "speed",
"num": "1",
"des": "Faster than a speeding bullet.",
"value": "50"
},
{
"name": "strength",
"num": "2",
"des": "More powerful than a locomotive.",
"value": "100"
}
],
"weakness": "kryptonite"
},
{
"hero": "Batman",
"id": "456",
...
I want to select hero and superpowers, and keep only name and des keys within superpowers, like:
[
{
"hero": "Superman",
"superpowers": [
{
"name": "speed",
"des": "Faster than a speeding bullet."
},
{
"name": "strength",
"des": "More powerful than a locomotive."
}
]
},
{
"hero": "Batman",
"superpowers": [
...
It wouldn't be hard to write an iterator to do this, but I want to try jq as I'm new to this tool and it seems useful to learn.
So I experimented on jqplay until it delivered the needed format. I don't know if it's optimal, but this worked:
jq '[.[] | {hero, superpowers: [ .superpowers[] | {name, des} ] } ]'
(A graphQL-like filter syntax would make this easier.)
Note: The output required for my json differs from what is mentioned in this question and answer, and I have avoided using map (iterator) in the solution. In other words, I'm not asking the same question, or presenting the same answer.
It would be helpful to know if my solution is optimal.

Flatten a hierarchical JSON array using JQ

Can anyone help me get the correct jq command to flatten the below example? I've seen a few other posts and I'm hacking away at it but can't seem to get it. I'd greatly appreciate any help.
Input:
[
{
"name": "level1",
"children": [
{
"name": "level2",
"children": [
{
"name": "level3-1",
"children": []
},
{
"name": "level3-2",
"children": []
}
]
}
]
}
]
Output:
[
{
"displayName": "level1",
"parent": ""
},
{
"displayName": "level2",
"parent": "level1"
},
{
"displayName": "level3-1",
"parent": "level2"
},
{
"displayName": "level3-2",
"parent": "level2"
}
]
Here's a straightforward solution that does not involve a helper function and actually solves a more general problem. It is based on the idea of beginning by adding a "parent" key to each child, and then using .. to collect all the name/parent pairs.
So first consider:
[ walk(if type=="object" and has("children")
then .name as $n | .children |= map(.parent = $n)
else . end)
| ..
| select(type=="object" and has("name"))
| {displayName: .name, parent}
]
This meets the requirements except that for the top-level (parentless) object, it produces a .parent value of null. That would generally be more JSON-esque than "", but if the empty string is really required, one has simply to replace the last non-trivial line above by:
| {displayName: .name, parent: (.parent // "")}
With a simple recursive function:
def f: .name as $parent | .children[] | {$parent, displayName: .name}, f;
[ {name: "", children: .} | f ]
Online demo

JSON schema different for first row and different for remaining rows

my problem statement is :
consider a a list of 15 rows, all rows should have 5 keys. However only the 0th row will have 4 keys. But all the remaining rows will have all the 5 keys.
I want to validate this again my response. Does first and other keyword really exist.
I found this here Correct JSON Schema for an array of items of different type
Example schema
{
"type": "array",
"items": {
"oneOf": [
{
"first": [{
"type": "object",
"required": ["state"],
"properties":{
"state":{
"type":"string"
}
}
}]
},
{
"other": [{
"type": "object",
"required": ["state", "zip"],
"properties":{
"state":{
"type":"string"
},
"zip":{
"type":"string"
}
}
}]
}
]
}
}
First things first: what do you want to achieve with following schema definition?
"first" : [ { ...schema... } ]
As to your problem statement, I am not sure, what you want to achieve:
Schema that allows first array item to be an object with 4 keys, while all other items should have 5 keys?
Schema, that allows only array items=object with 5 keys and will reject a JSON, which does have 4 keys in first item
Could you please rephrase your question to make it more clear? I did some solution basing on assumptions, but it would be good if you could confirm my understanding.
Required reading
Please read first through:
http://json-schema.org/latest/json-schema-validation.html#rfc.section.6.4.1
If "items" is an array of schemas, validation succeeds if each element
of the instance validates against the schema at the same position, if
any.
plus https://stackoverflow.com/a/52758108/2811843 on above topic
https://json-schema.org/understanding-json-schema/reference/array.html#length
https://json-schema.org/understanding-json-schema/reference/array.html#tuple-validation
and https://json-schema.org/understanding-json-schema/reference/array.html in general
as well as
https://json-schema.org/understanding-json-schema/reference/object.html#property-names
https://json-schema.org/understanding-json-schema/reference/object.html#size
and https://json-schema.org/understanding-json-schema/reference/object.html in general.
Possible solution
After looking at sample schema I will rephrase problem statement making some wild assumptions you want a schema, that allows an array of items, where item = object. First item could have 4 keys, while all other items must have 5 keys.
I need a JSON schema that will describe an array of objects, where
first object always has 4 keys/properties, while all remaining objects
do have 5 keys/properties.
Additionally, there is always at least first item in array (containing 4 keys) and there can be up to X other
objects (containing 5 keys) in array.
Go for Tuple-typing and array of objects. Thus you might exactly check that first item (object) has exactly 4 properties and define the schema for the rest of them.
First, full working schema (with comments inside). The "examples" section contains examples of arrays to illustrate the logic, only last 3 will be valid against schema.
{
"$schema": "http://json-schema.org/draft-07/schema#",
"type": "array",
"$comment" : "This is an array, where first item must be an object with at least 4 properties and one property named \"state\" and can contain minimum 1 and maximum of 3 items",
"minItems" : 1,
"maxItems" : 3,
"items": [
{
"type": "object",
"minProperties" : 4,
"required" : ["state"],
}
],
"additionalItems" : {
"$comment" : "Any additional item in this array must be an object with at least 5 keys and two of them must be \"state\" and \"zip\".",
"type" : "object",
"minProperties" : 5,
"required" : ["state", "zip"],
},
"examples" : [
[
{
"key1" : "1",
"key2" : "2",
"key3" : "3",
"state" : "some state",
},
{},
{}
],
[
{
"key1" : "1",
"key2" : "2",
"key3" : "3",
"state" : "some state",
},
{
"key1" : "1",
"key2" : "2",
"key3" : "3",
"state" : "some state",
"zip" : "12345"
},
{
"key1" : "1",
"key2" : "2",
"key3" : "3",
"state" : "some state",
}
],
[
{
"key1" : "1",
"key2" : "2",
"key3" : "3",
"state" : "some state",
},
{
"key1" : "1",
"key2" : "2",
"key3" : "3",
"state" : "some state",
"zip" : "12345"
},
{
"key1" : "1",
"key2" : "2",
"key3" : "3",
"state" : "some state",
"zip" : "54321"
},
{
"key1" : "1",
"key2" : "2",
"key3" : "3",
"state" : "some state",
"zip" : "54321"
}
],
[],
[
{
"key1" : "1",
"key2" : "2",
"key3" : "3",
"state" : "some state",
},
{
"key1" : "1",
"key2" : "2",
"key3" : "3",
"state" : "some state",
"zip" : "12345"
},
{
"key1" : "1",
"key2" : "2",
"key3" : "3",
"state" : "some state",
"zip" : "54321"
},
],
[
{
"key1" : "1",
"key2" : "2",
"key3" : "3",
"state" : "some state",
},
],
[
{
"key1" : "1",
"key2" : "2",
"key3" : "3",
"state" : "some state",
},
{
"key1" : "1",
"key2" : "2",
"key3" : "3",
"state" : "some state",
"zip" : "12345"
},
]
]
}
So, step by step:
"type": "array",
"minItems" : 1,
"maxItems" : 3,
an JSON which is an array with minimum 1 item, maximum 3 items, will be ok. If you don't define "minItems" value, the empty array would pass validation against schema.
"items": [
{
"type": "object",
"minProperties" : 4,
"required" : ["state"],
}
],
This is the Tuple magic - a finite, ordered list of elements (sequence). Yep, maths has it's saying. By using "items" : [ ... ] instead of { ... } you fall into quoted above section of JSON Schema Validation spec (http://json-schema.org/latest/json-schema-validation.html#rfc.section.6.4.1 ).
Above basically says: This is an array, where first item must be an object with at least 4 keys and one of those keys must be "state".
Ok, last but not least:
"additionalItems" : {
"$comment" : "Any additional item in this array must be an object with at least 5 keys and two of them must be \"state\" and \"zip\".",
"type" : "object",
"minProperties" : 5,
"required" : ["state", "zip"],
}
By this I said:
in this array (which must have first item an object with 4 keys and one of those keys is "state" and oh, by the way, an array must have at least 1 item and tops 3 items) you can have additional items on top of the the ones already defined in "items" section. Each such additional item must be an object with at least 5 keys, out of which two must be "state" and "zip".
Does it solve your issue?

Resources