A little more sophisticated as my question mentioned below.
I learned to use arrays more, but it screws things up too.
Input:
{
"a": [
{
"b": "c",
"d": "e"
},
{
"b": "f",
"d": "g"
}
],
"h": [
{
"b": "c",
"i": "j"
},
{
"b": "f",
"i": "k"
}
]
}
desired output:
{
"l": [
{
"b": "c",
"d": "e",
"i": "j"
},
{
"b": "f",
"d": "g",
"i": "k"
}
]
}
Things that I've tried, based up on JQ How to merge multiple objects into one
{ x: [ inputs | .a[] | { (.h[]): .i } ] | add}
The key to a simple solution is transpose:
[.a, .h]
| transpose
| map(add)
| {l: .}
Related
I have 3 documents:
{
"id": 1,
"user": "Brian1",
"configs": [
"a",
"b",
"c",
"d"
]
}
----
{
"id": 2,
"user": "max_en",
"configs": [
"a",
"h",
"i",
"j"
]
}
----
----
{
"id": 3,
"user": "userX",
"configs": [
"t",
"u",
"s",
"b"
]
}
I want to merge all the "configs" arrays into one array without dublicates,like this:
{
"configs": [
"a",
"b",
"c",
"d",
"h",
"i",
"j",
"t",
"u",
"s",
]
}
I've tried the following:
Aggregation.group("").addToSet("configs").as("configs") and { _id: "", 'configs': { $addToSet: '$configs' } }
The first one gives an error because I've left the fieldname empty (I don't know what to put there).
The second one returns a merged array but with duplicates.
When you want to group all the documents, you need to add {_id: null}
It means group all documents.
Probably you need this
db.collection.aggregate([
{
"$unwind": "$configs"
},
{
$group: {
_id: null,
configs: {
"$addToSet": "$configs"
}
}
}
])
But be cautious when you need to use on larger collection without a match.
How can I select certain values and/or values conditional on other columns in vega-lite? For example, below I only want to show values that have "red" in the column "c".
{
"data": {
"values": [
{"a": "A", "b": 2, "c": "red"},
{"a": "A", "b": 7, "c": "red"},
{"a": "A", "b": 4, "c": "blue"},
{"a": "B", "b": 1, "c": "blue"},
{"a": "B", "b": 2, "c": "red"}
]
},
"mark": "bar",
"encoding": {
"x": {"field": "a", "type": "nominal"},
"y": {"aggregate": "average", "field": "b", "type": "quantitative"}
}
}
I have tried adding the following code before "mark": "bar", according to this vega github tutorial, and using the online vega editor, but it doesn't filter column b. I thought I could use it to somehow filter a string as well.
"transform": {
"filter": "datum.b > 3"
},
Follow-up question regarding multiple filtering criteria.
You can do this with a filter transform:
{
"data": {
"values": [
{"a": "A", "b": 2, "c": "red"},
{"a": "A", "b": 7, "c": "red"},
{"a": "A", "b": 4, "c": "blue"},
{"a": "B", "b": 1, "c": "blue"},
{"a": "B", "b": 2, "c": "red"}
]
},
"transform": [{"filter": "datum.c == 'red'"}],
"mark": "bar",
"encoding": {
"x": {"field": "a", "type": "nominal"},
"y": {"aggregate": "average", "field": "b", "type": "quantitative"}
}
}
I have the following JSON file.
[
{
"name": "first",
"Arrays": {
"dddd0001": [
"A",
"A",
"B",
"B",
"C",
"C",
"C",
"C",
"D",
"E",
"F"
]
}
},
{
"name": "second",
"Arrays": {
"dddd0002": [
"AA",
"AA",
"BA",
"BB",
"CC",
"CC",
"CC",
"CC",
"DD",
"DD",
"FF"
]
}
},
{
"name": "third",
"Arrays": {
"dddd0003": [
"1",
"1",
"2",
"3",
"3",
"4",
"4",
"4",
"0",
"0",
"0"
]
}
}
]
I need to remove duplicates inside every array in the JSON file. So the result should look like following
[
{
"name": "first",
"Arrays": {
"dddd0001": [
"A",
"B",
"C",
"D",
"E",
"F"
]
}
},
{
"name": "second",
"Arrays": {
"dddd0002": [
"AA",
"BA",
"BB",
"CC",
"DD",
"FF"
]
}
},
{
"name": "third",
"Arrays": {
"dddd0003": [
"1",
"2",
"3",
"4",
"0"
]
}
}
]
Array key names are not known in advance. There might be multiple arrays inside the Arrays object.
I tried to use unique_by but it requires the key name.
This algorithm - search for every array inside the Arrays object, for every such array apply unique function, re-assign results back to the array - should be fairly easy to implement, but I am stuck.
Thanks.
walk( if type == "array" then unique else . end)
If the original order should be respected, then you can easily use "def uniques" as defined at How do I get jq to return unique results when json has multiple identical entries?
you can use unique and |=:
$ jq '.[].Arrays[] |= unique' file.json
[
{
"name": "first",
"Arrays": {
"dddd0001": [
"A",
"B",
"C",
"D",
"E",
"F"
]
}
},
{
"name": "second",
"Arrays": {
"dddd0002": [
"AA",
"BA",
"BB",
"CC",
"DD",
"FF"
]
}
},
{
"name": "third",
"Arrays": {
"dddd0003": [
"0",
"1",
"2",
"3",
"4"
]
}
}
]
$
the only "problem" is that unique sorts the elements of the array, so for example contents of "dddd0003" array are not in the same order of your expected result. I don't know if this is could be a problem for you.
if "Arrays" property can also contain "non-array" values, extra care can be taken in order to "filter out" those "non-array" values so that unique doesn't complain.
select(type == "array") can be used: (output omitted):
$ jq '(.[].Arrays[] | select(type == "array")) |= unique' file.json
...
or arrays:
$ jq '(.[].Arrays[] | arrays) |= unique' file.json
...
these last two solutions better reflect your algorithm.
var jsonArr = [
{
"name": "first",
"Arrays": {
"dddd0001": [
"A",
"A",
"B",
"B",
"C",
"C",
"C",
"C",
"D",
"E",
"F"
]
}
},
{
"name": "second",
"Arrays": {
"dddd0002": [
"AA",
"AA",
"BA",
"BB",
"CC",
"CC",
"CC",
"CC",
"DD",
"DD",
"FF"
]
}
},
{
"name": "third",
"Arrays": {
"dddd0003": [
"1",
"1",
"2",
"3",
"3",
"4",
"4",
"4",
"0",
"0",
"0"
]
}
}
]
for(var i=0; i< jsonArr.length; i++)
{
var arrtemp = jsonArr[i][Object.keys(jsonArr[i])[1]];
var arrtmp2 = arrtemp[Object.keys(arrtemp)[0]];
jsonArr[i][Object.keys(jsonArr[i])[1]] =arrtmp2.filter((v, p) => arrtmp2.indexOf(v) == p);
console.log(jsonArr[i])
}
Let's say I have three documents in a collection, like so:
[
{"_id": "101", parts: ["a", "b"]},
{"_id": "102", parts: ["a", "c"]},
{"_id": "103", parts: ["a", "z"]},
]
what is the query I have to write so that if I input ["a","b","c"]
(i.e. all items in parts field value in each doc should be present in ["a","b","c"]) will output:
[
{"_id": "101", parts: ["a", "b"]},
{"_id": "102", parts: ["a", "c"]}
]
is this even possible? any idea?
Below solution may not be the best but it works. The idea is finding all documents that has no items in parts outside the input array. It can be done with combination of $not, $elemMatch and $nin:
db.collection.find({
parts: {
$not: {
"$elemMatch": {
$nin: ["a", "b", "c"]
}
}
}
})
Mongo Playground
Thanks to #prasad_. I have tried to come up with a solution which is similar to what I wanted. I have used $setDifference here.
db.collection.aggregate([
{
$project: {
diff: {
$setDifference: [
"$parts",
[
"a",
"b",
"c"
]
]
},
document: "$$ROOT"
}
},
{
$match: {
"diff": {
$eq: []
}
}
},
{
$project: {
"diff": 0
}
},
])
output:
[
{
"_id": "101",
"document": {
"_id": "101",
"parts": [
"a",
"b"
]
}
},
{
"_id": "102",
"document": {
"_id": "102",
"parts": [
"a",
"c"
]
}
}
]
Mongo Playground
I'm using JOLT to transform data from:
[{"a" : "a",
"b" : "b",
"c" : "c",
...},
{"a" : "a",
"b" : "b",
"c" : "c",
...}]
To:
[{"a1" : "a",
"b1" : "b",
"c1" : "c",
...},
{"a1" : "a",
"b1" : "b",
"c1" : "c",
...}]
I'm trying to figure out a wild card that would map all the attributes I don't need to change. Something like:
[{
"operation": "shift",
"spec": {
"*": {
"a": "[&1].a1",
"b": "[&1].b1",
"c": "[&1].c1",
"*": {
"#": "&"
}
}
}
}]
Where:
"*": {
"#": "&"
}
Would work as a wildcard for all the fields I don't need to update.
Spec
[{
"operation": "shift",
"spec": {
"*": {
"a": "[&1].a1",
"b": "[&1].b1",
"c": "[&1].c1",
"*": "[&1].&"
}
}
}]