Store key-value pair in JSON array of Objects - arrays

I am interested in storing key-value pair of metadata inside a JSON array containing multiple JSON objects. This will instruct a generic parser what to do with the list of JSON objects in the JSON Array when it is processing the JSON Array. Below is a sample JSON, with where I am hoping to have some sort of metadata field.
{
"Data": [
<< "metadata":"instructions" here >>
{
"foo": 1,
"bar": "barString"
},
{
"foo": 3,
"bar": "fooString"
}
]
}
What is the proper way to structure this mixed data JSON array?

I would add a meta key as a peer of data like below. This would separate your data from the meta data.
{
"Meta": {
"metadata":"instructions"
},
"Data": [
{
"foo": 1,
"bar": "barString"
},
{
"foo": 3,
"bar": "fooString"
}
]
}

If you can modify the structure of the data, why not add a property meta with your instructions (i.e. Data.meta) and another property content (for want of a better word...) (i.e. Data.content), where the latter is the original array of objects.
That way, it is still valid JSON, and other implementations can read the meta-field as well without much ado.
Edit: just realized, you would also have to make Data an object rather than array. Then your JSON-schema should become this:
{
"Data": {
"metadata": "instructions here",
"content": [
{
"foo": 1,
"bar": "barString"
},
{
"foo": 3,
"bar": "fooString"
}
]
}
}
This will probably be the most stable, maintainable and portable solution.
For refrence, something similar has already been asked before.

After some additional discussion with another developer, we thought of one way to include the metadata instructions in the data JSON array.
{
"Data": [
{
"metadata": "Instructions"
}
{
"foo": 1,
"bar": "barString"
},
{
"foo": 3,
"bar": "fooString"
}
]
}
This approach does come with the limitation that index 0 of the data JSON array MUST contain a JSON Object containing the metadata and associated instructions for the generic parser. Failure to include this metadata object as index 0 would trigger an error case that the generic parser would need to handle. So it does have its trade-offs.

I will go to try help you..
"metadata" : [
{
"foo": 1,
"bar": "barString"
},
{
"foo": 3,
"bar": "fooString"
}
]

Related

Encapsulate a JSON Array inside an object with JOLT?

I work on a project where the output of one of our APIs is a JSON array. I'd like to encapsulate this array inside an object.
I try to use a JOLT transformation (this is the first time I use this tool) to achieve this. I've already searched through a lot of example, but I still can't figure out what my JOLT specification has to be to perform the transformation. I can't find what I am looking for.
For example, if my input is like this:
[
{
"id": 1,
"name": "foo"
},
{
"id": 2,
"name": "bar"
}
]
I'd like the output to be:
{
"list":
[
{
"id": 1,
"name": "foo"
},
{
"id": 2,
"name": "bar"
}
]
}
In short, I just want to put my array inside a field of another object.
You can use a shift transformation spec such as
[
{
"operation": "shift",
"spec": {
"*": "list[]"
}
}
]
where "*" wildcard represents indices of the current wrapper array of objects
the demo on the site http://jolt-demo.appspot.com/ is

Selecting nested objects which satisfy a predicate

In Azure search, is it possible to select objects in an array (Collection(Edm.ComplexType) field) which satisfy a predicate?
Using the any operator specified at https://learn.microsoft.com/en-us/azure/search/search-query-understand-collection-filters#correlated-versus-uncorrelated-search returns the entire root object if any of the objects in the array satisfies the predicate.
Example, given the object below in Azure search:
{
"arrayOfObjects": [
{
"id": 1,
"foo": "a"
},
{
"id": 2,
"foo": "b"
},
{
"id": 3,
"foo": "b"
}
]
}
Is it possible to select only the nested objects where foo equals “b”, so that the search response looks like this:
{
"arrayOfObjects": [
{
"id": 2,
"foo": "b"
},
{
"id": 3,
"foo": "b"
}
]
}
No, this is not possible. Queries in Azure Search operate at the granularity of documents, not objects within documents. A possible workaround would be to model your index such that the individual objects become top-level documents.

Mongodb - Take only one element in nested array

I'm using mongodb to store my data. My collection consists in a list of objects identified by a type a list of other objects for each of them.
An example of my collection is:
[
{
"type": "a",
"properties": [
{
"value": "value_a",
"date": "my_date_a"
},
{
"value": "value_b",
"date": "my_date_b"
},
...
]
},
...
]
Based on the above data structure, I want to retrieve all collections by a given type, taking for each of them only one element in the nested array (reducing the nested list to a list of only one element).
So, given a type "a", an example of the result may be:
[
{
"type": "a",
"properties": [
{
"value": "value_a",
"date": "my_date_a"
}
]
},
...
]
I'm started trying this query { "type": "a" } to filter the collections. But, how can I do to take only one "properties" element? I cannot use the "slice" operator.
Thanks a lot.
I'm assuming from your reference to slice, that you're not interested in matching a particular nested element, and rather just getting a value at a fixed index (eg, 0).
If you're willing to use the aggregation pipeline, you can use arrayElementAt within a projection:
db.collection.aggregate([
// matches documents with type 'a'
{ $match: { type: 'a' } },
// creates a new document for each
{ $project: {
// that contains the original value for type
type: 1,
// and the first element from the original properties for properties
properties: { $arrayElemAt: [ "$properties", 0 ] }
} }
])

How to prevent Elasticsearch from flattening 2D arrays in "fields"-containing query

Nested arrays get flattened when represented in "fields". I expect that values from the same path to be merged, but that the internal data structure will not be modified.
Could someone explain whether I am doing something incorrectly, or whether this belongs as an Elasticsearch issue?
Steps to reproduce:
Create the 2D data
curl -XPOST localhost:9200/test/5 -d '{ "data": [ [100],[2,3],[6,7] ] }'
Query the data, specifying fields
curl -XGET localhost:9200/test/5/_search -d '{"query":{"query_string":{"query":"*"} }, "fields":["data"] } }'
Result:
{"took":1,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":1,"max_score":1.0,"hits":[{"_index":"test","_type":"5","_id":"AVdsHJrepOClGTFyoGqo","_score":1.0,"fields":{"data":[100,2,3,6,7]}}]}}
Repeat without the use of "fields":
curl -XGET localhost:9200/test/5/_search -d '{"query":{"query_string":{"query":"*"} } } }'
Result:
{"took":1,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":1,"max_score":1.0,"hits":[{"_index":"test","_type":"5","_id":"AVdsHJrepOClGTFyoGqo","_score":1.0,"_source":{ "data": [ [100],[2,3],[6,7] ] }}]}}
Notice that _source and fields differ, in that "fields" decomposes the 2D array into a 1D array.
When you specify nothing else in your request, what you get back foreach hit is the "_source" object, that is, exactly the Json you sent to ES during indexing (even including whitespace!).
When you use source filtering, as Andrey suggests, it's the same except you can include or exclude certain fields.
When you use the "fields" directive in your query, the return values are not taken from the _source, but read directly from the Lucene Index. (see docs) Now the key in your search response will switch from "_source" to "fields" to reflect this change.
As alkis said:
https://www.elastic.co/guide/en/elasticsearch/reference/current/array.html
These docs say up front that, yes, Elasticsearch does flatten arrays.
Instead of specifying "fields" I usually do source filtering
Your query would change to something like:
curl -XGET <IPADDRESS>:9200/test/5/_search -d '{"_source":{"include": ["data"]}, "query":{"query_string":{"query":"*"} }}'
From here https://www.elastic.co/guide/en/elasticsearch/reference/current/array.html
it seems that elasticsearch considers them the same.
In Elasticsearch, there is no dedicated array type. Any field can contain zero or more values by default, however, all values in the array must be of the same datatype. For instance:
an array of strings: [ "one", "two" ]
an array of integers: [ 1, 2 ]
an array of arrays: [ 1, [ 2, 3 ]] which is the equivalent of [ 1, 2, 3 ]
an array of objects: [ { "name": "Mary", "age": 12 }, { "name": "John", "age": 10 }]
You could use an array of json objects and use nested data type with nested query.
Maybe nested data type could be helpful
PUT /my_index
PUT /my_index/_mapping/my_type
{
"properties" : {
"data" : {
"type" : "nested",
"properties": {
"value" : {"type": "long" }
}
}
}
}
POST /my_index/my_type
{
"data": [
{ "value": [1, 2] },
{ "value": [3, 4] }
]
}
POST /my_index/my_type
{
"data": [
{ "value": [1, 5] }
]
}
GET /my_index/my_type/_search
{
"query": {
"nested": {
"path": "data",
"query": {
"bool": {
"must": [
{
"match": {
"data.value": 1
}
},
{
"match": {
"data.value": 2
}
}
]
}
}
}
}
}

Mapping an array inside an array with a JSON Reader

My JSON looks as follows:
{
"records": [
{
"_id": "5106f97bdcb713b818d7f1f1",
"cn": "lsacco",
"favorites": [
{
"fullName": "Friend One",
"uid": "friend1"
},
{
"fullName": "Friend Two",
"uid": "friend2"
}
]
}
]
}
When I try to use records.favorites as the root for my JSON reader, I do not get any results populated to my model. Is there a way to do this without having to resort to using an association? Note that in my case, records will only have one element despite it showing an array.
records.favorites isn't valid because the property doesn't exist.
You want:
records[0].favorites
records has been declared as an array so records.favorites will point to nothing in the json data file.
using the index in records should solve the problem.

Resources