postgresql Json path capabilities - arrays

In documentation some of the postgresql json functions uses a json path attribute.
For exemple the jsonb_set function :
jsonb_set(target jsonb, path text[], new_value jsonb[, create_missing boolean])
I can't find any of the specifications of this type of attribute.
Can it be used for example to retrieve an array element based on it's attribute's value ?

The path is akin to a path on a filesystem: each value drills further down the leaves of the tree (in the order you specified). Once you get a particular JSONB value from extracting it via a path, you can chain other JSONB operations if needed. Using functions/operators with JSONB paths is mostly useful for when there are nested JSONB objects, but can also handle simple JSONB arrays too.
For example:
SELECT '{"a": 42, "b": {"c": [1, 2, 3]}}'::JSONB #> '{b, c}' -> 1;
...should return 2.
The path {b, c} first gets b's value, which is {"c": [1, 2, 3]}.
Next, it drills down to get c's value, which is [1, 2, 3].
Then the -> operation is chained onto that, which gets the value at the specified index of that array (using base-zero notation, so that 0 is the first element, 1 is the second, etc.). If you use -> then it will return a value with a JSONB data type, whereas ->> will return a value with a TEXT data type.
But you could have also written it like this:
SELECT '{"a": 42, "b": {"c": [1, 2, 3]}}'::JSONB #> '{b, c, 1}';
...and simply included both keys and array indexes in the same path.
For arrays, the following two should be equivalent, except the first uses a path, and the second expects an array and gets the value at the specified index:
SELECT '[1, 2, 3]'::JSONB #> '{1}';
SELECT '[1, 2, 3]'::JSONB -> 1;
Notice a path must always be in JSON array syntax, where each successive value is the next leaf in the tree you want to drill down to. You supply keys if it is a JSONB object, and indexes if it is a JSONB array. If these were file paths, the JSONB keys are like folders, and array indexes are like files.

Related

How do I iterate over an array in a nested json object in sqlite?

Assume I have a sqlite table features which has a column data that contains json objects.
CREATE TABLE features ( id INTEGER PRIMARY KEY, data json )
Now, an example data object may be:
{"A":
{"B":
{"coordinates":[
{"x":1, "y":10},
{"x":10, "y":2},
{"x":12, "y":12}
]
}
}
Now the number of json objects in the coordinates array can vary from row to row. Some documents can have 3 coordinates (example above) while others may have 5 or more coordinates.
For each row or document, I want to be able to iterate over just the x values and find the minimum, same for y values. So the results for the example would be 1 for x and 2 for y.
I can get all the json objects inside the array using json_each but I can't extract x for just one single row. I tried:
select value from features, json_each(json_extract(features.data, '$.A.B.coordinates'));
However, this seems to return "all" coordinate json objects for all rows. How do I go about iterating the array for one document and extract values from it so I can then select a minimum or maximum for one document?
Use json_extract() again after json_each(json_extract()) to extract each x and y and aggregate:
SELECT f.id,
MIN(json_extract(value, '$.x')) x,
MIN(json_extract(value, '$.y')) y
FROM features f, json_each(json_extract(f.data, '$.A.B.coordinates'))
GROUP BY f.id
See the demo.

mongoose: using array indices similar to data fields in the $in operator

If my schema was something like this,
index_array: [Number],
data_array: [{val: Number}]
with data similar to,
index_array: [1,2],
data_array: [{val: 1231},{val: 1092},{va: 0902}]
is there anyway i can form a query to extract the elements in data array, that matches the indices specified in the index array?
in above case i should get
data_array: [{val: 1092},{va: 0902}]
I understand i can do this after i retrieve the document, but is there anyway to do this on the server itself?

Data Structure which allows efficent searching of objects

I have a very large database of Objects (read an array of key/value pairs, like [{}, {}, {}] in standard C notation), and I need to be able to search for any value of any key within that set of pairs and find the object which contains it (I'll be using fuzzy searching or similar string comparison algorithms). One approach I can think of would be to create an enormous master object with a key referencing to the original object for each value inside the object:
DB = [
{
"a": 45,
"b": "Hello World"
},
{
"a": 32,
"b": "Testing..."
}
]
// ... Generation Code ... //
search = {
45: {the 0th object},
"Hello World": {the 0th object},
32: {the 1st object},
"Testing...": {the 1st object}
}
This solution at least reduces the problem to a large number of comparisons, but are there better approaches? Please note that I have very little formal Computer Science training so I may be missing some major detail simplifying or proving impossible this problem.
P.S. Is this too broad? If so, I'll gladly delete it
Your combined index is more suitable for a full-text search, but doesn't indicate in which property of an object the value is found. An alternative technique that provides more context is to build an index per property.
This should be faster both in preparation and during lookup on property-specific searchers (e.g. a == 32) since for n objects and p properties, a binary search (used in both inserts and lookups) would require log(np) comparisons on a combined index and log(n) on a single-property index.
In either case, you need to watch out for multiple occurrences of the same value. You can store an array of offsets as the value of each index entry, rather than just a single value.
For example:
search = {
"a": {
45: [0],
32: [1]
},
"b": {
"Hello World": [0],
"Testing...": [1]
}
}

How to get JavaScript/JSON array of array in Postgresql?

JSON object format is verbose:
"[{"id":1,"name":"John"}, {"id":2,"name":"Jack"}]"
Sometimes, repeating field names take more space than the actual data. To save bandwidth, and speed up page loading, I would like to generate a JavaScript array of arrays in string format instead and send it to the client. For example for this data:
create table temp (
id int,
name text
);
insert into temp values (1, 'John'), (2, 'Jack');
I would like to get '[[1, "John"], [2, "Jack"]]'. How can I do that?
I do not want to aggregate columns by typing them out, since that would be hard to maintain. I also know postgresql does not allow multiple types in an array like JavaScript, so one possibility is to use composite types, but then, stringified/aggregated result ends up having '()' in them.
select array_to_json(array_agg(json_build_array(id, name)))
from temp;
array_to_json
---------------------------
[[1, "John"],[2, "Jack"]]

Convert JSON with different types of objects (strings, array) to Map

I am trying to convert a JSON object to a map object in Hive using the brickhouse lib `brickhouse.udf.json.FromJsonUDF``.
The problem is, that my json object contains different types of values: string and one array of another arrays.
My json looks like this:
'{"key1":"value1","key2":"value2","key3":"value3","key4":[["0","1","nnn"],["1","3","mmm"],["1","3","ggg"],["1","5","kkk"],["4","5","ppp"]]}'
I can either read correctly the array of arrays element (key4) using the following:
select from_json('{"key1":"value1","key2":"value2","key3":"value3","key4":[["0","1","nnn"],["1","3","mmm"],["1","3","ggg"],["1","5","kkk"],["4","5","ppp"]]}', 'map<string,array<array<string>>>') from my_table limit 1;
Which gives me:
{"key1":[],"key3":[],"key2":[],"key4":[["0","1","nnn"],["1","3","mmm"],["1","3","ggg"],["1","5","kkk"],["4","5","ppp"]]}
As you can see all the elements but key4 are empty.
Or I am able to read other elements but key4 using:
select from_json('{"key1":"value1","key2":"value2","key3":"value3","key4":[["0","1","nnn"],["1","3","mmm"],["1","3","ggg"],["1","5","kkk"],["4","5","ppp"]]}', 'map<string,string>') from my_table limit 1;
Which gives me:
{"key1":"value1","key3":"value3","key2":"value2","key4":null}
But how can I convert all the elements correctly to key-value pairs on the resulting map object?
EDITED:
My actual data is an array of two components which are json objects:
[{"key1":"value1", "key2":"value2"}{"key3":"value3","key4":"value4","key5":"value5","key6":[["0","1","nnn"],["1","3","mmm"],["1","3","ggg"],["1","5","kkk"],["4","5","ppp"]]}]
Is it possible to create a struct object which contains the two json objects as two map objects so that I can access the first or second struct element and then select the value of the correspnding map object using a key?
For example: assuming my desired endresult is called struct_result I would access value1 from the first component like:
struct_result.t1["key1"]
which would give me "value1".
Is it possible to achieve this with this lib?
This can be done using named_structs. You need to create a named_struct, and specify the types for each of the keys independently.
For example
select_from_json('{"key1":"value1","key2":"value2","key3":"value3","key4":[["0","1","nnn"],["1","3","mmm"],["1","3","ggg"],["1","5","kkk"],["4","5","ppp"]]}',
named_struct("key1","", "key2", "", "key3", ""
"key4", array(array("")))
from my_table;
This creates a template object using the 'named_struct' UDF, or you can use the equivalent string type definition.

Resources