Assume I have a sqlite table features which has a column data that contains json objects.
CREATE TABLE features ( id INTEGER PRIMARY KEY, data json )
Now, an example data object may be:
{"A":
{"B":
{"coordinates":[
{"x":1, "y":10},
{"x":10, "y":2},
{"x":12, "y":12}
]
}
}
Now the number of json objects in the coordinates array can vary from row to row. Some documents can have 3 coordinates (example above) while others may have 5 or more coordinates.
For each row or document, I want to be able to iterate over just the x values and find the minimum, same for y values. So the results for the example would be 1 for x and 2 for y.
I can get all the json objects inside the array using json_each but I can't extract x for just one single row. I tried:
select value from features, json_each(json_extract(features.data, '$.A.B.coordinates'));
However, this seems to return "all" coordinate json objects for all rows. How do I go about iterating the array for one document and extract values from it so I can then select a minimum or maximum for one document?
Use json_extract() again after json_each(json_extract()) to extract each x and y and aggregate:
SELECT f.id,
MIN(json_extract(value, '$.x')) x,
MIN(json_extract(value, '$.y')) y
FROM features f, json_each(json_extract(f.data, '$.A.B.coordinates'))
GROUP BY f.id
See the demo.
If my schema was something like this,
index_array: [Number],
data_array: [{val: Number}]
with data similar to,
index_array: [1,2],
data_array: [{val: 1231},{val: 1092},{va: 0902}]
is there anyway i can form a query to extract the elements in data array, that matches the indices specified in the index array?
in above case i should get
data_array: [{val: 1092},{va: 0902}]
I understand i can do this after i retrieve the document, but is there anyway to do this on the server itself?
I have a very large database of Objects (read an array of key/value pairs, like [{}, {}, {}] in standard C notation), and I need to be able to search for any value of any key within that set of pairs and find the object which contains it (I'll be using fuzzy searching or similar string comparison algorithms). One approach I can think of would be to create an enormous master object with a key referencing to the original object for each value inside the object:
DB = [
{
"a": 45,
"b": "Hello World"
},
{
"a": 32,
"b": "Testing..."
}
]
// ... Generation Code ... //
search = {
45: {the 0th object},
"Hello World": {the 0th object},
32: {the 1st object},
"Testing...": {the 1st object}
}
This solution at least reduces the problem to a large number of comparisons, but are there better approaches? Please note that I have very little formal Computer Science training so I may be missing some major detail simplifying or proving impossible this problem.
P.S. Is this too broad? If so, I'll gladly delete it
Your combined index is more suitable for a full-text search, but doesn't indicate in which property of an object the value is found. An alternative technique that provides more context is to build an index per property.
This should be faster both in preparation and during lookup on property-specific searchers (e.g. a == 32) since for n objects and p properties, a binary search (used in both inserts and lookups) would require log(np) comparisons on a combined index and log(n) on a single-property index.
In either case, you need to watch out for multiple occurrences of the same value. You can store an array of offsets as the value of each index entry, rather than just a single value.
For example:
search = {
"a": {
45: [0],
32: [1]
},
"b": {
"Hello World": [0],
"Testing...": [1]
}
}
JSON object format is verbose:
"[{"id":1,"name":"John"}, {"id":2,"name":"Jack"}]"
Sometimes, repeating field names take more space than the actual data. To save bandwidth, and speed up page loading, I would like to generate a JavaScript array of arrays in string format instead and send it to the client. For example for this data:
create table temp (
id int,
name text
);
insert into temp values (1, 'John'), (2, 'Jack');
I would like to get '[[1, "John"], [2, "Jack"]]'. How can I do that?
I do not want to aggregate columns by typing them out, since that would be hard to maintain. I also know postgresql does not allow multiple types in an array like JavaScript, so one possibility is to use composite types, but then, stringified/aggregated result ends up having '()' in them.
select array_to_json(array_agg(json_build_array(id, name)))
from temp;
array_to_json
---------------------------
[[1, "John"],[2, "Jack"]]
I am trying to convert a JSON object to a map object in Hive using the brickhouse lib `brickhouse.udf.json.FromJsonUDF``.
The problem is, that my json object contains different types of values: string and one array of another arrays.
My json looks like this:
'{"key1":"value1","key2":"value2","key3":"value3","key4":[["0","1","nnn"],["1","3","mmm"],["1","3","ggg"],["1","5","kkk"],["4","5","ppp"]]}'
I can either read correctly the array of arrays element (key4) using the following:
select from_json('{"key1":"value1","key2":"value2","key3":"value3","key4":[["0","1","nnn"],["1","3","mmm"],["1","3","ggg"],["1","5","kkk"],["4","5","ppp"]]}', 'map<string,array<array<string>>>') from my_table limit 1;
Which gives me:
{"key1":[],"key3":[],"key2":[],"key4":[["0","1","nnn"],["1","3","mmm"],["1","3","ggg"],["1","5","kkk"],["4","5","ppp"]]}
As you can see all the elements but key4 are empty.
Or I am able to read other elements but key4 using:
select from_json('{"key1":"value1","key2":"value2","key3":"value3","key4":[["0","1","nnn"],["1","3","mmm"],["1","3","ggg"],["1","5","kkk"],["4","5","ppp"]]}', 'map<string,string>') from my_table limit 1;
Which gives me:
{"key1":"value1","key3":"value3","key2":"value2","key4":null}
But how can I convert all the elements correctly to key-value pairs on the resulting map object?
EDITED:
My actual data is an array of two components which are json objects:
[{"key1":"value1", "key2":"value2"}{"key3":"value3","key4":"value4","key5":"value5","key6":[["0","1","nnn"],["1","3","mmm"],["1","3","ggg"],["1","5","kkk"],["4","5","ppp"]]}]
Is it possible to create a struct object which contains the two json objects as two map objects so that I can access the first or second struct element and then select the value of the correspnding map object using a key?
For example: assuming my desired endresult is called struct_result I would access value1 from the first component like:
struct_result.t1["key1"]
which would give me "value1".
Is it possible to achieve this with this lib?
This can be done using named_structs. You need to create a named_struct, and specify the types for each of the keys independently.
For example
select_from_json('{"key1":"value1","key2":"value2","key3":"value3","key4":[["0","1","nnn"],["1","3","mmm"],["1","3","ggg"],["1","5","kkk"],["4","5","ppp"]]}',
named_struct("key1","", "key2", "", "key3", ""
"key4", array(array("")))
from my_table;
This creates a template object using the 'named_struct' UDF, or you can use the equivalent string type definition.