JMESPath query for nested array structures - arrays

I have the following data structure as a result of aws logs get-query-results:
{
"status": "Complete",
"statistics": {
"recordsMatched": 2.0,
"recordsScanned": 13281.0,
"bytesScanned": 7526096.0
},
"results": [
[
{
"field": "time",
"value": "2019-01-31T21:53:01.136Z"
},
{
"field": "requestId",
"value": "a9c233f7-0b1b-3326-9b0f-eba428e4572c"
},
{
"field": "logLevel",
"value": "INFO"
},
{
"field": "callerId",
"value": "a9b0f9c2-eb42-3986-33f7-8e450b1b72cf"
}
],
[
{
"field": "time",
"value": "2019-01-25T13:13:01.062Z"
},
{
"field": "requestId",
"value": "a4332628-1b9b-a9c2-0feb-0cd4a3f7cb63"
},
{
"field": "logLevel",
"value": "INFO"
},
{
"field": "callerId",
"value": "a9b0f9c2-eb42-3986-33f7-8e450b1b72cf"
}
],
]
}
The AWS CLI support JMESPath language for filtering output. I need to apply a query string, to filter among the returned "results" the objects that contain the "callerId" as a "field", retrieve the "value" property and obtain the following output:
[
{
callerId: "a9b0f9c2-eb42-3986-33f7-8e450b1b72cf"
},
{
callerId: "a9b0f9c2-eb42-3986-33f7-8e450b1b72cf"
}
]
The first step I do is flatter the results array with the query string: results[]
This will get read of the other root properties (status, statistics) and return only one big array with all of the {field: ..., value: ...} alike objects. But after this I can't manage to properly filter for those objects that match field=="callerId". I tried, among others, the following expressions without success:
'results[][?field=="callerId"]'
'results[][*][?field=="callerId"]'
'results[].{ callerId: #[?field=="callerId"].value }'
I'm not an expert in JMESPath and I was doing the tutorials of the jmespath.org site but couldn't manage to make it work.
Thanks!

Using jq is a good thing because it's more complete language, but if you want to do it with JMES Path here the solution:
results[*][?field=='callerId'].{callerId: value}[]
to get:
[
{
"callerId": "a9b0f9c2-eb42-3986-33f7-8e450b1b72cf"
},
{
"callerId": "a9b0f9c2-eb42-3986-33f7-8e450b1b72cf"
}
]

I'm not able to reproduce fully since I don't have the same logs in my log stream but I was able to do this using jq and putting the sample JSON object in a file
cat sample_output.json | jq '.results[][] | select(.field=="callerId") | .value'
OUTPUT:
"a9b0f9c2-eb42-3986-33f7-8e450b1b72cf"
"a9b0f9c2-eb42-3986-33f7-8e450b1b72cf"
you could pipe the output from the aws cli to jq.
I was able to get pretty close with the native JMESPath query and using the built in editor in this site
http://jmespath.org/examples.html#filtering-and-selecting-nested-data
results[*][?field==`callerId`][]
OUTPUT:
[
{
"field": "callerId",
"value": "a9b0f9c2-eb42-3986-33f7-8e450b1b72cf"
},
{
"field": "callerId",
"value": "a9b0f9c2-eb42-3986-33f7-8e450b1b72cf"
}
]
but I'm not sure how to get callerId to be the key and the value to be the value from another key.

Related

Is it possible to get key value pairs from snowflake api instead rowType?

I'm working with an API from snowflake and to deal with the json data, I would need to receive data as key-value paired instead of rowType.
I've been searching for results but haven't found any
e.g. A table user with name and email attributes
Name
Email
Kelly
kelly#email.com
Fisher
fisher#email.com
I would request this body:
{
"statement": "SELECT * FROM user",
"timeout": 60,
"database": "DEV",
"schema": "PLACE",
"warehouse": "WH",
"role": "DEV_READER",
"bindings": {
"1": {
"type": "FIXED",
"value": "123"
}
}
}
The results would come like:
{
"resultSetMetaData": {
...
"rowType": [
{ "name": "Name",
...},
{ "name": "Email",
...}
],
},
"data": [
[
"Kelly",
"kelly#email.com"
],
[
"Fisher",
"fisher#email.com"
]
]
}
And the results needed would be:
{
"resultSetMetaData": {
...
"data": [
[
"Name":"Kelly",
"Email":"kelly#email.com"
],
[
"Name":"Fisher",
"Email":"fisher#email.com"
]
]
}
Thank you for any inputs
The output is not valid JSON, but the return can arrive in a slightly different format:
{
"resultSetMetaData": {
...
"data":
[
{
"Name": "Kelly",
"Email": "kelly#email.com"
},
{
"Name": "Fisher",
"Email": "fisher#email.com"
}
]
}
}
To get the API to send it that way, you can change the SQL from select * to:
select object_construct(*) as KVP from "USER";
You can also specify the names of the keys using:
select object_construct('NAME', "NAME", 'EMAIL', EMAIL) from "USER";
The object_construct function takes an arbitrary number of parameters, as long as they're even, so:
object_construct('KEY1', VALUE1, 'KEY2', VALUE2, <'KEY_N'>, <VALUE_N>)

I have a datastudio error - basic json config

I've returned to try and make some datastudio custom javascript.
So I started off with a template type settings and basic js. Manifest is listing correctly - datastudio sees the custom item.
I took a long time for it to be authorised.
However, on adding the custome js, the console is reporting a load of erros.
first : data.0.type is not a valid config
second : data.0.elements.data.0.type is not a valid config.
Json:
{
"data": [
{
"id": "idtestviz",
"label": "Dimension Element Heading",
"type":"DIMENSION"
}
]
,
"style": [
{
"id": "idtestvizstyles",
"label": "Test Styles",
"elements":[
{
"id":"idtestvizfontcolor",
"label":"Font Colour",
"defaultValue":"#FFFF00"
}
]
}
]
}
It did have options in before, same error.
And appears to be the same as in https://developers.google.com/datastudio/visualization/define-config
Also it also is erroring on 'is already used in the config'
and that data.0.elements.style.0.elements.0.type required field that cannot be found
Seems like there are more checks that need to be done.
Is there a validator for json etc. before running, or has something updated on google side that their documentation hasn't been updated yet?
Or the more likely aspect, I'm missing some critical stuff...
Regards
Vince
Re checked my json config with a previous one that works, noted some errors in the objects. Corrected those and the json errors in the console have gone away.
JS errors remain - working on those... closing this question.
{
"data": [
{
"id":"test_viz_data",
"label":"Test Viz Data",
"elements":[
{
"id": "text_viz_dimensions",
"label": "Dimension Element Heading",
"type": "DIMENSION",
"options": {
"min": 1,
"max": 1
}
}
,
{
"id": "test_metrics",
"label": "Metric fields",
"type": "METRIC",
"options": {
"min": 1,
"max": 1
}
}
]
}
]
,
"style": [
{
"id": "idstyles",
"label": "Test Styles",
"elements":[
{
"id":"idfontcolor",
"label":"Font Colour",
"type":"FONT_COLOR",
"defaultValue":"#FFFF00"
}
]
}
]
,
"interactions": [
]
}

Apache Nifi: Parse data with UpdateRecord Processor

I'm trying to parse some data in Nifi (1.7.1) using UpdateRecord Processor.
Original data are json files, that I would like to convert to Avro, based on a schema.
The Avro conversion is ok, but in that convertion I also need to parse one array element from the json data to a different structure in Avro.
This is a sample data of the input json:
{ "geometry" : {
"coordinates" : [ [ 4.963087975800593, 45.76365595859971 ], [ 4.962874487781098, 45.76320922779652 ], [ 4.962815443439148, 45.763116079159374 ], [ 4.962744732112515, 45.763010484202866 ], [ 4.962096825239138, 45.762112721939246 ] ]} ...}
Being its schema (specified in RecordReader):
{ "type": "record",
"name": "features",
"fields": [
{
"name": "geometry",
"type": {
"type": "record",
"name": "geometry",
"fields": [
{
"name": "coordinatesJson",
"type": {
"type": "array",
"items": {
"type": "array",
"items": "double"
}
}
},
]
}
},
....
]
}
As you can see, coordinates is an array of arrays.
And I need to parse those data to Avro, based on this schema (specified in RecordWriter):
{
"name": "outputdata",
"type": "record",
"fields": [
{"name": "coordinatesAvro",
"type": {
"type": "array",
"items" : {
"type" : "record",
"name" : "coordinatesAvro",
"fields" : [ {
"name" : "X",
"type" : "double"
}, {
"name" : "Y",
"type" : "double"
} ]
}
}
},
.....
]
}
The problem here is that I'm not being able to parse from coordinatesJson to coordinatesAvro, using RecordPath functions
I tried several mappings, like:
Property: Value:
/coordinatesJson[0..-1]/X /geometry/coordinatesAvro[*][0]
/coordinatesJson[0..-1]/Y /geometry/coordinatesAvro[*][1]
It should be a pretty straighforward parsing step, but as I said, I've been going in circles to achive this for a while.
Any help would be really appreciated.
When I collide with something like that I do next:
1) Transofrm Json into Json with strcuture that I need (for example in your case: coordinatesAvro) by ExecuteScript Processor. I have used ECMAScript cause you can simple parse JSON and work with objects (transform them).
2) ConvertJsonToAvro with one common schema (coordinatesAvro in your case) for Reader and Writer.
It works very good and I have used it on BigData cases. This is one of possible resolutions for your problem.

How to construct JSON array with Parson?

I'm using Parson library to send sensor data from a MCU to a server. I want to generate the following JSON, but I can't figure out how to generate the arrays ("sensors" and "measurements").
{
"systemInfo:": {
"hubId": "1234",
"battery:": {
"value": 3.3,
"unit": "V"
}
},
"sensors": [
{
"name": "S1",
"measurements:": [
{
"measuredValue": "val",
"value": 123,
"unit": "unit"
}
]
},
{
"name": "S2",
"measurements": [
{
"measuredValue": "val1",
"value": 123,
"unit": "unit1"
},
{
"measuredValue": "val2",
"value": 123,
"unit": "unit2"
}
]
},
{
"name": "s3",
"measurements": [
{
"measuredValue": "val",
"value": 120,
"unit": "unit"
}
]
}
]
}
There is an example on the GitHub page (serialization_example), that generates an array by parsing a string:
json_object_dotset_value(root_object, "contact.emails",
json_parse_string("[\"email#example.com\", \"email2#example.com\"]"));
but I would like to generate it using the API functions and not by manually constructing the string like in the example above. E.g., by using
json_object_set_string()
json_object_dotset_string()
json_object_dotset_number() etc.
Is it possible? Or the API does not offer this functionality?
I was stuck at this same point but as I looked into parson.h and parson.c I found support for Json_Array. here is just a sample code to help.
//creating a Json_Array
JSON_Value *branch = json_value_init_array();
JSON_Array *leaves = json_value_get_array(branch);
//creating measurement Json
JSON_Value *leaf_value = json_value_init_object();
JSON_Object *leaf_object = json_value_get_object(leaf_value);
json_object_set_number(leaf_object,"name1",123);
json_object_set_number(leaf_object,"name2",456);
json_object_set_number(leaf_object,"name3",789);
json_array_append_value(leaves,leaf_value);
Hope this helps.
I didn't find a solution to my problem, but instead in found another library, cJSON, that can do what I need.

"There is no index available for this selector" despite the fact I made one

In my data, I have two fields that I want to use as an index together. They are sensorid (any string) and timestamp (yyyy-mm-dd hh:mm:ss).
So I made an index for these two using the Cloudant index generator. This was created successfully and it appears as a design document.
{
"index": {
"fields": [
{
"name": "sensorid",
"type": "string"
},
{
"name": "timestamp",
"type": "string"
}
]
},
"type": "text"
}
However, when I try to make the following query to find all documents with a timestamp newer than some value, I am told there is no index available for the selector:
{
"selector": {
"timestamp": {
"$gt": "2015-10-13 16:00:00"
}
},
"fields": [
"_id",
"_rev"
],
"sort": [
{
"_id": "asc"
}
]
}
What have I done wrong?
It seems to me like cloudant query only allows sorting on fields that are part of the selector.
Therefore your selector should include the _id field and look like:
"selector":{
"_id":{
"$gt":0
},
"timestamp":{
"$gt":"2015-10-13 16:00:00"
}
}
I hope this works for you!

Resources