BQ load JSON File with Array of Array - arrays

Im trying to load a JOSN file where some of the arrays are empty.
{"house_account_payable":"0.00","house_account_receivable":"0.00","gift_sales_payable":"0.00","gift_sales_receivable":"0.00","store_credit_sales_payable":"0.00","percentage_row":null,"sales_per_period":[["02:00AM - 02:59AM",{"amount":0,"qty":0}],["03:00AM - 03:59AM",{"amount":0,"qty":0}]],"revenue_centers":[],"tax_breakdowns":[]}
This is giving the error:
rror while reading table: test2, error message: Failed to parse JSON: No object found when new array is started.; BeginArray returned false; Parser terminated before end of string
Could somebody help me on this?

Are you trying to load data from your local machine or GCS? Please, remember about exporting in JSONL(Newline delimited JSON):
{"open_orders_ids": []}
{"unpaid_orders_ids": []}
The output:
Take a look for documentation about nested and repeated columns.
EDIT:
Your JSON schema should look like this:
{
"items": [
{
"house_account_payable": "0.00",
"house_account_receivable": "0.00",
"gift_sales_payable": "0.00",
"gift_sales_receivable": "0.00",
"store_credit_sales_payable": "0.00",
"percentage_row": "",
"sales_per_period": [
{
"AM02_00_AM02_59": {
"amount": "0",
"qty": "0"
}
},
{
"AM03_00_AM03_59": {
"amount": "0",
"qty": "0"
}
}
]
}
]
}
Regarding to Felipe Hoffa's post, run following commands:
jq -c .items[] <FILENAME>.json > <FILENAME>.jq.json
bq load --source_format NEWLINE_DELIMITED_JSON --autodetect <DATASET_ID>.<TABLENAME> <FILENAME>.jq.json
The schema:
Let me know if this is what you are looking for.

There's no problem with the null arrays.
The problem lies in this shorter json:
{"sales_per_period":[["02:00AM - 02:59AM",{"amount":0,"qty":0}],["03:00AM - 03:59AM",{"amount":0,"qty":0}]]}
The arrays there hold elements of different types, and to bring it into a structured table, a different schema is needed.
For example:
{"sales_per_period":[{"a":"02:00AM - 02:59AM","b":{"amount":0,"qty":0}},{"a":"03:00AM - 03:59AM","b":{"amount":0,"qty":0}}]}
Now this loads easily into BigQuery:
bq load --source_format=NEWLINE_DELIMITED_JSON --autodetect temp.short delete.short.json
Can you change this source JSON easily outside BigQuery? Otherwise load it raw into BigQuery, and parse it with a JS UDF inside BigQuery.

Related

Iterating a JSON array in LogicApps

I'm using a LogicApp triggered by an HTTP call. The call posts a JSON message which is a single row array. I simply want to extract the single JSON object out of the array so that I can parse it but have spent several hours googling and trying various options to no avail. Here's an example of the array:
[{
"id": "866ef906-5bd8-44d8-af34-0c6906d2dfd7",
"subject": "Engagement-866ef906-5bd8-44d8-af34-0c6906d2dfd7",
"data": {
"$meta": {
"traceparent": "00-dccfde4923181d4196f870385d99cb84-52b8333f100b844c-00"
},
"timestamp": "2021-10-19T17:01:06.334Z",
"correlationId": "866ef906-5bd8-44d8-af34-0c6906d2dfd7",
"fileName": "show.xlsx"
},
"eventType": "File.Uploaded",
"eventTime": "2021-10-19T17:01:07.111Z",
"metadataVersion": "1",
"dataVersion": "1"
}]
Examples of what hasn't worked:
Parse JSON on the array, error: InvalidTemplate when specifiying an array as the schema
For each directly against the http output, error: No dependent actions succeeded.
Any suggestions would be gratefully received.
You have to paste the example that you have provided to 'Use sample payload to generate schema' in the Parse JSON Connector and then you will be able to retrieve each individual object from the sample payload.
You can extract a single JSON object from your array by using its index in square brackets. E.g., in the example below you'd need to use triggerBody()?[0] instead of triggerBody(). 0 is an index of the first element in the array, 1 - of the second, and so on.
Result:

Logic Apps - looping through a nested array in JSON

I need to loop through this optional array (it's only the sectional of JSON I have trouble with).
As you can see from the code:
The optional bullseye has an array rings. rings has arrays of expansionCriteria and expansionCriteria may or may not have actions.
How do I iterate and get all type, threshold in expansionCriteria? I also need to access all skillsToRemove under actions, if available.
I am rather new to Logic Apps, so any help is appreciated.
"bullseye": {
"rings": [
{
"expansionCriteria": [
{
"type": "TIMEOUT_SECONDS",
"threshold": 180
}
],
"actions": {
"skillsToRemove": [
{
"name": "Claims Foundation",
"id": "60bd469a-ebab-4958-9ca9-3559636dd67d",
"selfUri": "/api/v2/routing/skills/60bd469a-ebab-4958-9ca9-3559636dd67d"
},
{
"name": "Claims Advanced",
"id": "bdc0d667-8389-4d1d-96e2-341e383476fc",
"selfUri": "/api/v2/routing/skills/bdc0d667-8389-4d1d-96e2-341e383476fc"
},
{
"name": "Claims Intermediate",
"id": "c790eac3-d894-4c00-b2d5-90cd8a69436c",
"selfUri": "/api/v2/routing/skills/c790eac3-d894-4c00-b2d5-90cd8a69436c"
}
]
}
},
{
"expansionCriteria": [
{
"type": "TIMEOUT_SECONDS",
"threshold": 5
}
]
}
]
}
Please let me know if you need more info.
To generate the schema, you can remove the name of the object at the top of the code: "bullseye":
Thank you pramodvalavala-msft for posting your answer in MS Q&A for the similar thread .
" As you are working with a JSON Object instead of an Array, unfortunately there is no built-in function to loop over the keys. There is a feature request to add a method to extract keys from an object for scenarios like this, that you could up vote for it gain more traction.
You can use the inline code action to extract the keys from your object as an array (using Object.keys()). And then you can loop over this array using the foreach loop to extract the object that you need from the main object, which you could then use to create records in dynamics."
For more information you can refer the below links:
. How to loop and extract items from Nested Json Array in Logic Apps .
.Nested ForEach Loop in Workflow. .

Parse JSON with wrong designed structure in Swift

I have to parse some really terrible designed JSON, and to be honest I have never faced with such one. The following is a simplified cut from the entire JSON file:
{
"5ee70183-87fe-4799-802e-ef7f5e7323db":
{
"title": "Bank 1",
"logo": "655ee02d87cf4cdf912c3507233b0520.gif"
},
"332c7078-97ad-4bf7-b8ee-44d85a9c88d1":
{
"title": "Bank 2",
"logo": "655ee02d87cf4cdf912c3507233b0520.gif"
},
"8e9bd4c8-6f4a-4663-ae86-b8fbaf295030":
{
"title": "Bank 3",
"logo": "655ee02d87cf4cdf912c3507233b0520.gif"
}
}
As you can see the "root" keys are some UUIDs. Those keys with values are supposed to be a list, but instead of using correct [] brackets for a list it's used {} wrong one. If I parse this using codables I have to create structs with UUID names, but what is worst this "list" is not fixed but go unlimited in theory. So my job is to parse this JSON and get an array of bank entities. As I'm shocked and confused at the moment I just think that I'm not able to use codables and need to parse this manually to a dictionary and get properties from there by assigning to the correct list item. If you ever faced with such an issue or know better parsing option, it will greatly help me to handle this.
You need
let res = try! JSONDecoder().decode([String:Root].self,from:data)
print(Array(res.values))
struct Root: Codable {
let title, logo: String
}

Generate JSON schema

Im trying a setup a Microsoft flow. In short, I need to take JSON data retrieved from a device, and parse it so that i could reference it in the Flows below. In order to parse, i need to provide the JSON Schema to Flow. Microsoft Flow has an option to generate it from a sample payload (the results returned from the API call), but it's not generating it correctly. I'm hoping someone can help me. I need the correct JSON Schema.
The data returned from the API:
[
null,
[
{
"user_id": 2003,
"user_label": "Test1"
},
{
"user_id": 2004,
"user_label": "Test2"
}
]
]
Scheme generated in Flow from the above sample payload:
{
"type": "array",
"items": {}
}
I then tried to generate the Schema from just the data. That seemed to work, but when the Flow runs, I get a Json validation error.
Tried generating from just the data like this:
{
"user_id": 2003,
"user_label": "Test1"
}
This generated the scheme like this:
{
"type": "object",
"properties": {
"user_id": {
"type": "number"
},
"user_label": {
"type": "string"
}
}
}
So you have 2 things going on, the nested object array, and the null.
You'll need another Parse JSON after the first Parse JSON. And you'll want to filter out the null before the second Parse JSON.
It took me a while to figure out, but I hope this helps.
Start by adding the Parse JSON step to whatever step is outputting the JSON.
Now, filter the array, make sure you use the 'Expression' when comparing with null.
Add the second Parse JSON, you'll notice that you won't have the option to select the output "Item" of the Filter array step, so select 'Parse JSON' - Item for now (we will change this to use the output of the Filter JSON step in a moment)
The step should automatically change to an 'Apply to each'. In the Parse JSON 2, generate the schema with
[
{
"user_id": 2003,
"user_label": "Test1"
},
{
"user_id": 2004,
"user_label": "Test2"
}
]
Then, modify the 'Select an output from previous steps field' and change it (from the Body of the Parse JSON step) to the Body of the Filter Array step
Finally, add an action after Parse JSON 2 and select one of the fields in Parse JSON 2, this will automatically change that step to a nested Apply to each
You should end up with something like this:

mongoexport csv output last array values

Inspired by this question in Server Fault
https://serverfault.com/questions/459042/mongoexport-csv-output-array-values
I'm using mongoexport to export some collections into CSV files, however when I try to target fields which are the last members of an array I cannot get it to export correctly.
Command I'm using
mongoexport -d db -c collection -fieldFile fields.txt --csv > out.csv
One item of my collection:
{
"id": 1,
"name": "example",
"date": [
{"date": ""},
{"date": ""},
],
"status": [
"true",
"false",
],
}
I can access to the first member of my array writing the fields like the following
name
id
date.0.date
status.0
Is there a way to acess the last item of my array without knowing the lenght of the array?
Because the following doesn't work:
name
id
date.-1.date
status.-1
Any idea of the correct notation? Or if it's simply not possible?
It's not possible to reference the last element of the array without knowing the length of the array, since the notation is array_field.index where the index is in [0, length - 1]. You could use the aggregation framework to create the view of the data that you want to export, save it temporarily into a collection with $out, and then mongoexport that. For example, for your documents you could do
db.collection.aggregate([
{ "$unwind" : "$date" },
{ "$group" : { "_id" : "$_id", "date" : { "$last" : "$date" } } },
{ "$out" : "temp-for-csv" }
])
in order to get just the last date for each document and output it to the collection temp-for-csv.
You can return just the last elements in an array with the $slice projection operator, but this isn't available in aggregation and mongoexport only takes a query specification, not a projection specification, since the --fields and --fieldFile option are supposed to suffice. Might be a good feature request to ask for using a query with a projection for mongoexport.

Resources