extracting name/value pairs into columns and values - Shopify Data In Bigquery - arrays

We use FiveTran to extract our data from shopify and store it in BigQuery. The field "properties" within the order_line table contains what looks like an array of key/value pairs. In this case name/value. The field type is string here is an example of the contents
order_line_id properties
9956058529877 [{"name":"_order_bump_rule_id","value":"4afx7cbw6"},{"name":"_order_bump_bump_id","value":"769d1996-b6fb-4bc3-8d41-c4d7125768c5"},{"name":"_source","value":"order-bump"}]
4467731660885 [{"name":"shipping_interval_unit_type","value":null},{"name":"charge_delay","value":null},{"name":"charge_on_day_of_week","value":null},{"name":"charge_interval_frequency","value":null},{"name":"charge_on_day_of_month","value":null},{"name":"shipping_interval_frequency","value":null},{"name":"number_charges_until_expiration","value":null}]
4467738738773 [{"name":"shipping_interval_unit_type","value":null},{"name":"charge_delay","value":null},{"name":"charge_on_day_of_week","value":null},{"name":"charge_interval_frequency","value":null},{"name":"charge_on_day_of_month","value":null},{"name":"shipping_interval_frequency","value":null},{"name":"number_charges_until_expiration","value":null}]
4578798600277 [{"name":"shipping_interval_unit_type","value":null},{"name":"charge_interval_frequency","value":null},{"name":"shipping_interval_frequency","value":null}]
I am trying to write a query that generate one row per record with a column for each of these name values:
shipping_interval_unit_type
charge_on_day_of_week
charge_interval_frequency
charge_on_day_of_month
subscription_id
number_charges_until_expiration
shipping_interval_frequency
and the corresponding "value". This field "properties" can contain many different "name" values and they can be in different order each time. The "name" values noted above are not always present in the "properties" field.
I've tried json functions but it doesn't seem to be properly formatted for json. I've tried unnesting it but that fails since it is a string.

Consider below approach
select * from (
select order_line_id,
json_extract_scalar(property, '$.name') name,
json_extract_scalar(property, '$.value') value
from your_table, unnest(json_extract_array(properties)) property
)
pivot (min(value) for name in (
'shipping_interval_unit_type',
'charge_on_day_of_week',
'charge_interval_frequency',
'charge_on_day_of_month',
'subscription_id',
'number_charges_until_expiration',
'shipping_interval_frequency'
))

Related

Google Sheets: VLOOKUP and Transcribing data based on multiple categories in the same column

I'm trying to use vlookup between two tabs in a sheet where the data that exists in a single column denoted by "type" (tab B) needs to be transcribed into separate columns by type (tab A).
Edit:
For example, tab B has the "raw" data listed in just three columns: the type, the ID, and the value.
I would want tab A to be able to pull from tab B but organize it by the ID, then by the value associated with the type following it.
So, tab A would vlookup according to the ID, and then pull the associated value according to the column name (type): aaaaaa, bbbbbbb, ccccccc
Column Headers would be: ID, aaaaaa, bbbbbbb, ccccccc
and then the associated data will be filled out according to the ID and type match
I've only done vlookup where the filter is by what exists on tab A to trigger what to pull in tab B
=if((F236="aaa",VLOOKUP(A236, "TabName!$A$4:$AC"),24,FALSE), if(F236="bbb",VLOOKUP(A236,"TabName!"),24,FALSE))
I've attached a sample of how the two tabs have been set up for reference! Thank you!
sample sheet
try:
=QUERY('sample database (tab b)'!A2:C,
"select B,max(C)
where B is not null
group by B
pivot A
label B'ID'")

How do I iterate over an array in a nested json object in sqlite?

Assume I have a sqlite table features which has a column data that contains json objects.
CREATE TABLE features ( id INTEGER PRIMARY KEY, data json )
Now, an example data object may be:
{"A":
{"B":
{"coordinates":[
{"x":1, "y":10},
{"x":10, "y":2},
{"x":12, "y":12}
]
}
}
Now the number of json objects in the coordinates array can vary from row to row. Some documents can have 3 coordinates (example above) while others may have 5 or more coordinates.
For each row or document, I want to be able to iterate over just the x values and find the minimum, same for y values. So the results for the example would be 1 for x and 2 for y.
I can get all the json objects inside the array using json_each but I can't extract x for just one single row. I tried:
select value from features, json_each(json_extract(features.data, '$.A.B.coordinates'));
However, this seems to return "all" coordinate json objects for all rows. How do I go about iterating the array for one document and extract values from it so I can then select a minimum or maximum for one document?
Use json_extract() again after json_each(json_extract()) to extract each x and y and aggregate:
SELECT f.id,
MIN(json_extract(value, '$.x')) x,
MIN(json_extract(value, '$.y')) y
FROM features f, json_each(json_extract(f.data, '$.A.B.coordinates'))
GROUP BY f.id
See the demo.

Sort by field from jsonb column

I'm using npgsql to store data about shipments and part of table is jsonb column storing some details about shipment including details about customer who made shipment.
Table for displaying data about shipments is displaying only Customer Name and if get that record via
CustomerName = shipment.Metadata.RootElement.GetProperty("customer").GetProperty("customerName").ToString(),
Request is that I make this column sortable so I would need to sort by this property while accessing database.
Is it even possible to do it in NpgSql?
You can easily sort by a property inside a JSON document - just specify that property in your OrderBy clause:
_ = await ctx.Shipments
.OrderBy(s => s.Metadata.RootElement.GetProperty("customer").GetProperty("customerName").GetString())
.ToListAsync();
This produces the following:
SELECT s."Id", s."Metadata"
FROM "Shipments" AS s
ORDER BY s."Metadata"#>>'{customer,customerName}'
You should probably be able to make this use an index as well.

Snowflake Flatten Query for array

Snowflake Table has 1 Variant column and loaded with 3 JSON record. The JSON records is as follows.
{"address":{"City":"Lexington","Address1":"316 Tarrar Springs Rd","Address2":null} {"address":{"City":"Hartford","Address1":"318 Springs Rd","Address2":"319 Springs Rd"} {"address":{"City":"Avon","Address1":"38 Springs Rd","Address2":[{"txtvalue":null},{"txtvalue":"Line 1"},{"Line1":"Line 1"}]}
If you look at the Address2 field in the JSON , The first one holds NULL,2nd String and 3rd one array.
When i execute the flatten query for Address 2 as one records holds array, i get only the 3rd record exploded. How to i get all 2 records with exploded value in single query.
select data:address:City::string, data:address:Address1::string, value:txtvalue::string
from add1 ,lateral flatten( input => data:address:Address2 );
When I execute the flatten query for Address 2 as one records holds array, I get only the 3rd record exploded
The default behaviour of the FLATTEN table function in Snowflake will skip any columns that do not have a structure to expand, and the OUTER argument controls this behaviour. Quoting the relevant portion from the documentation link above (emphasis mine):
OUTER => TRUE | FALSE
If FALSE, any input rows that cannot be expanded, either because they cannot be accessed in the path or because they have zero fields or entries, are completely omitted from the output.
If TRUE, exactly one row is generated for zero-row expansions (with NULL in the KEY, INDEX, and VALUE columns).
Default: FALSE
Since your VARIANT data is oddly formed, you'll need to leverage conditional expressions and data type predicates to check if the column in the expanded row is of an ARRAY type, a VARCHAR, or something else, and use the result to emit the right value.
A sample query illustrating the use of all above:
SELECT
t.v:address.City AS city
, t.v:address.Address1 AS address1
, CASE
WHEN IS_ARRAY(t.v:address.Address2) THEN f.value:txtvalue::string
ELSE t.v:address.Address2::string
END AS address2
FROM
add1 t
, LATERAL FLATTEN(INPUT => v:address.Address2, OUTER => TRUE) f;
P.s. Consider standardizing your input at ingest or source to reduce your query complexity.
Note: Your data example is inconsistent (the array of objects does not have homogenous keys), but going by your example query I've assumed that all keys of objects in the array will be named txtvalue.

Is it possible to nest values based on key column in AppSheet?

I have a table connected to AppSheet that has a column called "Names" there are many values that have the same name with different information. Is there anyway in AppSheet to have the user tap on one name and have all of the values show up that have the same Customer name. Essentially grouping.
I know there is a community on Google Plus for AppSheet but it doesn't seem very active my question has been sitting on the site for weeks. If anyone needs more clarification please ask.
Not very clear what you are trying to achieve, but you can correct me if I'm wrong.
You want in the Inline view of any given Customer, to have a list of referenced values.
You can do this with a SELECT() function. In Data > Columns > + "Virtual column"
In the "App formula" input add your function.
For example: SELECT( myTable[myColumn], [Name] = [_THISROW].[Name])
What this does:
List all values from column named "myColumn" in table "myTable"
where "Name" has the same "Name" as this row,
https://help.appsheet.com/expressions/functions/select
If you want to list not only values but a list of referenced rows from another table, you should use REF_ROWS.
For example REF_ROWS("myOrdersTable", "orderCustomer")
What this does: list all rows from table named "myOrdersTable" where column named "orderCustomer" has the same value as the unique KEY of this row.
REF_ROW virtual columns are generated automatically when you give a "REF" type to any column. In this example if you go to Data > Columns > "myOrdersTable" and change the type of "orderCustomer" to "REF" with "ReferencedTableName" to "myCustumerTable", a virtual column with list of referenced rows will be generated in "myCustumerTable" table after you save.
https://help.appsheet.com/data/references/references-between-tables

Resources