Gql query for repeated StructuredProperty - google-app-engine

How do I write the following query in GQL? [1]
Contact.query(Contact.address == Address(city='San Francisco',
street='Spear St'))
[1] Filtering for Structured Property Values

Quoting https://cloud.google.com/appengine/docs/python/ndb/queries#gql , "To query models containing structured properties, you can use foo.bar in your GQL syntax to reference subproperties" -- so if I understand your task correctly,
'''SELECT * FROM Contact
WHERE address.city='San Francisco' AND
address.street='Spear St'
'''
should work. Doesn't it?

Related

Django model based on an SQL table-valued function using MyModel.objects.raw()

If it's relevant I'm using Django with Django Rest Framework, django-mssql-backend and pyodbc
I am building some read only models of a legacy database using fairly complex queries and Django's MyModel.objects.raw() functionality. Initially I was executing the query as a Select query which was working well, however I received a request to try and do the same thing but with a table-valued function from within the database.
Executing this:
MyModel.objects.raw(select * from dbo.f_mytablefunction)
Gives the error: Invalid object name 'myapp_mymodel'.
Looking deeper into the local variables at time of error it looks like this SQL is generated:
'SELECT [myapp_mymodel].[Field1], '
'[myapp_mymodel].[Field2] FROM '
'[myapp_mymodel] WHERE '
'[myapp_mymodel].[Field1] = %s'
The model itself is mapped properly to the query as executing the equivalent:
MyModel.objects.raw(select * from dbo.mytable)
Returns data as expected, and dbo.f_mytablefunction is defined as:
CREATE FUNCTION dbo.f_mytablefunction
(
#param1 = NULL etc etc
)
RETURNS TABLE
AS
RETURN
(
SELECT
field1, field2 etc etc
FROM
dbo.mytable
)
If anyone has any explanation as to why these two modes of operation are treated substantially differently then I would be very pleased to find out.
Guess you've figured this out by now (see docs):
MyModel.objects.raw('select * from dbo.f_mytablefunction(%s)', [1])
If you'd like to map your table valued function to a model, this gist has a quite thorough approach, though no license is mentioned.
Once you've pointed your model 'objects' to the new TableFunctionManager and added the 'function_args' OrderedDict (see tests in gist), you can query it as follows:
MyModel.objects.all().table_function(param1=1)
For anyone wondering about use cases for table valued functions, try searching for 'your_db_vendor tvf'.

Create a Nested/Repeating field using SQL in BigQuery which can be queried with dot notation (without UNNEST)

I am trying to build a data structure in BigQuery using SQL which exactly reflects the data structure which I obtain when uploading JSON. This will enable me to query the view using SQL with dot notation instead of having to UNNEST, which I do understand but many of my clients find extremely confusing and unintuitive.
If I build a really simple dummy dataset with a couple of rows and then nest using the ARRAY_AGG(STRUCT([field list])) pattern:
WITH
flat_table AS (
SELECT "BigQuery" AS name, 23 AS user_count, "Data Warehouse" AS data_thing, 5 AS ease_of_use, "Awesome" AS description UNION ALL
SELECT "MySQL" AS name, 12 AS user_count, "Database" AS data_thing, 3 AS ease_of_use, "Solid" AS description
)
SELECT
name, user_count,
ARRAY_AGG(STRUCT(data_thing, ease_of_use, description)) AS attributes
FROM flat_table
GROUP BY name, user_count
Then saving and viewing the schema shows that the attributes field is Type = RECORD and Mode = REPEATED. Schema field names are:
name
user_count
attributes
attributes.data_thing
attributes.ease_of_use
attributes.description
If I look at the COLUMN information in the INFORMATION_SCHEMA.COLUMNS query I can see that the attributes field is_nullable = NO and data_type = ARRAY<STRUCT<data_thing STRING, ease_of_use INT64, description STRING>>
If I want to query this structure I need to use the UNNEST pattern as below:
SELECT
name,
user_count
FROM
nested_table,
UNNEST(attributes)
WHERE
ease_of_use > 3
However when I upload the following JSON representation of the same data to BigQuery with automatic schema detection:
{"attributes":{"description":"Awesome","ease_of_use":5,"data_thing":"Data Warehouse"},"user_count":23,"name":"BigQuery"}
{"attributes":{"description":"Solid","ease_of_use":3,"data_thing":"Database"},"user_count":12,"name":"MySQL"}
The schema looks nearly identical once loaded, except for the attributes field is Mode = NULLABLE (it is still Type = RECORD). The INFORMATION_SCHEMA.COLUMNS shows me that the attributes field is now is_nullable = YES and data_type = STRUCT<data_thing STRING, ease_of_use INT64, description STRING>, i.e. now nullable and not in an array.
However the most interesting thing for me is that I can now query this table using dot notation instead of the UNNEST pattern, so the query above becomes:
SELECT
name,
user_count
FROM
nested_table_json
WHERE
attributes.ease_of_use > 3
Which is arguably easier to read, even in this trivial case. However once we get to more complex data structures with multiple nested fields and multi-level nesting, the UNNEST pattern becomes extremely difficult to write, QA and debug. The dot notation pattern appears to be much more intuitive and scalable.
So my question is: is it possible to build a data structure equivalent to the loaded JSON by writing queries in SQL, enabling us to build Standard SQL queries using dot notation and not requiring complex UNNEST patterns?
If you know that your array_agg will produce one row, you can drop the ARRAY notation like this:
SELECT
name, user_count,
ARRAY_AGG(STRUCT(data_thing, ease_of_use, description))[offset(0)] AS attributes
notice the use of OFFSET(0) this way the returned output will be:
[
{
"name": "BigQuery",
"user_count": "23",
"attributes": {
"data_thing": "Data Warehouse",
"ease_of_use": "5",
"description": "Awesome"
}
}
]
which can be queried using dot notation.
In case you want just to group result in STRUCT, you don't need array_agg.
WITH
flat_table AS (
SELECT "BigQuery" AS name, 23 AS user_count, struct("Data Warehouse" AS data_thing, 5 AS ease_of_use, "Awesome" AS description) as attributes UNION ALL
SELECT "MySQL" AS name, 12 AS user_count, struct("Database" AS data_thing, 3 AS ease_of_use, "Solid" AS description)
)
SELECT
*
FROM flat_table

CakePHP 3 DISTINCT has no effect on generated query

CakePHP 3.5.13
$query = $Substances->find()->select(['id']);
debug($query->sql());
Produces:
'SELECT Substances.id AS `Substances__id` FROM substances Substances'
Trying to do the MySQL equivalent of DISTINCT() by changing the query to:
$query = $Substances->find()->select(['id'])->distinct(['id']);
Results in exactly the same query string as without ->distinct():
'SELECT Substances.id AS `Substances__id` FROM substances Substances'
Why is this? According to the documentation, that's how you write a DISTINCT() query using Cake's ORM.

How do I execute this query in GQL [Google App engine]?

I have a table of projects with two of it's columns as 'language' and 'tag'.
After the user gives an input, I want to output all the projects whose language is input or tag is input.
Sql query for above would be this,
Sql query: Select * from TableName where language='input' OR tag='input'
I tried to execute the same in Gql but in vain. What should be the query in Gql to output the data in the above mentioned way.
GQL doesn't have OR, so basically you have to make two separate queries and union results:
Select * from TableName where language='input'
Select * from TableName where tag='input'
You should join results on your app side, Cloud Console doesn't support such things too.
See GQL reference: https://cloud.google.com/datastore/docs/apis/gql/gql_reference
I don't know if is mandatory for you to use GQL, but in case you are able to avoid it, you can use the ndb filter instead.
results = TableName.query(ndb.OR(TableName.language == 'input',
TableName.tag == 'input'))
for result in results:
....your code here...
More information in: https://cloud.google.com/appengine/docs/python/ndb/queries

what is the GQL count query

In the interest of time, I do mean GQL, as in
SELECT * FROM Song WHERE composer = 'Lennon, John'
The following failed
SELECT COUNT(*) FROM myEntity
also the following
SELECT COUNT() FROM myEntity
As shown here, there is actually a way to count the return of your GQL. The doc about the GQL language shows you can't really do a count on the query itself. So what you would do is take your select *, put it in a GQL query object, then call the "count()" method on that object to have the count

Resources