How to dump BigQuery select statement results into Google Cloud SQL database. The only way I am aware of is dumping the results to Google Cloud Storage and then Cloud SQL can read from it.
Is there a better way to implement this? I want this to happen everyday.
You can create a cron job that will use the BigQuery API to query the data and the MySQL API to post the data.
You can use the Cloud DataFlow that will use BigQuery query as input, but you will need to write a custom sink (Java, Python) (or find one) that will dump it to MySql.
Related
I am using Google Sheets to create a database that is connected to Google Data Studio. But the database is growing fast and will soon overgrow Sheets limits.
I am looking for a cloud service that is simple to use like Sheets, where I can manually add data, do calculations (like formulas in Sheets) and also use Python to update the data there. I also need it to connect to Google Data Studio for visualisation.
I got recommended Firestore, Cloud SQL, Bigquery, but I still do not understand the difference between them. I am looking for something cheap where I can do the things I mentioned above.
P.S. I am new to SQL, so I would prefer a visual database (like Sheets).
Thank you all!
Sheet is not a database, but you can use as is. You have other type of database on Google Cloud, such as
Firestore a document oriented database, not really similar to a tabular Sheet
BigQuery which is a datawarehouse very powerful and the most similar to sheet in its design, checks and controls
Cloud SQL hosts relational database engine, similar to BigQuery but with, in addition, the capacity to create contraint (unique value, primary key, external (foreign) key in relation with another value in another table.
However, no one offer the easiness of Sheet in term of graphical interface. The engine are powerful but are developer oriented and not desktop user oriented.
How to update or delete data in azure sql DB using azure stream analytics
Currently, Azure Stream Analytics (ASA) only supports inserting (appending) rows to SQL outputs (Azure SQL Databases, and Azure Synapse Analytics).
You should consider to use workarounds to enable UPDATE, UPSERT, or MERGE on SQL databases, with Azure Functions as the intermediary layer.
You can find more information about such workarounds in this MS article.
Firstly, we need to know what is Azure Stream Analytics.
An Azure Stream Analytics job consists of an input, query, and an output. Stream Analytics ingests data from Azure Event Hubs, Azure IoT Hub, or Azure Blob Storage. The query, which is based on SQL query language, can be used to easily filter, sort, aggregate, and join streaming data over a period of time. You can also extend this SQL language with JavaScript and C# user defined functions (UDFs). You can easily adjust the event ordering options and duration of time windows when preforming aggregation operations through simple language constructs and/or configurations.
Azure Stream Analytics now natively supports Azure SQL Database as a source of reference data input. Developers can author a query to extract the dataset from Azure SQL Database, and configure a refresh interval for scenarios that require slowly changing reference datasets.
That means that you can not insert or update data in azure sql DB using Azure Stream Analytics.
Azure Stream Analytics is not a database manage tool.
Hope this helps.
I'm new to building data pipelines where dumping files in the cloud is one or more steps in the data flow. Our goal is to store large, raw sets of data from various APIs in the cloud then only pull what we need (summaries of this raw data) and store that in our on premises SQL Server for reporting and analytics. We want to do this in the most easy, logical and robust way. We have chosen AWS as our cloud provider but since we're at the beginning phases are not attached to any particular architecture/services. Because I'm no expert with the cloud nor AWS, I thought I'd post my thought for how we can accomplish our goal and see if anyone has any advice for us. Does this architecture for our data pipeline make sense? Are there any alternative services or data flows we should look into? Thanks in advance.
1) Gather data from multiple sources (using APIs)
2) Dump responses from APIs into S3 buckets
3) Use Glue Crawlers to create a Data Catalog of data in S3 buckets
4) Use Athena to query summaries of the data in S3
5) Store data summaries obtained from Athena queries in on-premises SQL Server
Note: We will program the entire data pipeline using Python (which seems like a good call and easy no matter what AWS services we utilize as boto3 is pretty awesome from what I've seen thus far).
You may use glue jobs (pyspark) for #4 and #5. You may automate flow using Glue triggers
I'm new to Azure development, and I'm having trouble finding examples of what I want to do.
I have an XML file in Azure file storage and I want to use a Logic App to get that XML data into a SQL database.
I guess I will need to create a "SQL Database" in Azure, before the Logic App can be written (correct?).
Assuming that I have some destination SQL database, are there Logic App connectors/triggers/whatever that I can use to: 1) recognize that a file has been uploaded to Azure, and 2) process that XML to go into a database?
If so, can such connectors/triggers/whatevers be configured/written so that any business rules I have, for massaging the data between the XML and the database, can be specified?
Thanks!
Yes you are right you need to create the db and then write logicapps to perform necessary functionality.
There are lot of connectors with trigger like blob storage, Sql connector etc...
You can perform your processing with the help of "Enterprise Connectors" or you can do custom processing using "AzureFunctions" which integrate with logic apps.
In order to perform CRUD operations on an Azure SQL Database, you can use the SQL Connector. Documentation on the connector can be found here:
Logic App SQL Connector
Adding SQL Connector to a Logic App
I've also written a blog myself on how to use the SQL Connector to perform Bulk operations using a stored procedure and OpenJSON : Bulk insert into SQL
This might help you in designing your Logic App if you choose to use a stored procedure.
I would like to build an application that serve a lot of users, so I decide to use cloud datastore because it is more scalable, but i also want to have an interface that will help me observe my data with some complex sql query.
so i decide to build my data with tow data base (cloud data store and cloud sql) and the users for my application will get the data from the datastore, and me with my interface i will use cloud sql.
The users will just read data they will not write to the datastore, but me with my interface I would read the data from my cloud sql so i can use complex query, and if i want to write or change the data, I will change both data in cloud sql and data sore.
what do you think? is there another suggestion ? thank you