I'm using Zapier with Redshift to fetch data from custom queries and trigger a wide array of actions when new rows are detected from either a table or custom query, including sending emails through Gmail or Mailchimp, exporting data to Google Sheets, and more. Zapier's UI enables our non-technical product stakeholders to take over these workflows and customize them as needed. Zapier has several integrations built for Postgres, and since Redshift supports the Postgres protocol, these custom workflows can be easily built in Zapier.
I'm switching our data warehouse from Redshift to Snowflake and the final obstacle is moving these Zapier Integrations. Snowflake doesn't support the Postgres protocol so it cannot be used as a drop in replacement for these workflows. No other data source has all the information that we need for these workflows so connecting to an upstream datasource of Snowflake is not an option. Would appreciate guidance on alternatives I could pursue, including the following:
Moving these workflows into application code
Using a foreign data wrapper in Postgres for Snowflake to continue using the existing workflows from a dummy Postgres instance
Using custom-code blocks in Zapier instead of the Postgres integration
I'm not sure if Snowflake has an API that will allow you to do what you want, but you can create a private Zapier Integration that will have all the same features and permissions as a public integration, but you can customize it for your team.
There's info about that process here: https://platform.zapier.com/
You might find it easier to use a vendor solution like Census to forward rows as events to Zapier. Their free plan is pretty sizeable for getting started. More info here https://www.getcensus.com/integrations/zapier
Related
I am using Google Sheets to create a database that is connected to Google Data Studio. But the database is growing fast and will soon overgrow Sheets limits.
I am looking for a cloud service that is simple to use like Sheets, where I can manually add data, do calculations (like formulas in Sheets) and also use Python to update the data there. I also need it to connect to Google Data Studio for visualisation.
I got recommended Firestore, Cloud SQL, Bigquery, but I still do not understand the difference between them. I am looking for something cheap where I can do the things I mentioned above.
P.S. I am new to SQL, so I would prefer a visual database (like Sheets).
Thank you all!
Sheet is not a database, but you can use as is. You have other type of database on Google Cloud, such as
Firestore a document oriented database, not really similar to a tabular Sheet
BigQuery which is a datawarehouse very powerful and the most similar to sheet in its design, checks and controls
Cloud SQL hosts relational database engine, similar to BigQuery but with, in addition, the capacity to create contraint (unique value, primary key, external (foreign) key in relation with another value in another table.
However, no one offer the easiness of Sheet in term of graphical interface. The engine are powerful but are developer oriented and not desktop user oriented.
I made a Spotify app that analyzes user data and manages interactive features by writing the API responses to a PostgreSQL database. The developer rules state that basically I have to delete the data when the user is not actively using my app.
Is there a way to automate this on the server (I'm using AWS Lightsail/Ubuntu) to do it daily? Would I need to add a datetime column to all of my tables and follow one of these: https://www.the-art-of-web.com/sql/trigger-delete-old/? Or is there a better way?
I have data in Microsoft's Common Data Service (from Microsoft Dynamics for Talent). I can't use the Data Management Framework as the data in question is in entities that are not available through the DMF.
How do I replicate the data in the CDS back a SQL database?
What I've tried so far is to create a logic app (and flow, neither worked) that grabs data using the CDS connector and pushes it into an SQL database, but there are several problems with this:
It's a maintenance burden
It's extremely error tedious to add new tables, etc. I have written a somehwat horendous stored proc that tries to create a table based on the data given to it from the json-ified data from the flow, but this is very error prone.
It doesn't work at all, since the size of the data exceeds some kind of limitation in the SQL connector and I get spurious errors.
Rather than trying to push through with these issues, I'd rather ask whether there's a better way to achieve this. With the Data Management Framework in Dynamics it was simply a matter of scheduling these sync jobs, which worked pretty well. Is there something similar with CDS?
I've also tried looking at the Data Integration projects in Powerapps, but these only seem to allow me to get data into Powerapps/CDS, not back out...
Common Data Service for Apps provides access to the data using the user interfaces or API, there is no direct access to the underlying database. This architecture has certain limitations when it comes to processing large volumes of data, for example for the purposes of data warehousing, reporting, or using Azure machine learning and analytics tools. Replicating CDS data using Extract, Transform, Load (ETL) tools is possible but inherently complex to maintain.
Data Export Service is a service made available on Microsoft AppSource that adds the ability to replicate Dynamics 365 for Customer Engagement apps data to an Azure SQL Database store in a customer-owned Azure subscription.
Note: The Data Export Service requires Dynamics 365 for Customer Engagement apps subscription, it is not available on Common Data Service for Apps plans.
I'm new to building data pipelines where dumping files in the cloud is one or more steps in the data flow. Our goal is to store large, raw sets of data from various APIs in the cloud then only pull what we need (summaries of this raw data) and store that in our on premises SQL Server for reporting and analytics. We want to do this in the most easy, logical and robust way. We have chosen AWS as our cloud provider but since we're at the beginning phases are not attached to any particular architecture/services. Because I'm no expert with the cloud nor AWS, I thought I'd post my thought for how we can accomplish our goal and see if anyone has any advice for us. Does this architecture for our data pipeline make sense? Are there any alternative services or data flows we should look into? Thanks in advance.
1) Gather data from multiple sources (using APIs)
2) Dump responses from APIs into S3 buckets
3) Use Glue Crawlers to create a Data Catalog of data in S3 buckets
4) Use Athena to query summaries of the data in S3
5) Store data summaries obtained from Athena queries in on-premises SQL Server
Note: We will program the entire data pipeline using Python (which seems like a good call and easy no matter what AWS services we utilize as boto3 is pretty awesome from what I've seen thus far).
You may use glue jobs (pyspark) for #4 and #5. You may automate flow using Glue triggers
I am working on a project where i need to display the database mssql server's performance metrics for example memory consumed/free, storage free space etc. I have researched for this purpose and one thing came up was DOGSTATSD.
Datadog provides the library for .net project to get custom metrics but that was not the solution for me because the metrics works on datadog website. I have to display the all (in graph or whatever suited) data, received from MSSQL SERVER. There will be multiple servers/instances.
Is there a way to do that, our WebApp connected with multiple databases and we receive/display information.
I cannot use already available tools for the insights.
You can easily get all needed data via querying dmv and other resources inside SQL Server. Good start is here.