Orchestration tools to be integrated within Snowflake - snowflake-cloud-data-platform

I am having a requirement where there is a need to go away with "TASKS" which is currently used as an orchestration for multiple stored procedures written within Snowflake.
I am evaluating "AIRFLOW" know as a replacement for TASK in this process. As I want to do a POC around it. If anyone can please help provide some reference docs/pointers to do them it would be great.
What is needed ?
--> How Apache Airflow would be connected to Snowflake.
--> How using airflow we can schedule the procedure sequentially, parallelly, etc.
--> Logging/error handling features.
--> Any best practices around governance.
Thanks in advance !!

Reference for your consideration:
https://community.snowflake.com/s/article/How-to-connect-Apache-Airflow-to-Snowflake-and-schedule-queries-jobs

Related

Does anyone has "Best Practice" to share for Unit/Integration/Regression testing with Snowflake?

We are embarking in a project using both Matillion and Snowflake and want to put in place some Unit/Integration/Regression testing.
Automated would be brilliant but manual would be good too.
We could invent something (simple) ourselves... but it would be better to benefit from other people experience.
Indeed there is a lack of possibility to unit/integration testing your models within Matillion.
You need some external tools to implement these - we were using a Spring Boot Microservice with the following steps:
Setup the Testdata with some Plain SQL scripts via JDBC connection to the underlying Database
Run the correspondig job via the Matillion REST API
Using JUnit to make assertion and verify the outcome
Looks like Matillion just released Object Validation in version 1.46. I haven't tried it yet but I imagine this would allow users to set up unit tests, which could then be scheduled or run after the orchestration job it's testing.

How to manage multiple database schema from simple docker?

For my application,i am using multiple databases.I want to run/upgrade schema for all those databases from one place(for management purpose).It is cumbersome process(specially in production/integration phase) to go to all databases and run/upgrade schema after every release or whenever some changes in schema.We thought of using simple docker for this purpose.
Anyone has idea whether is it good idea or not ?If possible please suggest how it can be done ?
I would like if any other suggestions are there.
As suggested by #markc, it is a matter of scripting only.Connect to all database and run schema on them.Used golang as language and built docker for that.

Sync Fox-pro database with MS-SQL server

I have an application running on Fox-pro database. Have added a module to the same application that runs on MS-SQL database. I need to sync both the database in real-time at different intervals. I will eventually move the application to use MS-SQL, but till the code is changed, I need to sync the databases.
Any script or tool is appreciated. Thanks.
I have found a tool, and it is really working absolutely fine.
The link to tool is : Data Loader.
Thanks to all for giving your time and efforts
I don't think there is any magic bullet for this - anyone who has done it will have had a particular requirement and have coded it themselves I would guess.

How to list the namespaces of a Citrusleaf/AeroSpike host?

I want to list the namespaces on a host remotely using the C# Client SDK, and the documentation is very scarce about it.
I am aware of a server tool to do this but I need to query that from a maintenance tool that I am writing, so using the server console is not an option.
Does anybody know if this is possible and if so how to do it?
You can make an info call with the string "namespaces" and parse the returned value.
doc on c# info API: http://www.aerospike.com/apidocs/csharp/html/Methods_T_Aerospike_Client_Info.htm
You can get that information by emulating the logic that clmonitor utilizes to communicate with the Aerospike cluster. Clmonitor is written in Python; executing the 'info' command in clmonitor provides a wealth of information, a subset of which is the list of namespaces. I suggest that you emulate the logic used by clmonitor in your C# code to communicate with the cluster and then parse out the information that you require. In the future, I suggest that you take advantage of the Aerospike forums to ask questions about Aerospike. Thank you for your interest in Aerospike.

Any solution as how a database version tool should be?

I am trying to making a tool which can help in maintaining data base version(like maintaining source code version). The technology which I am thinking to use is spring-hibernate so that the tool can be web based and it can be used by multiple project . The idea is that any database change can only be triggered with the help of this tool,so that the database version information can be maintained and the database can be made consistent .Operations like commit,roll back,branching,merging should be possible. Can you suggest me that how should I approach to this problem?
I have found an opensource tool called LiquidBase which has already provided some sort of solution in maintaining database version. Here is a short preview on what this tool can do. But this tool has some limitations like it does not handle stored procedures and triggers and it works on the basis of an XML file . But I think I can integrate this tool with my requirement and I can speed up development. If you have any other tool in knowledge which can be better than this then please let me know.
If possible tell me that how the tool should be organized so that different project can easily maintain their database version. What all problem the tool should try to address and what minimum support should at least be there in this tool? What should be the UI so that user should be easily able to use it.?

Resources