when the source tables are modified or added any latest records in azure blob storage, I need to read only those modifications or latest records using external tables in snowflake which are created on top of source tables?
I need a query or syntax to read them
I have staging tables in my SQL Server database, views that transform and combine those tables and final tables that I create from the result data of the views.
I could automatise the process by creating a stored process that would truncate the final table and insert the data from the view.
I want to know if it's possible to do this operation with an Azure Data Factory copy activity using the view as source and the table as sink.
Thank you for your help!
ADF does support SQL server as source as well as sink.
So there are 2 ways:
You can use copy activity with the view as your source and table as the destination
You can use stored procedure activity wherein you have all data ingestion/transformations logics within stored procedure and call the stored procedure
When processing Fact table, do people commonly use a Data Flow Task (in SSIS) as their ETL (Extract Transform Load), or do they use stored procedures to process data for fact table?
Is it common to use a Data Flow Task with vlookup task to filter data and to insert data into fact table? Or write a stored procedure with SQL queries to process data into fact table?
While processing using Data Flow task, SQL Server seems to load data into server memory and process its Vlookup and Filter tasks and process the data back by batch into SQL Server database.
On the other hand, processing using stored procedure where data is stay in SQL Server. Your feed back is appreciated. Thank you.
I create a pipeline in ADF for performing copy activity. My source database is Azure SQL database and Sink is Azure Blob .I want to execute an SQL Query in ADF to delete data from source once data is copied to blob. I am not allowed to use copy or lookup to execute query.Is their any custom way to do this.I need to create a view and have to do some activity.Please help
You can also use the built-in stored procedure sp_executesql, which allows you to provide a random SQL statement as parameter.
That way you don't have to implement your own stored procedure.
See more information about this stored procedure on sp_executesql (Transact-SQL).
If you are using data mapping flows, there is a new activity to execute custom SQL scripts:
Azure Data Factory mapping data flows adds SQL scripts to sink transformation
In a regular pipeline, you probably have to resort to using the Stored Procedure activity:
Transform data by using the SQL Server Stored Procedure activity in Azure Data Factory
You would have to write the delete logic in the SP, and then invoke the SP from Data Factory.
You can write a stored procedure for deleting the data from source table and call that stored procedure in "Stored procedure" activity after copy activity.
Your data flow will look like:
COPY ACTIVITY -----> STORED PROCEDURE ACTIVITY
They have rolled out the script activity
The script task can be used for the following purposes:
Truncate a table or view in preparation for inserting data.
Create, alter, and drop database objects such as tables and views.
Re-create fact and dimension tables before loading data into them.
Run stored procedures. If the SQL statement invokes a stored procedure that returns results from a temporary table, use the WITH RESULT SETS option to define metadata for the result set.
Save the rowset returned from a query as activity output for downstream consumption.
Script task is present under General tab of Activities.
Ref 1
https://learn.microsoft.com/en-us/azure/data-factory/transform-data-using-script
Ref 2
https://techcommunity.microsoft.com/t5/azure-data-factory-blog/execute-sql-statements-using-the-new-script-activity-in-azure/ba-p/3239969
I am using Talend for the first time and had a question - i am loading data to a staging table and then running an ETL job from the staging to the target database table.
The staging table has got a status column with a error column to indicate the status of database operation (DONE or FAIL) and what was the error if the record FAILED (for log generation). Need to update this via a Stored procedure
After my database operation is done - how can i can access the staging table primary key? This field is not mapped from the stage to actual table