SSIS package - Automate monthly refreshed data - sql-server

Can someone help me to automate a SSIS package with the following criteria?
Pull data from an URL website. This website uploads monthly Excel sheets with data.
The data in the Excel sheets needs to be updated monthly in our SQL Server table.
We only need the latest data, so every month we can truncate & refill or do a delta.
Background info:
There is already a table in our DEV environment that is in place and the necessary data is being populated monthly, however data is incorrect & inconsistent. Our company does not manage the DB, so we do not have control over this. That's why I need to make our own table within the DB so we can efficiently load the correct data.
So far these are the things that I have done to test it out:
Create the package with a Data Flow Task in VS 2019. Excel Source -> Data Conversion -> OLEDB Destination.
Package ran perfectly fine, with the exception of one column which data types did not convert correctly. (I will work on this issue)
These are the things I need to do to successfully execute this requirement:
Automate the execution of the SSIS package by creating a SQL maintenance job which will run monthly.
My questions:
How can I automate the process of pulling the data from the official source (Excel sheets) to our DEV environment on a monthly basis? Or do I have to do this manually?
How can I track the changes that were made in the table monthly? I know I have these options but not sure which one is better:
Configure CDC
Configure change tracking
DDL Triggers
Any help and best practices will be much appreciated.

How can I automate the process of pulling the data from the official source (Excel sheets) to our DEV environment on a monthly basis? Or do I have to do this manually?
Use a script task. It is extremely very little code:
System.Net.WebClient wc = new System.Net.WebClient();
wc.DownLoadFile("URL of file", "where you want the file to go");
This assumes you can get out of your dev environment.

Related

How to implement this specific action in my SSIS package?

I have created an SSIS package for my database running on SQL Server 2014.
This SSIS package extracts data from an Excel workbook (residing in a shared folder on my Windows Server) and feeds a table in my database. I have tested the package (through a SQL Job) and it works fine.
The logic behind the creation of this package is to pull data on a daily basis from that Excel workbook (the latter being updated on a daily basis). It does have a date column; so, data updated are tagged with the relevant date.
Since this SQL job will be executed on a daily basis, the table will be filled with duplicates (that is, data from Day 1 will be repeated on Day 2 and so on).
How can I deal this issue? I was thinking on adding a T-SQL flow to my SSIS package to delete the contents of the Table before it pulls the data from the Excel Workbook. I guess this would work but I am on the look out for a more elegant solution.
Any help would be appreciated.

ssis sql sap hana db (odbc)

Hi I am using SSIS (MSSQL) to copy data between multiple tables. This has been working fine up until recently when the S.A.P. team keeps updating the schema of the tables without telling me.
I have multiple tables that they continue to add columns to; this in turn makes my SSIS job of copying the data across fail.
Is there a way in SSIS that I can look at the source table and adjust my destination table to reflect the changes on the fly?
I'm quite new at SSIS and don't mind running a script out of the GUI but wondered if this was an option within the GUI I'm already familiar with.
So in short, can I in SSIS allow for new columns being added to source tables and update my destination tables automatically to stop my jobs failing
(Oh and map source to destination tables automatically)?
You'll have to include the new columns in the data flow, i.e. source and destination (include and map them). So basically you CANNOT automate what you're looking for in SSIS. Hope it helps.
Look into BiML Script, which lets you create and execute SSIS packages dynamically based on the meta data available at run time.

create reusable controls/packages in an one dtsx package in SSIS?

We use SSIS for our data automation. The caveat is we don't use the normal way mentioned online. For our environment, we save the Package.dtsx file on a server that has a windows job that will execute it using dtexec.exe.
I have multiple SSIS packages to pull data from various sources (Oracle, MySQL, SQL Server) and the general flow for them is the same. The table names are different but I will use data as the table names for one of the sources/SSIS packages.
backup the table data into bak_data on the destination DB
import new data from the source into data
compare data quality (row count) against data and bak_data
if data quality does meet our threshold, send a success e-mail (execute task against our destination DB using db_send_dbmail)
if the data quality does not meet our threshold, backup data to bad_data then restore from bak_data to data and send failure e-mail
Since the steps are always the same I thought I could use Control Flow Package Parts and then just use variables for the table names and what not.
But upon further investigation I realized I cannot do that because the Control Flow Package .dtsxp is a separate file referenced in/from the Package.dtsx file?
I can copy it to our automation server but not sure if that will be enough to work when Package.dtsx is executed using dtexec.
Is there anyway I can create reusable controls/packages with my constraint/situation?

Update SQL Server (2014) table from an Excel email attachment

We track IPs that attack our site. First attack, we temp block them. Tf they ever attack again then we permanently blacklist them. Information for each attack by each IP is stored in perpetuum. Twice daily, reports with an Excel spreadsheet with all pertinent information is emailed to various people, and then the information is manually added to a massive spreadsheet. We've recently spun up a new box with SQL server and I've added all of the existing information to a table in the new database.
As I'm new to this, I would like to know if there is a way to send the daily spreadsheets to this new sql server and have it parse out the excel attachment and update our master tracking table. The spreadsheet will always have the same structure (15 columns and header and footer rows) with varying row quantities, and of course it matches the existing table structure.
I've been googling it and am only able to find queries (ba dum tish) on how to make SQL export to excel and send an email with Database Mail. Can't find anything on sending en email TO sql server and having it process an attachment.
You can make use of the SQL Server Integration Services(SSIS). You can write an SSIS package that import the data from the given Excel spreadsheet to a table and
then from that table you can write insert or update statements to your production table. You can use "Data Flow task" to Import the Data from the excel file and then write an " Execute SQL Task" which will update the values to Production table. Remember that you will have to keep the Excel file in the same folder all time (or else you can use dynamic statements to get the file name dynamically using Variables). Once you have completed the package you can schedule the package as an SQL Server Job which will run periodically and hence the data will be automatically updated.
Please refer this video for a basic idea about SSIS :
Import Data From Excel to SQL Server Using SSIS
Twice daily, reports with an Excel spreadsheet with all pertinent information is emailed to various people,
Try saving the File to a location and then use SSMS Export,Import Wizard ..This package can be saved and set to Run Daily
Here is a step by step tutorial covering the same..
https://www.mssqltips.com/sqlservertutorial/203/simple-way-to-import-data-into-sql-server/

Convert or output SSIS package/job to SQL script?

I understand this may be a little far-fetched, but is there a way to take an existing SSIS package and get an output of the job it's doing as T-SQL? I mean, that's basically what it is right? Transfering data from one database to another can be done with T-SQL as well.
I'm wondering this because I'm trying to get away from using SSIS packages for data transfer and instead using EF/linq to do this on the fly in my application. My thought process is that currently I have an SSIS package that transfers and formats data from one database to another in preparation to be spit out to an excel. This SSIS package runs nightly and helps speed up the generation of the excel as once the data is transferred to the second db, it's already nice and formatted correctly.
However, if I could leverage EF and maybe some linq to sql in order to format the data from the first database on the fly and spit it out to excel quickly without having to use this second db, that would be great. So can my original question be done, can I extract the t-sql representation of an SSIS package some how?
SSIS packages are not exclusively T-SQL. They can consist of custom back-end code, file system changes, Office document creation steps, etc, to name only a few. As a result, generating the entirety of an SSIS package's work into T-SQL isn't possible, because the full breadth of it's work isn't limited to SQL Server.

Resources