Passing filename from SSIS script task - sql-server

I'm attempting to create a SSIS package that loads a flat file into a SQL server table.
I've been able to piece the loading functionality together. I'm currently stuck on passing the filename if it's found from the script task back to a variable where I'd like to use it in the flat file connection string.
Public Sub Main()
'
Dim di As DirectoryInfo = New DirectoryInfo("\\winshare\iFile\Cors2\AAA\AAA Employee Incentive Source Data\")
Dim fi As FileInfo() = di.GetFiles("AAA Full PreReg Report*.csv")
If fi.Length > 0 Then
Dts.Variables("User::fileExists").Value = True
Dts.Variables("User::FileName").Value = fi.name
Else
Dts.Variables("User::fileExists").Value = False
End If
' Add your code here
'
Dts.TaskResult = ScriptResults.Success
End Sub
I'm seeking help with
Dts.Variables("User::FileName").Value = fi.name
Why won't this work?
Thanks

If you are looking to get the first file in the directory then you can use the following line of code:
Dts.Variables("User::FileName").Value = fi(0).name
But If you are looking to loop over files then i recommend using the Foreach loop container to loop over files and store each file name within a variable:
SSIS - How to loop through files in folder and get path+file names and finally execute stored Procedure with parameter as Path + Filename
FAQ - How to loop through files in a specified folder, load one by one and move to archive folder using SSIS

Related

Import 2 Excel Files via SSIS with different sheet names

So as the title suggest, I need to do an import of 2 Excel (.xlsx) files from my local machine (c:\temp) into one SQL Server table. Each of the files only contains one sheet, but the sheet names will differ. The columnnames and no of columns on each file is identical.
If I select one specific excel file through SSIS via Excel Connection Manager, it extracts the data perfectly and inserts it into my destination SQL table.
The problem comes in when I add a ForEach Loop Container and want to loop through the c:\temp directory to read the 2 files. Somewhere I am missing a setting and keep getting various "connect to Excel" errors.
Please assist with the following:
I am unsure how to specify the Excel file path. Is the below correct? I used to select the exact file here when loading only 1 file:
Then it seems I need to create variables, so I did below:
Then I am not sure if I should add an expression to my ForEach loop and which mappings would be correct?
And lastly, I am not sure whether to put the filename or sheetname as variable below. I tried the filepath, but get the following error:
Please help as I am totally lost with this.
UPDATE
OK, I have now done the following:
Added a SheetName variable (which I think the Value is maybe incorrect). I am trying to tell it to only read the first sheet.
Then my Excel connection string looks like this:
Provider=Microsoft.ACE.OLEDB.12.0;Data Source=;Extended Properties="EXCEL 12.0 XML;HDR=NO";
My ForEach loop:
And my Excel source:
I get the following error:
[Book 2] Error: Opening a rowset for "Sheet1$" failed. Check that the object exists in the database.
It seems like your biggest issue is in regards to getting the sheetname which can vary, and the only way I know how to do this is with a script task.
So inside your foreach loop (store filepath to the Excel file) as variable, add a script task before you enter the data flow.
First of all start with knowing you connection string (I use this site for help: https://www.connectionstrings.com/excel/)
Set your read/write variable to [SheetName] and read to FilePath
Code:
var cstr = string.Format(#"Provider=Microsoft.ACE.OLEDB.12.0;Data Source={0};Extended Properties=""Excel 12.0 Xml; HDR = YES"";"
, Dts.Variables["FilePath"].ToString()); //Enter your connection string for OleDB conn
using (var conn = new System.Data.OleDb.OleDbConnection(cstr))
{
conn.Open();
var sheets = conn.GetOleDbSchemaTable(System.Data.OleDb.OleDbSchemaGuid.Tables, null);
//Since there is only 1 for sure.
Dts.Variables["SheetName"] = sheets.Rows[0]["TABLE_NAME"].ToString();
}
Now you have the SheetName in a variable (this will have the $ in the sheetname that you need as well), set up another variable called SQL and define it as "Select * from [" + SheetName + "]".
Now use the variable SQL in your DataFlow Source.

Export password protected xlsx file and e-mail it

Currently I'm working with SSIS package that is executing a Stored Procedure and generating an .XLSX file with the results of the query.
What I'm needing to do is to encrypt the .xlsx file. it could be done either encrypting the file after being populated with the SSIS package, or by putting a password on the .xlsx file beforehand and opening it (reading password protected file) and exporting data to it.
*I know that password protected files are not super safe, but for this case I only need it to be password protected for compliance.
I was investigating with SSIS and I believe I can do it with a powershell script that can be run using an "Execute Process Task" tool from SSIS, please correct me if I'm wrong on this.
Update: I'm executing with an "Execute Process Task" a PowerShell script (script.ps1):
Set objExcel = CreateObject(“Excel.Application”)
objExcel.Visible = True
objExcel.DisplayAlerts = FALSE
Set objWorkbook = objExcel.Workbooks.Add
Set objWorksheet = objWorkbook.Worksheets(1)
objWorksheet.Cells(1, 1).Value = Now
objWorkbook.SaveAs “C:\Scripts\Test.xlsx”,,”Password123”
objExcel.Quit
However here I don't knowhow to point to the Excel file I created with the package to password protect it, am I missing something?
this is what my package design looks in SSIS:
And this is the detail of the "Execute Process Task" called "Lock excel file generated":
*This comes from source: https://techcommunity.microsoft.com/t5/SQL-Server-Integration-Services/Run-PowerShell-scripts-in-SSIS/ba-p/388340
You can create encrypted Excel spreadsheets in Powershell.
You'd create a scheduled task to fire of a Powershell script, along the lines of
Set objExcel = CreateObject(“Excel.Application”)
objExcel.Visible = True
objExcel.DisplayAlerts = FALSE
Set objWorkbook = objExcel.Workbooks.Add
Set objWorksheet = objWorkbook.Worksheets(1)
//Whatever you do to populate the workbook
Set filename = [System.IO.Path]::GetRandomFileName()
objWorkbook.SaveAs filename,,”%reTG54w”
objExcel.Quit

Filepicker VBA to select file and store the file name not working

I am trying to run the following in order to get the file name that the user selects. The file is an .mdf file that is attached previously to an SQL server. But when I run it, a window comes out and says I don't have permission to open the file. I know it is because it is being used in SQL, because if I don't attach it in the SQL server it runs without a problem.
The thing is that I need the mdf in SQL before running the vba code and I just need the file name. Is there a way to store the file name without "opening" it?
Function GetDB() As String
Dim db As Office.FileDialog
Dim fileName As String
Set db = Application.FileDialog(msoFileDialogFilePicker)
With db
.Title = "Select a Database"
.AllowMultiSelect = False
.InitialFileName = Application.DefaultFilePath
Application.DisplayAlerts = False
If .Show = True Then
fileName = Mid(.SelectedItems(1), InStrRev(.SelectedItems(1), "\") + 1)
End If
End With
End Function
replace
sItem = .SelectedItems(1)
with:
GetDB = .SelectedItems(1)
I ended up setting and ADODB Connection to get the databases directly from the server without having the "The file is in use" issue.

SSIS: import MAX(filename) from folder

I need to pick one .csv file from \\\Share\Folder\ with max filename for further import to SQL. File name is alphanumerical, e.g. ABC_DE_FGHIJKL_MNO_PQRST_U-1234567.csv, where numerical part will vary, but I need the only max one each time the package runs.
Constraints: no write access on that SQL server, I use ##Temp table for import and this is the least desirable method for filename processing (no for each loops on this server).
Ideally it will be function/expr-based variable (combined with script task if needed) to pass into connection manager. Any ideas much appreciated.
Use a Script Task
Add a variable of type String User::CsvFile
Add a script task to your project and add your created variable as a ReadWriteVariable
In Your Script task write the following code (VB.NET):
You have to Import System.Linq Library
Public Sub Main()
Dim strDirectory As String = "C:\New Folder" ' Enter =the directory
Dim strFile As String = String.Empty
strFile = IO.Directory.GetFiles(strDirectory, "*.csv", IO.SearchOption.TopDirectoryOnly).OrderBy(Function(x) x.Length).Last
Dts.Variables.Item("CsvFile").Value = strFile
Dts.TaskResult = ScriptResults.Success
End Sub
Then use this variable from Flat File Source

SSIS Excel Import - Worksheet variable OR wildcard?

I have a SSIS data import package that uses a source Excel spreadsheet and then imports data into a SQL Server database table. I have been unsuccessful in automating this process because the Excel file's worksheet name is changed every day. So, I have had to manually change the worksheet name before running the import each day. As a caveat, there will never be any other worksheets.
Can I make a variable for the worksheet name?
Can I use a wildcard character rather than the worksheet name?
Would I be better off creating an Excel macro or similar to change the worksheet name before launching the import job?
I use the follow script task (C#):
System.Data.OleDb.OleDbConnection objConn;
DataTable dt;
string connStr = ""; //Use the same connection string that you have in your package
objConn = new System.Data.OleDb.OleDbConnection(ConnStr);
objConn.Open();
dt = objConn.GetOleDbSchemaTable(System.Data.OleDb.OleDbShemaGuid.Tables,null);
objConn.Close();
foreach(DataRow r in dt.Rows)
{
//for some reason there is always a duplicate sheet with underscore.
string t = r["TABLE_NAME"].ToString();
//Note if more than one sheet exist this will only capture the last one
if(t.Substring(t.Length-1)!="_")
{
Dts.Variables["YourVariable"].Value = t;
}
}
And then in SSIS, I add another variable to build my SQL.
new variable "Select * from [" + "Your Variable" + "]"
Finally set your datasource to that SQL variable in Excel Source.
This works perfectly for me with the same scenario, in case it helps you or someone else:
Required package level string variables 2:
varDirectoryList - You will use this inside SSIS for each loop variable mapping
varWorkSheet - This will hold your changing worksheet name. Since you only have 1, it's perfect.
Set up:
a. Add SSIS For Each Loop
b. Excel Connection Manager (connect to first workbook as you test, then at the end you will go to properties and add inside expression "Excel File Path" your varDirectoryList. Set DelayValidation True as well as your Excel Source task. *This will help it go through each workbook in your folder)
c. Inside your For Each Loop add a Scrip Task C#, title it "Get changing worksheet
name into variable" or your preference.
Data Flow Task with your Excel Source to SQL Table Destination.
In your Scrip Task add this code:
using System.Data.OleDb;
public void Main()
{
// store file name passed into Script Task
string WorkbookFileName = Dts.Variables["User::varDirectoryList"].Value.ToString();
// setup connection string
string connStr = String.Format("Provider=Microsoft.ACE.OLEDB.12.0;Data Source={0};Extended Properties=\"EXCEL 12.0;HDR=Yes;IMEX=1;\"", WorkbookFileName);
// setup connection to Workbook
using (var conn = new OleDbConnection(connStr))
{
try
{
// connect to Workbook
conn.Open();
// get Workbook schema
using (DataTable worksheets = conn.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null))
{
// in ReadWrite variable passed into Script Task, store third column in the first
// row of the DataTable which contains the name of the first Worksheet
Dts.Variables["User::varWorkSheet"].Value = worksheets.Rows[0][2].ToString();
//Uncomment to view first worksheet name of excel file. For testing purposes.
MessageBox.Show(Dts.Variables["User::varWorkSheet"].Value.ToString());
}
}
catch (Exception)
{
throw;
}
}
}
After you have this set up and run, you will get a message box displaying the changing worksheet names per workbooks.
If you are using Excel Source SQL Command you will need a 3rd string
variable like: varExcelSQL and inside that an expression like: SELECT
columns FROM ['varWorkSheet$'] which will dynamically change to match
each workbook. You may or may not need the single quotes, change as
needed in varExcelSQL.
If you are not using Excel Source SQL and just loading straight from
the Table; go into Excel Source Properties --> AccessMode -->
OpenRowSet from Variable --> select varWorkSheet.
That should take care of it, as long as the column structures remain the same.
If you happen to get files where it has multi data types in one column; you can use IMEX=1 inside your connection string which forces the datatypes to DT_WSTR's on import.
Hope this helps :-)
If you are using SSIS to import the sheet you could use a script task to find the name of the sheet and then change the name or whatever else you needed to do in order to make it fit the rest of your import. Here is an example of finding the sheet I found here
Dim excel As New Microsoft.Office.Interop. Excel.ApplicationClass
Dim wBook As Microsoft.Office.Interop. Excel.Workbook
Dim wSheet As Microsoft.Office.Interop. Excel.Worksheet
wBook = excel.Workbooks.Open
wSheet = wBook.ActiveSheet()
For Each wSheet In wBook.Sheets
MsgBox(wSheet.Name)
Next
On the MsgBox line is where you could change the name or report it back for another process

Resources