Apply VBA code to Excel file from SSIS - sql-server

Good evening everyone.
I have to build a SSIS package that does as follows:
1) Execute a VBA code to a XLS file (Transpose a range into another range)
2) Save the XLS (In the same file or as a new file)
3) Import the modified XLS from the Transposed range.
Basically I have to transpose the data inside a XLS that I must import, and I didn't find a good way to do that in SSIS (Since the column range can change between files)
With this simple VBA script I can do that and make SSIS read the data in a very straightforward way. However I'm not finding a way to apply this code without modifying the Excel previously manually to add the script and run the VBA script. I want to automate this so the package prepares the xls, extracts the new data, and save it to a table.
Can anyone shed some ideas on how to apply this code or other ways to do this? The most important point I think is that it's a very specific range that I want to transpose.
Sub myTranspose()
With Range("a18:ZZ27", Range("a18:ZZ27").End(xlDown))
.Copy
Range("a30").PasteSpecial Transpose:=True
End With
End Sub

Create a Script Task that is piped into a Data Flow task
Edit the Script Task by double clicking the Script Task and clicking the Edit Script button.
Add references to Excel and CSharp as seen in this answer
Add some code similar to the following:
public void Main()
{
string filepath = #"c:\temp\transpose.xlsx";
Excel.Application xlApp;
Excel._Workbook oWB;
try
{
xlApp = new Excel.Application();
xlApp.Visible = false;
oWB = (Excel.Workbook)xlApp.Workbooks.Open(filepath);
Excel.Range fromrng = xlApp.get_Range("B4", "F5");
Object[,] transposedRange = (Object[,])xlApp.WorksheetFunction.Transpose(fromrng);
Excel.Range to_rng = xlApp.get_Range("A8", "A8");
to_rng = to_rng.Resize[transposedRange.GetUpperBound(0), transposedRange.GetUpperBound(1)];
to_rng.Value = transposedRange;
xlApp.ActiveWorkbook.Save();
oWB.Close(filepath);
}
catch (Exception ex)
{
//do something
}
Dts.TaskResult = (int)ScriptResults.Success;
}
This gives the following result in the sample transpose.xlsx I created.

Related

SSIS Extract links from Excel cells to load into SQL

The problem:
I have an SSIS package that loops through 100+ Excel files and reads the data, then copies the contents over to a SQL Server Table. In these Excel files, this one column has hyperlinks. The column text itself says something like DSH-LN-4, but clicking on it in Excel opens up a folder that contains some images. How do I copy the underlying link in this column rather than the actual text in the cells?
What have I tried so far:
I haven't really tried anything because I found absolutely no resources on how to do this in SSIS. Manually adding a column to the Excel files is NOT possible, since there are 100's of files. The only resource I found was in this SO Question, but this does not indicate the process of doing this without manually manipulating the Excel files.
What I would like:
In my ForEach loop container, I have a data flow task that gets the Excel contents and shoves it into the SQL Table. The column that contains hyperlinks is called PhotoReference (since these hyperlinks open the folder that has the photos). I would like this PhotoReference column to copy over the underlying hyperlink of the cell and add that to the SQL column.
For instance, I want the PhotoReference column to contain this:
www.companyname.box.com/asjdfbgkjb134kjbsdafo2bm21n4bk
If I can manage to do this, my Power BI report running off of this underlying data could contain a clickable text that would open the image directly.
Any help would be appreciated.
UPDATE:
I was able to try two different methods to extract the hyperlinks from my column, but each of these have their own issues:
Method 1: I added a Script Task component to my ForEach container and as I loop through each Excel file, used Microsoft.Office.Interop.Excel.Hyperlinks assembly to get the hyperlink from my Excel column. BUT, I don't know what to do with it after. I figured the only thing to do is to overwrite the Excel columns' content with my extracted hyperlink, but I really rather not change my Excel files in any manner.
Method 2: I added a Script Component object inside my data flow task in between my Excel source and SQL Destination. In this method, I could not get nearly as far because the Input0_ProcessInputRow method that is auto-generated has the argument Row of type Input0Buffer. I am not able to apply any Microsoft.Office.Interop.Excel properties to my Input0Buffer object. So I am stuck.
If you have to right to alter the excel files, you can simply add a Script Task before the data flow task to replace the URL column value with the hyperlink.
In this answer, I will provide a step-by-step solution to solve this problem:
Creating Excel samples
First of all, I created some Excel files with the following columns:
First name (text)
Last name (text)
Age (number)
Photo (hyperlink)
The file content looks like the following:
Creating the SSIS package
First of all, You must add an Excel connection manager that link to one of the Excel files you need to import. And an OLE DB connection manager to connect to the SQL Server instance.
You must add a SSIS variable of type string, to store the Excel file path when using the foreach enumerator
Add a Foreach loop container and configure it to loop over the Excel files as mentioned in the images below:
Within the Foreach Loop container add a Script Task and a Data flow task as mentioned in the image below:
Now, Open the data flow task and add an Excel source and an OLE DB destination and configure the columns mapping between them.
Open the Script Task configuration, and select the ExcelFilePath variable (created in step 2) as a readonly variable as mentioned in the image below:
Now, open the Script editor and in the solution explorer window, right-click on the references icon and click on "Add Reference..."
When the Add reference catalog appears, click on the COM tab, and search for Excel, then you should select the Excel Object Library from the results as shown in the following image:
Also, make sure to add Microsoft.CSharp.dll reference.
On the top of the script you should add the following line:
using Excel = Microsoft.Office.Interop.Excel;
using System.Runtime.InteropServices;
In the Main() function add the following lines:
Excel.Application excel = new Excel.Application();
string originalPath = Dts.Variables["User::ExcelFilePath"].Value.ToString();
Excel.Workbook workbook = excel.Workbooks.Open(originalPath);
Excel.Worksheet worksheet = (Excel.Worksheet)workbook.Worksheets[1];
Excel.Range usedRange = worksheet.UsedRange;
int intURLColidx = 0;
excel.Visible = false;
excel.DisplayAlerts = false;
for (int i = 1; i <= usedRange.Columns.Count; i++)
{
if ((worksheet.Cells[1, i] as Excel.Range).Value != null &&
(string)(worksheet.Cells[1, i] as Excel.Range).Value == "Photo")
{
intURLColidx = i;
break;
}
}
for (int i = 2; i <= usedRange.Rows.Count; i++)
{
if ((worksheet.Cells[i, intURLColidx] as Excel.Range).Hyperlinks.Count > 0)
{
(worksheet.Cells[i, intURLColidx] as Excel.Range).Value2 = (worksheet.Cells[i, intURLColidx] as Excel.Range).Hyperlinks.Item[1].Address.ToString();
}
}
workbook.Save();
Marshal.FinalReleaseComObject(worksheet);
workbook.Close(Type.Missing, Type.Missing, Type.Missing);
Marshal.FinalReleaseComObject(workbook);
excel.Quit();
Marshal.FinalReleaseComObject(excel);
Dts.TaskResult = (int)ScriptResults.Success;
In the lines above, first we searched for the column index that contains the hyperlink (in this example the column name is "Photo", then we will check for each line if the Hyperlink address is not empty we will replace the column value with this hyperlink address)
Finally, make sure to configure the Excel connection manager to read the file path from the created variable value (Step 2) using expressions:
Experiments
After running the package, if we open an Excel file we will see that the Cell value is replaced with the URL:
And as shown in the image below, data are imported successfully to SQL Server:
References
Missing compiler required member 'microsoft.csharp.runtimebinder.binder.convert'
Extracting a URL from hyperlinked text in Excel cell
Excel interop prevent showing password dialog
What you will probably need to do is some hackery involving the Excel COM API, or macros. In fact, since you should stay away from using the Office COM API in SSIS.
You could pre-process excel to take that value with non-standard operations in SSIS, like using script component.
These are the steps you need to follow to import that data using the Script component:
Drag and drop a script component and select "source" as the script option type.
By default the script language is Microsoft Visual C# 2008 and I have done this sample with Microsoft Visual Basic 2008. Change this if you need to.
Define your output columns with the correct data type in "data type properties"
Edit the script. In the IDE you should add reference:
Microsoft.Excel 11.0 Object Library
(if that reference doesn´t work, try with Microsoft.Excel 5.0 Object Library)
Finally, write some code:
Imports Microsoft.Office.Interop.Excel
Public Overrides Sub getHyperlink()
Dim oExcel As Object = CreateObject("Excel.Application")
Dim FileName As String
FileName = Variables.FileName
Dim oBook As Object = oExcel.Workbooks.Open(FileName)
Dim oSheet As Object = oBook.Worksheets(1)
Output0Buffer.AddRow()
// change A1 with your correct col & row
Output0Buffer.Address = cell.range("A1").Hyperlinks(1).Address & "#" & cell.range("A1").Hyperlinks(1).SubAddress
End Sub
(keep in mind that it is a code that may not run, it is by way of illustration)
You could see code in C# here:
C# Script in SSIS Script Task to convert Excel Column in "Text" Format to "General"
The only issue with the script method is you need to have the Excel
runtime installed.
More about script component here:
https://www.tutorialgateway.org/ssis-script-component-as-transformation/

Database/Excel import and export format

We have a database program that can export and import excel sheets. The excel sheets it exports are formatted as text, I've noticed you cannot change the formatting in these exports without performing text to columns on the data first.
When I edit an export for import or create a new import file I format the cells as text but it doesn't upload nicely. It rounds numbers and skips some data.
I would like to know how to format the import file the same as the export file but I am not familiar with what kind of format behaves the way I described. I've written a bit of VBA to create the import file from data in a workbook so applying the formatting in VBA would be ideal. Any insight would be very appreciated.
What is the 'database program'? If this is an old legacy tool, you may not have many, or any options. If it is Access, you can write some VBA to format the exported Excel report.
'This deals with Excel already being open or not
On Error Resume Next
Set xl = GetObject(, "Excel.Application")
On Error GoTo 0
If xl Is Nothing Then
Set xl = CreateObject("Excel.Application")
End If
Set XlBook = GetObject(filename)
'filename is the string with the link to the file ("C:/....blahblah.xls")
'Make sure excel is visible on the screen
xl.Visible = True
XlBook.Windows(1).Visible = True
'xl.ActiveWindow.Zoom = 75
'Define the sheet in the Workbook as XlSheet
Set xlsheet1 = XlBook.Worksheets(1)
'Then have some fun!
with xlsheet1
.range("A1") = "some data here"
.columns("A:A").HorizontalAlignment = xlRight
.rows("1:1").font.bold = True
end with
'And so on...
You can't format data imported into an Access table, but you can add some formatting to an Access report or Form (I doubt you want to do this).

Is there a way to import an image from excel to a PictureBox?

I am writing an application that works with Excel files. So far I have been using Gembox spreadsheet to work with excel files. However, I discovered using Gembox spreadsheet I can save pics to excel files, but not retrieve them. Anyone can recommend how to retrieve a pic from excel file? Thank you
Here is how you can retrieve an image from an Excel file with GemBox.Spreadsheet:
ExcelFile workbook = ExcelFile.Load("Sample.xlsx");
ExcelWorksheet worksheet = workbook.Worksheets.ActiveWorksheet;
// Select Picture element.
ExcelPicture picture = worksheet.Pictures[0];
// Import to PictureBox control.
this.pictureBox1.Image = Image.FromStream(picture.PictureStream);
// Or write to file.
File.WriteAllBytes("Sample.png", picture.PictureStream.ToArray());

Excel - VBA Question. Need to access data from all excel files in a directory without opening the files

So I have a "master" excel file that I need to populate with data from excel files in a directory. I just need to access each file and copy one line from the second sheet in each workbook and paste that into my master file without opening the excel files.
I'm not an expert at this but I can handle some intermediate macros. The most important thing I need is to be able to access each file one by one without opening them. I really need this so any help is appreciated! Thanks!
Edit...
So I've been trying to use the dir function to run through the directory with a loop, but I don't know how to move on from the first file. I saw this on a site, but for me the loop won't stop and it only accesses the first file in the directory.
Folder = "\\Drcs8570168\shasad\Test"
wbname = Dir(Folder & "\" & "*.xls")
Do While wbname <> ""
i = i + 1
ReDim Preserve wblist(1 To i)
wblist(i) = wbname
wbname = Dir(FolderName & "\" & "*.xls")
How does wbname move down the list of files?
You dont have to open the files (ADO may be an option, as is creating links with code, or using ExecuteExcel4Macro) but typically opening files with code is the most flexible and easiest approach.
Copy a range from a closed workbook (ADO)
ExecuteExcel4Macro
Links method
But why don't you want to open the files - is this really a hard constraint?
My code in Macro to loop through all sheets that are placed between two named sheets and copy their data to a consolidated file pulls all data from all sheets in each workbook in a folder together (by opening the files in the background).
It could easily be tailored to just row X of sheet 2 if you are happy with this process
I just want to point out: You don't strictly need VBA to get values from a closed workbook. You can use a formula such as:
='C:\MyPath\[MyBook.xls]Sheet1'!$A$3
You can implement this approach in VBA as well:
Dim rngDestinationCell As Range
Dim rngSourceCell As Range
Dim xlsPath As String
Dim xlsFilename As String
Dim sourceSheetName As String
Set rngDestinationCell = Cells(3,1) ' or Range("A3")
Set rngSourceCell = Cells(3,1)
xlsPath = "C:\MyPath"
xlsFilename = "MyBook.xls"
sourceSheetName = "Sheet1"
rngDestinationCell.Formula = "=" _
& "'" & xlsPath & "\[" & xlsFilename & "]" & sourceSheetName & "'!" _
& rngSourceCell.Address
The other answers present fine solutions as well, perhaps more elegant than this.
brettdj and paulsm4 answers are giving much information but I still wanted to add my 2 cents.
As iDevlop answered in this thread ( Copy data from another Workbook through VBA ), you can also use GetInfoFromClosedFile().
Some bits from my class-wrapper for Excel:
Dim wb As Excel.Workbook
Dim xlApp As Excel.Application
Set xlApp = New Excel.Application
xlApp.DisplayAlerts = False ''# prevents dialog boxes
xlApp.ScreenUpdating = False ''# prevents showing up
xlApp.EnableEvents = False ''# prevents all internal events even being fired
''# start your "reading from the files"-loop here
Set wb = xlApp.Workbooks.Add(sFilename) '' better than open, because it can read from files that are in use
''# read the cells you need...
''# [....]
wb.Close SaveChanges:=False ''# clean up workbook
''# end your "reading from the files"-loop here
''# after your're done with all files, properly clean up:
xlApp.Quit
Set xlApp = Nothing
Good luck!
At the start of your macro add
Application.ScreenUpdating = false
then at the end
Application.ScreenUpdating = True
and you won't see any files open as the macro performs its function.

Any way to import multiple (csv) files to an Access db

I have multiple csv files with the same scheme, and I want to import them in one step. A solution could be to use the "import wizard", but I can only import one file with it. Oh, and it would be the best to work in msaccess2003. THX
The simplest solution is to start a dos-prompt, change to the directory where you have your files, and type:
type *.csv > allfiles.txt
If you do this often, you can create a batch-file that you can double-click from your desktop.
You can write a small program for importing see http://www.javaworld.com/javaworld/javaqa/2000-09/03-qa-0922-access.html for java JDBC conector to msaccess and since the import file is csv you can do this in no time...
There are other importing options for other languages
If all you want to do is drive the import with a list of files, you don't need a batch file. You can get the list of files using Dir():
Dim strCSVFileName As String
strCSVFileName = Dir("*.csv")
Do Until strCSVFileName = vbNullString
[import strCSVFileName]
strCSVFileName = Dir()
Loop
Of course, this assumes you're doing the import from within Access, but given your tags, that's the logical inference of your question.
This is an old thread, but it turned up when I searched for the issue. Hopefully this code helps someone address the same challenge. Builds / expands on the example David-W-Fenton offers, above.
I imported a file first, using the Wizard. Imported into a table named "bestTranscripts" and saved the import template as "BestImport" -- then used those values in the TransferText command.
Function ImportFiles()
On Error Resume Next
Dim cnn As New ADODB.Connection
Dim targetSet As New ADODB.Recordset
Dim sourceDirectoryName As String
Dim sourceFileName As String
sourceDirectoryName = "<path containing files>"
sourceFileName = Dir(sourceDirectoryName & "\*.txt")
Do Until sourceFileName = vbNullString
DoCmd.TransferText acImportDelim, "BestImport", "bestTranscripts", sourceFileName
sourceFileName = Dir()
Loop
End Function

Resources