Dynamically assign filename to excel connection string - sql-server

This is my very first time playing with SSIS in SQL Server 2012. I can successfully read an excel file and load its content to a table in SQL server 2012. The task is a simple direct read excel file then copy to sql server with no validation or transformation for now. The task was successful. But when I tried to make the package read the file name from a variable instead of the original hard coded one, it was generating an error "DTS_E_OLEDBERROR. An OLE DB error has occurred. Error code: 0x80040E4D"
What I did was just replacing the hard coded connection string in the excel connection manager with an expression which took the value of a variable assigned by an expression
The variable was assigned the value before the data flow task started. The variable was checked and did have the correct value.
But the error below was generated when data flow task started.
It would be highly appreciated if someone could point out what I did incorrectly and advise me how to solve the issue.

Option A
The ConnectionString property for an Excel Connection Manager is not where I go to manipulate the current file, which is contrast to an ordinary Flat File Connection Manager.
Instead, put an expression on the Excel Connection Manager's ExcelFilePath property.
In theory, there should be no difference between ConnectionString and ExcelFilePath except that you will have more "stuff" to build out to get the connection string just right.
Also, be sure you're executing the package in 32 bit mode.
Option B
An alternative that you might be running into is that the design-time value for the Connection String isn't valid once it's running. When the package begins, it verifies that all of the expected resources are available and if they aren't, it fails fast rather than dieing mid load. You can delay this validation until such time as SSIS has to actually access the resource and you do this by setting the DelayValidation property to True. This property exists on everything in SSIS but I would start with setting it on the Excel Connection Manager first. If that still throws the Package Validation Error, try setting the Data Flow's delay validation to true as well.

I had a heck of a time trying to get this to work, even after following all the instructions, so I just kept it with a static excel name and added a “File System Task” to copy the file and create a new file with whatever name I need.

We can define our connection string like below in Expression:
Provider=Microsoft.ACE.OLEDB.12.0;
Data Source=" + #[User::InputFolder] + "\\"+ #[User::FileName] +";
Extended Properties=\"EXCEL 12.0 XML;HDR=YES\";

Related

SSIS Package- Retain Same Connection Property in Excel Connection

I am using SSDT 2017. There are 3 Dataflow tasks (They are connected using precedence constraints) which loads data from Excel into Database tables. When I run each task individually it is showing as success but when I run the entire package it is showing as completed with out any green tick on the tasks which means the tasks are not getting executed. After I changed the RetainSameConnection Property of Excel to True, all the tasks are getting completed successfully.
I have not seen this behavior in the earlier versions. Is this a new change in 2017 version or am I missing anything ?
Trying to figure out the issue
Based on the question and comments you are using 3 Data flow Task that contains 3 Excel source components that use the same connection manager.
At the start of the package execution in the Validation phase, each one of this components try must acquire the connection from the connection manager to read the metadata and try to keep this connection open until it must be executed and since there are many files that try to open the same connection it will cause a problem.
When using RetainSameConnection the package run the acquireconnection method once and use it multiple time, then the issue is solved.
Something to try
Try changing the Delay Validation property to True, and ValidateExternalMetadata property to False for each Excel Source on all Data Flow Tasks, it may solve the issue if the connection is only acquired for validation purposes. If it doesn't works then it means that the acquire connection method is called to lock the file for read even if the data flow task execution is not started yet.

SQL Command from Variable for MDX OLEDB source SSIS

I'm having some issues with an MDX query source in an SSIS data flow.
If I configure an OLEDB source properly, and have the data access mode as SQL Command, the MDX query works.
I need this source to be parameterized though, so I'm trying to pass in a variable that is populated at runtime as the MDX source query.
The problem is, when I set this up, it will try to use the variable (which is not correct until runtime) and throw this error.
What is the trick to getting an MDX Source to work from a variable?
I built all of the downstream transformations after first configuring the source with a hardcoded query (SQL Command). Then I went back to change the source to use the variable and it broke.
Thanks for any input.
TITLE: Microsoft Visual Studio
------------------------------
The component reported the following warnings:
Error at DFT SSAS to SQL [SRC SSASPRP01 Cube [2]]: No column information was returned by the SQL command.
Choose OK if you want to continue with the operation.
Choose Cancel if you want to stop the operation.
------------------------------
BUTTONS:
OK
Cancel
------------------------------
You want to a parameterized query and would like to build a String variable with the query. Anyway, your package needs to be validated before run. So, here you have two options:
If your query variable is populated at runtime and it has no Expressions, you might specify this variable value with a valid MDX query. The package and your DataFlow task will be validated before run (regular process)with this default query and pass, and at runtime - use correct MDX query.
You can set DelayValidation property of your DataFlow task to true. Then it will be validated immediately before running, when your variable will contain valid MDX query.
I would prefer the second method as more generic.
Set Delay Validation = True. Delay validation is a property available for all SSIS components and it basicly holds the validation back till excution. Mostly when we set connections or other components with variables it helps. As the variables dont have the true property at run time.

SSIS Connection Manager not using the ConnectionString value

I have a child package where the ConnectionString property of a Connection Manager is set by a Parent Package Variable Configuration. I set up a script task that brings up a message box with the value of the ConnectionString property right before the dataflow task.
`MessageBox.Show(Dts.Connections["CPU_*"].ConnectionString.ToString());`
When I run the parent package, the message box shows that the connection string is changing with every iteration, but in the dataflow it always draws the data from the same source.
I'm using SQL Server 2008 R2, the connection manager is an ADO.Net type, RetainSameConnection is set to False, and I've been researching this for days. Anybody have any ideas?
Update (2/23/2015): To make this stranger, when I look at the diagnostic logs, they tell me that when the new connections are being opened they are using the new connection strings.
I found an answer here Passing SSIS Connection String in parent variable works but package still validates against child design value.
I'm not sure about later versions yet, but certainly up to 2008R2 there is a bug when you pass a connection string into a child package. As you correctly point out the connection string IS passed, but either the connection is evaluated prior to the parent configuration or the connection string is updated from the design object after the parent configuration has been passed.
Either way it just doesn't work.
If, like me, you don't want to add an additional Script Task to all your packages you will need to do the following:
Parent Package
Lets assume we are using a OLE DB connection called MyDBConnection
Add a string variable to hold the connection string (MyConnection)
Add a C# Script Task (before the Package Task) with the line:
Dts.Variables["User::MyConnection"].Value = Dts.Connections["MyDBConnection"].ConnectionString;
Child Package(s)
Add a string variable to hold the connection string (MyConnection)
Add a parent package configuration to pass the value of MyConnection
In the properties for the OLE DB connection add an expression to update the Connection String property from MyConnection

Exception handling in SSIS Package Dynamic Connection

SSIS Scenario
I have a SQL variable of type object. It contains all the connection to different servers/databases. I want to connect to those databases one by one and run a query.
Expected Exception Handling
If the SSIS Connection manager(Dynamic connection manger) is unable to find the connection to the server (probably the server is down) I want to Skip that connection (Database/server) and log that into the table and move onto the next Connection (Database/Server). The SSIS package should not crash.
My Implementation
I have Successfully Configured the SSIS package to Use Connection Manager (Dynamic Connection Manager) and Foreach loop to loop through the SSIS variable of type object. but I am not able to skip the Connection if the server/database is not found. it generates error that the server/databsse not found / problem with the connection and the SSIS package fails.
my experience in SSIS is one week old
Any help will be appreciated.
How about setting --> Force execution Result property of the task to success
I was looking for the same fix too. Seems the normal OnError Eventhandling does not work for issues that arise when connecting to a source DB.
There is another workaround I wanted to mention. You can handle the error in the Data Flow Task (OnError Eventhandler, set the System-Variable "Propagate" in that Eventhandler to false). I think this is still required, but not sure. I use it also to log the exception.
Afterwards you can set MaximumErrorCount in ForEachLoop to "0" (which means unlimited). I'm not exactly sure why it works, but trying to find a way to handle the scenario you described, I found this.
==
Just as an interesting observation: For debugging purposes I added an OnError Eventhandler to the ForEachLoop and set a breakpoint there in a dummy script. It was never reached. Nonetheless the ForEachLoop failed all the time until I set MaximumErrorCount to 0.
I dont think its possible to continue the package execution once it encounters an error. You need to control this behavior through a SQL Server table (or any other table for that matter).
Once the package fails, you can set a flag in the table saying that the package failed. The next time the package runs, you can start from this point on and continue the execution. But automatically skipping the down server is kinda pulling a rabbit out of a hat.
Another way you can do this is to ping each server before hand in a separate package and store the ping results in a table. Only pick those records (servers) whose ping results were positive. Otherwise just skip the server.

Using a scalar as a condition in an OLE DB data flow

I am having a problem with a data flow task in an ssis package i am trying to build. The objective of the package is to update tables situated in our local server using a connection to a distant server containing the source of the data, through a vpn connection.
There are no problems for tables which are re-downloaded entirely.
But some of the tables must be updated for real. What I mean is they're not re-downloaded. For each of those tables, I have to check the maximum value of the date column in our local server (int YYYMMDD type) and ask the package to download only the data added after that date.
I thought about using a scalar (#MAXDATE for ex) but the issue is, I have to declare this scalar in a session with our local server, and I cannot use it as a condition in an OLE DB Source task, because the latter implies a new session, this time with the distant server.
I can only view the database on the distant server and import it. So no way to create a table on it.
I hope it is clear enough. Would you have any tips to solve this problem?
Thank you in advance.
Ozgur
You can do this easily by using an execute SQL Task, a Data Flow task and one variable. I would probably add some error checking just in case no value is found on the local system, but that depends very much on what might go wrong.
Assuming VS2008
Declare a package level variable of type datetime. Give it an appropriate default value.
Create an Execute SQL Task with a query that returns the appropriate date value. On the first page of the properties window, make sure the Result Set is set to "Single Row." On the Result Set page, map the date column to the package variable.
Create a Data Flow task. In the OLE DB Data Source, write your query to include a question mark for the incoming date value. "and MaxDate>?". Now when you click on the Paramaters button, you should get a pop-up that allows you to map "Parameter0" to your package level variable.

Resources