Using a scalar as a condition in an OLE DB data flow - sql-server

I am having a problem with a data flow task in an ssis package i am trying to build. The objective of the package is to update tables situated in our local server using a connection to a distant server containing the source of the data, through a vpn connection.
There are no problems for tables which are re-downloaded entirely.
But some of the tables must be updated for real. What I mean is they're not re-downloaded. For each of those tables, I have to check the maximum value of the date column in our local server (int YYYMMDD type) and ask the package to download only the data added after that date.
I thought about using a scalar (#MAXDATE for ex) but the issue is, I have to declare this scalar in a session with our local server, and I cannot use it as a condition in an OLE DB Source task, because the latter implies a new session, this time with the distant server.
I can only view the database on the distant server and import it. So no way to create a table on it.
I hope it is clear enough. Would you have any tips to solve this problem?
Thank you in advance.
Ozgur

You can do this easily by using an execute SQL Task, a Data Flow task and one variable. I would probably add some error checking just in case no value is found on the local system, but that depends very much on what might go wrong.
Assuming VS2008
Declare a package level variable of type datetime. Give it an appropriate default value.
Create an Execute SQL Task with a query that returns the appropriate date value. On the first page of the properties window, make sure the Result Set is set to "Single Row." On the Result Set page, map the date column to the package variable.
Create a Data Flow task. In the OLE DB Data Source, write your query to include a question mark for the incoming date value. "and MaxDate>?". Now when you click on the Paramaters button, you should get a pop-up that allows you to map "Parameter0" to your package level variable.

Related

SQL Command from Variable for MDX OLEDB source SSIS

I'm having some issues with an MDX query source in an SSIS data flow.
If I configure an OLEDB source properly, and have the data access mode as SQL Command, the MDX query works.
I need this source to be parameterized though, so I'm trying to pass in a variable that is populated at runtime as the MDX source query.
The problem is, when I set this up, it will try to use the variable (which is not correct until runtime) and throw this error.
What is the trick to getting an MDX Source to work from a variable?
I built all of the downstream transformations after first configuring the source with a hardcoded query (SQL Command). Then I went back to change the source to use the variable and it broke.
Thanks for any input.
TITLE: Microsoft Visual Studio
------------------------------
The component reported the following warnings:
Error at DFT SSAS to SQL [SRC SSASPRP01 Cube [2]]: No column information was returned by the SQL command.
Choose OK if you want to continue with the operation.
Choose Cancel if you want to stop the operation.
------------------------------
BUTTONS:
OK
Cancel
------------------------------
You want to a parameterized query and would like to build a String variable with the query. Anyway, your package needs to be validated before run. So, here you have two options:
If your query variable is populated at runtime and it has no Expressions, you might specify this variable value with a valid MDX query. The package and your DataFlow task will be validated before run (regular process)with this default query and pass, and at runtime - use correct MDX query.
You can set DelayValidation property of your DataFlow task to true. Then it will be validated immediately before running, when your variable will contain valid MDX query.
I would prefer the second method as more generic.
Set Delay Validation = True. Delay validation is a property available for all SSIS components and it basicly holds the validation back till excution. Mostly when we set connections or other components with variables it helps. As the variables dont have the true property at run time.

Saving large files (over 20 MB) in Access database with SQL Server back end fails

I have an application with Microsoft Access front end and SQL Server back end. The link is implemented via ODBC data source using SQL Server Native Client 11.0.
There is a table with a column Attachments of OLE Object data type. The back end is a table with VARBINARY(MAX) data type for the Attachments column.
I save files in the Attachment field using Bound Object Frame. Everything works fine until the file size exceeds about 20MB. The statement BoundOLEFrame.Action = acOLECreateEmbed takes about 2.5 minutes to complete. It does not throw any exceptions, but when the following MoveNext or any recordset re-positioning statements is performed, they fail with Run-time error "3426":
"The action was cancelled by an associated object."
As the result the file does not appear to be stored in the database. An attempt to open the file with Access UI by double-clicking on the field causes the error:
"A problem occurred while Microsoft Access was communicating with the
OLE server or ActiveX Control. Close the OLE server and restart it
outside of Microsoft Access. Then try the original operation again in
Microsoft Access."
Suspecting that the issue could be related to ODBC timeout, which is by default is set to 60 seconds, I tried to set the current database QueryTimeout to 600 seconds. But this did not help...
Inserting these large files directly in the table (in Access table datasheet view, right-clicking on the field and selecting Insert Object... in the pop-up menu) first appear as successful, because the file looks like it was inserted and could be opened by double-clicking on the field. But, when I try to close the table, I am prompted if I wish to save it. Answering in affirmative leads to the following error:
"You can't save this record at this time. Microsoft Access may have
encountered an error while trying to save a record. If you close this
object now, the data changes you made will be lost. Do you want to
close the database object anyway?"
According to Access specifications the size of an OLE Object field is 1 GB, which is well above the size of my files.
Any suggestions would be appreciated. I am looking for a way to resolve this particular problem. I don’t think that alternative design for file storage is pertinent to the topic.
I do not have an option to store files any other way.
well the OLE Object field size max may be 1G - - but the overall file size for Access is 2G.
so if you have more than 100 x 20M fields - then there is going to be a problem.
it's unclear how many records are actually transferring to the front end file. I would sanity test a very limited set of records to see whether it is the field size or the overall file size that is coming into play.... perhaps that will give some insight.
I can tell from your post that you DON'T want me to suggest putting those attached files in their own folder and just store the link - but hey - it is the better design.....

Passing SQL Server Name of SSIS Job to package variable

I am trying to figure out a way that I can pass the SQL Sever Name an SSIS Job is running on, to a variable within a package.
Basically, which ever Server this Job is running on, will be passed to the ServerName property in an Ole DB Connection in the package so the data is loaded into that server.
I have been looking at documentation on package configuration, and I feel like it would be in the environment variable section. However, I do not think any of the listed "environment variables" are the server name. I have Googled this issue, and searched on StackOverflow for problems, but I cannot seem to find this problem.
There are probably a few different approaches, but I think you can accomplish this with a simple Execute SQL Task.
Create a new Execute SQL Task
Under General -> SQL Statement, enter your query Select ##SERVERNAME as ServerName in the "SQLStatement" field
Under General -> Result Set, choose "Single row"
Under Parameter Mapping, Enter your variable (create one if you don't have one) User::ServerName in Variable Name, "Input" as direction, 0 as Parameter Name, and -1 as Parameter Size
Under Result Set, enter "ServerName" for your Result Name and type the variable in Variable Name
Click OK
Give that a go and see if it accomplishes what you want.
I was able to accomplish this by using a script task, and C# code to grab the Machine Name, or IP address of the machine running it.
Dts.Variables["User::ServerName"].Value = System.Net.Dns.GetHostName();

Exception handling in SSIS Package Dynamic Connection

SSIS Scenario
I have a SQL variable of type object. It contains all the connection to different servers/databases. I want to connect to those databases one by one and run a query.
Expected Exception Handling
If the SSIS Connection manager(Dynamic connection manger) is unable to find the connection to the server (probably the server is down) I want to Skip that connection (Database/server) and log that into the table and move onto the next Connection (Database/Server). The SSIS package should not crash.
My Implementation
I have Successfully Configured the SSIS package to Use Connection Manager (Dynamic Connection Manager) and Foreach loop to loop through the SSIS variable of type object. but I am not able to skip the Connection if the server/database is not found. it generates error that the server/databsse not found / problem with the connection and the SSIS package fails.
my experience in SSIS is one week old
Any help will be appreciated.
How about setting --> Force execution Result property of the task to success
I was looking for the same fix too. Seems the normal OnError Eventhandling does not work for issues that arise when connecting to a source DB.
There is another workaround I wanted to mention. You can handle the error in the Data Flow Task (OnError Eventhandler, set the System-Variable "Propagate" in that Eventhandler to false). I think this is still required, but not sure. I use it also to log the exception.
Afterwards you can set MaximumErrorCount in ForEachLoop to "0" (which means unlimited). I'm not exactly sure why it works, but trying to find a way to handle the scenario you described, I found this.
==
Just as an interesting observation: For debugging purposes I added an OnError Eventhandler to the ForEachLoop and set a breakpoint there in a dummy script. It was never reached. Nonetheless the ForEachLoop failed all the time until I set MaximumErrorCount to 0.
I dont think its possible to continue the package execution once it encounters an error. You need to control this behavior through a SQL Server table (or any other table for that matter).
Once the package fails, you can set a flag in the table saying that the package failed. The next time the package runs, you can start from this point on and continue the execution. But automatically skipping the down server is kinda pulling a rabbit out of a hat.
Another way you can do this is to ping each server before hand in a separate package and store the ping results in a table. Only pick those records (servers) whose ping results were positive. Otherwise just skip the server.

What is the point of "Initial Catalog" in a SQL Server connection string?

Every SQL Server connection string I ever see looks something like this:
Data Source=MyLocalSqlServerInstance;Initial Catalog=My Nifty Database;
Integrated Security=SSPI;
Do I need the Initial Catalog setting? (Apparently not, since the app I'm working on appears to work without it.)
Well, then, what's it for?
If the user name that is in the connection string has access to more then one database you have to specify the database you want the connection string to connect to. If your user has only one database available then you are correct that it doesn't matter. But it is good practice to put this in your connection string.
This is the initial database of the data source when you connect.
Edited for clarity:
If you have multiple databases in your SQL Server instance and you don't want to use the default database, you need some way to specify which one you are going to use.
Setting an Initial Catalog allows you to set the database that queries run on that connection will use by default. If you do not set this for a connection to a server in which multiple databases are present, in many cases you will be required to have a USE statement in every query in order to explicitly declare which database you are trying to run the query on. The Initial Catalog setting is a good way of explicitly declaring a default database.

Categories

Resources