Lock an SSIS package from multiple simultaneous executions - sql-server

I have an SSIS package(Package1.dtsx) that been deployed to SSISDB. currently I scheduled the package with some parameters in sql server agent.
how do I lock the package(Package1.dtsx) if someone try to attempt running it in another sql server agent job with different parameters.

You can do this yourself by adding a flag and having your package check this flag before processing. Either quit out, loop until flag is clear or some other logic.
I personally have only ever had one agent per package and the agent handles the multiple execution scenarios.

Locking a package to prevent it from multiple executions is not possible. Think of it as a file. There is no way to lock a file from a user who has the rights to use it.
You can either create user groups/roles on SQL Server to segregate the execution depending on your needs/usage factors. To me, there is no straight forward way of locking a file from multiple executions. Sorry!

Related

How to config Oracle Data Integrator to restart when a job is error?

My company is using Oracle Data Integrator for ETL jobs. Recently, there's an issue with a source database that lead to extracting job sometimes fail (very randomly, once or twice per 10 extract jobs). When we restart the job, most of the times it run successfully.
So while we are trying to fix the connection to source database, is there any way to restart that particular job 1 or 2 times if it fails? How can I config that?
Thanks!
You can enclose the scenario in a package. Then, set the Processing after failure options in the package Advanced tab:

SSIS Parallelism - Microsoft HPC Cluster?

I am new to SSIS, and am trying to use its Parallelism Feature to import data from a database.
My job is to do this: Import a multi terabyte database into a set of flat files as quickly as possible.
I was thinking of this:
I have a Microsoft Server 2008 HPC Cluster (of 3 nodes) at my disposal. I was thinking of writing a HPC SOA job so that all the three compute nodes can make independent connections to the SQL Server and import a portion of the data in parallel. Ofcourse this would have nothing to do with SSIS and be an independent utility.
Then I came across SSIS, and its parallel import features. MY SSIS Server is not very high end - only a 4GB Machine. I am somehow inclined to use SSIS because that's the ideal Microsoft way of doing data import - and I won't have to rewrite a lot of stuff and possibly use existing transformations etc.
What is the best way to use Custom Tasks (or available ones) and do this import in parallel?
Gitmo, I may misunderstand your question but will give it a shot. You need to move data from a SQL Server instance to multiple files, correct? You want to leverage the parallelised data movement functionality provided by SSIS. That means multiple simultaneously running Data Flow Tasks (DFTs). For each target file you can have only one DFT because of problems with concurrent writes.
To get multiple simultaneously running Data Flow Tasks where your source is a SQL Server database and your target is a set of files, you can possibly try the following ways (please note there are upper limits on the parallelization you can get out of SSIS based upon many factors including your CPU Core count, whether you are running in BIDS/Visual Studio or not, and various settings in your packages, your server(s), your SQL Server instance, and many other considerations):
The Multiple Simultaneous DFT Solution: A single SSIS Package with one Connection Manager pointed to the source SQL Server database and many Connection Managers each pointed to a separate target file, plus one DFT for each target file. The DFTs are all disconnected from one another (no precedence constraints or green/red/blue lines/arrows). If there are pre or post ETL steps that need to run a great way to parallelize these DFTs is to drop them all in a Sequence Container that is connected to the earlier and later tasks through precedence constraints/arrows. These disconnected DFTs in their own Sequence Container will try to all run simultaneously.
The Multiple Simultaneous DTEXEC Solution: Multiple SSIS packages each with their own target file-specific DFT. You manually run separate DTEXEC processes either through separate CMD windows or through the GUI. #3 below is a variation on this solution and possibly a better one.
The Parent Master Package Running Multiple Children Packages Solution: Wrap the per target file packages developed in #2 above in a single Parent Master package. In the Parent package have multiple simultaneously running Execute Package Tasks. Again these Execute Package Tasks would be disconnected from other tasks. A good way to do this is to drop the multiple Execute Package Tasks in their own Sequence Container. As before if the Execute Package Tasks are disconnected (no precedence constraints/arrows) they will all try to run simultaneously.
Take a look at this excellent article from the Microsoft SQLCAT Team for some more ideas/insight: Top 10 SQL Server Integration Services Best Practices
There are likely variations on these same ideas and possibly other solutions available both inside and outside of SSIS. Good luck!
please look this post ..... using multi threading out side ssis and acheiveing parallelism Multithreaded serial execution
with out modifying much of package
http://sqljunkieshare.com/2011/12/21/parallelism-in-etl-process-ssis-2008-and-ssis-2012/

Scheduled jobs in Sql Agent

I have created SSIS packages to move data from AS400 to SQL Server which are scheduled daily.some of the packages in sql agent are taking longer duration more than 9 hours to complete.IF I run same package in Business intelligence studio manually, it is completing in less than 4 hours.Due to this problem my schedule packages are not competing on time.please help me to sort out this issue. I am unable to understand why there is a difference in task completion duration between manual interaction and schedule jobs.
My environment is windows server 2003 with sql server 2005 with SP3.please help me to sort out this issue.
The best way to get around this problem is to watch the scheduled task by using some debug statements and messages. For example, have some insert statements in the stored procedures the SSIS package is invoking. This way u will get to know what control is taking more time than expected. First try to isolate the control that is making the difference.
Also, you can invoke the package from command prompt using:-
dtexec /f filename.dtsx
This will print out all the messages in the console at each step as well.
Use SSIS logging in the package to log to a database table. Set logging to record start and end of tasks. By running the package in BIDS and comparing it to the logging when it is run on the server you will see which tasks are taking too long. See http://msdn.microsoft.com/en-us/library/ms138020.aspx for more info on SSIS logging (in sql 2008)
Might it be that the SQL server is less powerful than your client or has more load when you execute the package?
Business intelligence Studio the package is executed on your local client with it's CPU and RAM (I think).
Check what version of DTSEXEC you are using. May be you are using 32-bit version at one place and 64-bit at the other one.

Scheduled execution of code to conduct database operations in SQL Server

If I want to conduct some database operations on a scheduled basis, I could:
Use SQL Server Agent (if SQL Server) to periodically call the stored procedure and/or execute the T-SQL
Run some external process (scheduled by the operating system's task scheduler for example) which executes the database operation
etc.
Two questions:
What are some other means of accomplishing this
What decision criteria should one use to decide the best approach?
Thank you.
Another possibility is to have a queue of tasks somewhere, and when applications that otherwise use the database perform some operation, they also do some tasks out of the queue. Wikipedia does something like this with its job queue. The scheduling isn't as certain as with the other methods, but you can e.g. put off doing housekeeping work when your server happens to be heavily loaded.
Edit:
It's not necessarily better or worse than the other techniques. It's suitable for tasks that do not have to be performed by any specific deadline, but should be done "every now and then", or "soon, but not necessarily right now".
Advantages
You don't need to write a separate application or set up SQL Server Agent.
You can use any criteria you can program to decide whether to run a task or not: immediately, once a certain time has passed, or only if the server is not under heavy load.
If the scheduled tasks are ones like optimising indices, then you can do them less frequently when they are less necessary (e.g. when updates are rare), and more frequently when updates are common.
Disadvantages
You might need to modify multiple applications to cooperate correctly.
You need to ensure that the queue doesn't build up too much.
You can't reliably ensure that a task runs before a certain time.
You might have long periods where you get no requests (e.g. at night) where deferred/scheduled tasks could get done, but don't. You could combine it with one of the other ideas, having a special program that just does the jobs in the queue, but you could just not bother with the queue at all.
You can't really rely on external processes. All 'OS' based solutions I've seen failed to deliver in the real world: a database is way more than just the data, primarily because of the backup/restore strategy, the high availability strategy, the disaster recoverability strategy and all the other 'ities' you pay for in your SQL Server license. An OS scheduler based will be an external component completely unaware and unintegrated with any of them. Ie. you cannot back/restore your schedule with your data, it will not fail over with your database, you cannot ship it to a remote disaster recovery site through your SQL data shipping channel.
If you have Agent (ie. not Express edition) then use Agent. Has a long history of use and the know how around it is significant. The only problem with Agent is its dependence on msdb that makes it disconnect from the application database and thus does not play well with mirroring based availability and recoverability solutions.
For Express editions (ie. no Agent) the best option is to roll your own scheduler based on conversation timers (at least in SQL 2k5 and forward). You use conversations to chedule yourself messages at the desired moment and rely on activated procedures to run the tasks. They are transactional and integrated with your database, so you can rely on them being there after a restore and after a mirroring or clustering fail over. Unfortunately the know how around how to use them is fairly skim, I have several articles about the subject on my site rusanu.com. I've seen systems replicating a fair amount of Agent API on Express relying entirely on conversation timers.
I generally go with the operating systems scheduling method (task scheduler for Windows, cron for Unix).
I deal with multiple database platforms (SQL Server, Oracle, Informix) and want to keep the task scheduling as generic as possible.
Also, in our production environment we have to get a DBA involved for any troubleshooting / restarting of jobs that are running in the database. We have better access to the application servers with the scheduled tasks on them.
I think the best approach for the decision criteria is what the job is. If it's a completely internal SQL Server task or set of tasks that does not relate to the outside world, I would say a SQL Job is the best bet. If on the other hand, you are retrieving data and then doing something with it that is inherently outside SQL Server, very difficult to do in T-SQL or time consuming, perhaps the external service is the best bet.
I'd go with SQL Server Agent. It's well integrated with SQL Server; various SQL Server features use Agent (Log Shipping, for instance). You can create an Agent job to run one or more SSIS packages, for instance.
It's also integrated with operator notification, and can be scripted, or else executed through SMO.

Determine if a specific Windows application is running using SQL Server 2005

I need to determine if a specific application is running from a SQL Server 2005 job. The issue is that one of our applications we use to send data will hang, causing problems with any subsequent jobs that invokes it. If I can also obtain the CPU time, I can determine if it's likely a hung process.
A list of running applications would be good, but being able to lookup a specific executable name with the CPU time would be fantastic!
Any application launched by a job step will show as being run by the same logon account as the SQL Server Agent. Use a specific service account for the SQL Server Agent that won't be used for any other services. This willallow you to monitor the applications launched from by a job using Task Manager, Performance Monitor, etc.
Try opening the SQL Server Activity Monitor. You can also get some of the information from the stored proc sp_who2.
Have the job run an external script (batch file, KSH script) instead of a TSQL script.
I think the best approach is to run SQL Server Profiler as well as performance monitor and wait for the specified job to run. Then import the perfmon stats into profiler. You can do this from SQL Server profiler by going to File–> Import Performance Data… and point it to your Performance Monitor logs.
You should be able to choose the Process(all) counter to give you a list of all running processes, as well as getting CPU time for the processes. You can then correlate this to the application name and/or hostname from the Profiler logs to see whats going on.
I use the (free) replacement to task manager "Process Explorer" to get a better look at exe's and their dependencies.
Might be worth monitoring your issue with this.
http://technet.microsoft.com/en-us/sysinternals/bb896653.aspx

Resources