How do you run SQL Server Merge Replication Jobs sequentially? - sql-server

I work with an environment that uses Merge Replication to publish a dozen publications to 6 a dozen subscribers every 10 minutes. When certain jobs are running simultaneously, deadlocks and blocking is encountered and the replication process is not efficient.
I want to create a SQL Server Agent Job that runs a group of Merge Replication Jobs in a particular order waiting for one to finish before the next starts.
I created an SSIS package that started the jobs in sequence, but it uses sp_start_job and when run it immediately starts all the jobs so they are running together again.
A side purpose is to be able to disable replication to a particular server instead of individually disabling a dozen jobs or temporarily disabling replication completely to avoid 70+ individual disablings.
Right now, if I disable a Merge Replication job, the SSIS package will still start and run it anyway.
I have now tried creating an SSIS package for each Replication Job and then creating a SQL Server Agent job that calls these packages in sequence. That job takes 8 seconds to finish while the individual packages it is calling (starting a replication job) takes at least a minute to finish. In other words, that doesn't work either.
The SQL Server Agent knows when a Replication job finishes! Why doesn't an SSIS package or job step know? What is the point of having a control flow if it doesn't work?
Inserting waits is useless. the individual jobs can take anywhere from 1 second to an hour depending on what needs replicating.

May be I didn't see real problem but it is naturally that you need synchronization point and there are many ways to create it.
For example you could still run jobs simultaneously but let first job lock a resource that is needed for second, that will wait till resource will be unlocked. Or second job can listen log table in loop (with wait for a "minute" and self cancel after "an hour")...

Related

SSIS Job run in loop

I have an SSIS job which pulls data from one database and pushes into another. Currently the actions are triggered when a record is inserted into a table.
My understanding is using a SQL Server trigger to launch an SSIS Job is not advised. Suggesting to me the preferred route for this use case is to use a recurring schedule.
If I schedule every 10 seconds, will the ETL job launch again if the previous run has not finished? (Is there a better word to describe this behavior in the computing spacing?) If the job relaunches, is there a preferred way to accomplish this behavior?
If I schedule every 10 seconds, will the ETL job launch again if the previous run has not finished?
No. The next run time is computed once the job finishes, based on the "Starting at" and the next interval that meets the cycle interval.
While it is running the "Start Job at Step" option on the SQL Server Management Studio interface will be grayed out.
If you try to kick off the job again forcefully using sp_start_job, you'll get a error message saying it's already running.

Pausing Transactional Replication

Scenario:
I'm working with a customer that has a live database. On a separate server, they have a copy of this database and they have transactional replication setup, which runs constantly. I have an SSIS package that runs on the copy of the database for up to an hour to export data to a reporting database.
When I've tested the package with replication enabled, it occasionally fails as it reads from various tables at different points of the execution. The problem is that if some data is read at an early stage, which subsequently gets deleted/inserted, other related records that are read later on effectively become orphaned and cause lookup failures. Whilst I have various safeguards to combat this, it's difficult to cater for every case as not all records have dates that I can use to limit data.
Plan:
I have been looking at pausing the replication job, so that the package can run with static data and then re-enable it once the package has run. Once the replication is enabled again, all of the transactions from the live database that were generated during the package execution should then be applied to the copy.
Problem:
I've done some reading around the various Replication Agents used for transactional replication, but I'm not entirely sure what the minimum requirement is for pausing the replication.
At the moment I'm looking at pausing the Distribution Agent and the Log Reader Agent to achieve what I want to do. The question is, do I need to pause both agent jobs or can I pause one or the other so that the transactions build up and are applied once the agent is enabled?
I'm not sure if some of this is dependent on specific configurations or setup, but I can provide further information if required, so please comment if more information is required.
but I'm not entirely sure what the minimum requirement is for pausing the replication.
Replication works like below
Log reader agent reads the transactional log from publisher and inserts those records in distributor DB and also marks those log as inactive(so that Tlog space can be reused)
Now Distributor DB reads those records and inserts it into subscriber Database..
Now When you want to stop/pause Replication,you can stop
1.Log reader agent
Right click job and stop
or
2.Distributor agent
Right click job and stop
or
both
The question is, do I need to pause both agent jobs or can I pause one or the other so that the transactions build up and are applied once the agent is enabled?
If you pause only Distributor agent ( i would do the same),Log reader will do it's job and also there will be no impact to Log reusabilty on publisher
there are also caveats like ,if replication latency xrosses maximum limit,you will need to reinitialize replication.Though this will be huge like 24 hours
You also can use below link to monitor replication,after it has been enabled
https://www.brentozar.com/archive/2014/07/monitoring-sql-server-transactional-replication/

Running SQL Agent job for concurrent databases

I have a job that will create a job for all the databases in the SQL instance. I don't want the jobs to run sequentially. I need multiple databases to run at once, but I also want to make sure that I don't have too many databases running at the same time that might hinder performance on the server.
Is there a way to specify the number of concurrent jobs that can run at the same time or manage the jobs in a way that new jobs won't get started until the number of active jobs is less than what I specify?
You could create a stored procedure that accepts the SQL command(s) and the number [n] of jobs you want to run in parallel and let it create & start n jobs, then go into a loop to poll the msdb tables to see if said jobs are still running and each time it notices there are less than n jobs active (or left) it could start a new job until the entire set of databases has been handled. It should then wait for the last one to finish so the calling process knows everything is processed.
tips:
- Use the #job_id's returned by sp_add_job to check if a job is still running
- Use WITH (NOLOCK) on the system tables to avoid unnecessary locking, it doesn't really matter if you'd use 'dirty reads' when looking for the state of the jobs
- Use WAITFOR DELAY as to only check the tables every 5 seconds or so, otherwise your loop will eat wait too many resources !

Execute long running jobs in SQL Server

I am working with SQL Server 2008. Using the Agent, I have created a job and scheduled it to execute every minute.
The job executes a stored procedure that moves data from table XXX, to a temp table, and then eventually into table YYY.
The execution of the job may take more than one minute - since the data is rather large.
Will a second instance of the job be started even though the first instance is still running?
If so, should I mark records in temp table (status = 1) to indicate that those records are being processed by a previous instance of the job?
Is there a way for me to check that an instance of the job is currently running, so that I don't initiate a second instance of the job?
Is there another solution for this that I am unaware of? (throughput is important)
Only one instance of a particular job can run at any one time.
So there is no need to take any particular precautions against another execution of the same job beginning before the first one has stopped.
check this post
How to Prevent Sql Server Jobs to Run simultaneously
How to Prevent Sql Server Jobs to Run simultaneously
As Well HERE
Running Jobs
http://technet.microsoft.com/en-us/library/aa213815(v=sql.80).aspx
If a job has started according to its schedule, you cannot start another instance of that job on the same server until the scheduled job has completed. In multiserver environments, every target server can run one instance of the same job simultaneously.

Sequential Scheduling of Jobs

We have scheduled a number of jobs in SQL Server 2000. We want these jobs to be executed in a sequential order i.e. the failure of one job should prevent the next job from running. Can someone help me on doing this or creating dependency between scheduled jobs.
You could define your jobs as steps of one single job. So you'll can specify on every step if the next step should be executed in case of error.
Rather than combining the jobs in to one single block, it is better to divide in to pieces to simplify the error detection and make the management easier. It gives you to control your process step by step. If your SQL jobs can be executed via batch files, you can use windows task scheduler and define dependencies. But if the subject is more complex ETL process management, it is better to manage this process on a job scheduler.
I've done this in a queue system to cache data where there were 4 or 5 steps involved and had to allow delays for replication between the steps.
It was rather time consuming to implement as there were parent tasks which spawned 1 to n child steps which sometimes needed to be executed in order, sometimes irrelevant.
If you go down this path, you then need to create a location for error messages and process logs.
I highly recommend if in any way it can be created as one job with multiple steps, you should use the existing jobs agent. Each individual step can be configured to exit on fail, continue on fail, email on fail, etc - It's rather flexible.

Resources