If I want to conduct some database operations on a scheduled basis, I could:
Use SQL Server Agent (if SQL Server) to periodically call the stored procedure and/or execute the T-SQL
Run some external process (scheduled by the operating system's task scheduler for example) which executes the database operation
etc.
Two questions:
What are some other means of accomplishing this
What decision criteria should one use to decide the best approach?
Thank you.
Another possibility is to have a queue of tasks somewhere, and when applications that otherwise use the database perform some operation, they also do some tasks out of the queue. Wikipedia does something like this with its job queue. The scheduling isn't as certain as with the other methods, but you can e.g. put off doing housekeeping work when your server happens to be heavily loaded.
Edit:
It's not necessarily better or worse than the other techniques. It's suitable for tasks that do not have to be performed by any specific deadline, but should be done "every now and then", or "soon, but not necessarily right now".
Advantages
You don't need to write a separate application or set up SQL Server Agent.
You can use any criteria you can program to decide whether to run a task or not: immediately, once a certain time has passed, or only if the server is not under heavy load.
If the scheduled tasks are ones like optimising indices, then you can do them less frequently when they are less necessary (e.g. when updates are rare), and more frequently when updates are common.
Disadvantages
You might need to modify multiple applications to cooperate correctly.
You need to ensure that the queue doesn't build up too much.
You can't reliably ensure that a task runs before a certain time.
You might have long periods where you get no requests (e.g. at night) where deferred/scheduled tasks could get done, but don't. You could combine it with one of the other ideas, having a special program that just does the jobs in the queue, but you could just not bother with the queue at all.
You can't really rely on external processes. All 'OS' based solutions I've seen failed to deliver in the real world: a database is way more than just the data, primarily because of the backup/restore strategy, the high availability strategy, the disaster recoverability strategy and all the other 'ities' you pay for in your SQL Server license. An OS scheduler based will be an external component completely unaware and unintegrated with any of them. Ie. you cannot back/restore your schedule with your data, it will not fail over with your database, you cannot ship it to a remote disaster recovery site through your SQL data shipping channel.
If you have Agent (ie. not Express edition) then use Agent. Has a long history of use and the know how around it is significant. The only problem with Agent is its dependence on msdb that makes it disconnect from the application database and thus does not play well with mirroring based availability and recoverability solutions.
For Express editions (ie. no Agent) the best option is to roll your own scheduler based on conversation timers (at least in SQL 2k5 and forward). You use conversations to chedule yourself messages at the desired moment and rely on activated procedures to run the tasks. They are transactional and integrated with your database, so you can rely on them being there after a restore and after a mirroring or clustering fail over. Unfortunately the know how around how to use them is fairly skim, I have several articles about the subject on my site rusanu.com. I've seen systems replicating a fair amount of Agent API on Express relying entirely on conversation timers.
I generally go with the operating systems scheduling method (task scheduler for Windows, cron for Unix).
I deal with multiple database platforms (SQL Server, Oracle, Informix) and want to keep the task scheduling as generic as possible.
Also, in our production environment we have to get a DBA involved for any troubleshooting / restarting of jobs that are running in the database. We have better access to the application servers with the scheduled tasks on them.
I think the best approach for the decision criteria is what the job is. If it's a completely internal SQL Server task or set of tasks that does not relate to the outside world, I would say a SQL Job is the best bet. If on the other hand, you are retrieving data and then doing something with it that is inherently outside SQL Server, very difficult to do in T-SQL or time consuming, perhaps the external service is the best bet.
I'd go with SQL Server Agent. It's well integrated with SQL Server; various SQL Server features use Agent (Log Shipping, for instance). You can create an Agent job to run one or more SSIS packages, for instance.
It's also integrated with operator notification, and can be scripted, or else executed through SMO.
Related
Need some sanity check.
Imagine having 1 SQL Server instance, a beefy system (i.e 48GB of RAM and tons of storage). Obviously there comes a point where it gets hammered in a situation where there are lots of jobs running.
These jobs/DB are part of an external piece of software and cannot be controlled or modified by us directly.
Now, when these jobs run, besides the queries probably being inefficient, do bring the DB down - they become very slow so any "regular" users are having slow responses.
The immediate thing I can think of is replication of some kind where maybe, the "secondary" DB would be the one where these jobs point to and do their hammering, still leaving the primary available and active but would receive any updates from secondary for data consistency/integrity.
Would this be the right thing to do? Ultimately I want the load to be elsewhere but have the primary be aware of updates and update itself without bringing it down or being very slow.
What is this called in MS SQL Server? Does such a thing exist? The jobs will be doing a read-write FYI.
There are numerous approaches to this, all of which are native to SQL Server, but I think you should look into Transactional Replication:
https://learn.microsoft.com/en-us/sql/relational-databases/replication/transactional/transactional-replication?view=sql-server-ver16
It effectively creates a read-only replica based on log shipping that, for reporting purposes, is practically real time.
From the documentation:
"By default, Subscribers to transactional publications should be treated as read-only, because changes are not propagated back to the Publisher. However, transactional replication does offer options that allow updates at the Subscriber."
Your scenario likely has nuances I don't know about, but you can use various flavors of SQL Replication, custom triggers, linked servers, 3-part queries, etc. to fill in the holes.
Simple question: Is there any good reason to add MSMQ to an existing messaging framework which already has multiple BizTalk and SQL Server nodes?
Here's the background: We have a messaging framework to process bills, the load is rather low right now (at most 10,000 a day), but it's ramping up. We use BizTalk and SQL Server for all the processing, and we started noticing a few timeouts when inserting (synchronously) into one of the databases (NOT the BizTalk message box). One of our senior programmers suggested we use MSMQ to save (asynchronously) the data that causes the timeout and process it later; the solution he designed works and it's about to be deployed, but I'm still wondering if that was the right decision, considering that we could have used BizTalk itself or SQL Server Service Broker (SSSB). There's a lot of discussions about those three technologies, but they're usually about having to choose one of them over the others, I haven't seen any case of anyone who already had BizTalk and SSSB and decided to add MSMQ to the mix. In our case I think it's an unnecessary addition to our technology stack, but that may be my own bias (and ignorance too), since I know SSSB better and never did anything big with MSMQ. What do you think?
It sounds like you should figure out why your inserts are taking so long, and fix that instead. 10,000 / day is nothing for a decent box running SQL Server.
EDIT:
Adding any sort of asynchronous processing is a form of kicking the can down the road. Assume your inserts take one minute (I realize they probably don't, but for argument's sake). If you make your inserts asynchronous, you can still only handle 1440 inserts per day until you start falling behind. You are always going to need to speed up your inserts eventually.
Now with that said, I don't think that there is any compelling benefit in this case of using MSMQ over SSSB (or vice-versa). It could be argued that with MSMQ you need to hand-code a listener daemon that does your inserts, whereas with SSSB you have that automatically within the database. On the other hand, with MSMQ you are offloading the storage of the messages to another server, potentially offloading some of the immediate stress from your SQL Server.
I would argue that if you just wanted to take the database calls off-line then you could do that with BizTalk (for example, by creating an "offline" host - thereby creating a new host queue).
Where msmq really excels is on the inbound side of BizTalk. Systems can call to BizTalk not caring about the availability of BizTalk itself. The messages will just hang around until BizTalk is available again.
I'm with Hugh - we've used MSMQ (and IBM MQ Series) successfully with BizTalk for asynchronous, transactional traffic (mostly financial transactions, where the need for traceable, reliable, ACID type message delivery outweighs any need for transaction latency).
We've found the benefits of MSMQ to be:
Transactional delivery - messages can be pulled off by the destination system and inserted into SQL under a 2 phase UOW.
Hugh's point about delivery decoupled from system availability (and you still have the Dead Letter Queue if the target system is down for an unreasonable period of time)
Load balancing / throttling - a destination system can protect against overzealous message delivery by pulling messages off the queue at a more even pace.
Auditing - using the journalling on MSMQ allows an additional layer of tracing.
Also note that there is a WCF adapter for MSMQ - no requirement for custom listeners.
We generally stay away from calling SQL directly from BizTalk:
For reading this equates to polling the database in the hope that there are messages ready to be sent (this can create issues relating to frequency of calling, i.e. redundancy, induced latency, and load on SQL, and contention - e.g. polling while data is being added by an app to the tables. We would rather have each app decide when to submit messages to BizTalk / ESB.
for write operations, unless data is offloaded into a staging area for processing by destination apps, it can lead to much of the 'business' processing moving into BizTalk (i.e. validation, applying business rules etc) - IMHO this is too fine-grained for BizTalk. And as you've found, it can be hard to control the rate of message delivery into SQL (e.g. unless you start using Singleton Orhcestrations etc), which again causes locking / contention issues.
Problem at hand
Need to delete some few thousand records every 10 minutes from a SQL Server database table.This is part of cleanup for older records.
Solutions under consideration
There's .Net Service running for some other functionality. Same service can be used with a timer to execute SQL delete command on db.
SQL server job
Trigger
Key consideration for providing solution
Ours is a web product which gets deployed at different client locations. we want minimal operational overhead as resources doing deployment are very limited technical skill and we also want to make sure that there's less to none configuration requirement for our Product.
Performance is very important, as it on live transactional database.
This sounds like exactly the sort of work that a SQL Server job was intended to provide; database maintenance.
A scheduled job can execute a basic T-SQL statement that will delete the records you don't want any more, on whatever schedule you want it to run on. The job creation can be scripted to be part of your standard deployment scripts, which should negate the deployment costs.
Additionally, by utilizing an established part of SQL Server, you capitalize on the knowledge of other database administrators that will understand SQL jobs and be able to manage them.
I would not use a trigger...and stick with SQL Server DTS or SSIS. Obviously you will need some kind of identifier so I would use a timestamp column with an index...if that's not required just fire off a TRUNCATE once nightly.
The efficiency of the delete comes from indexes, has nothing to do how the timer is triggered. It is very important that the 'old' records be easily identifiable by a range scan. If the DELETE has to scan the whole table to find these 'old' records, it will block all other activity. Usually in such cases the table is clustered by the datetime value first, and unique primary keys are delegated to a non-clustered index, if needed.
Now how to pop the timer, you really have three alternatives:
SQL Agent job
Conversation Timers
Application timer
SQL Agent job is the best option for 10 minute intervals. Only drawback is that it does not work on SQL Express deployments. If that is a concern, then conversation timers and activated procedures are a viable alternative.
Last option has the disadvantage that the application must be running for the timer to trigger deletion. If this is not a concern (ie. if the application is not running, it doesn't matter that the records are not deleted) then is OK. Note that ASP.Net applications are very bad host for such timers, because of the way IIS and ASP may choose to recycle and put to sleep app pools.
I am creating a system where users can setup mailings to go out at specific times. Before I being I wanted to get some advice. First, is there already a .Net component that will handle scheduling jobs (either running another application or calling a URL) that will do what I am suggesting (Open Source would be cool)? If there isn’t, is it better to schedule a job in SQL and run some sort of script, create a .Net service that will look at an xml file or db for schedules, or have an application create scheduled tasks? There could be a ton of tasks, so I am thinking creating scheduled tasks or SQL jobs might not be a good idea.
Here may be a typical scenario; a user wants to send a newsletter to their clients. The user creates the newsletter on a Saturday, but doesn’t want it to go out until Monday. The user wants that same e-mail to go out every Monday for a month.
Thanks for looking!
Check out Quartz.NET
Quartz.NET is a full-featured, open
source job scheduling system that can
be used from smallest apps to large
scale enterprise systems.
If you want to use the readily available services in Windows itself, check out this article A New Task Scheduler Task Library on CodeProject on how to create scheuled tasks in Windows from your C# application.
You probably have more flexibility and power if you use C# and scheduled tasks in Windows, rather than limiting yourself to what can be done in SQL Server. SQL Server Agent Jobs are great - for database specific stuff, mostly - maintenance plans and so forth.
You can build your own windows service that schedules and executes jobs. Be sure to make good abstractions. In a similar project, I have used an abstraction where scheduling items are abstracted as Jobs composed of tasks. For example, sending newsletter may be a job whereas sending newsletter to each subscriber can be considered as a task. Then you need to run the job and tasks in defined threading models preferably using Threadpool threads or Task Parallel Library. Be sure to use asynchronous API for IO whenever possible. Also separate your scheduling logic from the abstractions. so that the scheduling logic can execute arbitrary types of jobs and its inclusive tasks.
The backup and restore process of a large database or collection of databases on sql server is very important for disaster & recovery purposes. However, I have not found a robust solution that will guarantee the whole process is as efficient as possible, 100% reliable and easily maintainable and configurable accross multiple servers.
Microsft's Maintenance Plans doesn't seem to be sufficient. The best solution I have used is one that I created manually using many jobs with many steps per database running on the source server (backup) and destination server (restore). The jobs use stored procedures to do the backup, copying & restoring. This runs once a day (full backup/restore) and intraday every 5 mins (transaction log shipping).
Although my current process works and reports any job failures via email, I know the whole process isn't very reliable and cannot be easily maintained/configured on all our servers by a non-DBA without having in-depth knowledge of the process.
I would like to know if others have this same backup/restore process and how others overcome this issue.
I've used a similar step to keep dev/test/QA databases 'zero-stepped' on a nightly basis for developers and QA folks to use.
Documentation is the key - if you want to remove what Scott Hanselman calls 'bus factor' (i.e. the danger that the creator of the system will get hit by a bus and everything starts to suck).
That said, for normal database backups and disaster recovery plans, I've found that SQL Server Maintenance Plans work out pretty well. As long as you include:
1) Decent documentation
2) Routine testing.
I've outlined some of the ways to go about doing that (for anyone drawn to this question looking for an example of how to go about creating a disaster recovery plan):
SQL Server Backup Best Practices (Free Tutorial/Video)
The key part of your question is the ability for the backup solution to be managed by a non-DBA. Any native SQL Server answer like backup scripts isn't going to meet that need, because backup scripts require T-SQL knowledge.
Because of that, you want to look toward third-party solutions like the ones Mitch Wheat mentioned. I work for Quest (the makers of LiteSpeed) so of course I'm partial to that one - it's easy to show to non-DBAs. Before I left my last company, I had a ten minute session to show the sysadmins and developers how the LiteSpeed console worked, and that was that. They haven't called since.
Another approach is using the same backup software that the rest of your shop uses. TSM, Veritas, Backup Exec and Microsoft DPM all have SQL Server agents that let your Windows admins manage the backup process with varying degrees of ease-of-use. If you really want a non-DBA to manage it, this is probably the most dead-easy way to do it, although you sacrifice a lot of performance that the SQL-specific backup tools give you.
I am doing precisely the same thing and have various issues semi regularly even with this process.
How do you handle the spacing between copying the file from Server A to Server B and restoring the transactional backup on Server B.
Every once in a while the transaction backup is larger than normal and takes a longer time to copy. The restore job then gets an operating system error that the file is in use.
This is not such a big deal since the file is automatically applied the next time around however it would be nicer to have a more elegant solution in general and one that specifically fixes this issue.