How many SQL jobs a sql server can handle? - sql-server

I am creating a database medical system and then I came to a point where I am trying to create a notification feature and i will use SQL jobs in it, where the SQL job responsibility is to check some tables and the entities that will find it need to be notified for a change in certain data will put their ids in an entity called Notification and a trigger will be called for the app to check that table and send the notificiation.
what I want to ask is how many SQL jobs can a sql server handle ?
Does the number of running SQL jobs in background affect the performance of my application or the database performance in a way or another ?
NOTE: the SQL job will run every 10 seconds
I couldn't find any useful information online.
thanks in advance.

This question really doesn't have enough background to get a definitive answer. What are the considerations?
Do the queries in your ten-second job actually complete in ten seconds, even when your DBMS is under its peak transactional workload? Obviously, if the job routinely doesn't complete in ten seconds, you'll get jobs piling up.
Do the queries in your job lock up tables and/or indexes so the transactional load can't run efficiently? (You should use SET ISOLATION LEVEL READ UNCOMMITTED; as much as you can so database reads won't lock things unnecessarily.)
Do the queries in your job do a lot of rows' worth of inserts and updates, and so swamp the SQL Server transaction logs?
How big is your server? (CPU cores? RAM? IO capacity?) How big is your database?
If your project succeeds and you get many users, will your answers to the above questions remain the same? (Hint: no.)
You should spend some time on the execution plans for the queries in your job, and try to make them as efficient as possible. Add the necessary indexes. If necessary refactor the queries to make them more efficient. SSMS will show you the execution plans and suggest appropriate indexes.
If your job is doing things like deleting expired rows, you may want to build the expiration in your data model. For example, suppose your job does
DELETE FROM readings WHERE expiration_date >= GETDATE()
and your application does this, relying on your job to avoid getting expired readings.
SELECT something FROM readings
You can refactor your application query to say
SELECT something FROM readings WHERE expiration_date < GETDATE()
and then run your job overnight, at a quiet time, rather than every ten seconds.
A ten-second job is not the greatest idea in the world. If you can rework your application so it will function correctly with a ten-second, ten-minute, or twelve-hour job, you'll have a more resilient production system. At any rate if something goes wrong with the job when your system is very busy you'll have more than ten seconds to fix it.

Related

Efficient way to delete records every 10 mins

Problem at hand
Need to delete some few thousand records every 10 minutes from a SQL Server database table.This is part of cleanup for older records.
Solutions under consideration
There's .Net Service running for some other functionality. Same service can be used with a timer to execute SQL delete command on db.
SQL server job
Trigger
Key consideration for providing solution
Ours is a web product which gets deployed at different client locations. we want minimal operational overhead as resources doing deployment are very limited technical skill and we also want to make sure that there's less to none configuration requirement for our Product.
Performance is very important, as it on live transactional database.
This sounds like exactly the sort of work that a SQL Server job was intended to provide; database maintenance.
A scheduled job can execute a basic T-SQL statement that will delete the records you don't want any more, on whatever schedule you want it to run on. The job creation can be scripted to be part of your standard deployment scripts, which should negate the deployment costs.
Additionally, by utilizing an established part of SQL Server, you capitalize on the knowledge of other database administrators that will understand SQL jobs and be able to manage them.
I would not use a trigger...and stick with SQL Server DTS or SSIS. Obviously you will need some kind of identifier so I would use a timestamp column with an index...if that's not required just fire off a TRUNCATE once nightly.
The efficiency of the delete comes from indexes, has nothing to do how the timer is triggered. It is very important that the 'old' records be easily identifiable by a range scan. If the DELETE has to scan the whole table to find these 'old' records, it will block all other activity. Usually in such cases the table is clustered by the datetime value first, and unique primary keys are delegated to a non-clustered index, if needed.
Now how to pop the timer, you really have three alternatives:
SQL Agent job
Conversation Timers
Application timer
SQL Agent job is the best option for 10 minute intervals. Only drawback is that it does not work on SQL Express deployments. If that is a concern, then conversation timers and activated procedures are a viable alternative.
Last option has the disadvantage that the application must be running for the timer to trigger deletion. If this is not a concern (ie. if the application is not running, it doesn't matter that the records are not deleted) then is OK. Note that ASP.Net applications are very bad host for such timers, because of the way IIS and ASP may choose to recycle and put to sleep app pools.

Horizontally partitioning data into an "archive" in SQL Server taking months to execute?

There is a project in flight at my organization to move customer data and all the associated records (billing transactions, etc) from one database to another, if the customer has not had account activity within a certain timeframe.
The total number of rows in all the tables is in the millions. Perhaps 100 million rows, with all the various tables combined. The schema is more-or-less normalized. The project's designers have decided on SSIS to execute this and initial analysis is showing 5 months of execution time.
Basically, the process:
Fills an "archive" database that has the same schema as the database of origin
Delete the original rows from the source database
I can provide more detail if necessary. What I'm wondering is, is SSIS the correct approach? Is there some sort of canonical way to move very large quantities of data around? Are there common performance pitfalls to avoid?
I just can't believe that this is going to take months to run and I'd like to know if there's something else that we should be looking into.
SSIS is just a tool. You can write a 100M rows transfer in SSIS to take 24h, you can write it to take 5 mo. The problem is what you write (ie. the workflow in SSIS case), not SSIS.
There isn't anything specific to SSID that would dictate 'the transfer cannot be done faster than 5 mo'.
The guiding principles for such a task (logically partition the data, process each logical partition in parallel, eliminate access and update contention between processing, batch commit changes, don't transfer more data that is necessary on the wire, use set based processing as much as possible, be able to suspend and resume etc etc) can be implemented on SSIS just as well as any other technology (if not better).
For the record, the ETL world speed record stands at about 2TB per hour. Using SSIS. And just as a matter of fact, I just finished a transfer of 130M rows, ~200Gb of data, took some 24h (I'm lazy and not shooting for ETL record).
I would understand 5mo for development, testing and deployment, but not 5mo for actual processing. That is like 7 rows a second, and is realy realy lame.
SSIS is probably not the right choice if you are simply deleting records.
This might be of interest: Performing fast SQL Server delete operations
UPDATE: as Remus correctly points out, SSIS can perform well or badly depending on how the flows are written, and there have been some huge benchmarks (on high end systems). But for just deletes there are simply ways, such as a SQL Agent job running a TSQL delete in batches.

How do I ensure SQL Server replication is running?

I have two SQL Server 2005 instances that are geographically separated. Important databases are replicated from the primary location to the secondary using transactional replication.
I'm looking for a way that I can monitor this replication and be alerted immediately if it fails.
We've had occasions in the past where the network connection between the two instances has gone down for a period of time. Because replication couldn't occur and we didn't know, the transaction log blew out and filled the disk causing an outage on the primary database as well.
My google searching some time ago led to us monitoring the MSrepl_errors table and alerting when there were any entries but this simply doesn't work. The last time replication failed (last night hence the question), errors only hit that table when it was restarted.
Does anyone else monitor replication and how do you do it?
Just a little bit of extra information:
It seems that last night the problem was that the Log Reader Agent died and didn't start up again. I believe this agent is responsible for reading the transaction log and putting records in the distribution database so they can be replicated on the secondary site.
As this agent runs inside SQL Server, we can't simply make sure a process is running in Windows.
We have emails sent to us for Merge Replication failures. I have not used Transactional Replication but I imagine you can set up similar alerts.
The easiest way is to set it up through Replication Monitor.
Go to Replication Monitor and select a particular publication. Then select the Warnings and Agents tab and then configure the particular alert you want to use. In our case it is Replication: Agent Failure.
For this alert, we have the Response set up to Execute a Job that sends an email. The job can also do some work to include details of what failed, etc.
This works well enough for alerting us to the problem so that we can fix it right away.
You could run a regular check that data changes are taking place, though this could be complex depending on your application.
If you have some form of audit train table that is very regularly updated (i.e. our main product has a base audit table that lists all actions that result in data being updated or deleted) then you could query that table on both servers and make sure the result you get back is the same. Something like:
SELECT CHECKSUM_AGG(*)
FROM audit_base
WHERE action_timestamp BETWEEN <time1> AND BETWEEN <time2>
where and are round values to allow for different delays in contacting the databases. For instance, if you are checking at ten past the hour you might check items from the start the last hour to the start of this hour. You now have two small values that you can transmit somewhere and compare. If they are different then something has most likely gone wrong in the replication process - have what-ever pocess does the check/comparison send you a mail and an SMS so you know to check and fix any problem that needs attention.
By using SELECT CHECKSUM_AGG(*) the amount of data for each table is very very small so the bandwidth use of the checks will be insignificant. You just need to make sure your checks are not too expensive in the load that apply to the servers, and that you don't check data that might be part of open replication transactions so might be expected to be different at that moment (hence checking the audit trail a few minutes back in time instead of now in my example) otherwise you'll get too many false alarms.
Depending on your database structure the above might be impractical. For tables that are not insert-only (no updates or deletes) within the timeframe of your check (like an audit-trail as above), working out what can safely be compared while avoiding false alarms is likely to be both complex and expensive if not actually impossible to do reliably.
You could manufacture a rolling insert-only table if you do not already have one, by having a small table (containing just an indexed timestamp column) to which you add one row regularly - this data serves no purpose other than to exist so you can check updates to the table are getting replicated. You can delete data older than your checking window, so the table shouldn't grow large. Only testing one table does not prove that all the other tables are replicating (or any other tables for that matter), but finding an error in this one table would be a good "canery" check (if this table isn't updating in the replica, then the others probably aren't either).
This sort of check has the advantage of being independent of the replication process - you are not waiting for the replication process to record exceptions in logs, you are instead proactively testing some of the actual data.

SQL Server Maintenance Suggestions?

I run an online photography community and it seems that the site draws to a crawl on database access, sometimes hitting timeouts.
I consider myself to be fairly compentent writing SQL queries and designing tables, but am by no means a DBA... hence the problem.
Some background:
My site and SQL server are running on a remote host. I update the ASP.NET code from Visual Studio and the SQL via SQL Server Mgmt. Studio Express. I do not have physical access to the server.
All my stored procs (I think I got them all) are wrapped in transactions.
The main table is only 9400 records at this time. I add 12 new records to this table nightly.
There is a view on this main table that brings together data from several other tables into a single view.
secondary tables are smaller records, but more of them. 70,000 in one, 115,000 in another. These are comments and ratings records for the items in #3.
Indexes are on the most needed fields. And I set them to Auto Recompute Statistics on the big tables.
When the site grinds to a halt, if I run code to clear the transaction log, update statistics, rebuild the main view, as well as rebuild the stored procedure to get the comments, the speed returns. I have to do this manually however.
Sadly, my users get frustrated at these issues and their participation dwindles.
So my question is... in a remote environment, what is the best way to setup and schedule a maintenance plan to keep my SQL db running at its peak???
My gut says you are doing something wrong. It sounds a bit like those stories you hear where some system cannot stay up unless you reboot the server nightly :-)
Something is wrong with your queries, the number of rows you have is almost always irrelevant to performance and your database is very small anyway. I'm not too familiar with SQL server, but I imagine it has some pretty sweet query analysis tools. I also imagine it has a way of logging slow queries.
I really sounds like you have a missing index. Sure you might think you've added the right indexes, but until you verify the are being used, it doesn't matter. Maybe you think you have the right ones, but your queries suggest otherwise.
First, figure out how to log your queries. Odds are very good you've got a killer in there doing some sequential scan that an index would fix.
Second, you might have a bunch of small queries that are killing it instead. For example, you might have some "User" object that hits the database every time you look up a username from a user_id. Look for spots where you are querying the database a hundred times and replace it with a cache--even if that "cache" is nothing more then a private variable that gets wiped at the end of a request.
Bottom line is, I really doubt it is something mis-configured in SQL Server. I mean, if you had to reboot your server every night because the system ground to a halt, would you blame the system or your code? Same deal here... learn the tools provided by SQL Server, I bet they are pretty slick :-)
That all said, once you accept you are doing something wrong, enjoy the process. Nothing, to me, is funner then optimizing slow database queries. It is simply amazing you can take a query with a 10 second runtime and turn it into one with a 50ms runtime with a single, well-placed index.
You do not need to set up your maintenance tasks as a maintenance plan.
Simply create a stored procedure that carries out the maintenance tasks you wish to perform, index rebuilds, statistics updates etc.
Then create a job that calls your stored procedure/s. The job can be configured to run on your desired schedule.
To create a job, use the procedure sp_add_job.
To create a schedule use the procedure sp_add_schedule.
I hope what I have detailed is clear and understandable but feel free to drop me a line if you need further assistance.
Cheers, John

SQL Server Merge Replication Schedule

We're replicating a database between London and Hong Kong using SQL Server 2005 Merge replication. The replication is set to synchronise every one minute and it works just fine. There is however the option to set the synchronisation to be "Continuous". Is there any real difference between replication every one minute and continuously?
The only reason for us doing every one minute rather than continuous in the first place was that it recovered better if the line went down for a few minutes, but this experience was all from SQL Server 2000 so it might not be applicable any more...
We have been trying the continuous replication solution on SQL SERVER 2005 and it appeared to be less efficient than a scheduled solution: as your process is continuous, you will not get all the info related to your passed replications (how many replications failed, how long did the process take, why was the process stopped, how many records were updated, how many database structure modifications were replicated to suscribers, and so on), making the replication follow-up a lot more difficult.
We have also been experiencing troubles while modifying database structure (ALTER TABLE instructions) and/or making bulk updates on one of the databases with continuous replication going on.
Keep you "every minute" synchro as it is and just forget about this "continuous" option.

Resources