WAITFOR command - sql-server

Given the problem that a stored procedure on SQL Server 2005, which is looping through a cursor, must be run once an hour and it takes about 5 minutes to run, but it takes up a large chunk of processor time:
edit: I'd remove the cursor if I could, unfortunatly, I have to be doing a bunch of processing and running other stored procs/queries based on the row.
Can I use
WAITFOR DELAY '0:0:0.1'
before each fetch to act as SQL's version of .Net's Thread.Sleep? Thus allowing the other processes to complete faster at the cost of this procedure's execution time.
Or is there another solution I'm not seeing?
Thanks

Putting the WAITFOR inside the loop would indeed slow it down and allow other things to go faster. You might also consider a WHILE loop instead of a cursor - in my experience it runs faster. You might also consider moving your cursor to a fast-forward, read-only cursor - that can limit how much memory it takes up.
declare #minid int, #maxid int, #somevalue int
select #minid = 1, #maxid = 5
while #minid <= #maxid
begin
set #somevalue = null
select #somevalue = somefield from sometable where id = #minid
print #somevalue
set #minid = #minid + 1
waitfor delay '00:00:00.1'
end

I'm not sure if that would solve the problem. IMHO the performance problem with cursors is around the amount of memory you use to keep the dataset resident and loop through it, if you then add a waitfor inside the loop you're hogging resources for longer.
But I may be wrong here, what I would suggest is to use perfmon to check the server's performance under both conditions, and then make a decision whether it is worth-it or not to add the wait.
Looking at the tag, I'm assuming you're using MS SQL Server, and not any of the other flavours.

You could delay the procedure, but that might or might not help you. It depends on how the procedure works. Is it in a transaction, why a cursor (horribly inefficient in SQL Server), where is the slowdown, etc. Perhaps reworking the procedure would make more sense.

Ever since SQL 2005 included windowing functions and other neat features, I've been able to eliminate cursors in almost all instances. Perhaps your problem would best be served by eliminating the cursor itself?
Definitely check out Ranking functions http://msdn.microsoft.com/en-us/library/ms189798.aspx and Aggregate window functions http://msdn.microsoft.com/en-us/library/ms189461.aspx

I'm guessing that whatever code you have means that the other processes can't access the table your cursor is derived from.
Provided that you make the cursor READONLY FASTWORD you should not lock the tables the cursor is derived from.
If, however, you need to write, then WAITFOR wouldn't help. Once you've locked the table, it's locked.
An option would be to snapshot the tables into a temp table, then cursor/loop through that instead. You would then not be locking the underlying tables, but equally the tables could change while you're processing the snapshot...
Dems

Related

Execution Plan - Hash vs. Merge in sql vs. Stored Procedure

I've got a problem with a terribly performing stored procedure. The odd part is that if I run the procedure it takes hours. If I run the contents of the procedure as a batch in ssms, it runs in a reasonable amount of time. I have narrowed the problem to a single statement within the proc.
My first thought was a bad query plan cache. However adding WITH RECOMPILE to either the proc, or OPTION(RECOMPILE) to the offending statement within the proc made no difference.
So I captured the (actual) execution plan from both exec-ing the procedure and running the statements directly and found this difference:
The slow stored procedure version has a <Merge ManyToMany="True"> element in the xml whereas the plain sql version has a <Hash> element.
I don't think I know enough about execution plans to determine why it would choose one or another.
Both versions were run on the same data -- etc:
BEGIN TRANSACTION;
exec myproc; --capture plan
ROLLBACK TRANSACTION;
BEGIN TRANSACTION
SQL Statements from procedure -- capture 2nd plan
ROLLBACK TRANSACTION
What sorts of things can influence the plan within a procedure that would be different when executing directly from ssms? Does anyone have any suggestions on how to narrow this down further?
I don't know how much help the particular query is here, but it's a MERGE statement:
MERGE schema.UpdatableView FORUPDATE
USING
(
large select statement that's not part of the problem
) DATA
ON DATA.field = FORUPDATE
WHEN MATCHED THEN -- 50% of the cost is here
UPDATE SET
LOTS of field updates
WHEN NOT MATCHED THEN -- other 50% is here
INSERT (FIELDS)
VALUES (FIELDS
OPTION (RECOMPILE)
;
The updatable views may be part of the problem, but SQL Profiler doesn't seem to think so. The underlying INSERT and UPDATE triggers on the view aren't begun until after the statement has been running for a few hours, and they complete in a reasonable amount of time.
This is usually due to different runtime settings like ANSI_NULLS and QUOTED_IDENTIFIER. I suggest you recreate the stored procedure and the views in the same SSMS tab (same session) that you use to test the query. This will make sure that both use the same settings. I think you will notice that both use the same plan.

What can I do to improve performance of my pure User Defined Function in SQL Server?

I have made a simple, but relatively computationally complex, UDF that queries a rarely changing table. In typical usage this function is called many many times from a WHERE clauses over a very small domain of parameters.
What can I do to make my usage of the UDF faster? My thoughts are that there should be some way to tell SQL Server that my function returns the same result with the same parameters and thus should be memoized. There doesn't seem a way to do it within the UDF because they are required to be pure and thus can't write to a temp table.
For completeness my UDF is below, though I am seeking a general answer on how to make calling UDFs on small domains faster, and not how to optimize this particular UDF.
CREATE function [dbo].[WorkDay] (
#inputDate datetime,
#offset int)
returns datetime as begin
declare
#result datetime
set #result = #inputDate
while #offset != 0
begin
set #result = dateadd( day, sign(#offset), #result )
while ( DATEPART(weekday, #result ) not between 2 and 6 )
or #result in (select date from myDB.dbo.holidays
where calendar = 'US' and date = #result)
begin
set #result = dateadd( day, sign(#offset), #result )
end
set #offset = #offset - sign(#offset)
end
return #result
END
My first thought here is -- what's the performance problem? Sure you have a loop (once per row to apply where) within a loop that it runs a query. But are you getting poor execution plans? Are your result sets huge? But lets turn to the generic. How does once solve this problem? SQL doesn't really do memoization(as the illustrious #Martin_Smith points out). So what's a boy to do?
Option 1 - New Design
Create an entirely new design. In this specific case #Aaron_Bertrand points out that a calendar table may meet your needs. Quite right. This doesn't really help you with non calendar situations, but as is often the case in SQL you need to think a bit different.
Option 2 - Call the UDF Less
Narrow the set of items that call this function. This reminds me a lot of how to do successful paging/row counting. Generate a small result set that has the distinct values required and then call your UDF so it is only called a few times. This may or may not be an option, but can work in many scenarios.
Option 3 - Dynamic UDF
I'll probably get booed out of the room for this suggestion, but here goes. What makes this UDF slow is the select statement inside the loop. If your Holiday table really changes infrequently you could put a trigger on the table. The trigger would write out and updated UDF. The new UDF could brute force all the holiday decisions. Would it bit a bit like cannibalism with SQL writing SQL? Sure. But it would get rid of the sub-query and speed the UDF up. Let the heckling begin.
Option 4 - Memoize It!
While SQL can't directly memoize, we do have SQL CLR. Convert the UDF to a SQL CLR udf. In CLR you get to use static variables. You could easily grab the Holidays table at some regular interval and store them in a hashtable. Then just rewrite your loop in the CLR. You could even go further and memoize the entire answer if that's appropriate logic.
Update:
Option 1 - I was really trying to focus on the general here, not the example function you used above. However, the current design of your UDF allows for multiple calls to the Holiday table if you happen to hit a few in a row. Using some sort of calendar-style-table that contains a list of 'bad days' and the corresponding 'next business day' will allow you to remove the potential for multiple hits & queries.
Option 3 - While the domain is unknown ahead of time you could very well modify your holiday table. For a given holiday day it would contain the next corresponding work day. From this data you could spit out a UDF with a long case statement (when '5/5/2012' then '5/14/2012' or something similar) at the bottom. This strategy may not work for every type of problem, but could work well for some types of problems.
Option 4 - There are implications to every technology. CLR needs to be deployed, the SQL Server configuration modified and SQL CLR is limited to the 3.5 framework. Personally, I've found these adjustments easy enough, but your situation may be different (say a recalcitrant DBA, or restrictions on modifications to production servers).
Using static variables requires the assemblies be granted FULL TRUST. You'll have to make sure you get your locking correct.
There is some evidence that at very high transaction levels CLR doesn't perform as well as direct SQL. In your scenario, however, this observation might not be applicable because there isn't a direct SQL correlary for what your trying to do (memoize).
You could write to a real table keyed off of your params and select for that first and if that comes up null, then calculate and insert into the table doing your own caching.
It might make more sense to pre-fill a table with all possible values for the date range you are interested in and then just join to that. you are then only doing the calculation once for each combination of params and letting SQL handle the join.

condition for creating a prepared statement using cfqueryparam?

Does cfquery becomes a prepared statement as long as there's 1 cfqueryparam? Or are there other conditions?
What happen when the ORDER BY clause or FROM clause is dynamic? Would every unique combination becomes a prepared statement?
And what happen when we're doing cfloop with INSERT, with every value cfqueryparam'ed, and invoke the cfquery with different number of iterations?
Any potential problems with too many prepared statements?
How does DB handle prepared statement? Will they be converted into something similar to store procedure?
Under what circumstances should we Not use prepared statement?
Thank you!
I can answer some parts of your question:
a query will become a preparedStatement as long as there is one <queryparam. I have in the past added a
where 1 = <cfqueryparam value="1" to queries which didn't have any dynamic parameters, in order to get them run as preparedStatements
Most DBs handle preparedStarements similarly to Stored Procedures, just held temporarily, rather than long-term, however the details are likely to be DB-specific.
Assuming you are using the drivers supplied with ColdFusion, if you turn on the 'Log Activity' checkbox in the advanced panel of the DataSource setup, then you'll get very detailed information about how CF is interacting with he DB and when it is creating a new preparedStatement and when it is re-using them. I'd recommend trying this out for yourself, as so many factors are involved (DB setup, Driver, CF version etc). If you do use the DB logging, re-start CF before running your test code, so you can see it creating the prepared statements, otherwise you'll just see it re-using statements by ID, without seeing what those statements are.
In addition, if you are asking about execution plans then there is more involved than just the number PreparedStatement's generated. It is a huge topic and very database dependent. I do not have a DBA's grasp on it, but I can answer a few of the questions about MS SQL.
What happen when the ORDER BY clause or FROM clause is dynamic? Would
every unique combination becomes a prepared statement?
The base sql is different. So you will end up with separate execution plans for each unique ORDER BY clause.
And what happen when we're doing cfloop with INSERT, with every value
cfqueryparam'ed, and invoke the cfquery with different number of
iterations?
MS SQL should reuse the same plan for all iterations because only the parameters change.
The sys.dm_exec_cached_plans view is very useful for seeing what plans are cached and how often they are reused.
SELECT p.usecounts, p.cacheobjtype, p.objtype, t.text
FROM sys.dm_exec_cached_plans p
CROSS APPLY sys.dm_exec_sql_text( p.plan_handle) t
ORDER BY p.usecounts DESC
To clear the cache first, use DBCC FLUSHPROCINDB. Obviously do not use it on a production server.
DECLARE #ID int
SET #ID = DB_ID(N'YourTestDatabaseName')
DBCC FLUSHPROCINDB( #ID )

SQL Server lock/hang issue

I'm using SQL Server 2008 on Windows Server 2008 R2, all sp'd up.
I'm getting occasional issues with SQL Server hanging with the CPU usage on 100% on our live server. It seems all the wait time on SQL Sever when this happens is given to SOS_SCHEDULER_YIELD.
Here is the Stored Proc that causes the hang. I've added the "WITH (NOLOCK)" in an attempt to fix what seems to be a locking issue.
ALTER PROCEDURE [dbo].[MostPopularRead]
AS
BEGIN
SET NOCOUNT ON;
SELECT
c.ForeignId , ct.ContentSource as ContentSource
, sum(ch.HitCount * hw.Weight) as Popularity
, (sum(ch.HitCount * hw.Weight) * 100) / #Total as Percent
, #Total as TotalHits
from
ContentHit ch WITH (NOLOCK)
join [Content] c WITH (NOLOCK) on ch.ContentId = c.ContentId
join HitWeight hw WITH (NOLOCK) on ch.HitWeightId = hw.HitWeightId
join ContentType ct WITH (NOLOCK) on c.ContentTypeId = ct.ContentTypeId
where
ch.CreatedDate between #Then and #Now
group by
c.ForeignId , ct.ContentSource
order by
sum(ch.HitCount * hw.HitWeightMultiplier) desc
END
The stored proc reads from the table "ContentHit", which is a table that tracks when content on the site is clicked (it gets hit quite frequently - anything from 4 to 20 hits a minute). So its pretty clear that this table is the source of the problem. There is a stored proc that is called to add hit tracks to the ContentHit table, its pretty trivial, it just builds up a string from the params passed in, which involves a few selects from some lookup tables, followed by the main insert:
BEGIN TRAN
insert into [ContentHit]
(ContentId, HitCount, HitWeightId, ContentHitComment)
values
(#ContentId, isnull(#HitCount,1), isnull(#HitWeightId,1), #ContentHitComment)
COMMIT TRAN
The ContentHit table has a clustered index on its ID column, and I've added another index on CreatedDate since that is used in the select.
When I profile the issue, I see the Stored proc executes for exactly 30 seconds, then the SQL timeout exception occurs. If it makes a difference the web application using it is ASP.NET, and I'm using Subsonic (3) to execute these stored procs.
Can someone please advise how best I can solve this problem? I don't care about reading dirty data...
EDIT:
The MostPopularRead stored proc is called very infrequently - its called on the home page of the site, but the results are cached for a day. The pattern of events that I am seeing is when I clear the cache, multiple requests come in for the home site, and they all hit the stored proc because it hasn't yet been cached. SQL Server then maxes out, and can only be resolved by restarting the sql server process. When I do this, usually the proc will execute OK (in about 200 ms) and put the data back in the cache.
EDIT 2:
I've checked the execution plan, and the query looks quite sound. As I said earlier when it does run it only takes around 200ms to execute. I've added MAXDOP 1 to the select statement to force it to use only one CPU core, but I still see the issue. When I look at the wait times I see that XE_DISPATCHER_WAIT, ONDEMAND_TASK_QUEUE, BROKER_TRANSMITTER, KSOURCE_WAKEUP and BROKER_EVENTHANDLER are taking up a massive amount of wait time.
EDIT 3:
I previously thought that this was related to Subsonic, our ORM, but having switched to ADO.NET, the erros is still live.
The issue is likely concurrency, not locking. SOS_SCHEDULER_YIELD occurs when a task voluntarily yields the scheduler for other tasks to execute. During this wait the task is waiting for its quantum to be renewed.
How often is [MostPopularRead] SP called and how long does it take to execute?
The aggregation in your query might be rather CPU-intensive, especially if there are lots of data and/or ineffective indexes. So, you might end up with high CPU pressure - basically, a demand for CPU time is too high.
I'd consider the following:
Check what other queries are executing while CPU is 100% busy? Look at sys.dm_os_waiting_tasks, sys.dm_os_tasks, sys.dm_exec_requests.
Look at the query plan of [MostPopularRead], try to optimize the query. Quite often an ineffective query is the root cause of a performance problem, and query optimization is much more straightforward than other performance improvement techniques.
If the query plan is parallel and the query is often called by multiple clients simultaneously, forcing a single-thread plan with MAXDOP=1 hint might help (abundant use of parallel plans is usually indicated by SOS_SCHEDULER_YIELD and CXPACKET waits).
Also, have a look at this paper: Performance tuning with wait statistics. It gives a pretty good summary of different wait types and their impact on performance.
P.S. It is easier to use SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED before a query instead of adding (nolock) to each table.
Remove the NOLOCK hint.
Open a query in SSMS, run SET STATISTICSIO ON and run the query in the procedure. Let it finish and post here the IO stats messages. Then post the table definitions and all indexes defined on them. Then somebody will be able to reply with the proper indexes you need.
As with all SQL performance problem, the text of the query is largely irrelevant without complete schema definition.
A guesstimate covering index would be:
create index ContentHitCreatedDate
on ContentHit (CreatedDate)
include (HitCount, ContentId, HitWeightId);
Update
XE_DISPATCHER_WAIT, ONDEMAND_TASK_QUEUE, BROKER_TRANSMITTER, KSOURCE_WAKEUP and BROKER_EVENTHANDLER: you can safely ignore all these waits. They show up because they represent threads parked and waiting to dispatch XEvents, Service Broker or internal SQL thread pool work items. As they spend most of their time parked and waiting, they get accounted for unrealistic wait times. Ignore them.
If you believe ContentHit to be the source of your problem, you could add a Covering Index
CREATE INDEX IX_CONTENTHIT_CONTENTID_HITWEIGHTID_HITCOUNT
ON dbo.ContentHit (ContentID, HitWeightID, HitCount)
Take a look at the Query Plan if you want to be certain about the bottleneck in your query.
By default settings sql server uses all the core/cpu for all queries (max DoP setting> advanced property, DoP= Degree of Parallelism), which can lead to 100% CPU even if only one core is actually waiting for some I/O.
If you search the net or this site you will find resource explaining it better than me (like monitoring your I/o despite you see a CPU-bound problem).
On one server we couldn't change the application with a bad query that locked down all resources (CPU) but by setting DoP to the half of the number of core we managed to avoid that the server get "stopped". The effect on the queries being less parallel was negligible in our case.
--
Dom
Thanks to all who posted, I got some great SQL Server perf tuning tips.
In the end we ran out time to resolve this mystery - we found a more effecient way to collect this information and cache it in the database, so this solved the problem for us.

Why is it considered bad practice to use cursors in SQL Server?

I knew of some performance reasons back in the SQL 7 days, but do the same issues still exist in SQL Server 2005? If I have a resultset in a stored procedure that I want to act upon individually, are cursors still a bad choice? If so, why?
Because cursors take up memory and create locks.
What you are really doing is attempting to force set-based technology into non-set based functionality. And, in all fairness, I should point out that cursors do have a use, but they are frowned upon because many folks who are not used to using set-based solutions use cursors instead of figuring out the set-based solution.
But, when you open a cursor, you are basically loading those rows into memory and locking them, creating potential blocks. Then, as you cycle through the cursor, you are making changes to other tables and still keeping all of the memory and locks of the cursor open.
All of which has the potential to cause performance issues for other users.
So, as a general rule, cursors are frowned upon. Especially if that's the first solution arrived at in solving a problem.
The above comments about SQL being a set-based environment are all true. However there are times when row-by-row operations are useful. Consider a combination of metadata and dynamic-sql.
As a very simple example, say I have 100+ records in a table that define the names of tables that I want to copy/truncate/whatever. Which is best? Hardcoding the SQL to do what I need to? Or iterate through this resultset and use dynamic-SQL (sp_executesql) to perform the operations?
There is no way to achieve the above objective using set-based SQL.
So, to use cursors or a while loop (pseudo-cursors)?
SQL Cursors are fine as long as you use the correct options:
INSENSITIVE will make a temporary copy of your result set (saving you from having to do this yourself for your pseudo-cursor).
READ_ONLY will make sure no locks are held on the underlying result set. Changes in the underlying result set will be reflected in subsequent fetches (same as if getting TOP 1 from your pseudo-cursor).
FAST_FORWARD will create an optimised forward-only, read-only cursor.
Read about the available options before ruling all cursors as evil.
There is a work around about cursors that I use every time I need one.
I create a table variable with an identity column in it.
insert all the data i need to work with in it.
Then make a while block with a counter variable and select the data I want from the table variable with a select statement where the identity column matches the counter.
This way i dont lock anything and use alot less memory and its safe, i will not lose anything with a memory corruption or something like that.
And the block code is easy to see and handle.
This is a simple example:
DECLARE #TAB TABLE(ID INT IDENTITY, COLUMN1 VARCHAR(10), COLUMN2 VARCHAR(10))
DECLARE #COUNT INT,
#MAX INT,
#CONCAT VARCHAR(MAX),
#COLUMN1 VARCHAR(10),
#COLUMN2 VARCHAR(10)
SET #COUNT = 1
INSERT INTO #TAB VALUES('TE1S', 'TE21')
INSERT INTO #TAB VALUES('TE1S', 'TE22')
INSERT INTO #TAB VALUES('TE1S', 'TE23')
INSERT INTO #TAB VALUES('TE1S', 'TE24')
INSERT INTO #TAB VALUES('TE1S', 'TE25')
SELECT #MAX = ##IDENTITY
WHILE #COUNT <= #MAX BEGIN
SELECT #COLUMN1 = COLUMN1, #COLUMN2 = COLUMN2 FROM #TAB WHERE ID = #COUNT
IF #CONCAT IS NULL BEGIN
SET #CONCAT = ''
END ELSE BEGIN
SET #CONCAT = #CONCAT + ','
END
SET #CONCAT = #CONCAT + #COLUMN1 + #COLUMN2
SET #COUNT = #COUNT + 1
END
SELECT #CONCAT
I think cursors get a bad name because SQL newbies discover them and think "Hey a for loop! I know how to use those!" and then they continue to use them for everything.
If you use them for what they're designed for, I can't find fault with that.
SQL is a set based language--that's what it does best.
I think cursors are still a bad choice unless you understand enough about them to justify their use in limited circumstances.
Another reason I don't like cursors is clarity. The cursor block is so ugly that it's difficult to use in a clear and effective way.
All that having been said, there are some cases where a cursor really is best--they just aren't usually the cases that beginners want to use them for.
Cursors are usually not the disease, but a symptom of it: not using the set-based approach (as mentioned in the other answers).
Not understanding this problem, and simply believing that avoiding the "evil" cursor will solve it, can make things worse.
For example, replacing cursor iteration by other iterative code, such as moving data to temporary tables or table variables, to loop over the rows in a way like:
SELECT * FROM #temptable WHERE Id=#counter
or
SELECT TOP 1 * FROM #temptable WHERE Id>#lastId
Such an approach, as shown in the code of another answer, makes things much worse and doesn't fix the original problem. It's an anti-pattern called cargo cult programming: not knowing WHY something is bad and thus implementing something worse to avoid it! I recently changed such code (using a #temptable and no index on identity/PK) back to a cursor, and updating slightly more than 10000 rows took only 1 second instead of almost 3 minutes. Still lacking set-based approach (being the lesser evil), but the best I could do that moment.
Another symptom of this lack of understanding can be what I sometimes call "one object disease": database applications which handle single objects through data access layers or object-relational mappers. Typically code like:
var items = new List<Item>();
foreach(int oneId in itemIds)
{
items.Add(dataAccess.GetItemById(oneId);
}
instead of
var items = dataAccess.GetItemsByIds(itemIds);
The first will usually flood the database with tons of SELECTs, one round trip for each, especially when object trees/graphs come into play and the infamous SELECT N+1 problem strikes.
This is the application side of not understanding relational databases and set based approach, just the same way cursors are when using procedural database code, like T-SQL or PL/SQL!
Sometimes the nature of the processing you need to perform requires cursors, though for performance reasons it's always better to write the operation(s) using set-based logic if possible.
I wouldn't call it "bad practice" to use cursors, but they do consume more resources on the server (than an equivalent set-based approach) and more often than not they aren't necessary. Given that, my advice would be to consider other options before resorting to a cursor.
There are several types of cursors (forward-only, static, keyset, dynamic). Each one has different performance characteristics and associated overhead. Make sure you use the correct cursor type for your operation. Forward-only is the default.
One argument for using a cursor is when you need to process and update individual rows, especially for a dataset that doesn't have a good unique key. In that case you can use the FOR UPDATE clause when declaring the cursor and process updates with UPDATE ... WHERE CURRENT OF.
Note that "server-side" cursors used to be popular (from ODBC and OLE DB), but ADO.NET does not support them, and AFAIK never will.
There are very, very few cases where the use of a cursor is justified. There are almost no cases where it will outperform a relational, set-based query. Sometimes it is easier for a programmer to think in terms of loops, but the use of set logic, for example to update a large number of rows in a table, will result in a solution that is not only many less lines of SQL code, but that runs much faster, often several orders of magnitude faster.
Even the fast forward cursor in Sql Server 2005 can't compete with set-based queries. The graph of performance degradation often starts to look like an n^2 operation compared to set-based, which tends to be more linear as the data set grows very large.
# Daniel P -> you don't need to use a cursor to do it. You can easily use set based theory to do it. Eg: with Sql 2008
DECLARE #commandname NVARCHAR(1000) = '';
SELECT #commandname += 'truncate table ' + tablename + '; ';
FROM tableNames;
EXEC sp_executesql #commandname;
will simply do what you have said above. And you can do the same with Sql 2000 but the syntax of query would be different.
However, my advice is to avoid cursors as much as possible.
Gayam
Cursors do have their place, however I think it's mainly because they are often used when a single select statement would suffice to provide aggregation and filtering of results.
Avoiding cursors allows SQL Server to more fully optimize the performance of the query, very important in larger systems.
The basic issue, I think, is that databases are designed and tuned for set-based operations -- selects, updates, and deletes of large amounts of data in a single quick step based on relations in the data.
In-memory software, on the other hand, is designed for individual operations, so looping over a set of data and potentially performing different operations on each item serially is what it is best at.
Looping is not what the database or storage architecture are designed for, and even in SQL Server 2005, you are not going to get performance anywhere close to you get if you pull the basic data set out into a custom program and do the looping in memory, using data objects/structures that are as lightweight as possible.

Resources