Is it possible to process a for loop in parallel within the same stored procedure? - sql-server

Have a SQL Server stored procedure where I want to loop through a set of options that I read from a table. So say a table has 100 options. My stored procedure will loop through these options and for each option I need to do some checks - by querying few specific tables based on the option and flag a status related to it.
Is it possible for me to split the for loop such that row 1 -50 are processed in one loop and row 51-100 in another loop and I am able to run both of these in parallel?. I see a way where you can run multiple stored procedure in parallel through a SQL job or other means but not able to see if I can get a for loop to execute in parallel by splitting it.

Treating your question as academic, and not considering whether a set-based solution might exist, since there isn't nearly enough information to do that.
No you can't do this in a single loop (or in two separate loops for that matter) using standard TSQL, because TSQL is synchronous. Even if you "split" the loop, the second procedure call could not start until the first call finished. They would not run in parallel.
To run two loops in parallel, you would have to introduce some other language. The results of this search turned up quite a few ideas but the first few I looked at had lots of warnings of pitfalls and unexpected results. Up to you if you want to experiment with any of them.

Related

How to run multiple stored procedures in parallel using ssis?

We have a list of stored procedures (more than 1000) in a table which need to be executed every morning.
The stored procedures do not have any dependency with each other.
We have tried while loop and cursor it used to takes a lot of time in execution.
We taught of creating job for each stored procedure and call them using sp_start_job (sp_start_job is called in async manner) we got level of parallelism.
Problem arise when a new stored procedure is added to list and it became huge.
some time ppl missed to create job related new stored procedure
DB got bombarded with a no of jobs (manageability issue for DBA)
Note: list of may altered any day (stored procedures can be added or removed from list).
If the SPs run for longer, I would have categorized the 1000 SPs into 5-10 numbers, then 1 SSIS package for each category and then Agent Jobs for each package. Then, schedule those jobs at same time.
There are many ways like Loops, Scripting and multiple factors to achieve it. You can test with different ways and go with the best one.
Note: Performance of the SSIS execution depends on your Memory, Processor and Hardware.
Adding to # Nick.MacDermaid - you can utilize MaxConcurrentExecutables property of package to implement custom parallelism. Of course you would need to have multiple containers and corresponding stored proc groups.
Parallel Execution in SSIS
MaxConcurrentExecutables, a property of the package. It defines how
many tasks (executables) can run simultaneously. It defaults to -1
which is translated to the number of processors plus 2. Please note
that if your box has hyperthreading turned on, it is the logical
processor rather than the physically present processor that is
counted.
Hi you can use the following piece of code to get basically write the script of running all your stored procedures if you add a new procedure it will automatically be added to the list
SELECT'EXEC '+SPECIFIC_NAME [Command] + ';'
FROM information_schema.routines
WHERE routine_type = 'PROCEDURE'
After this you take the result set put it into a tab delimited text file and save the file in a location.
use this link to import the text into a execute SQL task the first answer works well
SSIS: How do I pull a SQL statement from a file into a string variable?
execute the task and it should work, if you need to narrow the ;ist of procedures you can specify a specific prefix in the name of the procedure and use that in the where clause
It will run in serial, sorry i dont have enough rep to comment yet

SSRS - How to execute multiple instances of the same report simultaneously

I've got an SSIS package - the primary function of which is to precalculate some data and invoke an parameterized SSRS report. The SSRS report has multiple datasets, that it retrieves through stored procedures. It takes around 2-2.5 seconds to generate.
When I loop through the report within the package, the loop obviously executes one report at a time. To speed up this process, I split up the dataset into two and tried passing each dataset into its own loop container. The problem is that although the loops process simultaneously, the step at which the report is generated (script task) halts the process for the other loop - that is, while one report is generating, the other waits.
Given this, it seems that SSRS locks and only allows for one execution at a time. The profiler showed "sp_WriteLockSession" being invoked but according to this it appears that that is based on the design. I've also read up on the "no lock" hint but I'm not sure that's the route I want to go down either.
I'm not sure if I'm approaching this in the right way. Am I missing something? The only other thing I can think of is to create a second report and invoke that instead but if its locking due to the underlying datasets, then I'm really not sure what else to do. The datasets are primarily just select statements, with a couple of them inserting one row into a single table at the very end.
I'd appreciate any advice, thanks in advance!

T-SQL GO in UPDATE statements

I have a single derived field that is populated by a series of update statements, each statement joining to a different table and different fields. It is important that the series of updates execute in a specific order, i.e. a join to table A may produce result X then a join to table B produces result Y in which case I want result Y. Normally I just create a series of Update statments in the appropriate order and store them either in a single SSIS SQL container or in a single stored procedure. Is there a best practice regarding using or not using a GO command or BEGIN END between these update statements?
Why do you think consecutive statements would be executed out of order? Do you have specific locking hints on any of the statements (e.g. UPDLOCK, HOLDLOCK, etc.)? Otherwise if you have two consecutive statements, A and B, and A changes something, B will see that change. How that works in SSIS may be different if you have some branching or multi-threading capabilities, but this is not possible in a stored procedure.
Also GO is not a T-SQL command, it is a batch separator recognized by certain client tools like Management Studio. If you try to put a GO between two statements in a stored procedure, one of two things will happen:
the procedure will fail to compile (if the opening BEGIN doesn't have a matching END right before the GO).
the procedure will compile (if there is no BEGIN/END wrapper), but it will be shorter than you thought, ending at the first GO rather than where you intended.
Statements are executed in exactly the order that you write them in. You don't need GO or BEGIN...END to ensure ordering. For that reason using either of these has no effect. They also have nothing to do with transactions.

SQL Server While Loop return results as they are found

I have a SQL script that I've written that helps me search through a DB schema to find if certain columns are used (i.e. not null, not zero length string, etc) and what their most popular values are.
I'd really like to be able to return results as they're found in the loop since it could take a while to complete the entire search. Is there a way to return results in such a way that on the VB.NET side it will see the results as they are found when it tries to do SqlDataReader.Read?
Because right now, I'm storing the results in a temp table and returning the temp table at the end.
Thanks!
Not when it's a single SQL script, no - the caller will wait for the full result set before moving on.
However, you could break it into a few steps, like this:
Make the initial lookup, where you list all the columns you might be interested in
Return these results to VB.NET
For each result in this set, run the rest of your process to get the values you're actually interested in. Loop through each result from the first step
As you receive each collection of data, you can choose to do something with it if you want. Spin off a new thread, for example, to do some additional processing.
So if it's a single T-SQL script, you're stuck running it and returning the results at one time - but if you're able to break it up and execute the loop inside .NET instead of inside SQL, you'll have access to the results at each step.

SQL Cursor w/Stored Procedure versus Query with UDF

I'm trying to optimize a stored procedure I'm maintaining, and am wondering if anyone can clue me in to the performance benefits/penalities of the options below. For my solution, I basically need to run a conversion program on an image stored in an IMAGE column in a table. The conversion process lives in an external .EXE file. Here are my options:
Pull the results of the target table into a temporary table, and then use a cursor to go over each row in the table and run a stored procedure on the IMAGE column. The stored proc calls out to the .EXE.
Create a UDF that calls the .EXE file, and run a SQL query similar to "select UDFNAME(Image_Col) from TargetTable".
I guess what I'm looking for is an idea of how much overhead would be added by the creation of the cursor, instead of doing it as a set?
Some additional info:
The size of the set in this case is max. 1000
As an answer mentions below, if done as a set with a UDF, will that mean that the external program is opened 1000 times all at once? Or are there optimizations in place for that? Obviously, on a multi-processor system, it may not be a bad thing to have multiple instances of the process running, but 1000 might be a bit much.
define set base in this context?
If you have 100 rows will this open up the app 100 times in one shot? I would say test and just because you can call an extended proc from a UDF I would still use a cursor for this because setbased doesn't matter in this case since you are not manipulating data in the tables directly
I did a little testing and experimenting, and when done in a UDF, it does indeed process each row at a time - SQL server doesn't run 100 processes for each of the 100 rows (I didn't think it would).
However, I still believe that doing this as a UDF instead of as a cursor would be better, because my research tends to show that the extra overhead of having to pull the data out in the cursor would slow things down. It may not make a huge difference, but it might save time versus pulling all of the data out into a temporary table first.

Resources