T-SQL GO in UPDATE statements - sql-server

I have a single derived field that is populated by a series of update statements, each statement joining to a different table and different fields. It is important that the series of updates execute in a specific order, i.e. a join to table A may produce result X then a join to table B produces result Y in which case I want result Y. Normally I just create a series of Update statments in the appropriate order and store them either in a single SSIS SQL container or in a single stored procedure. Is there a best practice regarding using or not using a GO command or BEGIN END between these update statements?

Why do you think consecutive statements would be executed out of order? Do you have specific locking hints on any of the statements (e.g. UPDLOCK, HOLDLOCK, etc.)? Otherwise if you have two consecutive statements, A and B, and A changes something, B will see that change. How that works in SSIS may be different if you have some branching or multi-threading capabilities, but this is not possible in a stored procedure.
Also GO is not a T-SQL command, it is a batch separator recognized by certain client tools like Management Studio. If you try to put a GO between two statements in a stored procedure, one of two things will happen:
the procedure will fail to compile (if the opening BEGIN doesn't have a matching END right before the GO).
the procedure will compile (if there is no BEGIN/END wrapper), but it will be shorter than you thought, ending at the first GO rather than where you intended.

Statements are executed in exactly the order that you write them in. You don't need GO or BEGIN...END to ensure ordering. For that reason using either of these has no effect. They also have nothing to do with transactions.

Related

Is it possible to process a for loop in parallel within the same stored procedure?

Have a SQL Server stored procedure where I want to loop through a set of options that I read from a table. So say a table has 100 options. My stored procedure will loop through these options and for each option I need to do some checks - by querying few specific tables based on the option and flag a status related to it.
Is it possible for me to split the for loop such that row 1 -50 are processed in one loop and row 51-100 in another loop and I am able to run both of these in parallel?. I see a way where you can run multiple stored procedure in parallel through a SQL job or other means but not able to see if I can get a for loop to execute in parallel by splitting it.
Treating your question as academic, and not considering whether a set-based solution might exist, since there isn't nearly enough information to do that.
No you can't do this in a single loop (or in two separate loops for that matter) using standard TSQL, because TSQL is synchronous. Even if you "split" the loop, the second procedure call could not start until the first call finished. They would not run in parallel.
To run two loops in parallel, you would have to introduce some other language. The results of this search turned up quite a few ideas but the first few I looked at had lots of warnings of pitfalls and unexpected results. Up to you if you want to experiment with any of them.

Call stored procedure from SSIS Dataflow

The question in short:
Can I call a stored procedure that has an output parameter in a data flow?
In long:
I have many tables to extract, transform, and load from one db to another one.
Almost all of the tables require one transformation which is fixing the country codes (from 3 letters to two). So my idea is as follows:
for each row: call the stored procedure, pass the wrong country code, replace the wrong code with the correct one (the output of the stored procedure)
There are at least two solutions for this:
Look-Up component: configuring it in advance mode and make sure the last sentence of the SProc is the Select statement that returns the good country code (e.g. SELECT #good_country_code)
Using an OLEDB Command
The latter (OLEDB Command) is actually quite simple, you need to configure it with:
EXEC ? = dbo.StoredProc #param1 = ?, #param2 = ?
As a consequence a #RETURN_VALUE will appear on the Available Destination Columns which you can then map to an existing column in the pipeline. Remember to create a new pipeline field/column (e.g. Good_Country_Code) using a Derived Column component before the OLEDB component and you'll have the chance to have both values, or replace the wrong one using another Derived Column component after OLEDB Command.
No, natively there isn't a component that is going to handle that. You can accomplish it with a Script Component but you don't want to.
What you're describing is a Lookup. The Data Flow Task has a Lookup Component but you'll be better served, especially for a finite list of values like Countries to push your query into the component.
SELECT T.Country3, T.Country2 FROM dbo.Table T;
Then you drag your SourceCountry column and match to Country3. Check Country2 and for all the rows that match, you'll get the 2 letter abbreviation.
A big disadvantage of trying to use your stored procedure is efficiency. The default Lookup is going to cache all those values. With the Script Version, say you have 10k rows come through, all with CAN. That's 10k invocations of your stored procedure where the results never change.
You do pay a startup cost as the default Lookup mode is Full Cache which means it's going to run your query and keep all those values local. This is great with your data set: 1000 countries max, 5 or 10 byte per row. That's nothing.
Yes, you can. You'll want to use a couple Execute SQL Tasks to do this.
Use an Execute SQL Task to gather a Result Set of Wrong_Country_Codes.
Add a ForEach Container as a successor to the previous Execute SQL Task. Pass the Result Set to this Container.
Inside that ForEach container, you will have another Execute SQL Task that will call your sproc, using each row (e.g. Wrong_Country_Code) as a variable parameter.
That should work. Only select the columns necessary to pass to your stored procedure.
Edit
In acknowledgement to the other answer, performance is going to be an issue. Perhaps rather than have the stored procedure produce an output, alter the sproc to do the updates for you.

Export the "functionality" of many stored procedures to script

I have a large number of stored procedures (200+) that all collect clinical data and insert the result into a common table. Each stored procedure accepts the same single parameter, ClientID, and then compiles a list of diagnostic results and inserts them into a master table.
I have each clinical test separated into individual stored procedures however as I described in a previous SO question, the execution of the batch of these stored procedures pegs the CPU at 100% and continues on for hours before eventually failing. This leads me to want to create a single script that contains all the functionality of the stored procedures. Why you ask? Well, because it works. I would prefer to keep the logic in the stored procedure but until I can figure out why the stored procedures are so slow, and failing, I need to proceed with the "script" method.
So, what I am looking to do is to take all the stored procedures and find a way to "script" their functionality out to a single SQL script. I can use the "Tasks => Generate Scripts" wizard but the result contains all the Create Procedure and Begin and End functionality that I don't need.
In the versions of studio, etc. I use, there are options to control whether to script out the "if exists statements".
If you just want to capture the procs without the create statements, you could be able to roll your own pretty easily usig sp_helptext proc
For example, I created this proc
create proc dummy (
#var1 int
, #var2 varchar(10)
) as
begin
return 0
end
When I ran sp_helptext dummy I get pretty much the exact same thing as the output. Comments would also be included
I don't know of any tool that is going to return the "contents" without the create, as the formal parameters are part of the create or alter statement. Which probably leaves you using perl, python, whatever to copy out the create statement -- you lose the parameters -- though I suppose you could change those into comments.

Use transactions for select statements?

I don't use Stored procedures very often and was wondering if it made sense to wrap my select queries in a transaction.
My procedure has three simple select queries, two of which use the returned value of the first.
In a highly concurrent application it could (theoretically) happen that data you've read in the first select is modified before the other selects are executed.
If that is a situation that could occur in your application you should use a transaction to wrap your selects. Make sure you pick the correct isolation level though, not all transaction types guarantee consistent reads.
Update :
You may also find this article on concurrent update/insert solutions (aka upsert) interesting. It puts several common methods of upsert to the test to see what method actually guarantees data is not modified between a select and the next statement. The results are, well, shocking I'd say.
Transactions are usually used when you have CREATE, UPDATE or DELETE statements and you want to have the atomic behavior, that is, Either commit everything or commit nothing.
However, you could use a transaction for READ select statements to:
Make sure nobody else could update the table of interest while the bunch of your select query is executing.
Have a look at this msdn post.
Most databases run every single query in a transaction even if not specified it is implicitly wrapped. This includes select statements.
PostgreSQL actually treats every SQL statement as being executed within a transaction. If you do not issue a BEGIN command, then each individual statement has an implicit BEGIN and (if successful) COMMIT wrapped around it. A group of statements surrounded by BEGIN and COMMIT is sometimes called a transaction block.
https://www.postgresql.org/docs/current/tutorial-transactions.html

grabbing first result set from a stored proc called from another stored proc

I have a SQL Server 2005 stored proc which returns two result sets which are different in schema.
Another stored proc executes it as an Insert-Exec. However I need to insert the first result set, not the last one. What's a way to do this?
I can create a new stored proc which is a copy of the first one which returns just the result set I want but I wanted to know if I can use the existing one which returns two.
Actually, INSERT..EXEC will try to insert BOTH datasets into the table. If the column counts match and the datatype can be implicitly converted, then you will actually get both.
Otherwise, it will always fail because there is no way to only get one of the resultsets.
The solution to this problem is to extract the functionality that you want from the called procedure and incorporate it into the (formerly) calling procedure. And remind yourself while doing it that "SQL is not like client code: redundant code is more acceptable than redundant data".
In case this was not clear above, let me delineate the facts and options available to anyone in this situation:
1) If the two result sets returned are compatible, then you can get both in the same table with the INSERT and try to remove the ones that you do not want.
2) If the two result sets are incompatible then INSERT..EXEC cannot be made to work.
3) You can copy the code out of the called procedure and re-use it in the caller, and deal with the cost of dual-editing maintenance.
4) You can change the called procedure to work more compatibly with your other procedures.
Thats it. Those are your choices in T-SQL for this situation. There are some additional tricks that you can play with SQLCLR or client code but they will involve going about this a little bit differently.
Is there a compelling reason why you can't just have that first sproc return only one result set? As a rule, you should probably avoid having one sproc do both an INSERT and a SELECT (the exception is if the SELECT is to get the newly created row's identity).
Oo to prevent code from getting out of synch between the two processes, why not write a proc that does what you want to for the insert, call that in your process and have the orginal proc call that to get the first recordset and then do whatever else it needs to do.
Depending on how you get to this select, it is possible it might be refactored to a table-valued function instead of a proc that both processes would call.

Resources