Execute multiple stored procedures with single trip to database - sql-server

I have a lot of legacy data access code mainly SqlCommand with Stored Procedure calls that we used to execute alot of Insert statment into an database.
As long as the SQL server has been on the same machine as the application there have been acceptable performace but now are we trying to move some of the data to SQL Azure.
The problem is that our code calls a SP for every record to insert which results in quite a few trips to the database and when not located on the same server it takes some time.
var conn = new SqlConnection("connString")
var cmd = new SqlCommand(conn, "spMyStoreProc");
cmd.Params.Add("#a", SqlDbType.VarChar, 10);
cmd.Params.Add("#b", SqlDbType.Int);
using(conn)
{
conn.Open();
foreach(var rec in recordsToInsert)
{
cmd.Parameters["#a"].Value = rec.A;
cmd.Parameters["#b"].Value = rec.B;
cmd.ExecuteNonQuery();
}
conn.Close();
}
I have tried the code above with and without Transactions.
I have also tried to use a "batch" SQL statement to execute several SPs in every trip to the server.
Like this:
var cmd = new SqlCommand(conn);
cmd.CommandText = "EXEC spMyStoreProc #a='a' #b=2; EXEC spMyStoreProc #a='b' #b=4;"
It greatly increases the performance of the operation but since I have quite a few SPs where every SP has about 20-50 params it gets quite tedious to write this code for all the insert commands in this data access component.
Is this the best way to achive this, or can I somehow tell ADO.NET I want to execute my calls as a batch (havent fount anything suggesting its possible but feel that I atleast should ask) to avoid network latency etc betweeen every single SP call?
If not does anybody know any good way to achive this without having to write it "by hand" and since its a legacy application I can not change the data layer completely.
Is there any applications that can take SqlCommands with parameters and generate the TQL they would execute?
Thanks in advance

You should probably have one stored procedure, that calls all the other stored procedures - it will probably be the least amount of work. So, from the code you only call the stored procedures once... so given that they are the same parameters you are passing every time (because your code seems to imply that) you would basically do something like this:
CREATE PROCEDURE sp_RunBatch(#param1, #param2, etc [all the parameters you need])
AS
exec spMyStoreProc #a='a'
exec spMyStoreProc2 #b='b'
The advantages of this are many, some of which being that its all centralized, and you can even wrap all of them within a transaction, so as not to do dirty inserts (given that they all depend on each other).
Also, if you don't feel like passing 20/30 parameters to each SP, you may want to make a user-table-defined data-type for each set of parameters, that you can pass. So then each SP gets 1 or 2 parameters, and the code becomes much simpler and readable.
EDIT:
This is a good reference for the user-defined table types: http://msdn.microsoft.com/en-us/library/bb675163.aspx
And this is how to pass the table valued types to SQL server: http://msdn.microsoft.com/en-us/library/bb675163.aspx

An alternative to M.R.'s approach would be to send all your parameters as an XML document, then parse the XML document to extract your parameters. This may simplify the interface a bit.
However I think you were on something when you discussed the possibility of chaining all the commands in a single string. But instead of manually building them, consider building an extension method to the SqlCommand object that returns a single string for execution, leveraging the sp_executesql syntax, and execute the entire string in a single pass.
So you would have a loop that looks like this, and you would call a new ToInlineSql extension method:
string sqlCommand = "";
foreach(var rec in recordsToInsert)
{ cmd.Parameters["#a"].Value = rec.A;
cmd.Parameters["#b"].Value = rec.B;
sqlCommand += cmd.ToInlineSql();
}
// execute sqlCommand
The ToInlineSql extension method could look like this (peuso-code, you will have to add certain things such as checking for the data type and so on) [and here is the link to sp_executesql:
public static class SqlCmdExt
{
public static string ToInlineSql(this SqlCommand cmd)
{
string sql = "sp_executesql " + cmd.CommandText ;
foreach (SqlParameter p in cmd.Parameters)
{
sql += ", #" + p.Name + " " + p.DataType.ToString() ;
sql += ", " + p.Name + " = " + p.Value;
}
sql += ";";
return sql;
}
}

Related

I am trying to run multiple query statements created when using the python connector with the same query id

I have created a Python function which creates multiple query statements.
Once it creates the SQL statement, it executes it (one at a time).
Is there anyway to way to bulk run all the statements at once (assuming I was able to create all the SQL statements and wanted to execute them once all the statements were generated)? I know there is an execute_stream in the Python Connector, but I think this requires a file to be created first. It also appears to me that it runs a single query statement at a time."
Since this question is missing an example of the file, here is a file content that I have provided as extra that we can work from.
//connection test file for python multiple queries
import snowflake.connector
conn = snowflake.connector.connect(
user = 'xxx',
password = '',
account = 'xxx',
warehouse= 'xxx',
database= 'TEST_xxx'
session_parameters = {
'QUERY_TAG: 'Rachel_test',
}
}
while(conn== true){
print(conn.sfqid)import snowflake.connector
try:
conn.cursor().execute("CREATE WAREHOUSE IF NOT EXISTS tiny_warehouse_mg")
conn.cursor().execute("CREATE DATABASE IF NOT EXISTS testdb_mg")
conn.cursor().execute("USE DATABASE testdb_mg")
conn.cursor().execute(
"CREATE OR REPLACE TABLE "
"test_table(col1 integer, col2 string)")
conn.cursor().execute(
"INSERT INTO test_table(col1, col2) VALUES " +
" (123, 'test string1'), " +
" (456, 'test string2')")
break
except Exception as e:
conn.rollback()
raise e
}
conn.close()
The reference to this question refers to a method that can be done with the file call, the example in documentation is as follows:
from codecs import open
with open(sqlfile, 'r', encoding='utf-8') as f:
for cur in con.execute_stream(f):
for ret in cur:
print(ret)
Reference to guide I used
Now when I ran these, they were not perfect, but in practice I was able to execute multiple sql statements in one connection, but not many at once. Each statement had their own query id. Is it possible to have a .sql file associated with one query id?
Is it possible to have a .sql file associated with one query id?
You can achieve that effect with the QUERY_TAG session parameter. Set the QUERY_TAG to the name of your .SQL file before executing it's queries. Access the .SQL file QUERY_IDs later using the QUERY_TAG field in QUERY_HISTORY().
I believe though you generated the .sql while executing in snowflake each statement will have unique query id.
If you want to run one sql independent to other you may try with multiprocessing/multi threading concept in python.
The Python and Node.Js libraries do not allow multiple statement executions.
I'm not sure about Python but for Node.JS there is this library that extends the original one and add a method call "ExecutionAll" to it:
snowflake-multisql
You just need to wrap multiple statements with the BEGIN and END.
BEGIN
<statement_1>;
<statement_2>;
END;
With these operators, I was able to execute multiple statement in nodejs

System.NotSupportedException: Commands with multiple queries cannot have out parameters

I ran into another issue with using a data reader around a sproc with multiple ref cursors coming out. I am getting a not supported exception. Unfortunately, i can see from where it is coming from the source code of npgsql however.. i am not sure if i agree with throwing that exception. The code we have written works with oracle (both fully managed and managed flavors), sql server. Any help appreciated to keep it consistent for an api across some of those key flavors of dbms out there.
sproc body
CREATE OR REPLACE FUNCTION public.getmultipleresultsets (
v_organizationid integer)
RETURNS Setof refcursor
LANGUAGE 'plpgsql'
AS $BODY$
declare public override void AddCursorOutParameter(DbCommand command,
string RefCursorName)
{
NpgsqlParameter parameter = (NpgsqlParameter)CreateParameter(RefCursorName, false);
parameter.NpgsqlDbType = NpgsqlDbType.Refcursor;
parameter.NpgsqlValue = DBNull.Value;
parameter.Direction = ParameterDirection.Output;
command.Parameters.Add(parameter);
}
cv_1 refcursor;
cv_2 refcursor;
BEGIN
open cv_1 for
SELECT a.errorCategoryId, a.name, a.bitFlag
FROM ErrorCategories a
ORDER BY name;
RETURN next cv_1;
open cv_2 for
SELECT *
FROM StgNetworkStats ;
RETURN next cv_2;
END;
$BODY$;
Key Reader code that wraps postgres sql (Entlib implementation of npgsql)
private IDataReader DoExecuteReader(DbCommand command, CommandBehavior cmdBehavior)
{
try
{
var sql = new StringBuilder();
using (var reader = command.ExecuteReader(CommandBehavior.SequentialAccess))
{
while (reader.Read())
{
sql.AppendLine($"FETCH ALL IN \"{ reader.GetString(0) }\";");
}
}
command.CommandText = sql.ToString();
command.CommandType = CommandType.Text;
IDataReader reader2 = command.ExecuteReader(cmdBehavior);
return reader2;
}
catch (Exception)
{
throw;
}
}
The command building code is shown below
Helper.InitializeCommand(cmd, 300, "getmultipleresultsets");
db.AddReturnValueParameter(cmd);
db.AddInParameter(cmd, "organizationId", DbType.Int32, ORGANIZATIONID);
db.AddCursorOutParameter(cmd, "CV_1");
db.AddCursorOutParameter(cmd, "CV_2
The code that adds the refcursor parameter goes something like this
You code above seems to garble the PostgreSQL function with the .NET client code attempting to read its result.
Regardless, your function is declared to return a set of refcursors - this is not the same as two output parameters; you seem to be confusing the name of the cursor (cursors have names, but not ints, for example) with the name of the parameter (int parameters do have names).
Please note that PostgreSQL does not actually have output parameters - a function always returns a single table, and that's it. PostgreSQL does have a function syntax with output parameters, but that is only a way to construct the schema of the output table. This is unlike SQL Server, which apparently can return both a table and a set of named output parameters. To facilitate portability, when reading results, if Npgsql sees any NpgsqlParameter with direction out, it will attempt to find a resultset with the name of the parameter and will simply populate the NpgsqlParameter's Value with the first row's value for that column. This practice has zero added value over simply reading the resultset yourself - it's just there for compatibility.
To sum it up, I'd suggest you read the refcursors with your reader and then fetch their results as appropriate.

SQL CLR Trigger - get source table

I am creating a DB synchronization engine using SQL CLR Triggers in Microsoft SQL Server 2012. These triggers do not call a stored procedure or function (and thereby have access to the INSERTED and DELETED pseudo-tables but do not have access to the ##procid).
Differences here, for reference.
This "sync engine" uses mapping tables to determine what the table and field maps are for this sync job. In order to determine the target table and fields (from my mapping table) I need to get the source table name from the trigger itself. I have come across many answers on Stack Overflow and other sites that say that this isn't possible. But, I've found one website that provides a clue:
Potential Solution:
using (SqlConnection lConnection = new SqlConnection(#"context connection=true")) {
SqlCommand cmd = new SqlCommand("SELECT object_name(resource_associated_entity_id) FROM sys.dm_tran_locks WHERE request_session_id = ##spid and resource_type = 'OBJECT'", lConnection);
cmd.CommandType = CommandType.Text;
var obj = cmd.ExecuteScalar();
}
This does in fact return the correct table name.
Question:
My question is, how reliable is this potential solution? Is the ##spid actually limited to this single trigger execution? Or is it possible that other simultaneous triggers will overlap within this process id? Will it stand up to multiple executions of the same and/or different triggers within the database?
From these sites, it seems the process Id is in fact limited to the open connection, which doesn't overlap: here, here, and here.
Will this be a safe method to get my source table?
Why?
As I've noticed similar questions, but all without a valid answer for my specific situation (except that one). Most of the comments on those sites ask "Why?", and in order to preempt that, here is why:
This synchronization engine operates on a single DB and can push changes to target tables, transforming the data with user-defined transformations, automatic source-to-target type casting and parsing and can even use the CSharpCodeProvider to execute methods also stored in those mapping tables for transforming data. It is already built, quite robust and has good performance metrics for what we are doing. I'm now trying to build it out to allow for 1:n table changes (including extension tables requiring the same Id as the 'master' table) and am trying to "genericise" the code. Previously each trigger had a "target table" definition hard coded in it and I was using my mapping tables to determine the source. Now I'd like to get the source table and use my mapping tables to determine all the target tables. This is used in a medium-load environment and pushes changes to a "Change Order Book" which a separate server process picks up to finish the CRUD operation.
Edit
As mentioned in the comments, the query listed above is quite "iffy". It will often (after a SQL Server restart, for example) return system objects like syscolpars or sysidxstats. But, it seems that in the dm_tran_locks table there's always an associated resource_type of 'RID' (Row ID) with the same object_name. My current query which works reliably so far is the following (will update if this changes or doesn't work under high load testing):
select t1.ObjectName FROM (
SELECT object_name(resource_associated_entity_id) as ObjectName
FROM sys.dm_tran_locks WHERE resource_type = 'OBJECT' and request_session_id = ##spid
) t1 inner join (
SELECT OBJECT_NAME(partitions.OBJECT_ID) as ObjectName
FROM sys.dm_tran_locks
INNER JOIN sys.partitions ON partitions.hobt_id = dm_tran_locks.resource_associated_entity_id
WHERE resource_type = 'RID'
) t2 on t1.ObjectName = t2.ObjectName
If this is always the case, I'll have to find that out during testing.
How reliable is this potential solution?
While I do not have time to set up a test case to show it not working, I find this approach (even taking into account the query in the Edit section) "iffy" (i.e. not guaranteed to always be reliable).
The main concerns are:
cascading (whether recursive or not) Trigger executions
User (i.e. Explicit / Implicit) transactions
Sub-processes (i.e. EXEC and sp_executesql)
These scenarios allow for multiple objects to be locked, all at the same time.
Is the ##SPID actually limited to this single trigger execution? Or is it possible that other simultaneous triggers will overlap within this process id?
and (from a comment on the question):
I think I can join my query up with the sys.partitions and get a dm_trans_lock that has a type of 'RID' with an object name that will match up to the one in my original query.
And here is why it shouldn't be entirely reliable: the Session ID (i.e. ##SPID) is constant for all of the requests on that Connection). So all sub-processes (i.e. EXEC calls, sp_executesql, Triggers, etc) will all be on the same ##SPID / session_id. So, between sub-processes and User Transactions, you can very easily get locks on multiple resources, all on the same Session ID.
The reason I say "resources" instead of "OBJECT" or even "RID" is that locks can occur on: rows, pages, keys, tables, schemas, stored procedures, the database itself, etc. More than one thing can be considered an "OBJECT", and it is possible that you will have page locks instead of row locks.
Will it stand up to multiple executions of the same and/or different triggers within the database?
As long as these executions occur in different Sessions, then they are a non-issue.
ALL THAT BEING SAID, I can see where simple testing would show that your current method is reliable. However, it should also be easy enough to add more detailed tests that include an explicit transaction that first does some DML on another table, or have a trigger on one table do some DML on one of these tables, etc.
Unfortunately, there is no built-in mechanism that provides the same functionality that ##PROCID does for T-SQL Triggers. I have come up with a scheme that should allow for getting the parent table for a SQLCLR Trigger (that takes into account these various issues), but haven't had a chance to test it out. It requires using a T-SQL trigger, set as the "first" trigger, to set info that can be discovered by the SQLCLR Trigger.
A simpler form can be constructed using CONTEXT_INFO, if you are not already using it for something else (and if you don't already have a "first" Trigger set). In this approach you would still create a T-SQL Trigger, and then set it as the "first" Trigger using sp_settriggerorder. In this Trigger you SET CONTEXT_INFO to the table name that is the parent of ##PROCID. You can then read CONTEXT_INFO() on a Context Connection in a SQLCLR Trigger. If there are multiple levels of Triggers then the value of CONTEXT INFO will get overwritten, so reading that value must be the first thing you do in each SQLCLR Trigger.
This is an old thread, but it is an FAQ and I think I have a better solution. Essentially it uses the schema of the inserted or deleted table to find the base table by doing a hash of the column names and comparing the hash with the hashes of tables with a CLR trigger on them.
Code snippet below - at some point I will probably put the whole solution on Git (it sends a message to Azure Service Bus when the trigger fires).
private const string colqry = "select top 1 * from inserted union all select top 1 * from deleted";
private const string hashqry = "WITH cols as ( "+
"select top 100000 c.object_id, column_id, c.[name] "+
"from sys.columns c "+
"JOIN sys.objects ot on (c.object_id= ot.parent_object_id and ot.type= 'TA') " +
"order by c.object_id, column_id ) "+
"SELECT s.[name] + '.' + o.[name] as 'TableName', CONVERT(NCHAR(32), HASHBYTES('MD5',STRING_AGG(CONVERT(NCHAR(32), HASHBYTES('MD5', cols.[name]), 2), '|')),2) as 'MD5Hash' " +
"FROM cols "+
"JOIN sys.objects o on (cols.object_id= o.object_id) "+
"JOIN sys.schemas s on (o.schema_id= s.schema_id) "+
"WHERE o.is_ms_shipped = 0 "+
"GROUP BY s.[name], o.[name]";
public static void trgSendSBMsg()
{
string table = "";
SqlCommand cmd;
SqlDataReader rdr;
SqlTriggerContext trigContxt = SqlContext.TriggerContext;
SqlPipe p = SqlContext.Pipe;
using (SqlConnection con = new SqlConnection("context connection=true"))
{
try
{
con.Open();
string tblhash = "";
using (cmd = new SqlCommand(colqry, con))
{
using (rdr = cmd.ExecuteReader(CommandBehavior.SingleResult))
{
if (rdr.Read())
{
MD5 hash = MD5.Create();
StringBuilder hashstr = new StringBuilder(250);
for (int i=0; i < rdr.FieldCount; i++)
{
if (i > 0) hashstr.Append("|");
hashstr.Append(GetMD5Hash(hash, rdr.GetName(i)));
}
tblhash = GetMD5Hash(hash, hashstr.ToString().ToUpper()).ToUpper();
}
rdr.Close();
}
}
using (cmd = new SqlCommand(hashqry, con))
{
using (rdr = cmd.ExecuteReader(CommandBehavior.SingleResult))
{
while (rdr.Read())
{
string hash = rdr.GetString(1).ToUpper();
if (hash == tblhash)
{
table = rdr.GetString(0);
break;
}
}
rdr.Close();
}
}
if (table.Length == 0)
{
p.Send("Error: Unable to find table that CLR trigger is on. Message not sent!");
return;
}
….
HTH

How to add a parameter to an existing stored procedure in SQL Server

I want to add a new parameter to an existing stored procedure. Body of this procedure may have been already customized by users so I can't drop and recreate it. I don't need to modify the body, just the signature.
So I thought to do a replacement of the last existing parameter by itself + the new parameter.
replace(OBJECT_DEFINITION (OBJECT_ID(id)),'#last_param varchar(max)=null','#last_param varchar(max)=null, #new_param varchar(max)=null')
It works fine if the following string is found
#last_param varchar(max)=null
but doesn't work if there is spaces in the string.
I would like to use a regex to be sure it works in all cases but I'm not sure it's possible in SQL Server.
Can you help me please ?
Thanks
SQL Server does not natively support regular expressions. You'll have to look at more manual string-analyzing with the available string functions. Something like this:
set #obDef = OBJECT_DEFINITION(OBJECT_ID(id))
set #startLastParam = PATINDEX('%#last_param%varchar%(%max%)%=%null%', #obDef)
if #startLastParam = 0 begin
-- handle lastParam not found
end else begin
set #endLastParam = CHARINDEX('null', #obDef, #startLastParam) + 4 -- 4 = len('null')
set #newDef = STUFF(#obDef, #endLastParam, 0, ', #new_param varchar(max)=null')
end
This isn't very fool-proof/safe though. PATINDEX() only gives you the same % wildcard you know from LIKE, it may match no character, it may match half the stored proc to find the word max somewhere entirely outside the signature.
So don't just run this in your customers production ;) but if you are certain about the current stored proc signature, this might just do the trick for you.

Is there a gain in efficiency if I put a lengthy SQL select into a stored procedure?

I have the following SQL Server 2012 query:
var sql = #"Select question.QuestionUId
FROM Objective,
ObjectiveDetail,
ObjectiveTopic,
Problem,
Question
where objective.examId = 1
and objective.objectiveId = objectiveDetail.objectiveId
and objectiveDetail.ObjectiveDetailId = ObjectiveTopic.ObjectiveDetailId
and objectiveTopic.SubTopicId = Problem.SubTopicId
and problem.ProblemId = question.ProblemId";
var a = db.Database.SqlQuery<string>(sql).ToList();
Can someone help explain to me if it would be a good idea to put this into a
stored procedure and if so then how could I do that and then call it from my C# code. It was
suggested to me that if it is in a stored procedure then it would run more
efficiently as it would not be recompiled often. Is that the case?
Yes, there is. For starters, a stored procedure is precompiled and stored within your database. Being precompiled, the database engine can execute it more efficiently, since no on-the-fly compilation necessary. Also, database optimizations can be added to support a precompiled procedure. A stored procedure also allows business logic to be encapsulated within the database.
If you decide to go the stored procedure route, then consider the following:
First of all, you will need to create a stored procedure that encapsulates your existing SQL query.
CREATE PROCEDURE ListQuestionIds
#ExamId int
AS
BEGIN
SELECT Question.QuestionUId
FROM Objective
INNER JOIN ObjectiveDetail
ON ( Objective.objectiveId = ObjectiveDetail.objectiveId )
INNER JOIN ObjectiveTopic
ON ( ObjectiveDetail.ObjectiveDetailId = ObjectiveTopic.ObjectiveDetailId )
INNER JOIN Problem
ON ( ObjectiveTopic.SubTopicId = Problem.SubTopicId )
INNER JOIN Question
ON ( Problem.ProblemId = Question.ProblemId )
WHERE Objective.examId = #ExamId;
END;
Please make sure that the tables called by your procedure (Objective, Problem, etc,) have all of the relevant primary keys and indexes in place to enhance the performance of your query.
Next, you will need to call that stored procedure from within your C# code. One way--but by no means the only way--is to create a connection to your database using the SqlConnection object and then executing your procedure via the SqlCommand object.
I would recommend that you take a look at How to execute a stored procedure within C# program for some on-topic examples. But a simple example of such might look like:
string connectionString = "your_connection_string";
using (var con = new SqlConnection(connectionString))
{
using (var cmd = new SqlCommand("ListQuestionIds", con)) {
cmd.CommandType = CommandType.StoredProcedure
cmd.Parameters.Add(new SqlParameter("#ExamId", examId))
con.Open();
using (SqlDataReader rdr = cmd.ExecuteReader())
{
while (rdr.Read())
{
// Loop through the returned SqlDataReader object (aka. rdr) and
// then evaluate & process the returned question id value(s) here
}
}
}
}
Please note that this sample code does not (intentionally) include any error handling. I leave that up to you to integrate into your application.
Finally, just as an FYI... many of the more modern ORMs (e.g., Entity Framework, NHibernate, etc.) allow you to execute stored procedure-like queries from your C# code without requiring an explicit stored procedure. If you are already using an ORM in your application, then you may want to forgo the stored procedure altogether. Whatever you decide to do, a little research on your end will help you make an informed decision.
I hope this helps you get started. Good luck.

Resources