Does anyone know of a way to verify the correctness of the queries in all stored procedures in a database?
I'm thinking of the scenario where if you modify something in a code file, simply doing a rebuild would show you compilation errors that point you to places where you need to fix things. In a database scenario, say if you modify a table and remove a column which is used in a stored procedure you won't know anything about this problem until the first time that procedure would run.
What you describe is what unit testing is for. Stored procedures and functions often require parameters to be set, and if the stored procedure or function encapsulates dynamic SQL--there's a chance that a [corner] case is missed.
Also, all you mention is checking for basic errors--nothing about validating the data returned. For example - I can change the precision on a numeric column...
This also gets into the basic testing that should occur for the immediate issue, and regression testing to ensure there aren't unforeseen issues.
You could create all of your objects with SCHEMABINDING, which would prevent you from changing any underlying tables without dropping and recreating the views and procedures built on top of them.
Depending on your development process, this could be pretty cumbersome. I offer it as a solution though, because if you want to ensure the correctness of all procedures in the db, this would do it.
I found this example on MSDN (SQL Server 2012). I guess it can be used in some scenarios:
USE AdventureWorks2012;
GO
SELECT p.name, r.*
FROM sys.procedures AS p
CROSS APPLY sys.dm_exec_describe_first_result_set_for_object(p.object_id, 0) AS r;
Source: sys.dm_exec_describe_first_result_set_for_object
Related
While debugging I am unable to watch temp table's value in sql server 2012.I am getting all of my variables value and even can print that but struggling with the temp tables .Is there any way to watch temp table's value?.
SQL Server provides the concept of temporary table which helps the developer in a great way. These tables can be created at runtime and can do the all kinds of operations that one normal table can do. But, based on the table types, the scope is limited. These tables are created inside tempdb database.
While debugging, you can pause the SP at some point, write the select statement in your SP before the DROP table statement, the # table is available for querying.
select * from #temp
I placed this code inside my stored procedure and I am able to see the temp table contents inside the "Locals" window.
INSERT INTO #temptable (columns) SELECT columns FROM sometable; -- populate your temp table
-- for debugging, comment in production
DECLARE #temptable XML = (SELECT * FROM #temptable FOR XML AUTO); -- now view #temptable in Locals window
This works on older SQL Server 2008 but newer versions would probably support a friendlier FOR JSON object. Credit: https://stackoverflow.com/a/6748570/1129926
I know this is old, I've been trying to make this work also where I can view temp table data as I debug my stored procedure. So far nothing works.
I've seen many links to methods on how to the do this, but ultimately they don't work the way a developer would want them to work. For example: suppose one has several processes in the Stored Procedure that updates and modifies data in the same temp table, there is no way to see update on the fly for each process in the SP.
This is a VERY common request, yet no one seems to have a solution other than don't use Stored Procedures for complex processing due how difficult they are to debug. If you're a .NET Core/EF 6 developer and have the correct PK,FK set for the database, one shouldn't really need to use Stored Procedures at all as it can all be handled by EF6 and debug code to view data results in your entities/models directly (usually in web API using models/entities).
Trying to retrieve the data from the tempdb is not possible even with the same connection (as has been suggested).
What is sometimes used is:
PRINT '#temptablename'
SELECT * FROM #temptablename
Dotted thruout the code, you can add a debug flag to the SP and selectively debug the output. NOT ideal at all, but works for many situations.
But this MUST already be in the Stored Procedure before execution (not during). And you must remember to remove the code prior to deployment to a production environment.
I'm surprised in 2022, we still have no solution to this other than don't use complex stored procedures or use .NET Core/EF 6 ... which in my humble opinion is the best approach for 2022 since SSMS and other tools like dbForge and RedGate can't accomplish this either.
I am new to Postgresql and I am trying to figure out some details about stored procedures (which I think are actually called functions in pgsql) when used in a multiple schema environment.
The application I have in mind involves a multi-tenant DB design where one schema is used for each tenant and all schemata, which have the same table structure and names, are part of the same database. As far as I know from DBs in general, stored procedures/functions are pre-compiled and therefore faster so I woulid like to use them for performing operations on each schema's tables by sending the required parameters from the application server instead of sending a list of SQL commands. In addition, I would like to have a SINGLE set of functions that implement all the SELECT (including JOIN type), INSERT, UPDATE, etc operations on the tables of each schema. This will allow to easily perform changes in each function and avoid SQL code replication and redundancy. As I found out, it is possible to create a set of functions in a schema s0 and then create s1, s2, ... schemata (having all the same tables) that use these functions.
For exapmle, I can create a template schema named s0 (identical to all others) and create a SQL or pl/pgSQL function that belongs to this schema and contains operations on the schema's tables. In this function, the table names are written without the schema prefix, i.e.
first_table and not s0.first_table
An example function could be:
CREATE FUNCTION sel() RETURNS BIGINT
AS 'SELECT count(a) from first_table;'
LANGUAGE SQL;
As I have tested, this function works well. If I move to schema s1 by entering:
set search_path to s1;
and then call the function again, the function acts upon s1 schema's identically named table first_table.
The function could also include the parameter path in order to call it with a schema name and a command to change the search_ path similar to this:
CREATE OR REPLACE FUNCTION doboth(path TEXT, is_local BOOLEAN DEFAULT false) RETURNS BIGINT AS $$
SELECT set_config('search_path', regexp_replace(path, '[^\w ,]', '', 'g'), is_local);
SELECT count(a) from first_table;
$$ LANGUAGE sql;
as shown in the proposed solution in PostgreSQL: how do I set the search_path from inside a function?
However, when I tried this and I called the function for a schema, I noticed that the second SELECT of the function was executed before the first, which led to executing the second SELECT on the wrong schema! This was really unexpected. Does anybody know the explanation to this behavior?
In order to bypass this issue, I created a plpgsql function that does the same thing and it worked without any execution order issues:
CREATE OR REPLACE FUNCTION doboth(path TEXT, is_local BOOLEAN DEFAULT false) RETURNS BIGINT AS $$
DECLARE result BIGINT;
BEGIN
PERFORM set_config('search_path', regexp_replace(path, '[^\w ,]', '', 'g'), is_local);
SELECT count(a) from first_table INTO result;
RETURN result;
END
$$ LANGUAGE plpgsql;
So, now some questions about performance this time:
1) Apart from a) having the selection of schema to operate and the specified operation on the schema in one transaction which is necessary for my multi-tenant implementation, and b) teaming together SQL commands and avoiding some extra data exchange between the application server and the DB server which is beneficial, do the Postgresql functions have any performance benefits over executing the same code in separate SQL commands?
2) In the described multi-tenant scenario with many schemata and one DB,
does a function that is defined once and called for any identical schema to the one it is defined lose any of its performance benefits (if any)?
3) Is there any difference in performance between an SQL function and a PL/pgSQL function that contains the same operations?
Before I answer your questions, a remark to your SQL function.
It does not fail because the statements are executed in a wrong order, but because both queries are parsed before the first one is executed. The error message you get is somewhat like
ERROR: relation "first_table" does not exist
[...]
CONTEXT: SQL function "doboth" during startup
Note the “during startup”.
Aswers
You may experience a slight performance boost, particularly if the SQL statements are complicated, because the plans of SQL statements in a PL/pgSQL function are cached for the duration of a database session or until they are invalidated.
If the plan for the query is cached by the PL/pgSQL function, but the SQL statement calling the function has to be planned every time, you might actually be worse of from a performance angle because of the overhead of executing the function.
Whenever you call the function with a different schema name, the query plan will be invalidated and has to be created anew. So if you change the schema name for every invocation, you won't gain anything.
SQL function don't cache query plans, so they don't perform better than the plain SQL query.
Note, however, that the gains from caching simple SQL statements in functions are not tremendous.
Use functions that just act as containers for SQL statements only if it makes life simpler for you, otherwise use plain SQL.
Do not only focus on performance uring design, but on a good architecture and a simple design.
If the same statements keep repeating over and over, you might gain more performance using prepared statements than using functions.
Firstly, I do not really believe there can be any issues with line execution order in functions. If you have any issues, it's your code not working, not Postgres.
Secondly, multi-tenant behavior is well implemented with set search_path to s1, s0;. There is usually no need for switching anything inside procedures.
Thirdly, there are no performance benefits in using stored procedures except for minimizing data flows between DB and the application. If you consider a query like SELECT count(*) FROM mytable WHERE somecolumn = $1 there is absolutely nothing you can optimize before you know the value of $1.
And finally, no, there is no significant difference between functions in SQL and PL/pgSQL. The most time is still consumed by reading through tables, so focus on perfecting that.
Hope that clarifies the situation. Also, you may want to consider security benefits of storage procedures. Just a hint.
I am trying to hunt down a certain stored procedure which writes to certain table (it needs to be changed) however going through every single stored procedure is not a route I really want to take. So I was hoping there might be a way to find out which stored procedures INSERT or UPDATE certain table.
I have tried using this method (pinal_daves_blog), but it is not giving me any results.
NOTICE: The stored procedure might not be in the same DB!
Is there another way or can I somehow check what procedure/function has made the last insert or update to table.
One brute-force method would be to download an add-in from RedGate called SQL Search (free), then do a stored procedure search for the table name. I'm not affiliated at all with RedGate or anything, this is just a method that I have used to find similar things and has served me well.
http://www.red-gate.com/products/sql-development/sql-search/
If you go this route, you just type in the table name, change the 'object types' ddl selection to 'Procedures' and select 'All databases' in the DB ddl.
Hope this helps! I know it isn't the most technical solution, but it should work.
There is no built-in way to tell what function, procedure, or executed batch has made the last change to a table. There just isn't. Some databases have this as part of their transaction logging but SQL Server isn't one of them.
I have wondered in the past whether transactional replication might provide that information, if you already have that set up, but I don't know whether that's true.
If you know the change has to be taking place in a stored procedure (as opposed to someone using SSMS or executing lines of SQL via ADO.NET), then #koppinjo's suggestion is a good one, as is this one from Pinal Dave's blog:
USE AdventureWorks
GO
--Searching for Empoloyee table
SELECT Name
FROM sys.procedures
WHERE OBJECT_DEFINITION(OBJECT_ID) LIKE '%Employee%'
There are also dependency functions, though they can be outdated or incomplete:
select * from sys.dm_sql_referencing_entities( 'dbo.Employee', 'object' )
You could run a trace in Profiler. The procedure would have to write to the table while the trace is running for you to catch it.
I have a simple SELECT statement with a couple columns referenced in the WHERE clause. Normally I do these simple ones in the VB code (setup a Command object, set Command Type to text, set Command Text to the Select statement). However I'm seeing timeout problems. We've optimized just about everything we can with our tables, etc.
I'm wondering if there'd be a big performance hit just because I'm doing the query this way, versus creating a simple stored procedure with a couple params. I'm thinking maybe the inline code forces SQL to do extra work compiling, creating query plan, etc. which wouldn't occur if I used a stored procedure.
An example of the actual SQL being run:
SELECT TOP 1 * FROM MyTable WHERE Field1 = #Field1 ORDER BY ID DESC
A well formed "inline" or "ad-hoc" SQL query - if properly used with parameters - is just as good as a stored procedure.
But this is absolutely crucial: you must use properly parametrized queries! If you don't - if you concatenate together your SQL for each request - then you don't benefit from these points...
Just like with a stored procedure, upon first executing, a query execution plan must be found - and then that execution plan is cached in the plan cache - just like with a stored procedure.
That query plan is reused over and over again, if you call your inline parametrized SQL statement multiple times - and the "inline" SQL query plan is subject to the same cache eviction policies as the execution plan of a stored procedure.
Just from that point of view - if you really use properly parametrized queries - there's no performance benefit for a stored procedure.
Stored procedures have other benefits (like being a "security boundary" etc.), but just raw performance isn't one of their major plus points.
It is true that the db has to do the extra work you mention, but that should not result in a big performance hit (unless you are running the query very, very frequently..)
Use sql profiler to see what is actually getting sent to the server. Use activity monitor to see if there are other queries blocking yours.
Your query couldn't be simpler. Is Field1 indexed? As others have said, there is no performance hit associated with "ad-hoc" queries.
For where to put your queries, this is one of the oldest debates in tech. I would argue that your requests "belong" to your application. They will be versionned with your app, tested with your app and should disappear when your app disappears. Putting them anywhere other than in your app is walking into a world of pain. But for goodness sake, use .sql files, compiled as embedded resources.
Select statement which is part of form clause of any
another statement is called as inline query.
Cannot take parameters.
Not a database object
Procedure:
Can take paramters
Database object
can be used globally if same action needs to be performed.
I have a trigger that will be dynamically created, as it will be attached to a view that is also dynamically generated. I don't want my stored procedure to have the entire trigger within it, so I would like to move most of it to a stored procedure, but I won't know the fields in the inserted and deleted tables.
The trigger is about 90 lines long, and the only part I really need to be different between triggers is:
DECLARE #DEBUG bit = 1
DECLARE #EntityName nvarchar(128) = 'Lot'
SELECT * INTO #MYINSERTED FROM INSERTED
SELECT * INTO #MYDELETED FROM DELETED
If I could move the rest of it to a stored procedure then that would be great.
The problem with just passing in the #DEBUG and #EntityName and using #MYINSERTED and #MYDELETED in the stored procedure then I would have a problem if two people are inserting or updating the same view at the same time.
The best bet would be to pass a table variable to remove any concurrency issues but I am not certain the best way to do that.
Thank you.
This would actually be a bad idea. SQL is not like your run-of-the-mill procedural language. The SQL 'compilation' binds to a physical access path plan, meaning that the statements are compiled into plans that say 'open rowset with ID 1234, seek a record and retrieve its content' and that '1234' is determined during the compilation of a batch by the optimizer. Which means that moving common code into a procedure as you plan more often hurts more than it benefits. The procedure cannot be bound to a 'generic' access path, it needs to know the actual tables and objects it should look into for selects and updates and the like. You either end up doing dynamic SQL in the procedure or moving only non data bound, generic, parts of the procedure (eg. calculations) which creates very convoluted code and still can hurt performance while at the same time decreasing the procedure readability.
Much more advisable is to have a template and generate your triggers from these template via various code generation techniques, like XML and XSLT.
I suspect the metadata/schema about inserted and deleted are the core problem here (which is why you are using SELECT * INTO).
If you are code generating the trigger and view dynamically, I would say it probably doesn't make much difference. After all, all the triggers and views are code generated and can be regenerated as your system gets new capabilities or the core SP is improved.
Only if the triggers and views are customized and are never regenerated, then there would be a benefit of sharing a core SP which can be modified and upgraded instead of regenerating the views and triggers.
The overhead of regenerating is probably outweighed by generated code which will have a solid execution plan and better binding.