I'm trying to create a wrapper in T-SQL for a procedure where I'm not sure what the data types are. I can run the wrapper without an INSERT INTO statement and I get the data just fine, but I need to have it in a table.
Whenever I use the INSERT INTO I get an error:
Column name or number of supplied values does not match table definition
I've parsed back through my code and can't see where any column names don't match up, so I'm thinking that it has to be a data type. I've looked through the procedure I'm wrapping to see if I can find what the data types are, but some aren't defined there; I've referenced the tables they pull some data from to find the definitions; I've run SQL_VARIANT_PROPERTY on all of the data to see what data type it is (although some of them come up null).
Is there some better way for me to track down exactly where the error is?
I think you can find out your stored procedure result schema, using sp_describe_first_result_set (available from SQL2012) and FMTONLY. Something like this:
EXEC sp_describe_first_result_set
#tsql = N'SET FMTONLY OFF; EXEC yourProcedure <params are embedded here>'
More details can be found here.
However, if I remember correctly, this works only if your procedure used deterministic schemas (no SELECT INTO #tempTable or similar things).
One trick to find out the schema of your result is to actually materialize the result into ad-hoc created table. However, this is not easy since SELECT INTO does not work with EXEC procedure. One work-around is this:
1) Define a linked-server to the instance itself. E.g. loopback
2) Execute your procedure like this (for SQL 2008R2):
SELECT * INTO tempTableToHoldDataAndStructure
FROM OPENQUERY(' + #LoopBackServerName + ', ''set fmtonly off exec ' + #ProcedureFullName + ' ' + #ParamsStr
where
#LoopBackServerName = 'loopback'
#ProcedureFullName = loopback.database.schema.procedure_name
#ParamsStr = embedded parameters
For SQL2012 I think the execution might fail if RESULT SETS are not provided (i.e. schema definition of the expected result, which is kind of a chicken-egg problem in this case):
' WITH RESULT SETS (( ' + #ResultSetStr + '))'');
Okay, I have a solution to my problem. It's tedious, but tedious I can do. Randomly guessing is what drives me crazy. The procedure I'm wrapping dumps 51 columns. I already know I can get it to work without putting anything into a table. So I decided to comment out part of the select statement in the procedure I'm wrapping so it's only selecting 1 column. (First I made a copy of that procedure so I don't screw up the original; then I referenced the copy from my wrapper). Saved both, ran it, and it worked. So far so good. I could have done it line by line, but I'm more of a binary kind of guy, so I went about halfway down--now I'm including about 25 columns in both the select statement and my table--and it's still working. Repeat procedure until it doesn't work any more, then backtrack until it does again. My error was in identifying one of the data types followed by "IDENTITY". I'm not sure what will happen when I leave that out, but at least my wrapper works.
Related
SQL Server has Deferred Name Resolution feature, read here for details:
https://msdn.microsoft.com/en-us/library/ms190686(v=sql.105).aspx
In that page, all it's talking is stored procedure so it seems Deferred Name Resolution only works for stored procedures and not for functions and I did some testing.
create or alter function f2(#i int)
returns table
as
return (select fff from xxx)
go
Note the table xxx does not exist. When I execute the above CREATE statement, I got the following message:
Msg 208, Level 16, State 1, Procedure f2, Line 4 [Batch Start Line 22]
Invalid object name 'xxx'.
It seems that SQL Server instantly found the non-existent table xxx and it proved Deferred Name Resolution doesn't work for functions. However when I slightly change it as follows:
create or alter function f1(#i int)
returns int
as
begin
declare #x int;
select #x = fff from xxx;
return #x
end
go
I can successfully execute it:
Commands completed successfully.
When executing the following statement:
select dbo.f1(3)
I got this error:
Msg 208, Level 16, State 1, Line 34
Invalid object name 'xxx'.
So here it seems the resolution of the table xxx was deferred. The most important differences between these two cases is the return type. However I can't explain when Deferred Name Resolution will work for functions and when not. Can anyone help me to understand this? Thanks in advance.
It feels like you were looking for understanding of why your particular example didn't work. Quassnoi's answer was correct but didn't offer a reason so I went searching and found this MSDN Social answer by Erland Sommarskog. The interesting part:
However, it does not extend to views and inline-table functions. For
stored procedures and scalar functions, all SQL Server stores in the
database is the text of the module. But for views and inline-table
functions (which are parameterised view by another name) SQL Server
stores metadata about the columns etc. And that is not possible if the
table is missing.
Hope that helps with understanding why :-)
EDIT:
I did take some time to confirm Quassnoi's comment that sys.columns as well as several other tables did contain some metadata about the inline function so I am unsure if there is other metadata not written. However I thought I would add a few other notes I was able to find that may help explain in conjunction.
First a quote from Wayne Sheffield's blog:
In the MTVF, you see only an operation called “Table Valued Function”. Everything that it is doing is essentially a black box – something is happening, and data gets returned. For MTVFs, SQL can’t “see” what it is that the MTVF is doing since it is being run in a separate context. What this means is that SQL has to run the MTVF as it is written, without being able to make any optimizations in the query plan to optimize it.
Then from the SQL Server 2016 Exam 70-761 by Itzik Ben-Gan (Skill 3.1):
The reason that it's called an inline function is because SQL Server inlines, or expands, the inner query definition, and constructs an internal query directly against the underlying tables.
So it seems the inline function essentially returns a query and is able to optimize it with the outer query, not allowing the black-box approach and thus not allowing deferred name resolution.
What you have in your first example is an inline function (it does not have BEGIN/END).
Inline functions can only be table-valued.
If you used a multi-statement table-valued function for you first example, like this:
CREATE OR ALTER FUNCTION
fn_test(#a INT)
RETURNS #ret TABLE
(
a INT
)
AS
BEGIN
INSERT
INTO #ret
SELECT a
FROM xxx
RETURN
END
, it would compile alright and fail at runtime (if xxx would not exist), same as a stored procedure or a scalar UDF would.
So yes, DNR does work for all multi-statement functions (those with BEGIN/END), regardless of their return type.
I am trying to write an SQL statement in python which passes a table name as a variable. However, I get the following error: Must declare the table variable "#P1".
pypyodbc.Programming Error: ('42000', '[42000]' [Miscrosoft] [SQL SERVER NATIVE CLIENT 10.0] [SQL SERVER] Must declare the table variable "#P1"
The code yielding the ERROR is:
query = cursor.execute('''SELECT * FROM ?''', (table_variable,))
I have other code where I pass variables to the SQL statement using the same syntax which works fine (code below works as intended).
query = cursor.execute('''SELECT column_name FROM information_schema.columns WHERE table_name = ?''', (table_variable,))
The error seems to occur when I am using a variable to pass a table name.
Any help resolving this error would be much appreciated.
With new comments from the OP this has changed rather significantly. If all you are trying to do is get a few rows of sample from each table you can easily leverage the sys.tables catalog view. This will create a select statement for every table in your database. If you have multiple schemas you could extend this to add the schema name too.
select 'select top 10 * from ' + QUOTENAME(t.name)
from sys.tables t
What you're trying to do is impossible. You can only pass values into queries as parameters - so
SELECT * FROM #Table
is banned but
SELECT * FROM TableName WHERE Column=#Value
is perfectly legal.
Now, as to why it's banned. From a logical point of view the database layer can't cache a query plan for what you're trying to do at all - the parameter will completely and utterly change where it goes and what returns - and can't guarantee in advance what it can or can't do. It's like trying to load an abstract source file at runtime and execute it - messy, unpredictable, unreliable and a potential security hole.
From a reliability point of view, please don't do
SELECT * FROM Table
either. It makes your code less readable because you can't see what's coming back where, but also less reliable because it could change without warning and break your application.
I know it can seem a long way round at first, but honestly - writing individual SELECT statements which specify the fields they actually want to bring back is a better way to do it. It'll also make your application run faster :-)
You can define a string variable:
table_var_str = 'Table_name'
st = 'SELECT * FROM ' + table_var_str
query = cursor.execute(st)
It will solve the problem.
You can also set the table_var_str as a list:
table_var_str = []
st = []
for i in range(N):
table_var_str.append = 'Table_name' + str(i)
st.append('SELECT * FROM ' + table_var_str[i])
for j in range(J):
query = cursor.execute(st[j])
If the query is very long, you should write them in a line instead of multi lines.
Note: I'm running under SQL Server 2008 R2...
I've taken the time to read dozens of posts on this site and other sites on how to execute dynamic SQL where the query is more than 4000 characters. I've tried more than a dozen solutions proposed. The consensus seems to be to split the query into 4000-character variables and then do:
EXEC (#SQLQuery1 + #SQLQuery2)
This doesn't work for me - the query is truncated at the end of #SQLQuery1.
Now, I've seen samples how people "force" a long query by using REPLICATE a bunch of spaces, etc., but this is a real query - but it gets a little more sophisticated than that.
I have SQL View with a name of "Company_A_ItemView".
I have 10 companies that I want to create the same exact view, with different names, e.g.
"Company_B_ItemView"
"Company_C_ItemView"
..etc.
If you offer help, please don't ask why there are multiple views - just accept that I need to do it this way, OK?
Each company has its own set of tables, and the CREATE VIEW statement references several tables by name. Here's BRIEF sample, but remember, the total length of the query is around 6000 characters:
CREATE view [dbo].[Company_A_ItemView] as
select
WE.[Item No_],
WE.[Location Code],
LOC.[Bin Number],
[..more fields, etc.]
from
[Company_A_Warehouse_Entry] WE
left join
[Company_A_Location] LOC
...you get the idea
So, what I am currently doing is:
a. Pulling the contents of the CREATE VIEW statement into 2 Declared Variables, e.g.
Set #SQLQuery1 = (select text
from syscomments
where ID = 1382894081 and colid = 1)
Set #SQLQuery2 = (select
from syscomments
where ID = 1382894081 and colid = 2)
Note that this is how SQL stores long definitions - when you create the view, it stores the text into multiple syscomments records. In my case, the view is split into a text chunk of 3591 characters into the first syscomment record and the rest of the text is in the second record. I have no idea why SQL doesn't use all 4000 characters in the syscomment field. And the statement is broken in the middle of a word.
Please note in all my examples, all #SQLQueryxxx variables are declared as varchar(max). I've also tried declaring them as nvarchar(max) and varchar(8000) and nvarchar(8000) with the same results.
b. I then do a "Search and Replace" for "Company_A" and replace it with "Company_B". In the code below, the variable "#CompanyID" is first set to "Company_B":
SET #SQLQueryNew1 = #SQLQuery1
SET #SQLQueryNew1 = REPLACE(#SQLQueryNew1, 'Company_A', #CompanyID)
SET #SQLQueryNew2 = #SQLQuery2
SET #SQLQueryNew2 = REPLACE(#SQLQueryNew2, 'Company_A',#CompanyID)
c. I then try:
EXEC (#SQLQueryNew1 + #SQLQueryNew2)
The message returned indicates that it's trying to execute the statement truncated at the end of #SQLQueryNew1, e.g. 80% (approx) of the query's text.
I've tried CAST'ing the final result into a new varchar(max) and nvarchar(max) - no luck
I've tried CAST'ing the original query a new varchar(max) and nvarchar(max)- no luck
I've looked at the result of retrieving the original CREATE VIEW statement, and it's fine.
I've tried various other ways of retrieving the original CREATE VIEW statement, such as:
Set #SQLQuery1 = (select VIEW_DEFINITION)
FROM [MY_DATABASE].[INFORMATION_SCHEMA].[VIEWS]
where TABLE_NAME = 'Company_A_ItemView')`
This one returns only the first 4000 characters of the CREATE VIEW
Set #SQLQuery1 = (SELECT (OBJECT_DEFINITION(#ObjectID))
If I do a
SELECT LEN(OBJECT_DEFINITION(#ObjectID))
it returns the correct length of the query (e.g. 5191), but if I look at #SQLQuery1, or try to
EXEC(#SQLQuery1), the statement is still truncated.
c. There are some references that state that since I'm manipulating the text of the query after retrieving it, the resulting variables are then truncated to 4000 characters. I've tried CAST'ing the result as I do the REPLACE, e.g.
SET #SQLQueryNew1 = SELECT (CAST(REPLACE(#SQLQueryNew1,
'Company_A',
#CompanyID) AS varchar(max))
Same result.
I know there are other methods, such as creating stored procedures for creating the views. But the views are being developed and are somewhat "in flux", so placing the text of the CREATE VIEW inside a stored proc is cumbersome. My goal is to be able to take Company_A's view and replicate it exactly - multiple times, except reference Company_B's view name and table names, Company_C's view name and table names, etc.
I'm wondering if there is anyone out there who has done this type of manipulation of a long SQL "CREATE VIEW" statement and try to execute it.
Just use VARCHAR(MAX) or NVARCHAR(MAX). They work fine for EXEC(string).
FYI,
Note that this is how SQL stores long definitions - when you create
the view, it stores the text into multiple syscomments records.
This is not correct. This is how it used to be done on SQL Server 2000. Since SQL Server 2005 and higher they are saved as NVARCHAR(MAX) in a single entry in sys.sql_modules.
syscomments is still around, but it is retained read-only solely for compatibility.
So all you should need to do is to change your #SQLQuery1,2,etc. variables to a single NVARCHAR(MAX) variable, and pull your View code from the [definition] column of the sys.sql_modules table instead.
Note that you should be careful with your string manipulations as there are certain functions that will revert to (N)VARCHAR(4000) output if all of their input arguments are not (N)VARCHAR(MAX). (Sorry, I do not know which ones, but REPLACE() may be one). In fact, this may be what has been causing so much confusion in your tests.
declare your sql variables (#SQLQuery1...) as nvarchar(4000)
be sure each sql part did't exceed 4000 byte (copy each part to a text file and test the file size in bytes)
I remember reading a while back that randomly SQL Server can slow down and / or take a stupidly long time to execute a stored procedure when it is written like:
CREATE PROCEDURE spMyExampleProc
(
#myParameterINT
)
AS
BEGIN
SELECT something FROM myTable WHERE myColumn = #myParameter
END
The way to fix this error is to do this:
CREATE PROCEDURE spMyExampleProc
(
#myParameterINT
)
AS
BEGIN
DECLARE #newParameter INT
SET #newParameter = #myParameter
SELECT something FROM myTable WHERE myColumn = #newParameter
END
Now my question is firstly is it bad practice to follow the second example for all my stored procedures? This seems like a bug that could be easily prevented with little work, but would there be any drawbacks to doing this and if so why?
When I read about this the problem was that the same proc would take varying times to execute depending on the value in the parameter, if anyone can tell me what this problem is called / why it occurs I would be really grateful, I cant seem to find the link to the post anywhere and it seems like a problem that could occur for our company.
The problem is "parameter sniffing" (SO Search)
The pattern with #newParameter is called "parameter masking" (also SO Search)
You could always use the this masking pattern but it isn't always needed. For example, a simple select by unique key, with no child tables or other filters should behave as expected every time.
Since SQL Server 2008, you can also use the OPTIMISE FOR UNKNOWN (SO). Also see Alternative to using local variables in a where clause and Experience with when to use OPTIMIZE FOR UNKNOWN
I modified a procedure and it now takes a greater number of parameters. How can I find every place that the procedure is called so I can update the number of arguments the proc is passed?
I tried this:
select * from syscomments where text like '%MODIFIED-PROCEDURE-NAME%'
but I'm still finding other places the proc is called that this query did not return.
use sys.sql_modules:
SELECT
OBJECT_SCHEMA_NAME(m.object_id) + '.' + OBJECT_NAME(m.object_id)
FROM sys.sql_modules m
WHERE m.definition like '%whatever%'
sys.sql_modules.definition is nvarchar(max). Other similar views have nvarchar(4000) columns, where the text is split over multiple rows.
Get yourself Red-Gate SQL Search - it's great, it's FREE and it just works. It can be used to do exactly what you're looking for! Go grab it - it's worth its weight in gold!
If this is all inside of SQL server you could just recompile it.
Just create a single script containing all stored procedures and functions. Run the script. It'll bomb where the problems are.
Optionally, you could just search the script you created as well.