Strange Issue in SSIS with WITH RESULTS SET returning wrong number of columns - sql-server

So I have a stored procedure in SQL Server. I've simplified its code (for this question) to just this:
CREATE PROCEDURE dbo.DimensionLookup as
BEGIN
select DimensionID, DimensionField from DimensionTable
inner join Reference on Reference.ID = DimensionTable.ReferenceID
END
In SSIS on SQL Server 2012, I have a Lookup component with the following source command:
EXECUTE dbo.DimensionLookup WITH RESULT SETS (
(DimensionID int, DimensionField nvarchar(700) )
)
When I run this procedure in Preview mode in BIDS, it returns the two columns correctly. When I run the package in BIDS, it runs correctly.
But when I deploy it out to the SSIS catalog (the same server the database is on), point it to the same data sources, etc. - it fails with the message:
EXECUTE statement failed because its WITH RESULT SETS clause specified 2 column(s) for result set number 1, but the statement sent
3 column(s) at run time.
Steps Tried So Far:
Adding a third column to the result set - I get a different error, VS_NEEDSNEWMETADATA - which makes sense, kind of proof there's no third column.
SQL Profiler - I see this:
exec sp_prepare #p1 output,NULL,N'EXECUTE dbo.DimensionLookup WITH RESULT SETS ((
DimensionID int, DimensionField nvarchar(700)))',1
SET FMTONLY ON exec sp_execute 1 SET FMTONLY OFF
So it's trying to use FMTONLY to get the result set data ... needless to say, running SET FMTONLY ON and then running the command in SSMS myself yields .. just the two columns.
SET NOTCOUNT ON - Nothing changed.
So, two other interesting things:
I deployed it out to my local SQL 2012 install and it worked fine, same connections, etc. So it may be a server / database configuration. Not sure what if anything it is, I didn't install the dev server and my own install was pretty much click through vanilla.
Perhaps the most interesting thing. If I remove the join from the procedure's statement so it just becomes
select DimensionID, DimensionField from DimensionTable
It goes back to just sending 2 columns in the result set! So adding a join, without adding any additional output columns, ups the result set to 3 columns. Even if I add 6 more joins, just 3 columns. So one guess is its some sort of metadata column that only gets activated when there's a join.
Anyway, as you can imagine, it's driving me kind of mad. I have a workaround to load the data into a temp table and just return that, but why won't this work? What extra column is being sent back? Why only when I add a join?
Gah!

So all credit to billinkc: The reason is because of a patch.
In Version 11.0.2100.60, SSIS Lookup SQL command metadata is gathered using the old SET FMTONLY method. Unfortunately, this doesn't work in 2012, as the Books Online entry on SET FMTONLY helpfully notes:
Do not use this feature. This feature has been replaced by sp_describe_first_result_set.
Too bad they didn't follow their own advice!
This has been patched as of version 11.0.2218.0. Metadata is correctly gathered using the sp_describe_first_result_set system stored procedure.

This can happen if the specified WITH results set in SSIS identifies that there are more columns than being returned by the stored proc being called. Check your stored proc and ensure that you have the correct number of output columns as the WITH results set.

Related

VS 2012 - Comparing dates between tables to control progression of package steps

I've created an SSIS package that pulls data from various sources and aggregates it as needed for the business. The goal of this processing is to create a single table, for example "Data_Tableau". This table is the datasource for connected Tableau dashboards.
The Tableau dashboards need to be available during the processing, so I don't truncate "Data_Tableau" and re-populate with the SSIS package. Instead, the SSIS package steps create "Data_Stage". Then the final step of the package is a drop/rename, wherein I drop "Data_Tableau" and sp_rename "Data_Stage" to "Data_Tableau".
USE dbname
DROP TABLE Data_Tableau
EXEC sp_rename Data_Stage, Data_Tableau
Before this final step, I expect max(buydate) from "Data_Stage" to be greater than max(buydate) from "Data_Tableau", since "Data_Stage" would have additional records since the last time the process ran.
However, sometimes there are issues with upstream data and I end up with max(buydate) from "Data_Stage" = max(buydate) from "Data_Tableau". In such cases, I would not want the final drop/rename process to run. Instead, I want the job to fail and I'll send an alert to the appropriate upstream data team when I get the failure notification.
That's the long-winded background. My question is...how do I check the dates and cause a failure within the SSIS package. I'm using VS 2012.
I was thinking of creating a constraint before the final drop/rename step, but I haven't created variables or expressions before and am unsure how to achieve this.
I was also considering creating a 2-row table as follows:
SELECT MAX(buydate) 'MaxDate', 'Tableau' 'FieldType' FROM dbname.dbo.Data_Tableau
UNION ALL
SELECT MAX(buydate) 'MaxDate', 'Stage' 'FieldType' FROM dbname.dbo.Data_Stage
and then using a query against that table as some sort of constraint, but not sure if that makes any sense and/or is better than the option of creating variables/expressions.
Goal: If MAX(buydate) from "Data_Stage" > MAX(buydate) from "Data_Tableau", then I'd want the drop/rename step to run, otherwise it should fail and "Data_Tableau" will contain the same data as before the package ran.
Suggestions? Step-by-step instrux would be greatly appreciated.
I would do this by putting this:
Then the final step of the package is a drop/rename, wherein I drop
"Data_Tableau" and sp_rename "Data_Stage" to "Data_Tableau".
into a stored procedure that gets called by the SSIS package.
Then it's simply a matter of using an IF block before that part of the code:
--psuedocode
IF (SELECT MaxBuyDateFromTableA) >= (SELECT MaxBuyDateFromTableB)
BEGIN
DROP TABLE Data_Tableau
EXEC sp_rename Data_Stage, Data_Tableau
END
ELSE
--do something else (or nothing at all)

How to see result of select during Sql Server debugging

I am debugging a stored procedure in Sql Server, I can see the local variables in "Locals", I can add other variables in "Watches" (I have embedded a picture here with Sql Server in debug mode and different debug windows).
My question is: where can I see the result of the select statements during debugging? It is really helpful to see them as they are executed, more so when they read from temporary tables sometimes, which are local to that procedure.
Later edit:
I have followed the advice given below and am having this problem with XML viewer (please see attachment): "The XML page cannot be displayed"
From View contents of table variables and temp tables in SSMS debugger:
This won't be in immediately, but we are considering a feature similar
to this for a future release.
And workaround (you need to add additional XML variable for each temp table):
Declare #TableVar_xml xml
Set #TableVar_xml = (Select * from #TableVar for XML Auto, Elements xsinil);
Then I can look at the table variable contents using the XML viewer.

Does SQL Server Handle Views Differently Than Tables?

I ran into an interesting problem with a client, I sent over a .dll file that within had a SQL statement. When the client ran this SQL query within Management Studio the query ran without issue and returned the proper results, when he left it to SQL Server to run things with the sp_executesql command the following error was received:
The conversion of the varchar value '201101456914' overflowed an int column. Maximum integer value exceeded.
(The number in the error above is from a varchar column, why it's trying to convert it to an integer is a mystery). When he set up the data in a table and ran the query the same way (with sp_executesql) it worked perfectly fine.
Here is the query that ends up being fired with the sp_executesql:
exec sp_executesql N'SELECT OffBookAccountDescription, Value,
OffBookAccountId, DataAsOf FROM OffBook
WHERE OffBookCode = #code AND MemberNumber = #num',
N'#num int,#code char(1)',#num=5555,#code='U'
This causes the error to appear in the view but not when fired on a table with exactly the same data. The number in the above error message corresponds to the OffBookAccountId field in the query.
Is there something that SQL Server does in the back end to handle views differently than tables?

Stored procedure problem in MS Access caused by ReturnsRecords (accdb)

I have a relatively simply stored procedure that runs an insert, and then attempts to return the last inserted ID. This is done so I can the ID via SCOPE_IDENTITY(). This was working great for me. But then, I got reports that on some machines, the stored proc would cause duplicate results.
After investigating it, I found that the cause was the use of the property ReturnsRecords. When true, it will run a query twice! For a select; who cares. For this case though, it is causing duplicates in my database.
Setting ReturnsRecords to false gets rid of the problem, but then it defeats the purpose of the stored proc (I absolutely must get the proper last inserted ID for the record)!
My question is simply this: How would I go about inserting this record and getting the ID of the new record, while getting around this problem?
Additional Info:
I am currently using DAO
I have tried the ADO.Command method, but it is
very error prone and doesn't seem to
work with output parameters for me.
I am using the stored proc solely for the purpose of retaining scope. I do not have my heart set on using a stored proc. I simply need a reliable way to get the id of the last inserted row.
This is an ACCDB
This is happening in access 2007
my DB backend is MSSQL Server 2008
Any help or insight is appreciated.
One of your parameters in the procedure can be set to output. Still don't return any rows, but set the value of that parameter to Scope_Identity()
create proc ReturnTheNewID
#NewValue int
, #ReturnNewID int output
as
set nocount on
insert ....
set #ReturnNewID = Scope_identity()

SqlDataAdapter.Fill method slow

Why would a stored procedure that returns a table with 9 columns, 89 rows using this code take 60 seconds to execute (.NET 1.1) when it takes < 1 second to run in SQL Server Management Studio? It's being run on the local machine so little/no network latency, fast dev machine
Dim command As SqlCommand = New SqlCommand(procName, CreateConnection())
command.CommandType = CommandType.StoredProcedure
command.CommandTimeout = _commandTimeOut
Try
Dim adapter As new SqlDataAdapter(command)
Dim i as Integer
For i=0 to parameters.Length-1
command.Parameters.Add(parameters(i))
Next
adapter.Fill(tableToFill)
adapter.Dispose()
Finally
command.Dispose()
End Try
my paramter array is typed (for this SQL it's only a single parameter)
parameters(0) = New SqlParameter("#UserID", SqlDbType.BigInt, 0, ParameterDirection.Input, True, 19, 0, "", DataRowVersion.Current, userID)
The Stored procedure is only a select statement like so:
ALTER PROC [dbo].[web_GetMyStuffFool]
(#UserID BIGINT)
AS
SELECT Col1, Col2, Col3, Col3, Col3, Col3, Col3, Col3, Col3
FROM [Table]
First, make sure you are profiling the performance properly. For example, run the query twice from ADO.NET and see if the second time is much faster than the first time. This removes the overhead of waiting for the app to compile and the debugging infrastructure to ramp up.
Next, check the default settings in ADO.NET and SSMS. For example, if you run SET ARITHABORT OFF in SSMS, you might find that it now runs as slow as when using ADO.NET.
What I found once was that SET ARITHABORT OFF in SSMS caused the stored proc to be recompiled and/or different statistics to be used. And suddenly both SSMS and ADO.NET were reporting roughly the same execution time. Note that ARITHABORT is not itself the cause of the slowdown, it's that it causes a recompilation, and you are ending up with two different plans due to parameter sniffing. It is likely that parameter sniffing is the actual problem needing to be solved.
To check this, look at the execution plans for each run, specifically the sys.dm_exec_cached_plans table. They will probably be different.
Running 'sp_recompile' on a specific stored procedure will drop the associated execution plan from the cache, which then gives SQL Server a chance to create a possibly more appropriate plan at the next execution of the procedure.
Finally, you can try the "nuke it from orbit" approach of cleaning out the entire procedure cache and memory buffers using SSMS:
DBCC DROPCLEANBUFFERS
DBCC FREEPROCCACHE
Doing so before you test your query prevents usage of cached execution plans and previous results cache.
Here is what I ended up doing:
I executed the following SQL statement to rebuild the indexes on all tables in the database:
EXEC <databasename>..sp_MSforeachtable #command1='DBCC DBREINDEX (''*'')', #replacechar='*'
-- Replace <databasename> with the name of your database
If I wanted to see the same behavior in SSMS, I ran the proc like this:
SET ARITHABORT OFF
EXEC [dbo].[web_GetMyStuffFool] #UserID=1
SET ARITHABORT ON
Another way to bypass this is to add this to your code:
MyConnection.Execute "SET ARITHABORT ON"
I ran into the same issue, but when I've rebuilt indexes on SQL table, it worked fine, so you might want to consider rebuilding index on sql server side
Why not make it a DataReader instead of DataAdapter, it looks like you have a singel result set and if you aren't going to be pushing changes back in the DB and don't need constraints applied in .NET code you shouldn't use the Adapter.
EDIT:
If you need it to be a DataTable you can still pull the data from the DB via a DataReader and then in .NET code use the DataReader to populate a DataTable. That should still be faster than relying on the DataSet and DataAdapter
I don't know "Why" it's so slow per se - but as Marcus is pointing out - comparing Mgmt Studio to filling a dataset is apples to oranges. Datasets contain a LOT of overhead. I hate them and NEVER use them if I can help it.
You may be having issues with mismatches of old versions of the SQL stack or some such (esp given you are obviously stuck in .NET 1.1 as well) The Framework is likely trying to do database equivilant of "Reflection" to infer schema etc etc etc
One thing to consider try with your unfortunate constraint is to access the database with a datareader and build your own dataset in code. You should be able to find samples easily via google.

Resources