Need to speed up SQL Server SP that uses system metadata - sql-server

Let me apologize in advance for the length of this question. I don't see how to ask it without giving all the definitions.
I've inherited a SQL Server 2005 database that includes a homegrown implementation of change tracking. Through triggers, changes to virtually every field in the database are stored in a set of three tables. In the application for this database, the user can request the history of various items, and what's returned is not just changes to the item itself, but also changes in related tables. The problem is that in some cases, it's painfully slow, and in some cases, the request eventually crashes the application. The client has also reported other users having problems when someone requests history.
The tables that store the change data are as follows:
CREATE TABLE [dbo].[tblSYSChangeHistory](
[id] [bigint] IDENTITY(1,1) NOT NULL,
[date] [datetime] NULL,
[obj_id] [int] NULL,
[uid] [varchar](50) NULL
This table tracks the tables that have been changed. Obj_id is the value that Object_ID() returns.
CREATE TABLE [dbo].[tblSYSChangeHistory_Items](
[id] [bigint] IDENTITY(1,1) NOT NULL,
[h_id] [bigint] NOT NULL,
[item_id] [int] NULL,
[action] [tinyint] NULL
This table tracks the items that have been changed. h_id is a foreign key to tblSYSChangeHistory. item_id is the PK of the changed item in the specified table. action indicates insert, delete or change.
CREATE TABLE [dbo].[tblSYSChangeHistory_Details](
[id] [bigint] IDENTITY(1,1) NOT NULL,
[i_id] [bigint] NOT NULL,
[col_id] [int] NOT NULL,
[prev_val] [varchar](max) NULL,
[new_val] [varchar](max) NULL
This table tracks the individual changes. i_id is a foreign key to tblSYSChangeHistory_Items. col_id indicates which column was changed, and prev_val and new_val indicate the original and new values for that field.
There's actually a fourth table that supports this architecture. tblSYSChangeHistory_Objects maps plain English descriptions of operations to particular tables in the database.
The code to look up the history for an item is incredibly convoluted. It's one branch of a very long SP. Relevant parameters are as follows:
#action varchar(50),
#obj_id bigint = 0,
#uid varchar(50) = '',
#prev_val varchar(MAX) = '',
#new_val varchar(MAX) = '',
#start_date datetime = '',
#end_date datetime = ''
I'm storing them to local variables right away (because I was able to significantly speed up another SP by doing so):
declare #iObj_id bigint,
#cUID varchar(50),
#cPrev_val varchar(max),
#cNew_val varchar(max),
#tStart_date datetime,
#tEnd_date datetime
set #iObj_id = #obj_id
set #cUID = #uid
set #cPrev_val = #prev_val
set #cNew_val = #new_val
set #tStart_date = #start_date
set #tEnd_date = #end_date
And here's the code from that branch of the SP:
create table #r (obj_id int, item_id int, l tinyint)
create clustered index #ri on #r (obj_id, item_id)
insert into #r
select object_id(obj_name), #iObj_id, 0
from dbo.tblSYSChangeHistory_Objects
where obj_type = 'U' and descr = cast(#cPrev_val AS varchar(150))
declare #i tinyint, #cnt int
set #i = 1
while #i <= 4
begin
insert into #r
select obj_id, item_id, #i
from dbo.vSYSChangeHistoryFK a with (nolock)
where exists (select null from #r where obj_id = a.rel_obj_id and item_id = a.rel_item_id and l = #i - 1)
and not exists (select null from #r where obj_id = a.obj_id and item_id = a.item_id)
set #cnt = ##rowcount
insert into #r
select rel_obj_id, rel_item_id, #i
from dbo.vSYSChangeHistoryFK a with (nolock)
where object_name(obj_id) not in (<this is a list of particular tables in the database>)
and exists (select null from #r where obj_id = a.obj_id and item_id = a.item_id and l between #i - 1 and #i)
and not exists (select null from #r where obj_id = a.rel_obj_id and item_id = a.rel_item_id)
set #i = case #cnt + ##rowcount when 0 then 100 else #i + 1 end
end
select date, obj_name, item, [uid], [action],
pkey, item_id, id, key_obj_id into #tCH_R
from dbo.vSYSChangeHistory a with (nolock)
where exists (select null from #r where obj_id = a.obj_id and item_id = a.item_id)
and (#cUID = '' or uid = #cUID)
and (#cNew_val = '' or [action] = #cNew_val)
declare ch_item_cursor cursor for
select distinct pkey, key_obj_id, item_id
from #tCH_R
where item = '' and pkey <> ''
open ch_item_cursor
fetch next from ch_item_cursor
into #cPrev_val, #iObj_id, #iCol_id
while ##fetch_status = 0
begin
set #SQLStr = 'select #val = ' + #cPrev_val +
' from ' + object_name(#iObj_id) + ' with (nolock)' +
' where id = #id'
exec sp_executesql #SQLStr,
N'#val varchar(max) output, #id int',
#cNew_val output, #iCol_id
update #tCH_R
set item = #cNew_val
where key_obj_id = #iObj_id
and item_id = #iCol_id
fetch next from ch_item_cursor
into #cPrev_val, #iObj_id, #iCol_id
end
close ch_item_cursor
deallocate ch_item_cursor
select date, obj_name,
cast(item AS varchar(254)) AS item,
uid, [action],
cast(id AS int) AS id
from #tCH_R
order by id
return
As you can see, the code uses a view. Here's that definition:
ALTER VIEW [dbo].[vSYSChangeHistoryFK]
AS
SELECT i.obj_id, i.item_id, c1.parent_object_id AS rel_obj_id, i2.item_id AS rel_item_id
FROM dbo.vSYSChangeHistoryItemsD AS i INNER JOIN
sys.foreign_key_columns AS c1 ON c1.referenced_object_id = i.obj_id AND c1.constraint_column_id = 1 INNER JOIN
dbo.vSYSChangeHistoryItemsD AS i2 ON c1.parent_object_id = i2.obj_id INNER JOIN
dbo.tblSYSChangeHistory_Details AS d1 ON d1.i_id = i.min_id AND d1.col_id = c1.referenced_column_id INNER JOIN
dbo.tblSYSChangeHistory_Details AS d1k ON d1k.i_id = i2.min_id AND d1k.col_id = c1.parent_column_id AND ISNULL(d1.new_val,
ISNULL(d1.prev_val, '')) = ISNULL(d1k.new_val, ISNULL(d1k.prev_val, '')) --LEFT OUTER JOIN
UNION ALL
SELECT i0.obj_id, i0.item_id, c01.parent_object_id AS rel_obj_id, i02.item_id AS rel_item_id
FROM dbo.vSYSChangeHistoryItemsD AS i0 INNER JOIN
sys.foreign_key_columns AS c01 ON c01.referenced_object_id = i0.obj_id AND c01.constraint_column_id = 1 AND col_name(c01.referenced_object_id,
c01.referenced_column_id) = 'ID' INNER JOIN
dbo.vSYSChangeHistoryItemsD AS i02 ON c01.parent_object_id = i02.obj_id INNER JOIN
dbo.tblSYSChangeHistory_Details AS d01k ON i02.min_id = d01k.i_id AND d01k.col_id = c01.parent_column_id AND ISNULL(d01k.new_val,
d01k.prev_val) = CAST(i0.item_id AS varchar(max))
And finally, that view uses one more view:
ALTER VIEW [dbo].[vSYSChangeHistoryItemsD]
AS
SELECT h.obj_id, m.item_id, MIN(m.id) AS min_id
FROM dbo.tblSYSChangeHistory AS h INNER JOIN
dbo.tblSYSChangeHistory_Items AS m ON h.id = m.h_id
GROUP BY h.obj_id, m.item_id
Working with the Profiler, it appears that view vSYSChangeHistoryFK is the big culprit, and my testing suggests that the particular problem is in the join between the two copies of vSYSChangeHistoryItemsD and the foreign_key_columns table.
I'm looking for any ideas on how to give acceptable performance here. The client reports sometimes waiting as much as 15 minutes without getting results. I've tested up to nearly 10 minutes with no result in at least one case.
If there were new language elements in 2008 or later that would solve this, I think the client would be willing to upgrade.
Thanks.

Wow that's a mess. Your big gain should be in removing the cursor. I see 'where exists' - that's nice and efficient b/c as soon as it finds one match it aborts. And I see 'where not exists' - by definition that has to scan everything. Is it finding the top 4? You can do better with using ROW_NUMBER() OVER (PARTITON BY [whatever makes it unique] ORDER BY [whatever your id is]. It's hard to tell. select object_id(obj_name), #iObj_id, 0 makes it seem like only the #i=1 loop actually does anything (?)
If that is what it's doing, you could write it as
SELECT * from
(
select ROW_NUMBER() OVER (PARTITION BY obj_id ORDER BY item_id desc) as Row,
obj_id, item_id
FROM bo.vSYSChangeHistoryFK a with (nolock)
where obj_type = 'U' and descr = cast(#cPrev_val AS varchar(150))
) paged
where Row between 1 and 4
ORDER BY Row
A DBA level change that could help would be to set up a partitioning scheme based on date. Roll over to a new partition every so often. Put the old partitions on different disks. Most queries may only need to hit the recent partition, which will be say 1/5th the size that it used to be, making it much faster without changing anything else.
Not a full answer, sorry. That mess would take hours to parse

Related

How do I loop through a table, search with that data, and then return search criteria and result to new table?

I have a set of records that need to be validated (searched) in a SQL table. I will call these ValData and SearchTable respectively. A colleague created a SQL query in which a record from the ValData can be copied and pasted in to a string variable, and then it is searched in the SearchTable. The best result from the SearchTable is returned. This works very well.
I want to automate this process. I loaded the ValData to SQL in a table like so:
RowID INT, FirstName, LastName, DOB, Date1, Date2, TextDescription.
I want to loop through this set of data, by RowID, and then create a result table that is the ValData joined with the best match from the SearchTable. Again, I already have a query that does that portion. I just need the loop portion, and my SQL skills are virtually non-existent.
Suedo code would be:
DECLARE #SearchID INT = 1
DECLARE #MaxSearchID INT = 15000
DECLARE #FName VARCHAR(50) = ''
DECLARE #FName VARCHAR(50) = ''
etc...
WHILE #SearchID <= #MaxSearchID
BEGIN
SET #FNAME = (SELECT [Fname] FROM ValData WHERE [RowID] = #SearchID)
SET #LNAME = (SELECT [Lname] FROM ValData WHERE [RowID] = #SearchID)
etc...
Do colleague's query, and then insert(?) search criteria joined with the result from the SearchTable in to a temporary result table.
END
SELECT * FROM FinalResultTable;
My biggest lack of knowledge comes in how do I create a temporary result table that is ValData's fields + SearchTable's fields, and during the loop iterations how do I add one row at a time to this temporary result table that includes the ValData joined with the result from the SearchTable?
If it helps, I'm using/wanting to join all fields from ValData and all fields from SearchTable.
Wouldn't this be far easier with a query like this..?
SELECT FNAME,
LNAME
FROM ValData
WHERE (FName = #Fname
OR LName = #Lname)
AND RowID <= #MaxSearchID
ORDER BY RowID ASC;
There is literally no reason to use a WHILE other than to destroy performance of the query.
With a bit more trial and error, I was able to answer what I was looking for (which, at its core, was creating a temp table and then inserting rows in to it).
CREATE TABLE #RESULTTABLE(
[feedname] VARCHAR(100),
...
[SCORE] INT,
[Max Score] INT,
[% Score] FLOAT(4),
[RowID] SMALLINT
)
SET #SearchID = 1
SET #MaxSearchID = (SELECT MAX([RowID]) FROM ValidationData
WHILE #SearchID <= #MaxSearchID
BEGIN
SET #FNAME = (SELECT [Fname] FROM ValidationData WHERE [RowID] = #SearchID)
...
--BEST MATCH QUERY HERE
--Select the "top" best match (order not guaranteed) in to the RESULTTABLE.
INSERT INTO #RESULTTABLE
SELECT TOP 1 *, #SearchID AS RowID
--INTO #RESULTTABLE
FROM #TABLE3
WHERE [% Score] IN (SELECT MAX([% Score]) FROM #TABLE3)
--Drop temp tables that were created/used during best match query.
DROP TABLE #TABLE1
DROP TABLE #TABLE2
DROP TABLE #TABLE3
SET #SearchID = #SearchID + 1
END;
--Join the data that was validated (searched) to the results that were found.
SELECT *
FROM ValidationData vd
LEFT JOIN #RESULTTABLE rt ON rt.[RowID] = vd.[RowID]
ORDER BY vd.[RowID]
DROP TABLE #RESULTTABLE
I know this could be approved by doing a join, probably with the "BEST MATCH QUERY" as an inner query. I am just not that skilled yet. This takes a manual process which took hours upon hours and shortens it to just an hour or so.

How to insert data into a temporary table using an existing table and new columns

I am trying to insert data into a temporary table within my stored procedure. The data is selected from an existing table and creating new columns with concatenated data. I'm getting an error that the column name or number of supplied values does not match table definition. I'm pretty certain that the code in my application is correct so I believe the issue is with the way I'm storing the data in a temporary table.
Here is my proc:
AS
BEGIN
CREATE TABLE #TempTable
(
[ID] [varchar](10),
[FIRST_NAME] varchar(50),
[LAST_NAME] varchar(50),
[WEBSITE_LINK] varchar(200)
)
INSERT INTO #TempTable
SELECT USER.ID,USER.FIRSTNAME AS [FIRST_NAME], USER.LASTNAME AS
[LAST_NAME]
FROM USER
WHERE USER.Registered = 'Yes'
DECLARE #Link1 NVARCHAR(100)
DECLARE #Link2 VARCHAR(10)
DECLARE #Link3 NVARCHAR(4)
SET #Link1 = 'http://www.mywebsite.com/user/'
SET #Link2 = (SELECT USER.ID FROM USER WHERE USER.Registered =
'Yes')
SET #Link3 ='/document.doc'
SET #WEBSITE_LINK = (SELECT concat(#Link1,#Link2,#Link3 )AS
[WEBSITE_LINK])
DROP TABLE #TempTable
END
I think this is your problem:
SET #Link2 = (SELECT USER.ID FROM USER WHERE USER.Registered = 'Yes')
What if there are six of them? A single variable can't hold all of them. You can do:
SELECT TOP(1) #Link2 = USER.ID FROM USER WHERE USER.Registered = 'Yes' ORDER BY [SOMETHING];
If the goal is to create a temp table with a full [WEBSITE_LINK], you can do that without all those variables:
BEGIN
CREATE TABLE #TempTable
(
[ID] [varchar](10),
[FIRST_NAME] varchar(50),
[LAST_NAME] varchar(50),
[WEBSITE_LINK] varchar(200)
)
INSERT INTO #TempTable
SELECT DISTINCT u.ID
, [FIRST_NAME] = u.FIRSTNAME
, [LAST_NAME] = u.LASTNAME
, [WEBSITE_LINK] = 'http://www.mywebsite.com/user/' +
CAST(u.ID AS VARCHAR(10)) +
'/document.doc'
FROM [USER] u
WHERE u.Registered = 'Yes'
-- Do something with these values...
DROP TABLE #TempTable
END

Select on a table with 2 possible structures

I'm trying to write a query that will select data from a table. due to different versions of the database, there are 2 possible structures for the source table, where the newer version has 2 more fields than the old one.
I've tried identifying the older structure and replacing the columns with NULL and also tried writing 2 separate queries with and IF statement directing to the correct one. Neither of these solutions work and in both cases it seems that the SQL engine is failing on validating these 2 columns.
Examples of my attempted solutions:
IF NOT EXISTS (SELECT *
FROM sys.objects
WHERE object_id = Object_id(N'[dbo].[Test2]')
AND type IN ( N'U' ))
BEGIN
CREATE TABLE [dbo].[test2]
(
[id] [INT] IDENTITY(1, 1) NOT NULL,
[statusid] [INT] NULL
)
END
go
DECLARE #Flag INT = 0
IF EXISTS(SELECT 1
FROM sys.columns
WHERE NAME = N'TestId'
AND object_id = Object_id(N'dbo.Test2'))
SET #Flag = 1
--Solution #1
IF #Flag = 1
SELECT id,
statusid,
testid
FROM dbo.test2
ELSE
SELECT id,
statusid
FROM dbo.test2
--Solution #2
SELECT id,
statusid,
CASE
WHEN #Flag = 1 THEN testid
ELSE NULL
END AS TestId
FROM dbo.test2
you can use Dynamic SQL and generate the query accordingly depends on value of #flag
declare #sql nvarchar(max)
select #sql = N'select id, statusid, '
+ case when #flag = 1 then 'testid' else 'NULL' end + ' as testid'
+ ' from dbo.test2'
print #sql
exec sp_executesql #sql
But it will not be that easy to code and maintain Dynamic Query if you have a complex query.

Sybase ASE identify columns of keys of multiple tables

I'm trying to identify the columns making up keys in ASE.
Sybase has the solution listed here: http://infocenter.sybase.com/help/index.jsp?topic=/com.sybase.infocenter.help.ase.15.5/title.htm
I have a slightly modified version below, however it only works (just as sybase's solution) if I look up for a single table, but I want to use the 'in' keyword and look up all the tables in one shot.
Could I get some help, as to why the solution below does not work? It only generates the list of columns for 't5' table.
declare #keycnt integer
declare #objname varchar(256)
select #keycnt = keycnt, #objname = sysobjects.name from sysindexes, sysobjects
where
--sysobjects.id = object_id("t5")
--sysobjects.id = object_id("t4")
sysobjects.id in (object_id("t5"), object_id("t4"))
and sysobjects.id = sysindexes.id
and indid = 1
while #keycnt > 0
begin
select index_col(#objname, 1, #keycnt)
select #keycnt = #keycnt - 1
end
These are the tables I'm using for testing:
CREATE TABLE t4(
[value] [varchar] (500) not NULL ,
CONSTRAINT pk_g4 PRIMARY KEY CLUSTERED (
[value]
)
)
CREATE TABLE t5(
[myvalue] [varchar] (500) not NULL ,
CONSTRAINT pk_g4 PRIMARY KEY CLUSTERED (
[myvalue]
)
)
You have two solutions:
Using OR
declare #keycnt integer
declare #objname varchar(256)
select #keycnt = keycnt, #objname = sysobjects.name from sysindexes, sysobjects
where
--sysobjects.id = object_id("t5")
--sysobjects.id = object_id("t4")
(sysobjects.id = object_id("t5") OR sysobjects.id = object_id("t4"))
and sysobjects.id = sysindexes.id
and indid = 1
while #keycnt > 0
begin
select index_col(#objname, 1, #keycnt)
select #keycnt = #keycnt - 1
end
Or Using Dynamic SQL to properly use the IN.

SQL-Server Trigger on update for Audit

I can't find an easy/generic way to register to an audit table the columns changed on some tables.
I tried to do it using a Trigger on after update in this way:
First of all the Audit Table definition:
CREATE TABLE [Audit](
[Id] [int] IDENTITY(1,1) NOT NULL,
[Date] [datetime] NOT NULL default GETDATE(),
[IdTypeAudit] [int] NOT NULL, --2 for Modify
[UserName] [varchar](50) NULL,
[TableName] [varchar](50) NOT NULL,
[ColumnName] [varchar](50) NULL,
[OldData] [varchar](50) NULL,
[NewData] [varchar](50) NULL )
Next a trigger on AFTER UPDATE in any table:
DECLARE
#sql varchar(8000),
#col int,
#colcount int
select #colcount = count(*) from INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME = 'MyTable'
set #col = 1
while(#col < #colcount )
begin
set #sql=
'INSERT INTO Audit
SELECT 2, UserNameLastModif, ''MyTable'', COL_NAME(Object_id(''MyTable''), '+ convert(varchar,#col) +'), Deleted.'
+ COL_NAME(Object_id('MyTable'), #col) + ', Inserted.' + COL_NAME(Object_id('MyTable'), #col) + '
FROM Inserted LEFT JOIN Deleted ON Inserted.[MyTableId] = Deleted.[MyTableId]
WHERE COALESCE(Deleted.' + COL_NAME(Object_id('MyTable'), #col) + ', '''') <> COALESCE(Inserted.' + COL_NAME(Object_id('MyTable'), #col) + ', '''')'
--UserNameLastModif is an optional column on MyTable
exec(#sql)
set #col = #col + 1
end
The problems
Inserted and Deleted lost the context when I use the exec function
Seems that colnumber it isn't always a correlative number, seems if you create a table with 20 columns and you delete one and create another, the last one have a number > #colcount
I was looking for a solution for all over the net but I couln't figure out
Any Idea?
Thanks!
This highlights a greater problem with structural choice. Try to write a set-based solution. Remove the loop and dynamic SQL and write a single statement that inserts the Audit rows. It is possible but to make it easier consider a different table layout, like keeping all columns on 1 row instead of splitting them.
In SQL 2000 use syscolumns. In SQL 2005+ use sys.columns. i.e.
SELECT column_id FROM sys.columns WHERE object_id = OBJECT_ID(DB_NAME()+'.dbo.Table');
#Santiago : If you still want to write it in dynamic SQL, you should prepare all of the statements first then execute them.
8000 characters may not be enough for all the statements. A good solution is to use a table to store them.
IF NOT OBJECT_ID('tempdb..#stmt') IS NULL
DROP TABLE #stmt;
CREATE TABLE #stmt (ID int NOT NULL IDENTITY(1,1), SQL varchar(8000) NOT NULL);
Then replace the line exec(#sql) with INSERT INTO #stmt (SQL) VALUES (#sql);
Then exec each row.
WHILE EXISTS (SELECT TOP 1 * FROM #stmt)
BEGIN
BEGIN TRANSACTION;
EXEC (SELECT TOP 1 SQL FROM #stmt ORDER BY ID);
DELETE FROM #stmt WHERE ID = (SELECT MIN(ID) FROM #stmt);
COMMIT TRANSACTION;
END
Remember to use sys.columns for the column loop (I shall assume you use SQL 2005/2008).
SET #col = 0;
WHILE EXISTS (SELECT TOP 1 * FROM sys.columns WHERE object_id = OBJECT_ID('MyTable') AND column_id > #col)
BEGIN
SELECT TOP 1 #col = column_id FROM sys.columns
WHERE object_id = OBJECT_ID('MyTable') AND column_id > #col ORDER BY column_id ASC;
SET #sql ....
INSERT INTO #stmt ....
END
Remove line 4 #colcount int and the proceeding comma. Remove Information schema select.
DO not ever use any kind of looping a trigger. Do not use dynamic SQl or call a stored proc or send an email.All of these things are exretemly inappropriate in a trigger.
If tyou want to use dynamic sql use it to create the script to create the trigger. And create an audit table for every table you want audited (we actually have two for every table) or you will have performance problems due to locking on the "one table to rule them all".

Resources