SQL Server trigger inserting duplicates - sql-server

I'm debugging a data pipeline that consists of several tables and triggers.
We have a table called Dimension_Date defined as such:
CREATE TABLE [dbo].[Dimension_Date]
(
[id_date] [bigint] IDENTITY(1,1) NOT NULL,
[a_date] [datetime] NOT NULL,
[yoy_date] [date] NOT NULL
)
We also have three different tables in which other processes are inserting data (with transactions, although I don't have access to these processes). Each table contains a Datetime column (x_date) that needs to be inserted in the Dimension table only if that datetime doesn't exist already in the Dimension table. If it already exists, it shouldn't be inserted.
On each of these tables there is a trigger that, among other things, checks if the datetime exists in the Dimension table and if it isn't, inserts the new date. Once all the actions are performed, the content of TABLE_1, 2 and 3 are deleted. The triggers (of tables TABLE_1, 2 and 3) contain the following query:
CREATE TRIGGER [dbo].[insert_Date_Trigger]
ON [dbo].[TABLE_1]
AFTER INSERT
AS
BEGIN
INSERT INTO Dimension_Date (a_date, yoy_date)
SELECT DISTINCT x_date, DateADD(yy, -100, CONVERT(date, x_date))
FROM TABLE_1
WHERE NOT EXISTS (SELECT id_date FROM Dimension_Date
WHERE a_date = TABLE_1.a_date);
(...)
DELETE FROM TABLE_1
END
The problem is that these triggers are inserting duplicates in the Dimension table (two different id_date for the same a_date field), and I can't figure out where the problem is. Could it be that the processes might not be using transactions? Is there anything wrong with the query?
Any help would be greatly appreciated.

Related

Using TSQLT FakeTable to test a table created by a Stored Procedure

I am learning to write unit tests for work. I was advised to use TSQLT FakeTable to test some aspects of a table created by a stored procedure.
In other unit tests, we create a temp table for the stored procedure and then test the temp table. I'm not sure how to work the FakeTable into the test.
EXEC tSQLt.NewTestClass 'TestThing';
GO
CREATE OR ALTER PROCEDURE TestThing.[test API_StoredProc to make sure parameters work]
AS
BEGIN
DROP TABLE IF EXISTS #Actual;
CREATE TABLE #Actual ----Do I need to create the temp table and the Fake table? I thought I might need to because I'm testing a table created by a stored procedure.
(
ISO_3166_Alpha2 NVARCHAR(5),
ISO_3166_Alpha3 NVARCHAR(5),
CountryName NVARCHAR(100),
OfficialStateName NVARCHAR(300),
sovereigny NVARCHAR(75),
icon NVARCHAR(100)
);
INSERT #Actual
(
ISO_3166_Alpha2,
ISO_3166_Alpha3,
CountryName,
OfficialStateName,
sovereigny,
icon
)
EXEC Marketing.API_StoredProc #Username = 'AnyValue', -- varchar(100)
#FundId = 0, -- int
#IncludeSalesForceInvestorCountry = NULL, -- bit
#IncludeRegisteredSalesJurisdictions = NULL, -- bit
#IncludeALLCountryForSSRS = NULL, -- bit
#WHATIF = NULL, -- bit
#OUTPUT_DEBUG = NULL -- bit
EXEC tsqlt.FakeTable #TableName = N'#Actual', -- nvarchar(max) -- How do I differentiate between the faketable and the temp table now?
#SchemaName = N'', -- nvarchar(max)
#Identity = NULL, -- bit
#ComputedColumns = NULL, -- bit
#Defaults = NULL -- bit
INSERT INTO #Actual
(
ISO_3166_Alpha2,
ISO_3166_Alpha3,
CountryName,
OfficialStateName,
sovereigny,
icon
)
VALUES
('AF', 'AFG', 'Afghanistan', 'The Islamic Republic of Afghanistan', 'UN MEMBER STATE', 'test')
SELECT * FROM #actual
END;
GO
EXEC tSQLt.Run 'TestThing';
What I'm trying to do with the code above is basically just to get FakeTable working. I get an error: "FakeTable couold not resolve the object name #Actual"
What I ultimately want to test is the paramaters in the stored procedure. Only certain entries should be returned if, say, IncludeSalesForceInvestorCountry is set to 1. What should be returned may change over time, so that's why I was advised to use FakeTable.
In your scenario, you don’t need to fake any temp tables, just fake the table that is referenced by Marketing.API_StoredProc and populate it with values that you expect to be returned, and some you don’t. Add what you expect to see in an #expected table, call Marketing.API_StoredProc dumping the results into an #actual table and compare the results with tSQLt.AssertEqualsTable.
A good starting point might be to review how tSQLT.FakeTable works and a real world use case.
As you know, each unit test runs within its own transaction started and rolled back by the tSQLT framework. When you call tSQLt.FakeTable within a unit test, it temporarily renames the specified table then creates an exactly named facsimile of that table. The temporary copy allows NULL in every column, has no primary or foreign keys, identity column, check, default or unique constraints (although some of those can be included in the facsimile table depending on parameters passed to tSQLt.FakeTable). For the duration of the test transaction, any object that references the name table will use the fake rather than the real table. At the end of the test, tSQLt rolls back the transaction, the fake table is dropped and the original table returned to its former state (this all happens automatically). You might ask, what is the point of that?
Imagine you have an [OrderDetail] table which has columns including OrderId and ProductId as the primary key, an OrderStatusId column plus a bunch of other NOT NULL columns. The DDL for this table might look something like this.
CREATE TABLE [dbo].[OrderDetail]
(
OrderDetailId int IDENTITY(1,1) NOT NULL
, OrderId int NOT NULL
, ProductId int NOT NULL
, OrderStatusId int NOT NULL
, Quantity int NOT NULL
, CostPrice decimal(18,4) NOT NULL
, Discount decimal(6,4) NOT NULL
, DeliveryPreferenceId int NOT NULL
, PromisedDeliveryDate datetime NOT NULL
, DespatchDate datetime NULL
, ActualDeliveryDate datetime NULL
, DeliveryDelayReason varchar(500) NOT NULL
/* ... other NULL and NOT NULL columns */
, CONSTRAINT PK_OrderDetail PRIMARY KEY CLUSTERED (OrderId, ProductId)
, CONSTRAINT AK_OrderDetail_AutoIncrementingId UNIQUE NONCLUSTERED (OrderDetailId)
, CONSTRAINT FK_OrderDetail_Order FOREIGN KEY (OrderId) REFERENCES [dbo].[Orders] (OrderId)
, CONSTRAINT FK_OrderDetail_Product FOREIGN KEY (ProductId) REFERENCES [dbo].[Product] (ProductId)
, CONSTRAINT FK_OrderDetail_OrderStatus FOREIGN KEY (OrderStatusId) REFERENCES [dbo].[OrderStatus] (OrderStatusId)
, CONSTRAINT FK_OrderDetail_DeliveryPreference FOREIGN KEY (DeliveryPreferenceId) REFERENCES [dbo].[DeliveryPreference] (DeliveryPreferenceId)
);
As you can see, this table has foreign key dependencies on the Orders, Product, DeliveryPreference and OrderStatus table. Product may in turn have foreign keys that reference ProductType, BrandCategory, Supplier among others. The Orders table has foreign key references to Customer, Address and SalesPerson among others. All of the tables in this chain have numerous columns defined as NOT NULL and/or are constrained by CHECK and other constraints. Some of these tables themselves have more foreign keys.
Now imagine you want to write a stored procedure (OrderDetailStatusUpdate) whose job it is to update the order status for a single row on the OrderDetail table. It has three input parameters #OrderId, #ProductId and #OrderStatusId. Think about what you would need to do to set up a test for this procedure. You would need to add at least two rows to the OrderDetail table including all the NOT NULL columns. You would also need to add parent records to all the FK-referenced tables, and also to any tables above that in the hierarchy, ensuring that all your inserts comply with all the nullability and other constraints on those tables too. By my count that is at least 11 tables that need to be populated, all for one simple test. And even if you bite the bullet and do all that set-up, at some point in the future someone may (probably will) come along and add a new NOT NULL column to one of those tables or change a constraint that will cause your test to fail - and that failure actually has nothing to do with your test or the stored procedure you are testing. One of the basic tenets of test-driven development is that a test should have only on reason to fail, I count dozens.
tSQLT.FakeTable to the rescue.
What is the minimum you actually need to do to in order to set up a test for that procedure? You need two rows to the OrderDetail table (one that gets updated, one that doesn’t) and the only columns you actually “need” to consider are OrderId and ProductId (the identifying key) plus OrderStatusId - the column being updated. The rest of the columns whilst important in the overall design, have no relevance to the object under test. In your test for OrderDetailStatusUpdate, you would follow these steps:
Call tSQLt.FakeTable ‘dbo.OrderDetail’
Create an #expected table (with OrderId, ProductId and OrderStatusId
columns) and populate it with the two rows you expect to end up with
(one will have the expected OrderStatusId the other can be NULL)
Add two rows to the now mocked OrderDetail table (OrderId and
ProductId only)
Call the procedure under test OrderDetailStatusUpdate passing the
OrderID and ProductID for one of the rows inserted plus the
OrderStatusId you are changing to.
Use tSQLt.AssertEqualsTable to compare the #expected table with the
OrderDetail table. This assertion will only compare the columns on
the #expected table, the other columns on OrderDetail will be ignored
Creating this test is really quick and the only reason it is ever likely to fail is because something pertinent to the code under test has changed in the underlying schema. Changes to any other columns on the OrderDetail table or any of the parent/grand-parent tables will not cause this test to break.
So the reason for using tSQLt.FakeTable (or any other kind of mock object) is to provide really robust test isolation and simply test data preparation.

SQL Server: Capturing All the columns that have changed in a separate table

In my SQl Server I have a table of around 40 attributes/columns. There is a daily load which might update any of these columns. I want to capture the changes in these columns in a separate table with a reason code column telling which column value changed. There might be instances where more than one column value might get changed in a single daily load, in that case the changed log table should capture all these changes separately in rows with each row depicting the individual change.
For Example:
TableA(column1(pk),column2,column3,column4)
values(1,100,ABC,999)
After update:
TableA(column1(pk),column2,column3,column4)
values(1,100,ACD,901)
The corresponding change log table should have two entries:
TabChangeLog(column1,before,after,reason);
values(1,ABC,ACD,'column3 changed')
values(1,999,901,'column4 changed')
I tried implementing this through triggers but am not able to figure out a way to separate each of these changes in separate rows when there are more than one changes. Please help
You need to create a trigger like :
create trigger trigger_name
instead of update as
if update(column1)
begin
insert into TabChangeLog
select inserted.column1, inserted.column3, deleted.column3, 'column3', 'update/change'
from inserted i inner join deleted d
on i.column1 = d.column2
end
if update(column2)
begin
insert into TabChangeLog
select inserted.column1, inserted.column2, deleted.column2, 'column2', 'update/change'
from inserted i inner join deleted d
on i.column1 = d.column2
end
...
https://www.tutorialgateway.org/instead-of-update-triggers-in-sql-server/
Microsoft SQL Server 2016 has a thing called Temporal Tables which would probably simplify your job a lot. It lets you rewind a dataset through time to see the changes:
https://learn.microsoft.com/en-us/sql/relational-databases/tables/temporal-tables?view=sql-server-2017
If you don't want to go that route and use triggers instead. UPDATE triggers have two tables inserted and deleted that let you know what the row state was before and after.
*Edit: These are tables so you have to use SELECT INTO etc to interact with them you can't do conditional logic (if /else)
CREATE TABLE [dbo].[Table1](
[Id] [int] NOT NULL,
[Tail] [int] NOT NULL,
CONSTRAINT [PK_Table1_1] PRIMARY KEY CLUSTERED
(
[Id] ASC
)
)
CREATE TABLE Table1_Audit
(
Audit varchar(100)
)
--drop trigger Table1_OnUPDATE
CREATE TRIGGER Table1_OnUPDATE
ON dbo.Table1
AFTER UPDATE
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
-- Insert statements for trigger here
INSERT INTO Table1_Audit ([Audit])
select CONCAT('Tail changed to' ,inserted.Tail,' for pk Id=',inserted.Id) from inserted inner join
deleted on inserted.Id = deleted.Id --pk must be the same
where
inserted.Tail <> deleted.Tail --field x must be different
END
GO
--truncate table Table1_Audit
--update Table1 set Tail = 5
select * from Table1_Audit

Recursive stored procedure - insert sql

I am trying to generate a recursive stored procedure for the insert of a table of some hierarchical data.
I have three tables with the following design
Table A
Id. uniqueidentifier
Name
Acronym
Table B
Id. uniqueidentifier
ParentId uniqueidentifier (references TableA)
ChildId. uniqueidentifier (references TableA)
Datestart datetime
Dateend. datetime
Table C
Id uniqueidentifier
Datestart datetime
Dateend datetime
AId uniqueidentifier (references TableA)
Left int
Right int
I am trying to do an insert into TableC with the values from the other two tables. TableA has the master data, and TableB has the parent child associations mentioned as per the date range.
As in as per quarter the parent child relationship can change, hence the date start and date end columns.
I want to write a recursive stored procedure with common table expression to do the inserts by combining the table A and B. I have tried searching for help online but most of the link has parent child relationship mentioned in the same table but not like my scenario.
Please let me know if somebody can help me.

Recursive OR statement SQL Server 2016

I am looking at creating a relatively simple insert statement that inserts a new record if there are any changes to a table. Issue i have is there are over 600 columns that would need to be checked.
More details: the main reporting table is updated every 15 minutes from the front end application using a SQL process to push the changes, however it over-writes the data and doesn't maintain a change log. I have no control over any of this.
Second table (my table) is a DWH table, which will create an audit of changes. So I use an inner join where t1.AccountNo = t2.AccountNo and t1.Field1 <> t.2Field1 then add an OR and add the next field t1.AccountNo = t2.AccountNo and t1.Field2 <> t.2Field2 .
Is there a better way to get the desired result given the number of columns?
You could try a different approach.
Create a trigger on the main table for update and delete.
This trigger copies the data which is already in the table to your dwh table before the data has changed.
create Trigger [nameupdate] on [yourtable] after update
as
insert into [dwh]
select
getdate() as [ChangeDate]
,'update' as [Action]
,SYSTEM_USER as [User]
,d.[ID]
,d.[...]
from deleted d
GO
same for delete
create Trigger [namedelete] on [yourtable] after delete
[...]
my dwh table has 3 additional columns for tracking and contains all columns from main table.
CREATE TABLE [dwh](
[ID] [int] IDENTITY(1,1) NOT NULL Primary key,
[ChangeDate] [datetime] NOT NULL,
[Action] [varchar](50) NOT NULL,
[User] [nvarchar](128) NOT NULL,
[...]

SQL Server 2005 How can I set up an audit table that records the column name updated?

given this table definition
create table herb.app (appId int identity primary key
, application varchar(15) unique
, customerName varchar(35),LoanProtectionInsurance bit
, State varchar(3),Address varchar(50),LoanAmt money
,addedBy varchar(7) not null,AddedDt smalldatetime default getdate())
I believe changes will be minimal, usually only a single field, and very sparse.
So I created this table:
create table herb.appAudit(appAuditId int primary key
, field varchar(20), oldValue varchar(50),ChangedBy varchar(7) not null,AddedDt smalldatetime default getdate())
How in a trigger can I get the column name of the value of what was changed to store it? I know how to get the value by joining the deleted table.
Use the inserted and deleted tables. Nigel Rivett wrote a great generic audit trail trigger using these tables. It is fairly complex SQL code, but it highlights some pretty cool ways of pulling together the information and once you understand them you can create a custom solution using his ideas as inspiration, or you could just use his script.
Here are the important ideas about the tables:
On an insert, inserted holds the inserted values and deleted is empty.
On an update, inserted holds the new values and deleted holds the old values.
On a delete, deleted holds the deleted values and inserted is empty.
The structure of the inserted and deleted tables (if not empty) are identical to the target table.
You can determine the column names from system tables and iterate on them as illustrated in Nigel's code.
if exists (select * from inserted)
if exists (select * from deleted)
-- this is an update
...
else
-- this is an insert
...
else
-- this is a delete
...
-- For updates to a specific field
SELECT d.[MyField] AS OldValue, i.[MyField] AS NewValue, system_user AS User
FROM inserted i
INNER JOIN deleted d ON i.[MyPrimaryKeyField] = d.[MyPrimaryKeyField]
-- For your table
SELECT d.CustomerName AS OldValue, i.CustomerName AS NewValue, system_user AS User
FROM inserted i
INNER JOIN deleted d ON i.appId = d.appId
If you really need this kind of auditing in a way that's critical to your business look at SQL Server 2008's Change Data Capture feature. That feature alone could justify the cost of an upgrade.
something like this for each field you want to track
if UPDATE(Track_ID)
begin
insert into [log].DataChanges
(
dcColumnName,
dcID,
dcDataBefore,
dcDataAfter,
dcDateChanged,
dcUser,
dcTableName
)
select
'Track_ID',
d.Data_ID,
coalesce(d.Track_ID,-666),
coalesce(i.Track_ID,-666),
getdate(),
#user,
#table
from inserted i
join deleted d on i.Data_ID=d.Data_ID
and coalesce(d.Track_ID,-666)<>coalesce(i.Track_ID,-666)
end
'Track_ID' is the name of the field, and d.Data_ID is the primary key of the table your tracking. #user is the user making the changes, and #table would be the table your keeping track of changes in case you're tracking more than one table in the same log table
Here's my quick and dirty audit table solution. (from http://freachable.net/2010/09/29/QuickAndDirtySQLAuditTable.aspx)
CREATE TABLE audit(
[on] datetime not null default getutcdate(),
[by] varchar(255) not null default system_user+','+AppName(),
was xml null,
[is] xml null
)
CREATE TRIGGER mytable_audit ON mytable for insert, update, delete as
INSERT audit(was,[is]) values(
(select * from deleted as [mytable] for xml auto,type),
(select * from inserted as [mytable] for xml auto,type)
)

Resources