Recursive OR statement SQL Server 2016 - sql-server

I am looking at creating a relatively simple insert statement that inserts a new record if there are any changes to a table. Issue i have is there are over 600 columns that would need to be checked.
More details: the main reporting table is updated every 15 minutes from the front end application using a SQL process to push the changes, however it over-writes the data and doesn't maintain a change log. I have no control over any of this.
Second table (my table) is a DWH table, which will create an audit of changes. So I use an inner join where t1.AccountNo = t2.AccountNo and t1.Field1 <> t.2Field1 then add an OR and add the next field t1.AccountNo = t2.AccountNo and t1.Field2 <> t.2Field2 .
Is there a better way to get the desired result given the number of columns?

You could try a different approach.
Create a trigger on the main table for update and delete.
This trigger copies the data which is already in the table to your dwh table before the data has changed.
create Trigger [nameupdate] on [yourtable] after update
as
insert into [dwh]
select
getdate() as [ChangeDate]
,'update' as [Action]
,SYSTEM_USER as [User]
,d.[ID]
,d.[...]
from deleted d
GO
same for delete
create Trigger [namedelete] on [yourtable] after delete
[...]
my dwh table has 3 additional columns for tracking and contains all columns from main table.
CREATE TABLE [dwh](
[ID] [int] IDENTITY(1,1) NOT NULL Primary key,
[ChangeDate] [datetime] NOT NULL,
[Action] [varchar](50) NOT NULL,
[User] [nvarchar](128) NOT NULL,
[...]

Related

SQL Server trigger inserting duplicates

I'm debugging a data pipeline that consists of several tables and triggers.
We have a table called Dimension_Date defined as such:
CREATE TABLE [dbo].[Dimension_Date]
(
[id_date] [bigint] IDENTITY(1,1) NOT NULL,
[a_date] [datetime] NOT NULL,
[yoy_date] [date] NOT NULL
)
We also have three different tables in which other processes are inserting data (with transactions, although I don't have access to these processes). Each table contains a Datetime column (x_date) that needs to be inserted in the Dimension table only if that datetime doesn't exist already in the Dimension table. If it already exists, it shouldn't be inserted.
On each of these tables there is a trigger that, among other things, checks if the datetime exists in the Dimension table and if it isn't, inserts the new date. Once all the actions are performed, the content of TABLE_1, 2 and 3 are deleted. The triggers (of tables TABLE_1, 2 and 3) contain the following query:
CREATE TRIGGER [dbo].[insert_Date_Trigger]
ON [dbo].[TABLE_1]
AFTER INSERT
AS
BEGIN
INSERT INTO Dimension_Date (a_date, yoy_date)
SELECT DISTINCT x_date, DateADD(yy, -100, CONVERT(date, x_date))
FROM TABLE_1
WHERE NOT EXISTS (SELECT id_date FROM Dimension_Date
WHERE a_date = TABLE_1.a_date);
(...)
DELETE FROM TABLE_1
END
The problem is that these triggers are inserting duplicates in the Dimension table (two different id_date for the same a_date field), and I can't figure out where the problem is. Could it be that the processes might not be using transactions? Is there anything wrong with the query?
Any help would be greatly appreciated.

How to delete documents from Filetable?

I am trying to delete some documents from sql server's filetable.
Here I have one table in which I am storing all my Attachment's details and Documents in sql server's file table named Attchemnts.
AttachmentDetails table has below schema,
CREATE TABLE [dbo].[AttachmentDetails](
[Id] [int] IDENTITY(1,1) NOT NULL PRIMARY KEY,
[DocumentName] [nvarchar](max) NULL,
[DocumentType] [nvarchar](max) NULL,
[ModifiedDateTime] [datetime] NOT NULL,
[CreatedDateTime] [datetime] NOT NULL,
[CreatedBy] [nvarchar](254) NULL,
[ModifiedBy] [nvarchar](254) NULL,
[IsDeleted] [bit] NULL,
)
Whenever I am uploading any document to File table then I am inserting that document's detailed information in AttchemntsDetails table as per table schema.
Here I have tried the below solution
CREATE PROCEDURE [dbo].[DeleteFiles]
AS
BEGIN
DELETE Attachments
FROM AttachmentDetails a
WHERE
DocumentType = 'video/mp4' AND DATEDIFF(day, a.CreatedDateTime, GETDATE())<11
end
This procedure suppose to delete only Video/mp4 files who are 10 days older But it deletes any type of document from the filetable.
SQL is a set-based language. For every cursor/loop based script there's a far simpler and faster set based solution. In any case, the way this query is written would result in random deletions since there's no guarantee what all those TOP 1 queries will return without an ORDER BY clause.
It looks like you're trying to delete all video attachments older than 30 days. It also looks like the date is stored in a separate table called table1. You can write a DELETE statement whose rows come from a JOIN if you use the FROM clause, eg:
DELETE Attachments
FROM Attachments inner join table1 a on a.ID=Attachments.ID
WHERE
DocumentType = 'video/mp4' AND
CreatedDateTime < DATEADD(day,-30,getdate())
EDIT
The original query contained DATEADD(day,30,getdate()) when it should be DATEADD(day,-30,getdate())
Example
Assuming we have those two tables :
create table attachments (ID int primary key,DocumentType nvarchar(100))
insert into attachments (ID,DocumentType)
values
(1,'video/mp4'),
(2,'audio/mp3'),
(3,'application/octet-stream'),
(4,'video/mp4')
and
create table table1 (ID int primary key, CreatedDateTime datetime)
insert into table1 (ID,CreatedDateTime)
values
(1,dateadd(day,-40,getdate())),
(2,dateadd(day,-40,getdate())),
(3,getdate()),
(4,getdate())
Executing the DELETE query will only delete the Attachment with ID=1. The query
select *
from Attachments
```
Will return :
```
ID DocumentType
2 audio/mp3
3 application/octet-stream
4 video/mp4
```

SQL Server: Capturing All the columns that have changed in a separate table

In my SQl Server I have a table of around 40 attributes/columns. There is a daily load which might update any of these columns. I want to capture the changes in these columns in a separate table with a reason code column telling which column value changed. There might be instances where more than one column value might get changed in a single daily load, in that case the changed log table should capture all these changes separately in rows with each row depicting the individual change.
For Example:
TableA(column1(pk),column2,column3,column4)
values(1,100,ABC,999)
After update:
TableA(column1(pk),column2,column3,column4)
values(1,100,ACD,901)
The corresponding change log table should have two entries:
TabChangeLog(column1,before,after,reason);
values(1,ABC,ACD,'column3 changed')
values(1,999,901,'column4 changed')
I tried implementing this through triggers but am not able to figure out a way to separate each of these changes in separate rows when there are more than one changes. Please help
You need to create a trigger like :
create trigger trigger_name
instead of update as
if update(column1)
begin
insert into TabChangeLog
select inserted.column1, inserted.column3, deleted.column3, 'column3', 'update/change'
from inserted i inner join deleted d
on i.column1 = d.column2
end
if update(column2)
begin
insert into TabChangeLog
select inserted.column1, inserted.column2, deleted.column2, 'column2', 'update/change'
from inserted i inner join deleted d
on i.column1 = d.column2
end
...
https://www.tutorialgateway.org/instead-of-update-triggers-in-sql-server/
Microsoft SQL Server 2016 has a thing called Temporal Tables which would probably simplify your job a lot. It lets you rewind a dataset through time to see the changes:
https://learn.microsoft.com/en-us/sql/relational-databases/tables/temporal-tables?view=sql-server-2017
If you don't want to go that route and use triggers instead. UPDATE triggers have two tables inserted and deleted that let you know what the row state was before and after.
*Edit: These are tables so you have to use SELECT INTO etc to interact with them you can't do conditional logic (if /else)
CREATE TABLE [dbo].[Table1](
[Id] [int] NOT NULL,
[Tail] [int] NOT NULL,
CONSTRAINT [PK_Table1_1] PRIMARY KEY CLUSTERED
(
[Id] ASC
)
)
CREATE TABLE Table1_Audit
(
Audit varchar(100)
)
--drop trigger Table1_OnUPDATE
CREATE TRIGGER Table1_OnUPDATE
ON dbo.Table1
AFTER UPDATE
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
-- Insert statements for trigger here
INSERT INTO Table1_Audit ([Audit])
select CONCAT('Tail changed to' ,inserted.Tail,' for pk Id=',inserted.Id) from inserted inner join
deleted on inserted.Id = deleted.Id --pk must be the same
where
inserted.Tail <> deleted.Tail --field x must be different
END
GO
--truncate table Table1_Audit
--update Table1 set Tail = 5
select * from Table1_Audit

Dynamic SQL to execute large number of rows from a table

I have a table with a very large number of rows which I wish to execute via dynamic SQL. They are basically existence checks and insert statements and I want to migrate data from one production database to another - we are merging transactional data. I am trying to find the optimal way to execute the rows.
I've been finding the coalesce method for appending all the rows to one another to not be efficient for this particularly when the number of rows executed at a time is greater than ~100.
Assume the structure of the source table is something arbitrary like this:
CREATE TABLE [dbo].[MyTable]
(
[ID] [int] IDENTITY(1,1) NOT NULL,
[DataField1] [int] NOT NULL,
[FK_ID1] [int] NOT NULL,
[LotsMoreFields] [NVARCHAR] (MAX),
CONSTRAINT [PK_MyTable] PRIMARY KEY CLUSTERED ([ID] ASC)
)
CREATE TABLE [dbo].[FK1]
(
[ID] [int] IDENTITY(1,1) NOT NULL,
[Name] [int] NOT NULL, -- Unique constrained value
CONSTRAINT [PK_FK1] PRIMARY KEY CLUSTERED ([ID] ASC)
)
The other requirement is I am tracking the source table PK vs the target PK and whether an insert occurred or whether I have already migrated that row to the target. To do this, I'm tracking migrated rows in another table like so:
CREATE TABLE [dbo].[ChangeTracking]
(
[ReferenceID] BIGINT IDENTITY(1,1),
[Src_ID] BIGINT,
[Dest_ID] BIGINT,
[TableName] NVARCHAR(255),
CONSTRAINT [PK_ChangeTracking] PRIMARY KEY CLUSTERED ([ReferenceID] ASC)
)
My existing method is executing some dynamic sql generated by a stored procedure. The stored proc does PK lookups as the source system has different PK values for table [dbo].[FK1].
E.g.
IF NOT EXISTS (<ignore this existence check for now>)
BEGIN
INSERT INTO [Dest].[dbo].[MyTable] ([DataField1],[FK_ID1],[LotsMoreFields]) VALUES (333,(SELECT [ID] FROM [Dest].[dbo].[FK1] WHERE [Name]=N'ValueFoundInSource'),N'LotsMoreValues');
INSERT INTO [Dest].[dbo].[ChangeTracking] ([Src_ID],[Dest_ID],[TableName]) VALUES (666,SCOPE_IDENTITY(),N'MyTable'); --666 is the PK in [Src].[dbo].[MyTable] for this inserted row
END
So when you have a million of these, it isn't quick.
Is there a recommended performant way of doing this?
As mentioned, the MERGE statement works well when you're looking at a complex JOIN condition (if any of these fields are different, update the record to match). You can also look into creating a HASHBYTES hash of the entire record to quickly find differences between source and target tables, though that can also be time-consuming on very large data sets.
It sounds like you're making these updates like a front-end developer, by checking each row for a match and then doing the insert. It will be far more efficient to do the inserts with a single query. Below is an example that looks for names that are in the tblNewClient table, but not in the tblClient table:
INSERT INTO tblClient
( [Name] ,
TypeID ,
ParentID
)
SELECT nc.[Name] ,
nc.TypeID ,
nc.ParentID
FROM tblNewClient nc
LEFT JOIN tblClient cl
ON nc.[Name] = cl.[Name]
WHERE cl.ID IS NULL;
This is will way more efficient than doing it RBAR (row by agonizing row).
Taking the two answers from #RusselFox and putting them together, I reached this tentative solution (but looking a LOT more efficient):
MERGE INTO [Dest].[dbo].[MyTable] [MT_D]
USING (SELECT [MT_S].[ID] as [SrcID],[MT_S].[DataField1],[FK_1_D].[ID] as [FK_ID1],[MT_S].[LotsMoreFields]
FROM [Src].[dbo].[MyTable] [MT_S]
JOIN [Src].[dbo].[FK_1] ON [MT_S].[FK_ID1] = [FK_1].[ID]
JOIN [Dest].[dbo].[FK_1] [FK_1_D] ON [FK_1].[Name] = [FK_1_D].[Name]
) [SRC] ON 1 = 0
WHEN NOT MATCHED THEN
INSERT([DataField1],[FL_ID1],[LotsMoreFields])
VALUES ([DataField1],[FL_ID1],[LotsMoreFields])
OUTPUT [SRC].[SrcID],INSERTED.[ID],0,N'MyTable' INTO [Dest].[dbo].[ChangeTracking]([Src_ID],[Dest_ID],[AlreadyExists],[TableName]);

Cannot insert explicit value for identity column

I am migrating my application form one database to other with keeping table structure as it is. I am creating same tables in new table and inserted value using db link.
I am getting error message like "Cannot insert explicit value for identity column in table 'XYZ' when IDENTITY_INSERT is set to OFF." because table XYZ have ScreenConfigSettingAccessId as an identity column
Below is the script I am using for creating table and inserting value
CREATE TABLE [dbo].[XYZ](
[ScreenConfigSettingAccessId] [int] IDENTITY(1,1) NOT NULL,
[APP_ID] [int] NOT NULL,
[ScreenConfigSettingId] [int] NOT NULL,
[RSRC_ID] [char](20) NOT NULL)
)
INSERT INTO [dbo].[XYX]
(
[ScreenConfigSettingAccessId] ,
[APP_ID] ,
[ScreenConfigSettingId] ,
[RSRC_ID]
)
SELECT
[ScreenConfigSettingAccessId] ,
[APP_ID] ,
[ScreenConfigSettingId] ,
[RSRC_ID]
FROM [olddatabase].[database name].[dbo].[XYX]
in old table the value of ScreenConfigSettingAccessId is 3 and 4.
I want to inset the same data which old table have so set IDENTITY_INSERT to ON and tried but it still not allowing to insert.
Looking for you suggestions
You need to specify the table. Check out the command syntax in SQL Books Online: SQL 2000 or SQL 2012 (the syntax hasn't changed).

Resources