How to delete documents from Filetable? - sql-server

I am trying to delete some documents from sql server's filetable.
Here I have one table in which I am storing all my Attachment's details and Documents in sql server's file table named Attchemnts.
AttachmentDetails table has below schema,
CREATE TABLE [dbo].[AttachmentDetails](
[Id] [int] IDENTITY(1,1) NOT NULL PRIMARY KEY,
[DocumentName] [nvarchar](max) NULL,
[DocumentType] [nvarchar](max) NULL,
[ModifiedDateTime] [datetime] NOT NULL,
[CreatedDateTime] [datetime] NOT NULL,
[CreatedBy] [nvarchar](254) NULL,
[ModifiedBy] [nvarchar](254) NULL,
[IsDeleted] [bit] NULL,
)
Whenever I am uploading any document to File table then I am inserting that document's detailed information in AttchemntsDetails table as per table schema.
Here I have tried the below solution
CREATE PROCEDURE [dbo].[DeleteFiles]
AS
BEGIN
DELETE Attachments
FROM AttachmentDetails a
WHERE
DocumentType = 'video/mp4' AND DATEDIFF(day, a.CreatedDateTime, GETDATE())<11
end
This procedure suppose to delete only Video/mp4 files who are 10 days older But it deletes any type of document from the filetable.

SQL is a set-based language. For every cursor/loop based script there's a far simpler and faster set based solution. In any case, the way this query is written would result in random deletions since there's no guarantee what all those TOP 1 queries will return without an ORDER BY clause.
It looks like you're trying to delete all video attachments older than 30 days. It also looks like the date is stored in a separate table called table1. You can write a DELETE statement whose rows come from a JOIN if you use the FROM clause, eg:
DELETE Attachments
FROM Attachments inner join table1 a on a.ID=Attachments.ID
WHERE
DocumentType = 'video/mp4' AND
CreatedDateTime < DATEADD(day,-30,getdate())
EDIT
The original query contained DATEADD(day,30,getdate()) when it should be DATEADD(day,-30,getdate())
Example
Assuming we have those two tables :
create table attachments (ID int primary key,DocumentType nvarchar(100))
insert into attachments (ID,DocumentType)
values
(1,'video/mp4'),
(2,'audio/mp3'),
(3,'application/octet-stream'),
(4,'video/mp4')
and
create table table1 (ID int primary key, CreatedDateTime datetime)
insert into table1 (ID,CreatedDateTime)
values
(1,dateadd(day,-40,getdate())),
(2,dateadd(day,-40,getdate())),
(3,getdate()),
(4,getdate())
Executing the DELETE query will only delete the Attachment with ID=1. The query
select *
from Attachments
```
Will return :
```
ID DocumentType
2 audio/mp3
3 application/octet-stream
4 video/mp4
```

Related

Dynamic SQL to execute large number of rows from a table

I have a table with a very large number of rows which I wish to execute via dynamic SQL. They are basically existence checks and insert statements and I want to migrate data from one production database to another - we are merging transactional data. I am trying to find the optimal way to execute the rows.
I've been finding the coalesce method for appending all the rows to one another to not be efficient for this particularly when the number of rows executed at a time is greater than ~100.
Assume the structure of the source table is something arbitrary like this:
CREATE TABLE [dbo].[MyTable]
(
[ID] [int] IDENTITY(1,1) NOT NULL,
[DataField1] [int] NOT NULL,
[FK_ID1] [int] NOT NULL,
[LotsMoreFields] [NVARCHAR] (MAX),
CONSTRAINT [PK_MyTable] PRIMARY KEY CLUSTERED ([ID] ASC)
)
CREATE TABLE [dbo].[FK1]
(
[ID] [int] IDENTITY(1,1) NOT NULL,
[Name] [int] NOT NULL, -- Unique constrained value
CONSTRAINT [PK_FK1] PRIMARY KEY CLUSTERED ([ID] ASC)
)
The other requirement is I am tracking the source table PK vs the target PK and whether an insert occurred or whether I have already migrated that row to the target. To do this, I'm tracking migrated rows in another table like so:
CREATE TABLE [dbo].[ChangeTracking]
(
[ReferenceID] BIGINT IDENTITY(1,1),
[Src_ID] BIGINT,
[Dest_ID] BIGINT,
[TableName] NVARCHAR(255),
CONSTRAINT [PK_ChangeTracking] PRIMARY KEY CLUSTERED ([ReferenceID] ASC)
)
My existing method is executing some dynamic sql generated by a stored procedure. The stored proc does PK lookups as the source system has different PK values for table [dbo].[FK1].
E.g.
IF NOT EXISTS (<ignore this existence check for now>)
BEGIN
INSERT INTO [Dest].[dbo].[MyTable] ([DataField1],[FK_ID1],[LotsMoreFields]) VALUES (333,(SELECT [ID] FROM [Dest].[dbo].[FK1] WHERE [Name]=N'ValueFoundInSource'),N'LotsMoreValues');
INSERT INTO [Dest].[dbo].[ChangeTracking] ([Src_ID],[Dest_ID],[TableName]) VALUES (666,SCOPE_IDENTITY(),N'MyTable'); --666 is the PK in [Src].[dbo].[MyTable] for this inserted row
END
So when you have a million of these, it isn't quick.
Is there a recommended performant way of doing this?
As mentioned, the MERGE statement works well when you're looking at a complex JOIN condition (if any of these fields are different, update the record to match). You can also look into creating a HASHBYTES hash of the entire record to quickly find differences between source and target tables, though that can also be time-consuming on very large data sets.
It sounds like you're making these updates like a front-end developer, by checking each row for a match and then doing the insert. It will be far more efficient to do the inserts with a single query. Below is an example that looks for names that are in the tblNewClient table, but not in the tblClient table:
INSERT INTO tblClient
( [Name] ,
TypeID ,
ParentID
)
SELECT nc.[Name] ,
nc.TypeID ,
nc.ParentID
FROM tblNewClient nc
LEFT JOIN tblClient cl
ON nc.[Name] = cl.[Name]
WHERE cl.ID IS NULL;
This is will way more efficient than doing it RBAR (row by agonizing row).
Taking the two answers from #RusselFox and putting them together, I reached this tentative solution (but looking a LOT more efficient):
MERGE INTO [Dest].[dbo].[MyTable] [MT_D]
USING (SELECT [MT_S].[ID] as [SrcID],[MT_S].[DataField1],[FK_1_D].[ID] as [FK_ID1],[MT_S].[LotsMoreFields]
FROM [Src].[dbo].[MyTable] [MT_S]
JOIN [Src].[dbo].[FK_1] ON [MT_S].[FK_ID1] = [FK_1].[ID]
JOIN [Dest].[dbo].[FK_1] [FK_1_D] ON [FK_1].[Name] = [FK_1_D].[Name]
) [SRC] ON 1 = 0
WHEN NOT MATCHED THEN
INSERT([DataField1],[FL_ID1],[LotsMoreFields])
VALUES ([DataField1],[FL_ID1],[LotsMoreFields])
OUTPUT [SRC].[SrcID],INSERTED.[ID],0,N'MyTable' INTO [Dest].[dbo].[ChangeTracking]([Src_ID],[Dest_ID],[AlreadyExists],[TableName]);

SQL Pivot with multiple joins

The SQL Pivot command seems difficult at least. I've read a lot about it, and been tinkering with this query for a while, but all I get are really obscure error messages that don't help, like "The column name 'Id' was specified multiple times.." or "The multi-part identifier X could not be bound."
Our database collects client answers to questions. I'd like to create a table which contains a row for each client, and columns for each question (ID) they've answered and the AVG ResponseTime across all times that user has logged in. This is made more difficult as the UserId isn't directly stored in the UserSessionData table, it's stored in the UserSession table, so I have to do a join first, which seems to complicate the issue.
The tables I'm trying to pivot are roughly of the following form:
CREATE TABLE [dbo].[UserSessionData](
[Id] [int] IDENTITY(1,1) NOT NULL,
[UserSessionId] [int] NOT NULL,
[UserWasCorrect] [bit] NULL,
[ResponseTime] [float] NULL,
[QuestionId] [int] NULL)
--This table contains user answers to a number of questions.
CREATE TABLE [dbo].[UserSession](
[Id] [int] IDENTITY(1,1) NOT NULL,
[UserId] [int] NOT NULL,
[SessionCode] [nvarchar](50) NOT NULL)
--This table contains details of the user's login session.
CREATE TABLE [dbo].[Question](
[Id] [int] IDENTITY(1,1) NOT NULL,
[QuestionText] [nvarchar](max) NOT NULL,
[GameId] [int] NOT NULL,
[Description] [nvarchar](max) NULL)
--This table contains question details
I'll continue trying to mangle a solution, but if anyone can shed any light (or suggest an easier method than PIVOT to achieve the desired result), then that would be great.
Cheers
It's because you've got the same column names in multiple tables so after you've done the join the pivot sees multiple columns all the same name. Have a look at my example below:
SELECT
*
FROM (
SELECT
usd.Id AS usdId
,UserSessionId
,UserWasCorrect
,ResponseTime
,QuestionId
,us.Id AS usId
,SessionCode
,UserId
,Description
,GameId
,qu.Id AS quId
,QuestionText
FROM #UserSessionData usd
LEFT JOIN #UserSession us
ON usd.UserSessionId = us.Id
LEFT JOIN #Question qu
ON usd.QuestionId = qu.Id
) AS tbl PIVOT (
-- As no example data was provided 'quest' & 'voyage are just random values I put in. Change the pivot statement to match what you want
MIN(ResponseTime) FOR SessionCode IN (quest, voyage)
) AS pvt
'quest' and 'voyage' are example data of the rows contents in the Column SessionCode. This will need to be changed to your columns contents. In PIVOTs and UNPIVOTs you cannot use a query to get these values and they have to be statically put in. You could use dynamic SQL to generate the values however this is usually heavily advised against

Recursive OR statement SQL Server 2016

I am looking at creating a relatively simple insert statement that inserts a new record if there are any changes to a table. Issue i have is there are over 600 columns that would need to be checked.
More details: the main reporting table is updated every 15 minutes from the front end application using a SQL process to push the changes, however it over-writes the data and doesn't maintain a change log. I have no control over any of this.
Second table (my table) is a DWH table, which will create an audit of changes. So I use an inner join where t1.AccountNo = t2.AccountNo and t1.Field1 <> t.2Field1 then add an OR and add the next field t1.AccountNo = t2.AccountNo and t1.Field2 <> t.2Field2 .
Is there a better way to get the desired result given the number of columns?
You could try a different approach.
Create a trigger on the main table for update and delete.
This trigger copies the data which is already in the table to your dwh table before the data has changed.
create Trigger [nameupdate] on [yourtable] after update
as
insert into [dwh]
select
getdate() as [ChangeDate]
,'update' as [Action]
,SYSTEM_USER as [User]
,d.[ID]
,d.[...]
from deleted d
GO
same for delete
create Trigger [namedelete] on [yourtable] after delete
[...]
my dwh table has 3 additional columns for tracking and contains all columns from main table.
CREATE TABLE [dwh](
[ID] [int] IDENTITY(1,1) NOT NULL Primary key,
[ChangeDate] [datetime] NOT NULL,
[Action] [varchar](50) NOT NULL,
[User] [nvarchar](128) NOT NULL,
[...]

SQL Server - Order Identity Fields in Table

I have a table with this structure:
CREATE TABLE [dbo].[cl](
[ID] [int] IDENTITY(1,1) NOT NULL,
[NIF] [numeric](9, 0) NOT NULL,
[Name] [varchar](80) NOT NULL,
[Address] [varchar](100) NULL,
[City] [varchar](40) NULL,
[State] [varchar](30) NULL,
[Country] [varchar](25) NULL,
Primary Key([ID],[NIF])
);
Imagine that this table has 3 records. Record 1, 2, 3...
When ever I delete Record number 2 the IDENTITY Field generates a Gap. The table then has Record 1 and Record 3. Its not correct!
Even if I use:
DBCC CHECKIDENT('cl', RESEED, 0)
It does not solve my problem becuase it will set the ID of the next inserted record to 1. And that's not correct either because the table will then have a multiple ID.
Does anyone has a clue about this?
No database is going to reseed or recalculate an auto-incremented field/identity to use values in between ids as in your example. This is impractical on many levels, but some examples may be:
Integrity - since a re-used id could mean records in other systems are referring to an old value when the new value is saved
Performance - trying to find the lowest gap for each value inserted
In MySQL, this is not really happening either (at least in InnoDB or MyISAM - are you using something different?). In InnoDB, the behavior is identical to SQL Server where the counter is managed outside of the table, so deleted values or rolled back transactions leave gaps between last value and next insert. In MyISAM, the value is calculated at time of insertion instead of managed through an external counter. This calculation is what is giving the perception of being recalcated - it's just never calculated until actually needed (MAX(Id) + 1). Even this won't insert inside gaps (like the id = 2 in your example).
Many people will argue if you need to use these gaps, then there is something that could be improved in your data model. You shouldn't ever need to worry about these gaps.
If you insist on using those gaps, your fastest method would be to log deletes in a separate table, then use an INSTEAD OF INSERT trigger to perform the inserts with your intended keys by first looking for records in these deletions table to re-use (then deleting them to prevent re-use) and then using the MAX(Id) + 1 for any additional rows to insert.
I guess what you want is something like this:
create table dbo.cl
(
SurrogateKey int identity(1, 1)
primary key
not null,
ID int not null,
NIF numeric(9, 0) not null,
Name varchar(80) not null,
Address varchar(100) null,
City varchar(40) null,
State varchar(30) null,
Country varchar(25) null,
unique (ID, NIF)
)
go
I added a surrogate key so you'll have the best of both worlds. Now you just need a trigger on the table to "adjust" the ID whenever some prior ID gets deleted:
create trigger tr_on_cl_for_auto_increment on dbo.cl
after delete, update
as
begin
update dbo.cl
set ID = d.New_ID
from dbo.cl as c
inner join (
select c2.SurrogateKey,
row_number() over (order by c2.SurrogateKey asc) as New_ID
from dbo.cl as c2
) as d
on c.SurrogateKey = d.SurrogateKey
end
go
Of course this solution also implies that you'll have to ensure (whenever you insert a new record) that you check for yourself which ID to insert next.

Cannot insert explicit value for identity column

I am migrating my application form one database to other with keeping table structure as it is. I am creating same tables in new table and inserted value using db link.
I am getting error message like "Cannot insert explicit value for identity column in table 'XYZ' when IDENTITY_INSERT is set to OFF." because table XYZ have ScreenConfigSettingAccessId as an identity column
Below is the script I am using for creating table and inserting value
CREATE TABLE [dbo].[XYZ](
[ScreenConfigSettingAccessId] [int] IDENTITY(1,1) NOT NULL,
[APP_ID] [int] NOT NULL,
[ScreenConfigSettingId] [int] NOT NULL,
[RSRC_ID] [char](20) NOT NULL)
)
INSERT INTO [dbo].[XYX]
(
[ScreenConfigSettingAccessId] ,
[APP_ID] ,
[ScreenConfigSettingId] ,
[RSRC_ID]
)
SELECT
[ScreenConfigSettingAccessId] ,
[APP_ID] ,
[ScreenConfigSettingId] ,
[RSRC_ID]
FROM [olddatabase].[database name].[dbo].[XYX]
in old table the value of ScreenConfigSettingAccessId is 3 and 4.
I want to inset the same data which old table have so set IDENTITY_INSERT to ON and tried but it still not allowing to insert.
Looking for you suggestions
You need to specify the table. Check out the command syntax in SQL Books Online: SQL 2000 or SQL 2012 (the syntax hasn't changed).

Resources