I have a script that automatically overwrites SQL Server objects every 2 days.
In order to check whether the script has run successfully, I would like to be able to check two things:
Find out the freshness of the objects by retrieving the object creation (table, views,...) date. If it is older than 2 days, the script has not overwritten the objects. These objects have to be listed.
Find out the completeness of the objects by ensuring all objects are present based on a predefined list, ie check if all tables/views are present. The objects are already stored in another table on the database level, so this can be used as an input.
How to go about this? What would be the approach? Could you please refer me to any good online resources? What scripting language is used to realize this?
Many thanks.
If you use the system tables then an unrelated release could throw you off. Use a log table to keep track of what is going on. On successful completion of your process have it insert an entry into the table that says it was completed. Then query the log table to see when you should refresh again.
Could be something as simple as the table below where activityTypeId = 1 for this process and activityType is zero for started and 1 for completed.
CREATE TABLE [dbo].[ActivityLog](
[id] [int] IDENTITY(1,1) NOT NULL,
[activityTypeId] [int] NOT NULL,
[activityTime] [datetime] NOT NULL,
[activityValue] [int] NOT NULL,
CONSTRAINT [PK_ActivityLog] PRIMARY KEY CLUSTERED
(
[id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF,
IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
ON [PRIMARY]
) ON [PRIMARY]
This should do it:
SELECT
mo.Name,
CASE
WHEN so.name IS NULL
THEN 'Does Not Exist'
WHEN DATEDIFF(dd, so.create_date, getdate()) > 2
THEN 'More than two days old'
ELSE 'Exists' END AS existCheck
FROM dbo.MyObjects AS mo
LEFT JOIN sys.objects AS so ON so.name = mo.Name
Here is a possible solution
IF EXISTS(
SELECT 1
FROM sys.objects O
LEFT OUTER JOIN YourTable T
ON O.name = t.name
AND O.modify_date < DATEADD(DAY,-2,GETDATE())
WHERE TYPE IN ('U','V')
)
RAISERROR('Some objects have not been updated in the last 2 days', 16, 1)
Related
I have a table with the following definition:
CREATE TABLE [dbo].[Transactions]
(
[ID] [varchar](18) NOT NULL,
[TIME_STAMP] [datetime] NOT NULL,
[AMT] [decimal](18, 4) NOT NULL,
[CID] [varchar](90) NOT NULL,
[DEPARTMENT] [varchar](4) NULL,
[SOURCE] [varchar](14) NULL,
PRIMARY KEY NONCLUSTERED
(
[ID] ASC
)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
The table has 75 million rows in it. Somehow, it takes up 20 GB of disk space!
The following 2 queries...
SELECT
SUM(AMT)
FROM
Transactions
WHERE
TIME_STAMP >= '2017-11-11 00:00:00' AND
TIME_STAMP < '2017-11-12 00:00:00' AND
DEPARTMENT = 'Shoes' AND
SOURCE = 'Website'
SELECT
COUNT(DISTINCT(CID))
FROM
Transactions
WHERE
TIME_STAMP >= '2017-11-11 00:00:00' AND
TIME_STAMP < '2017-11-12 00:00:00' AND
DEPARTMENT = 'Accessories' AND
SOURCE = 'Mobile'
...each take about 2 minutes to run!
The DEPARTMENT and SOURCE fields are of low cardinality, they contain only a few distinct values.
Please advise on what I need to do, which indexes I need to create with which settings to optimize performance of these queries.
Thank you!
The best way to solve this specific query would be a composite index (one index with multiple columns) in this order:
Department
Source
Timestamp
Try to put the most selective column first, so if source has more possible variations than department, put it first. The date will obviously go last since it will trigger an index scan.
CREATE INDEX IX_Transactions ON Transactions(TIME_STAMP,DEPARTMENT,SOURCE) INCLUDE(AMT,CID)
I would create an index using Timestamp, Department and Source. I would also add AMT and CID as included columns. This means both your queries could be satisfied by reading the index and not having to hit the parent table at all.
CREATE INDEX IX_Transactions ON Transactions(TIME_STAMP,DEPARTMENT,SOURCE) INCLUDE(AMT,CID)
One additional option to consider is to run the Execution Plan and see if it recommends an index. I do this a lot when considering indexes because I have seen performance improvements from Execution Plan recommended indexes, over indexes that I thought were good, but were not intuitive.
I have some table in my table which is accessible to many. Some data is missing in my table now. How can I find who deleted those rows from that table.
You can use ApexSQL Log to fully investigate operations executed against your table. The database needs to be in the full recovery model, so the information on past operations is available inside the transaction log file for ApexSQL Log to read it. Once the tool analyzes your t-log, you will be able to see the time the operation began and ended, the operation type, the schema and object name of the object affected, the name of the user who executed the operation, and more. For UPDATEs, you’ll even be able to see the old and the new value of the updated fields.
There are several guides on this here https://solutioncenter.apexsql.com/apexsql-log-solutions-table-of-contents/
Furthermore, you can even use ApexSQL Log to rollback those transactions if you need to. It will simply 'Undo' them and rollback changes back to their original state.
You can find deleted data's UserName by following little snippet :
DECLARE #TableName sysname
SET #TableName = 'dbo.t1_new' --INPUT TABLE NAME
SELECT
u.[name] AS UserName
, l.[Begin Time] AS TransactionStartTime
FROM
fn_dblog(NULL, NULL) l
INNER JOIN
(
SELECT
[Transaction ID]
FROM
fn_dblog(NULL, NULL)
WHERE
AllocUnitName LIKE #TableName + '%'
AND
Operation = 'LOP_DELETE_ROWS'
) deletes
ON deletes.[Transaction ID] = l.[Transaction ID]
INNER JOIN
sysusers u
ON u.[sid] = l.[Transaction SID]
source : dba.stackexchange
(I don't recall who posted it)
Unfortunately, you can't see deleted records if you don't keep them somewhere yourself.
If you want to track this type of interventions, you should not really delete your records.
Instead, you should create a some more fields on your table.
Here is an exemple :
SET QUOTED_IDENTIFIER ON
GO
SET ANSI_PADDING ON
GO
CREATE TABLE [dbo].[Person](
[Pers_ID] [int] IDENTITY(1,1) NOT NULL,
[Pers_CompanyID] [int] NULL,
[Pers_FirstName] [nvarchar](50) NULL,
[Pers_LastName] [nvarchar](50) NULL,
[Pers_CreatedBy] [int] NULL,
[Pers_CreatedDate] [datetime] NULL,
[Pers_UpdatedBy] [int] NULL,
[Pers_UpdatedDate] [datetime] NULL,
[Pers_Deleted] [bit] NULL,
CONSTRAINT [PK_Person] PRIMARY KEY CLUSTERED
(
[Pers_ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
SET ANSI_PADDING OFF
GO
When the user creates a record, you can set CreatedBy = UserID, CreatedDate = CurrentDate,
While updating a record, UpdatedBy = UserID, UpdatedDate = CurrentDate
And deleting, Deleted = True, UpdatedBy = UserID, UpdatedDate = CurrentDate.
And in your code, in all queries you should add the condition Deleted = null.
Thus, you can track who created, updated or deleted a record.
Running PHP 5.3 and SQL Server 2008 R2 using sqlsrv driver to connect.
Codeigniter 2.2
So I have been using codeigniter for a couple years now and this is the first time I have run across this problem. I have a table in my database named 'update_times' which is a log table for data updates I load daily, it has around 2000 records in it and is indexed on the columns I query against. My database has 60 or so tables and 'update_times' is the only table I am unable to select anything from with codeigniter.
I have done a bunch of tests:
I ran a record count for every table in the database and every other table was correct except 'update_times' table which returned 0 records
I can query(select) from the table in Management Studio with no problem.
I can also select from update_times table using sqlsrv php function sqlsrv_query, I return records using this method
I am unable to select using active_record select or query method from codeigniter (I tried it in multiple controllers/models)
Here's the weird part, I can insert, update and delete using the active record functions. It is only the select where I am having the issue and only on this table.
I tried rebuilding the indexes and rebuilding the entire table as well but nothing helped. So I am left stumped. I was going to just create a new table with a new name for update_times but I really want to find the problem in CI so that I know what to do if it happens again. It's almost like CI is blocking the select for some reason.
I now have created a table with a different name with that same structure and I am unable to query it the same as the update_times table. I am still stumped.
Here is the table structure of update_times:
CREATE TABLE [dbo].[update_times](
[ut_id] [int] IDENTITY(1,1) NOT NULL,
[table_name] [varchar](50) NOT NULL,
[start_time] [datetime] NOT NULL,
[end_time] [datetime] NOT NULL,
[records] [int] NOT NULL,
[emp_id] [int] NULL,
[dates_requested] [varchar](50) NULL,
PRIMARY KEY CLUSTERED
(
[ut_id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
Any help would be great or suggestions on how to narrow the error.
I am supposed to remove whole rows and part of XML-documents from a table with an XML column based on a specific value in the XML column. However the table contains millions of rows and gets locked when I perform the operation. Currently it will take almost a week to clean it up, and the system is too critical to be taken offline for so long.
Are there any ways to optimize the xpath expressions in this script:
declare #slutdato datetime = '2012-03-01 00:00:00.000'
declare #startdato datetime = '2000-02-01 00:00:00.000'
declare #lev varchar(20) = 'suppliername'
declare #todelete varchar(10) = '~~~~~~~~~~'
CREATE TABLE #ids (selId int NOT NULL PRIMARY KEY)
INSERT into #ids
select id from dbo.proevesvar
WHERE leverandoer = #lev
and proevedato <= #slutdato
and proevedato >= #startdato
begin transaction /* delete whole rows */
delete from dbo.proevesvar
where id in (select selId from #ids)
and ProeveSvarXml.exist('/LaboratoryReport/LaboratoryResults/Result[Value=sql:variable(''#todelete'')]') = 1
and Proevesvarxml.exist('/LaboratoryReport/LaboratoryResults/Result[Value!=sql:variable(''#todelete'')]') = 0
commit
go
begin transaction /* delete single results */
UPDATE dbo.proevesvar SET ProeveSvarXml.modify('delete /LaboratoryReport/LaboratoryResults/Result[Value=sql:variable(''#todelete'')]')
where id in (select selId from #ids)
commit
go
The table definitions is:
CREATE TABLE [dbo].[ProeveSvar](
[ID] [int] IDENTITY(1,1) NOT NULL,
[CPRnr] [nchar](10) NOT NULL,
[ProeveDato] [datetime] NOT NULL,
[ProeveSvarXml] [xml] NOT NULL,
[Leverandoer] [nvarchar](50) NOT NULL,
[Proevenr] [nvarchar](50) NOT NULL,
[Lokationsnr] [nchar](13) NOT NULL,
[Modtaget] [datetime] NOT NULL,
CONSTRAINT [PK_ProeveSvar] PRIMARY KEY CLUSTERED
(
[ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY],
CONSTRAINT [IX_ProeveSvar_1] UNIQUE NONCLUSTERED
(
[CPRnr] ASC,
[Lokationsnr] ASC,
[Proevenr] ASC,
[ProeveDato] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
The first insert statement is very fast. I believe I can handle the locking by committing 50 rows at a time, so other requests can be handled in between my transactions.
The total number of rows for this supplier is about 5.5 million and the total rowcount in the table is around 13 million.
I've not really used xpath within SQL server before, but something which stands out is that you're doing lots of reads and writes in the same command (in the second statement). If possible, change your queries to..
CREATE TABLE #ids (selId int NOT NULL PRIMARY KEY)
INSERT into #ids
select id from dbo.proevesvar
WHERE leverandoer = #lev
and proevedato <= #slutdato
and proevedato >= #startdato
and ProeveSvarXml.exist('/LaboratoryReport/LaboratoryResults/Result[Value=sql:variable(''#todelete'')]') = 1
and Proevesvarxml.exist('/LaboratoryReport/LaboratoryResults/Result[Value!=sql:variable(''#todelete'')]') = 0
begin transaction /* delete whole rows */
delete from dbo.proevesvar
where id in (select selId from #ids)
This means that the first query will only create the new temporary table, and not write anything back, which will take slightly longer than your original, but the key thing is that your second query will ONLY be deleting records based on what's in your temporary table.
What you'll probably find is because it's deleting records, it's constantly re-building indices, and causing the reads to also be slower.
I'd also delete/disable any indices/constraints that don't actually help your query run.
Also, you're creating your clustered primary key on the ID, which isn't always the best thing to do. Especially if you're doing lots of date scans.
Can you also view the estimated execution plan for the top query, it would be interesting to see the order in which it checks the conditions. If it's doing the date first, then that's fine, but if it's doing the xpath before it checks the date, you might have to separte it into 3 queries, or add a new clustered index on 'proevedato,id'. This should force the query to only run the xpath for records which actually match the date.
Hope this helps.
I've table that contains some buy/sell data, with around 8M records in it:
CREATE TABLE [dbo].[Transactions](
[id] [int] IDENTITY(1,1) NOT NULL,
[itemId] [bigint] NOT NULL,
[dt] [datetime] NOT NULL,
[count] [int] NOT NULL,
[price] [float] NOT NULL,
[platform] [char](1) NOT NULL
) ON [PRIMARY]
Every X mins my program gets new transactions for each itemId and I need to update it. My first solution is two step DELETE+INSERT:
delete from Transactions where platform=#platform and itemid=#itemid
insert into Transactions (platform,itemid,dt,count,price) values (#platform,#itemid,#dt,#count,#price)
[...]
insert into Transactions (platform,itemid,dt,count,price) values (#platform,#itemid,#dt,#count,#price)
The problem is, that this DELETE statement takes average 5 seconds. It's much too long.
The second solution I found is to use MERGE. I've created such Stored Procedure, wchich takes Table-valued parameter:
CREATE PROCEDURE [dbo].[sp_updateTransactions]
#Table dbo.tp_Transactions readonly,
#itemId bigint,
#platform char(1)
AS
BEGIN
MERGE Transactions AS TARGET
USING #Table AS SOURCE
ON (
TARGET.[itemId] = SOURCE.[itemId] AND
TARGET.[platform] = SOURCE.[platform] AND
TARGET.[dt] = SOURCE.[dt] AND
TARGET.[count] = SOURCE.[count] AND
TARGET.[price] = SOURCE.[price] )
WHEN NOT MATCHED BY TARGET THEN
INSERT VALUES (SOURCE.[itemId],
SOURCE.[dt],
SOURCE.[count],
SOURCE.[price],
SOURCE.[platform])
WHEN NOT MATCHED BY SOURCE AND TARGET.[itemId] = #itemId AND TARGET.[platform] = #platform THEN
DELETE;
END
This procedure takes around 7 seconds with table with 70k records. So with 8M it would probably take few minutes. The bottleneck is "When not matched" - when I commented this line, this procedure runs on average 0,01 second.
So the question is: how to improve perfomance of the delete statement?
Delete is needed to make sure, that table doesn't contains transaction that as been removed in application. But it real scenario it happens really rarely, ane the true need of deleting records is less than 1 on 10000 transaction updates.
My theoretical workaround is to create additional column like "transactionDeleted bit" and use UPDATE instead of DELETE, ane then make table cleanup by batch job every X minutes or hours and Execute
delete from transactions where transactionDeleted=1
It should be faster, but I would need to update all SELECT statements in other parts of application, to use only transactionDeleted=0 records and so it also may afect application performance.
Do you know any better solution?
UPDATE: Current indexes:
CREATE NONCLUSTERED INDEX [IX1] ON [dbo].[Transactions]
(
[platform] ASC,
[ItemId] ASC
) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 50) ON [PRIMARY]
CONSTRAINT [IX2] UNIQUE NONCLUSTERED
(
[ItemId] DESC,
[count] ASC,
[dt] DESC,
[platform] ASC,
[price] ASC
) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
OK, here is another approach also. For a similar problem (large scan WHEN NOT MATCHED BY SOURCE then DELETE) I reduced the MERGE execute time from 806ms to 6ms!
One issue with the problem above is that the "WHEN NOT MATCHED BY SOURCE" clause is scanning the whole TARGET table.
It is not that obvious but Microsoft allows the TARGET table to be filtered (by using a CTE) BEFORE doing the merge. So in my case the TARGET rows were reduced from 250K to less than 10 rows. BIG difference.
Assuming that the above problem works with the TARGET being filtered by #itemid and #platform then the MERGE code would look like this. The changes above to the indexes would help this logic too.
WITH Transactions_CTE (itemId
,dt
,count
,price
,platform
)
AS
-- Define the CTE query that will reduce the size of the TARGET table.
(
SELECT itemId
,dt
,count
,price
,platform
FROM Transactions
WHERE itemId = #itemId
AND platform = #platform
)
MERGE Transactions_CTE AS TARGET
USING #Table AS SOURCE
ON (
TARGET.[itemId] = SOURCE.[itemId]
AND TARGET.[platform] = SOURCE.[platform]
AND TARGET.[dt] = SOURCE.[dt]
AND TARGET.[count] = SOURCE.[count]
AND TARGET.[price] = SOURCE.[price]
)
WHEN NOT MATCHED BY TARGET THEN
INSERT
VALUES (
SOURCE.[itemId]
,SOURCE.[dt]
,SOURCE.[count]
,SOURCE.[price]
,SOURCE.[platform]
)
WHEN NOT MATCHED BY SOURCE THEN
DELETE;
Using a BIT field for IsDeleted (or IsActive as many people do) is valid but it does require modifying all code plus creating a separate SQL Job to periodically come through and remove the "deleted" records. This might be the way to go but there is something less intrusive to try first.
I noticed in your set of 2 indexes that neither is CLUSTERED. Can I assume that the IDENTITY field is? You might consider making the [IX2] UNIQUE index the CLUSTERED one and changing the PK (again, I assume the IDENTITY field is a CLUSTERED PK) to be NONCLUSTERED. I would also reorder the IX2 fields to put [Platform] and [ItemID] first. Since your main operation is looking for [Platform] and [ItemID] as a set, physically ordering them this way might help. And since this index is unique, that is a good candidate for being CLUSTERED. It is certainly worth testing as this will impact all queries against the table.
Also, if changing the indexes as I have suggested helps, it still might be worth trying both ideas and hence doing the IsDeleted field as well to see if that increases performance even more.
EDIT:
I forgot to mention, by making the IX2 index CLUSTERED and moving the [Platform] field to the top, you should get rid of the IX1 index.
EDIT2:
Just to be very clear, I am suggesting something like:
CREATE UNIQUE CLUSTERED INDEX [IX2]
(
[ItemId] DESC,
[platform] ASC,
[count] ASC,
[dt] DESC,
[price] ASC
)
And to be fair, changing which index is CLUSTERED could also negatively impact queries where JOINs are done on the [id] field which is why you need to test thoroughly. In the end you need to tune the system for your most frequent and/or expensive queries and might have to accept that some queries will be slower as a result but that might be worth this operation being much faster.
See this https://stackoverflow.com/questions/3685141/how-to-....
would the update be the same cost as a delete? No. The update would be
a much lighter operation, especially if you had an index on the PK
(errrr, that's a guid, not an int). The point being that an update to
a bit field is much less expensive. A (mass) delete would force a
reshuffle of the data.
In light of this information, your idea to use a bit field is very valid.