Creating log files in SQL Server 2012 - sql-server

I'm sure this is probably SQL 101 but Google searches keep finding 'creating logins' entries. Let me give a quick overview.
I have drifted into SQL reporting from general IT Support as a result of the need for more detailed reports than our systems can provide. My company runs leisure centres and we very helpfully use 3 different leisure management systems across 70+ sites. All 3 are SQL based but do the same job in very different ways. I have produced loads of reports in SSRS but the 2 or 3 I have done that access all systems are very, very time consuming and just one link down means the whole report is inaccessible.
A request to send data to a third party for marketing purposed has forces us to finally look at centralising data from all of the systems to make reporting much easier. There will only essentially be 2 tables - 1 for membership details and one for activities. I have done the hard part of creating a view that produces the same information from each of the 3 systems and set up a central database to bring the data back to. I will have a stored procedure running on each system that will populate a table with records from the previous day. There will then be a job on the central server that will copy data from these tables and remove it once transferred. So far so (relatively) simple.
The problem is that the central server will be trying to retrieve data from over 60 servers - all with their own network links. Some sites are remote with poor DSL connections so there will be times when some of the data can't be copied by the scheduled job. I am happy that a SQL agent job can have these as steps and one failed connection won't stop the whole process but my concern is that troubleshooting when something goes wrong will be tricky if I don't get some kind of logging in place.
The stored procedures although complicated SQL wise are just update/insert record jobs. What would be helpful is that the update job writes to a log file somehow reporting that it affected 20 rows. And the insert job affected 100 rows. Basic stuff but I have no idea how I go about it. What would also be useful is some kind of warning when one of the steps fails. SQL Agent will help but I want to build as much resilience as possible in whilst I am at the 3 server stage before rolling out to the 60+ server stage.
Any pointers in the right direction would be much appreciated. My SQL skills are self taught (With a lot of Stack Overflow help!) and although I have learnt a lot about producing complicated views and queries in the last couple of weeks, most of my SQL has just been queries for SSRS so this is all new to me.
Many thanks.

The output clause will get you what you want for logging. What it allows you to do is capture what your statement is doing. Below is an example of of how to perform an update and capture the changes in a logging table.
As for error handling and resilience I would take a look into using SSIS to perform your ETL. SSIS gives you a much more robust feature set for error handling.
-- Create Temp Tables
CREATE TABLE #myLog
(
id int,
oldVal int,
newVal int
);
CREATE TABLE #myTable
(
id int,
val int
);
-- Add Values to #myTable
INSERT INTO #myTable VALUES
(1, 1234),
(2, 1234);
-- Output Contents of #myTable
SELECT * FROM #myTable;
-- Update #myTable & Capture Changes
UPDATE #myTable
SET val = 12345
OUTPUT
inserted.id,
deleted.val,
inserted.val
INTO #myLog
WHERE id = 2
-- Output Contents of #myTable and #myLog
SELECT * FROM #myTable
SELECT * FROM #myLog
-- Drop Temp Tables
DROP TABLE #myLog
DROP TABLE #myTable

Related

MS SQL Trigger for ETL vs Performance

I would need information what might be the impact for production DB of creating triggers for ~30 Production tables that capture any Update,Delete and Insert statement and put following information "PK", "Table Name", "Time of modification" to separate table.
I have limited ability to test it as I have read only permissions to both Prod and Test environment (and I can get one work day for 10 end users to test it).
I have estimated that number of records inserted from those triggers will be around ~150-200k daily.
Background:
I have project to deploy Data Warehouse for database that is very customized + there are jobs running every day that manipulate the data. Updated on Date column is not being maintain (customization) + there are hard deletes occurring on tables. We decided to ask DEV team to add triggers like:
CREATE TRIGGER [dbo].[triggerName] ON [dbo].[ProductionTable]
FOR INSERT, UPDATE, DELETE
AS
INSERT INTO For_ETL_Warehouse (Table_Name, Regular_PK, Insert_Date)
SELECT 'ProductionTable', PK_ID, GETDATE() FROM inserted
INSERT INTO For_ETL_Warehouse (Table_Name, Regular_PK, Insert_Date)
SELECT 'ProductionTable', PK_ID, GETDATE() FROM deleted
on core ~30 production tables.
Based on this table we will pull delta from last 24 hours and push it to Data Warehouse staging tables.
If anyone had similar issue and can help me estimate how it can impact performance on production database I will really appreciate. (if it works - I am saved, if not I need to propose other solution. Currently mirroring or replication might be hard to get as local DEVs have no idea how to set it up...)
Other ideas how to handle this situation or perform tests are welcome (My deadline is Friday 26-01).
First of all I would suggest you code your table name into a smaller variable and not a character one (30 tables => tinyint).
Second of all you need to understand how big is the payload you are going to write and how:
if you chose a correct clustered index (date column) then the server will just need to out data row by row in a sequence. That is a silly easy job even if you put all 200k rows at once.
if you code the table name as a tinyint, then basically it has to write:
1byte (table name) + PK size (hopefully numeric so <= 8bytes) + 8bytes datetime - so aprox 17bytes on the datapage + indexes if any + log file . This is very lightweight and again will put no "real" pressure on sql sever.
The trigger itself will add a small overhead, but with the amount of rows you are talking about, it is negligible.
I saw systems that do similar stuff on a way larger scale with close to 0 effect on the work process, so I would say that it's a safe bet. The only problem with this approach is that it will not work in some cases (ex: outputs to temp tables from DML statements). But if you do not have these kind of blockers then go for it.
I hope it helps.

Getting data of temp table while debugging

While debugging I am unable to watch temp table's value in sql server 2012.I am getting all of my variables value and even can print that but struggling with the temp tables .Is there any way to watch temp table's value?.
SQL Server provides the concept of temporary table which helps the developer in a great way. These tables can be created at runtime and can do the all kinds of operations that one normal table can do. But, based on the table types, the scope is limited. These tables are created inside tempdb database.
While debugging, you can pause the SP at some point, write the select statement in your SP before the DROP table statement, the # table is available for querying.
select * from #temp
I placed this code inside my stored procedure and I am able to see the temp table contents inside the "Locals" window.
INSERT INTO #temptable (columns) SELECT columns FROM sometable; -- populate your temp table
-- for debugging, comment in production
DECLARE #temptable XML = (SELECT * FROM #temptable FOR XML AUTO); -- now view #temptable in Locals window
This works on older SQL Server 2008 but newer versions would probably support a friendlier FOR JSON object. Credit: https://stackoverflow.com/a/6748570/1129926
I know this is old, I've been trying to make this work also where I can view temp table data as I debug my stored procedure. So far nothing works.
I've seen many links to methods on how to the do this, but ultimately they don't work the way a developer would want them to work. For example: suppose one has several processes in the Stored Procedure that updates and modifies data in the same temp table, there is no way to see update on the fly for each process in the SP.
This is a VERY common request, yet no one seems to have a solution other than don't use Stored Procedures for complex processing due how difficult they are to debug. If you're a .NET Core/EF 6 developer and have the correct PK,FK set for the database, one shouldn't really need to use Stored Procedures at all as it can all be handled by EF6 and debug code to view data results in your entities/models directly (usually in web API using models/entities).
Trying to retrieve the data from the tempdb is not possible even with the same connection (as has been suggested).
What is sometimes used is:
PRINT '#temptablename'
SELECT * FROM #temptablename
Dotted thruout the code, you can add a debug flag to the SP and selectively debug the output. NOT ideal at all, but works for many situations.
But this MUST already be in the Stored Procedure before execution (not during). And you must remember to remove the code prior to deployment to a production environment.
I'm surprised in 2022, we still have no solution to this other than don't use complex stored procedures or use .NET Core/EF 6 ... which in my humble opinion is the best approach for 2022 since SSMS and other tools like dbForge and RedGate can't accomplish this either.

Primavera P6 database has grown to a very large size

I'm not a P6 admin, nor am I a (SQL Server) DBA. I'm just a Winforms developer (with T-SQL) who has agreed to do a little research for the scheduling group.
I believe the version they're running is 8.2, desktop (non-Citrix). Backend is SQL Server. The backend has grown to 36gb and nightly backups are periodically filling drives to their limits.
REFRDEL holds 135 million records, dating back to some time in 2012.
UDFVALUE holds 26 million records
All other tables have reasonable numbers of records.
Can someone clue us in as to which of the several cleanup-oriented stored procedures to run (if any), or offer some sane advice so that we can get the backend down to a manageable size, please? Something that would not violate best practices and is considered very safe, please.
When you look at the data in the database there is a column name "delete_session_id". Do you see any with the value of -99? If so, then there is an unresolved issue on this. If not, then proceed with the following to get the clean up jobs running again...
If you are using SQL Server (Full Editions), perform the following steps to resolve the issue:
Verify that the SQL Server Agent service is started on the server and has a startup type of automatic.
Logs for this service can be found (by default) at:
C:\Program Files\Microsoft SQL Server\\LOG\SQLAGENT.x
This log includes information on when the service was stopped/started
If the SQL Agent is started, you can then check what jobs exist on the SQL Server database by issuing the following command as SA through SQL Query Analyzer (2000) or through Microsoft SQL Server Management Studio:
select * from msdb.dbo.sysjobs
If the Primavera background processes (SYMON and DAMON) are not listed, or the SQL Agent was not started, then these background processes can be reinitialized by running the following commands as SA user against the Project Management database:
exec initialize_background_procs
exec system_monitor
exec data_monitor
A bit late coming to this, but thought the following may be useful to some.
We noticed REFRDEL had grown to a large size and after some investigation discovered the following ...
DAMON runs the following procedures to perform clean-up:
BGPLOG_CLEANUP
REFRDEL_CLEANUP
REFRDEL Bypass
CLEANUP_PRMQUEUE
USESSION_CLEAR_LOGICAL_DELETES
CLEANUP_LOGICAL_DELETES
PRMAUDIT_CLEANUP
CLEANUP_USESSAUD
USER_DEFINED_BACKGROUND
DAMON was configured to run every Saturday around 4pm but we noticed that it had been continuously failing. This was due to an offline backup process which started at 10pm. We first assumed that this was preventing the REFRDEL_CLEANUP from running.
However after monitoring REFRDEL for a couple of weeks, we found that REFRDEL_CLEANUP was actually running and removing data from the table. You can check your table by running the following query on week 1 and then again in week 2 to verify the oldest records are being deleted.
select min(delete_date), max(delete_date), count(*) from admuser.refrdel;
The problem is to do with the default parameters used by the REFRDEL_CLEANUP procedure. These are described here but in summary the procedure is set to retain the 5 most recent days worth of records and delete just 1 days' worth of records. This is what's causing the issue...DAMON runs just once a week...and when it runs the cleanup job, it's only deleting 1 day's data but has accumulated a week's worth...therefore the amount of data will just get bigger and bigger.
The default parameters can be overridden in the SETTINGS table.
Here are the steps I took to correct the issue:
First, clean up the table..
-- 1. create backup table
CREATE TABLE ADMUSER.REFRDEL_BACKUP TABLESPACE PMDB_DAT1 NOLOGGING AS
Select * from admuser.refrdel where delete_date >= (sysdate - 5);
-- CHECK DATA HAS BEEN COPIED
-- 2. disable indexes on REFRDEL
alter index NDX_REFRDEL_DELETE_DATE unusable;
alter index NDX_REFRDEL_TABLE_PK unusable;
-- 3. truncate REFRDEL table
truncate table admuser.refrdel;
-- 4. restore backed up data
ALTER TABLE ADMUSER.REFRDEL NOLOGGING;
insert /*# append */ into admuser.refrdel select * from admuser.refrdel_backup;
--verify number of rows copied
ALTER TABLE ADMUSER.REFRDEL LOGGING;
commit;
-- 5. rebuild indexes on REFRDEL
alter index NDX_REFRDEL_DELETE_DATE rebuild;
alter index NDX_REFRDEL_TABLE_PK rebuild;
-- 6. gather table stats
exec dbms_stats.gather_table_stats(ownname => 'ADMUSER', tabname => 'REFRDEL', cascade => TRUE);
-- 7. drop backup table
drop table admuser.refrdel_backup purge;
Next, override the parameters so we try to delete at least 10 days' worth of data. The retention period will always keep 5 days' worth of data.
exec settings_write_string(‘10',’database.cleanup.Refrdel’,’DaysToDelete’); -- delete the oldest 10 days of data
exec settings_write_string(’15’,’database.cleanup.Refrdel’,’IntervalStep’); -- commit after deleting every 15 minutes of data
exec settings_write_string(‘5d’,’database.cleanup.Refrdel’,’KeepInterval’); -- only keep 5 most recent days of data
This final step is only relevant to my environment and will not apply to you unless you have similar issues. This is to alter the start time for DAMON to allow it complete before our offline backup process kicks in. So in this instance I have changed the start time from 4pm to midnight.
BEGIN
DBMS_SCHEDULER.SET_ATTRIBUTE (
name => 'BGJOBUSER.DAMON',
attribute => 'start_date',
value => TO_TIMESTAMP_TZ('2016/08/13 00:00:00.000000 +00:00','yyyy/mm/dd hh24:mi:ss.ff tzr'));
END;
/
It is normal for UDFVALUE to hold a large number of records. Each value for any user-defined field attached to any object in P6 will be represented as a record in this table.
REFRDEL on the other hand should be automatically cleaned up during normal operation in a healthy system. In P6 8.x, they should be cleaned up by the data_monitor process, which by default is configured to run once a week (on Saturdays).
You should be able to execute it manually, but be forewarned: it could take a long time to complete if it hasn't executed properly since 2012.
36gb is still a very, very large database. For some clients a database of that magnitude might not be unreasonable depending on the total number of activities and, especially, the kinds of data that is stored. For example, notepads take comparatively a large amount of space.
In your case though, since you already know data_monitor hasn't executed properly for a while, it's more likely that the tables are full of records that have been soft-deleted but haven't yet been purged. You can see such records by running a query such as:
select count(*) from task where delete_session_id is not null;
Note that you must select from the task table, not the view, as the view automatically filters these soft-deleted records out.
You shouldn't delete such records manually. They should be cleaned up, along with the records in REFRDEL, as a result of running data_monitor.

SQL 2008 All records in column in table updated to NULL

About 5 times a year one of our most critical tables has a specific column where all the values are replaced with NULL. We have run log explorers against this and we cannot see any login/hostname populated with the update, we can just see that the records were changed. We have searched all of our sprocs, functions, etc. for any update statement that touches this table on all databases on our server. The table does have a foreign key constraint on this column. It is an integer value that is established during an update, but the update is identity key specific. There is also an index on this field. Any suggestions on what could be causing this outside of a t-sql update statement?
I would start by denying any client side dynamic SQL if at all possible. It is much easier to audit stored procedures to make sure they execute the correct sql including a proper where clause. Unless your sql server is terribly broken, they only way data is updated is because of the sql you are running against it.
All stored procs, scripts, etc. should be audited before being allowed to run.
If you don't have the mojo to enforce no dynamic client sql, add application logging that captures each client sql before it is executed. Personally, I would have the logging routine throw an exception (after logging it) when a where clause is missing, but at a minimum, you should be able to figure out where data gets blown out next time by reviewing the log. Make sure your log captures enough information that you can trace it back to the exact source. Assign a unique "name" to each possible dynamic sql statement executed, e.g., each assign a 3 char code to each program, and then number each possible call 1..nn in your program so you can tell which call blew up your data at "abc123" as well as the exact sql that was defective.
ADDED COMMENT
Thought of this later. You might be able to add / modify the update trigger on the sql table to look at the number of rows update prevent the update if the number of rows exceeds a threshhold that makes sense for your. So, did a little searching and found someone wrote an article on this already as in this snippet
CREATE TRIGGER [Purchasing].[uPreventWholeUpdate]
ON [Purchasing].[VendorContact]
FOR UPDATE AS
BEGIN
DECLARE #Count int
SET #Count = ##ROWCOUNT;
IF #Count >= (SELECT SUM(row_count)
FROM sys.dm_db_partition_stats
WHERE OBJECT_ID = OBJECT_ID('Purchasing.VendorContact' )
AND index_id = 1)
BEGIN
RAISERROR('Cannot update all rows',16,1)
ROLLBACK TRANSACTION
RETURN;
END
END
Though this is not really the right fix, if you log this appropriately, I bet you can figure out what tried to screw up your data and fix it.
Best of luck
Transaction log explorer should be able to see who executed command, when, and how specifically command looks like.
Which log explorer do you use? If you are using ApexSQL Log you need to enable connection monitor feature in order to capture additional login details.
This might be like using a sledgehammer to drive in a thumb tack, but have you considered using SQL Server Auditing (provided you are using SQL Server Enterprise 2008 or greater)?

Is deleting all records in a table a bad practice in SQL Server?

I am moving a system from a VB/Access app to SQL server. One common thing in the access database is the use of tables to hold data that is being calculated and then using that data for a report.
eg.
delete from treporttable
insert into treporttable (.... this thing and that thing)
Update treportable set x = x * price where (...etc)
and then report runs from treporttable
I have heard that SQL server does not like it when all records from a table are deleted as it creates huge logs etc. I tried temp sql tables but they don't persists long enough for the report which is in a different process to run and report off of.
There are a number of places where this is done to different report tables in the application. The reports can be run many times a day and have a large number of records created in the report tables.
Can anyone tell me if there is a best practise for this or if my information about the logs is incorrect and this code will be fine in SQL server.
If you do not need to log the deletion activity you can use the truncate table command.
From books online:
TRUNCATE TABLE is functionally
identical to DELETE statement with no
WHERE clause: both remove all rows in
the table. But TRUNCATE TABLE is
faster and uses fewer system and
transaction log resources than DELETE.
http://msdn.microsoft.com/en-us/library/aa260621(SQL.80).aspx
delete from sometable
Is going to allow you to rollback the change. So if your table is very large, then this can cause a lot of memory useage and time.
However, if you have no fear of failure then:
truncate sometable
Will perform nearly instantly, and with minimal memory requirements. There is no rollback though.
To Nathan Feger:
You can rollback from TRUNCATE. See for yourself:
CREATE TABLE dbo.Test(i INT);
GO
INSERT dbo.Test(i) SELECT 1;
GO
BEGIN TRAN
TRUNCATE TABLE dbo.Test;
SELECT i FROM dbo.Test;
ROLLBACK
GO
SELECT i FROM dbo.Test;
GO
i
(0 row(s) affected)
i
1
(1 row(s) affected)
You could also DROP the table, and recreate it...if there are no relationships.
The [DROP table] statement is transactionally safe whereas [TRUNCATE] is not.
So it depends on your schema which direction you want to go!!
Also, use SQL Profiler to analyze your execution times. Test it out and see which is best!!
The answer depends on the recovery model of your database. If you are in full recovery mode, then you have transaction logs that could become very large when you delete a lot of data. However, if you're backing up transaction logs on a regular basis to free the space, this might not be a concern for you.
Generally speaking, if the transaction logging doesn't matter to you at all, you should TRUNCATE the table instead. Be mindful, though, of any key seeds, because TRUNCATE will reseed the table.
EDIT: Note that even if the recovery model is set to Simple, your transaction logs will grow during a mass delete. The transaction logs will just be cleared afterward (without releasing the space). The idea is that DELETE will create a transaction even temporarily.
Consider using temporary tables. Their names start with # and they are deleted when nobody refers to them. Example:
create table #myreport (
id identity,
col1,
...
)
Temporary tables are made to be thrown away, and that happens very efficiently.
Another option is using TRUNCATE TABLE instead of DELETE. The truncate will not grow the log file.
I think your example has a possible concurrency issue. What if multiple processes are using the table at the same time? If you add a JOB_ID column or something like that will allow you to clear the relevant entries in this table without clobbering the data being used by another process.
Actually tables such as treporttable do not need to be recovered to a point of time. As such, they can live in a separate database with simple recovery mode. That eases the burden of logging.
There are a number of ways to handle this. First you can move the creation of the data to running of the report itself. This I feel is the best way to handle, then you can use temp tables to temporarily stage your data and no one will have concurency issues if multiple people try to run the report at the same time. Depending on how many reports we are talking about, it could take some time to do this, so you may need another short term solutio n as well.
Second you could move all your reporting tables to a difffernt db that is set to simple mode and truncate them before running your queries to populate. This is closest to your current process, but if multiple users are trying to run the same report could be an issue.
Third you could set up a job to populate the tables (still in separate db set to simple recovery) once a day (truncating at that time). Then anyone running a report that day will see the same data and there will be no concurrency issues. However the data will not be up-to-the minute. You also could set up a reporting data awarehouse, but that is probably overkill in your case.

Resources