Distribution Clenup in Transaction Replication - distributed-transactions

I have setup transaction Replication PULL type between SQL servers.
But, my distribution cleanup job is not removing any data from MS_replCommands and repltransaction tables.
I have set Immediate_Snyc and allow_anonymous to 0.
Distribution Job Detail:
Query:
EXEC dbo.sp_MSdistribution_cleanup #min_distretention = 0, #max_distretention = 72
JOB result:
Executed as user: NT SERVICE\SQLSERVERAGENT. Removed 0 replicated transactions consisting of 0 statements in 0 seconds (0 rows/sec). [SQLSTATE 01000] (Message 21010). The step succeeded.
note: When I have set Immediate_Snyc to 1 and tried then it worked, but why not with 0 as on other server I have set 0 and it's working.
Please help me.

Strange- the expected behaviour is that if immediate_sync is "true", then the distribution database will hold transaction data for the set maximum retention period, so that the current, and any new subscribers can get the baseline snapshot + transactions necessary to "catch up". You'd expect the distribution database to hold data for the max retention period (72 hours in your case).
If it's set to "false" then any new subscribers will need a new snapshot, however any distributed commands will be cleared from the distribution database by the cleanup job.
Double check that all your subscribers are receiving transactions, and do you have anonymous subscriptions enabled?

Related

Release unused memory sql server 2014 memory optimized table

My system
Microsoft SQL Server 2014 (SP1-CU4) (KB3106660) - 12.0.4436.0 (X64)
Dec 2 2015 16:09:44 Copyright (c) Microsoft Corporation Enterprise
Edition: Core-based Licensing (64-bit) on Windows NT 6.3 (Build
9600: ) (Hypervisor)
I use two table table1 and ``table2` memory optimazed table (each size 27 GB)
drop table1
IF OBJECT_ID('table1') IS NOT NULL
BEGIN
DROP TABLE [dbo].[table1]
END
After:
SQL server Memory Usage By Memory Optimazed Objects Reports
Table Name =table2 Table Used Memory = 26582,50 Table Unused Memory = 26792,69
How can I run sql server garbage collector manually ? this is possible or not ?
I need "Table Unused Memory" Release because another process always gives this error
"There is insufficient system memory in resource pool 'Pool' to run this query."
Thank you
Data for memory optimized tables is held in data & delta files.
A delete statement will not remove the data from the data file but insert a delete record into the delta file, hence your storage continuing to be large.
The data & delta file are maintained in pairs know as checkpoint file pairs (CFPs). Over time closed CFPs are merged based upon a merge policy from multiple CFPs into one merged target CFP.
A background thread evaluates all closed CFPs using a merge policy and then initiates one or more merge requests for the qualifying CFPs. These merge requests are processed by the offline checkpoint thread. The evaluation of merge policy is done periodically and also when a checkpoint is closed.
You can force merge the files using stored procedure sys.sp_xtp_merge_checkpoint_files following a checkpoint.
EDIT
Run statement:
SELECT
container_id,
internal_storage_slot,
file_type_desc,
state_desc,
inserted_row_count,
deleted_row_count,
lower_bound_tsn,
upper_bound_tsn
FROM
sys.dm_db_xtp_checkpoint_files
ORDER BY
file_type_desc,
state_desc
Then find the rows with status UNDER CONSTRUCTION and make a note of
the lower and upper transaction id.
Now execute:
EXEC sys.sp_xtp_merge_checkpoint_files 'myDB',1003,1004;
where 1003 and 1004 is the lower and upper transaction id.
To completely remove the files you will have to ensure that you have to:
Run Select statement from above
Run EXEC sys.sp_xtp_merge_checkpoint_files from above
Perform a Full Backup
CHECKPOINT
Backup the Log
EXEC sp_xtp_checkpoint_force_garbage_collection;
Checkpoint
Exec sp_filestream_force_garbage_collection 'MyDb' to remove files marked as Tombstone
You may need to run steps 3 - 7 twice to completely get rid of the files.
See The DBA who came to tea article
CFP's go through the following stages:
•PRECREATED – A small set of CFPs are kept pre-allocated to minimize or eliminate any waits to allocate new files as transactions are being executed. These are full sized with data file size of 128MB and delta file size of 8 MB but contain no data. The number of CFPs is computed as the number of logical processors or schedulers with a minimum of 8. This is a fixed storage overhead in databases with memory-optimized tables
•UNDER CONSTRUCTION – Set of CFPs that store newly inserted and possibly deleted data rows since the last checkpoint.
•ACTIVE - These contain the inserted/deleted rows from previous closed checkpoints. These CFPs contain all required inserted/deleted rows required before applying the active part of the transaction log at the database restart. We expect that size of these CFPs to be approximately 2x of the in-memory size of memory-optimized tables assuming merge operation is keeping up with the transactional workload.
•MERGE TARGET – CFP stores the consolidated data rows from the CFP(s) that were identified by the merge policy. Once the merge is installed, the MERGE TARGET transitions into ACTIVE state
•MERGED SOURCE – Once the merge operation is installed, the source CFPs are marked as MERGED SOURCE. Note, the merge policy evaluator may identify multiple merges a CFP can only participate in one merge operation.
•REQUIRED FOR BACKUP/HA – Once the merge has been installed and the MERGE TARGET CFP is part of durable checkpoint, the merge source CFPs transition into this state. CFPs in this state are needed for operational correctness of the database with memory-optimized table. For example, to recover from a durable checkpoint to go back in time. A CFP can be marked for garbage collection once the log truncation point moves beyond its transaction range.
•IN TRANSITION TO TOMBSTONE – These CFPs are not needed by in-memory OLTP engine can they can be garbage collected. This state indicates that these CFPs are waiting for the background thread to transition them to the next state TOMBSTONE
•TOMBSTONE – These CFPs are waiting to be garbage collected by the filestream garbage collector. Please refer to FS Garbage Collection for details

Primavera P6 database has grown to a very large size

I'm not a P6 admin, nor am I a (SQL Server) DBA. I'm just a Winforms developer (with T-SQL) who has agreed to do a little research for the scheduling group.
I believe the version they're running is 8.2, desktop (non-Citrix). Backend is SQL Server. The backend has grown to 36gb and nightly backups are periodically filling drives to their limits.
REFRDEL holds 135 million records, dating back to some time in 2012.
UDFVALUE holds 26 million records
All other tables have reasonable numbers of records.
Can someone clue us in as to which of the several cleanup-oriented stored procedures to run (if any), or offer some sane advice so that we can get the backend down to a manageable size, please? Something that would not violate best practices and is considered very safe, please.
When you look at the data in the database there is a column name "delete_session_id". Do you see any with the value of -99? If so, then there is an unresolved issue on this. If not, then proceed with the following to get the clean up jobs running again...
If you are using SQL Server (Full Editions), perform the following steps to resolve the issue:
Verify that the SQL Server Agent service is started on the server and has a startup type of automatic.
Logs for this service can be found (by default) at:
C:\Program Files\Microsoft SQL Server\\LOG\SQLAGENT.x
This log includes information on when the service was stopped/started
If the SQL Agent is started, you can then check what jobs exist on the SQL Server database by issuing the following command as SA through SQL Query Analyzer (2000) or through Microsoft SQL Server Management Studio:
select * from msdb.dbo.sysjobs
If the Primavera background processes (SYMON and DAMON) are not listed, or the SQL Agent was not started, then these background processes can be reinitialized by running the following commands as SA user against the Project Management database:
exec initialize_background_procs
exec system_monitor
exec data_monitor
A bit late coming to this, but thought the following may be useful to some.
We noticed REFRDEL had grown to a large size and after some investigation discovered the following ...
DAMON runs the following procedures to perform clean-up:
BGPLOG_CLEANUP
REFRDEL_CLEANUP
REFRDEL Bypass
CLEANUP_PRMQUEUE
USESSION_CLEAR_LOGICAL_DELETES
CLEANUP_LOGICAL_DELETES
PRMAUDIT_CLEANUP
CLEANUP_USESSAUD
USER_DEFINED_BACKGROUND
DAMON was configured to run every Saturday around 4pm but we noticed that it had been continuously failing. This was due to an offline backup process which started at 10pm. We first assumed that this was preventing the REFRDEL_CLEANUP from running.
However after monitoring REFRDEL for a couple of weeks, we found that REFRDEL_CLEANUP was actually running and removing data from the table. You can check your table by running the following query on week 1 and then again in week 2 to verify the oldest records are being deleted.
select min(delete_date), max(delete_date), count(*) from admuser.refrdel;
The problem is to do with the default parameters used by the REFRDEL_CLEANUP procedure. These are described here but in summary the procedure is set to retain the 5 most recent days worth of records and delete just 1 days' worth of records. This is what's causing the issue...DAMON runs just once a week...and when it runs the cleanup job, it's only deleting 1 day's data but has accumulated a week's worth...therefore the amount of data will just get bigger and bigger.
The default parameters can be overridden in the SETTINGS table.
Here are the steps I took to correct the issue:
First, clean up the table..
-- 1. create backup table
CREATE TABLE ADMUSER.REFRDEL_BACKUP TABLESPACE PMDB_DAT1 NOLOGGING AS
Select * from admuser.refrdel where delete_date >= (sysdate - 5);
-- CHECK DATA HAS BEEN COPIED
-- 2. disable indexes on REFRDEL
alter index NDX_REFRDEL_DELETE_DATE unusable;
alter index NDX_REFRDEL_TABLE_PK unusable;
-- 3. truncate REFRDEL table
truncate table admuser.refrdel;
-- 4. restore backed up data
ALTER TABLE ADMUSER.REFRDEL NOLOGGING;
insert /*# append */ into admuser.refrdel select * from admuser.refrdel_backup;
--verify number of rows copied
ALTER TABLE ADMUSER.REFRDEL LOGGING;
commit;
-- 5. rebuild indexes on REFRDEL
alter index NDX_REFRDEL_DELETE_DATE rebuild;
alter index NDX_REFRDEL_TABLE_PK rebuild;
-- 6. gather table stats
exec dbms_stats.gather_table_stats(ownname => 'ADMUSER', tabname => 'REFRDEL', cascade => TRUE);
-- 7. drop backup table
drop table admuser.refrdel_backup purge;
Next, override the parameters so we try to delete at least 10 days' worth of data. The retention period will always keep 5 days' worth of data.
exec settings_write_string(‘10',’database.cleanup.Refrdel’,’DaysToDelete’); -- delete the oldest 10 days of data
exec settings_write_string(’15’,’database.cleanup.Refrdel’,’IntervalStep’); -- commit after deleting every 15 minutes of data
exec settings_write_string(‘5d’,’database.cleanup.Refrdel’,’KeepInterval’); -- only keep 5 most recent days of data
This final step is only relevant to my environment and will not apply to you unless you have similar issues. This is to alter the start time for DAMON to allow it complete before our offline backup process kicks in. So in this instance I have changed the start time from 4pm to midnight.
BEGIN
DBMS_SCHEDULER.SET_ATTRIBUTE (
name => 'BGJOBUSER.DAMON',
attribute => 'start_date',
value => TO_TIMESTAMP_TZ('2016/08/13 00:00:00.000000 +00:00','yyyy/mm/dd hh24:mi:ss.ff tzr'));
END;
/
It is normal for UDFVALUE to hold a large number of records. Each value for any user-defined field attached to any object in P6 will be represented as a record in this table.
REFRDEL on the other hand should be automatically cleaned up during normal operation in a healthy system. In P6 8.x, they should be cleaned up by the data_monitor process, which by default is configured to run once a week (on Saturdays).
You should be able to execute it manually, but be forewarned: it could take a long time to complete if it hasn't executed properly since 2012.
36gb is still a very, very large database. For some clients a database of that magnitude might not be unreasonable depending on the total number of activities and, especially, the kinds of data that is stored. For example, notepads take comparatively a large amount of space.
In your case though, since you already know data_monitor hasn't executed properly for a while, it's more likely that the tables are full of records that have been soft-deleted but haven't yet been purged. You can see such records by running a query such as:
select count(*) from task where delete_session_id is not null;
Note that you must select from the task table, not the view, as the view automatically filters these soft-deleted records out.
You shouldn't delete such records manually. They should be cleaned up, along with the records in REFRDEL, as a result of running data_monitor.

DB2 Logfile Limitation, SQLCODE: -964

I have tried a huge insert query in DB2.
INSERT INTO MY_TABLE_COPY ( SELECT * FROM MY_TABLE);
Before that, I set the followings:
UPDATE DATABASE CONFIGURATION FOR MY_DB USING LOGFILSIZ 70000;
UPDATE DATABASE CONFIGURATION FOR MY_DB USING LOGPRIMARY 50;
UPDATE DATABASE CONFIGURATION FOR MY_DB USING LOGSECOND 2;
db2stop force;
db2start;
and I got this error:
DB21034E The command was processed as an SQL statement because it was not a
valid Command Line Processor command. During SQL processing it returned:
SQL0964C The transaction log for the database is full. SQLSTATE=57011
SQL0964C The transaction log for the database is full.
Explanation:
All space in the transaction log is being used.
If a circular log with secondary log files is being used, an
attempt has been made to allocate and use them. When the file
system has no more space, secondary logs cannot be used.
If an archive log is used, then the file system has not provided
space to contain a new log file.
The statement cannot be processed.
User Response:
Execute a COMMIT or ROLLBACK on receipt of this message (SQLCODE)
or retry the operation.
If the database is being updated by concurrent applications,
retry the operation. Log space may be freed up when another
application finishes a transaction.
Issue more frequent commit operations. If your transactions are
not committed, log space may be freed up when the transactions
are committed. When designing an application, consider when to
commit the update transactions to prevent a log full condition.
If deadlocks are occurring, check for them more frequently.
This can be done by decreasing the database configuration
parameter DLCHKTIME. This will cause deadlocks to be detected
and resolved sooner (by ROLLBACK) which will then free log
space.
If the condition occurs often, increase the database
configuration parameter to allow a larger log file. A larger log
file requires more space but reduces the need for applications to
retry the operation.
If installing the sample database, drop it and install the
sample database again.
sqlcode : -964
sqlstate : 57011
any suggestions?
I used the maximum values for LOGFILSIZ, LOGPRIMARY, and LOGSECOND;
The max value for LOGFILSIZ may be different for windows, linux, etc. But, you can try a very big number and the DB let you know what is the max. In my case it was 262144.
Also, LOGPRIMARY + LOGSECOND <= 256. I tried 128 for each and it works for my huge query.
Instead of performing trial and errors with the DB CFG parameters, you can put these INSERT statements in the Stored Procedure with commit interval.
Refer to the following post for details: This might help.
https://prasadspande.wordpress.com/2014/06/06/plsql-ways-updatingdeleting-the-bulk-data-from-the-table/
Thanks

How the cdc retention value can be changed for the cleanup job?

I'm implementing a logging feature on a asp.net mvc2 application, that uses SqlServer2008 as a database and Entity Framework as a data model.
I enabled CDC feature of SqlServer and it's logging changes well, but I just noticed that some of the old logging data is erased.
Does anyone know what's default period CDC keeps records, and does anyone know how could I set it to indefinite value.
I just discovered that the default retention value is 4320 minutes = 72 hours = 3 days.
It should be configurable by using
sp_cdc_change_job #job_type='cleanup', #retention=minutes
The maximum value is 52494800 (100 years). If specified, the value must be a positive integer. Retention is valid only for cleanup jobs.
Here's the link to the more detail explanation of sp_cdc_change_job procedure
Hope this will help someone else, too :D.
If you want to retain the CDC data indefinitly, you can simply disable the CDC cleanup job:
Open SQL Server Management Studio and connect to your database server
In the object explorer, expand “<instance> | SQL Server Agent | Jobs”
Find the job for cleanup named “cdc.<database name>_cleanup”.
Right-click the job and select "disable"
Here is a sample picture of where to find the option:
After you disabled the cleanup job, the CDC data will no longer get removed after a certain time interval.
By default it deletes anything older than 3 days, to change the default value to 14 days use the following script
use <Your database>
go
SELECT ([retention])/((60*24)) AS Default_Retention_days ,*
FROM msdb.dbo.cdc_jobs
go
EXEC <Your database>.sys.sp_cdc_change_job
#job_type=N'Cleanup'
,#retention=20160 -- <60 min *24 hrs * 14 days >
go
SELECT ([retention])/((60*24)) AS Default_Retention_days ,*
FROM msdb.dbo.cdc_jobs
Go

SQL Replication "Row Not Found" Error

I have transactional replication running between two databases. I fear they have fallen slightly out of sync, but I don't know which records are affected. If I knew, I could fix it manually on the subscriber side.
SQL Server is giving me this message:
The row was not found at the Subscriber when applying the replicated command. (Source: MSSQLServer, Error number: 20598)
I've looked around to try to find out what table, or even better what record is causing the issue, but I can't find that information anywhere.
The most detailed data I've found so far is:
Transaction sequence number: 0x0003BB0E000001DF000600000000, Command ID: 1
But how do I find the table and row from that? Any ideas?
This gives you the table the error is against
use distribution
go
select * from dbo.MSarticles
where article_id in (
select article_id from MSrepl_commands
where xact_seqno = 0x0003BB0E000001DF000600000000)
And this will give you the command (and the primary key (ie the row) the command was executing against)
exec sp_browsereplcmds
#xact_seqno_start = '0x0003BB0E000001DF000600000000',
#xact_seqno_end = '0x0003BB0E000001DF000600000000'
I'll answer my own question with a workaround I ended up using.
Unfortunately, I could not figure out which table was causing the issue through the SQL Server replication interface (or the Event Log for that matter). It just didn't say.
So the next thing I thought of was, "What if I could get replication to continue even though there is an error?" And lo and behold, there is a way. In fact, it's easy. There is a special Distribution Agent profile called "Continue on data consistency errors." If you enable that, then these types of errors will just be logged and passed on by. Once it is through applying the transactions and potentially logging the errors (I only encountered two), then you can go back and use RedGate SQL Data Compare (or some other tool) to compare your two databases, make any corrections to the subscriber and then start replication running again.
Keep in mind, for this to work, your publication database will need to be "quiet" during the part of the process where you diff and fix the subscriber database. Luckily, I had that luxury in this case.
If your database is not prohibitively large, I would stop replication, re-snapshot and then re-start replication. This technet article describes the steps.
If it got out of sync due to a user accidently changing data on the replica, I would set the necessary permissions to prevent this.
This replication article is worth reading.
Use this query to find out the article that is out of sync:
USE [distribution]
select * from dbo . MSarticles
where article_id IN ( SELECT Article_id from MSrepl_commands
where xact_seqno = 0x0003BB0E000001DF000600000000)
of course if you check the error when the replication fails it also tells you which record is at fault and you could extract that data from the core system and just insert it on the subscriber.
This is better than skipping errors as with the SQL Data Compare it will lock the table for the comparison and if you have millions of rows this can take a long time to run.
Tris
Changing the profile to "Continue on data consistency errors" won't always work. Obviously it reduces or nullifies an error, but you won't get the whole proper data. It will skip the rows by which an error occurs, and hence you fail to get accurate data.
the following checks resolve my problem
check that all the replication SQL Agents jobs are working fine and if not start them.
in my case it was stopped because of some killed session occurred a few hours before by Some DBA because of blocking issue
after a very short time all data in subscription were updated and no
other error in replication monitor
in my case all above queries did not returned nothing
This error usually comes when particular record does not exists on subscriber and a update or delete command executed for same record on primary server and which got replicated on subscriber as well.
As this records does not exists on subscriber, replication throws an error " Row Not Found"
Solution of this error to make replication work back to the normal running state:
We can check with following query, whether request at publisher was of update or delete statement:
USE [distribution]
SELECT *
FROM msrepl_commands
WHERE publisher_database_id = 1
AND command_id = 1
AND xact_seqno = 0x00099979000038D6000100000000
We can get artical id information from above query, which can be passed to below proc:
EXEC Sp_browsereplcmds
#article_id = 813,
#command_id = 1,
#xact_seqno_start = '0x00099979000038D60001',
#xact_seqno_end = '0x00099979000038D60001',
#publisher_database_id = 1
Above query will give information about, whether it was a update statement or delete statement.
In Case of Delete Statement
That record can be directly deleted from msrepl_commands objects so that replication wont make retry attempts for the record
DELETE FROM msrepl_commands
WHERE publisher_database_id = 1
AND command_id =1
AND xact_seqno = 0x00099979000038D6000100000000
In case of update statement:
You need to insert that record manually from publisher DB to subscriber DB:

Resources