Release unused memory sql server 2014 memory optimized table - sql-server

My system
Microsoft SQL Server 2014 (SP1-CU4) (KB3106660) - 12.0.4436.0 (X64)
Dec 2 2015 16:09:44 Copyright (c) Microsoft Corporation Enterprise
Edition: Core-based Licensing (64-bit) on Windows NT 6.3 (Build
9600: ) (Hypervisor)
I use two table table1 and ``table2` memory optimazed table (each size 27 GB)
drop table1
IF OBJECT_ID('table1') IS NOT NULL
BEGIN
DROP TABLE [dbo].[table1]
END
After:
SQL server Memory Usage By Memory Optimazed Objects Reports
Table Name =table2 Table Used Memory = 26582,50 Table Unused Memory = 26792,69
How can I run sql server garbage collector manually ? this is possible or not ?
I need "Table Unused Memory" Release because another process always gives this error
"There is insufficient system memory in resource pool 'Pool' to run this query."
Thank you

Data for memory optimized tables is held in data & delta files.
A delete statement will not remove the data from the data file but insert a delete record into the delta file, hence your storage continuing to be large.
The data & delta file are maintained in pairs know as checkpoint file pairs (CFPs). Over time closed CFPs are merged based upon a merge policy from multiple CFPs into one merged target CFP.
A background thread evaluates all closed CFPs using a merge policy and then initiates one or more merge requests for the qualifying CFPs. These merge requests are processed by the offline checkpoint thread. The evaluation of merge policy is done periodically and also when a checkpoint is closed.
You can force merge the files using stored procedure sys.sp_xtp_merge_checkpoint_files following a checkpoint.
EDIT
Run statement:
SELECT
container_id,
internal_storage_slot,
file_type_desc,
state_desc,
inserted_row_count,
deleted_row_count,
lower_bound_tsn,
upper_bound_tsn
FROM
sys.dm_db_xtp_checkpoint_files
ORDER BY
file_type_desc,
state_desc
Then find the rows with status UNDER CONSTRUCTION and make a note of
the lower and upper transaction id.
Now execute:
EXEC sys.sp_xtp_merge_checkpoint_files 'myDB',1003,1004;
where 1003 and 1004 is the lower and upper transaction id.
To completely remove the files you will have to ensure that you have to:
Run Select statement from above
Run EXEC sys.sp_xtp_merge_checkpoint_files from above
Perform a Full Backup
CHECKPOINT
Backup the Log
EXEC sp_xtp_checkpoint_force_garbage_collection;
Checkpoint
Exec sp_filestream_force_garbage_collection 'MyDb' to remove files marked as Tombstone
You may need to run steps 3 - 7 twice to completely get rid of the files.
See The DBA who came to tea article
CFP's go through the following stages:
•PRECREATED – A small set of CFPs are kept pre-allocated to minimize or eliminate any waits to allocate new files as transactions are being executed. These are full sized with data file size of 128MB and delta file size of 8 MB but contain no data. The number of CFPs is computed as the number of logical processors or schedulers with a minimum of 8. This is a fixed storage overhead in databases with memory-optimized tables
•UNDER CONSTRUCTION – Set of CFPs that store newly inserted and possibly deleted data rows since the last checkpoint.
•ACTIVE - These contain the inserted/deleted rows from previous closed checkpoints. These CFPs contain all required inserted/deleted rows required before applying the active part of the transaction log at the database restart. We expect that size of these CFPs to be approximately 2x of the in-memory size of memory-optimized tables assuming merge operation is keeping up with the transactional workload.
•MERGE TARGET – CFP stores the consolidated data rows from the CFP(s) that were identified by the merge policy. Once the merge is installed, the MERGE TARGET transitions into ACTIVE state
•MERGED SOURCE – Once the merge operation is installed, the source CFPs are marked as MERGED SOURCE. Note, the merge policy evaluator may identify multiple merges a CFP can only participate in one merge operation.
•REQUIRED FOR BACKUP/HA – Once the merge has been installed and the MERGE TARGET CFP is part of durable checkpoint, the merge source CFPs transition into this state. CFPs in this state are needed for operational correctness of the database with memory-optimized table. For example, to recover from a durable checkpoint to go back in time. A CFP can be marked for garbage collection once the log truncation point moves beyond its transaction range.
•IN TRANSITION TO TOMBSTONE – These CFPs are not needed by in-memory OLTP engine can they can be garbage collected. This state indicates that these CFPs are waiting for the background thread to transition them to the next state TOMBSTONE
•TOMBSTONE – These CFPs are waiting to be garbage collected by the filestream garbage collector. Please refer to FS Garbage Collection for details

Related

Sql server performance issue with multiple threads

First I will describes the details of my scenario and then my question:
I am running an instance of sql server 2017 with the following properties (from SSMS):
product: SQL Server Express Edition (64-bit)
platform: NT x64
version: 10.0.2531.0
memory: 3980MB
processors: 4 (logical) 2 (physical)
collation: SQL_Latin_General_CP1_CI_AS
The instance is configured as follows:
recovery model is set to simple and I truncate the log file
via dbcc shrinkfile before I run my queries
Max server memory is set to 2147483647 and min to default
All connections to the instance are set to have an isolation level of read uncommitted
There are three tables whose total size is ~3.2GB upon which I perform selects and a single table, which is initialized to be empty, to which I perform inserts.
All the tables have clustered indecies on their relevant columns.
I execute a single stored proc which performs all the operations in the above tables, but the access pattern of the selects is random, i.e. each time the sp is called it access random rows in the big tables (all in all it access a total of 5 rows from all tables per invocation) and adds a new row in the small table, so there is no collision between the inserts
I measure the execution time of the stored procedure via datediff inside the body of the stored procedure
I run two optional scenarios:
a single thread from my app calls the sp N times and recieves an average time of 12sec
4 threads from my app concurrently call the sp N/4 times each and recieve average times of: 0.8sec 3sec 5sec 90sec
All threads in all scenarios perform all their queries successfully
My hypothesis & question(s):
It seems logical that the single thread out performs threads that run concurrently due to increased memory pollution caused by multiple threads accessing such big tables, but how come there are concurrent threads that out perform the single thread even though they are running with a more polluted memory? (I would expect the single thread to out perform ALL the concurrent threads.)
Is there a locking issue here although the isolation level is read uncommitted?

How do you reload incremental data using SQL Server CDC?

I haven't been able to find documentation/an explanation on how you would reload incremental data using Change Data Capture (CDC) in SQL Server 2014 with SSIS.
Basically, on a given day, if your SSIS incremental processing fails and you need to start again. How do you stage the recently changed records again?
I suppose it depends on what you're doing with the data, eh? :) In the general case, though, you can break it down into three cases:
Insert - check if the row is there. If it is, skip it. If not, insert it.
Delete - assuming that you don't reuse primary keys, just run the delete again. It will either find a row to delete or it won't, but the net result is that the row with that PK won't exist after the delete.
Update - kind of like the delete scenario. If you reprocess an update, it's not really a big deal (assuming that your CDC process is the only thing keeping things up to date at the destination and there's no danger of overwriting someone/something else's changes).
Assuming you are using the new CDC SSIS 2012 components, specifically the CDC Control Task at the beginning and end of the package. Then if the package fails for any reason before it runs the CDC Control Task at the end of the package those LSNs (Log Sequence Number) will NOT be marked as processed so you can just restart the SSIS package from the top after fixing the issue and it will just reprocess those records again. You MUST use the CDC Control Task to make this work though or keep track the LSNs yourself (before SSIS 2012 this was the only way to do it).
Matt Masson (Sr. Program Manager on MSFT SQL Server team) has a great post on this with a step-by-step walkthrough: CDC in SSIS for SQL Server 2012
Also, see Bradley Schacht's post: Understanding the CDC state Value
So I did figure out how to do this in SSIS.
I record the min and max LSN number everytime my SSIS package runs in a table in my data warehouse.
If I want to reload a set of data from the CDC source to staging, in the SSIS package I need to use the CDC Control Task and set it to "Mark CDC Start" and in the text box labelled "SQL Server LSN to start...." I put the LSN value I want to use as a starting point.
I haven't figured out how to set the end point, but I can go into my staging table and delete any data with an LSN value > then my endpoint.
You can only do this for CDC changes that have not been 'cleaned up' - so only for data that has been changed within the last 3 days.
As a side point, I also bring across the lsn_time_mapping table to my data warehouse since I find this information historically useful and it gets 'cleaned up' every 4 days in the source database.
To reload the same changes you can use the following methods.
Method #1: Store the TFEND marker from the [cdc_states] table in another table or variable. Reload back the marker to your [cdc_states] from the "saved" value to process the same range again. This method, however, allows you to start processing from the same LSN but if in the meanwhile you change table got more changes those changes will be captured as well. So, you can potentially get more changes that happened after you did the first data capture.
Method #2: In order to capture the specified range, record the TFEND markers before and after the range is processed. Now, you can use the OLEDB Source Connection (SSIS) with the following cdc functions. Then use the CDC Splitter as usual to direct Inserts, Updates, and Deletes.
DECLARE #start_lsn binary(10);
DECLARE #end_lsn binary(10);
SET #start_lsn = 0x004EE38E921A01000001;-- TFEND (1) -- if null then sys.fn_cdc_get_min_lsn('YourCapture') to start from the beginnig of _CT table
SET #end_lsn = 0x004EE38EE3BB01000001; -- TFEND (2)
SELECT * FROM [cdc].[fn_cdc_get_net_changes_YOURTABLECAPTURE](
#start_lsn
,#end_lsn
,N'all' -- { all | all with mask | all with merge }
--,N'all with mask' -- shows values in "__$update_mask" column
--,N'all with merge' -- merges inserts and updates together. It's meant for processing the results using T-SQL MERGE statement
)
ORDER BY __$start_lsn;

Primavera P6 database has grown to a very large size

I'm not a P6 admin, nor am I a (SQL Server) DBA. I'm just a Winforms developer (with T-SQL) who has agreed to do a little research for the scheduling group.
I believe the version they're running is 8.2, desktop (non-Citrix). Backend is SQL Server. The backend has grown to 36gb and nightly backups are periodically filling drives to their limits.
REFRDEL holds 135 million records, dating back to some time in 2012.
UDFVALUE holds 26 million records
All other tables have reasonable numbers of records.
Can someone clue us in as to which of the several cleanup-oriented stored procedures to run (if any), or offer some sane advice so that we can get the backend down to a manageable size, please? Something that would not violate best practices and is considered very safe, please.
When you look at the data in the database there is a column name "delete_session_id". Do you see any with the value of -99? If so, then there is an unresolved issue on this. If not, then proceed with the following to get the clean up jobs running again...
If you are using SQL Server (Full Editions), perform the following steps to resolve the issue:
Verify that the SQL Server Agent service is started on the server and has a startup type of automatic.
Logs for this service can be found (by default) at:
C:\Program Files\Microsoft SQL Server\\LOG\SQLAGENT.x
This log includes information on when the service was stopped/started
If the SQL Agent is started, you can then check what jobs exist on the SQL Server database by issuing the following command as SA through SQL Query Analyzer (2000) or through Microsoft SQL Server Management Studio:
select * from msdb.dbo.sysjobs
If the Primavera background processes (SYMON and DAMON) are not listed, or the SQL Agent was not started, then these background processes can be reinitialized by running the following commands as SA user against the Project Management database:
exec initialize_background_procs
exec system_monitor
exec data_monitor
A bit late coming to this, but thought the following may be useful to some.
We noticed REFRDEL had grown to a large size and after some investigation discovered the following ...
DAMON runs the following procedures to perform clean-up:
BGPLOG_CLEANUP
REFRDEL_CLEANUP
REFRDEL Bypass
CLEANUP_PRMQUEUE
USESSION_CLEAR_LOGICAL_DELETES
CLEANUP_LOGICAL_DELETES
PRMAUDIT_CLEANUP
CLEANUP_USESSAUD
USER_DEFINED_BACKGROUND
DAMON was configured to run every Saturday around 4pm but we noticed that it had been continuously failing. This was due to an offline backup process which started at 10pm. We first assumed that this was preventing the REFRDEL_CLEANUP from running.
However after monitoring REFRDEL for a couple of weeks, we found that REFRDEL_CLEANUP was actually running and removing data from the table. You can check your table by running the following query on week 1 and then again in week 2 to verify the oldest records are being deleted.
select min(delete_date), max(delete_date), count(*) from admuser.refrdel;
The problem is to do with the default parameters used by the REFRDEL_CLEANUP procedure. These are described here but in summary the procedure is set to retain the 5 most recent days worth of records and delete just 1 days' worth of records. This is what's causing the issue...DAMON runs just once a week...and when it runs the cleanup job, it's only deleting 1 day's data but has accumulated a week's worth...therefore the amount of data will just get bigger and bigger.
The default parameters can be overridden in the SETTINGS table.
Here are the steps I took to correct the issue:
First, clean up the table..
-- 1. create backup table
CREATE TABLE ADMUSER.REFRDEL_BACKUP TABLESPACE PMDB_DAT1 NOLOGGING AS
Select * from admuser.refrdel where delete_date >= (sysdate - 5);
-- CHECK DATA HAS BEEN COPIED
-- 2. disable indexes on REFRDEL
alter index NDX_REFRDEL_DELETE_DATE unusable;
alter index NDX_REFRDEL_TABLE_PK unusable;
-- 3. truncate REFRDEL table
truncate table admuser.refrdel;
-- 4. restore backed up data
ALTER TABLE ADMUSER.REFRDEL NOLOGGING;
insert /*# append */ into admuser.refrdel select * from admuser.refrdel_backup;
--verify number of rows copied
ALTER TABLE ADMUSER.REFRDEL LOGGING;
commit;
-- 5. rebuild indexes on REFRDEL
alter index NDX_REFRDEL_DELETE_DATE rebuild;
alter index NDX_REFRDEL_TABLE_PK rebuild;
-- 6. gather table stats
exec dbms_stats.gather_table_stats(ownname => 'ADMUSER', tabname => 'REFRDEL', cascade => TRUE);
-- 7. drop backup table
drop table admuser.refrdel_backup purge;
Next, override the parameters so we try to delete at least 10 days' worth of data. The retention period will always keep 5 days' worth of data.
exec settings_write_string(‘10',’database.cleanup.Refrdel’,’DaysToDelete’); -- delete the oldest 10 days of data
exec settings_write_string(’15’,’database.cleanup.Refrdel’,’IntervalStep’); -- commit after deleting every 15 minutes of data
exec settings_write_string(‘5d’,’database.cleanup.Refrdel’,’KeepInterval’); -- only keep 5 most recent days of data
This final step is only relevant to my environment and will not apply to you unless you have similar issues. This is to alter the start time for DAMON to allow it complete before our offline backup process kicks in. So in this instance I have changed the start time from 4pm to midnight.
BEGIN
DBMS_SCHEDULER.SET_ATTRIBUTE (
name => 'BGJOBUSER.DAMON',
attribute => 'start_date',
value => TO_TIMESTAMP_TZ('2016/08/13 00:00:00.000000 +00:00','yyyy/mm/dd hh24:mi:ss.ff tzr'));
END;
/
It is normal for UDFVALUE to hold a large number of records. Each value for any user-defined field attached to any object in P6 will be represented as a record in this table.
REFRDEL on the other hand should be automatically cleaned up during normal operation in a healthy system. In P6 8.x, they should be cleaned up by the data_monitor process, which by default is configured to run once a week (on Saturdays).
You should be able to execute it manually, but be forewarned: it could take a long time to complete if it hasn't executed properly since 2012.
36gb is still a very, very large database. For some clients a database of that magnitude might not be unreasonable depending on the total number of activities and, especially, the kinds of data that is stored. For example, notepads take comparatively a large amount of space.
In your case though, since you already know data_monitor hasn't executed properly for a while, it's more likely that the tables are full of records that have been soft-deleted but haven't yet been purged. You can see such records by running a query such as:
select count(*) from task where delete_session_id is not null;
Note that you must select from the task table, not the view, as the view automatically filters these soft-deleted records out.
You shouldn't delete such records manually. They should be cleaned up, along with the records in REFRDEL, as a result of running data_monitor.

How to free up vertica raw space after deleting records

We are using Vertica community edition which has raw data limit of 1TB.
recently reached 1 TB raw data limit so we decided to delete some records from all tables. After deletion of old records Vertica still shows Utilization : 104%
dbadmin=> SELECT GET_COMPLIANCE_STATUS();
GET_COMPLIANCE_STATUS
----------------------------------------------------------------------------------------
Raw Data Size: 1.04TB +/- 0.10TB
License Size : 1.00TB
Utilization : 104%
Audit Time : 2014-09-04 13:05:24.020979-04
Compliance Status : The database is in compliance with respect to raw data size.
No expiration date for a Perpetual license
NOTICE: Recent audits suggests a change in compliance status. We are awaiting additional data points to confirm.
(1 row)
Any idea how to free up that space ?
Rows that were deleted using DELETE are marked for deletion and not immediately removed from physical storage. You need to wait for a mergeout to occur, advance the epoch, or run a PURGE. More information about purging deleted data is available in the documentation.
Besides Kermit's answer...
Since deletes are expensive I used to put data on partitions, based around date, and after archiving the data somewhere else I would drop the partition.
The records marked for deletion and pointed by delete vectors
You can find the count of delete vectors by the following query
select count(1) from delete_vectors;
To delete the delete vectors
You can write a script which contains
select make_ahm_now()
select purge()
And then schedule to run this script
Deleted records are excluded when computing database size. So there is no urgency in immediately purging the delete vectors.
To immediately see the effect of your deletion on your license compliance, you can trigger an immediate audit with
select audit_license_size();
As others have said
select make_ahm_now(); select purge();
will remove the deleted records permanently but these operations can be very time consuming and will block operations like truncate().
You can reduce the disruption by using
select purge_table('<your table name');
instead of
select purge()

SQL Server Delete Lock issue

I have a SQL Server database where I am deleting rows from three tables A,B,C in batches with some conditions through a SQL script scheduled in a SQL job. The job runs for 2 hours as the tables have a large amount of data. While the job is running, my front end application is not accessible (giving timeout error) since the application inserts and updates data in these same tables A,B,C.
Is it possible for the front end application to run in parallel without any issues while the SQL script is running? I have checked for the locks on the table and SQL Server is acquiring page locks. Can Read Committed Snapshot or Snapshot isolation levels or converting page locks to row locks help here. Need advice.
Split the operation in two phases. In the first phase, collect the primary keys of rows to delete:
create table #TempList (ID int);
insert #TempList
select ID
from YourTable
In the second phase, use a loop to delete those rows in small batches:
while 1=1
begin
delete top (1000)
from YourTable
where ID in (select ID from #TempList)
if ##rowcount = 0
break
end
The smaller batches will allow your front end applications to continue in between them.
I suspect that SQL Server at some point escalates to table lock, and this means that the table is inaccessible, both for reading and updating.
To optimize locking and concurrency when dealing with large deletes, use batches. Start with 5000 rows at the time (to prevent lock escalation) and monitor how it behaves and whether it needs further tuning up or down. 5000 is a "magic number", but it's low enough number that lock manager doesn't consider escalating to table lock, and large enough for the performance.
Whether timeouts will happen or not depends on other factors as well, but this will surely reduce if not elliminate alltogether. If the timeout happen on read operations, you should be able to get rid of them. Another approach, of course, is to increase the command timeout value on client.
Snapshot (optimistic) isolation is an option as well, READ COMMITTED SNAPSHOT more precisely, but it won't help with updates from other sessions. Also, beware of version store (in tempdb) growth. Best if you combine it with the proposed batch approach to keep the transactions small.
Also, switch to bulk-logged recovery for the duration of delete if the database is in full recovery normally. But switch back as soon as it finishes, and make a backup.
Almost forgot -- if it's Enterprise edition of SQL Server, partition your table; then you can just switch the partition out, it's almost momentarilly and the clients will never notice it.

Resources