hash indexes on replicated times ten db - timesten

I have have a replicated times ten database. I need to reset the page size of a number of hash indexes. when I update the page sizes it breaks replication and takes 10 hours to rebuild replicated databases. There has to be some way to up update hash index size that does not break replication. Oracle is telling my dba, " if you update index sizes you have to rebuild replication." It seems BAD idea to go 8+ hour without a failover. currently doing (we tried setting replication level to 2. hand no effect) the following resets index size for primary key to 23244 pages
ALTER SESSION SET ddl_replication_level = 1;
ALTER TABLE MYSCHEMA.employee SET PAGES = 23244;
......
ALTER SESSION SET ddl_replication_level = 3;
has anyone updated hash index sizes without rebuilding replication???

According oracle the only solution other than rebuilding replication. Is:
Disable replication.
Run DDL on primary node.
Run ddl alter statements secondary nodes.
restart replication
Not sure if we still need ddl_replication_level settings. We will try next week

Related

Update COLUMNSTORE index in DB transaction

Is it possible to update a COLUMNSTORE index in a DB transaction? I would like to use the following SQL command inside transaction:
ALTER INDEX [IX_Name] ON [dbo].[TableName] REORGANIZE WITH (COMPRESS_ALL_ROW_GROUPS = ON)
The transaction can take a long time. Will other SQL clients be able to use the index during the transaction?
Note that everything in SQL runs in it's own implicit transaction if you don't specify one, so if you're just running REORGANIZE there's no difference between running it or wrapping it in a BEGIN/COMMIT.
COMPRESSED row groups are immutable, so let's use defragmentation rather than update for your scenario. In columnstore world an update translates into a delete + insert and a delete is "deferred". More specifically deletes are reflected in the deleted bitmap, which the engine joins with the data and returns the rows visible to your transaction. The per-row group state of the delete bitmap can be seen in the sys.dm_db_column_store_row_group_physical_stats DMV as the deleted_rows column. Also note that deleting or updating an OPEN or CLOSED row group happens in-place: for deletes you'll see the row count decrement (updates won't change the row count), however you will never see any deleted_rows in these two types of row groups.
So what does REORGANIZE do? It reads small and/or fragmented row groups and combines them, but not in place, rather it writes them out as new row groups and the old row group's state will change to TOMBSTONE. Old row groups will be around while they have active readers, while transactions started after the REORGANIZE will always read the data from the new row groups.

MS SQL Trigger for ETL vs Performance

I would need information what might be the impact for production DB of creating triggers for ~30 Production tables that capture any Update,Delete and Insert statement and put following information "PK", "Table Name", "Time of modification" to separate table.
I have limited ability to test it as I have read only permissions to both Prod and Test environment (and I can get one work day for 10 end users to test it).
I have estimated that number of records inserted from those triggers will be around ~150-200k daily.
Background:
I have project to deploy Data Warehouse for database that is very customized + there are jobs running every day that manipulate the data. Updated on Date column is not being maintain (customization) + there are hard deletes occurring on tables. We decided to ask DEV team to add triggers like:
CREATE TRIGGER [dbo].[triggerName] ON [dbo].[ProductionTable]
FOR INSERT, UPDATE, DELETE
AS
INSERT INTO For_ETL_Warehouse (Table_Name, Regular_PK, Insert_Date)
SELECT 'ProductionTable', PK_ID, GETDATE() FROM inserted
INSERT INTO For_ETL_Warehouse (Table_Name, Regular_PK, Insert_Date)
SELECT 'ProductionTable', PK_ID, GETDATE() FROM deleted
on core ~30 production tables.
Based on this table we will pull delta from last 24 hours and push it to Data Warehouse staging tables.
If anyone had similar issue and can help me estimate how it can impact performance on production database I will really appreciate. (if it works - I am saved, if not I need to propose other solution. Currently mirroring or replication might be hard to get as local DEVs have no idea how to set it up...)
Other ideas how to handle this situation or perform tests are welcome (My deadline is Friday 26-01).
First of all I would suggest you code your table name into a smaller variable and not a character one (30 tables => tinyint).
Second of all you need to understand how big is the payload you are going to write and how:
if you chose a correct clustered index (date column) then the server will just need to out data row by row in a sequence. That is a silly easy job even if you put all 200k rows at once.
if you code the table name as a tinyint, then basically it has to write:
1byte (table name) + PK size (hopefully numeric so <= 8bytes) + 8bytes datetime - so aprox 17bytes on the datapage + indexes if any + log file . This is very lightweight and again will put no "real" pressure on sql sever.
The trigger itself will add a small overhead, but with the amount of rows you are talking about, it is negligible.
I saw systems that do similar stuff on a way larger scale with close to 0 effect on the work process, so I would say that it's a safe bet. The only problem with this approach is that it will not work in some cases (ex: outputs to temp tables from DML statements). But if you do not have these kind of blockers then go for it.
I hope it helps.

How can I block users while I truncate a SQL Table

We have a SQL Server 2008R2 Table that tracks incremented unique key values, like transaction numbers, etc. It's like a bad version of sequence objects. There are about 300 of these unique keys in the table.
My problem is that the table grows several 100,000 rows every day because it keeps the previously used numbers.
My issue is that we have to clean out the table once a week or performance suffers. I want to use a truncate and then kick off the SP to generate the next incremented value for each of the 300 keys. This works with a run time of about 5 minutes, but during this time the system is trying to use the table and throwing errors because there is no data.
Is there any way to lock the table, preventing user access, truncate and then lift the lock?
TRUNCATE automatically will lock the whole table. A delete statement will implement row locking, which will not interfere with your user queries. You might want to think about a purge of old records during off hours.
This will require cooperation by the readers. If you want to avoid using a highly blocking isolation level like serializable, you can use sp_getapplock and sp_releaseapplock to protect the table during the regeneration process. https://msdn.microsoft.com/en-us/library/ms189823.aspx
An alternative might be to build your new set in another table and then use sp_rename to swap them out.

Should I specify a fill factor on my tables?

I am working on a new system with a SQL Server 2005 database which will soon be going into production. A colleague recently mentioned to me that I should always be specifying the fill factor on my tables. Currently I don't specify fill factor on any of my tables.
My application is OLTP with a mix of reads and writes. A couple of my tables are "reference" tables i.e. read-only but most are read-write. The read-only tables are low volume ( < 50000 rows ).
From what I've read in the SQL Server documentation I should be sticking with the default fill-factor unless the table is read only.
Can anyone comment on this, both for read-only and read-write tables?
No, you shouldn't specify a fill factor on your tables. The fill factor is always ignored by the engine, except for one and only one operation: index build (which includes initial build of an index on a populated table and/or a rebuild of an index). So the fill factor makes sense to be specified only in the ALTER TABLE ... REBUILD and ALTER INDEX ... REBUILD operations.
See also A SQL Server DBA myth a day: (25/30) fill factor.

How do I speed up deletes from a large database table?

Here's the problem I am trying to solve: I have recently completed a data layer re-design that allows me to load-balance my database across multiple shards. In order to keep shards balanced, I need to be able to migrate data from one shard to another, which involves copying from shard A to shard B, and then deleting the records from shard A. But I have several tables that are very big, and have many foreign keys pointed to them, so deleting a single record from the table can take more than one second.
In some cases I need to delete millions of records from the tables, and it just takes too long to be practical.
Disabling foreign keys is not an option. Deleting large batches of rows is also not an option because this is a production application and large deletes lock too many resources, causing failures. I'm using Sql Server, and I know about partitioned tables, but the restrictions on partitioning (and the license fees for enterprise edition) are so unrealistic that they are not possible.
When I began working on this problem I thought the hard part would be writing the algorithm that figures out how to delete rows from the leaf level up to the top of the data model, so that no foreign key constraints get violated along the way. But solving that problem did me no good since it takes weeks to delete records that need to disappear overnight.
I already built in a way to mark data as virtually deleted, so as far as the application is concerned, the data is gone, but I'm still dealing with large data files, large backups, and slower queries because of the sheer size of the tables.
Any ideas? I have already read older related posts here and found nothing that would help.
Please see: Optimizing Delete on SQL Server
This MS support article might be of interest: How to resolve blocking problems that are caused by lock escalation in SQL Server:
Break up large batch operations into several smaller operations. For
example, suppose you ran the following
query to remove several hundred
thousand old records from an audit
table, and then you found that it
caused a lock escalation that blocked
other users:
DELETE FROM LogMessages WHERE LogDate < '2/1/2002'
By removing these records a few
hundred at a time, you can
dramatically reduce the number of
locks that accumulate per transaction
and prevent lock escalation. For
example:
SET ROWCOUNT 500
delete_more:
DELETE FROM LogMessages WHERE LogDate < '2/1/2002'
IF ##ROWCOUNT > 0 GOTO delete_more
SET ROWCOUNT 0
Reduce the query's lock footprint by making the query as efficient as
possible. Large scans or large
numbers of Bookmark Lookups may
increase the chance of lock
escalation; additionally, it increases
the chance of deadlocks, and generally
adversely affects concurrency and
performance.
delete_more:
DELETE TOP(500) FROM LogMessages WHERE LogDate < '2/1/2002'
IF ##ROWCOUNT > 0 GOTO delete_more
You could achieve the same result using SET ROWCOUNT as suggested by Mitch but according to MSDN it won't be supported for DELETE and some other operations in future versions of SQL Server:
Using SET ROWCOUNT will not affect DELETE, INSERT, and UPDATE
statements in a future release of SQL Server. Avoid using SET ROWCOUNT
with DELETE, INSERT, and UPDATE statements in new development work,
and plan to modify applications that currently use it. For a similar
behavior, use the TOP syntax. For more information, see TOP
(Transact-SQL).
You could create new files, copy all but the "deleted" rows, then swap the names on the tables. Finally, drop the old tables. If you're deleting a large percentage of the records, then this may actually be faster.
Another suggestion is to rename the table and add a status column. When status = 1 (deleted), then you won't want it to show. So you then create a view with the same name as the orginal table which selects from the table when status is null or = 0 (depending on how you implement it). The deletion appears immediate to the user and a background job can run every fifteen minutes deleting records that runs without anyone other than the dbas being aaware of it.
If you're using SQL 2005 or 2008, perhaps using "snapshot isolation" would help you. It allows the data to remain visible to users while there's an underlying data update operation processing, and then reveals the data as soon as it's committed. Even if you delete takes 30 minutes to run, your applications would stay online during this time.
Here's a quick primer of snapshot locking:
http://www.mssqltips.com/tip.asp?tip=1081
Though you should still try to speed up your delete so it's as quick as possible, this may alleviate some of the burden.
You can delete small batches using a while loop, something like this:
DELETE TOP (10000) FROM LogMessages WHERE LogDate < '2/1/2002'
WHILE ##ROWCOUNT > 0
BEGIN
DELETE TOP (10000) FROM LogMessages WHERE LogDate < '2/1/2002'
END
If a sizeable percentage of the table is going to match the deletion criteria (near or over 50%), then it is "cheaper" to create a temporary table with the records that are not going to be deleted (reverse the WHERE criteria), truncate the original table and then repopulate it with the records that were intended to be kept.
DELETE FROM TABLE WHERE ROW_TO_DELETE = 'OK';
GO
-->
INSERT INTO #TABLE WHERE NOT ROW_TO_DELETE = 'OK';
TRUNCATE TABLE;
INSERT INTO TABLE (SELECT * FROM #TABLE);
GO
here is the solution to your problem.
DECLARE #RC AS INT
SET #RC = -1
WHILE #RC <> 0
BEGIN
DELETE TOP(1000000) FROM [Archive_CBO_ODS].[CBO].[AckItem] WHERE [AckItemId] >= 300
SET #RC = ##ROWCOUNT
--SET #RC = 0
END

Resources