Shrinking pg_toast on RDS instance - database

I have a Postgres 9.6 RDS instance and it is growing 1GB a day. We have made some optimizations to the relation related to the pg_toast but the pg_toast size is not changing.
Autovacuum is on, but since autovacuum/VACUUM FREEZE do not reclaim space and VACUUM FULL does an exclusive lock, I am not sure anymore what the best approach is.
The data in the table is core to our user experience and although following this approach makes sense, it would take away the data our users expect to see during the vacuum full process.
What are the other options here to shrink the pg_toast?
Here is some data about table sizes. You can see in the first two images, that the relation scoring_responsescore is relation associated with the pg_toast.
Autovacuum settings
Results from current running autovacuum process for that specific pg_toast. It might help.

VACUUM (FULL) is the only method PostgreSQL provides to reduce the size of a table.
Is the bloated TOAST table such a problem for you? TOAST tables are always accessed via the TOAST index, so the bloat shouldn't be a performance problem.
I know of two projects that provide table reorganization with only a short ACCESS EXCLUSIVE lock, namely pg_squeeze and pg_repack, but you probably won't be able to use those in an Amazon RDS database.
To keep the problem from getting worse, you should first try to raise autovacuum_vacuum_cost_limit to 2000 for the affected table, and if that doesn't do the trick, lower autovacuum_vacuum_cost_delay to 0. You can use ALTER TABLE to change the settings for a single table.

pg_repack still does not allow to reduce the size of TOAST Segments in RDS.
And in RDS we cannot run pg_repack with superuser privileges, we have to use "--no-superuser-check" option. With this it will not be able to access the pg_toast.* tables.

Related

Changing tempdb "in-query"

Good day,
Is it possible to change the tempdb my current session is using?
I have a very heavy query that is meant for HD usage.
Ideally, I'd like the query to be ran using a tempdb we have specifically for such heavy things.
(Main issue is the query creates a very large temp table)
I'd like something along the lines of:
use tempdb <tempdbname>
<query>
use tempdb <normaltempdb>
If this is at all possible, even if by other means, please let me know.
Right now, the only way I know of to do this is to bind a user to a different tempdb, and then have HD login using that user, instead of the normal user.
Thanks in advance,
ziv.
In Sybase ASE you cannot change your tempdb in-flight; your tempdb is automagically assigned at log in.
You have a few options:
1 (recommended) - have the DBA create a login specifically for this process and bind said login to the desired tempdb (eg, sp_tempdb 'bind', ...); have your process use this new login
2 (not recommended) - instead of creating #temp tables, create permanent tables with a 'desired_tempdb_name..' prefix; you'll likely piss off your DBA if you forget to manually drop said tables when you're done with it
3 (ok, if you've got the disk space) - as Rich has suggested, make sure all tempdb's are sized large enough to support your process
NOTE: If you're using Sybase's SQLAnywhere, IQ or Advantage RDBMSs... sorry, I don't know how temporary databases are assigned for these products.
if your main concern is impact to tempdb and other users, you could consider creating multiple default tempdbs of the same size and structure. Add these to the default group and sessions are assigned to a tempdb on connection thus lessening the risk of one large query impacting the whole dataserver
http://infocenter.sybase.com/help/index.jsp?topic=/com.sybase.infocenter.dc00841.1502/html/phys_tune/phys_tune213.htm
You could also consider the use of a login trigger for specific logins and check the program name which is connecting to decide upon which tempdb to use (e.g. Business Objects could go to a much larger DSS tempdb or similar).
There is no way to change your session tempdb in-flight that I'm aware of though as tempdb bindings are set on connection.
It sounds like you do have at least one other tempdb created by the DBA. You can bind to this by application name as well as the login Id. Set the application name in your client session (depends on what the client is as to how you do this.) Use sp_tempdb (dba only) to bind that application name to the alternative tempdb, and your # table will be in that tempdb. Any session with that application name will use that tempdb.
tempdbs do not have to be the same size or structure and you can have separate log and data (a good idea,) with more log and less data depending on what you are doing.
markp mentions permanent tables in tempdbs, and says "not recommended". This can be a good technique though. You do need to be careful about how big they get and when they are dropped. You might not need or want to drop them straightaway, for example if you need to bcp from them and/or have them visible for Support purposes, but you do need to be clear about space usage, when to drop and how.

Is it ok to use Audit columns on all the tables in SQL Server(or any other DB)

In my present team we have made it a practice to add 5-6 audit columns in all our tables irrespective of if they are required or not. I am concerned this will increase the number of pages occupied by the tables and the size of the Database. Once live, the application may have 50k users hitting it concurrently.
How will it impact the performance of the Application ? What should I tell my boss to convince this is a bad policy.
You need to test out and have some data to show .Test the impact with expected workload of 50k users with and with out Audit and document differences like
1.CPU usage
2.Memory Usage
3.IO load
if you are seeing any slowness ,then you can present you boss the testing you have done..
Here is a whitepaper from Microsoft which states the impact of Auditing on various oltp workloads
If Size is the concern, you can maintain a separate database or table for the log. Log your changes/operations on tables through triggeres or equivalent SPs created.
Then periodically you can delete the old data.
Why are you convinced this is a bad policy.
They don't take a lot of room. If you get some performance issues and they are not used you can just remove them and reindex the table.
Is it really a position you want to go to you boss without some hard evidence?

Major performance difference between two Oracle database instances

I am working with two instances of an Oracle database, call them one and two. two is running on better hardware (hard disk, memory, CPU) than one, and two is one minor version behind one in terms of Oracle version (both are 11g). Both have the exact same table table_name with exactly the same indexes defined. I load 500,000 identical rows into table_name on both instances. I then run, on both instances:
delete from table_name;
This command takes 30 seconds to complete on one and 40 minutes to complete on two. Doing INSERTs and UPDATEs on the two tables has similar performance differences. Does anyone have any suggestions on what could have such a drastic impact on performance between the two databases?
I'd first compare the instance configurations - SELECT NAME, VALUE from V$PARAMETER ORDER BY NAME and spool the results into text files for both instances and use some file comparison tool to highlight differences. Anything other than differences due to database name and file locations should be investigated. An extreme case might be no archive logging on one database and 5 archive destinations defined on the other.
If you don't have access to the filesystem on the database host find someone who does and have them obtain the trace files and tkprof results from when you start a session, ALTER SESSION SET sql_trace=true, and then do your deletes. This will expose any recursive SQL due to triggers on the table (that you may not own), auditing, etc.
If you can monitor the wait_class and event columns in v$session for the deleting session you'll get a clue as to the cause of the delay. Generally I'd expect a full table DELETE to be disk bound (a wait class indication I/O or maybe configuration). It has to read the data from the table (so it knows what to delete), update the data blocks and index blocks to remove the entries which generate a lot of entries for the UNDO tablespace and the redo log.
In a production environment, the underlying files may be spread over multiple disks (even SSD). Dev/test environments may have them all stuck on one device and have a lot of head movement on the disk slowing things down. I could see that jumping an SQL maybe tenfold. Yours is worse than that.
If there is concurrent activity on the table [wait_class of 'Concurrency'] (eg other sessions inserting) you may get locking contention or the sessions are both trying to hammer the index.
Something is obviously wrong in instance two. I suggest you take a look at these SO questions and their answers:
Oracle: delete suddenly taking a long time
oracle delete query taking too much time
In particular:
Do you have unindexed foreign key references (reason #1 of delete taking a looong time -- look at this script from AskTom),
Do you have any ON DELETE TRIGGER on the table ?
Do you have any activity on instance two (if this table is continuously updated, you may be blocked by other sessions)
please note: i am not a dba...
I have the following written on my office window:
In case of emergency ask the on call dba to:
Check Plan
Run Stats
Flush Shared Buffer Pool
Number 2 and/or 3 normally fix queries which work in one database but not the other or which worked yesterday but not today....

Using a duplicate SQL Server database for queries

I have a very large (100+ gigs) SQL Server 2005 database that receives a large number of inserts and updates, with less frequent selects. The selects require a lot of indexes to keep them functioning well, but it appears the number of indexes is effecting the efficiency of the inserts and updates.
Question: Is there a method for keeping two copies of a database where one is used for the inserts and updates while the second is used for the selects? The second copy wouldn't need to be real-time updated, but shouldn't be more than an hour old. Is it possible to do this kind of replication while keeping different indexes on each database copy? Perhaps you have other solutions?
Your looking to setup a master/child database topology using replication. With SQL server you'll need to setup replication between two databases (preferrably on separate hardware). The Master DB you should use for inserts and updates. The Child will service all your select queries. You'll want to also optimize both database configuration settings for the type of work they will be performing. If you have heavy select queries on the child database you may also want to setup view's that will make the queries perform better than complex joins on tables.
Some reference material on replication:
http://technet.microsoft.com/en-us/library/ms151198.aspx
Just google it and you'll find plenty of information on how to setup and configure:
http://search.aim.com/search/search?&query=sql+server+2005+replication&invocationType=tb50fftrab
Transactional replication can do this as the subscriber can have a number of aditional indexes compared with the publisher. But you have to bear in mind a simple fact: all inserts/updates/deletes are going to be replicated at the reporting copy (the subscriber) and the aditional indexes will... slow down replication. It is actually possible to slow down the replication to a rate at wich is unable to keep up, causing a swell of the distribution DB. But this is only when you have a constant high rate of updates. If the problems only occur durink spikes, then the distribution DB will act as a queue that absorbes the spikes and levels them off during off-peak hours.
I would not take this endevour without absolute, 100% proof evidence that it is the additional indexes that are slowing down the insert/updates/deletes, and w/o testing that the insert/updates/deletes are actually performing significantly better without the extra indexes. Specifically , ensure that the culprit is not the other usual suspect: lock contention.
Generally, all set-based operations (including updating indexes) are faster than non set-based ones
1,000 inserts will most probably be slower than one insert of 1,000 records.
You can batch the updates to the second database. This will, first, make the index updating more fast, and, second, smooth the peaks.
You could task schedule a bcp script to copy the data to the other DB.
You could also try transaction log shipping to update the read only db.
Don't forget to adjust the fill factor when you create your two databases. It should be low(er) on the database with frequent updates, and 100 on your "data warehouse"/read only database.

SQLServer tempDB growing infinitely

we have several "production environments" (three servers each, with the same version of our system. Each one has a SQL Server Database as production database).
In one of this environment the tempdb transaction log starts to grow fast and infinitely, we can´t find why. Same version of SO, SQL Server, application. No changes in the environment.
Someone know how to figure what´s happening ou how to fix this?
You might be in Full recovery model mode - if you are doing regular backups you can change this to simple and it will reduce the size of the log after the backup.
Here is some more info.
Have you tried running Profiler? This will allow you to view all of the running queries on the server. This may give you some insight into what is creating items in tempdb.
Your best bet is to fire up SQL Server Profiler and see what's going on. Look for high values in the "Writes" column or Spool operators, these are both likely to cause high temp usage.
If it is only the transaction log growing then try this, open transactions prevent the log from being shrunk down as it goes. This should be run in tempdb:
DBCC OPENTRAN
ok, i think this question is the same as mine.
the tempdb grow fast. the common reason is that the programmer create the procedure, and use the temportary table.
when we create these tables, or other operation,like trigger, dbcc command, they are all use the tempdb.
create the temportary tables, sqlserver will alloc space for table, like GAM,SGAM or IAM,but sqlserver must sure the Physical consistency, so there can only be a person do it every time, the others objects must wait. that caused tempdb grow fast.
i find the sovlution from MS, about like that, hope can help you:
1.create the data files for tempdb, the number will the same as CPU, ec:your host have 16cpu,you need to create 16 date files for tempdb. and every file must has the same size.
2.you need monitor these files , sure they are not full.
3.if these files space not enough big, that will auto grow, you need to put others the same size.
my english is not good, and if you are cant solve it, use the procedure sp_helpfile , check it. and paste the result at here.
when i was in singapore, i find this situation.
good luck.

Resources