Increased use of swap on AWS Aurora

Increased use of swap on AWS Aurora - database

I have an AWS RDS Aurora (PostgreSQL compatible) instance, which recently triggered an alert because of increased swap usage, which was caused by running some not optimized queries (big temporary tables and sequential scanning). Some basic AWS metrics looks like:
Blue line: freeable memory
Purple line: swap usage
Yellow line: freeable - swap
I have a few questions I could not find an answer, nowhere in AWS docs, forums nor on SO
Why the DB started allocating swap while it still had a lot of freeable memory?
Why it's not releasing the swap if it's no longer used? How to reduce the amount of used swap?
Why also adds to freeable memory?

You can find more details about the RDS swap memory in the AWS Knowledgebase: https://aws.amazon.com/premiumsupport/knowledge-center/troubleshoot-rds-swap-memory/.
Swap Memory is an essential part of the OS, which helps to extend the memory size by storing additional data in a DISK. When more memory is allocated, old contents of the RAM is written to the Swap location in the DISK and new contents is placed in the RAM. In this case, it indicates that a new query/a set of queries are executed which fetched/scanned more records thus more RAM is required. So the OS made room by moving some old data to Swap.
As per the KB, below is the reason for swap memory not going down,
Linux swap usage isn't cleared frequently, because clearing the swap
usage requires extra overhead to reallocate swap when it's needed and
when reloading pages. As a result, if swap space is used on your RDS
DB instance, even if swap space was used only one time, the SwapUsage
metrics don't return to zero.
Postgres caches the result of previous executions in RAM so that it can reduce the disk seek next time. You can improving the database performance by allocating sufficient buffer cache. This is an expected behavior. The size of this cache is configurable. Please refer: https://redfin.engineering/how-to-boost-postgresql-cache-performance-8db383dc2d8f
Also as mentioned in the KB, this could be due to queries that returns huge amount of record, or a load on the database. You can enable performance insights to get more details about the queries that are running during that time.
BTW, Performance insights may not be available in smaller RDS instances. In that case, you can look in to the binary logs to see which queries where executed. Also enabling slow query logs will help you.

Related

MaximumInsertCommitSize optimization

I have a simple ssis ETL moving data from one server to another, a couple of the tables are 150gb+. For these larger tables how do I optimize the MaximumInsertCommitSize, on the server that is being loaded, I see the the the ram utilization is near 100% (64 gb) for the duration of the load for these tables. I am also getting pageiolatch_ex suspensions on the server that is being loaded.
This leads me to believe that the MaximumInsertCommitSize needs to be brought down.
my first question is whether you agree with this?
my other questions are more open ended
How do I optimize it by trial and error when it takes an hour to load the table (and table size matters for this operation so I would have to load all of it) ?
would network speed ever play into this optimization due to the increased bandwidth with multiple ?partitions? of data
would the hard drive speed affect it as this is the servers cheapest component (facepalm) - my thinking here is page io latch memory waits indicate that the number of disk operation is minimized but since the ram is over utilized sending tasks to the drive instead of waiting would be better.
Finally is there a calculator for this, I feel that even vague approximation of MaximumInsertCommitSize would be great (just based on network, disk, ram, file size, with assumptions that the destination is partitioned and has ample disk space)

Whole Oracle database in memory

Suppose I have an Oracle database whose data files are 256 GB in size. Is it a good idea to use a server with, say, 384 GB RAM in order to host the entire database in RAM?
Is there any difference if you only have, say, 128 GB RAM?
I'm talking about caching and Oracle inner workings, not memory based filesystem. Suppose OLTP, and a 100 GB working set.
Regards,

Assuming you are talking about Oracle using the memory for caching and other processes and not a memory based filesystem (which is an awful idea)... more memory is almost always better than less memory.
The real world answer is it depends. If your working set of data is a few GB or less then the extra memory wouldn't help as much.
How much memory you need and when extra memory stops helping depends on your application and what style of DB (OLTP,DSS) and there is no simple yes/no answer.

Use the views V$SGA_TARGET_ADVICE and V$PGA_TARGET_ADVICE to predict the performance improvement of additional memory.
Oracle records many statistics about physical (disk) and logical (total) I/O requests. People used to obsess over the buffer cache hit ratio. It can be helpful but that number doesn't tell the whole story. If the ratio is 99% then your cache is probably sufficient and adding more memory won't help. If it's low then you might benefit from more memory, or perhaps the processes that use disk aren't time critical.
Be careful before you request more memory. I've seen a lot of memory wasted because some people assume more memory will solve everything. Oracle has many I/O features to help reduce memory requirements. The "in-memory database" fad is mostly hype.

Solr always use more than 90% of physical memory

I have 300000 documents stored in solr index. And used 4GB RAM for solr server. But It consumes more than 90% of physical memory. So I moved to my data to a new server which has 16 GB RAM. Again solr consumes more than 90% memory. I don't know how to resolve this issue. I used default MMapDirectory and solr version 4.2.0. Explain me if you have any solution or the reason for this.

MMapDirectory tries to use the OS memory (OS Cache) to the full as much as possible this is normal behaviour, it will try to load the entire index into memory if available. In fact, it is a good thing. Since these memory is available it will try to use it. If another application in the same machine demands more, OS will release it for it. This is one the reason why Solr/Lucene the queries are order of magnitude fast, as most of the call to server ends up memory (depending on the size memory) rather than disk.
JVM memory is a different thing, it can be controlled, only working query response objects and certain cache entries use JVM memory. So JVM size can be configured based on number request and cache entries.

what -Xmx value are you using when invoking the jvm? If you are not using an explicit value, the jvm will set one based on the machine features.
Once you give a max amount of heap to Solr, solr will potentially use all of it, if it needs to, that is how it works. If you to limit to say 2GB use -Xmx=2000m when you invoke the jvm. Not sure how large your docs are, but 300k docs would be considered a smallish index.

How to cap memory usage by Extensible Storage Engine (JetBlue)?

I have an app that every so often hits a ESE database quite hard and then stops for a long time. After hitting the database memory usage goes way up (over 150MB) and stays high. I'm assuming ESE has lots of cached data.
Is there a way to cap the memory usage by ESE? I'm happy to suffer any perf hit
the only way I've seen to drop the memory usage is to close the DB

You can control the database cache size by setting the database cache size system parameter (JET_paramCacheSize). That number can be changed on-the-fly.
You might not need to set it though: by default ESENT will manage its cache size automatically by looking at available system memory, system paging and database I/O load. If you have hundreds of MB of free memory then ESENT won't see any reason to reduce the cache size. On the other hand. if you start using the memory on your system you should find that ESENT will automatically reduce the size of the database cache in your application. You can set the limits for automatic cache sizing with the JET_paramCacheSizeMin and JET_paramCacheSizeMax parameters.
Documentation link for the sytem parameters: http://msdn.microsoft.com/en-us/library/ms683044.aspx

Will performance of a SQL server degrade if the DB can't fit in the memory?

Will the performance of a SQL server drastically degrade if the database is bigger than the RAM? Or does only the index have to fit in the memory? I know this is complex, but as a rule of thumb?

Only the working set or common data or currently used data needs to fit into the buffer cache (aka data cache). This includes indexes too.
There is also the plan cache, network buffers + other stuff too. MS have put a lot of work into memory management on SQL Server and it's works well, IMHO.
Generally, more RAM will help but it's not essential.

Yes, when indexes cant fit in the memory or when doing full table scans. Doing aggregate functions over data not in memory will also require many (and maybe random) disc reads.
For some benchmarks:
Query time will depend significantly
on whether the affected data currently
resides in memory or disk access is
required. For disk intensive
operations, the characteristics of the
disk sequential and random I/O
performance are also important.
http://www.sql-server-performance.com/articles/per/large_data_operations_p7.aspx
There for, don't expect the same performance if your db size > ram size.
Edit:
http://highscalability.com/ is full of examples like:
Once the database doesn't fit in RAM you hit a wall.
http://highscalability.com/blog/2010/5/3/mocospace-architecture-3-billion-mobile-page-views-a-month.html
Or here:
Even if the DB size is just 10% bigger than RAM size this test shows a 2.6 times drop in performance.
http://www.mysqlperformanceblog.com/2010/04/08/fast-ssd-or-more-memory/
Although, remember that this is for hot data, data that you want to query over and don't can cache. If you can, you can easily live with significant less memory.

All DB operations will have to be backed up by writing to disk, having more RAM is helpful, but not essential.

Loading the whole database into RAM is not practical. Database can be upto a Terabytes these days. There is little chance that anyone would buy so much RAM. I think performance will be optimal even if the size of the RAM available is one tenth of the size of the database.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight