How could database have worse benchmark results on faster disk? - database

I'm benchmarking comparable (2vCPU, 2G RAM) server (Ubuntu 18.04) from DigitalOcean (DO) and AWS EC2 (t3a.small).
The disk benchmark (fio) goes inline with the results of https://dzone.com/articles/iops-benchmarking-disk-io-aws-vs-digitalocean
In summary:
DO --
READ: bw=218MiB/s (229MB/s), 218MiB/s-218MiB/s (229MB/s-229MB/s), io=3070MiB (3219MB), run=14060-14060msec
WRITE: bw=72.0MiB/s (76.5MB/s), 72.0MiB/s-72.0MiB/s (76.5MB/s-76.5MB/s), io=1026MiB (1076MB), run=14060-14060msec
EC2 --
READ: bw=9015KiB/s (9232kB/s), 9015KiB/s-9015KiB/s (9232kB/s-9232kB/s), io=3070MiB (3219MB), run=348703-348703msec
WRITE: bw=3013KiB/s (3085kB/s), 3013KiB/s-3013KiB/s (3085kB/s-3085kB/s), io=1026MiB (1076MB), run=348703-348703msec
which shows DO disk more than 10 times faster than the EBS of EC2
However, sysbench following https://severalnines.com/database-blog/how-benchmark-postgresql-performance-using-sysbench is showing DO slower than EC2 (using Postgres 11 default configuration, read-write test on oltp_legacy/oltp.lua )
DO --
transactions: 14704 (243.87 per sec.)
Latency (ms):
min: 9.06
avg: 261.77
max: 2114.04
95th percentile: 383.33
EC2 --
transactions: 20298 (336.91 per sec.)
Latency (ms):
min: 5.85
avg: 189.47
max: 961.27
95th percentile: 215.44
What could be the explanation?

Sequential read/write throughput matters for large sequential scans, stuff like data warehousing, loading a large backup, etc.
Your benchmark is OLTP which does lots of small quick queries. For this sequential throughput is irrelevant.
For reads (SELECTs) the most important factor is having enough RAM to keep your working set in cache and not do any actual IO. Failing that, it is read random access time.
For writes (UPDATE,INSERT) then the fsync latency, which is the time required to commit data to stable storage, is the most important factor since the database will only finish a COMMIT when data has been written.
Most likely the EC2 has better random access and fsync performance. Maybe it uses SSDs or battery-backed cache.

Sequential bandwidth and latency / iops are independent parameters.
Some workloads (like DBs) depend on latency for lots of small IOs. Or throughput for lots of small IO operations, iops (IOs per second).

In addition to IOPS vs throughput which others mentioned. I also wanted to point out that they are both pretty similar numbers. 240 tps vs 330 tps. you could add or subtract almost that much by just doing things like vacuum, analyze, or let it sit there for a while.
there could be other factors too. CPU speed could be different, there could be one performance for short burst vs throttling a heavy user, there could be presence or absence of huge_pages, different cache timings, memory speeds, or different nvme drivers. the point is 240 is not as much less than 330 as you might think.
Update: something else to point out is that OLTP read/write transactions arent necessary bottlenecked by disk performance. if you have sync off, then it really isnt.
I dont know exactly what the sysbench legacy OLTP read write test is doing, but I suspect its more like a bank xaction touching multiple records, using indexes, ... its probably not some sort of raw max insertion rate, or MAX CRUD operation rate benchmark.
I get 1000 tps on my desktop in the write heavy benchmark against pg13, but i can insert something like 50k records per second, each being ~ 100 bytes records from just a single process python client during bulk loads. and nearly 100k w/ sync off.

Related

SQL Server: Calculating Page Life Expectancy

I want to calculate the page life expectancy of my SQL Server.
If I query the PLE with the follwowing query I get the value 46.000:
SELECT [object_name],
[counter_name],
[cntr_value] FROM sys.dm_os_performance_counters
WHERE [object_name] LIKE '%Manager%'
AND [counter_name] = 'Page life expectancy'
I think this value isn't the final value because of the high amount. Do I have to calculate these value with a specifiy formula?
Thanks
Although some counters reported by sys.dm_os_performance_counters are cumulative, PLE reflects the current value so no calculation is necessary.
As to whether the value of 46 seconds is a cause for concern depends much on the workload and storage system. This value would be concern on a high-volume OLTP system with local spinning disk media due to the multi-millisecond latency incurred for each physical IO and IOPS of roughly 200 per spindle. Conversely, the same workload with high-performance local SSD may be fine because the storage capable of well over 100K IOPS.

What does CPU utilisation in databases actually mean?

There are two kinds of queries that I ran,
1.A purposely introduced query to perform sorting(order by) in about 10 columns.This uses CPU since sorting is a CPU intensive operation.
The scenario involved running the query which took 30 seconds and ran about 100 of those using simultaneous connections on 100 different tables.CPU usage on a 32 core machine was about 85% on all 32 cores and all 100 queries ran in parallel.
2.Inserting a million rows on a table.
I don't understand why this would consume CPU, since this is purely disk I/O.But I inserted 1 million rows on a single table using 100 simultaneous connections/threads and no indexes where there on those tables,now insert is not the fastest way to load data, but the point here is it is consuming CPU time about 32% on about 10 cores.This is way lesser than the above but still I am just curios.
I could be wrong because of Wal archiving was on and query log was on - does this contribute to CPU.I am assuming no since those are also disk IO.
There was no other process/application running/installed on this machine other than postgres.
Many different things:
CPU time for query planning and the logic in the executor for query execution
Transforming text representations of tuples into their on-disk format. Parsing dates, and so on.
Log output
Processing the transaction logs
Writing to shared_buffers when inserting pages to write, scanning shard_buffers for pages to write out
Interprocess communication for lock management
Scanning through in-memory cached copies of indexes when checking uniqueness, inserting new keys in an index, etc
....
If you really want to know the juicy details, fire up perf with stack traces enabled to see where CPU time is spent.
If your table had a primary key, then it has an implicit index.
It may also be true that if the table had a primary key, then it would be stored as a b-tree and not a simple flat table; I'm not clear on this point since my postgres-fu has weakened over the years, but many DBMSes use the primary key as a default clustering key for a b-tree and just store everything in the b-tree. Managing that b-tree requires plenty of CPU.
Additionally, if you're inserting from 100 threads and connections, then postgres has to perform locking in order to keep internal data structures consistent. Fighting for locks can consume a ton of CPU, and is especially difficult to do efficiently on machines with many CPUs - acquiring a single mutex requires the cooperation of every CPU in the system ala cache coherency protocol.
You may want to experiment with different numbers of threads, while measuring overall runtime and cpu usage - you may find that with, say, 8 threads, the total CPU utilized is 1/10th of your current usage, but still gets the job done within 110-150% of the original time. This would be a sure sign that lock contention is killing your CPU usage.

SQL Server high logical reads vs scan count

From a performance tuning perspective which one is more important?
Say a query reports 30 scans and 148 logical reads on a table with about 2 million records.
A modified version of the the same query reports 1 scan with 1400 logical reads. Second query takes about 40ms less CPU time to execute. Is the second query better?
I think so and this is my thesis:
In the first case, we have a high number of scans on a very large table. This is costly on CPU and server memory, since all the rows in the table have to be loaded into memory. Executing such a query thousands of times will be taxing on server resources.
In the second case, we have less scans even though we are accumulating a higher number of logical reads. Since logical reads effectively corresponds to number of pages being read from cache, the bottle neck here will be network bandwidth in getting the results back to the client. The actual work SQL Server has to do in this case is less.
What are your thoughts?
The logical read metrics are mostly irrelevant. You care about time elapsed, CPU time spent and disk resources used. Why would you care about logical reads? They are accounted for by looking at CPU time.
If you want your query to go faster measure wall clock time. If you want to use less resources measure CPU and physical IO.

Which NoSQL Database for Mostly Writing

I'm working on a system that will generate and store large amounts of data to disk. A previously developed system at the company used ordinary files to store its data but for several reasons it became very hard to manage.
I believe NoSQL databases are good solutions for us. What we are going to store is generally documents (usually around 100K but occasionally can be much larger or smaller) annotated with some metadata. Query performance is not top priority. The priority is writing in a way that I/O becomes as small a hassle as possible. The rate of data generation is about 1Gbps, but we might be moving on 10Gbps (or even more) in the future.
My other requirement is the availability of a (preferably well documented) C API. I'm currently testing MongoDB. Is this a good choice? If not, what other database system can I use?
The rate of data generation is about 1Gbps,... I'm currently testing MongoDB. Is this a good choice?
OK, so just to clarify, your data rate is ~1 gigaBYTE per 10 seconds. So you are filling a 1TB hard drive every 20 minutes or so?
MongoDB has pretty solid write rates, but it is ideally used in situations with a reasonably low RAM to Data ratio. You want to keep at least primary indexes in memory along with some data.
In my experience, you want about 1GB of RAM for every 5-10GB of Data. Beyond that number, read performance drops off dramatically. Once you get to 1GB of RAM for 100GB of data, even adding new data can be slow as the index stops fitting in RAM.
The big key here is:
What queries are you planning to run and how does MongoDB make running these queries easier?
Your data is very quickly going to occupy enough space that basically every query will just be going to disk. Unless you have a very specific indexing and sharding strategy, you end up just doing disk scans.
Additionally, MongoDB does not support compression. So you will be using lots of disk space.
If not, what other database system can I use?
Have you considered compressed flat files? Or possibly a big data Map/Reduce system like Hadoop (I know Hadoop is written in Java)
If C is key requirement, maybe you want to look at Tokyo/Kyoto Cabinet?
EDIT: more details
MongoDB does not support full-text search. You will have to look to other tools (Sphinx/Solr) for such things.
Larges indices defeat the purpose of using an index.
According to your numbers, you are writing 10M documents / 20 mins or about 30M / hour. Each document needs about 16+ bytes for an index entry. 12 bytes for ObjectID + 4 bytes for pointer into the 2GB file + 1 byte for pointer to file + some amount of padding.
Let's say that every index entry needs about 20 bytes, then your index is growing at 600MB / hour or 14.4GB / day. And that's just the default _id index.
After 4 days, your main index will no longer fit into RAM and your performance will start to drop off dramatically. (this is well-documented under MongoDB)
So it's going to be really important to figure out which queries you want to run.
Have a look at Cassandra. It executes writes are much faster than reads. Probably, that's what you're looking for.

SQL Server 2008 Activity Monitor Resource Wait Category: Does Latch include CPU or just disk IO?

In SQL Server 2008 Activity Monitor, I see Wait Time on Wait Category "Latch" (not Buffer Latch) spike above 10,000ms/sec at times. Average Waiter Count is under 10, but this is by far the highest area of waits in a very busy system. Disk IO is almost zero and page life expectancy is over 80,000, so I know it's not slowed down by disk hardware and assume it's not even touching SAN cache. Does this mean SQL Server is waiting on CPU (i.e. resolving a bajillion locks) or waiting to transfer data from the local server's cache memory for processing?
Background: System is a 48-core running SQL Server 2008 Enterprise w/ 64GB of RAM. Queries are under 100ms in response time - for now - but I'm trying to understand the bottlenecks before they get to 100x that level.
Class Count Sum Time Max Time
ACCESS_METHODS_DATASET_PARENT 649629086 3683117221 45600
BUFFER 20280535 23445826 8860
NESTING_TRANSACTION_READONLY 22309954 102483312 187
NESTING_TRANSACTION_FULL 7447169 123234478 265
Some latches are IO, some are CPU, some are other resource. It really depends on which particular latch type you're seeing this. sys.dm_os_latch_stats will show which latches are hot in your deployment.
I wouldn't worry about the last three items. The two nesting_transaction ones look very healthy (low average, low max). Buffer is also OK, more or less, although the the 8s max time is a bit high.
The AM_DS_PARENT latch is related to parallel queries/parallel scans. Its average is OK, but the max of 45s is rather high. W/o going into too much detail I can tell that long wait time on this latch type indicate that your IO subsystem can encounter spikes (and the max 8s BUFFER latch waits corroborate this).

Resources