How to debug high CPU AWS RDS Postgres? - database

Hello I'm looking at Performance Insights in AWS RDS (Postgres 10)
I slice by "Waits"
When I see Top databases, Top Applications, Top session Types and Top Users they are all actually higher than the SQL queries it self
From these metrics how do you tell what is bottlenecking the CPU?

Top waits, Top SQL, etc. are all different dimensions that you can use to understand what's contributing to database load. Dimensions are not comparable with each other.
It sounds like you want to diagnose what's contributing to the PostgreSQL "CPU" wait event. You can find more information on this topic in the official RDS docs on Tuning with wait event.
If the issue turns out to be suboptimal queries, then you can find the worst performers in the Top SQL tab (dimension) of Performance Insights.

Here are some cases that could cause high CPU usage of Postgres.
Incorrect indexes are used in the query
Debug method: Check the query plan - Through EXPLAIN, we could check the query plan, if the index is used in the query, the Index Scan could be found in the query plan result.
Solution: add the corresponding index to reduce CPU usage
Query with sort operation
Debug method: Check EXPLAIN (analyze, buffers) - If the memory is insufficient to do the sorting operation, the temporary file could be used to do the sorting, and high CPU usage is comes up.
Note: DO NOT "EXPLAIN (analyze)" in a busy production system as it actually executes the query behind the scenes to provide more accurate planner information and its impact is significant
Solution: Tune up the work_mem and sorting operations
Sample: Tune sorting operations in PostgreSQL with work_mem
Long-running transactions
Debug method: SELECT * FROM pg_stat_activity where (state = 'idle in transaction') and xact_start is not null;
Solution: kill the long-running transaction through select pg_terminate_backend(pid)

Related

How to find top cpu utilized queries in MS Access database?

I want to find those queries in MS Access which are utilizing the CPU mostly and put them in a table in descending order.
I have checked the system tables of MS Access database, but can't find any clue for this.
I am new to MS Access, please help.
For 10 or even 15 years, Access has never been CPU bound. In other words, network speeds, disk drive speeds etc. are the main factor .
In the vast majority of cases throwing more CPU at a problem will not help improve performance. If 99% of time is network or other factors, then a double of CPU will only improve by 2%.
However, I will accept that if a query is using lots of CPU, then it stands somewhat to reason that such a query is pulling a lot of data. There is no CPU logger for the Access database engine. However, you can look at rows and the query plan, and that can be done with showplan. How this works and can be used is outlined here:
How to get query plans (showplan.out) from Access 2010?
And here is a older article on showplan and how to use it:
https://www.techrepublic.com/article/use-microsoft-jets-showplan-to-write-more-efficient-queries/#
So, showplan is somewhat similar to looking at the query plan used in SQL server. It will tell you things like if a full table scan is being used to get one row, or if indexing can or was used. So, looking at the query plan, be it sql server, or in this case Access is certainly possible. However, the status on CPU usage are slim, but how much data and things like if the query plan is doing full table scans is available in a similar fashion to query plans like one would see for server based systems such as SQL server.

SQL Server CXPACKET timeout

We've got SQL Server 2016 (v13.0.4206.0), by default there is no restrictions for parallelism - any count SQL wants. And it didn't lead any problems... Till now.
For another feature there were written query that unexpectedly raised timeout exception in our application. I was deeply surprised when it was successfully executed with setting up maximum threads per query to 1. Yes, 6 seconds for query is not so good, even accounting to most of time was spent for fetching, but it's far away from 3 minutes timeout!
By the way, executing this query with SQL Server Management Studio works all the time despite of parallelism settings. It seems that something wrong with connection to database, but all other queries works fine, even which much harder then that one.
Our application is built on ASP.NET Core 3.0 (don't know if it matters), database connection is made using System.Data.SqlClient v4.8.0. All I could determine is that there are so much tasks created for this query:
I've tried to watch for execution in sys.dm_os_waiting_tasks (thanks google). I'm not sure I got it right, but it seems that tasks with context_id 0-8 is blocked with those who have context_id 9-16 and vise versa. Obvious example of deadlock, isn't it? But how can SQL Server manage threads to make it without my "help"? Or what am I doing wrong?
Just in case some inappropriate answers:
I won't turn parallelism off (set maximum threads per query to 1) as solution because of some heavy queries in our application;
I don't want to raise Cost Threshold for Parallelism setting because I'm afraid of same problem with another query (guess, a heavier one). So I just want to determine real cause;
Optimizing the query isn't considered (anymore), as according to actual execution plan I can't make it faster - there are enough indexes for it. But I'm ready to rethink after some really weighty arguments.
So, my question is: why does parallelism that I didn't ask for spoil the query execution? And how can I avoid that?
It's true sometimes the engine chooses to use parallel execution (or not to use) which leads to worse performance.
You do not want to control the server option and the cost as you are not sure how this will reflect to other queries, which is understandable.
If you are sure, your query will be execute better without being handle in parallel, you can specify the option just for it using query hints - MAXDOP like this:
SELECT ...
FROM ...
OPTION (MAXDOP 1);
It's easy and you can rollback if needed. Also, you are not affecting other queries.
You are saying that:
Optimizing query isn't considered (anymore), as according to actual execution plan...
The execution plan is sometimes misleading. As a start - you can save your execution plan and open it with SentryOne Plan Explorer - it's free and can give you a better look of what's going on.
Also, if a query is execute for either 3 seconds or 6 minutes, there must be something wrong with it or may be the activity of your database. If it is executed fast in the SSMS always, maybe the engine is using the correct cache plan. I thing it's better to share the query itself and to attach the two plans (serial and parallel) and spend more time tuning it.

SQL server - high buffer IO and network IO

I have a a performance tuning question on SQL server.
I have a program that needs to run every month and it takes more than 24hrs to finish. I need to tune this program in the hope that I can decrease the running time to 12 hrs or less.
As this program isn't developed by us, i can't check the program content and modify it. All i can do is just open the SQL server profiler and activity monitor to trace and analyze the sql content. I have disabled unused triggers and did some housekeeping, but the running time only decreased 1 hr.
I found that the network I/O and buffer I/O are high, but i don't know the cause and meaning of this ?
Can anyone tell me the cause of these two issues (network I/O and butter I/O)? Are there any suggestions for optimizing this program?
Thank you!
. According to your descriptions, I think your I/O is normal, your
question is only one:one procedure is too slowly. the solution:
1.open the SSMS
2.find the procedure
3.click the buttton named "Display estimated execution plan"
4.fix the procedure.
To me it seems like your application reads a lot of data into the application, which would explain the figures. Still, I would check out the following:
Is there blocking? That can easily be a huge waste of time if the process is just waiting for something else to complete. It doesn't look like that based on your statistics, but it's still important to check.
Are the tables indexed properly? Good indexes to match search criteria / joins. If there's huge key lookups, covering indexes might make a big difference. Too many indexes / unnecessary indexes can slow down updates.
You should look into plan cache to see statements responsible for the most I/O or CPU usage
Are the query plans correct for the most costly operations? You might have statistics that are outdated or other optimization issues.
If the application transfers a lot of data to and/or from the database, is the network latency & bandwidth good enough or could it be causing slowness? Is the server where the application is running a bottleneck?
If these don't help, you should probably post a new question with detailed information: The SQL statements that are causing the issues, table & indexing structure of the involved tables with row counts and query plans.

What can cause bad SQL server performance?

Every time I find out that the performance of data retrieval from my database is slow. I try to figure out which part of my SQL query has the problem and I try to optimize it and also add some indexes to the table. But this does not always solve the problem.
My question is :
Are there any other tricks to make SQL server performance better?
What are the other reason which can make SQL server performance worse?
Inefficient query design
Auto-growing files
Too many indexes to be maintained on a table
Too few indexes on a table
Not properly choosing your clustered index
Index fragmentation due to poor maintenance
Heap fragmentation due to no clustered index
Too high FILLFACTORs used on indexes, causing excessive page splitting
Too low of a FILLFACTOR used on indexes, causing excessive space usage and increased scanning time
Not using covered indexes where appropriate
Non-selective indexes being used
Improper maintenance of statistics (out of date statistics)
Databases not normalized properly
Transaction logs and data sharing the same drive spindles
The wrong memory configuration
Too little memory
Too little CPU
Slow hard drives
Failing hard drives or other hardware
A 3D screensaver on your database server chewing up your CPU
Sharing the database server with other processes which compete for CPU and memory
Lock contention between queries
Queries which scan entire large tables
Front end code which searches data in an inefficent manner (nested loops, row by row)
CURSORS which are not necessary and/or are not FAST_FORWARD
Not setting NOCOUNT when you have large tables being cursored through.
Using a transaction isolation level which is too high (such as using SERIALIZABLE when it's not necessary)
Too many round trips between the client and the SQL Server (a chatty interface)
An unnecessary linked server query
A linked server query which targets a table on a remote server with no primary or candidate key defined
Selecting too much data
Excessive query recompilations
oh and there might be some others, too.
When I talk to new developers that have this problem I usually find that it is because of one of two problems. Both of them are fixed if you follow these 2 rules.
First, don’t retrieve any data that you don’t need. For example, if you are doing paging then don’t bring back 100 rows and then calculate which ones belong on the page. Have the stored proc figure it out and only retrieve the 10 you need.
Second, nothing is faster than work you don’t do. For example, I worked on a system where the full roles and rights for a user were retrieved with every page requested – this was 100’s of rows for some users. Even just saving this to session state on the first request and then using it from there for subsequent requests took a meaningful weight off of the database.
Suggest you get a good book on Performance tuning for the database you use (this is very much database specific). This is an extremely complex subject and cannot really be answered other than in generalities on the web.
For instance, Dave markle tell you inefficient queries can cause the problem and there are many many ways to write inefficient queries and many more ways to fix them.
If you're new to the database and you have access to the database engine tuning advisor, you can heuristically tune your database.
You basically capture the SQL queries being run against your DB in the SQL Profiler, then feed those to DETA. DETA effectively runs the queries (without altering your data) and then works out what information your database is missing (views, indexes, partitions, statistics etc.) to do the queries better.
It can then apply them for you and monitor them in the future. I'm not saying to assume that DETA is always right or to do things without understanding, but I've found that it's definately a good way to see what your queries are doing, how long they take, and how you can index the DB appropriately.
PS: With all that said, it's much better to invest in a good DBA at the start of a project so that you have good structures and indexing to start with. But thats not the position that you're in right now...
This is a very wide question. And there is a ton of answers already. Still I would like to add one important factor - Page Split. The problem is – there are good splits and bad splits. Following are good articles explaining how to use transaction_log extended event for identifying bad/nasty page splits
Tracking Problematic Pages Splits in SQL Server 2012 Extended Events - Jonathan Kehayias
Tracking page splits using the transaction log - Paul Randal
You mentioned:
I try to optimize it and also add some indexes
But, sometimes removing unused non-clustered indexes may help to improve performance as it help to reduce transaction logs. Read Top Reasons for Log Performance Problems
Wait statistics, or please tell me where it hurts gives an idea about using wait statistics for performance analysis.
To see some fresh ideas for performance, take a look at
Performance Considerations - sqlmag.com
Separate tables in joins to different disks (for parallel disk I/O - filegroups).
Avoid joins on columns with few unique values.
To understand JOIN, read Advanced JOIN Techniques

Whats the best way to profile a sqlserver 2005 database for performance?

What techinques do you use? How do you find out which jobs take the longest to run? Is there a way to find out the offending applications?
Step 1:
Install the SQL Server Performance Dashboard.
Step2:
Profit.
Seriously, you do want to start with a look at that dashboard. More about installing and using it can be found here and/or here
To identify problematic queries start the Profiler, select following Events:
TSQL:BatchCompleted
TSQL:StmtCompleted
SP:Completed
SP:StmtCompleted
filter output for example by
Duration > x ms (for example 100ms, depends mainly on your needs and type of system)
CPU > y ms
Reads > r
Writes > w
Depending on what you want to optimize.
Be sure to filter the output enough to not having thousands of datarows scrolling through your window, because that will impact your server performance!
Its helpful to log output to a database table to analyse it afterwards.
Its also helpful to run Windows system monitor in parallel to view cpu load, disk io and some sql server performance counters. Configure sysmon to save the data to a file.
Than you have to get production typical query load and data volumne on your database to see meaningfull values with profiler.
After getting some output from profiler, you can stop profiling.
Then load the stored data from the profiling table again into profiler, and use importmenu to import the output from systemmonitor and the profiler will correlate the sysmon output to your sql profiler data. Thats a very nice feature.
In that view you can immediately identifiy bootlenecks regarding to your memory, disk or cpu sytem.
When you have identified some queries you want to omtimize, go to query analyzer and watch the execution plan and try to omtimize index usage and query design.
I have had good sucess with the Database Tuning tools provided inside SSMS or SQL Profiler when working on SQL Server 2000.
The key is to work with a GOOD sample set, track a portion of TRUE production workload for analsys, that will get the best overall bang for the buck.
I use the SQL Profiler that comes with SQL Server. Most of the poorly performing queries I've found are not using a lot of CPU but are generating a ton of disk IO.
I tend to put in filters on disk reads and look for queries that tend to do more than 20,000 or so reads. Then I look at the execution plan for those queries which usually gives you the information you need to optimize either the query or the indexes on the tables involved.
I use a few different techniques.
If you're trying to optimize a specific query, use Query Analyzer. Use the tools in there like displaying the execution plan, etc.
For your situation where you're not sure WHICH query is running slowly, one of the most powerful tools you can use is SQL Profiler.
Just pick the database you want to profile, and let it do its thing.
You need to let it run for a decent amount of time (this varies on traffic to your application) and then you can dump the results in a table and start analyzing them.
You are going to want to look at queries that have a lot of reads, or take up a lot of CPU time, etc.
Optimization is a bear, but keep going at it, and most importantly, don't assume you know where the bottleneck is, find proof of where it is and fix it.

Resources