Let's say I have a simple ASP.NET MVC web application and a (local) Sql Server. As an ORM, I am using Entity Framework 4.3.1
To figure out how long it takes on the ORM side, I've prepared a simple select query and printed out timestamps, like
...
using (var context = Entities())
{
(1) timestamp1
var list = context.Database.SqlQuery<Entity>("select * from entities").ToList();
(2) timestamp2
}
...
At the same time, I watched Sql Server Profiler to see the query start/end times.
The result is as follows (note that only milliseconds are shown since the query-processing time is less than 1sec)
timestamp1: 149 msec
query start time: 197 msec
query end time: 198 msec
timestamp2: 199 msec
Question) why so much time (48 msec, 197-149 msec) was taken before starting the query? is there any way to reduce this?
thanks!
ADO.NET EF needs to initialize itself, you'll find that future queries will execute without as much lag. Also don't forget that SQL Server Profiler only logs execution time for the query, not time associated with network transport and other overheads.
There might also be slowdowns associated with activating SQL Server - you say it's local on your computer, so it's possible that SQL Server's components were paged-out to disk. What is performance like when you run against a dedicated SQL Server box?
Related
Objective
I have Apache Nifi Docker container on Azure VM with attached premium very high-throughput SSD disk. I have MSSQL Server 2012 database on AWS. Nifi to database communication happens through mssql jar v6.2, and through high-throughtput AWS Direct Connect MPLS network.
Within Nifi Flow only one processor is executed - ExecuteSQLRecord. It use only one thread/CPU and has 4 GB JVM Heap Space available. ExecuteSQLRecord execute query that return 1 million of rows, which equals to 60MB Flow File. Query is based on table indexes, so there is nothing to optimize on DB side. Query looks like: SELECT * FROM table WHERE id BETWEEN x AND y.
The issue
ExecuteSQLRecord with 1 thread/CPU, 1 query , retrieves 1M of rows (60MB) in 40 seconds.
In the same time, the same query run from MSSMS and database internal network takes 18 seconds.
In the same time query is already optimized on DB side (with indexes), and throughtput scale linearly with increasing number of threads/CPUs - network is not a bottleneck.
Questions
Is this performance okay for Nifi 1 CPU? Is it okay that Nifi spends 22 seconds (from 40) for retrieval and storing the results to Content Repository?
How does Nifi pull the data from MSSQL Server? Is this a pull approach? If yes, maybe we have to many roundtrips?
How can I check how much time Nifi spending on converting result set to CSV, and how much time for writting into Content Repository?
Are you using the latest Docker image (1.11.4)? If so you should be able to set the fetch size on the ExecuteSQLRecord processor (https://issues.apache.org/jira/browse/NIFI-6865)
I got a couple of different results when I searched for the default fetch size for the MSSQL driver, one site said 1 and another said 32. In your case for that many records I'd imagine you'd want it to be way higher (see https://learn.microsoft.com/en-us/previous-versions/sql/legacy/aa342344(v=sql.90)?redirectedfrom=MSDN#use-the-appropriate-fetch-size for setting the appropriate fetch size).
To add to Matt's answer, you can examine the provenance data for each flowfile and see the lineage duration (amount of time) it spent in each segment of the flow. You can also see the status history for every processor, so you can examine the data in/out by size and number of flowfiles, CPU usage, etc. for each processor.
I have c# application that is communicating frequently with the MS SQL server that is on remote server. The application runs almost 24/7. I have noticed that in 1 month the data usage is 20 GB which i find too much for SQL queries.
How can i calculate how much would be data usage by reading from DB e.g. only Int32 column? I guess minimum each query would be 4 byte, but there is likely some overhead for establishing communication with the remote server? It is hard for me to imagine how could SQL queries spend around 800MB per day.
. How can i calculate how much would be data usage by reading from DB e.g. only Int32 column?
Enable enable .NET's SqlConnection statistics, and then examine the results with SqlConnection.RetrieveStatistics.
I have a performance issue with a method that calls org.hibernate.Query#list. The duration of the method call vary over time: it usually lasts about one second but some days, for maybe half a day, it takes about 20 seconds.
How can this issue be resolved? How can the cause for this issue be determined?
More elements in the analysis of this issue:
Performance issues have been observed in production environment, but the described issue is in a test environment.
The issue has been observed for at least several weeks but the date of its origin is unknown.
The underlying query is a view (select) in MS SQL Server (2008 R2):
Database reads/writes in this test environment are from a few users at a time only: the database server should not be overly sollicited and the data only changes slowly over time.
Executing the exact query directly from a MS SQL Server client always takes less than a second.
Duplicating the database (using the MS SQL Server client to backup the database and restore this backup as a new database) does not allow to reproduce the problem: the method call results in being fast on the duplicate.
The application uses Hibernate (4.2.X) and Java 6.
Upgrading from Hibernate 3.5 to 4.2 has not changed anything about the problem.
The method call is always with the same arguments: there is a test method that does the operation.
Profiling the method call (using hprof) shows that when it is long, most of the time is spent on "Object.wait" and "ref.ReferenceQueue.remove".
Using log4jdbc to log the underlying query duration during the method call shows the following results :
query < 1s => method ~ 1s
query ~ 3s => method ~ 20s
The query generates POJO as described in the most up-voted answer from this issue.
I have not tried using a Constructor with all attributes as described in the most up-voted answer from this other similar issue because I do not understand what effect that would have.
A possible cause of apparently random slowness with an Hibernate query is the flushing of the session. If some statements (inserts, updates, deletes) in the same transaction are unflushed, the list method of Query might do an autoflush (depending on the current flush mode). If that's the case, the performance issue might not even be caused by the query on which list() is called.
It seems the issue is with MS SQL Server and the updating of procedure's plan: following DBCC FREEPROCCACHE, DBCC DROPCLEANBUFFERS the query and method times are consistent.
A solution to the issue may be to upgrade MS SQL Server: upgrading to MS SQL Server 2008 R2 SP2 resulted in the issue not appearing anymore.
It seems the difference between the duration of the query and that of the method is an exponential factor related to the objects being returned: most of the time is spent on a socket read of the result set.
After a large SQL Query is run that is built through my ASPX Pages I see the following two items listed in sql profiler.
Event Class TextData ApplicationName CPU Reads Writes
SQL:BatchCompleted Select N'Testing Connection...' SQLAgent - Alert Engine 1609 0 0
SQL:BatchCompleted EXECUTE msdb.sbo.sp_sqlagent_get_perf_counters SQLAgent - Alert Engine 1609 96 0
These CPU is the same as the query so does that query actually take 1609*3=4827?
Same thing happens with case :
Audit Logout
Can I limit this? I am using sql server 2005.
First of all, some of what you see in the SQL Profiler is cumulative, so you can't always just add the numbers up. For example, a SPCompleted event will show the total time of all the SPStatementCompleted events that make it up. Not sure if that's your issue here.
The only way to improve the CPU is to actually improve your query. Make sure its using indexes, minimize the number of rows read, etc. Work with an experienced DBA on some of these techniques, or read a book.
Only other mitigation I can think of is to limit the number of CPUs the query runs on (this is called Degree of Parallelism, or DOP). You can set this at the server level, or specify it at the query level. If you have a multiple processor server, this can ensure that a single long-running query doesn't take over all processors on the box--it will leave one or more processors free for other queries to run.
No, it takes 1609 milliseconds of CPU in total. What is the duration?
I bet the same or slighty more because I doubt SQL Agent queries use parallelism.
Are you trying to reduce background processes using CPU? If so, then you reduce functionality by disabling SQL Agent (no backups then for example) and restarting SQL Server with switch -x
You also can not stop "Audit logout" events... this is what happens when you disconnect or close a connection.
However, are you maxing the processors? If so, you'll need to differentiate between "user" memory for queries and "system" memory used for paging or (god forbid) generating your parity on RAID 5 disks.
High CPU can often be solved by more RAM and a better disk config.
SQL Server 2008 has a new "Resource Governor" that may help. I don't know if you're using SQL Server 2008 or not but you may want to take a look here
This is an issue of connection string. If audit logout takes too much of your cpu then try to play with different connection string.
I want to tune a production SQL server. After making adjustments (such as changing the degree of parallelism) I want to know if it helped or hurt query execution times.
This seems like an obvious performance counter, but for the last half hour I've been searching Google and the counter list in perfmon, and I have not been able to find a performance counter for SQL server to give me the average execution time for all queries hitting a server. The SQL Server equivalent of the ASP.NET Request Execution Time.
Does one exist that I'm missing? Is there another effective way of monitoring the average query times for a server?
I don't believe there is a PerfMon but there is a report within SQL Server Management Studio:
Right click on the database, select Reports > Standard Reports > Object Execution Statistics. This will give you several very good statistics about what's running within the database, how long it's taking, how much memory/io processing it takes, etc.
You can also run this on the server level across all databases.
You can use Query Analyzer (which is one of the tools with SQL Server) and see how they are executed internally so you can optimize indexing etc. That wouldn't tell you about the average, or round-trip back to the client. To do that you'd have to log it on the client and analyze the data yourself.
I managed to do it by saving the Trace to SQL. When the trace is open
File > Save As > Trace Table
Select the SQL, and once its imported run
select avg(duration) from dbo.[YourTableImportName]
You can very easily perform other stats, max, min, counts etc... Much better way of interrogating the trace result
An other solution is to run multiple time the query and get the average query time:
DO $proc$
DECLARE
StartTime timestamptz;
EndTime timestamptz;
Delta double precision;
BEGIN
StartTime := clock_timestamp();
FOR i IN 1..100 LOOP
PERFORM * FROM table_name;
END LOOP;
EndTime := clock_timestamp();
Delta := 1000 * (extract(epoch FROM EndTime) - extract(epoch FROM StartTime)) / 100;
RAISE NOTICE 'Average duration in ms = %', Delta;
END;
$proc$;
Here it run 100 time the query:
PERFORM * FROM table_name;
Just replace SELECT by PERFORM
Average over what time and for which queries? You need to further define what you mean by "average" or it has no meaning, which is probably why it's not a simple performance counter.
You could capture this information by running a trace, capturing that to a table, and then you could slice and dice the execution times in one of many ways.
It doesn't give exactly what you need, but I'd highly recommend trying the SQL Server 2005 Performance Dashboard Reports, which can be downloaded here. It includes a report of the top 20 queries and their average execution time and a lot of other useful ones as well (top queries by IO, wait stats etc). If you do install it be sure to take note of where it installs and follow the instructions in the Additional Info section.
The profiler will give you statistics on query execution times and activities on the server. Overall query times may or may not mean very much without tying them to specific jobs and query plans.
Other indicators of performance bottlenecks are resource contention counters (general statistics, latches, locks). You can see these through performance counters. Also looking for large number of table-scan or other operations that do not make use of indexes can give you an indication that indexing may be necessary.
On a loaded server increasing parallelism is unlikely to materially affect performance as there are already many queries active at any given time. Where parallelism gets you a win is on large infrequently run batch jobs such as ETL processes. If you need to reduce the run-time of such a process then parallelism might be a good place to look. On a busy server doing a transactional workload with many users the system resources will be busy from the workload so parallelism is unlikely to be a big win.
You can use Activity Monitor. It's built into SSMS. It will give you real-time tracking of all current expensive queries on the server.
To open Activity Monitor:
In Sql Server Management Studio (SSMS), Right click on the server and select Activity Monitor.
Open Recent Expensive Queries to see CPU Usage, Average Query Time, etc.
Hope that helps.
There are counters in 'SQL Server:Batch Resp Statistics' group, which are able to track SQL Batch Response times. Counters are divided based on response time intervals, for example, from 0 ms to 1 ms, ..., from 10 ms to 20 ms, ..., from 1000 ms to 2000 ms and so on, So proper counters can be selected for the desired time interval.
Hope it helps.