We are running Dynamics GP 2010 on 2 load balanced citrix servers. For the past 3 weeks we have had severe performance hits when users are running Fixed Assets reporting.
The database is large in size, but when I run the reports locally on the SQL server, they run great. The SQL server seems to be performing adequately even when users are seeing slow performance.
Any ideas?
Just because your DB seems un-stressed, it does not mean that it is fine. It could contain other bottlenecks. Typically, if a DB server is not maxing-out its CPUs occasionally, it means there is a much bigger problem.
Standard process for troubleshooting performance problems on a data driven app go like this:
Tune DB indexes. The Tuning Wizard in SSMS is a great starting point. If you haven't tried this yet, it is a great starting point.
Check resource utilization: CPU, RAM. If your CPU is maxed-out, then consider adding/upgrading CPU or optimize code or split your tiers. If your RAM is maxed-out, then consider adding RAM or split your tiers.
Check HDD usage: if your queue length goes above 1 very often (more than once per 10 seconds), upgrade disk bandwidth or scale-out your disk (RAID, multiple MDF/LDFs, DB partitioning).
Check network bandwidth
Check for problems on your app (Dynamics) server
Shared report dictionaries are the bane of reporting in GP. they do tend to slow things down. also, modifying reports becomes impossible as somebody has it open all the time.
use local report dictionaries and have a system to keep them synced with a "master" reports.dic
Related
SQL Server 2008 Enterprise SP4 0.0.6547.0 x64
Running on Windows 2012R2 patched current.
A VM running on Cisco UCM blades and 6.0 Update 3 plus patches.
A Nimble CS700 SAN for the storage.
This is a large OLTP server with 12 vCPU. Normal CPU usage hovers around 6-11%
What happens is that, without warning, the IO Stall times will go through the roof (2000-1000ms) and most queries will stop returning results. Adam Machanic's sp_whoisactive will show dozens of active queries. CPU is at 90+%.
SAN shows almost zero activity and all other VMs on the same SAN are operating optimally.
We see massive blocking as the stalled processes hold blocks, with some timing out and sleeping with blocks hanging on the SPID. Killing the SPIDs in question provides temporary relief, but seconds later we are right back where we started.
The only thing that provides relief is a reboot of the server.
Management is rightly demanding an actual root cause. When this happened last summer, with visibility to the CEO level, we engaged Microsoft support, who were dumbfounded and offered no actual root cause.
What I can't do is upgrade the SQL server. The machine hosts a packaged application and the package publisher refuses to support their software if we implement any newer SQL Server version. I desperately want to go to 2014/2016/2017, and would feel that it would solve this problem and others.
In any event, I searched the bug reports and did not see anything that matched.
Has anyone run into this issue? If so did you suss out a root cause? I have a gut feel that there is a bug in either SQL 2008, Windows 2012R2 or how they interact. But I don't want to write that into the RCA without having some corroboration.
Would appreciate any pointers.
Here is my approach
1.) Try eliminate storage issues.We once had a storage issue(SAN) and root cause seemed to be some HBA.You can further check if your storage is performing with in acceptable limits
You should start with below counters and see if they are less than 15ms
Avg. Disk sec/Read - is the average time, in seconds, of a read of data from the disk.
Avg. Disk sec/Write - is the average time, in seconds, of a write of data to the disk.
There is more info here :https://www.mssqltips.com/sqlservertip/2460/perfmon-counters-to-identify-sql-server-disk-bottlenecks/
2.) Once you have eliminated storage issues, you can further check if SQLSERVER is the only causing IO spikes or if there are any other applications causing IO.You can use resource monitor to find this
3.) If you have reached here, SQLSERVER may be culprit..Go with below steps and try following same sequence and see if problem persists after each step.
Remember HIGH IO can be caused due to
Stale stats and missing indexes:You might not be updating stats regularly or some type of queries might need more frequent index rebuilds/stats update
gather queries causing HIGH IO and try tuning them,you can observe number of reads done and try adding indexes to minimize number of reads
Further Check memory pressure,some times high memory usage can cause Buffer pool flush and there by queries will go to disk..You can look for a counter called PLE and see what is good for your environment
Further research pointed to VMWare. The machine was allocated 304GB of RAM, 264GB of which was assigned to SQL Server. However the underlying host was overcommitted on RAM by a large amount. We suspect thrashing as page life drops, and as other VMs also need real RAM.
Thanks
John.
I'm looking into using Windows Performance Monitor to analyse server performance. I'm testing on the adventure works 2014 database on SQL Server 2014.
I want to try to Max out the CPU, Memory, disk usage (I/O) and possibly put a high amount of User activity. Then I can train myself in using windows performance monitor to take logs for performances around this area.
I know for CPU I can just run a heavy query within a while loop (maybe infinite) and that will go towards maxing it out.
I'm less sure about the other ones. I've tried queries which select from large tables (30,000 records), and are in a while loop to try and use some of the memory up. But it doesn't seem to drop the Available Mbytes left counter on performance monitor. Is this because the tables not big enough?
As for the disk usage, I imagine I may have to do some updates or inserts so that the disk is being written to. But I can't seem to get it to effect disk usage.
As for network, I can only think of opening up multiple queries and running them concurrently, but that seems a bit messy.
As a side note, I want to script it all myself. Rather than using any extra tools or pre canned apps that do it for you.
I'm running some stored procedures in SQL Server 2012 under Windows Server 2012 in a dedicated server with 32 GB of RAM and 8 CPU cores. The CPU usage is always below 10% and the RAM usage is at 80% because SQL Server has 20 GB (of 32 GB) assigned.
There are some stored procedures that are taking 4 hours some days and other days, with almost the same data, are taking 7 or 8 hours.
I'm using the least restrictive isolation level so I think this should not be a locking problem. The database size is around 100 GB and the biggest table has around 5 million records.
The processes have bulk inserts, updates and deletes (in some cases I can use truncate to avoid generating logs and save some time). I'm making some full-text-search queries in one table.
I have full control of the server so I can change any configuration parameter.
I have a few questions:
Is it possible to improve the performance of the queries using
parallelism?
Why is the CPU usage so low?
What are the best practises for configuring SQL Server?
What are the best free tools for auditing the server? I tried one
from Microsoft called SQL Server 2012 BPA but the report is always
empty with no warnings.
EDIT:
I checked the log and I found this:
03/18/2015 11:09:25,spid26s,Unknown,SQL Server has encountered 82 occurrence(s) of I/O requests taking longer than 15 seconds to complete on file [C:\Program Files\Microsoft SQL Server\MSSQL11.HLSQLSERVER\MSSQL\DATA\templog.ldf] in database [tempdb] (2). The OS file handle is 0x0000000000000BF8. The offset of the latest long I/O is: 0x00000001fe4000
Bump up max memory to 24 gb.
Move tempdb off the c drive and consider mult tempdb files, with auto grow at least 128 Mbps or 256 Mbps.
Install performance dashboard and run performance dashboard report to see what queries are running and check waits.
If you are using auto grow on user data log and log files of 10%, change that to something similar to tempdb growth above.
Using performance dashboard check for obvious missing indexes that predict 95% or higher improvement impact.
Disregard all the nay Sayers who say not to do what I'm suggesting. If you do these 5 things and you're still having trouble post some of the results from performance dashboard, which by the way is free.
One more thing that may be helpful, download and install the sp_whoisactive stored proc, run it and see what processes are running. Research the queries that you find after running sp_whoisactive.
query taking hours but using low CPU
You say that as if CPU would matter for most db operations. HINT: They do not.
Databases need IO. RAM sin some cases helps mitigate this, but at the end it runs down to IO.
And you know what I see in your question? CPU, Memory (somehow assuming 32gb is impressive) but NO WORD ON DISC LAYOUT.
And that is what matters. Discs, distribution of files to spread the load.
If you look into performance counters then you will see latency being super high on discs - because whatever "pathetic" (in sql server terms) disc layout you have there, it simply is not up to the task.
Time to start buying. SSD are a LOT cheaper than discs. You may say "Oh, how are they cheaper". Well, you do not buy GB - you buy IO. And last time I checked SSD did not cost 100 times the price of discs - but they have 100 times or more the IO. and we talk always of random IO.
Then isolate Tempdb on separate SSD - tempdb either does no a lot or a TON and you want to see this.
Then isolate the log file.
Make multiple data files, for database and tempdb (particularly tempdb - as many as you have cores).
And yes, this will cost money. But at the end - you need IO and like most developers you got CPU. Bad for a database.
Not sure if this is a SO or a ServerFault question, so please feel free to move if it's not in the right place:
I have a client with a large database containing a table with around 30-35 million rows running on a SQL2008R2 server (the server is pretty high spec, 16 cores, 92 gig ram, RAID etc). There are other tables this table may join on, but it is the main driver of a several reports.
Their SSRS instance/database and query source database are both running on the same box/sql instance
They regularly run ad-hoc reports from this database (which have undergone extensive optimisation), many of which may end up touching a lot of the data in the table. After looking at the report server stats it appears that the data fetch doesn't actually take that long, but a lot of data is returned and report processing takes a fair while: it can take up to 20-30 minutes to process some of the larger reports, which can have tens of thousands of pages (the data fetch in these cases is less than 10 seconds).
(Note: I realise that there is never really a need to run 25,000 pages off but the client insists and won't listen to reason...something about Excel spreadsheets *FACEPALM!*)
At the moment they are concerned about a couple of performance issues that crop up sporadically and the culprit may be the ad-hoc reporting.
We are looking at offloading the report processing anyway, so thought that this would be an ideal opportunity - but before doing so I'm wondering how much relief this will give the SQL server.
If I move the SSRS app and database onto another SQL host and remotely query the data (network conditions should be ideal as this is datacentre based), will I see any performance gains?
This is mainly based on guesswork at this stage but I see the following being the factors that could affect performance:
I/O for moving a shedload of rows from the query source to RS temp DB
CPU load when the report server is crunching all the data
In moving to another host I see these factors being reduced for the SQL server. The new server will be solely responsible for report processing (and should also be high spec), so hopefully there will be no contention when processing reports.
Do I sound like I am on the right track in my assumptions? Is there anything else that I may have missed which could adversely affect performance or improve performance?
Thanks in advance
You should look at transactional replication to send data from the main server to a database on the reporting server. Querying the tables directly over the network will only slow things down even more.
I've read that it's unwise to install SQL Server and IIS on the same machine, but I haven't seen any evidence for that. Has anybody tried this, and if so, what were the results? At what point is it necessary to separate them? Is any tuning necessary? I'm concerned specifically with IIS7 and SQL Server 2008.
If somebody can provide numbers showing when it makes more sense to go to two machines, that would be most helpful.
It is unwise to run SQL Server with any other product, including another instance of SQL Server. The reason for this recommendation is the nature of of how SQL Server uses the OS resources. SQL Server runs on a user mode memory management and processor scheduling infrastructure called SQLOS. SQL Server is designed to run at peak performance and assumes that is the only server on the OS. As such the SQL OS reserves all RAM on the machine for SQL process and creates a scheduler for each CPU core and allocates tasks for all schedulers to run, utilizing all CPU it can get, when it needs it. Because SQL reserves all memory, other processes that need memory will cause SQL to see memory pressure, and the response to memory pressure will evict pages from buffer pool and compiled plans from the plan cache. And since SQL is the only server that actually leverages the memory notification API (there are rumors that the next Exchange will too), SQL is the only process that actually shrinks to give room to other processes (like leaky buggy ASP pools). This behavior is also explained in BOL: Dynamic Memory Management.
A similar pattern happens with CPU scheduling where other processes steal CPU time from the SQL schedulers. On high end systems and on Opteron machines things get worse because SQL uses NUMA locality to full advantage, but no other processes are usually not aware of NUMA and, as much as the OS can try to preserve locality of allocations, they end up allocating all over the physical RAM and reduce the overall throughput of the system as the CPUs are idling on waiting for cross-numa boundary page access. There are other things to consider too like TLB and L2 miss increase due to other processes taking up CPU cycles.
So to sum up, you can run other servers with SQL Server, but is not recommended. If you must, then make sure you isolate the two server to your best ability. Use CPU affinity masks for both SQL and IIS/ASP to isolate the two on separate cores, configure SQL to reserve less RAM so that it leaves free memory for IIS/ASP, configure your app pools to recycle aggressively to prevent application pool growth.
Yes, it is possible and many do it.
It tends to be a question of security and/or performance.
Security is questioned as your attack surface is increased on a box that has both. Perhaps not an issue for you.
Performance is questioned as now your server is serving web and DB requests. Again, perhaps not an issue in your case.
Test vs. Production....
Many may feel fine in test environments but not production....
Again, your team's call. I like my test and production environments being as similar as possible if possible but that's my preference.
It's possible, yes.
A good idea for a production environment, no.
The problem that you're going to run in to is that a SQL Server database under substantial load is, more than likely, going to be doing heavy disk I/O and have a large memory footprint. That combination is going to tie up the machine, and you're going to see a performance hit in IIS as it tries to serve up the pages.
It's unwise in certain contexts... totally wise in others.
If your machine is underutilized and won't experience heavy loads, then there is an advantage to installing the database on the same machine, because you simply won't have to transfer anything across the network.
On the other hand, if one or both of IIS or the database will be under heavy load, they will likely start to interfere, and the performance gain of dedicated hardware for each will probably outstrip the loss of having to go over the network.
Don't forget the maintenance issue...you can't reboot/patch one without nuking the other. If they are on two boxes, you could give your users a better experience, than no response from the webserver if you are maintaining the SQL box.
Not highest on the list, but should be noted.
You certainly can. You will run into performance issues if, for example, you have large user base or if there are a lot of heavy query's being run against the DB. I have worked on several sites, usually hosted at 1and1, that run IIS and SQL Server (Express!) on the same box with thousands of users (hundreds concurrent) and millions of records in poorly designed tables, accessed via poorly written stored procedures and the user experience was certainly tolerable. It all comes down to how hard you plan on hitting the server.