Long loading time after creating Availability Groups and migrating in SQL - sql-server

so I have this issue. Our client using MS SQL databases. Two months ago they migrated their databases to the SQL Enterprise 2019 from earlier version and Standard edition.
They major reason was to secure high availability through feature in MS SQL - Availability groups.
After that our application get really slowed. In the simply way to tell, customer startup an app select workspace and then its takes like 15 seconds to load data.
First step is just sending request to database to select data - no inserts, deletes or any high performance processes.
App is using and working with geographical and geometry data, every geo objects is saved in database as geometry data type. The first huge, major select is causing the slow issue.
When I was looking at activity mon under wait categories is only one thing suspicious to me and its type Other.
In database I dont see any high cost queries and availability group mode is set to synchronous.
If Im getting this right, the synchronous mode should not be the cause of this problem because this database is clearly for reading a data not as I mentioned modifying.
I made changes to some instance parameters and set Optimize for Ad hoc workloads to True and and threshold for parallelism from 5 to 20.
Other thing which I tried was create a new app source database and database which contains geo data inside of that SQL instance and didnt add them to availability groups.
From application we are using, for test causes, a connection to the one instance with new test databases.
Neither of this settings work. So guys if you have any idea or any experience with this please help me.
Here is a screen of top 10 waits from sys dmv.

1 - Stats recompute...
When you are going from a SQL version to a higher one, you must first change the compatibility level (to have some performance benefits) and then recompute all statistics in the database with a FULLSCAN. Why ? Because each version of SQL Server come with a new optimizer that have new operators, new algorithms and many improvements... To stick to this new version of the optimizer the method of computing statistics and the form of the results of these calculations, is rethought with each modification of the engine ... so much so that if we use the old statistics with a new engine, it is like taking the census of the population in 1930, to plan the construction of roads, schools and hospitals for the current actual population ....
2 - SQL Server Editions...
When upscaling SQL Server from Standard to Enterprise, you need to increase the "hardware" (even if it is a VM) because many of the features that runs under Enterprise version, and does not exists in Standard, needs some more computationnal resources. As an example, using the AUTO_UPDATE_STATISTICS_ASYNC will use automatically one more thread to the detriment of other processes... In comparison, using a Rolls Royce or a Hummer, instead of a VolksWagen is arguably more comfortable, faster ... but requires more oil and more expensive insurance!
3 - Synchronous AVG...
Synchronous AlwaysOn availability groups must have a very fast and faultless network .... If this is not the case, the replication of update requests can drag performance down, especially if you are in pessimistic lockdown (default mode).
4 - Transaction logs...
One common global lack of performances can be the latency to write the transaction log.
5 - Tempdb files...
Another current global lack of performances can be the latency to access tempdb files.
For those two file problems, use the Glenn Berry latency file query that will give you a indice... Good values are under 7 ms for reads and 15 ms for writes...
CONCLUSION
Many other factors can contribute to slow down you system. But without no more information, we cannot help you...

Related

SQL Server Table > MS Access Local Copy?

I'm looking for a little advice.
I have some SQL Server tables I need to move to local Access databases for some local production tasks - once per "job" setup, w/400 jobs this qtr, across a dozen users...
A little background:
I am currently using a DSN-less approach to avoid distribution issues
I can create temporary LINKS to the remote tables and run "make table" queries to populate the local tables, then drop the remote tables. Works as expected.
Performance here in US is decent - 10-15 seconds for ~40K records. Our India teams are seeing >5-10 minutes for the same datasets. Their internet connection is decent, not great and a variable I cannot control.
I am wondering if MS Access is adding some overhead here than can be avoided by a more direct approach: i.e., letting the server do all/most of the heavy lifting vs Access?
I've tinkered with various combinations, with no clear improvement or success:
Parameterized stored procedures from Access
SQL Passthru queries from Access
ADO vs DAO
Any suggestions, or an overall approach to suggest? How about moving data as XML?
Note: I have Access 7, 10, 13 users.
Thanks!
It's not entirely clear but if the MSAccess database performing the dump is local and the SQL Server database is remote, across the internet, you are bound to bump into the physical limitations of the connection.
ODBC drivers are not meant to be used for data access beyond a LAN, there is too much latency.
When Access queries data, is doesn't open a stream, it fetches blocks of it, wait for the data wot be downloaded, then request another batch. This is OK on a LAN but quickly degrades over long distances, especially when you consider that communication between the US and India has probably around 200ms latency and you can't do much about it as it adds up very quickly if the communication protocol is chatty, all this on top of the connection's bandwidth that is very likely way below what you would get on a LAN.
The better solution would be to perform the dump locally and then transmit the resulting Access file after it has been compacted and maybe zipped (using 7z for instance for better compression). This would most likely result in very small files that would be easy to move around in a few seconds.
The process could easily be automated. The easiest is maybe to automatically perform this dump every day and making it available on an FTP server or an internal website ready for download.
You can also make it available on demand, maybe trough an app running on a server and made available through RemoteApp using RDP services on a Windows 2008 server or simply though a website, or a shell.
You could also have a simple windows service on your SQL Server that listens to requests for a remote client installed on the local machines everywhere, that would process the dump and sent it to the client which would then unpack it and replace the previously downloaded database.
Plenty of solutions for this, even though they would probably require some amount of work to automate reliably.
One final note: if you automate the data dump from SQL Server to Access, avoid using Access in an automated way. It's hard to debug and quite easy to break. Use an export tool instead that doesn't rely on having Access installed.
Renaud and all, thanks for taking time to provide your responses. As you note, performance across the internet is the bottleneck. The fetching of blocks (vs a continguous DL) of data is exactly what I was hoping to avoid via an alternate approach.
Or workflow is evolving to better leverage both sides of the clock where User1 in US completes their day's efforts in the local DB and then sends JUST their updates back to the server (based on timestamps). User2 in India, also has a local copy of the same DB, grabs just the updated records off the server at the start of his day. So, pretty efficient for day-to-day stuff.
The primary issue is the initial DL of the local DB tables from the server (huge multi-year DB) for the current "job" - should happen just once at the start of the effort (~1 wk long process) This is the piece that takes 5-10 minutes for India to accomplish.
We currently do move the DB back and forth via FTP - DAILY. It is used as a SINGLE shared DB and is a bit LARGE due to temp tables. I was hoping my new timestamped-based push-pull of just the changes daily would have been an overall plus. Seems to be, but the initial DL hurdle remains.

Why use AppFabric when denormalized SQL Server data seems to perform as well?

I am working on an eCommerce website designed to present a large number of SKUs. The SQL Server schema describing these products is normalized to the extent that, a few years ago, it became unreasonably slow to retrieve the necessary information to present to customers, so we changed our infrastructure such that we would bear the cost of loading the data for each product once and then store that data in an AppFabric cache (previously Velocity).
Over time, the complexity of requirements placed on our AppFabric infrastructure has grown (imagine that), forcing us to spend a considerable amount of time writing code for handling data retrieval from our cache, data updates including incremental updates, etc.
We happen to have much of our product data stored in a denormalized form in a side database, so for experimentation's sake I wrote a console app to randomly select one of our ~150K SKUs at a time, and then retrieve the record for that product from our denormalized table.
I was surprised to find that I was able to select these records in about the same average time that I could select a record from our AppFabric cache, about 2.5 ms average in both cases. I'm sure in both cases the data is coming from an in-memory cache of one sort or another, be it AppFabric or disk cache, and the 2.5 ms is bumping against a bare minimum amount of time for a network round trip.
This makes me think we might be better off just using denormalized data in SQL Server for our high load/high performance needs. The management tools for SQL Server-based data are so much better. All of the devs on our team are adept at using Management Studio, whereas with AppFabric we have one dev who can use PowerShell to a) Give us a count of records stored in the cache and b) dump the cache. Any other management functionality we have to create ourselves.
This makes me ask why anyone would want to use AppFabric at all. We are not concerned with cost, because the cost of the development efforts we have to apply to an AppFabric-related solution vastly outweigh even the cost of SQL Server licensing.
Thank you for whatever feedback you can provide to help our team decide the best direction to move forward.
Deciding to use a caching mechanism should be a very thought out process -- and isn't really always the right choice. However, the primary reason for using caching over a durable persistance model is to manage an extremely high transaction load.
In AppFabric Cache I can setup a distributed set of servers to work off of one logical repository -- with built in load balancing. So, unlike Microsoft SQL Server which has no way of providing clustered instances for the purpose of load balancing -- if I'm reading and writing 50 to 100 million times a day the cache is a more viable solution for sharing those resources. Then those writes can be queued to the durable persistence model over time ensuring that there are no real peaks in usage because it's spread out both across the caching fabric and the durable store.
Using AppFabric rather than a dedicated cache-aside database containing a denormalised schema also provides the benefit of fine grained control over cache key expiry, eviction, and tuned region policies. You would have to roll this yourself if you used SqlServer. I also agree with #mperrenoud03 comments about load balancing and high transaction rate support. Also, if you use a good ORM tool like NHibernate, it can be configured to use Appfabric (or other distributed cache platforms) as a 2nd level cache. We are leveraging this in our project and getting good results.

SQL server 2005 replication to many slave servers - hardware replication or change the strategy

we have a 500gb database that performs about 10,000 writes per minute.
This database has a requirements for real time reporting. To service this need we have 10 reporting databases hanging off the main server.
The 10 reporting databases are all fed from the 1 master database using transactional replication.
The issue is that the server and replication is starting to fail with PAGEIOLATCH_SH errors - these seem to be caused by the master database being overworked. We are upgrading the server to a quad proc / quad core machine.
As this database and the need for reporting is only going to grow (20% growth per month) I wanted to know if we should start looking at hardware (or other 3rd party application) to manage the replication (what should we use) OR should we change the replication from the master database replicating to each of the reporting databases to the Master replicating to reporting server 1, reporting server 1 replicating to reporting server 2
Ideally the solution will cover us to a 1.5tb database, with 100,000 writes per minute
Any help greatly appreciated
One common model is to have your main database replicate to 1 other node, then have that other node deal with replicating the data out from there. It takes the load off your main server and also has the benefit that if, heaven forbid, your reporting system's replication does max out it won't affect your live database at all.
I haven't gone much further than a handful of replicated hosts, but if you add enough nodes that your distribution node can't replicate it all it's probably sensible to expand the hierarchy so that your distributor is actually replicated to other distributors which then replicate to the nodes you report from.
How many databases you can have replicated off a single node will depend on how up-to-date your reporting data needs to be (EG: Whether it's fine to have it only replicate once a day or whether you need to the second) and how much data you're replicating at a time. Might be worth some experimentation to find out exactly how many nodes 1 distributor could power if it didn't have the overhead of actually running your main services.
Depending on what you're inserting, a load of 100,000 writes/min is pretty light for SQL Server. In my book, I show an example that generates 40,000 writes/sec (2.4M/min) on a machine with simple hardware. So one approach might be to see what you can do to improve the write performance of your primary DB, using techniques such as batch updates, multiple writes per transaction, table valued parameters, optimized disk configuration for your log drive, etc.
If you've already done as much as you can on that front, the next question I have is what kind of queries are you doing that require 10 reporting servers? Seems unusual, even for pretty large sites. There may be a bunch you can do to optimize on that front, too, such as offloading aggregation queries to Analysis Services, or improving disk throughput. While you can, scaling-up is usually a better way to go than scaling-out.
I tend to view replication as a "solution of last resort." Once you've done as much optimization as you can, I would look into horizontal or vertical partitioning for your reporting requirements. One reason is that partitioning tends to result in better cache utilization, and therefore higher total throughput.
If you finally get to the point where you can't escape replication, then the hierarchical approach suggested by fyjham is definitely a reasonable one.
In case it helps, I cover most of these issues in depth in my book:
Ultra-Fast ASP.NET.
Check that your publisher and distributor's transaction log files don't have too many VLFs (Virtual Log Files) as detailed here (step 8):
http://www.sqlskills.com/BLOGS/KIMBERLY/post/8-Steps-to-better-Transaction-Log-throughput.aspx
If your distribution database is co-located with you publisher database, consider moving it to its own dedicated server.

Would it ever be wise to have a SQL server per web server?

I'm wondering if, under the circumstances that
You get lots more reads than writes
Your SQL server of choice is cheap/free and offers a fast mirroring/replication service
Your database isn't insanely large
rather than having separate SQL servers it would be better to have an instance of SQL on each machine getting instant updates from the master. This way there would be no network latency when doing all the read queries, but there would be a per box performance hit as the SQL instance has to execute. Would this be better overall for performance? Are there any other pros/cons that might come up?
Your SQL Server should always be on a different box to the webserver, of that there is no question.
How many DB servers and webservers you have, and how they mirror (or otherwise) is up to how you scale your application.
You have SQL Server on a different machine because it needs (and deserves) a lot of RAM.
It's quite a common architectural pattern to have read-only replicas of a database. We accept some degree of stalesness in them, perhaps they are even only updated once a day.
The general rule will be that multiple copies will introduce complexity in terms of operations and management and tend to introduce the possibilities of inconsistency of data - almost inevitably the copies will not be perfectly is step (or the costs of making them soo will be too high.)
An example: what happens if your replication processing breaks a bit. So that some, but not all copies become stale. Now your users start to see radically different views of the world. How much might that matter to you? If it's a site with low value data (eg. celebrity sightings in London suberbs) then perhaps that's fine. If it's on hand inventory, and being out of date means that your customers can't place orders, then maybe you care rather more.
My advice: things that sound simple at a boxed on paper sort of level don't always work out that way when you're sitting in an operations room at 3AM. Be very sure that you can easily operate your solution.
How would your SQL Server be cheap/free? I should have said the licensing costs for this setup would be crippling. At retail prices you're looking at $6000 per server. See also Jeff's comments about costs. Scale out the web servers by all means, but not your SQL Server until it's pretty much on its' knees.
You might instead want to think about a distributed cache like Velocity or NCache.
Either way, run your site first with one SQL server and see how it copes with the load, then think about mirroring/replication across servers, otherwise you're just optimising prematurely. Measure first!
An immediate con is that there is no distributed lock co-ordinator in SQL Server so you can get merge conflicts as updates can change the same row on two different servers at the same time.
Depending on the size of the database and the disks in the web servers, you will find your network latency is smaller than the disk latency you will start suffering as the web server disks will not usually be as performant as the disk array you give to the database. If you wanted that kind of performance, you would be buying it per web server.
Replication performance is not without latency either, the distribution of the transactions isn't 'free' and careful maintenance of the transaction log would have to be planned to ensure you did not get log fragmentation (too many vlog's wthin the transaction log) which kills replication performance.

Transaction level, nolock/readpast and concurrency

We have a system that is concurrently inserted a large amount of data from multiple stations while also exposing a data querying interface. The schema looks something like this (sorry about the poor formatting):
[SyncTable]
SyncID
StationID
MeasuringTime
[DataTypeTable]
TypeID
TypeName
[DataTable]
SyncID
TypeID
DataColumns...
Data insertion is done in a "Synchronization" and goes like this (we only insert data into the system, we never update)
INSERT INTO SyncTable(StationID, MeasuringTime) VALUES (X,Y); SELECT ##IDENTITY
INSERT INTO DataTable(SyncID, TypeID, DataColumns) VALUES
(SyncIDJustInserted, InMemoryCachedTypeID, Data)
... lots (500) similar inserts into DataTable ...
And queries goes like this ( for a given station, measuringtime and datatype)
SELECT SyncID FROM SyncTable WHERE StationID = #StationID
AND MeasuringTime = #MeasuringTime
SELECT DataColumns FROM DataTable WHERE SyncID = #SyncIDJustSelected
AND DataTypeID = #TypeID
My question is how can we combine the transaction level on the inserts and NOLOCK/READPAST hints on the queries so that:
We maximize the concurrency in our system while favoring the inserts (we need to store a lot of data, something as high as 2000+ records a second)
Queries only return data from "commited" synchronization (we don't want a result set with a half inserted synchronization or a synchronization with some skipped entries due to lock-skipping)
We don't care if the "newest" data is included in the query, we care more for consistency and responsiveness then for "live" and up-to-date data
This may be very conflicting goals and may require a high transaction isolation level but I am interested in all tricks and optimizations to achieve high responsiveness on both inserts and selects. I'll be happy to elaborate if more details are needed to flush out more tweaks and tricks.
UPDATE: Just adding a bit more information for future replies. We are running SQL Server 2005 (2008 within six months probably) on a SAN network with 5+ TB of storage initially. I'm not sure what kind of RAID the SAn is set up to and precisely how many disks we have available.
If you are running SQL 2005 and above look into implementing snapshot isolation. You will not be able to get consistent results with nolock.
Solving this on SQL 2000 is much harder.
This is a great scenario for SQL Server 2005/2008 Enterprise's Partitioning feature. You can create a partition for each StationID, and each StationID's data can go into its own filegroup (if you want, may not be necessary depending on your load.)
This buys you some advantages with concurrency:
If you partition by stationid, then users can run select queries for stationid's that aren't currently loading, and they won't run into any concurrency issues at all
If you partition by stationid, then multiple stations can insert data simultaneously without concurrency issues (as long as they're on different filegroups)
If you partition by syncid range, then you can put the older data on slower storage.
If you partition by syncid range, AND if your ranges are small enough (meaning not a range with thousands of syncids) then you can do loads at the same time your users are querying without running into concurrency issues
The scenario you're describing has a lot in common with data warehouse nightly loads. Microsoft did a technical reference project called Project Real that you might find interesting. They published it as a standard, and you can read through the design docs and the implementation code in order to see how they pulled off really fast loads:
http://www.microsoft.com/technet/prodtechnol/sql/2005/projreal.mspx
Partitioning is even better in SQL Server 2008, especially around concurrency. It's still not a silver bullet - it requires manual design and maintenance by a skilled DBA. It's not a set-it-and-forget-it feature, and it does require Enterprise Edition, which costs more than Standard Edition. I love it, though - I've used it several times and it's solved specific problems for me.
What type of disk system will you be using? If you have a large striped RAID array, writes should perform well. If you can estimate your required reads and writes per second, you can plug those numbers into a formula and see if your disk subsystem will keep up. Maybe you have no control over hardware...
Wouldn't you wrap the inserts in a transaction, which would make them unavailable to the reads until the insert is finished?
This should follow if your hardware is configured correctly and you're paying attention to your SQL coding - which it seems you are.
Look into SQLIO.exe and SQL Stress tools:
SQLIOStress.exe
SQLIOStress.exe simulates various patterns of SQL Server 2000 I/O behavior to ensure rudimentary I/O safety.
The SQLIOStress utility can be downloaded from the Microsoft Web site. See the following article.
• How to Use the SQLIOStress Utility to Stress a Disk Subsystem such as SQL Server
http://support.microsoft.com/default.aspx?scid=kb;en-us;231619
Important The download contains a complete white paper with extended details about the utility.
SQLIO.exe
SQLIO.exe is a SQL Server 2000 I/O utility used to establish basic benchmark testing results.
The SQLIO utility can be downloaded from the Microsoft Web site. See the following:
• SQLIO Performance Testing Tool (SQL Development) – Customer Available
http://download.microsoft.com/download/f/3/f/f3f92f8b-b24e-4c2e-9e86-d66df1f6f83b/SQLIO.msi

Resources