How to debug slowdown on a SQL Azure server? - sql-server

We've had a SQL Azure cloudapp/database in production for a long time and while its performance has been a little volatile, over the last few days it has suddenly dropped drastically. Our application is unresponsive because SQL queries and stored procedures that used to take 5-10 seconds are now taking 90 seconds or more.
What are the things I should check, given that we already do regular index rebuilds/reorgs, clear down large tables when we're finished, etc.
We're still on the "Web" service tier and are planning to move soon to the newer S2 perhaps but we need to tackle this issue.

1) How many active connections does your SQL Azure DB have during slow times? Things get wierd once you get into 150+ range on a shared plan.
If you have a ton of connections open, that means you're not properly clearing them in your app somewhere.
2) Does your DB have any blocking queries? DBs with alot of blocking (deadlocking) queries may behave much slower, if you need access to locked resources
3) You should really consider switching to a dedicated SQL Azure plan. It is very quick to do and no action is required on the app-dev side. http://azure.microsoft.com/blog/2014/07/08/azure-update-sql-database-easy-upgrade-to-new-service-tiers-performance-improvements-pitr-for-basic-and-automated-export-for-all-service-tiers/
4) If neither helps, contact support. This could be an issue on their end
5) Once immediate problems are resolved, consider active monitoring of your SQL Azure db's (link in my profile signature)

http://www.developer.com/services/how-to-identify-performance-bottlenecks-on-azure-sql-database.html
You could also have a device in your network that is slowing down the performance. You might want to run some network tests to see if the problem is internal or external. For instance, someone might have changed some firewall or security settings on a rollout and messed it up a bit or a device might be ready to fail.

Related

Azure SQL connection timeouts (non-transient)

I know there are posts around Azure SQL connection timeouts, but have not found the following case.
I've been using Azure SQL (S3 plan). Normally the DTU is very low, and there are no timeouts when apps connect to this DB.
The problem starts when we run batch jobs against the DB, such as updating certain column value for millions of rows. It may take hours to complete these batch jobs. During this period, the DTU value reaches the max and other apps fail with timeouts.
Are there guidelines on what should be done? Here are options I thought of.
Upgrade to higher tier. This option likely works, but not attractive as the DTU is usually very low.
Increasing timeouts for the apps that connect to DB. Not sure if this works, because timeout would have to be a very long time.
If there is a way to allocate a certain portion of DTU to the batch job (say 70%) and always keeps some DTU left for others, that'd be ideal, but I don't think it's possible. Any suggestion would be appreciated!
Increasing timeouts is rarely a true solution, and can often make things worse.
First see if the operations being performed by the batch job can be made more efficient, by reviewing the query execution plans for missing or insufficient indexes, inefficient query logic, opportunities for caching, etc.
You could add a configuration setting and self-throttling logic to your batch job that allows you to control how many operations it may perform in a given timeframe, and then use that to determine what works best in your situation.
Maybe an easier option would be to just add a step to the beginning of your batch job that temporarily scales-up the database to a higher pricing tier when it starts, and then scales it back down when it finished.
ALTER DATABASE MyDatabaseName MODIFY (SERVICE_OBJECTIVE = 'P4')
You can upgrade to higher tier before running batch job and then downgrade back to S3.

SQL Server Table > MS Access Local Copy?

I'm looking for a little advice.
I have some SQL Server tables I need to move to local Access databases for some local production tasks - once per "job" setup, w/400 jobs this qtr, across a dozen users...
A little background:
I am currently using a DSN-less approach to avoid distribution issues
I can create temporary LINKS to the remote tables and run "make table" queries to populate the local tables, then drop the remote tables. Works as expected.
Performance here in US is decent - 10-15 seconds for ~40K records. Our India teams are seeing >5-10 minutes for the same datasets. Their internet connection is decent, not great and a variable I cannot control.
I am wondering if MS Access is adding some overhead here than can be avoided by a more direct approach: i.e., letting the server do all/most of the heavy lifting vs Access?
I've tinkered with various combinations, with no clear improvement or success:
Parameterized stored procedures from Access
SQL Passthru queries from Access
ADO vs DAO
Any suggestions, or an overall approach to suggest? How about moving data as XML?
Note: I have Access 7, 10, 13 users.
Thanks!
It's not entirely clear but if the MSAccess database performing the dump is local and the SQL Server database is remote, across the internet, you are bound to bump into the physical limitations of the connection.
ODBC drivers are not meant to be used for data access beyond a LAN, there is too much latency.
When Access queries data, is doesn't open a stream, it fetches blocks of it, wait for the data wot be downloaded, then request another batch. This is OK on a LAN but quickly degrades over long distances, especially when you consider that communication between the US and India has probably around 200ms latency and you can't do much about it as it adds up very quickly if the communication protocol is chatty, all this on top of the connection's bandwidth that is very likely way below what you would get on a LAN.
The better solution would be to perform the dump locally and then transmit the resulting Access file after it has been compacted and maybe zipped (using 7z for instance for better compression). This would most likely result in very small files that would be easy to move around in a few seconds.
The process could easily be automated. The easiest is maybe to automatically perform this dump every day and making it available on an FTP server or an internal website ready for download.
You can also make it available on demand, maybe trough an app running on a server and made available through RemoteApp using RDP services on a Windows 2008 server or simply though a website, or a shell.
You could also have a simple windows service on your SQL Server that listens to requests for a remote client installed on the local machines everywhere, that would process the dump and sent it to the client which would then unpack it and replace the previously downloaded database.
Plenty of solutions for this, even though they would probably require some amount of work to automate reliably.
One final note: if you automate the data dump from SQL Server to Access, avoid using Access in an automated way. It's hard to debug and quite easy to break. Use an export tool instead that doesn't rely on having Access installed.
Renaud and all, thanks for taking time to provide your responses. As you note, performance across the internet is the bottleneck. The fetching of blocks (vs a continguous DL) of data is exactly what I was hoping to avoid via an alternate approach.
Or workflow is evolving to better leverage both sides of the clock where User1 in US completes their day's efforts in the local DB and then sends JUST their updates back to the server (based on timestamps). User2 in India, also has a local copy of the same DB, grabs just the updated records off the server at the start of his day. So, pretty efficient for day-to-day stuff.
The primary issue is the initial DL of the local DB tables from the server (huge multi-year DB) for the current "job" - should happen just once at the start of the effort (~1 wk long process) This is the piece that takes 5-10 minutes for India to accomplish.
We currently do move the DB back and forth via FTP - DAILY. It is used as a SINGLE shared DB and is a bit LARGE due to temp tables. I was hoping my new timestamped-based push-pull of just the changes daily would have been an overall plus. Seems to be, but the initial DL hurdle remains.

Using Offline Indexing in SQL Server

I've written a .Net application which has an SQL Server 2008 R2 database with relatively small number of tables, but in some tables there might be some 100,000,000 records! For improving performance of SELECTs, I've created necessary indexes and it works well. But, as everyone knows, indexes need to be rebuilt when they are fragmented.
We have installed an SQL Server 2008 R2 Express on one of customer PCs plus my Winforms application. Three more PCs connect to this database over regular LAN, and everything seems fine.
Now, the problem is that, I want to rebuild indexes, for example every time a user starts using my program on ANY of the machines. Well, I can execute several ALTER INDEXes, but as stated in MS docs, OFFLINE indexing will lock the tables for period of indexing. Which means other users will lose access to tables when a user starts the program! I know there is an ONLINE option, but it doesn't work in Express edition of SQL Server.
In other environments with a real server running all the time, I would create an Agent Job which rebuilt indexes over night.
How can I solve this problem?
Without a normal 24/7 server running, it's difficult to do such maintenance automatically without disturbing users. I don't think putting that job at the application startup is a good idea, as it can really start many times together without a real reason, and also slows down startup significantly if tables are big, in addition to keep everyone else out as you say.
I would opt for 2 choices:
Setup a job on the "server" to do the rebuild on either SQL Server startup or computer startup. It will slow down the initialization of that PC when the user first power it on, but once done, it should work OK, and most likely with similar results to the nightly job.
Add an option in the application to launch the reindexing job manually when the user wants to do it, warning that it will take some time and during the process anyone else cannot use it. While this provides maximum flexibility, it relies on the user doing so when they start noting delays.

Failover strategy for database application

I've got a writing and reading database application holding a local cache. In case of an application server fault a backup server shall start working.
The primary and backup application can only run exclusively because of its local cache and some low isolation level on the database.
As far as my communication knowledge goes it is impossible to let both servers always figure out who is allowed to run exclusively.
Can I somehow solve this communication conflict through using the database as a third entity? I think this is a quite typical problem and there might not be a 100% safe method, but I would be happy to know how other people recommend to solve such issues? Or if there is some best practice to this.
It's okay if both application are not working for 30 minutes or so, but there is not enough time to get people out of bed and let them figure out what the problem is.
Can you set up a third server which is monitoring both application servers for health? This server could then decide appropriately in case one of the servers appears to be gone: Instruct the hot standby to start processing.
if i get the picture right, your backup server constantly polls the primary server for data updates, it wouldn't be hard to check if the poll fails, schedule it again for 30s later 3 times and in the third failure dynamically update the DNS entry to the database server to reflect the change in active server. Both Windows DNS and Bind accept dynamic updates signed and unsigned.

How do the servers for Fogbugz handle load balancing?

I remember hearing Joel say he has 2 different locations where the servers are located, each location has 2 front end servers and 1 back end server.
If a one of the hosting facilities goes down, how can he switch over to the other one? (Or is it just going to be a DNS change that will take 24-72 hours to propagate?).
How can a single SQL Server instance have so many databases on it? FB has a completely separate database per account. I can't see a single SQL Server instance having more than say 200-250 databases on it! And I'm sure they have more customers than that.
They talked about this in one of the Stack Overflow podcasts, but I can't find it in the transcripts.
1) Each of the two centers handles approximately 1/2 of the users. Fairly often (hourly, I think Joel said) they ship transaction logs to the other site. If site A goes down, they bring up the db backups on site B, and do the DNS switchover. It won't be instantaneous or automated, nor do they want it to be, because they'll be coming up with slightly stale data, and want to avoid that if it's at all possible to bring the broken site back up.
I'm not sure how they handle the DNS situation, but you can set the TTL on DNS records to mere seconds to limit caching, and have failover occur very quickly.
2) Why not? I'm not sure of the hard limit of databases per instance, but there's also nothing keeping you from running multiple instances of SQL Server on your box. I would imagine you're more limited by hardware than software. (You can also run Fogbugz with a MySQL database backend).
Got a few questions in here so I'll break these out:
If a one of the hosting facilities goes down, how can he switch over to the other one?
There's several ways to do this, including database mirroring (new in SQL Server 2005), log shipping, and replication. I've recorded a podcast on SQL Server high availability and disaster recovery options at SQLServerPedia.
(Or is it just going to be a DNS change that will take 24-72 hours to propagate?)
Like the other post mentioned, you can set your DNS time-to-live numbers very slow, but the cooler method uses database mirroring. With mirroring, you can set both the primary and secondary server names in your connection string, and your application will automatically try the second server when the first one doesn't respond.
How can a single SQL Server instance have so many databases on it? FB has a completely separate database per account. I can't see a single SQL Server instance having more than say 200-250 databases on it! And I'm sure they have more customers than that.
The largest SQL Server I've worked with had over a thousand databases, and I've talked to a couple of other DBAs who have worked on systems with more than 2,000 databases on a server. It certainly makes management much more challenging, that's for sure.

Resources