Azure SQL GeoReplication - Queries on secondary DB are slower - sql-server

I've setup two SQL DBs on Azure with geo-replication. The primary is in Brazil and a secondary in West Europe.
Similarly I have two web apps running the same web api. A Brazilian web app that reads and writes on the Brazilian DB and a European web app that reads on the European DB and writes in the Brazilian DB.
When I test response times on read-only queries with Postman from Europe, I first notice that on a first "cold" call the European Web app is twice as fast as the Brazilian one. However, immediate next calls response times on the Bazilian web app are 10% of the initial "cold" call whereas response times on the European web app remain the same. I also notice that after a few minutes of inactivity, results are back to the "cold" case.
So:
why do query response times drop in Brazil?
whatever the answer is to 1, why doesn't it happen in Europe?
why does the response times optimization occurring in 1 doesn't last after a few minutes of inactivity?
Note that both web apps and DB are created as copy/paste (except geo-replication) from each other in an Azure ARM json file.
Both web apps are alwaysOn.
Thank you.
UPDATE
Actually there are several parts in action in what I see as a end user. The webapps and the dbs. I wrote this question thinking the issue was around the dbs and geo-replication however, after trying #Alberto's script (see below) I couldn,' see any differences in wait_times when querying Brazil or Europe so the problem may be on the webapps. I don't know how to further analyse/test that.
UPDATE 2
This may be (or not) related to query store. I asked on a new more specific question on that subject.
UPDATE 3
Queries on secondary database are not slower. My question was raised on false conclusions. I won't delete it as others took time to answer it and I thank them.
I was comparing query response times through rest calls to a web api running EF queries on a SQL Server DB. As rest calls to the web api located in the region querying the db replica are slower than rest calls to the same web api deployed in another region targeting the primary db, I concluded the problem was on the db side. However, when I run the queries in SSMS directly, bypassing the web api, I observe almost no differences in response times between primary and replica db.
I still have a problem but it's not the one raised in that question.

On Azure SQL Database your database' memory utilization may be dynamically reduced after some minutes of inactivity, and on this behavior Azure SQL differs from SQL Server on-premises. If you run a query two or three times it then start to execute faster again.
If you examine the query execution plan and its wait stats, you may find a wait named MEMORY_ALLOCATION_EXT for those queries executing after the memory allocation has been shrinked by Azure SQL Database service. Databases with a lot activity and query execution may not see its memory allocation reduced. For a detailed information of my part please read this StackOverflow thread.
Take in consideration also both databases should have the same service tier assigned.
Use below script to determine query waits and see what is the difference in terms of waits between both regions.
DROP TABLE IF EXISTS #before;
SELECT [wait_type], [waiting_tasks_count], [wait_time_ms], [max_wait_time_ms],
[signal_wait_time_ms]
INTO #before
FROM sys.[dm_db_wait_stats];
-- Execute test query here
SELECT *
FROM [dbo].[YourTestQuery]
-- Finish test query
DROP TABLE IF EXISTS #after;
SELECT [wait_type], [waiting_tasks_count], [wait_time_ms], [max_wait_time_ms],
[signal_wait_time_ms]
INTO #after
FROM sys.[dm_db_wait_stats];
-- Show accumulated wait time
SELECT [a].[wait_type], ([a].[wait_time_ms] - [b].[wait_time_ms]) AS [wait_time]
FROM [#after] AS [a]
INNER JOIN [#before] AS [b] ON
[a].[wait_type] = [b].[wait_type]
ORDER BY ([a].[wait_time_ms] - [b].[wait_time_ms]) DESC;

Related

Long loading time after creating Availability Groups and migrating in SQL

so I have this issue. Our client using MS SQL databases. Two months ago they migrated their databases to the SQL Enterprise 2019 from earlier version and Standard edition.
They major reason was to secure high availability through feature in MS SQL - Availability groups.
After that our application get really slowed. In the simply way to tell, customer startup an app select workspace and then its takes like 15 seconds to load data.
First step is just sending request to database to select data - no inserts, deletes or any high performance processes.
App is using and working with geographical and geometry data, every geo objects is saved in database as geometry data type. The first huge, major select is causing the slow issue.
When I was looking at activity mon under wait categories is only one thing suspicious to me and its type Other.
In database I dont see any high cost queries and availability group mode is set to synchronous.
If Im getting this right, the synchronous mode should not be the cause of this problem because this database is clearly for reading a data not as I mentioned modifying.
I made changes to some instance parameters and set Optimize for Ad hoc workloads to True and and threshold for parallelism from 5 to 20.
Other thing which I tried was create a new app source database and database which contains geo data inside of that SQL instance and didnt add them to availability groups.
From application we are using, for test causes, a connection to the one instance with new test databases.
Neither of this settings work. So guys if you have any idea or any experience with this please help me.
Here is a screen of top 10 waits from sys dmv.
1 - Stats recompute...
When you are going from a SQL version to a higher one, you must first change the compatibility level (to have some performance benefits) and then recompute all statistics in the database with a FULLSCAN. Why ? Because each version of SQL Server come with a new optimizer that have new operators, new algorithms and many improvements... To stick to this new version of the optimizer the method of computing statistics and the form of the results of these calculations, is rethought with each modification of the engine ... so much so that if we use the old statistics with a new engine, it is like taking the census of the population in 1930, to plan the construction of roads, schools and hospitals for the current actual population ....
2 - SQL Server Editions...
When upscaling SQL Server from Standard to Enterprise, you need to increase the "hardware" (even if it is a VM) because many of the features that runs under Enterprise version, and does not exists in Standard, needs some more computationnal resources. As an example, using the AUTO_UPDATE_STATISTICS_ASYNC will use automatically one more thread to the detriment of other processes... In comparison, using a Rolls Royce or a Hummer, instead of a VolksWagen is arguably more comfortable, faster ... but requires more oil and more expensive insurance!
3 - Synchronous AVG...
Synchronous AlwaysOn availability groups must have a very fast and faultless network .... If this is not the case, the replication of update requests can drag performance down, especially if you are in pessimistic lockdown (default mode).
4 - Transaction logs...
One common global lack of performances can be the latency to write the transaction log.
5 - Tempdb files...
Another current global lack of performances can be the latency to access tempdb files.
For those two file problems, use the Glenn Berry latency file query that will give you a indice... Good values are under 7 ms for reads and 15 ms for writes...
CONCLUSION
Many other factors can contribute to slow down you system. But without no more information, we cannot help you...

Sudden drop in SQL Azure query performance after moving web app to Azure

What could explain this big drop in performance in an Azure SQL DB after moving the app from an hosted VPS to an Azure App service?
Here's a typical chart from Query Store's High Variation chart over the past two weeks. The red arrow indicates when I moved the production app from another hosting provider to an Azure App. Prior to moving the app, I experienced zero timeouts. Now, using the same Azure SQL DB, timeouts are triggering frequently for longish queries (but by no means too arduous).
The only other change I made was change the user principle in the connection string. This user only has SELECT, INSERT, UPDATE, DELETE and EXECUTE permissions.
My theories are:
- something to do with networking between the app and the db. Resiliency? But I have a SQL exec plan specified
- something wrong with the user I set up?
- bad plan regression (I have now enabled auto FORCE PLAN tuning)
- a problem caused by Hangfire running on two servers simultaneously (now mitigated by moving HF tables to a new DB)
- something is triggering some kind of throttling that I cannot figure out.
Here is a chart of timeouts from Log Analytics:
All help appreciated. Note: this site has had almost identical traffic over the past 30 days.
In fact, take a look at this from the SQL DB metrics over the past week:
And here is some Wait info - last 6 hours:
Blue = PARALLELISM
Orange = BUFFERIO

How can I find out why azure SQL Database is restarting/resetting periodically?

Over the past week or two, we've seen a four cases where our Azure SQL Database DTU graph ends up looking like this:
That is, it seems to "restart" (note that the graph consistently shows 0 DTUs before the spike, which was definitely not the case because we have constant traffic on this server). This seems to indicate that the DTU measurements restarted. The large spike, followed by the subsequent decaying and stabilizing DTU values seems to indicate to us that the database is "warming up" (presumably doing things like populating caches and organizing indexes perhaps?). The traffic to the web app that accesses this database showed nothing abnormal over the same time period, so we don't have any reason to think that this is a result of "high load".
The "Activity Log" item in Azure doesn't show any information. Looking at the "Resource Health" of our database, however, we saw the following:
Note the A problem with your SQL database has been resolved. The timestamp however doesn't exactly correspond to the time of the spike above (the graph is showing UTC+1 time, and presumably the resource-health timestamp is in UTC, so it's about 1.15hrs difference).
Clicking on "View History" gives us all such events for the past couple of weeks:
In each case the database is "available" again within the refresh-granularity (2 minutes), again suggesting restarts. Interestingly, the restarts are around 4 days apart in each case.
Of course I expect and understand that the database be moved around and restarted from time to time. Our web app is Asp.Net Core 2.0 and uses connection resiliency, so we don't have any failing requests.
That said, considering that this has been happening relatively frequently in the last few weeks, I'm of course wondering if this is something that needs action from our side. We did, for example, upgrade to Entity Framework Core 2.0 around 5 weeks ago, so I'm slightly concerned that that might have something to do with it.
My questions:
Is there any way to know for sure that the database server restarted? Is this information stored in the database itself anywhere, or perhaps on the master database?
Is there any way to know the reason for such restarts, and whether or not it's "our fault" or simply a result of hosting-environment changes? Does the Azure team make such information publicly available anywhere?
The database is on S3 Standard level (100 DTUs) and is hosted in South-East Asia. It's around 3.5GB in size.
Please enable Query Store to identify queries and statements involved on those spikes you see on the DTU consumption graph.
ALTER DATABASE [DB1] SET QUERY_STORE = ON;
Then use a query like below to identify long running queries and the tables involved with them. The name of the tables may give you an idea on what is creating those spikes.
SELECT TOP 10 rs.avg_duration, qt.query_sql_text, q.query_id,
qt.query_text_id, p.plan_id, GETUTCDATE() AS CurrentUTCTime,
rs.last_execution_time
FROM sys.query_store_query_text AS qt
JOIN sys.query_store_query AS q
ON qt.query_text_id = q.query_text_id
JOIN sys.query_store_plan AS p
ON q.query_id = p.query_id
JOIN sys.query_store_runtime_stats AS rs
ON p.plan_id = rs.plan_id
WHERE rs.last_execution_time > DATEADD(hour, -1, GETUTCDATE())
ORDER BY rs.avg_duration DESC;
About the downtimes logged on Resource Health, it seems they are related to maintenance tasks because they occur every 4 days, but I will report it to SQL Azure team and try to get a feedback.

Access local front-end connected to Azure SQL Server back-end very slow

I've been using Access to rapid-prototype a DB. Now I'd like to do a small group online test so I split the DB and placed the back-end on Azure SQL Server, then re-linked. It's incredibly slow and I've been researching solutions for days without positive results. My local environment is Win10, Office2016 64bit and internet connection is fast and stable.
I have tried different ODBC drivers, including the SQL Native Client v11.
I've disabled auto-tuning level on the NIC.
I've recreated all queries from access on the server.
I've made sure that Tracing in ODBC is off.
But I enabled tracing temporarily to see what was happening. If I opened the front-end, logged in (Small user table), and did something on the first form (Add 1 record with 3 sub-records...and really...nothing fancy or heavy at all and this only takes 1 minute) then closed the DB, I see that the Tracing log file is 1.5MB.
So I created an empty Access file and an ODBC link to only the User table (12 columns, 20 records), and then monitored the tracing log file again. Opening access doesn't add anything to the log file, but opening this one, linked table made the log file grow to 255kb. Opening this table in access took 5 seconds.
Access sent about 800 requests to the server for opening just this one small table. If I paste all the User table data into a text file, it's only 2kb. SO why is it so slow?
Any ideas on this, and specifically other suggestions to get this working faster?
Kind regards,
Well, the reason why using Azure is slower than running Access connected to a local instance of SQL server is because, well slow is slow!
I mean, if you going to travel 30 miles, you have a choice to walk, or to take a car.
So here is the question you need to know:
Why is walking slower than driving a car?
Answer: Because you are travelling at a slower speed!
So why is using Azure slower the using an instance of SQL server running on your local computer or local network?
Answer:
Because the connection speed to Azure is about 100 times slower!
The idea here that you not going to take into account the DIFFERENCE in connection speed is the issue here. It is a disservice to the reading public that may conclude that such a setup (Access front end on a pc to Azure instance of SQL server) is not a viable setup.
So the first issue here is to make a note of your connection speed to the back end database.
A typical office local area network has a speed of 100mbits, or today most are 1gig – even the el-cheapo routers you purchase at Best Buy are now rated at 1gig (1000 mbits).
However, your typical high speed internet is rated at about 5, or 10 mbits. So that is 100 times slower. (Actually 1000/5 = 200 times slower!!!).
That means if something NOW takes 3 seconds on your office network with Access and SQL server? Well then a WAN (over the internet), then you need to multiple the time by the change in your connection speed (this is so simple – yet it seems to escape all!). So, if you lucky, you might have a 5 mbits speed rating for your internet. That means you go
1000 / 5 = 200
You now take the 200 and multiple the existing delay you have of say 3 seconds and you get 600 seconds (that is 10 minutes if you are wondering!). So you going from 3 seconds to 10 minutes!
This kind of comparison in speed would be like walking into a sports shop to purchase a rubber boat to cross the Atlantic. So not taking into account the change in internet speed and wondering why things are slow is the issue here.
You can most certainly use Access to Azure, but you have to realize two simple concepts.
a connection and test with a connection that is 50-200 times slower than your LAN is a test that going to run 50 to 200 times slower! The failure to mention and take into considering the MASSIVE DIFFERENCE in your speed connection of your LAN compared to a WAN is the simple issue here.
opening a form bound to a large table of data is going to case performance issues.
I was sitting at the bus stop talking to a 90 year old granny lady. I asked her the following:
Have you ever used an instant teller?
She said, why yes, I use them all the time.
I then asked here don’t you think it would be bad to have the teller machine download all the peoples accounts while you wait and THEN ask you for your account number?
The old lady stated, of course, that would be silly. I type in my account pin and the machine ONLY downloads my account information – this is practical and obvious.
In other words that old lady realised that downloading a bunch of data BEFORE you the user even types in or does anything is a waste of bandwidth.
So you never want to launch a form bound to a table and THEN ask the user what record to work on. Why have Access download large numbers of records into a form and THEN ask the user or allow the user to navigate to the required record?
Even when using Google, it does not download the whole internet into your web browser page and you then go ctrl+f to search the contents of that web page.
The same concepts should be applied to Access applications. A design that asks for what to work on and then launches a form bound to a table with a "where" clause will thus fix this issue.
So if you have a form (and even a sub form) that displays a customer invoice, you would FIRST ASK FOR the invoice number, and then simply launch that form using a where clause that restricts the form to the ONE invoice!
Keep in mind that you can STILL use that invoice form bound to a table of 1 million rows and ONLY THE ONE record will be pulled down the network connection *if one used the where clause.
So a typical internet connection has adequate speed to run a browser, and also has MORE than adequate bandwidth speed to pull down a few records. Access often gets a bad rap for poor performance, but that is ONLY DUE to Access developers IGNORING the obvious advice that downloading tons of things that you don’t yet need into a form will run slow.
So web based applications, or even desktop applications written in vb.net perform well with SQL Azure running in the cloud over that MUCH slower internet connection because those applications don’t launch forms bound to large datasets WITHOUT FIRST simply allowing the user to request what they need to see and view.
As for Access and using SharePoint? That setup can be VERY fast, and in fact MUCH faster than SQL Azure, MySQL or any traditional database system because when you using SharePoint tables and Access, then Access automatic syncs a copy of the data local. This setup means your application will continue to run WITHOUT ANY internet connection. The instant the connection is restored, then the data sync can resume.
This means that if you have a table with 15,000 rows and run a report on that data the report can run and launch in an instant with SharePoint back end since a local copy of the data exists in the front end at ALL TIMES! So this setup is VERY well suited an off line mode or in cases that you have a poor and slow internet connections since you as noted always have local copy of the data – only when a record is changed does a sync occur, and that sync can occur independent of Access. So you change one record – and it starts syncing with SharePoint.
However for larger data sets that have to be updated, then SQL server is far better since you can execute a sql update on 10,000 rows and ZERO network traffic and transfer of data need occur to update those 10,000 rows when using SQL server (a pass through query) and when using SharePoint, the 10,000 rows WILL transfer over the network since the local copy requires the rows to be updated. So that massive advantage of using SharePoint for the database backend does not exist for applications that have to update lots of rows or do lots of row update types of data processing.
So the key concepts and take away here:
The high speed internet connection you have is often 10-200 times slower than your typical cheap office (local) network. So that means a 2 second operation will now take 10-200 times longer.
The Access application needs to be optimized to avoid things like loading too many records into a form. So building search forms etc. That FIRST ASK the user what they need to work on is a basic and simple requirement for all good developers and that includes Access developers.
Access and SharePoint can be the BEST option, and such a setup allows the application to run EVEN WHEN there is no internet connection at all. If table sizes are below say 10,000 rows, then this setup can often be ideal. However for applications that have to update lots of rows and for data processing heavy applications this setup is poor since updates to any rows will case data syncing to occur over the network. This setup is also the cheapest, since a single office 365 account with SharePoint support for Access can be had for $6 per month, and that $6 account allows up to 500 free users and those 500 users can even use their Gmail or non-Microsoft account for this setup. And such access applications that do fit within the bounds of SharePoint tables tend to need far less changes and optimizing then using SQL server over the internet.
With SQL server, use of views, pass-tough query and in some cases writing store procedures allows updates and code to run WITHOUT using ANY bandwidth. So one can send a single update query to the server that updates 10,000 rows of data – the only network cost will be the “tiny” amount of bandwidth to send that sql statement.
So while bound forms can be used with SQL Azure running in the cloud, one needs to build software like those do for the web, or vb.net in which they FIRST ask the user what account or customer to work on and THEN launch the UI to display that given data.
So in access, you build a search form say like this:
So at the end of the day, it is important to ignore posts here that suggests Access to SQL in the cloud is not viable. Access with proper designs will work rather well over typical internet connections to SQL server running on Azure.
In fact I seen people use Access to SQL over a 56k modem!
One has to adopt sensible designs in which the data pulled for a given task is restricted – this is a hall mark of all developers – the only issue is Access does NOT enforce this approach while most other developer tools don’t let you hang yourself with things like bound forms to large tables! It not that Access is slow, but Access is slow when you make poor design decisions.
Access to SharePoint can be a real winner – especially for poor bandwidth, spotty bandwidth and even when the connection is lost, the application will continue to run and run faster than 99% of the cases if you were running the same application with a SQL back end. There is a BIG caveat here since only certain types of applications will work well with SharePoint tables. For me to explain the why, how, and when such applications are better is beyond a simple post here, but one simply needs to be aware that SharePoint can be incredible solution, but not for all applications and SQL server can and will be better choice. This SharePoint “better” choice can only be determined on a case by case evaluation of the given type of application in question.
The problem is simply that Azure SQL Database is not very fast running with small DTUs (Database Transaction Units) compared to, say, an in-house instance of SQL Server hosted on even a moderate modern server.
I've checked it out too, and it requires extremely careful design of queries and filtering - far from what you normally can get away with - to obtain acceptable overall speed. On the other hand, this is a giving experience that will bring focus to potential bottlenecks you otherwise wouldn't encounter before it might be too late.
OK, so after almost a week of trying to get this to work (Access front-end to SQL Server back-end on Azure), I've come to the conclusion that it's not a viable solution.
I've tried SQL Server, and setup a Sharepoint 2016 server on Azure, which also failed.
What has worked is using a product from Bullzipp called MS Access to MySQL to convert the access tables, then adding a MySQL DB on the server and importing the file generated by Bullzip. The only thing to note here is that Bullzip doesn't like the newer access formats (it wants an MDB file) so go to Access, create a new, empty file, but make sure you set its file type to MDB. then import your tables across and run Bullzip.
It's now working a hell of a lot faster than the SQL Server, but I am getting some write conflicts in Access, so I just need to go through the code and do whatever I need to so I can avoid those messages.
Using Access as a front to Azure SQL tables is the worst solution. But, sometimes you have to do it. I have a client who is adamant that she wants to keep her Access database. When she hired her very first employee, it became clear she needed to SQL tables behind the screens.
This was a bit of a nightmare. However, after redesigning some terrible table structures, creating views and many procs, I've been able to do it. I use local tables in some cases, and refill by pulling from a stored proc and inserting into the local table. I use linked tables for basic data edits, and do explicit save records almost constantly.
I also have a first-load module that opens all forms, goes to the last record, back to the first record, and then hides the form until needed. The load limps along for about 3
My only remaining issue is now that Azure will close connections after idle time of (I think) 30 or more minutes -- or maybe it's when the laptop sleeps? That kills the app and it has to be closed and re-opened.

Tracing slow user login-in sessions in SQL Server

Background:
We have number of databases of the similar size and identical schema. All of them have identical settings and are placed on the same instance. Everyone uses an application to access and query databases. Within the application all connection strings are identical (except login and password) for all databases. Many users experience significant slowness when logging into and querying one of our databases, but not the other ones.
Problem:
One of the databases gradually became slower and slower to access. Query execution time is also affected, but not as significantly as the time it takes for the user to log in. Now it takes around 50 seconds to login. For all other databases log-in time is only about 4-5 seconds.
Question:
I would like to compare normal log-in sessions on "healthy databases" to the log-in session on the problematic database. Could you please suggest a way to monitor what exactly happens within the log-in session? I know how to trace queries run against specific database, but I don't know what to look for to find what makes logging in slow. Would either profiler or extended events show such information? Is there any other way to analyse what happens during the time user waits to log in?
You can use the SQL Server Profiler to trace every query sent to the ddbb, and with the ability to filter based on user name, database name, etc.
See https://msdn.microsoft.com/en-us/en-en/library/ms175047.aspx
I would take a look at ddbb indexes and statistics, as these are the areas that could slow your ddbb if are not well maintained.

Resources