How to set Azure SQL to rebuild indexes automatically? - sql-server

In on premise SQL databases, it is normal to have a maintenance plan for rebuilding the indexes once in a while, when it is not being used that much.
How can I set it up in Azure SQL DB?
P.S: I tried it before, but since I couldn't find any options for that, I thought maybe they are doing it automatically until I've read this post and tried:
SELECT
DB_NAME() AS DBName
,OBJECT_NAME(ps.object_id) AS TableName
,i.name AS IndexName
,ips.index_type_desc
,ips.avg_fragmentation_in_percent
FROM sys.dm_db_partition_stats ps
INNER JOIN sys.indexes i
ON ps.object_id = i.object_id
AND ps.index_id = i.index_id
CROSS APPLY sys.dm_db_index_physical_stats(DB_ID(), ps.object_id, ps.index_id, null, 'LIMITED') ips
ORDER BY ps.object_id, ps.index_id
And found out that I have indexes that need maintaining

Update: Note that the engineering team has published updated guidance to better codify some of the suggestions in this answer in a more "official" from Microsoft place as some customers asked for that. SQL Server/DB Index Guidance. Thanks, Conor
original answer:
I'll point out that most people don't need to consider rebuilding indexes in SQL Azure at all. Yes, B+ Tree indexes can become fragmented, and yes this can cause some space overhead and some CPU overhead compared to having perfectly tuned indexes. So, there are some scenarios where we do work with customers to rebuild indexes. (The primary scenario is when the customer may run out of space, currently, as disk space is somewhat limited in SQL Azure due to the current architecture). So, I will encourage you to step back and consider that using the SQL Server model for managing databases is not "wrong" but it may or may not be worth your effort.
(If you do end up needing to rebuild an index, you are welcome to use the models posted here by the other posters - they are generally fine models to script tasks. Note that SQL Azure Managed Instance also supports SQL Agent which you can also use to create jobs to script maintenance operations if you so choose).
Here are some details that may help you decide if you may be a candidate for index rebuilds:
The link you referenced is from a post in 2013. The architecture for SQL Azure was completely redone after that post. Specifically, the hardware architecture moved from a model that was based on local spinning disks to one based on local SSDs (in most cases). So, the guidance in the original post is out of date.
You can have cases in the current architecture where you can run out of space with a fragmented index. You have options to rebuild the index or to move to a larger reservation size for awhile (which will cost more money) that supports a larger disk space allocation. [Since the local SSD space on the machines is limited, reservation sizes are roughly linked to proportions of the machine. As we get newer hardware with larger/more drives, you have more scale-up options].
SSD fragmentation impact is relatively low compared to rotating disks since the cost of a random IO is not really any higher than a sequential one. The CPU overhead of walking a few more B+ Tree intermediate pages is modest. I've usually seen an overhead of perhaps 5-20% max in the average case (which may or may not justify regular rebuilds which have a much bigger workload impact when rebuilding)
If you are using query store (which is on by default in SQL Azure), you can evaluate whether a specific index rebuild helps your performance visibly or not. You can do this as a test to see if your workload improves before bothering to take the time to build and manage index rebuild operations yourself.
Please note that there is currently no intra-database resource governance within SQL Azure for user workloads. So, if you start an index rebuild, you may end up consuming lots of resources and impacting your main workload. You can try to time things to be done off-hours, of course, but for applications with lots of customers around the world this may not be possible.
Additionally, I will note that many customers have index rebuild jobs "because they want stats to be updated". It is not necessary to rebuild an index just to rebuild the stats. In recent SQL Server and SQL Azure, the algorithm for stats update was made more aggressive on larger tables and the model for how we estimate cardinality in cases where customers are querying recently inserted data (since the last stats update) have been changed in later compatibility levels. So, it is often the case that the customer doesn't even need to do any manual stats update at all.
Finally, I will note that the impact of stats being out of date was historically that you'd get plan choice regressions. For repeated queries, a lot of the impact of this was mitigated by the introduction of the automatic tuning feature over query store (which forces prior plans if it notices a large regression in query performance compared to the prior plan).
The official recommendation that I give customers is to not bother with index rebuilds unless they have a tier-1 app where they've demonstrated real need (benefits outweigh the costs) or where they are a SaaS ISV where they are trying to tune a workload over many databases/customers in elastic pools or in a multi-tenant database design so they can reduce their COGS or avoid running out of disk space (as mentioned earlier) on a very big database. In the largest customers we have on the platform, we sometimes see value in doing index operations manually with the customer, but we often do not need to have a regular job where we do this kind of operation "just in case". The intent from the SQL team is that you don't need to bother with this at all and you can just focus on your app instead. There are always things that we can add or improve into our automatic mechanisms, of course, so I completely allow for the possibility that an individual customer database may have a need for such actions. I've not seen any myself beyond the cases I mentioned, and even those are rarely an issue.
I hope this gives you some context to understand why this isn't being done in the platform yet - it just hasn't been an issue for the vast majority of customer databases we have today in our service compared to other pressing needs. We revisit the list of things we need to build each planning cycle, of course, and we do look at opportunities like this regularly.
Good luck - whatever your outcome here, I hope this helps you make the right choice.
Sincerely,
Conor Cunningham
Architect, SQL

You can use Azure Automation to schedule index maintenance tasks as explained here :Rebuilding SQL Database indexes using Azure Automation
Below are steps :
1) Provision an Automation Account if you don’t have any, by going to https://portal.azure.com and select New > Management > Automation Account
2) After creating the Automation Account, open the details and now click on Runbooks > Browse Gallery
Type on the search box the word “indexes” and the runbook “Indexes tables in an Azure database if they have a high fragmentation” appears:
4) Note that the author of the runbook is the SC Automation Product Team at Microsoft. Click on Import:
5) After importing the runbook, now let’s add the database credentials to the assets. Click on Assets > Credentials and then on “Add a credential…” button.
6) Set a Credential name (that will be used later on the runbook), the database user name and password:
7) Now click again on Runbooks and then select the “Update-SQLIndexRunbook” from the list, and click on the “Edit…” button. You will be able to see the PowerShell script that will be executed:
8) If you want to test the script, just click on the “Test Pane” button, and the test window opens. Introduce the required parameters and click on Start to execute the index rebuild. If any error occurs, the error is logged on the results window. Note that depending on the database and the other parameters, this can take a long time to complete:
9) Now go back to the editor, and click on the “Publish” button enable the runbook. If we click on “Start”, a window appears asking for the parameters. But as we want to schedule this task, we will click on the “Schedule” button instead:
10) Click on the Schedule link to create a new Schedule for the runbook. I have specified once a week, but that will depend on your workload and how your indexes increase their fragmentation over time. You will need to tweak the schedule based on your needs and by executing the initial queries between executions:
11) Now introduce the parameters and run settings:
NOTE: you can play with having different schedules with different settings, i.e. having a specific schedule for a specific table.
With that, you have finished. Remember to change the Logging settings as desired:

Azure Automation is good and pricing is also negligible..
Some other options you have are
1.Create a execute sql task and schedule it through sql agent .The execute sql task should contain the index rebuild code along with stats rebuild
2.You also can create a linked server to SQLAZURE and create a sql agent job.To create a linked server to azure, you can see this SO link:I need to add a linked server to a MS Azure SQL Server

As #TheGamiswar suggested, add a linked server, then create a stored procedure like this:
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE PROCEDURE [LinkedServerName].[RemoteDB].[dbo].[sp_RebuildReorganizIndexes]
AS
BEGIN
ALTER INDEX PK_MyTable ON MyTable REBUILD WITH (STATISTICS_NORECOMPUTE = ON, ONLINE=ON);
ALTER INDEX IX_MyTable ON MyTable REBUILD WITH (STATISTICS_NORECOMPUTE = ON, ONLINE=ON); --Nonclustered index
ALTER INDEX PK_MyTable ON MyTable REORGANIZE;
ALTER INDEX IX_MyTable ON MyTable REORGANIZE;
END
Then on your linked server use "SQL Server Agent" to create a new job and a schedule:
For details please see https://learn.microsoft.com/en-us/sql/ssms/agent/create-a-job?view=sql-server-2017

Related

Long loading time after creating Availability Groups and migrating in SQL

so I have this issue. Our client using MS SQL databases. Two months ago they migrated their databases to the SQL Enterprise 2019 from earlier version and Standard edition.
They major reason was to secure high availability through feature in MS SQL - Availability groups.
After that our application get really slowed. In the simply way to tell, customer startup an app select workspace and then its takes like 15 seconds to load data.
First step is just sending request to database to select data - no inserts, deletes or any high performance processes.
App is using and working with geographical and geometry data, every geo objects is saved in database as geometry data type. The first huge, major select is causing the slow issue.
When I was looking at activity mon under wait categories is only one thing suspicious to me and its type Other.
In database I dont see any high cost queries and availability group mode is set to synchronous.
If Im getting this right, the synchronous mode should not be the cause of this problem because this database is clearly for reading a data not as I mentioned modifying.
I made changes to some instance parameters and set Optimize for Ad hoc workloads to True and and threshold for parallelism from 5 to 20.
Other thing which I tried was create a new app source database and database which contains geo data inside of that SQL instance and didnt add them to availability groups.
From application we are using, for test causes, a connection to the one instance with new test databases.
Neither of this settings work. So guys if you have any idea or any experience with this please help me.
Here is a screen of top 10 waits from sys dmv.
1 - Stats recompute...
When you are going from a SQL version to a higher one, you must first change the compatibility level (to have some performance benefits) and then recompute all statistics in the database with a FULLSCAN. Why ? Because each version of SQL Server come with a new optimizer that have new operators, new algorithms and many improvements... To stick to this new version of the optimizer the method of computing statistics and the form of the results of these calculations, is rethought with each modification of the engine ... so much so that if we use the old statistics with a new engine, it is like taking the census of the population in 1930, to plan the construction of roads, schools and hospitals for the current actual population ....
2 - SQL Server Editions...
When upscaling SQL Server from Standard to Enterprise, you need to increase the "hardware" (even if it is a VM) because many of the features that runs under Enterprise version, and does not exists in Standard, needs some more computationnal resources. As an example, using the AUTO_UPDATE_STATISTICS_ASYNC will use automatically one more thread to the detriment of other processes... In comparison, using a Rolls Royce or a Hummer, instead of a VolksWagen is arguably more comfortable, faster ... but requires more oil and more expensive insurance!
3 - Synchronous AVG...
Synchronous AlwaysOn availability groups must have a very fast and faultless network .... If this is not the case, the replication of update requests can drag performance down, especially if you are in pessimistic lockdown (default mode).
4 - Transaction logs...
One common global lack of performances can be the latency to write the transaction log.
5 - Tempdb files...
Another current global lack of performances can be the latency to access tempdb files.
For those two file problems, use the Glenn Berry latency file query that will give you a indice... Good values are under 7 ms for reads and 15 ms for writes...
CONCLUSION
Many other factors can contribute to slow down you system. But without no more information, we cannot help you...

What are the best practices for auto index recommendations in SQL

I am reviewing a SQL Server 2008 R2 instance with 30+ databases with the goal of moving to SQL Server 2014. In reviewing this I found a SQL job that a previous employee implemented. The job utilizes a set of scripts from this article https://www.sqlservercentral.com/forums/topic/indexing-views-1, to automatically create and drop all recommended indexes every half hour 24/7. When this was implemented the databases were roughly 40gb, but since have grown to over 1TB as we are a highly transactional company. With one of the databases running our primary ERP/ordering system. From everything I understand about indexing, this seems like a terrible idea as it could be creating and dropping indexes on very large tables. Is this a good practice, am I missing something?
Found this in a related post
"How to use it? Run AutoIndex.sql to install the SPs and sql agent
job. Upon every 30 minutes, the sql agent job will run the auto create
index and auto drop index scripts to make recommendations. Same
recommendation will not be stored multiple times, instead we just bump
up the count and change the latest recommendation time. You can view
the recommendations using the simple commands in the
viewrecommendations.sql. Look for the recommendations in the
recommendation table that have high counts, which means they have been
repetitively recommended thus are more valuable. You can also look at
the initial recommendation time and the last recommendation time to
get a sense of the freshness and the time range this recommendation is
valid for. After you made a decision to implement a recommendation,
simply run execute_recommendation with the recommendation id and the
recommendation will be implemented automatically." Thank U Snehal
Link Here
According to that user the script you're proc you have in your system should just aggregate index recommendations over time and allow you too see what indexes are constantly being recommended.
I believe the important distinction here is SQL doesn't log how many times it suggest a particular index so you may get a suggested index based on a one off query, which probably isn't something you want to implement. Instead you run this for a period and see what's being hit frequently and create those indexes.

Automatic database indexing

I have a database which is used by a multi-tenant application. In this database workloads are dynamic and change continuously. Therefore I have to allocate a DA to continuously manage the database. But I thought to use an automated service for this task such as Azure SQL Database Advisor - Automatic index management (platform is not important - I am OK with using MS sql server or oracle or other RDBMS).
I want to know how these automated indexes are actually working.Can I replace database administrator with these automatic indexers. I read that whenever a query execution plan is generated it will find out all the useful indexes to execute that query. Then it uses the indexes which really exist and cache some data about indexes which don't exist. If an index data is cached again and again the sql adviser will show that as a recommended index. But I want to know can we relay on this, what about update and insert queries? If I have a table where records are frequently updated, these automated indexing systems will consider that?
Note that Index Advisor is only available in SQL Database (Azure).
In the background Index Advisor is a machine learning algorithm, a relatively simple and quite effective one. It will analyze your workload, see if you would benefit from indexes. If he thinks you would it will show you as a recommendation - if you turn automatic index creation/dropping on it will actually create the index. To understand better how it works take a look at Channel 9. Note that before you apply a recommendation you can have an estimated impact.
Now the algorithm can make mistakes, right? So once the recommendation is applied it can automatically be reverted based on its performance.
Also note that next to Index Advisor you can check the Query Performance Insights that will show the performance of you queries. So this can help your DBA diagnose other, non-index related problems.
But note that Index Advisor will not drop and create for you new indexes every hour, it takes for him a day or two. So if your database's workload is changing very fast then I am not sure any automatic management tool or DBA will react quickly enough for your workload.

Efficient way to delete records every 10 mins

Problem at hand
Need to delete some few thousand records every 10 minutes from a SQL Server database table.This is part of cleanup for older records.
Solutions under consideration
There's .Net Service running for some other functionality. Same service can be used with a timer to execute SQL delete command on db.
SQL server job
Trigger
Key consideration for providing solution
Ours is a web product which gets deployed at different client locations. we want minimal operational overhead as resources doing deployment are very limited technical skill and we also want to make sure that there's less to none configuration requirement for our Product.
Performance is very important, as it on live transactional database.
This sounds like exactly the sort of work that a SQL Server job was intended to provide; database maintenance.
A scheduled job can execute a basic T-SQL statement that will delete the records you don't want any more, on whatever schedule you want it to run on. The job creation can be scripted to be part of your standard deployment scripts, which should negate the deployment costs.
Additionally, by utilizing an established part of SQL Server, you capitalize on the knowledge of other database administrators that will understand SQL jobs and be able to manage them.
I would not use a trigger...and stick with SQL Server DTS or SSIS. Obviously you will need some kind of identifier so I would use a timestamp column with an index...if that's not required just fire off a TRUNCATE once nightly.
The efficiency of the delete comes from indexes, has nothing to do how the timer is triggered. It is very important that the 'old' records be easily identifiable by a range scan. If the DELETE has to scan the whole table to find these 'old' records, it will block all other activity. Usually in such cases the table is clustered by the datetime value first, and unique primary keys are delegated to a non-clustered index, if needed.
Now how to pop the timer, you really have three alternatives:
SQL Agent job
Conversation Timers
Application timer
SQL Agent job is the best option for 10 minute intervals. Only drawback is that it does not work on SQL Express deployments. If that is a concern, then conversation timers and activated procedures are a viable alternative.
Last option has the disadvantage that the application must be running for the timer to trigger deletion. If this is not a concern (ie. if the application is not running, it doesn't matter that the records are not deleted) then is OK. Note that ASP.Net applications are very bad host for such timers, because of the way IIS and ASP may choose to recycle and put to sleep app pools.

SQL Server Maintenance Suggestions?

I run an online photography community and it seems that the site draws to a crawl on database access, sometimes hitting timeouts.
I consider myself to be fairly compentent writing SQL queries and designing tables, but am by no means a DBA... hence the problem.
Some background:
My site and SQL server are running on a remote host. I update the ASP.NET code from Visual Studio and the SQL via SQL Server Mgmt. Studio Express. I do not have physical access to the server.
All my stored procs (I think I got them all) are wrapped in transactions.
The main table is only 9400 records at this time. I add 12 new records to this table nightly.
There is a view on this main table that brings together data from several other tables into a single view.
secondary tables are smaller records, but more of them. 70,000 in one, 115,000 in another. These are comments and ratings records for the items in #3.
Indexes are on the most needed fields. And I set them to Auto Recompute Statistics on the big tables.
When the site grinds to a halt, if I run code to clear the transaction log, update statistics, rebuild the main view, as well as rebuild the stored procedure to get the comments, the speed returns. I have to do this manually however.
Sadly, my users get frustrated at these issues and their participation dwindles.
So my question is... in a remote environment, what is the best way to setup and schedule a maintenance plan to keep my SQL db running at its peak???
My gut says you are doing something wrong. It sounds a bit like those stories you hear where some system cannot stay up unless you reboot the server nightly :-)
Something is wrong with your queries, the number of rows you have is almost always irrelevant to performance and your database is very small anyway. I'm not too familiar with SQL server, but I imagine it has some pretty sweet query analysis tools. I also imagine it has a way of logging slow queries.
I really sounds like you have a missing index. Sure you might think you've added the right indexes, but until you verify the are being used, it doesn't matter. Maybe you think you have the right ones, but your queries suggest otherwise.
First, figure out how to log your queries. Odds are very good you've got a killer in there doing some sequential scan that an index would fix.
Second, you might have a bunch of small queries that are killing it instead. For example, you might have some "User" object that hits the database every time you look up a username from a user_id. Look for spots where you are querying the database a hundred times and replace it with a cache--even if that "cache" is nothing more then a private variable that gets wiped at the end of a request.
Bottom line is, I really doubt it is something mis-configured in SQL Server. I mean, if you had to reboot your server every night because the system ground to a halt, would you blame the system or your code? Same deal here... learn the tools provided by SQL Server, I bet they are pretty slick :-)
That all said, once you accept you are doing something wrong, enjoy the process. Nothing, to me, is funner then optimizing slow database queries. It is simply amazing you can take a query with a 10 second runtime and turn it into one with a 50ms runtime with a single, well-placed index.
You do not need to set up your maintenance tasks as a maintenance plan.
Simply create a stored procedure that carries out the maintenance tasks you wish to perform, index rebuilds, statistics updates etc.
Then create a job that calls your stored procedure/s. The job can be configured to run on your desired schedule.
To create a job, use the procedure sp_add_job.
To create a schedule use the procedure sp_add_schedule.
I hope what I have detailed is clear and understandable but feel free to drop me a line if you need further assistance.
Cheers, John

Resources