What are the best practices for auto index recommendations in SQL - sql-server

I am reviewing a SQL Server 2008 R2 instance with 30+ databases with the goal of moving to SQL Server 2014. In reviewing this I found a SQL job that a previous employee implemented. The job utilizes a set of scripts from this article https://www.sqlservercentral.com/forums/topic/indexing-views-1, to automatically create and drop all recommended indexes every half hour 24/7. When this was implemented the databases were roughly 40gb, but since have grown to over 1TB as we are a highly transactional company. With one of the databases running our primary ERP/ordering system. From everything I understand about indexing, this seems like a terrible idea as it could be creating and dropping indexes on very large tables. Is this a good practice, am I missing something?

Found this in a related post
"How to use it? Run AutoIndex.sql to install the SPs and sql agent
job. Upon every 30 minutes, the sql agent job will run the auto create
index and auto drop index scripts to make recommendations. Same
recommendation will not be stored multiple times, instead we just bump
up the count and change the latest recommendation time. You can view
the recommendations using the simple commands in the
viewrecommendations.sql. Look for the recommendations in the
recommendation table that have high counts, which means they have been
repetitively recommended thus are more valuable. You can also look at
the initial recommendation time and the last recommendation time to
get a sense of the freshness and the time range this recommendation is
valid for. After you made a decision to implement a recommendation,
simply run execute_recommendation with the recommendation id and the
recommendation will be implemented automatically." Thank U Snehal
Link Here
According to that user the script you're proc you have in your system should just aggregate index recommendations over time and allow you too see what indexes are constantly being recommended.
I believe the important distinction here is SQL doesn't log how many times it suggest a particular index so you may get a suggested index based on a one off query, which probably isn't something you want to implement. Instead you run this for a period and see what's being hit frequently and create those indexes.

Related

Automatic database indexing

I have a database which is used by a multi-tenant application. In this database workloads are dynamic and change continuously. Therefore I have to allocate a DA to continuously manage the database. But I thought to use an automated service for this task such as Azure SQL Database Advisor - Automatic index management (platform is not important - I am OK with using MS sql server or oracle or other RDBMS).
I want to know how these automated indexes are actually working.Can I replace database administrator with these automatic indexers. I read that whenever a query execution plan is generated it will find out all the useful indexes to execute that query. Then it uses the indexes which really exist and cache some data about indexes which don't exist. If an index data is cached again and again the sql adviser will show that as a recommended index. But I want to know can we relay on this, what about update and insert queries? If I have a table where records are frequently updated, these automated indexing systems will consider that?
Note that Index Advisor is only available in SQL Database (Azure).
In the background Index Advisor is a machine learning algorithm, a relatively simple and quite effective one. It will analyze your workload, see if you would benefit from indexes. If he thinks you would it will show you as a recommendation - if you turn automatic index creation/dropping on it will actually create the index. To understand better how it works take a look at Channel 9. Note that before you apply a recommendation you can have an estimated impact.
Now the algorithm can make mistakes, right? So once the recommendation is applied it can automatically be reverted based on its performance.
Also note that next to Index Advisor you can check the Query Performance Insights that will show the performance of you queries. So this can help your DBA diagnose other, non-index related problems.
But note that Index Advisor will not drop and create for you new indexes every hour, it takes for him a day or two. So if your database's workload is changing very fast then I am not sure any automatic management tool or DBA will react quickly enough for your workload.

SQL Server 2014 - what is the most efficient way to move data from one table to another across databases same instance

I've found various opinions on this one, including a reference to an article that indicates that 'select into' operators can run in parallel from 2014+ and may or may not be more efficient than 'insert' as a result.
My use case is moving data from one table to another identical table across databases, same instance, 2014. The inserts will be 5-10M rows-ish, and I don't care about logging just efficiency. I need a general recommendation, not a case-by-case analysis.
I realize that there are other factors (row length, etc) that might affect the answer, but I'm looking for the best place to start. I can always try other methods if necessary.
So what's the most efficient way to load a table in one database from an identical table in another?
Thanks in advance!
I would suggest a SSIS (SQL Server Integration Services) package that performs BULK operations. Although 5M rows isn't significant in our current world.
Since "it depends" you'll have to help us understand what you're trying to save. INSERT INTO is nice only in that it is self contained and "easy." If this is a one time deal you might do it this way and stop thinking about it.
If however you're going to be shoveling 10M records daily - you might consider a scheduled SSIS script. There is overhead to maintaining the script but it is generally faster. If you are reloading data for testing purposes (reset to baseline) then the SSIS package is a good way to go.
You might also look at this article: https://dba.stackexchange.com/questions/99367/insert-into-table-select-from-table-vs-bulk-insert

SQL Server Using TableDiff on large tables

We have a process which uses uses SQL Server's amazing tableDiff via:
Microsoft SQL Server\100\COM\Tablediff.exe
It's SQL Server 2008 R2. It connects from one instance to another identical instance. It works very well!
I have a situation where a table which now has 10767594 records is taking 2.5 hours to complete, it only has one table in the job. How can I improve this?
The process is triggered by a Windows Scheduled Task, this calls a .bat file, the .bat file contains the recommended code which has no issue. We have a couple of these in place and have had for some time. It's just the one job that deals with the big table from instance to instance that is taking too long.
I have realised that the source table does have an index but the destination table does not. I will put an index on this table, what else can I do?
Does table diff run better with indexes?
Is there a ways to use table diff more effectively?
E.g. if I capture the lastProcessedID can I run tableDiff next time for all records where id > lastProcessedID?
Any advice would be great. Thank you in advance
EDITED:
MY SOLUTION - This was a very very big surprise. As I mentioned above, the 10 million+ record table which was identical on the source and destination except for 2 indexes (on the source). After waiting for out of hours since this is an internal production server I applied the indexes to the source. Now I run the tableDiff job which has not been changed at all and it completes in under 2 minutes. 2.5 hours to 2 mins!
I have accepted the answer below because it very very helpful. I did go down the Merge Replication path however after setting up replication and publishing I found out that the production instance was not able to be a subscriber due to the replication not be ticked on install. As Jason says its a reasonable amount of research, learning and setting up. Since I am not a DBA and had not looked at this before it was a worth while experience.
The performance issue is because the remote queries pull every record from each place to do the comparison to generate the output. Indexes can help slightly to make the pull a little faster from each location, but it's not likely to be significant.
An incremental approach is definitely better. I don't believe tablediff directly supports comparing 2 queries. If it did, you could do something like EXCEPT or INTERSECT to do the comparisons. If you're trying to keep these databases in sync, why not consider other solutions, like log shipping, mirroring, SSIS, replication, clustering, etc.

Database tuning advices

Possibly some of you don't even know about these features so you will learn a lot from this post which will in fact help me to optimize better and some of you probably use them on daily basis so you can help me and other less DBA proof users.
I'm using SQL-Server 2005 Standard
I run SQL Server Profiler a lot. Each time i find ad hoc queries or sps which execution time exceed my possible limits of under 100ms for complex queries and above 30ms for short ones (number does not mean a thing, just to make some sense). After i find possibly problematic queries i write them down so i can use Database Engine Tuning Advisor which executes overloaded queries on tables and at the result gives me indexes i need to build in order to improve performance. Each night i execute index rebuild function from Maintenance Plans.
Now question time!!!
1.if Database Engine Tuning Advisor gives me 10 indexes to create while improvement percentage is about 40% should i use it's advice or not? Better question is what is ratio number of indexes/improvement percentage i should follow. Indexes take space and time to rebuild.
2.If i create about 5-7 indexes for each problematic query, i can end up with 500 indexes per DB. How many indexes can i build so DB will perform normally? are there any limitations?
3.Is there any other way to optimize ( nor re-design ) your DB other than using my method or going sp by sp by your hands and eyes?
There's no right answer to this question as it depends heavily on your workload.
For workloads with a heavy ratio of reads (e.g. data warehouse) it might make sense to create an index which it would be positively counter productive to create for an environment with a greater amount of writes.
The DTA can help with this regard by assessing the impact on the overall workload but you would need to try and capture a representative sample (not just the poor performing queries). SQL Profiler is quite resource intensive so to do this with the least possible impact on your server you would need to use a server side SQL trace with appropriate filters to only log events related to the database of interest.
To identify the poorest performing queries in isolation If you have at least SQL2005 SP1 client tools installed you should be able to right click the database node in Management Studio and use the Reports -> Standard Reports menu to see the plans in the cache with highest CPU/IO.
If you are interested in this area I recommend the book SQL Server 2008 Query Performance Tuning Distilled (most of it applicable to SQL2005 as well)
You can get SQL Profiler to log to a table, so it will write the queries to a table you specify. If you can, leave it running for a few hours - Or however long it takes to cover as many queries/events as possible.
Next, use Database Engine Tuning Advisor - And get it to use this table of queries as its source input. You will find it looks at the whole pattern, and will recommend you create some indices, and remove others.
This is better than looking at queries one by one in isolation, although that still has its place.

SQL Server Maintenance Suggestions?

I run an online photography community and it seems that the site draws to a crawl on database access, sometimes hitting timeouts.
I consider myself to be fairly compentent writing SQL queries and designing tables, but am by no means a DBA... hence the problem.
Some background:
My site and SQL server are running on a remote host. I update the ASP.NET code from Visual Studio and the SQL via SQL Server Mgmt. Studio Express. I do not have physical access to the server.
All my stored procs (I think I got them all) are wrapped in transactions.
The main table is only 9400 records at this time. I add 12 new records to this table nightly.
There is a view on this main table that brings together data from several other tables into a single view.
secondary tables are smaller records, but more of them. 70,000 in one, 115,000 in another. These are comments and ratings records for the items in #3.
Indexes are on the most needed fields. And I set them to Auto Recompute Statistics on the big tables.
When the site grinds to a halt, if I run code to clear the transaction log, update statistics, rebuild the main view, as well as rebuild the stored procedure to get the comments, the speed returns. I have to do this manually however.
Sadly, my users get frustrated at these issues and their participation dwindles.
So my question is... in a remote environment, what is the best way to setup and schedule a maintenance plan to keep my SQL db running at its peak???
My gut says you are doing something wrong. It sounds a bit like those stories you hear where some system cannot stay up unless you reboot the server nightly :-)
Something is wrong with your queries, the number of rows you have is almost always irrelevant to performance and your database is very small anyway. I'm not too familiar with SQL server, but I imagine it has some pretty sweet query analysis tools. I also imagine it has a way of logging slow queries.
I really sounds like you have a missing index. Sure you might think you've added the right indexes, but until you verify the are being used, it doesn't matter. Maybe you think you have the right ones, but your queries suggest otherwise.
First, figure out how to log your queries. Odds are very good you've got a killer in there doing some sequential scan that an index would fix.
Second, you might have a bunch of small queries that are killing it instead. For example, you might have some "User" object that hits the database every time you look up a username from a user_id. Look for spots where you are querying the database a hundred times and replace it with a cache--even if that "cache" is nothing more then a private variable that gets wiped at the end of a request.
Bottom line is, I really doubt it is something mis-configured in SQL Server. I mean, if you had to reboot your server every night because the system ground to a halt, would you blame the system or your code? Same deal here... learn the tools provided by SQL Server, I bet they are pretty slick :-)
That all said, once you accept you are doing something wrong, enjoy the process. Nothing, to me, is funner then optimizing slow database queries. It is simply amazing you can take a query with a 10 second runtime and turn it into one with a 50ms runtime with a single, well-placed index.
You do not need to set up your maintenance tasks as a maintenance plan.
Simply create a stored procedure that carries out the maintenance tasks you wish to perform, index rebuilds, statistics updates etc.
Then create a job that calls your stored procedure/s. The job can be configured to run on your desired schedule.
To create a job, use the procedure sp_add_job.
To create a schedule use the procedure sp_add_schedule.
I hope what I have detailed is clear and understandable but feel free to drop me a line if you need further assistance.
Cheers, John

Resources