The said answer for the following question is C to use FORCESEEK hint. But, to use hint, we have to review the execution plan first, right? The question doesn't mention anything about execution plan. The problem seems to be "Readers block Writers". So, won't SNAPSHOT ISOLATION help in this kind of situation?
Question:
A database application runs slowly because of a query against a frequently updated table that has a clustered index. The query returns four columns: three columns in its where clause contained in a non-clustered index and one additional column. To optimize the statement
A. Add a HASH hint to the query
B. Add a LOOP hint to the query
C. Add a FORCESEEK hint to the query
D. Add an INCLUDE clause to the index
E. Add a FORCESCAN hint to the attach query
F. Add a columnstore index to cover the query
G. Enable the optimize for ad hoc workloads option.
H. Conver the unique clustered index with a columnstore index.
I. Include a SET FORCEPLAN ON statement before you run the query
J. Include a SET STATISTICS PROFILE ON statement before you run the query
K. Include a SET STATISTICS SHOWPLAN_XML ON statement before you run the query
L. Include a SET TRANSACTION ISOLATION LEVEL REPEATABLE READ statement before you run the query
M. Include a SET TRANSADCTION ISOLATION LEVEL SNAPSHOT statement before you run the query
N. Include a SET TRANSACTION ISOLATION LEVEL SERIALIZABLE statement before you run the query
I will go for option D. because it covers the missing non-Clusterd Index on the table.
Related
I have 2 questions about SQL Server statistics, please help me. I am using SQL Server 2016.
My table TBL1 has only one column COL1. When I used COL1 in joins with other tables, statistics are automatically created on COL1.
Next I create a non-clustered index on COL1 of TBL1, then another set of statistics are created on COL1. Now I have 2 sets of statistics on COL1.
Out of the above 2 statistics, which statistics are used by SQL Server for further queries? I am assuming that the statistics created by the non-clustered index will be used, am I right?
If I use the Update Statics TBL1 command, all the statistics for TBL1 are updated. In the MSDN documentation, I see that updating statistics causes queries to recompile, what do they mean by re-compiling of queries? The MSDN link is
https://learn.microsoft.com/en-us/sql/relational-databases/statistics/update-statistics?view=sql-server-ver15
Please explain.
If there's only 1 column in your table, there's no reason to have a non-clustered index. This creates a separate copy of that data. Just create the clustered index on that column.
Yes - Since your table only has the one column and an index was created on that column, it's almost certain that SQL Server will use that index whenever joining to that table and thus the statistics for that index will be used.
In this context, it means that the execution plan in cache will be invalidated due to stale statistics and the next time a query executes the optimizer will recreate an execution plan. In other words, it will be assumed there may be a better set of steps to execute the query and the optimizer will try to assemble a better set of steps (execution plan) to execute.
Recommended Reading:
SQL Server Statistics
Understanding Execution Plans
Execution Plan Caching & Reuse
I was trying to improve 2 queries that are almost the same with indexing. I saw a Table Scan in the first query and created an index to make that an Index Seek, when I saw the second query, SQL Server indicated to create an index equals that last I have created changing the order of columns only, but in execution plan the SQL Server Engine was already doing an Index seek on the table.
My question is:
If SQL Server execution plan are already an index seek should I create another index for this query, should I delete the index I have created and replace with this other one, or should I ignore the advice that SQL Server gives?
One cannot answer without specific details. This is not a guessing game. Please post the exact table structure, table sizes, the indexes you added and the execution plans you have.
The fact that you added an index does not mean you added the best index. Nor does the fact that the execution plan uses an index seek implies the plan is optimal. Wrong index column order and partial predicate match would manifest as 'seek' on the leading column(s), it would be suboptimal, and SQL would continue recommending a better index (ie. exactly the symptoms you describe).
Please read Understanding how SQL Server executes a query and How to analyse SQL Server performance.
I saw a Table Scan in the first query and created an index to make that an Index Seek
All Seeks are not good,All Scans are not bad..
Imagine you have an customers table with 10 customers each having 1000 orders,now total rows in the orders table is 10000 rows..
To get top 1 order for each customer ,if your query is doing scan of orders table it may be bad,since doing seek will only cost you 10 seeks..
You have to understand the data and see why optimizer choose this plan and how you make optimizer in choosing the plan you need..Itzik Ben-Gan gives amazing examples in this tutorial and there is a video on SQL Bits
Further Craig Freedman talks on seeks and scans part and goes into details on why optimiser may choose Scan over Seek due to random reads,data density
I detected a bookmark lookup deadlock in my application, and I can't decide which solution to use. None of them seem to be optimal.
Here are the queries:
UPDATE TEST SET DATA = #data WHERE CATEGORY = #cat
SELECT DATA, EXTRA_COLUMN FROM TEST WHERE CATEGORY = #cat
The problem is that there is an unclustered index in CATEGORY and DATA that is used by both queries in reverse order with the clustered index.
i.e.: The update locks the clustered index and update the table, while the select locks the unclustered index to make the bookmark lookup, and them both want each others locks (deadlock).
Here are the options that I found:
1 - Create an index that includes all the columns from the select query.
- It worked, but I don't think is a good idea, I would have to include any column that is used in any select query that can be update anywhere in the application.
2 - Change the transaction isolation level of the database to COMMITTED_SNAPSHOT
3 - Add NOLOCK hint to the select
4 - Drop the index
5 - force one of the transactions to block at an earlier point, before it has had an opportunity to acquire the lock that ends up blocking the other transaction. (Did not work)
I think the second option is the best choice, but I know that it can create other issues, shouldn't the COMMITTED_SNAPSHOT be the default isolation level in SQL SERVER?
It seems to me that there isn't any error either in the application or in the database logic, it's one simple table with an unclustered index and two queries that acces the same table, one to update and the other to select.
Which is the best way to solve this problem? Is there any other solution?
I really expected that SQL Server was able to solve it by itself.
Snapshot isolation is a very robust solution to removing reads from the equation. Many RDBMSes have them always on. They don't cause a lot of problems in practice. Prefer this solution to some manual brittle solution such as very specific indexes or hints.
Please try adding a nonclustered index on Category (include Data & Extra_Column) and adding the following hints to your queries:
UPDATE t SET t.DATA = #data FROM TEST WITH (index(ix_Cat)) WHERE CATEGORY = #cat
SELECT DATA, EXTRA_COLUMN FROM TEST WITH (index(ix_Cat)) WHERE CATEGORY = #cat
This will ensure that both queries will Update/Select data in the same order, and will prevent them from deadlocking eachother.
Comming from an MySQL background I'm having difficulties to understand what's wrong with the following setup.
I have two tables variable and dimension
Both have a primary key, variable furthermore has a foreign key to dimension named dimension_instance_1_uid, on which an index was created.
When I execute a query like this
SELECT
this_.name, dimensioni4_.name
FROM dbo.variable this_
INNER JOIN dbo.dimension_instance dimensioni4_
-- even with index hint nothing changes...
-- WITH (INDEX(PK_dimension_instance))
ON this_.dimension_instance_1_uid = dimensioni4_.UID
it seems as if the index isn't used for a seek and a scan is executed according to the execution plan. It shows two index scan's instead of one index scan and one index seek.
I would expect a index seek because in my case in dimension_instance only 10 of 15k records match entries in variable table.
Can anybody shed some light in my misunderstanding of how MS SQL indexes work.
The Query Execution Plan and the Query Optimizer estimate what is the better operation to do regarding the data inside the db and other variables: in your case maybe it thinks that the query will be less costly doing an index scan instead of a seek: this may be caused from low row numbers
it seems as if the index isn't used at all when I look at the execution plan.
Am I blind, are you blind or did you post the wrong execution plan?
The plan has two source tables and bot use a Clustered Index Scan. THat is100% usage of an index for source table access.
Now, why a scan and not a seek -well, because you dont have any limitations (where clause) and that may be the fastest way. If te machine assumes both tables must be fully read anyway, why doing a seek instead of a scan?
Can anybody shed some light in my misunderstanding of how MS SQL indexes work.
It's not the indexes you misunderstand, but the Hash Join. Hash Join just doesn't have a use for indexes on the join predicates (unlike nested loops join).
http://use-the-index-luke.com/sql/join/hash-join-partial-objects
I'm wondering what is the correct solution to the below is.
I have an UPDATE statement in T-SQL that needs to be run as a daily task. The procedure will update one bit column in one table. Rows affected is around 30,000.
A pseudo version of the T-SQL
UPDATE TABLE_NAME
SET BIT_FIELD = [dbo].[FUNCTION](TABLE_NAME.ID)
WHERE -- THIS ISN'T RELEVANT
The function that determines true or false basically runs a few checks and hits around 3 other tables. Currently the procedure takes about 30 minutes to run and update 30,000 rows in our development environment. I was expecting this to double on production.
The problem I'm having is that intermittently TABLE_NAME table locks up. If I run it in batches of 1000 it seems ok but if I increase this it appears to run fine but eventually the table locks up. The only resolution is to cancel the query which results in no rows being updated.
Please note that the procedure is not wrapped in a TRANSACTION.
If I run each update in a separate UPDATE statement would this fix it? What would be a good solution when updating quite a large number of records in a live environment?
Any help would be much appreciated.
Thanks!
In your case, the SQL Server Optimizer has probably determined that a table lock is needed to perform the update of your table. You should perform rework on your query so that this table lock will not occur or will have a smaller impact on your users. So in a practical way this means: (a) speed up your query and (b) make sure the table will not lock.
Personally I would consider the following:
1. Create clustered and non-clustered indexes on your tables in order to improve the performance of your query.
2. See if it is possible to not use a function, but instead use joins, they are typically a lot faster.
3. Break up the update in multiple parts and perform these parts separately. You might have an 'or' satement in your 'where' clause, that is a good splitting point, but you can also consider creating a cursor to loop through the table and perform the update at one record at a time.