I have a table with that will grow to several million rows over some years. As part of my web application, I have to query the count on a subset of this table whenever a user accesses a particular page. Someone with an architecty hat has said that they have a performance concern with that. Assuming they are correct, will adding an indexed view address this issue?
Sql that I want to be fast:
SELECT COUNT(*) FROM [dbo].[Txxx] WHERE SomeName = 'ZZZZ'
OR
SELECT COUNT_BIG(*) FROM [dbo].[Txxx] WHERE SomeName = 'ZZZZ'
Table:
CREATE TABLE [dbo].[Txxx](
[Id] [uniqueidentifier] ROWGUIDCOL NOT NULL,
[SomeName] [nvarchar](50) NOT NULL,
[SomeGuid] [uniqueidentifier] NOT NULL
CONSTRAINT [PK_Txxx] PRIMARY KEY CLUSTERED
(
[Id] ASC
)
View:
CREATE view dbo.Vxxx
WITH SCHEMABINDING
AS
SELECT SomeName, COUNT_BIG(*) AS UsedCount
FROM dbo.Txxx
GROUP BY SomeName
Index:
CREATE UNIQUE CLUSTERED INDEX [IV_COUNT] ON [dbo].[Vxxx]
(
[SomeName] ASC
)
Yes, but only Enterprise Edition will consider the indexed view during query compilation. To leverage the index on non-EE you need to select directly from the view and use the NOEXPAND hint:
NOEXPAND applies only to indexed views. An indexed view is a view with
a unique clustered index created on it. If a query contains references
to columns that are present both in an indexed view and base tables,
and the query optimizer determines that using the indexed view
provides the best method for executing the query, the query optimizer
uses the index on the view. This function is called indexed view
matching. Automatic use of indexed view by query optimizer is
supported only in specific editions of SQL Server.
Be warned that a indexed view like this will create write contention, because any update will lock and entire SomeName scope: only one transaction at a time will be able to insert, delete or update any row with SomeName = 'ZZZZ'.
Yes, that indexed view will definitely improve the performance of that particular query (assuming Enterprise Edition - Remus explains how to utilize it if you're not on Enterprise).
However, it isn't "free" - the index will need to be maintained for all DML operations to dbo.Txxx, will occupy space (though considerably less than the base table, in comparison), and will be subject to issues that also affect normal tables - such as fragmentation and (likely to a lesser extent in this case) page splits.
Related
Wrote a great simple function in SQL that apparently is not usable (or advisable) in my SELECT statement.
Have some intelligence behind combining Combinations of Company Name and Contact Name in our select and I find it's repeating across several views. Being a programmer, of course the right thing is to encapsulate that functionality for reuse across all views I'm created. But alas, from my searching it does not appear possible or recommended, at least not with UDFs.
The question: Is there any way to select the return value of a method/function/chunk of reusable code where I pass it the value of columns for each row... Or do I truly have to copy/paste the logic into each select statement?
SELECT formatName(company, contact, ' - ') as Name FROM company join contacts...
I know I can do this on the client (eventually), but client changes are not in scope for this phase of the project.
I guess I typed more in this question than just cutting and pasting a CASE statement into each view, but reuse is ingrained if me of course. :)
A better performing and DRY method to accomplish this is with a computed column.
A computed column is a virtual column that is not physically stored in
the table, unless the column is marked PERSISTED. A computed column
expression can use data from other columns to calculate a value for
the column to which it belongs. You can specify an expression for a
computed column in SQL Server 2017 by using SQL Server Management
Studio or Transact-SQL.
You can make this column persisted as well
PERSISTED Specifies that the Database Engine will physically store the
computed values in the table, and update the values when any other
columns on which the computed column depends are updated. Marking a
computed column as PERSISTED allows an index to be created on a
computed column that is deterministic, but not precise. For more
information, see Indexes on Computed Columns. Any computed columns
used as partitioning columns of a partitioned table must be explicitly
marked PERSISTED. computed_column_expression must be deterministic
when PERSISTED is specified.
alter table company add FullName as (FirstName + '-' + LastName) persisted;
Then, you could just add this column in your SELECT can can even query against it, if it's persisted.
What you can do is create a view that behaves like a table. Meaning it would have the performance of a table, can have indexes added etc. This view can have any of the columns of the underlying base table plus you can add calculated columns, such as [name]. This is accomplished by adding WITH SCHEMABINDING when creating the view. This view can then be used in lieu of the base table in all of your queries.
Here is an example.
The underlying base table with data:
CREATE TABLE dbo.company (
companyid int IDENTITY(1,1) NOT NULL,
company varchar(50) NULL,
contact varchar(50) NULL,
CONSTRAINT PK_company PRIMARY KEY CLUSTERED (companyid ASC)
) ON FG1
The view containing WITH SCHEMABINDING:
CREATE view dbo.VW_company WITH SCHEMABINDING AS
SELECT companyid,
CASE WHEN RTRIM(ISNULL(company,'')) <> '' AND RTRIM(ISNULL(contact,'')) <> '' THEN company +' - '+ contact
WHEN RTRIM(ISNULL(company,'')) <> '' THEN company
WHEN RTRIM(ISNULL(contact,'')) <> '' THEN contact
ELSE '' END as [Name]
FROM dbo.company
This view can now be used everywhere the table is used, without a performance hit. Furthermore, the calculated column [Name] can actually have an index added to it! That's something you cannot do with a function.
After running a query, the SQL Server 2014 Actual Query Plan shows a missing index like below:
CREATE NONCLUSTERED INDEX IX_1 ON Table1 (Column1) INCLUDE
(PK_Column,SomeOtherColumn)
The missing index suggests to include the Primary Key column in the index. The table is clustered index with the PK_Column.
I am confused and it seems that I don’t get the concept of Clustered Index Primary Key right.
My assumption was: when a table has a clustered PK, all of the non-clustered indexes point to the PK value. Am I correct? If I am, why the query plan missing index asks me to include the PK column in the index?
Summary:
Index advised is not valid,but it doesn't make any difference.See below tests section for details..
After researching for some time,found an answer here and below statement explains convincingly about missing index feature..
they only look at a single query, or a single operation within a single query. They don't take into account what already exists or your other query patterns.
You still need a thinking human being to analyze the overall indexing strategy and make sure that you index structure is efficient and cohesive.
So coming to your question,this index advised may be valid ,but should not to be taken for granted. The index advised is useful for SQL Server for the particular query executed, to reduce cost.
This is the index that was advised..
CREATE NONCLUSTERED INDEX IX_1 ON Table1 (Column1)
INCLUDE (PK_Column, SomeOtherColumn)
Assume you have a query like below..
select pk_column, someothercolumn
from table
where column1 = 'somevalue'
SQL Server tries to scan a narrow index as well if available, so in this case an index as advised will be helpful..
Further you didn't share the schema of table, if you have an index like below
create index nci_test on table(column1)
and a query of below form will advise again same index as stated in question
select pk_column, someothercolumn
from table
where column1 = 'somevalue'
Update :
i have orders table with below schema..
[orderid] [int] NOT NULL Primary key,
[custid] [char](11) NOT NULL,
[empid] [int] NOT NULL,
[shipperid] [varchar](5) NOT NULL,
[orderdate] [date] NOT NULL,
[filler] [char](160) NOT NULL
Now i created one more index of below structure..
create index onlyempid on orderstest(empid)
Now when i have a query of below form
select empid,orderid,orderdate --6.3 units
from orderstest
where empid=5
index advisor will advise below missing index .
CREATE NONCLUSTERED INDEX empidalongwithorderiddate
ON [dbo].[orderstest] ([empid])
INCLUDE ([orderid],[orderdate])--you can drop orderid too ,it doesnt make any difference
If you can see orderid is also included in above suggestion
now lets create it and observe both structures..
---Root level-------
For index onlyempid..
for index empidalongwithorderiddate
----leaf level-------
For index onlyempid..
for index empidalongwithorderiddate
As you can see , creating as per suggestion makes no difference,Even though it is invalid.
I Assume suggestion was made by Index advisor based on query ran and is specifically for the query and it has no idea of other indexes involved
I don't know your schema, nor your queries. Just guessing.
Please correct me if this theory is incorrect.
You are right that non-clustered indexes point to the PK value. Imagine you have large database (for example gigabytes of files) stored on ordinary platter hard-drive. Lets suppose that the disk is fragmented and the PK_index is saved physical far from your Table1 Index.
Imagine that your query need to evaluate Column1 and PK_column as well. The query execution read Column1 value, then PK_value, then Column1 value, then PK_value...
The hard-drive platter is spinning from one physical place to another, this can take time.
Having all you need in one index is more effective, because it means reading one file sequentially.
I tried to examine RID (foremerly bookmark) lookup by creating a heap table:
CREATE TABLE [dbo].[CustomerAddress]
(
[CustomerID] [int],
[AddressID] [int],
[ModifiedDate] [datetime]
);
GO
CREATE NONCLUSTERED INDEX x
ON dbo.CustomerAddress(CustomerID, AddressID);
Then, I tried the following query to inestigate execution plan:
SELECT CustomerID, AddressID, ModifiedDate
FROM dbo.CustomerAddress
WHERE CustomerID = 29485;
But, using MSSMS I cannot see RID lookup in the execution plan:
I'm using SQL Server 2008R2 (version 10.50.4000.0) service pack 2.
PS: This question is based on Aaron Bertrand's article.
A table scan means SQL Server does not use your index. It reads from the "heap". A "heap" is the data storage for tables without a clustered index.
Since it does not touch the index at all, SQL Server does not need a RID lookup to go from the index to the heap.
The reason is probably that SQL Server estimates there might be more than +/- 100 rows for one customer. The optimizer will try to avoid a large numbers of lookups.
You could try again with an index on just (CustomerID), or by adding an AddresID to your where clause.
I created view on table and I'd like to create index on this view. When I created index only on view my query doesn't use index but when I create index on table and next the same index on view, query use index.
My view is creating new column based on other column with the same table which is based my view so I can't create index on table because i can't modify this table. It is possible created non clustered index on view which can improve my query? When I created non clustered index my query only table scan instead of use non clustered index.
If your query mentions the base tables, the optimizer can use the index on the view only in Enterprise Edition. If your query mentions the index, you can force the index in the query :
SELECT *
FROM MyView WITH (INDEX = MyIndex)
But beware : it could lead to worse performances if the optimizer was right in the first place.
If you created first clustered and then regular index on the view with schemabinding, you can use WITH (NOEXPAND) when selecting from the view, then it "might" use your new index, ie:
select *
from dbo.vwIndexedView WITH (NOEXPAND)
WHERE indexedColumn = 1
I know that a SQL Server full text index can not index more than one table. But, I have relationships in tables that I would like to implement full text indexes on.
Take the 3 tables below...
Vehicle
Veh_ID - int (Primary Key)
FK_Atr_VehicleColor - int
Veh_Make - nvarchar(20)
Veh_Model - nvarchar(50)
Veh_LicensePlate - nvarchar(10)
Attributes
Atr_ID - int (Primary Key)
FK_Aty_ID - int
Atr_Name - nvarchar(50)
AttributeTypes
Aty_ID - int (Primary key)
Aty_Name - nvarchar(50)
The Attributes and AttributeTypes tables hold values that can be used in drop down lists throughout the application being built. For example, Attribute Type of "Vehicle Color" with Attributes of "Black", "Blue", "Red", etc...
Ok, so the problem comes when a user is trying to search for a "Blue Ford Mustang". So what is the best solution considering that tables like Vehicle will get rather large?
Do I create another field in the "Vehicle" table that is "Veh Color" that holds the text value of what is selected in the drop down in addition to "FK Atr VehicleColor"?
Or, do I drop "FK Atr VehicleColor" altogether and add "Veh Color"? I can use text value of "Veh Color" to match against "Atr Name" when the drop down is populated in an update form. With this approach I will have to handle if Attributes are dropped from the database.
-- Note: could not use underscore outside of code view as everything between two underscores is italicized.
I believe it's a common practice to have separate denormalized table specifically for full-text indexing. This table is then updated by triggers or, as it was in our case, by SQL Server's scheduled task.
This was SQL Server 2000. In SQL Server you can have an indexed view with full-text index: http://msdn.microsoft.com/en-us/library/ms187317.aspx. But note that there are many restrictions on indexed views; for instance, you can't index a view that uses OUTER join.
You can create a view that pulls in whatever data you need, then apply the full-text index to the view. The view needs to be created with the 'WITH SCHEMABINDING' option, and needs to have a UNIQUE index.
CREATE VIEW VehicleSearch
WITH SCHEMABINDING
AS
SELECT
v.Veh_ID,
v.Veh_Make,
v.Veh_Model,
v.Veh_LicensePlate,
a.Atr_Name as Veh_Color
FROM
Vehicle v
INNER JOIN
Attributes a on a.Atr_ID = v.FK_Atr_VehicleColor
GO
CREATE UNIQUE CLUSTERED INDEX IX_VehicleSearch_Veh_ID ON VehicleSearch (
Veh_ID ASC
) ON [PRIMARY]
GO
CREATE FULLTEXT INDEX ON VehicleSearch (
Veh_Make LANGUAGE [English],
Veh_Model LANGUAGE [English],
Veh_Color LANGUAGE [English]
)
KEY INDEX IX_VehicleSearch_Veh_ID ON [YourFullTextCatalog]
WITH CHANGE_TRACKING AUTO
GO
As I understand it (I've used SQL Server a lot but never full-text indexing) SQL Server 2005 allows you to create full text indexes against a view. So you could create a view on
SELECT
Vehicle.VehID, ..., Color.Atr_Name AS ColorName
FROM
Vehicle
LEFT OUTER JOIN Attributes AS Color ON (Vehicle.FK_Atr_VehicleColor = Attributes.Atr_Id)
and then create your full-text index across this view, including 'ColorName' in the index.