Database Tuning Advisor recommends to create an existing index - sql-server

When I run SQL Server 2005 Database Tuning Advisor, it gives a recommendation to create an index, but it will recommends to index a column which already has an index on it. Why does it give a recommendation to create the same index again?
Here is my SQL:
SELECT t.name AS 'affected_table'
, 'Create NonClustered Index IX_' + t.name + '_'
+ CAST(ddmid.index_handle AS VARCHAR(10))
+ ' On ' + ddmid.STATEMENT
+ ' (' + IsNull(ddmid.equality_columns,'')
+ CASE
WHEN ddmid.equality_columns IS NOT NULL
AND ddmid.inequality_columns IS NOT NULL
THEN ','
ELSE ''
END
+ ISNULL(ddmid.inequality_columns, '')
+ ')'
+ ISNULL(' Include (' + ddmid.included_columns + ');', ';')
AS sql_statement
, ddmigs.user_seeks
, ddmigs.user_scans
, CAST((ddmigs.user_seeks + ddmigs.user_scans)
* ddmigs.avg_user_impact AS INT) AS 'est_impact'
, ddmigs.last_user_seek
FROM
sys.dm_db_missing_index_groups AS ddmig
INNER JOIN sys.dm_db_missing_index_group_stats AS ddmigs
ON ddmigs.group_handle = ddmig.index_group_handle
INNER JOIN sys.dm_db_missing_index_details AS ddmid
ON ddmig.index_handle = ddmid.index_handle
INNER Join sys.tables AS t
ON ddmid.OBJECT_ID = t.OBJECT_ID
WHERE
ddmid.database_id = DB_ID()
AND CAST((ddmigs.user_seeks + ddmigs.user_scans)
* ddmigs.avg_user_impact AS INT) > 100
ORDER BY
CAST((ddmigs.user_seeks + ddmigs.user_scans)
* ddmigs.avg_user_impact AS INT) DESC;

Perhaps try "DESC" to order a different way?
This worked in another similar SO question... Why does SQL Server 2005 Dynamic Management View report a missing index when it is not?

You may need to run your queries and suggest the index that is already there.
SELECT * FROM table WITH INDEX(IX_INDEX_SHOULD_BE_USED) WHERE x = y
The index that is there might not be thought of as useful from SQL Server. Run the query that is suggesting the need for the index and see the execution path in SQL Server and then build other indexes that are needed.

Can u please list the full index missing warning message? generally, it's asking to create an index on the table BUT only to return certain fields, instead of an index on the table, which will return all fields by default.

Go ahead and script out the details of both your current index strucutre and then compare this to reccomendations made by the DTA.
I suspect that you will find there are structural differences in the results.

Related

Select unique values from every column in every table

Is there a way to get a count of distinct values from every table and column in SQL Server.
I've tried using a cursor for this, but that seems insufficient.
I've got to agree with Sean and say that this is going to be horrifically slow, but if you really want to do it, then I'm not going to stop you.
Something like this could be used as a starting point if you specifically don't want to use a cursor. This took just under a minute to look at a small database I've got with 10 tables in it. The largest table has just a few million rows in it. No matter what, you're going to be doing some sort of iteration, whether that's a cursor or explicitly reading against the table for each column.
Also, if you want to do something like this, you'll likely need to accommodate for things... like you're not going to be able to use COUNT on xml columns. Like I said, it's a starting point.
DECLARE #cmd VARCHAR(MAX)
SELECT #cmd =
STUFF (
(
SELECT
' union SELECT ''['+ SCHEMA_NAME(st.schema_id) + '].[' + st.name +']'' as [Object], ''[' + sc.name + ']'' as [Column], COUNT(distinct [' + sc.name + ']) as [Count] FROM [' + SCHEMA_NAME(st.schema_id) + '].[' + st.name + ']'
FROM sys.tables st
JOIN sys.columns sc
ON sc.object_id = st.object_id
JOIN sys.dm_db_partition_stats ddps
ON ddps.object_id = sc.object_id
WHERE
ddps.row_count > 0
FOR XML PATH('')
),1,6,''
)
EXECUTE (#cmd)

Implementing geometry_columns view in MS SQL Server

(We're using MSSQL Server 2014 as far as I know)
I have never seen a good solution for maintaining a geometry_columns table in MSSQL Server. https://gis.stackexchange.com/questions/71558 never got figured out, and even if it did, the PostGIS approach of using a view (rather than a table) is a much better solution.
With that said, I can't seem to figure out how to implement the basics of how this might work.
The basic schema of the geometry_columns view - from PostGIS is:
(the DDL is a bit more complicated, but can be provided if need be)
MS SQL Server will allow you to query your information_schema table to show tables with a 'geometry' data type:
select *
FROM information_schema.columns
where data_type = 'geometry'
I'm imagining the geometry_columns view could be defined with something similar to the following, but I can't figure out how to get the information about the geometry columns to populate in the query:
SELECT
TABLE_CATALOG as f_table_catalog
, TABLE_SCHEMA as f_table_schema
, table_name as f_table_name
, COLUMN_NAME as f_geometry_column
/*how to deal with these in view?
, geometry_column.STDimension() as coord_dimension
, geometry_column.STSrid as srid
, geometry_column.STGeometryType() as type
*/
FROM information_schema.columns where data_type = 'geometry'
I'm hung up as to how the three ST operators can dynamically report the dimension, srid, and geometry type in the view when trying to query from the information_schema table. Perhaps this is a SQL problem more than anything, but I can't wrap my head around it for some reason.
Here's what the PostGIS geometry columns table looks like:
Also please let me know if this question a) could be asked differently because it is a general SQL question and/or b) it belongs on another forum (GIS.SE didn't have an answer, as I believe this is more on the database side than spatial/GIS)
Based on a little reading, it seems that PostGIS - as befits a dedicated GIS system - is a little more clever than SQL Server, when it comes to geometry columns. It looks like in PostGIS you can say that a particular geometry column will only ever contain, say, a POINT, or a LINESTRING. This is how the geometry_columns view can then be more specific about the columns it is describing.
I don't believe it is possible to readily constrain a SQL Server geometry in this way (triggers or constraints might allow, but would be messy). PostGIS can have a general geometry column with no further restriction. Let's suppose you're happy for your SQL Server geometry_columns view to return the dimension, SRID, and type based on an arbitrary row of data.
We can get the column metadata out of the catalog views, but I think the only way to do the necessary querying to also get the geometry metadata is with dynamic SQL. This rules out views and functions. I can do you a stored procedure though:
CREATE PROCEDURE GetGeometryColumns
AS
BEGIN
DECLARE #sql nvarchar(max);
SET #sql = ( SELECT
STUFF((
SELECT ' UNION ALL ' + Query
FROM
( SELECT
'SELECT ''' + s.name + ''' SchemaName'
+ ', ''' + t.name + ''' TableName'
+ ', ''' + c.name + ''' ColumnName'
+ ', ( SELECT TOP (1) ' + c.name + '.STDimension() FROM ' + s.name + '.' + t.name + ') Dimension'
+ ', ( SELECT TOP (1) ' + c.name + '.STSrid FROM ' + s.name + '.' + t.name + ') SRID'
+ ', ( SELECT TOP (1) ' + c.name + '.STGeometryType() FROM ' + s.name + '.' + t.name + ') GeometryType'
AS Query
FROM
sys.schemas s
INNER JOIN sys.tables t ON s.schema_id = t.schema_id
INNER JOIN sys.columns c on t.object_id = c.object_id
WHERE
system_type_id = 240
) GeometryColumn
FOR XML PATH(''), TYPE).value('.', 'NVARCHAR(MAX)'), 1, 10, '')
);
EXEC ( #sql );
END
This builds a SQL statement which is a UNION of SELECTs, one for each geometry column defined in the database. Note that I'm using the sys. catalog views, which for SQL Server are better than using INFORMATION_SCHEMA.
Each of the individual SELECTs that this builds will return the name of the column, plus metadata from the value in the first row (artibtarily picked).
The sproc then executes the statements its built, and returns.
To use:
CREATE TABLE T1 (
Id int NOT NULL PRIMARY KEY
, Region geometry
)
;
CREATE TABLE T2 (
Id int NOT NULL PRIMARY KEY
, Source geometry
, Destination geometry
)
;
INSERT T1 VALUES ( 1, geometry::STGeomFromText('POLYGON((1 1, 3 3, 3 1, 1 1))', 4236)) ;
INSERT T2 VALUES ( 10
, geometry::STGeomFromText('POINT(1.3 2.4)', 4236)
, geometry::STGeomFromText('POINT(2.6 2.5)', 4236)) ;
then simply
EXEC GetGeometryColumns;
to get
SchemaName TableName ColumnName Dimension SRID GeometryType
---------- --------- ----------- ----------- ----------- ----------------------
dbo T1 Region 2 4236 Polygon
dbo T2 Source 0 4236 Point
dbo T2 Destination 0 4236 Point
If you want the results in a table, you can for example:
DECLARE #geometryColumn TABLE
(
SchemaName sysname
, TableName sysname
, ColumnName sysname
, Dimension int
, SRID int
, GeometryType nvarchar(100)
);
INSERT #geometryColumn EXEC GetGeometryColumns
SELECT * FROM #geometryColumn
I'd be interested to see if anyone can get the necessary logic into an actual VIEW...

SQL Query, column SHOULD be present, but results states, it is not

I am having a hard time grasping why this query is telling me the TaxPayerID is NOT found, when in the beginning, I am clearly checking for it and only using the databases, which should contain the TaxPayerID column in the nTrucks table.
sp_MSforeachdb
'
IF EXISTS (SELECT * FROM [?].INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME = ''nTrucks'' AND COLUMN_NAME = ''TaxPayerID'')
BEGIN
SELECT "?", nTrucks.UnitNumber, ntrucks.Companyid, nCompanyData.CompanyName, nTrucks.Owner, nTrucks.TaxPayerID
FROM nTrucks
INNER JOIN nCompanyData ON nTrucks.CompanyID = nCompanyData.CompanyID
WHERE nTrucks.Owner like ''%Trucker%''
END
'
I am getting multiple 'Invalid column name 'TaxPayerID'.' errors, I assume it is from the databases NOT containing this column.
If anyone here can throw me a bone, a simple "you're a dummy, do it this way!", I would be very appreciative.
JF
You're a dummy! (you asked for it) :)
How to debug this error:
Locate the database that throws an error and try executing an actual SQL query on it directly to see if it will compile:
IF EXISTS (SELECT * FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME = ''nTrucks'' AND COLUMN_NAME = ''TaxPayerID'')
BEGIN
SELECT nTrucks.UnitNumber, ntrucks.Companyid, nCompanyData.CompanyName, nTrucks.Owner, nTrucks.TaxPayerID
FROM nTrucks
INNER JOIN nCompanyData ON nTrucks.CompanyID = nCompanyData.CompanyID
WHERE nTrucks.Owner like ''%Trucker%''
END
It will fail.
Now you know that SQL server checks schema at query parse time rather than run time.
Then you follow #GordonLinoff suggestion and convert the SELECT query into dynamic SQL as follows:
sp_MSforeachdb
'
IF EXISTS (SELECT * FROM [?].INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME = ''nTrucks'' AND COLUMN_NAME = ''TaxPayerID'')
BEGIN
EXEC(
''SELECT "?", nTrucks.UnitNumber, ntrucks.Companyid, nCompanyData.CompanyName, nTrucks.Owner, nTrucks.TaxPayerID
FROM [?]..nTrucks
INNER JOIN [?]..nCompanyData ON nTrucks.CompanyID = nCompanyData.CompanyID
WHERE nTrucks.Owner like ''''%Trucker%''''
'' )
END
'
(I hope I got my quotes right)
If your query is supposed to reference a central nCompareData table then remove [?].. before nCompareData

sqlserver Query time taking

I am executing below query. It takes 80 seconds for just 17 records.
can any body tell me reason if knows. I have already tried with Indexes.
SELECT DISTINCT t.i_UserID,
u.vch_LoginName,
t.vch_PreviousEmailAddress AS 'vch_EmailAddress',
u.vch_DisplayName,
t.d_TransactionDate AS 'd_DateAdded',
'Old' AS 'vch_RecordStatus'
FROM tblEmailTransaction t
INNER JOIN tblUser u
ON t.i_UserID = u.i_UserID
WHERE t.vch_PreviousEmailAddress LIKE '%kala%'
Change collation for vch_PreviousEmailAddress column on Latin1_General_100_BIN2
Create covered index:
CREATE NONCLUSTERED INDEX ix
ON dbo.tblEmailTransaction (vch_PreviousEmailAddress)
INCLUDE (i_UserID, d_TransactionDate)
GO
And have fun with this query:
SELECT t.i_UserID,
u.vch_LoginName,
t.vch_PreviousEmailAddress AS vch_EmailAddress,
u.vch_DisplayName,
t.d_TransactionDate AS d_DateAdded,
'Old' AS vch_RecordStatus
FROM (
SELECT DISTINCT i_UserID,
vch_PreviousEmailAddress,
d_TransactionDate
FROM dbo.tblEmailTransaction
WHERE vch_PreviousEmailAddress LIKE '%kala%' COLLATE Latin1_General_100_BIN2
) t
JOIN dbo.tblUser u ON t.i_UserID = u.i_UserID
One other thing, which I find useful in solving problems like this:
Try running the following script. It will tell you which indexes you could ask to your SQL Server database, which would make the most (positive) improvement.
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED
SELECT TOP 100
ROUND(s.avg_total_user_cost * s.avg_user_impact * (s.user_seeks + s.user_scans),0) AS 'Total Cost',
s.avg_user_impact,
d.statement AS 'Table name',
d.equality_columns,
d.inequality_columns,
d.included_columns,
'CREATE INDEX [IndexName] ON ' + d.statement + ' ( '
+ case when (d.equality_columns IS NULL OR d.inequality_columns IS NULL)
then ISNULL(d.equality_columns, '') + ISNULL(d.inequality_columns, '')
else ISNULL(d.equality_columns, '') + ', ' + ISNULL(d.inequality_columns, '')
end + ' ) '
+ CASE WHEN d.included_columns IS NULL THEN '' ELSE 'INCLUDE ( ' + d.included_columns + ' )' end AS 'CREATE INDEX command'
FROM sys.dm_db_missing_index_groups g,
sys.dm_db_missing_index_group_stats s,
sys.dm_db_missing_index_details d
WHERE d.database_id = DB_ID()
AND s.group_handle = g.index_group_handle
AND d.index_handle = g.index_handle
ORDER BY [Total Cost] DESC
The right-hand column displays the CREATE INDEX command which you'd need to run, to create that index.
This one of those lifesaver scripts, which I run on our in-house databases once ever so often.
But yes, in your example, this is just likely to tell you that you need an index on the vch_PreviousEmailAddress field in your tblEmailTransaction table.
The probable bottleneck are 2:
Missing Index on tblEmailTransaction.i_UserID: Check if the table has the index
Missing Index on tblUser.i_UserID: Check if the table has the index
Like Statement: Like statement is know to be not good in performance, as Devart suggested, try to specify collection in this way:
WHERE vch_PreviousEmailAddress LIKE '%kala%' COLLATE Latin1_General_100_BIN2
To have a better view on your query, You have to run this command with your query:
SET IO STATISTICS ON
It will write all the IO Access that the query does and the we can see what happen.
Just a final question ?
How many rows contains the two tables?
Ciao

SQL Server WHERE with wildcard

Is it possible to use a wildcard for the where in statement in SQL Server 2008?
For example, I currently have:
SELECT something
FROM myTable
WHERE (ORG + '-' + ORGSUB like '5015001-________' or
ORG + '-' + ORGSUB like '5015018-________' or
ORG + '-' + ORGSUB like '_______-________')
I need to do it this way:
SELECT something
FROM myTable
WHERE
(ORG + '-' + ORGSUB) in( '5015001-________','5015018-________','_______-________')
i'm going to be passing those values through a stored procedure as a comma delimited list. is there another way to get it done?
Take your comma delimited list, split it, and insert it into a temp table...
You can then use a LIKE statement in a JOIN to this temp table:
SELECT something
FROM myTable mt
JOIN #tempTable tt
ON mt.ORG + '-' + mt.ORGSUB LIKE tt.SearchValue
Why do you even care about ORGSUB in your query (as provided in the example)?
Seems to me you should rewrite your WHERE clause to look for the components separately, e.g.:
SELECT something
FROM myTable
WHERE ORG IN (5015001, 5015018, ...)
[add other criteria as appropriate]
Why a comma-separated list?
DECLARE TYPE dbo.OrgSub AS TABLE(s VARCHAR(32));
GO
CREATE PROCEDURE dbo.SearchOrgSubs
#OrgSub dbo.OrgSub READONLY
AS
BEGIN
SET NOCOUNT ON;
SELECT t.something
FROM dbo.mytable AS t
INNER JOIN #OrgSub AS o
ON t.ORG + '-' + t.ORGSUB = o.s;
END
GO
Now you can pass the set into the stored procedure from C# or wherever, without first having to form it into a comma-separated list.
You can create a temp Table contains the result of a split function.
SELECT somthing
from myTable
JOIN dbo.Split('5015001-________','5015018-________','_______-________') as Splits
on (ORG + '-' + ORGSUB) like Splits.items

Resources