Outer join on two tables with sequential guid stalls - sql-server

I'm attempting to perform a full outer join on two tables that are not related. Each table has a location_id which will eventually form the primary/foreign key relationship (once I figure out this performance issue). When executing the outer join, it just clocks away. Queries and triggers performed against each table on its own complete in less than a second.
This table has 21000 records:
CREATE TABLE [dbo].[TBL_LOCATIONS](
[OBJECTID] [int] NOT NULL,
[Loc_Name] [nvarchar](100) NULL,
[Location_ID] [uniqueidentifier] NULL,
[SHAPE] [geometry] NULL,
CONSTRAINT [R33_pk] PRIMARY KEY CLUSTERED
(
[OBJECTID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 75) ON [PRIMARY]
) ON [PRIMARY]
GO
ALTER TABLE [dbo].[TBL_LOCATIONS] WITH CHECK ADD CONSTRAINT [g17_ck] CHECK (([SHAPE].[STSrid]=(26917)))
GO
ALTER TABLE [dbo].[TBL_LOCATIONS] ADD CONSTRAINT [DF_TBL_LOCATIONS_Location_ID] DEFAULT (newsequentialid()) FOR [Location_ID]
GO
CREATE SPATIAL INDEX [S17_idx] ON [dbo].[TBL_LOCATIONS]
(
[SHAPE]
)USING GEOMETRY_GRID
WITH (
BOUNDING_BOX =(224827, 3923750, 323464, 3967780), GRIDS =(LEVEL_1 = HIGH,LEVEL_2 = HIGH,LEVEL_3 = HIGH,LEVEL_4 = HIGH),
CELLS_PER_OBJECT = 16, PAD_INDEX = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
CREATE UNIQUE NONCLUSTERED INDEX [UUID_OID_33] ON [dbo].[TBL_LOCATIONS]
(
[Location_ID] ASC,
[OBJECTID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 75) ON [PRIMARY]
GO
This table has 53000 records
CREATE TABLE [dbo].[TBL_EVENTS](
[OBJECTID] [int] NOT NULL,
[Event_ID] [uniqueidentifier] NULL,
[Location_ID] [uniqueidentifier] NULL,
CONSTRAINT [PK_TBL_EVENTS] PRIMARY KEY CLUSTERED
(
[OBJECTID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
ALTER TABLE [dbo].[TBL_EVENTS] ADD CONSTRAINT [DF_TBL_EVENTS_Event_ID] DEFAULT (newsequentialid()) FOR [Event_ID]
GO
ALTER TABLE [dbo].[TBL_EVENTS] ADD CONSTRAINT [DF_TBL_EVENTS_Event_ID] DEFAULT (newsequentialid()) FOR [Event_ID]
GO
CREATE UNIQUE NONCLUSTERED INDEX [R36_SDE_ROWID_UK] ON [dbo].[TBL_EVENTS]
(
[OBJECTID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 75) ON [PRIMARY]
GO
And here is the query that is running....and running...1 hour and no results.
SELECT
TBL_LOCATIONS.Loc_Name,
TBL_LOCATIONS.Location_ID,
TBL_LOCATIONS.SHAPE,
TBL_EVENTS.Event_ID
FROM
TBL_EVENTS
FULL OUTER JOIN
TBL_LOCATIONS ON TBL_EVENTS.Location_ID = TBL_LOCATIONS.Location_ID
I've tried every permutation of attribute indexes on both tables, rebuilding and reorganizing them, nothing affects the performance. The use of ObjectID as PK is mandated by the application, as is the sequentialGUID. I don't think those are factors here, as both these tables perform splendidly outside of this query. SQL Server 2008 SP1 64BIT on RAID 10/48 GB RAM.

FULL JOIN works well when data in columns used to links tables are unique.
For rows containing duplicated data FULL JOIN behaves like CROSS JOIN and can cause performace issues.
So probably bottleneck comes from duplicates in LOCATION_ID column.

Maybe you need to consider turning off Transaction Logging whilst doing all that.

If the linked field values are not all that unique (location), the query size could approach quite a large number.
In an extreme example, if location only had the value of "1" in both tables, the total rows would be close to the cross join size, about 1,113,000,000 rows (21,000 * 53,000). A query of this size (over a billion rows) will take a long time to run.
EDIT - updating incorrect statement as pointed out in comments

Related

In Microsoft SQL Server, will 2 non clustered indices improve performance when there is already a non clustered composite index?

I have a table called Files that contains these columns
[Id] [int] IDENTITY(1,1) NOT NULL,
[RowVersion] [timestamp] NULL,
[CreatedAt] [datetime2](7) NOT NULL,
[UpdatedAt] [datetime2](7) NOT NULL,
[FileName] [nvarchar](450) NOT NULL,
[FileContent] [nvarchar](max) NULL,
[MessengerId] [int] NOT NULL,
[FileTypeId] [int] NOT NULL,
[Ref] [int] NOT NULL,
The table has a clustered index on the primary key Id.
Also there is a non clustered composite index on MessengerId and FileName
CREATE UNIQUE NONCLUSTERED INDEX [IX_Files_CourierId_FileName] ON [dbo].[Files]
(
[MessengerId] ASC,
[FileName] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
This query is slow
SELECT COUNT(*) FROM Files WHERE MessengerId = 1 AND Filename = 'myfilename.xml'
I'm battling to test because the timeouts happen on a production server. On my developper laptop I have no issues.
Will adding 2 new indices on MessengerId and Filename improve performance?
The 2 new indices look like so
CREATE NONCLUSTERED INDEX [Index_on_FileName] ON [dbo].[Files]
(
[FileName] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
GO
CREATE NONCLUSTERED INDEX [Index_on_MessengerId] ON [dbo].[Files]
(
[MessengerId] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
GO
This query
SELECT COUNT(*) FROM Files WHERE MessengerId = 1 AND Filename = 'myfilename.xml'
requires a index equality-lookup on MessengerId and Filename and no other columns, so you current index will suffice.
You could equally switch the column order for the same result (you may find it a tiny bit faster in the event there is no match)
CREATE UNIQUE NONCLUSTERED INDEX [IX_Files_CourierId_FileName] ON [dbo].[Files]
(
[FileName] ASC,
[MessengerId] ASC
)
However the other two new indexes you propose do not cover the query, so the server most likely not even bother using them, especially if your existing query is still in place. Certainly they will be slower if used, as they will require key lookups to the clustered index.
Your actual timeout issue is more likely associated with locking because of other queries doing updates. I suggest you use a query to find any possible blockers, for example this one.

SQL Server unique constraint failing to discard duplicate record

I am adding a constraint on multiple column in one the table of my SQL Server database.
Below is the query to add the constraint:
ALTER TABLE [db].[tablename]
ADD CONSTRAINT [contact_uniq_constraint]
UNIQUE NONCLUSTERED ([account_id_fk] ASC, [contact_type] ASC,
[contact] ASC, [country_code] ASC)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF,
SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF,
ONLINE = OFF, ALLOW_ROW_LOCKS = ON,
ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
But I am still able to add duplicate in the table. Can someone help me identify the reason?

How to improve performance in SQL Server table with varbinary column?

I have a table in my database that we use as a filestore, the file itself is stored in a varbinary(MAX) column.
CREATE TABLE [dbo].[FILE_TABLE](
[FILE_ID] [int] IDENTITY(1,1) NOT NULL,
[HTML_FILE] [nvarchar](max) NULL,
[XML_FILE] [varbinary](max) NULL,
[FILE_EXTENSION] [nvarchar](50) NULL
CONSTRAINT [PK_FILE_TABLE] PRIMARY KEY CLUSTERED
(
[FILE_ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 80) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
GO
We also created the following non-clustered index:
CREATE NONCLUSTERED INDEX [NonClusteredIndex-20180220-000116] ON [dbo].[FILE_TABLE]
(
[FILE_ID] ASC
)
INCLUDE ( [XML_FILE]) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
This table has semantics and freetext enabled to run a keyword search against the documents. When we run a simple freetext query, the performance is very slow. The table has 810 records only and it takes 30 seconds to return 713 rows. How could we improve the performance of this table or query?
SELECT * FROM FILE_TABLE WHERE freetext(XML_FILE,'lights')

primary key name is required field?

Is there any difference between the below 2 CREATE TABLE statements in SQL Server 200x/2012? I generated this script from two different tables, one had a Key name defined (PK_Table1) whereas the other had some kind of randomly generated number associated to it (PK_Table1_1084F446).
CREATE TABLE [dbo].[Table1](
[ID] [uniqueidentifier] NOT NULL,
<<Other Column declaration here>>
PRIMARY KEY CLUSTERED
(
[ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
Few more non-clustered indexes declaration here
CREATE TABLE [dbo].[Table1](
[ID] [uniqueidentifier] NOT NULL,
<<Other Column declaration here>>
CONSTRAINT [PK_Table1] PRIMARY KEY CLUSTERED
(
[ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
Few more non-clustered indexes declaration here
It works in the same way, but natural names are more convenient:
1) when altering constraint you can easy refer to it (if you gave sensible name);
2) when query failed due to constraint, name of this constraint is showed, so you can easily know what cause an error (if you gave sensible name).

aspnet_Users table with huge indexsize

We have an aspnet_Users table from aspnet membership table that shows up with
almost 18 gb index size.
rows 251172
datasize 56472 KB
indexsize 17800536 KB
This is just the standard aspnet membership table, but we do have an other table with a foreign key to this table (userid column).
Anyone seen this problem before?
How can i reduce the index size?
the aspnet_Users table is defined as
CREATE TABLE [dbo].[aspnet_Users](
[ApplicationId] [uniqueidentifier] NOT NULL,
[UserId] [uniqueidentifier] NOT NULL,
[UserName] [nvarchar](256) NOT NULL,
[LoweredUserName] [nvarchar](256) NOT NULL,
[MobileAlias] [nvarchar](16) NULL,
[IsAnonymous] [bit] NOT NULL,
[LastActivityDate] [datetime] NOT NULL,
PRIMARY KEY NONCLUSTERED
(
[UserId] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
CREATE UNIQUE CLUSTERED INDEX [aspnet_Users_Index] ON [dbo].[aspnet_Users]
(
[ApplicationId] ASC,
[LoweredUserName] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
CREATE NONCLUSTERED INDEX [aspnet_Users_Index2] ON [dbo].[aspnet_Users]
(
[ApplicationId] ASC,
[LastActivityDate] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
ALTER TABLE [dbo].[aspnet_Users] ADD DEFAULT (newid()) FOR [UserId]
GO
ALTER TABLE [dbo].[aspnet_Users] ADD DEFAULT (NULL) FOR [MobileAlias]
GO
ALTER TABLE [dbo].[aspnet_Users] ADD DEFAULT ((0)) FOR [IsAnonymous]
GO
ALTER TABLE [dbo].[aspnet_Users] WITH NOCHECK ADD FOREIGN KEY([ApplicationId])
REFERENCES [dbo].[aspnet_Applications] ([ApplicationId])
GO
This works out at 70k per row though which is more then a single extent (8 pages, 64k) which implies massive fragmentation.
Have you ever run index maintenance on it? Run this and see what happens
ALTER INDEX ALL ON aspnet_Users REBUILD
Alternatively, has the table ever been extended with a LOB (pictures, XML etc) column? Or has someone added dozens of indexes? So please add the actual in-use table definition
Edit: remove the index on LastActivityDate, or change it to smalldatetime to keep minute accuracy, or only update it if changed more then xx seconds/minutes

Resources