Related
I've been going through the boards and tried lots of different things without luck. So thought id reach out to the community directly.
The problem:
I have an Azure SQL Server DB that has 2 tables:
DATA_IMPORT (Source table I import data into via Data Factory...it gets truncated each load (approx 20m rows).
DATA_SOURCE (Table where I insert the 20m rows from DATA_IMPORT into with some simple transformation. This is expected to reach about 0.5b rows)
Im a little new to SQL Server and now resorted to having no indexes in DATA_SOURCE to see if that helps....still takes 60+mins.
No indexes are needed on table DATA_IMPORT, since its just a holding table.
Table Structures
CREATE TABLE [dbo].[DATA_IMPORT ](
[field1] [nvarchar](255) NOT NULL,
[field2] [nvarchar](255) NOT NULL,
[field3] [nvarchar](255) NOT NULL,
[field4] [nvarchar](255) NOT NULL,
[field5] [nvarchar](255) NOT NULL,
[field6] [nvarchar](255) NOT NULL,
[field7] [nvarchar](255) NOT NULL,
[field8] [nvarchar](255) NOT NULL,
[field9] [nvarchar](255) NOT NULL,
[field10] [nvarchar](255) NOT NULL,
[measure1] int NULL,
[measure2] decimal(10,2) NULL,
[measure3] decimal(10,5) NULL,
[measure4] decimal(7,2) NULL,
[measure5] decimal(10,5) NULL
)
CREATE TABLE [dbo].[DATA_SOURCE](
[EFF_DATE] [datetime] NOT NULL,
[EFF_STATUS] [nvarchar](255) NOT NULL,
[DATA_SOURCE] [nvarchar](255) NOT NULL,
[PERIOD] [date] NOT NULL,
[field1] [nvarchar](255) NOT NULL,
[field2] [nvarchar](255) NOT NULL,
[field3] [nvarchar](255) NOT NULL,
[field4] [nvarchar](255) NOT NULL,
[field5] [nvarchar](255) NOT NULL,
[field6] [nvarchar](255) NOT NULL,
[field7] [nvarchar](255) NOT NULL,
[field8] [nvarchar](255) NOT NULL,
[field9] [nvarchar](255) NOT NULL,
[field10] [nvarchar](255) NOT NULL,
[measure1] int NULL,
[measure2] decimal(10,2) NULL,
[measure3] decimal(10,5) NULL,
[measure4] decimal(7,2) NULL,
[measure5] decimal(10,5) NULL,
[measure6] decimal(11,3) NULL
[REC_CREATEDBY] [nvarchar](50) NOT NULL,
[REC_CREATEDON] [datetime] NOT NULL,
[REC_LASTUPDATEDBY] [nvarchar](50) NULL,
[REC_LASTUPDATEDON] [datetime] NULL
)
INSERT SQL
--YYYY-MM-DD
Declare #varPeriod varchar(30) = '2020-01-01'
Declare #varDataSource varchar(255) = 'https://blah.com'
INSERT INTO [DATA_SOURCE] (
[EFF_DATE],[EFF_STATUS],[DATA_SOURCE],[PERIOD],
[field1],[field2],[field3],[field4],[field5],
[field6],[field7],[field8],[field9],[field10],
[measure1],[measure2],[measure3],[measure4],[measure5],
[measure6],
[REC_CREATEDBY],[REC_CREATEDON], [REC_LASTUPDATEDBY], [REC_LASTUPDATEDON])
SELECT
SYSDATETIME() AS [EFF_DATE]
,'A' AS [EFF_STATUS]
,#varDataSource AS [DATA_SOURCE],
CONVERT(varchar, #varPeriod, 100) AS [PERIOD],
[field1],[field2],[field3],[field4],[field5],
[field6],[field7],[field8],[field9],[field10],
[measure1],[measure2],[measure3],[measure4],[measure5],
,CAST([measure1]*[measure2] AS numeric(11,3)) as [measure6]
,'DATA_LOADER' AS [REC_CREATEDBY]
,SYSDATETIME() AS [REC_CREATEDON]
,'DATA_LOADER' AS [REC_LASTUPDATEDBY]
,SYSDATETIME() AS [REC_LASTUPDATEDON]
FROM [dbo].[DATA_IMPORT];
GO
What performance recommendations do you have so I can insert these 20m rows quickly?
I will need to apply a 3/4 indexes too once I join to my dimensional data.
Thanks for your help all
Jay
EDIT: use BULK Insert for a better performance when inserting data.
PS: another important thing to look is the DTU / vCores you assign to your database.
I have 2 tables,each contains 4-500k records
CREATE TABLE [dbo].[User][UserId] [int] IDENTITY(1,1) NOT NULL,
[Password] [nvarchar](max) NULL,
[RoleId] [int] NOT NULL,
[Name] [nvarchar](max) NULL,
[Address] [nvarchar](max) NULL,
[Email] [nvarchar](max) NULL,
[Landline] [nvarchar](max) NULL,
[MobileNumberCode] [int] NULL,
[MobileNumber] [nvarchar](max) NULL,
[DateOfBirth] [datetime] NULL,
[MarriageDate] [datetime] NULL,
[CreatedDate] [datetime] NOT NULL,
[UpdatedDate] [datetime] NOT NULL,
[Status] [nvarchar](max) NOT NULL,
[BranchId] [int] NULL,
[UserTitle] [nvarchar](50) NULL,
[MiddleName] [nvarchar](50) NULL,
[LastName] [nvarchar](50) NULL,
[HouseNumber] [nvarchar](50) NULL,
[BuildingNumber] [nvarchar](50) NULL,
[RoadNumber] [nvarchar](50) NULL,
[BlockNumber] [nvarchar](50) NULL,
[City] [nvarchar](50) NULL,
[NearBranchId] [int] NULL,
[MobileIsValid] [bit] NULL,
[EmailIsValid] [bit] NULL,
[Gender] [nvarchar](50) NULL,
[SourceId] [int] NULL)
CREATE TABLE [dbo].[PurchaseOrder]
[PurchaseOrderId] [int] NOT NULL,
[BranchId] [int] NOT NULL,
[PurchaseDate] [datetime] NOT NULL,
[Amount] [decimal](18, 3) NOT NULL,
[UserId] [int] NOT NULL,
[Status] [nvarchar](max) NULL,
[sbs_no] [int] NOT NULL)
And I have stored procedure to get data from these tables using join.
CREATE PROC Sp_SearchCustomer (#FromDate datetime = null,
#ToDate datetime = null,
#RegFromDate datetime = null,
#RegToDate datetime = null)
AS
BEGIN
select a.UserId,a.Name,b.PurchaseOrderId,b.Amount from dbo.[User] a left join PurchaseOrder b on a.UserId=b.UserId
where
((a.CreatedDate >= ''' + cast(#RegFromDate as varchar) + ''')
AND (a.CreatedDate <= ''' + cast(#RegToDate as varchar) + '''))
and ((b.PurchaseDate >= ''' + cast(#FromDate as varchar) + ''')
AND (b.PurchaseDate <= ''' + cast(#ToDate as varchar) + '''))
END
When executing this procedure with date, its getting "The wait operation timed out" exception. Please help to solve this issue.
Your date in your tables and in your Procedure are both saved as varchar. This is perfect and there is no need to convert them to varchar.
Beside, varchar is surrounded by quotes and won't be executed. This is just becoming a string:
where ((a.CreatedDate >= 'cast(#RegFromDate as varchar)')...
There are also way too many useless parenthesis since you are using AND.
Try this instead:
CREATE PROC Sp_SearchCustomer (
#FromDate datetime = null,
#ToDate datetime = null,
#RegFromDate datetime = null,
#RegToDate datetime = null
)
AS
BEGIN
SELECT a.UserId
,a.Name
,b.PurchaseOrderId
,b.Amount
FROM dbo.[User] a
LEFT JOIN PurchaseOrder b
ON a.UserId = b.UserId
WHERE
a.CreatedDate >= #RegFromDate
AND a.CreatedDate <= #RegToDate
AND b.PurchaseDate >= #FromDate
AND b.PurchaseDate <= #ToDate
END
Once the query has been improved, you can test it again.
You should also look at Statistics and Indexes and make sure that Statistics are up-to-date and Indexes are not fragmented.
For Statistics, you can use: exec sp_updatestats
For Indexes on these 2 tables, look at the Fragmentation % and choose to REBUILD or REORGANIZE them.
select
a.COUNTY_FIPS
,COUNT(e.PROPERTY_ID) as house_count
,AVG(cast(e.AVM_FINAL_VALUE as bigint)) as avg_avm
,max(cast(e.AVM_FINAL_VALUE as bigint)) as max_avm
,min(cast(e.AVM_FINAL_VALUE as bigint)) as min_avm
from
RAW_Equity e
left join
(SELECT
SA_PROPERTY_ID, MM_FIPS_STATE_CODE, MM_FIPS_MUNI_CODE,
CASE
WHEN MM_FIPS_STATE_CODE < 10
THEN '0' + CAST(MM_FIPS_STATE_CODE as VARCHAR)
ELSE CAST(MM_FIPS_STATE_CODE as VARCHAR)
END
+ CASE
WHEN MM_FIPS_MUNI_CODE < 10
THEN '00' + CAST(MM_FIPS_MUNI_CODE as VARchar)
WHEN MM_FIPS_MUNI_CODE < 100
THEN '0' + CAST(MM_FIPS_MUNI_CODE as VARchar)
ELSE CAST(MM_FIPS_MUNI_CODE as VARchar)
END AS COUNTY_FIPS
FROM
RAW_Address) a ON a.SA_PROPERTY_ID = e.PROPERTY_ID
where
AVM_CONFIDENCE_SCORE >= 70
group by
a.COUNTY_FIPS
Is there any way I can improve the performance of this query? Schema for both the tables are shown below. I am was thinking about creating non clustered index on AVM_CONFIDENCE_SCORE but I think it will only going to increase the query time. Any help will be greatly appreciated.
RAWADDRESS table:
CREATE TABLE [dbo].[RAW_Address]
(
[SA_PROPERTY_ID] [int] NOT NULL,
[SA_SCM_ID] [int] NOT NULL,
[MM_STATE_CODE] [varchar](2) NOT NULL,
[MM_MUNI_NAME] [varchar](24) NOT NULL,
[MM_FIPS_STATE_CODE] [tinyint] NOT NULL,
[MM_FIPS_MUNI_CODE] [smallint] NOT NULL,
[MM_FIPS_COUNTY_NAME] [varchar](35) NOT NULL,
[SA_SITE_HOUSE_NBR] [varchar](20) NULL,
[SA_SITE_FRACTION] [varchar](10) NULL,
[SA_SITE_DIR] [varchar](2) NULL,
[SA_SITE_STREET_NAME] [varchar](40) NULL,
[SA_SITE_SUF] [varchar](4) NULL,
[SA_SITE_POST_DIR] [varchar](2) NULL,
[SA_SITE_UNIT_PRE] [varchar](10) NULL,
[SA_SITE_UNIT_VAL] [varchar](6) NULL,
[SA_SITE_CITY] [varchar](30) NULL,
[SA_SITE_STATE] [varchar](2) NOT NULL,
[SA_SITE_ZIP] [int] NULL,
[SA_SITE_PLUS_4] [smallint] NULL,
[SA_SITE_CRRT] [varchar](4) NULL,
[SA_MAIL_HOUSE_NBR] [varchar](20) NULL,
[SA_MAIL_FRACTION] [varchar](10) NULL,
[SA_MAIL_DIR] [varchar](2) NULL,
[SA_MAIL_STREET_NAME] [varchar](50) NULL,
[SA_MAIL_SUF] [varchar](4) NULL,
[SA_MAIL_POST_DIR] [varchar](2) NULL,
[SA_MAIL_UNIT_PRE] [varchar](10) NULL,
[SA_MAIL_UNIT_VAL] [varchar](6) NULL,
[SA_MAIL_CITY] [varchar](50) NULL,
[SA_MAIL_STATE] [varchar](2) NULL,
[SA_MAIL_ZIP] [int] NULL,
[SA_MAIL_PLUS_4] [smallint] NULL,
[SA_MAIL_CRRT] [varchar](4) NULL,
[SA_SITE_MAIL_SAME] [varchar](1) NULL
) ON [PRIMARY]
RAW Equity table:
CREATE TABLE [dbo].[RAW_Equity]
(
[PROPERTY_ID] [int] NOT NULL,
[SCM_ID] [int] NOT NULL,
[MM_STATE_CODE] [varchar](2) NOT NULL,
[MM_MUNI_NAME] [varchar](24) NOT NULL,
[MM_FIPS_STATE_CODE] [int] NOT NULL,
[MM_FIPS_MUNI_CODE] [int] NOT NULL,
[MM_FIPS_COUNTY_NAME] [varchar](35) NOT NULL,
[AVM_FINAL_VALUE] [int] NULL,
[AVM_LOW_VALUE] [int] NULL,
[AVM_HIGH_VALUE] [int] NULL,
[AVM_CONFIDENCE_SCORE] [int] NULL,
[FINAL_VALUE] [float] NULL,
[FIRST_POSITION_SR_UNIQUE_ID] [int] NULL,
[FIRST_POSITION_LOAN_DATE] [int] NULL,
[FIRST_POSITION_DOC_NBR] [varchar](20) NULL,
[FIRST_POSITION_LOAN_VAL] [int] NULL,
[FIRST_POSITION_LENDER_CODE] [int] NULL,
[FIRST_POSITION_LNDR_LAST_NAME] [varchar](50) NULL,
[FIRST_POSITION_LNDR_FIRST_NAME] [varchar](50) NULL,
[FIRST_POSITION_LENDER_TYPE] [varchar](1) NULL,
[FIRST_POSITION_LOAN_TYPE] [varchar](1) NULL,
[FIRST_POSITION_INTEREST_RATE_TYPE] [varchar](1) NULL,
[FIRST_POSITION_ESTIMATED_INTEREST_RATE] [float] NULL,
[FIRST_POSITION_LNDR_CREDIT_LINE] [varchar](1) NULL,
[FIRST_POSITION_MODELED_MORTGAGE_TYPE] [varchar](1) NULL,
[SECOND_POSITION_SR_UNIQUE_ID] [int] NULL,
[SECOND_POSITION_LOAN_DATE] [int] NULL,
[SECOND_POSITION_DOC_NBR] [varchar](20) NULL,
[SECOND_POSITION_LOAN_VAL] [int] NULL,
[SECOND_POSITION_LENDER_CODE] [int] NULL,
[SECOND_POSITION_LNDR_LAST_NAME] [varchar](50) NULL,
[SECOND_POSITION_LNDR_FIRST_NAME] [varchar](50) NULL,
[SECOND_POSITION_LENDER_TYPE] [varchar](1) NULL,
[SECOND_POSITION_LOAN_TYPE] [varchar](1) NULL,
[SECOND_POSITION_INTEREST_RATE_TYPE] [varchar](1) NULL,
[SECOND_POSITION_ESTIMATED_INTEREST_RATE] [float] NULL,
[SECOND_POSITION_LNDR_CREDIT_LINE] [varchar](1) NULL,
[SECOND_POSITION_MODELED_MORTGAGE_TYPE] [varchar](1) NULL,
[THIRD_POSITION_SR_UNIQUE_ID] [int] NULL,
[THIRD_POSITION_LOAN_DATE] [int] NULL,
[THIRD_POSITION_DOC_NBR] [varchar](20) NULL,
[THIRD_POSITION_LOAN_VAL] [int] NULL,
[THIRD_POSITION_LENDER_CODE] [int] NULL,
[THIRD_POSITION_LNDR_LAST_NAME] [varchar](50) NULL,
[THIRD_POSITION_LNDR_FIRST_NAME] [varchar](50) NULL,
[THIRD_POSITION_LENDER_TYPE] [varchar](1) NULL,
[THIRD_POSITION_LOAN_TYPE] [varchar](1) NULL,
[THIRD_POSITION_INTEREST_RATE_TYPE] [varchar](1) NULL,
[THIRD_POSITION_ESTIMATED_INTEREST_RATE] [float] NULL,
[THIRD_POSITION_LNDR_CREDIT_LINE] [varchar](1) NULL,
[THIRD_POSITION_MODELED_MORTGAGE_TYPE] [varchar](1) NULL,
[TOTAL_OUTSTANDING_LOANS] [bigint] NULL,
[LTV] [int] NULL,
[AVAILABLE_EQUITY] [int] NULL,
[LENDABLE_EQUITY] [int] NULL,
[PROCESS_ID] [int] NOT NULL,
[FILLER] [varchar](4) NULL,
CONSTRAINT [PK_RAW_Equity] PRIMARY KEY CLUSTERED
(
[PROPERTY_ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
You could try this way :-
(willing to know, how much this one, helpful to you)
Declare #result Table
(
RowId Int Identity(1,1) Primary Key
,COUNTY_FIPS Varchar(100)
,MM_FIPS_STATE_CODE Int
,MM_FIPS_MUNI_CODE Int
,house_count Int
,avg_avm Int
,max_avm Int
,min_avm Int
)
Insert Into #result(MM_FIPS_STATE_CODE,MM_FIPS_MUNI_CODE,house_count,avg_avm,max_avm,min_avm)
Select a.MM_FIPS_STATE_CODE
,a.MM_FIPS_MUNI_CODE
,Count(e.PROPERTY_ID) as house_count
,Avg(Cast(e.AVM_FINAL_VALUE as bigint)) as avg_avm
,Max(Cast(e.AVM_FINAL_VALUE as bigint)) as max_avm
,Min(Cast(e.AVM_FINAL_VALUE as bigint)) as min_avm
From RAW_Equity As e With (Nolock)
Left Join RAW_Address As a With (Nolock) On e.PROPERTY_ID = a.SA_PROPERTY_ID
Where e.AVM_CONFIDENCE_SCORE >= 70
Group by a.MM_FIPS_STATE_CODE
,a.MM_FIPS_MUNI_CODE
Update r
Set r.COUNTY_FIPS = REPLICATE('0',2-LEN(RTRIM(r.MM_FIPS_STATE_CODE))) + RTRIM(r.MM_FIPS_STATE_CODE) + REPLICATE('0',3-LEN(RTRIM(r.MM_FIPS_MUNI_CODE))) + RTRIM(r.MM_FIPS_MUNI_CODE)
From #result As r
Select r.COUNTY_FIPS
,r.house_count
,r.avg_avm
,r.max_avm
,r.min_avm
From #result As r
1st try without any index, and after that create clustered index as mentioned and try AGAIN the above same query
CREATE INDEX IX_RAW_Address_SA_PROPERTY_ID ON RAW_Address(SA_PROPERTY_ID)
I would put an index on:
Table: RAW_Equity
Columns: PROPERTY_ID, AVM_CONFIDENCE_SCORE
and
Table: RAW_Address
Columns: SA_PROPERTY_ID
Include: MM_FIPS_STATE_CODE,MM_FIPS_MUNI_CODE
Your query can be simplified, which will likely make it faster:
select
REPLICATE('0',2-LEN(RTRIM(a.MM_FIPS_STATE_CODE)))
+ RTRIM(a.MM_FIPS_STATE_CODE)
+ REPLICATE('0',3-LEN(RTRIM(a.MM_FIPS_MUNI_CODE)))
+ RTRIM(a.MM_FIPS_MUNI_CODE)
AS COUNTY_FIPS
,COUNT(e.PROPERTY_ID) as house_count
,AVG(cast(e.AVM_FINAL_VALUE as bigint)) as avg_avm
,max(cast(e.AVM_FINAL_VALUE as bigint)) as max_avm
,min(cast(e.AVM_FINAL_VALUE as bigint)) as min_avm
from
RAW_Equity e
left join RAW_Address a
ON a.SA_PROPERTY_ID = e.PROPERTY_ID
where
e.AVM_CONFIDENCE_SCORE >= 70
group by
a.MM_FIPS_STATE_CODE, a.MM_FIPS_MUNI_CODE
If RAW_Address doesn't already have a CLUSTERED INDEX with SA_PROPERTY_ID as the first key of the index, then this may help:
CREATE INDEX IX_RAW_Address_SA_PROPERTY_ID ON RAW_Address(SA_PROPERTY_ID)
INCLUDE (MM_FIPS_STATE_CODE, MM_FIPS_MUNI_CODE)
I'm trying to replay SQL Server 2014 Profiler trace that I saved to a DB table. When I open I get "Failed to open a table" error message. There is nothing in the windows logs.
I googled and this error used to happen when upgrading a SQL Server 2000 system to a 64 bit system. That doesn't apply here. I'm running on Windows Server 2012 with a fresh install of SQL Server 2014.
The trace was a TSQL_replay template. I saved it to a table using the following code. The code produced a table with the definition shown.
SELECT *
INTO myTrace
FROM ::fn_trace_gettable(N'c:\Logs\sql_trace_events.trc', default)
CREATE TABLE [dbo].[myTrace]
(
[TextData] [ntext] NULL,
[BinaryData] [image] NULL,
[DatabaseID] [int] NULL,
[TransactionID] [bigint] NULL,
[LineNumber] [int] NULL,
[NTUserName] [nvarchar](256) NULL,
[NTDomainName] [nvarchar](256) NULL,
[HostName] [nvarchar](256) NULL,
[ClientProcessID] [int] NULL,
[ApplicationName] [nvarchar](256) NULL,
[LoginName] [nvarchar](256) NULL,
[SPID] [int] NULL,
[Duration] [bigint] NULL,
[StartTime] [datetime] NULL,
[EndTime] [datetime] NULL,
[Reads] [bigint] NULL,
[Writes] [bigint] NULL,
[CPU] [int] NULL,
[Permissions] [bigint] NULL,
[Severity] [int] NULL,
[EventSubClass] [int] NULL,
[ObjectID] [int] NULL,
[Success] [int] NULL,
[IndexID] [int] NULL,
[IntegerData] [int] NULL,
[ServerName] [nvarchar](256) NULL,
[EventClass] [int] NULL,
[ObjectType] [int] NULL,
[NestLevel] [int] NULL,
[State] [int] NULL,
[Error] [int] NULL,
[Mode] [int] NULL,
[Handle] [int] NULL,
[ObjectName] [nvarchar](256) NULL,
[DatabaseName] [nvarchar](256) NULL,
[FileName] [nvarchar](256) NULL,
[OwnerName] [nvarchar](256) NULL,
[RoleName] [nvarchar](256) NULL,
[TargetUserName] [nvarchar](256) NULL,
[DBUserName] [nvarchar](256) NULL,
[LoginSid] [image] NULL,
[TargetLoginName] [nvarchar](256) NULL,
[TargetLoginSid] [image] NULL,
[ColumnPermissions] [int] NULL,
[LinkedServerName] [nvarchar](256) NULL,
[ProviderName] [nvarchar](256) NULL,
[MethodName] [nvarchar](256) NULL,
[RowCounts] [bigint] NULL,
[RequestID] [int] NULL,
[XactSequence] [bigint] NULL,
[EventSequence] [bigint] NULL,
[BigintData1] [bigint] NULL,
[BigintData2] [bigint] NULL,
[GUID] [uniqueidentifier] NULL,
[IntegerData2] [int] NULL,
[ObjectID2] [bigint] NULL,
[Type] [int] NULL,
[OwnerID] [int] NULL,
[ParentName] [nvarchar](256) NULL,
[IsSystem] [int] NULL,
[Offset] [int] NULL,
[SourceDatabaseID] [int] NULL,
[SqlHandle] [image] NULL,
[SessionLoginName] [nvarchar](256) NULL,
[PlanHandle] [image] NULL,
[GroupID] [int] NULL
)
I tried the same thing and I did not run into any issues. Have you tried with a new trace and save to a different named table?
You have to wait...the 'replay' is grayed out for about 1 minute until it fully loads the script.
Had the same issue and it turned out I was trying to open a trace recorded in Profiler 2014 with Profiler 2008 on a diffrent SQL instance in order to reply the trace. Upgrading profiler to 2014 on a replay instance solved the problem.
You have to create table of specific structure first. Try to export trace into a table from profiler and look what it created. Then just insert subset of columns into the table. Here is what I used for SQL 2012-2017:
------- Trace created with Replay template
USE [testdb]
GO
/****** Object: Table [dbo].[TraceTable] Script Date: 29-Oct-18 17:37:07 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[TraceTableSQL1]
(
[RowNumber] [int] IDENTITY ( 0 , 1 ) NOT NULL ,
[EventClass] [int] NULL ,
[BinaryData] [image] NULL ,
[DatabaseID] [int] NULL ,
[NTUserName] [nvarchar] ( 128 ) NULL ,
[NTDomainName] [nvarchar] ( 128 ) NULL ,
[HostName] [nvarchar] ( 128 ) NULL ,
[ClientProcessID] [int] NULL ,
[ApplicationName] [nvarchar] ( 128 ) NULL ,
[LoginName] [nvarchar] ( 128 ) NULL ,
[SPID] [int] NULL ,
[StartTime] [datetime] NULL ,
[EndTime] [datetime] NULL ,
[Error] [int] NULL ,
[DatabaseName] [nvarchar] ( 128 ) NULL ,
[RowCounts] [bigint] NULL ,
[RequestID] [int] NULL ,
[EventSequence] [bigint] NULL ,
[IsSystem] [int] NULL ,
[ServerName] [nvarchar] ( 128 ) NULL ,
[TextData] [ntext] NULL ,
[EventSubClass] [int] NULL ,
[Handle] [int] NULL ,
PRIMARY KEY CLUSTERED
(
[RowNumber] ASC
)
WITH ( PAD_INDEX = OFF , STATISTICS_NORECOMPUTE = OFF , IGNORE_DUP_KEY = OFF , ALLOW_ROW_LOCKS = ON , ALLOW_PAGE_LOCKS = ON ) ON [PRIMARY]
)
ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
GO
INSERT [TraceTableSQL1]
SELECT
[EventClass] ,
[BinaryData] ,
[DatabaseID] ,
[NTUserName] ,
[NTDomainName] ,
[HostName] ,
[ClientProcessID] ,
[ApplicationName] ,
[LoginName] ,
[SPID] ,
[StartTime] ,
[EndTime] ,
[Error] ,
[DatabaseName] ,
[RowCounts] ,
[RequestID] ,
[EventSequence] ,
[IsSystem] ,
[ServerName] ,
[TextData] ,
[EventSubClass] ,
[Handle]
FROM sys.fn_trace_gettable ( N'd:\temp\profiler.trc' , DEFAULT )
I have an engineering practice of SQL Optimization problem, which I think is a typical case ,and will help a lot of guys.
SQL SERVER 2005,
Firstly, create the main table. This is a person info table.
CREATE TABLE [dbo].[OLAPAgentDim](
[RoleID] [varchar](50) NULL CONSTRAINT [DF_OLAPAgentDim_RoleID] DEFAULT ((1)),
[OLAPKey] [bigint] IDENTITY(1,1) NOT NULL,
[FatherKey] [bigint] NULL,
[FatherKeyValue] [nvarchar](100) NULL,
[System] [varchar](6) NULL,
[Level] [int] NULL,
[IfLeaf] [real] NULL,
[IfDel] [real] NULL CONSTRAINT [DF_OLAPAgentDim_IfDel] DEFAULT ((0)),
[SourceKey] [varchar](50) NULL,
[MainDemoName] [nvarchar](100) NULL,
[FastCode] [varchar](50) NULL,
[TagValue] [varchar](50) NULL,
[Script] [nvarchar](max) NULL,
[Birthday] [datetime] NULL,
[EarlyStartTime] [datetime] NULL,
[StartTime] [datetime] NULL,
[EndTime] [datetime] NULL,
[EditTime] [datetime] NULL,
[BecomesTime] [datetime] NULL,
[ContractTime] [datetime] NULL,
[ContractEndTime] [datetime] NULL,
[XMLIcon] [nvarchar](max) NULL,
[PassKey] [varchar](50) NULL CONSTRAINT [DF_OLAPAgentDim_PassKey] DEFAULT ('N3pkY3RHaeZXA9mGJdfm8A=='),
[Address] [nvarchar](100) NULL,
[HomeTel] [varchar](50) NULL,
[Mobile] [varchar](50) NULL,
[Email] [varchar](100) NULL,
[IDCard] [varchar](50) NULL,
[IDSecu] [varchar](50) NULL,
[IDEndowment] [varchar](50) NULL,
[IDAccumulation] [varchar](50) NULL,
[ContactPerson] [nvarchar](100) NULL,
[ContactPersonTel] [varchar](50) NULL,
[Others1] [varchar](50) NULL,
[SexKey] [varchar](2) NULL CONSTRAINT [DF_OLAPAgentDim_SexKey] DEFAULT ((1)),
[SexKeyValue] [nvarchar](100) NULL,
[MarrageKey] [varchar](2) NULL CONSTRAINT [DF_OLAPAgentDim_MarrageKey] DEFAULT ((1)),
[MarrageKeyValue] [nvarchar](100) NULL,
[Nation] [nvarchar](50) NULL,
[Race] [nvarchar](50) NULL,
[PartyMemberKey] [varchar](2) NULL CONSTRAINT [DF_OLAPAgentDim_PartyMemberKey] DEFAULT ((1)),
[PartyMemberKeyValue] [nvarchar](100) NULL,
[RegionKey] [bigint] NULL CONSTRAINT [DF_OLAPAgentDim_RegionKey] DEFAULT ((1)),
[RegionKeyValue] [nvarchar](100) NULL,
[LeaveResonKey] [bigint] NULL CONSTRAINT [DF_OLAPAgentDim_LeaveResonKey] DEFAULT ((1)),
[LeaveResonKeyValue] [nvarchar](100) NULL,
[RoleStr] [varchar](max) NULL,
[RoleStrValue] [nvarchar](max) NULL,
[LeaderKey] [bigint] NULL CONSTRAINT [DF_OLAPAgentDim_LeaderKey] DEFAULT ((1)),
[LeaderKeyValue] [nvarchar](100) NULL,
[FastCode2] [varchar](50) NULL,
[FastCode3] [varchar](50) NULL,
[FastCode4] [varchar](50) NULL,
[FastCode5] [varchar](50) NULL,
[OtherAddress] [nvarchar](100) NULL,
[ShowOrder] [int] NULL,
[RaceKey] [bigint] NULL DEFAULT ((1)),
[RaceKeyValue] [nvarchar](100) NULL,
[DepartLevelKey] [bigint] NULL DEFAULT ((1)),
[DepartLevelKeyValue] [nvarchar](100) NULL,
[forumname] [nvarchar](100) NULL,
[IfCloseKey] [bigint] NULL DEFAULT ((1)),
[IfCloseKeyValue] [nvarchar](100) NULL,
[InsureStartTime] [datetime] NULL,
[AccumulationStartTime] [datetime] NULL,
[Rate] [varchar](50) NULL,
[DirectLeaderKey] [bigint] NULL CONSTRAINT [DF_OLAPAgentDim_DirectLeaderKey] DEFAULT ((1)),
[DirectLeaderAttriKey] [bigint] NULL CONSTRAINT [DF_OLAPAgentDim_DirectLeaderAttriKey] DEFAULT ((1)),
[DirectLeaderKeyValue] [nvarchar](100) NULL,
[DirectLeaderSourceKey] [varchar](50) NULL,
[DirectLeaderPartName] [nvarchar](100) NULL,
[DirectLeaderPositionName] [nvarchar](100) NULL,
[NOTSync] [int] NULL,
[FatherPath] [nvarchar](max) NULL,
[SaleDiscount] [real] NULL,
CONSTRAINT [PK_OLAPAgent Dim] PRIMARY KEY CLUSTERED
(
[OLAPKey] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
Secondly, insert abount 10,000 record into the table. I think 10,000 record is not a very big number to SQL SERVER. You can see this is a father and children dimention table in fact. The records with ifleaf=0 means the person's department structure node, the records with ifleaf=1 means the person. You can define father-children relationship using FahterKey column. For Example:
OLAPKey IfLeaf FatherKey DepartLevelKey MainDemoName
2 0 0 1 IBM Company
3 0 2 2 Sales Depart
4 0 2 2 Service Depart
5 0 3 3 Sales Team1
6 1 5 NULL John Smith
7 1 4 NULL Mary
......
DepartLevelKey Column means the depart node's level.
So in this table, we can save the whole HR tree info.
Thirdly, we see the problem SQL:
create table #t
(
TableID int IDENTITY(1,1),
OLAPKey bigint,
MainDemoName nvarchar(max)
)
declare #t4 table
(
TableID int IDENTITY(1,1),
MainDemoName nvarchar(max),
OLAPKeystr varchar(100)
)
declare #agentkey bigint
set #agentkey ='2'
--Part A
--DepartLevelKey=2, to get #agentkey node's all level=2 department
;WITH Result AS(
SELECT OLAPKey,DepartLevelKey,maindemoname FROM OLAPAgentDim WHERE OLAPKey =#agentkey
UNION ALL
SELECT a.OLAPKey,a.DepartLevelKey,a.maindemoname FROM OLAPAgentDim AS a,Result AS b WHERE a.FatherKey = b.OLAPKey
)
insert #t select OLAPKey,maindemoname from Result where DepartLevelKey=4
--Part B
;with One as
(
select *,convert(varchar(50),OLAPKey) as Re from #t
)
insert #t4 select maindemoname,stuff((select ','+Re from One where One.maindemoname=#t.maindemoname for xml path('')),1,1,'') as Two
from #t
group by maindemoname
drop table #t
The SQL above is divided into Part A and Part B.
Part A SQL get all the childrens below a root node(and filtered those belong to the specified DepartLevelKey). For example, to get all persons in Sales Department's child-department with level=3.
Part B SQL change the rows to column, For example:
Change:
TableID OLAPKey MainDemoName
1 6 Sales Team1
2 10 Sales Team1
3 12 Sales Team1
to:
TableID MainDemoName OLAPKeystr
1 Sales Team1 6,10,12
Thus we get each goal department's persons, for further processing(omited here).
The Problem:
The Part A is very slow, cost about 5 minutes. The Part B is slow too.
I wonder how to optimize it basing the table struc existed.
yours,
Ivan
Try:
(i) Adding this index to OLAPAgentDim:
create index IX_OLAPAgentDim_FatherKey on OLAPAgentDim (FatherKey) include (DepartLevelKey, MainDemoName)
(ii) Changing MainDemoName in #t from nvarchar(max) to nvarchar(100). This matches the column definition in OLAPAgentDim.
(iii) Between Part A and Part B, i.e. after Part A and before Part B, adding this index to #t:
create clustered index IX on #t (MainDemoName)