I'm trying to select some values where if a certain date falls on a day in the weekend, then it selects either one or more days before, and I made that work. Now I want it to select a value in another column, because I want to display the CalendarID from my calendar dimension.
The code I have that works is the following:
SELECT
i.Item
,CASE
WHEN c.DayOfWeek = 6 THEN DATEADD(day,-1,c.Date)
WHEN c.DayOfWeek = 7 THEN DATEADD(day,-2,c.Date)
ELSE c.Date END AS TransactionDate
FROM [dbo].[Items] i
LEFT JOIN [dbo].[Dim_Calendar] c ON
c.Date = q.TransactionDate
I have tried below code, but then there appears to be an error in my 'equals' signs.
SELECT
i.Item
,CASE c.CalendarID
WHEN c.DayOfWeek = 6 THEN DATEADD(day,-1,c.Date)
WHEN c.DayOfWeek = 7 THEN DATEADD(day,-2,c.Date)
ELSE c.Date END AS TransactionDate
FROM [dbo].[Items] i
LEFT JOIN [dbo].[Dim_Calendar] c ON
c.Date = q.TransactionDate
As I thought that I would then get the value of the CalendarID on those particular values. Is there a way to do what I'm asking, and could the solution perhaps be to do the CASE WHEN in the left join statement?
It's not completely clear what you want to do.
The aliases you give in your SELECT clause aren't available to the rest of your query.
You can say something like
LEFT JOIN dbo.Dim_Calendar c
ON c.date = CASE
WHEN c.DayOfWeek = 6 THEN DATEADD(day,-1,c.Date)
WHEN c.DayOfWeek = 7 THEN DATEADD(day,-2,c.Date)
ELSE c.Date END
The query planner is smart enough to optimize this correctly, so it's just verbose, not slow.
Or you can nest your queries something like this, to make your aliases visible.
SELECT i.Item, i.TransactionDate
FROM (SELECT
i.Item
,CASE c.CalendarID
WHEN c.DayOfWeek = 6 THEN DATEADD(day,-1,c.Date)
WHEN c.DayOfWeek = 7 THEN DATEADD(day,-2,c.Date)
ELSE c.Date END AS TransactionDate
FROM [dbo].[Items]
) i
LEFT JOIN [dbo].[Dim_Calendar] c ON c.Date = i.TransactionDate
But none of this is debugged. I'm not sure of the ON condition in your left join.
Related
Okay, so I have 2 queries, but I'm not sure how to merge them, here's the first one:
SELECT
e.EmplName,
CAST(SUM(t.ManHrs) AS REAL) AS [Hrs Logged]
FROM EmplCode e
LEFT JOIN TimeTicketDet t ON e.EmplCode = t.EmplCode
WHERE CAST(t.TicketDate AS DATE) = CAST(GETDATE() AS DATE)
AND t.WorkCntr <> 50
AND e.DeptNum LIKE 'PROD %'
AND e.Active = 'Y'
GROUP BY e.EmplName
HAVING CAST(SUM(t.ManHrs) AS REAL) < 6
So basically, what I'm trying to accomplish is compile a list of employees who log in under 6 hours a day. The problem is, I'm unable to capture the employees who do not log in at all. A LEFT JOIN to the EmplCode table doesn't work because
WHERE CAST(t.TicketDate AS DATE) = CAST(GETDATE() AS DATE)
Essentially turns the LEFT JOIN into an INNER JOIN. My query that lists all employees is this one:
SELECT
e.EmplName
FROM EmplCode e
WHERE e.DeptNum LIKE 'PROD %'
AND e.Active = 'Y'
GROUP BY e.EmplName
But having that ticketdate argument is what I'm having a hard time getting around. How can I get a list of all employees and their log in time for today, while also including those who have no time tickets at all for today? I tried doing a subquery, but I just can't wrap my head around it when I filter for today's tickets only, without eliminating the nulls entirely
Move this: CAST(t.TicketDate AS DATE) = CAST(GETDATE() AS DATE) to the ON clause instead of the WHERE clause.
This too: AND t.WorkCntr <> 50
The HAVING clause might also need to be modified, to include the employees who have no ManHrs. Adding OR SUM(t.ManHrs) IS NULL to the end might do it, but I haven't tested this.
e int the where is killing the left
SELECT
e.EmplName,
CAST(SUM(t.ManHrs) AS REAL) AS [Hrs Logged]
FROM EmplCode e
LEFT JOIN TimeTicketDet t
ON cast CAST(t.TicketDate AS DATE) = CAST(GETDATE() AS DATE)
AND t.WorkCntr <> 50
where e.EmplCode = t.EmplCode
AND e.DeptNum LIKE 'PROD %'
AND e.Active = 'Y'
GROUP BY e.EmplName
HAVING CAST(SUM(t.ManHrs) AS REAL) < 6
I am using azure sql server database. I have written one sql query to generate reprot. Here it is:
;WITH cte AS
(
SELECT ProjectID, CreatedDateUTC, ProductID, LicenseID, BackgroundID from Project p
WHERE CAST(p.CreatedDateUTC AS DATE) >= #StartDate and CAST(p.CreatedDateUTC AS DATE) <= #EndDate
and IsBackgroundUsed = 1
and s7ImageGenerated = 1 and p.SiteCode in ('b2c' )
)
SELECT ProjectID , CreatedDateUTC,
(SELECT BackgroundName from Background b WHERE b.BackgroundID = cte.BackgroundID) AS BackgroundName,
(SELECT Name FROM Product pr WHERE pr.ProductID = cte.ProductID) AS ProductName,
Case WHEN LicenseID is null THEN 'Standard' ELSE (SELECT LicenseName from License l WHERE l.LicenseID = cte.LicenseID) END AS CLA,
(SELECT PurchaseFG from Product_background pb WHERE pb.BackgroundID = cte.BackgroundID and pb.ProductId = cte.productID) AS PurchaseFG,
(SELECT FGcode from Product pr WHERE pr.ProductID = cte.ProductID) AS ProductFGCode,
--(Select dbo.[getProjectFGCodeByBackground](cte.ProductID, cte.BackgroundID)) AS FGCode,
'' AS ERPOrderNumber,
0 AS DesignQuanity
from cte
WHERE (SELECT count(*) from Approval.OrderDetail od WHERE od.ProjectID = cte.ProjectID) = 0
Is there any way to optimize this query. Timeout issue comes. I have written this query in store procedure and calling that store procedure using linq entity framework.
Earlier i have used join but it's more slow down so tried with sub query. Worked more then one year now not working.
This will definitely improve the performance, especially if the table Approval.OrderDetail is large:
...WHERE not exists
(SELECT 1 from Approval.OrderDetail od WHERE od.ProjectID = cte.ProjectID)
Writing a sub-select for every single field is a terrible way to retrieve data, as you'll likely end up with a lot of Loop Joins which have terrible performance over large data sets.
Your original JOIN method is the way to go, but you need to ensure you have appropriate indexes on your joining columns.
You can also replace the WHERE clause, with a LEFT JOIN and IS NULL combination
LEFT JOIN Approval.OrderDetail od
ON od.ProjectID = p.ProjectID
...
AND od.ProjectID IS NULL;
or a NOT EXISTS (although that is more likely to have to SCAN a wider range of rows for each row returned by the main query).
WHERE NOT EXISTS
(SELECT 1 FROM Approval.OrderDetail od WHERE od.ProjectID = cte.ProjectID)
In either case, make sure your Project table is appropriately indexed on (IsBackgroundUsed, s7ImageGenerated, SiteCode, CreatedDate) and that all joins are appropriately indexed.
I'd also question whether you actually need to cast your CreatedDateUTC fields to DATE types?
A possible simplification could be:
SELECT
p.ProjectID,
p.CreatedDateUTC,
b.BackgroundName,
pr.Name,
IIF(p.LicenseID IS NULL, 'Standard', l.LicenseName) AS CLA,
pb.PurchaseFG,
pr.FGCode AS ProductFGCode,
'' AS ERPOrderNumber,
0 AS DesignQuantity
FROM Project p
LEFT JOIN Approval.OrderDetail od
ON od.ProjectID = p.ProjectID
LEFT JOIN Background b
ON b.BackgroundID = p.BackgroundID
LEFT JOIN Product pr
ON pr.ProductID = p.ProductID
LEFT JOIN License l
ON l.LicenseID = p.LicenseID
LEFT JOIN Product_Background pb
ON pb.BackgroundID = p.BackgroundID
AND pb.ProductID = p.ProductID
WHERE p.CreatedDateUTC >= #StartDate AND p.CreatedDateUTC <= #EndDate
AND p.IsBackgroundUsed = 1
AND p.s7ImageGenerated = 1
AND p.SiteCode = 'b2c'
AND od.ProjectID IS NULL;
WHERE CAST(p.CreatedDateUTC AS DATE) >= #StartDate and CAST(p.CreatedDateUTC AS DATE) <= #EndDate
make this SARGAble ,create non clustered index on CreatedDateUTC
Suppose this is the parameter ,
declare #StartDate datetime='2018-02-01'
declare #EndDate datetime='2018-02-28'
Then,
set #EndDate=dateadd(second,-1,dateadd(day,1,#EndDate))
now you can safely use do this,
WHERE p.CreatedDateUTC >= #StartDate and p.CreatedDateUTC <= #EndDate
I think,#Mark Sinkinson query will work ok than sub query.( I will try NOT EXISTS clause once)
Use INNER JOIN if possible.
Hope you are using Store Procedure and calling the SP.
Create index on all joins columns.
Since your sub query is working fine output wise without TOP 1 so it appear that all tables have ONE to ONE relation with Project .
CREATE NONCLUSTERED INDEX IX_Project ON project (
CreatedDateUTC
,IsBackgroundUsed
,s7ImageGenerated
,SiteCode
) include (ProductID,LicenseID,BackgroundID);
Hope projectID is already Clustered Index.
Might not be much faster but easier to read for me.
You should be able to adjust #StartDate and #EndDate and not have to cast to date.
Have an index on all join and where conditions.
If those are FK you should be able to use an inner join (and should).
SELECT P.ProjectID , P.CreatedDateUTC,
b.BackgroundName,
pr.Name AS ProductName,
isnull(l.LicenseName, 'Standard') as CLA,
pb.PurchaseFG,
pr.FGcode AS ProductFGCode,
'' AS ERPOrderNumber,
0 AS DesignQuanity
from Project p
left join Background b
on b.BackgroundID = p.BackgroundID
left join Product pr
on pr.ProductID = p.ProductID
left join License l
on l.LicenseID = p.LicenseID
left join Product_background pb
on pb.BackgroundID = p.BackgroundID
and pb.ProductId = p.productID
left join Product pr
on pr.ProductID = p.ProductID
WHERE CAST(p.CreatedDateUTC AS DATE) >= #StartDate
and CAST(p.CreatedDateUTC AS DATE) <= #EndDate
and p.IsBackgroundUsed = 1
and p.s7ImageGenerated = 1
and p.SiteCode = 'b2c'
and not exists (SELECT 1
from Approval.OrderDetail od
WHERE od.ProjectID = p.ProjectID)
I have this query that is used in a SSRS report that someone else created. The left join is the cause of the problem. If I change it to an inner join I get results (not the correct results) in about 15 seconds. With the Left Join I end up canceling the query after 20 minutes. I added an index to both Budgets.Professionals and Transactions.Professionals with no change in performance. Is there a way to rewrite the query and not use the Left Join?
SELECT
profs.ProfName as orig
,profs.Initials
,DATEPART(year, TransDate) as [Year]
,SUM(CASE WHEN IsFlatFee = 'Y' OR COALESCE(MT.Admin, 'N') = 'Y'
THEN 0.0
ELSE Units * (aph.assignedpercent/100) * isnull(B.rate, 0.0)
END) AS ctp
,SUM(CASE WHEN IsFlatFee = 'Y' OR COALESCE(MT.Admin, 'N') = 'Y'
THEN 0
ELSE Units
END * (aph.assignedpercent/100)) AS worked_hours
,SUM(Value * (aph.assignedpercent/100)) AS worked_value
, 0 AS billed_hours
,0 AS billed_value
,0 AS billed_netamt
, 0.0 as paid
, 0.0 as wo
FROM Transactions Trans
INNER JOIN Matters Matts ON Trans.matters = Matts.matters
INNER JOIN MatterTypes MT ON Matts.mattertype = MT.mattertypesdesc
and MT.Admin <> 'Y'
INNER JOIN Components Comps ON Comps.components = Trans.components
and Comps.CompType = 'F'
INNER JOIN AssignedProfsHistory APH on APH.Matters = Trans.Matters
and APH.AssignedType = 'Originating'
and Trans.TransDate between APH.EffectiveDate and
ISNULL(EndDate,'12/31/2099')
INNER JOIN Professionals profs on profs.Professionals = APH.Professionals
and profs.ProfType = 'Member'
and profs.IsActive = 'Y'
and profs.IsBillable = 'Y'
**LEFT join** (SELECT Budgets.Professionals as timekeeper, Budgets.Amount as
rate, Budgets.PeriodDate
FROM Matters Matts
INNER JOIN Budgets ON Matts.matters = Budgets.matters
and cast(Budgets.PeriodDate as Date) <= '2017-12-31'
AND MONTH('2017-12-31') = MONTH(Budgets.PeriodDate)
WHERE Matts.MatterID = '99999-99.003') as B
*on B.timekeeper = Trans.Professionals*
and YEAR(B.PeriodDate) = DATEPART(year, TransDate)
WHERE cast(transdate as DATE) between dateadd(day, 1, DATEADD(year, -3,
'2017-12-31')) and '2017-12-31'
GROUP BY profs.ProfName, profs.Initials, DATEPART(year, TransDate)
As Sean and Aaron said. There are too many things that are potentially an issue.
You seem (I'm guessing from column names) that you are joining on text columns mattertypesdesc for one. In fact most of the work is done against text columns. Even Matts.MatterID is textual. This may not be possible in your scenario but it would perform better if the tables had integer primary keys and you join on those.
Anyway, guessing aside.... You might get a quick win if you replace your sub query in the left join with a temp table.
so before you existing query just do ...
SELECT Budgets.Professionals as timekeeper, Budgets.Amount as rate, Budgets.PeriodDate
INTO #t
FROM Matters Matts
INNER JOIN Budgets ON Matts.matters = Budgets.matters
and cast(Budgets.PeriodDate as Date) <= '2017-12-31'
AND MONTH('2017-12-31') = MONTH(Budgets.PeriodDate)
WHERE Matts.MatterID = '99999-99.003'
then in your exisintg query, replace the subquery with
SELECT ...
...
...
LEFT JOIN #t as B
ON B.timekeeper = Trans.Professionals
....
You can also try with the APPLY operator... remove left join & it's on condition, use outer apply and include on conditions inside the outer apply script like
AND budgets.timekeeper = trans.professionals
AND year(budgets.perioddate) = datepart(year, transdate)
Sample
OUTER APPLY
(
SELECT budgets.professionals AS timekeeper,
budgets.amount AS rate,
budgets.perioddate
FROM matters matts
INNER JOIN budgets
ON matts.matters = budgets.matters
AND cast(budgets.perioddate AS date) <= '2017-12-31'
AND month('2017-12-31') = month(budgets.perioddate)
AND budgets.timekeeper = trans.professionals
AND year(budgets.perioddate) = datepart(year, transdate)
WHERE matts.matterid = '99999-99.003'
) AS b
Thanks everyone who responded. I took your suggestions and I was able to come up with a solution. The query that I had to kill after running for 2 hrs now finishes in about 14 seconds.
I ended up creating a cte at the beginning of the script.
;with cte as
(SELECT Transactions FROM Transactions t
WHERE cast(t.TransDate as DATE) between dateadd(day, 1, DATEADD(year, -3,
#EndDate)) and #EndDate)
Then I linked the CTE to Transactions.
FROM Transactions Trans
INNER JOIN cte ON cte.Transactions = Trans.Transactions
I then was able to remove the 'where' clause that was causing the issue.
WHERE cast(transdate as DATE) between dateadd(day, 1, DATEADD(year, -3,
#EndDate)) and #EndDate
here's a quandry I'm facing in SSRS that I'm a bit stumped on. Here's the business logic I'm trying to create.
In determining the correct # of days in lab, use the following the logic:
If a case has multiple detail items with the same BacklogGroup, Daysinlab = Max(DaysinlabGDL)
If the items are from different BackLogGroups Sum the DaysInLabGDL from each of the BackLogGroups to get the DaysInLab amount.
So for example:
Case ID Back Log Group Days Calc Days
In Lab
4595549 EMAX 5 7
4595550 EMAX 5 2
4595551 CLINICAL ZIRC 5 3
4595552 BruxZir H 5 3
4595559 Implant SS 5 4
4595559 IMPLANTCA 8 8
The Expression I'm using for Calc days is this:
=iif(Fields!CaseID.Value = Previous(Fields!CaseID.Value) and Fields!BackLogGroup.Value <> Previous(Fields!BackLogGroup.Value),Fields!ActualDaysInLab.Value + Previous(Fields!ActualDaysInLab.Value),Max(Fields!ActualDaysInLab.Value))
In essence what I'm trying to do is compare detail records within a case and if the backlog group is different for each of the detail records (there can be more than 2 detail recs/case) sum the days in lab column. If the backlog groups are the same for the detail recs then I want to take the max() of the days in lab.
If there is a case where there are say 3 detail recs and two have the same backlog group take the max of those and add them to the other.
So in the case above Calc days for caseID 4595559 should be 13 (5+8) for both detail recs. But for some reason I'm not getting that. I wound up with one being 4 and one being 8.
In case it makes a difference here's the SQL query that creates the dataset:
Declare #StartDate Datetime
Declare #EndDate Datetime
Set #StartDate = '12/01/2013'
Set #EndDate = GetDate()
SELECT
cp.CaseID
,c.DateIn
,c.DateInvoiced
,cp.ProductID
,p.BackLogGroup
,sra.SourceCategory
,sra.DaysInLabGDL
,DATEDIFF(DAY,c.DateIn,c.DateInvoiced) AS ActualDaysInLab
,dbo.GL_GetBusinessDayCount(c.DateIn,c.DateInvoiced) AS WorkingDays
FROM dbo.CaseProducts cp WITH (NOLOCK)
INNER JOIN dbo.Cases c WITH (NOLOCK)
ON cp.CaseID = c.CaseID
LEFT OUTER JOIN dbo.Products p WITH (NOLOCK)
ON cp.ProductID = p.ProductID
LEFT OUTER JOIN dbo.SalesReAllocation sra WITH (NOLOCK)
ON p.ProductID = sra.ProductID
WHERE
p.BackLogGroup IS NOT NULL
AND
c.DateInvoiced IS NOT NULL
AND
c.DateIn between #StartDate and #EndDate
Order by
cp.CaseID
I hope this is clear. If not let me know and I'll try and clarify.
Thanks in advance.
I am calling your first result set t (for convenience).
I think the solution to your problem is a double aggregation:
select CaseId, sum(DaysInLab) as DaysInLab
from (select CaseID, BackLogGroup, max(DaysInLabGDL) as DaysInLab
from t
group by CaseId, BackLogGroup
) blg
group by CaseId;
So here is the Final Query. Thanks for the help #Gordon Linoff. It put me on the right path.
Declare #StartDate Datetime
Declare #EndDate Datetime
Set #StartDate = '12/01/2013'
Set #EndDate = GetDate()
With t as
(
SELECT
--count(cp.caseID) as CaseCount
cp.CaseID
,c.DateIn
,c.DateInvoiced
,cp.ProductID
,p.BackLogGroup
,sra.SourceCategory
,sra.DaysInLabGDL
,DATEDIFF(DAY,c.DateIn,c.DateInvoiced) AS ActualDaysInLab
,dbo.GL_GetBusinessDayCount(c.DateIn,c.DateInvoiced) AS WorkingDays
FROM dbo.CaseProducts cp WITH (NOLOCK)
INNER JOIN dbo.Cases c WITH (NOLOCK)
ON cp.CaseID = c.CaseID
LEFT OUTER JOIN dbo.Products p WITH (NOLOCK)
ON cp.ProductID = p.ProductID
LEFT OUTER JOIN dbo.SalesReAllocation sra WITH (NOLOCK)
ON p.ProductID = sra.ProductID
WHERE
p.BackLogGroup IS NOT NULL
AND
c.DateInvoiced IS NOT NULL
AND
--cp.CaseID = 4595187
c.DateIn between #StartDate and #EndDate
)
select blg.CaseID, DateIn, DateInvoiced, sum(DaysInLab) as DaysInLab, blg2.BackLogGroup, blg2.Workingdays, blg2.Workingdays - sum(Daysinlab) as DaysOver
from (select CaseID, BackLogGroup, max(DaysInLabGDL) as DaysInLab, WorkingDays
from t
group by CaseId, BackLogGroup, WorkingDays
) blg
Inner Join (Select CaseID, DateIn, DateInvoiced, BackLogGroup, WorkingDays
from t
group by CaseID, DateIn, DateInvoiced, BackLogGroup, WorkingDays
) blg2 on blg.CaseID = blg2.CaseId
group by blg.CaseId, DateIn, DateInvoiced, blg2.BackLogGroup, blg2.Workingdays
having blg2.workingdays > sum(Daysinlab)
I am seeing some strange query speed results when using a view with an outer apply, I am doing a distinct count on 2 different columns in the view, 1 is done in less than 0.1 seconds, the other takes 4-6 seconds, is the second count query returned slower because it is part of the outer apply? If so - how could I speed this query up?
The fast distinct count is -
SELECT DISTINCT ISNULL([ItemType], 'N/A') AS Items FROM vwCustomerItemDetailsFull
The slow distinct count is -
SELECT DISTINCT ISNULL([CustomerName], 'N/A') AS Items FROM vwCustomerItemDetailsFull
The view is -
SELECT I.ItemID,
IT.Name AS ItemType,
CASE
WHEN CustomerItemEndDate IS NULL
OR CustomerItemEndDate > GETDATE() THEN CustomerItems.CustomerName
ELSE NULL
END AS CustomerName,
CASE
WHEN CustomerItemEndDate IS NULL
OR CustomerItemEndDate > GETDATE() THEN CustomerItems.CustomerNumber
ELSE NULL
END AS CustomerNumber,
CASE
WHEN CustomerItemEndDate IS NULL
OR CustomerItemEndDate > GETDATE() THEN CustomerItems.CustomerItemStartDate
ELSE NULL
END AS CustomerItemStartDate,
FROM tblItems I
INNER JOIN tblItemTypes IT
ON I.ItemTypeID = IT.ItemTypeID
OUTER APPLY (SELECT TOP 1 CustomerName,
CustomerNumber,
StartDate AS CustomerItemStartDate,
EndDate AS CustomerItemEndDate
FROM tblCustomerItems CI
INNER JOIN tblCustomers C
ON C.CustomerID = CI.CustomerID
WHERE CI.ItemID = I.ItemID
ORDER BY EndDate DESC) AS CustomerItems
Check the execution plan, this speed difference is not strange at all, since it is an outer apply and not a cross apply, and within it you are limiting the results to top 1, it means that your outer apply has no influence on the number of results of the query, or the column ItemType.
Therefore when you select from the view and don't use any columns from the outer apply, the optimiser is smart enough to know it doesn't need to execute it. So in essesnce your first query is:
SELECT DISTINCT ISNULL([ItemType], 'N/A') AS Items
FROM ( SELECT tblItems
FROM Items
INNER JOIN tblItemTypes IT
ON I.ItemTypeID = IT.ItemTypeID
) vw
Whereas your second query has to execute the outer apply.
I have previously posted a longer answer which could also be helpful.
EDIT
If you wanted to change your query to a JOIN it could be rewritten as so:
SELECT I.ItemID,
IT.Name AS ItemType,
CustomerName,
CustomerNumber,
CustomerItemStartDate,
FROM tblItems I
INNER JOIN tblItemTypes IT
ON I.ItemTypeID = IT.ItemTypeID
LEFT JOIN
( SELECT ci.ItemID,
CustomerName,
CustomerNumber,
StartDate AS CustomerItemStartDate,
EndDate AS CustomerItemEndDate,
RN = ROW_NUMBER() OVER (PARTITION BY ci.ItemID ORDER BY EndDate DESC)
FROM tblCustomerItems CI
INNER JOIN tblCustomers C
ON C.CustomerID = CI.CustomerID
) AS CustomerItems
ON CustomerItems.ItemID = I.ItemID
AND CustomerItems.rn = 1
AND CustomerItems.CustomerItemEndDate < GETDATE();
However I don't think this will improve performance much since you said the most costly part is the sort on EndDate, and for your first query it will negatively impact performance because the optimiser will no longer optimise out the outer apply.
I expect the best way to improve the performance will be adding indexes, without knowing your data size or distribution I can't accurately guess the exact index you need, if you run the query on it's own showing the actual execution plan SSMS will suggest an index for you which would be better than my best guess.