Performance issue with DATEADD function - sql-server

This is my actual select query,
SELECT b.CaseNumber as CaseNumber,b.DebtorNr ,b.ActionDate,DATEADD(MONTH,-12,b.ActionDate) one_month,a.Registerdate --COALESCE(count(A.historynr),0) as DebtorActivity
from rr..r_basic_info b
join rr..activities_VW as A on b.DebtorNr=a.Debtornr
where
B.Debtornr = A.Debtornr
--and a.Registerdate<=b.ActionDate --this condition works
and a.registerdate >= DATEADD(month,-12,getdate()) --i have a problem with this condition and causing huge time consumption
I have a view defined here is activities_VW
select H.NR as historynr,o.debtornr as Debtornr, O.NR as ordernr, h.Actmenunr as Actmenunr,h.AGREEMENT as AgreementCode, h.Registerdate as Registerdate
from abc..history h join abc..orders o on o.NR=h.ORDERNR
and my execution plan is like
One more information for all rows b.actionDate column has identical value like '2015-04-11 08:37:44.037'.
I have checked with all date format but nothing wrong found.
For another case, I have different value for different rows in b.actionDate column and it is working fine for that case.
Thanks

I may be wrong in my understanding so take it only as a possibility - when using a function within a join criteria and/or where clause, in order to determine if data meets the criteria, it must be checked against every row in the table.
Think about your first part WHERE e.DATE <= a.joining_date - you can simply look directly at rows that are less than the e.DATE.
For your second part AND e.DATE >= DATEADD(MONTH, - 6, a.joining_date) - there is no column the is "joining date minus 6 months", so to determine if e.Date is greater than it, you would need to perform that calculation on every instance of a.joining_date in the table.
Remember that where clause information is not necessarily evaluated in the order is it written down in the query - so the rows you would think are eliminated by the first part of your where are not necessarily eliminated by it. So as one of the comments suggested, using a computed/persisted column on the DATEADD(MONTH, - 6, a.joining_date) would probably work well.

This is my actual select query,
SELECT b.CaseNumber as CaseNumber,b.DebtorNr ,b.ActionDate,DATEADD(MONTH,-12,b.ActionDate) one_month,a.Registerdate --COALESCE(count(A.historynr),0) as DebtorActivity
from rr..r_basic_info b
join rr..activities_VW as A on b.DebtorNr=a.Debtornr
where
B.Debtornr = A.Debtornr
--and a.Registerdate<=b.ActionDate --this condition works
and a.registerdate >= DATEADD(month,-12,getdate()) --i have a problem with this condition and causing huge time consumption
I have a view defined here is activities_VW
select H.NR as historynr,o.debtornr as Debtornr, O.NR as ordernr, h.Actmenunr as Actmenunr,h.AGREEMENT as AgreementCode, h.Registerdate as Registerdate
from abc..history h join abc..orders o on o.NR=h.ORDERNR
and my execution plan is like
One more information for all rows b.actionDate column has identical value like '2015-04-11 08:37:44.037'.
I have checked with all date format but nothing wrong found.
For another case, I have different value for different rows in b.actionDate column and it is working fine for that case.
Thanks

Related

Group by an evaluated field (sql server) [duplicate]

Why are column ordinals legal for ORDER BY but not for GROUP BY? That is, can anyone tell me why this query
SELECT OrgUnitID, COUNT(*) FROM Employee AS e GROUP BY OrgUnitID
cannot be written as
SELECT OrgUnitID, COUNT(*) FROM Employee AS e GROUP BY 1
When it's perfectly legal to write a query like
SELECT OrgUnitID FROM Employee AS e ORDER BY 1
?
I'm really wondering if there's something subtle about the relational calculus, or something, that would prevent the grouping from working right.
The thing is, my example is pretty trivial. It's common that the column that I want to group by is actually a calculation, and having to repeat the exact same calculation in the GROUP BY is (a) annoying and (b) makes errors during maintenance much more likely. Here's a simple example:
SELECT DATEPART(YEAR,LastSeenOn), COUNT(*)
FROM Employee AS e
GROUP BY DATEPART(YEAR,LastSeenOn)
I would think that SQL's rule of normalize to only represent data once in the database ought to extend to code as well. I'd want to only right that calculation expression once (in the SELECT column list), and be able to refer to it by ordinal in the GROUP BY.
Clarification: I'm specifically working on SQL Server 2008, but I wonder about an overall answer nonetheless.
One of the reasons is because ORDER BY is the last thing that runs in a SQL Query, here is the order of operations
FROM clause
WHERE clause
GROUP BY clause
HAVING clause
SELECT clause
ORDER BY clause
so once you have the columns from the SELECT clause you can use ordinal positioning
EDIT, added this based on the comment
Take this for example
create table test (a int, b int)
insert test values(1,2)
go
The query below will parse without a problem, it won't run
select a as b, b as a
from test
order by 6
here is the error
Msg 108, Level 16, State 1, Line 3
The ORDER BY position number 6 is out of range of the number of items in the select list.
This also parses fine
select a as b, b as a
from test
group by 1
But it blows up with this error
Msg 164, Level 15, State 1, Line 3
Each GROUP BY expression must contain at least one column that is not an outer reference.
There is a lot of elementary inconsistencies in SQL, and use of scalars is one of them. For example, anyone might expect
select * from countries
order by 1
and
select * from countries
order by 1.00001
to be a similar queries (the difference between the two can be made infinitesimally small, after all), which are not.
I'm not sure if the standard specifies if it is valid, but I believe it is implementation-dependent. I just tried your first example with one SQL engine, and it worked fine.
use aliasses :
SELECT DATEPART(YEAR,LastSeenOn) as 'seen_year', COUNT(*) as 'count'
FROM Employee AS e
GROUP BY 'seen_year'
** EDIT **
if GROUP BY alias is not allowed for you, here's a solution / workaround:
SELECT seen_year
, COUNT(*) AS Total
FROM (
SELECT DATEPART(YEAR,LastSeenOn) as seen_year, *
FROM Employee AS e
) AS inline_view
GROUP
BY seen_year
databases that don't support this basically are choosing not to. understand the order of the processing of the various steps, but it is very easy (as many databases have shown) to parse the sql, understand it, and apply the translation for you. Where its really a pain is when a column is a long case statement. having to repeat that in the group by clause is super annoying. yes, you can do the nested query work around as someone demonstrated above, but at this point it is just lack of care about your users to not support group by column numbers.

SQL Get Second Record

I am looking to retrieve only the second (duplicate) record from a data set. For example in the following picture:
Inside the UnitID column there is two separate records for 105. I only want the returned data set to return the second 105 record. Additionally, I want this query to return the second record for all duplicates, not just 105.
I have tried everything I can think of, albeit I am not that experience, and I cannot figure it out. Any help would be greatly appreciated.
You need to use GROUP BY for this.
Here's an example: (I can't read your first column name, so I'm calling it JobUnitK
SELECT MAX(JobUnitK), Unit
FROM JobUnits
WHERE DispatchDate = 'oct 4, 2015'
GROUP BY Unit
HAVING COUNT(*) > 1
I'm assuming JobUnitK is your ordering/id field. If it's not, just replace MAX(JobUnitK) with MAX(FieldIOrderWith).
Use RANK function. Rank the rows OVER PARTITION BY UnitId and pick the rows with rank 2 .
For reference -
https://msdn.microsoft.com/en-IN/library/ms176102.aspx
Assuming SQL Server 2005 and up, you can use the Row_Number windowing function:
WITH DupeCalc AS (
SELECT
DupID = Row_Number() OVER (PARTITION BY UnitID, ORDER BY JobUnitKeyID),
*
FROM JobUnits
WHERE DispatchDate = '20151004'
ORDER BY UnitID Desc
)
SELECT *
FROM DupeCalc
WHERE DupID >= 2
;
This is better than a solution that uses Max(JobUnitKeyID) for multiple reasons:
There could be more than one duplicate, in which case using Min(JobUnitKeyID) in conjunction with UnitID to join back on the UnitID where the JobUnitKeyID <> MinJobUnitKeyID` is required.
Except, using Min or Max requires you to join back to the same data (which will be inherently slower).
If the ordering key you use turns out to be non-unique, you won't be able to pull the right number of rows with either one.
If the ordering key consists of multiple columns, the query using Min or Max explodes in complexity.

SQL Server - Count the number of times the contents of a specified field repeat in a table

What's the best way to 'SELECT' a 'DISTINCT' list of a field from a table / view (with 'WHERE' criteria) and alongside that count the number of times that that field content repeats in the table / view?
In other words, I have an initial view that looks a bit like this:
I'd like a single SQL query to filter it (SELECT...WHERE...) so that we are only considering records where [ORDER COMPLETE] = False and [PERSONAL] = Null...
...and then create a distinct list of names with counts of the number of times each name appears in the previous table:
*Displaying the [ORDER COMPLETE] and [PERSONAL] fields is redundant by this point and could be dropped to simplify.
I can do the steps individually as above, but struggling to get a single query to do it all... any help appreciated!
Thanks in advance,
-Tim
This should just be the following
SELECT dbo.tblPerson.Person,
COUNT(dbo.tblPerson.Person) AS Count
FROM dbo.tblPerson
INNER JOIN dbo.tblNotifications ON dbo.tblPerson.PersonID = dbo.tblNotifications.AddresseeID
WHERE dbo.tblNotifications.Complete = 'False'
AND dbo.tblNotifications.Personal IS NULL
GROUP BY dbo.tblPerson.Person
ORDER BY COUNT(dbo.tblPerson.Person) DESC
You don't need your DISTINCT or TOP 100 PERCENT,
Here is a simplified fiddle
Well I got downvoted into oblivion (probably for displaying the full extent of my own ignorance!), but just in case someone from the future experiences the same problem as me and stumbles across this question while Googling (or whatever verb you use for "searching all digitised human knowledge" in the distant future), here's some sanitised code of the query I managed to get to work in the end - thanks to Mark Sinkinson's snippet for helping me realise the obvious...
SELECT DISTINCT TOP (100) PERCENT dbo.tblPerson.Person, COUNT(dbo.tblPerson.Person) AS CountPerson
FROM dbo.tblPerson INNER JOIN
dbo.tblNotifications ON dbo.tblPerson.PersonID = dbo.tblNotifications.AddresseeID
WHERE (dbo.tblNotifications.Complete = 'False') AND (dbo.tblNotifications.Personal IS NULL)
GROUP BY dbo.tblPerson.Person
ORDER BY CountPerson DESC

MS Access : Average and Total Calculation in Single Query

INTRODUCTION TO DATABASE TABLE BEING USED -
I am working on a “Stock Market Prices” based Database Table. My table has got the data for the following FIELDS –
ID
SYMBOL
OPEN
HIGH
LOW
CLOSE
VOLUME
VOLUME CHANGE
VOLUME CHANGE %
OPEN_INT
SECTOR
TIMESTAMP
New data gets added to the table daily “Monday to Friday”, based on the stock market price changes for that day. The current requirement is based on the VOLUME field, which shows the volume traded for a particular stock on daily basis.
REQUIREMENT –
To get the Average and Total Volume for last 10,15 and 30 Days respectively.
METHOD USED CURRENTLY -
I created these 9 SEPARATE QUERIES in order to get my desired results –
First I have created these 3 queries to take out the most recent last 10,15 and 30 dates from the current table:
qryLast10DaysStored
qryLast15DaysStored
qryLast30DaysStored
Then I have created these 3 queries for getting the respective AVERAGES:
qrySymbolAvgVolume10Days
qrySymbolAvgVolume15Days
qrySymbolAvgVolume30Days
And then I have created these 3 queries for getting the respective TOTALS:
qrySymbolTotalVolume10Days
qrySymbolTotalVolume15Days
qrySymbolTotalVolume30Days
PROBLEM BEING FACED WITH CURRENT METHOD -
Now, my problem is that I have ended up having these so many different queries, whereas I wanted to get the output into One Single Query, as shown in the Snapshot of the Excel Sheet:
http://i49.tinypic.com/256tgcp.png
SOLUTION NEEDED -
Is there some way by which I can get these required fields into ONE SINGLE QUERY, so that I do not have to look into multiple places for the required fields? Can someone please tell me how to get all these separate queries into one -
A) Either by taking out or moving the results from these separate individual queries to one.
B) Or by making a new query which calculates all these fields within itself, so that these separate individual queries are no longer needed. This would be a better solution I think.
One Clarification about Dates –
Some friend might think why I used the method of using Top 10,15 and 30 for getting the last 10,15 and 30 Date Values. Why not I just used the PC Date for getting these values? Or used something like -
("VOLUME","tbl-B", "TimeStamp BETWEEN Date() - 10 AND Date()")
The answer is that I require my query to "Read" the date from the "TIMESTAMP" Field, and then perform its calculations accordingly for LAST / MOST RECENT "10 days, 15 days, 30 days” FOR WHICH THE DATA IS AVAILABLE IN THE TABLE, WITHOUT BOTHERING WHAT THE CURRENT DATE IS. It should not depend upon the current date in any way.
If there is any better method or more efficient way to create these queries, then please enlighten.
You have separate queries to compute 10DayTotalVolume and 10DayAvgVolume. I suspect you can compute both in one query, qry10DayVolumes.
SELECT
b.SYMBOL,
Sum(b.VOLUME) AS 10DayTotalVolume,
Avg(b.VOLUME) AS 10DayAvgVolume
FROM
[tbl-B] AS b INNER JOIN
qryLast10DaysStored AS q
ON b.TIMESTAMP = q.TIMESTAMP
GROUP BY b.SYMBOL;
However, that makes me wonder whether 10DayAvgVolume can ever be anything other than 10DayTotalVolume / 10
Similar considerations apply to the 15 and 30 day values.
Ultimately, I think you want something based on a starting point like this:
SELECT
q10.SYMBOL,
q10.[10DayTotalVolume],
q10.[10DayAvgVolume],
q15.[15DayTotalVolume],
q15.[15DayAvgVolume],
q30.[30DayTotalVolume],
q30.[30DayAvgVolume]
FROM
(qry10DayVolumes AS q10
INNER JOIN qry15DayVolumes AS q15
ON q10.SYMBOL = q15.SYMBOL)
INNER JOIN qry30DayVolumes AS q30
ON q10.SYMBOL = q30.SYMBOL;
That assumes you have created qry15DayVolumes and qry30DayVolumes following the approach I suggested for qry10DayVolumes.
If you want to cut down the number of queries, you could use subqueries for each of the qry??DayVolumes saved queries, but try it this way first to make sure the logic is correct.
In that second query above, there can be a problem due to field names which start with digits. Enclose those names in square brackets or re-alias them in qry10DayVolumes, qry15DayVolumes, and qry30DayVolumes using alias names which begin with letters instead of digits.
I tested the query as written above with the "2nd Upload.mdb" you uploaded, and it ran without error from Access 2007. Here is the first row of the result set from that query:
SYMBOL 10DayTotalVolume 10DayAvgVolume 15DayTotalVolume 15DayAvgVolume 30DayTotalVolume 30DayAvgVolume
ACC-1 42909 4290.9 54892 3659.46666666667 89669 2988.96666666667
Access doesn't support most advanced SQL syntax and clauses, so this is a bit of a hack, but it works, and is fast on your small sample. You're basically running 3 queries but the Union clauses allow you to combine into one:
select
Symbol,
sum([10DayTotalVol]) as 10DayTotalV,
sum([10DayAvgVol]) as 10DayAvgV,
sum([15DayTotalVol]) as 15DayTotalV,
sum([15DayAvgVol]) as 15DayAvgV,
sum([30DayTotalVol]) as 30DayTotalV,
sum([30DayAvgVol]) as 30DayAvgV
from (
select
Symbol,
sum(volume) as 10DayTotalVol, avg(volume) as 10DayAvgVol,
0 as 15DayTotalVol, 0 as 15DayAvgVol,
0 as 30DayTotalVol, 0 as 30DayAvgVol
from
[tbl-b]
where
timestamp >= (select min(ts) from (select distinct top 10 timestamp as ts from [tbl-b] order by timestamp desc ))
group by
Symbol
UNION
select
Symbol,
0, 0,
sum(volume), avg(volume),
0, 0
from
[tbl-b]
where
timestamp >= (select min(ts) from (select distinct top 15 timestamp as ts from [tbl-b] order by timestamp desc ))
group by
Symbol
UNION
select
Symbol,
0, 0,
0, 0,
sum(volume), avg(volume)
from
[tbl-b]
where
timestamp >= (select min(ts) from (select distinct top 30 timestamp as ts from [tbl-b] order by timestamp desc ))
group by
Symbol
) s
group by
Symbol

SQL Server Reference a Calculated Column

I have a select statement with calculated columns and I would like to use the value of one calculated column in another. Is this possible? Here is a contrived example to show what I am trying to do.
SELECT [calcval1] = CASE Statement, [calcval2] = [calcval1] * .25
No.
All the results of a single row from a select are atomic. That is, you can view them all as if they occur in parallel and cannot depend on each other.
If you're referring to computed columns, then you need to update the formula's input for the result to change during a select.
Think of computed columns as macros or mini-views which inject a little calculation whenever you call them.
For example, these columns will be identical, always:
-- assume that 'Calc' is a computed column equal to Salaray*.25
SELECT Calc, Salary*.25 Calc2 FROM YourTable
Also keep in mind that the persisted option doesn't change any of this. It keeps the value around which is nice for indexing, but the atomicity doesn't change.
Unfortunately not really, but a workaround that is sometimes worth it is
SELECT [calcval1], [calcval1] * .25 AS [calcval2]
FROM (SELECT [calcval1] = CASE Statement FROM whatever WHERE whatever)
Yes it's possible.
Use the WITH Statement for nested selects:
Two ways I can think of to do that. First understand that the calval1 column does not exist as far as SQL Server is concerned until the statement has run, therefore it cannot be directly used as showning your example. So you can put the calculation in there twice, once for calval1 and once as substitution for calcval1 in the calval2 calculation.
The other way is to make a derived table with calval1 in it and then calculate calval2 outside the derived table something like:
select calcval1*.25 as calval2, calval1, field1, field2
from (select casestament as cavlval1, field1, field2 from my table) a
You'll need to test both for performance.
You should use an outer apply instead of a subselect:
select V.calc,V.calc*0.25 from FOO outer apply (select case Statement as calc) V
You can't "reset" the value of a calculated column in a Select clause, if that's what you're trying to do... The value of a calculated column is based on the calculated column formulae. Which CAN include the value of another calculated column.... but you canlt reset the formulae in a Select clause... if all you want to do is "output" the value based on two calculated columns, (as the syntax in your question reads" Then the
"[calcval2]"
in
SELECT [calcval1] = CASE Statement, [calcval2] = [calcval1] * .25
would just become a column alias in the output of the Select Clause.
or are you asking how to define the formulae for one calculated column to be based on another?

Resources