I have two queries that return the mean age of customers and I am trying to combine their columns into one table, but the problem is that when I join them together with UNION it only returns one column from the first query.
SELECT AVG(C.income) AS "<50"
FROM Customer C
WHERE DATEDIFF(YEAR, C.birthDate, GETDATE()) < 50
UNION
SELECT AVG(C1.income) AS ">50"
FROM Customer C1
WHERE DATEDIFF(YEAR, C1.birthDate, GETDATE()) > 50
This returns only one column with all the data under it
<50
But I want this where I have both the columns from the two queries
<50 | > 50
What you want is conditional aggregation:
SELECT AVG(CASE
WHEN DATEDIFF(YEAR, C.birthDate, GETDATE()) < 50 THEN C.income
END) AS "<50",
AVG(CASE
WHEN DATEDIFF(YEAR, C.birthDate, GETDATE()) > 50 THEN C.income
END) AS ">50"
FROM Customer C
This will return a single row with two columns: one for <50 and another one for >50 customers.
Note: ELSE is omitted from CASE expressions. If the WHEN predicate evaluates to false, then CASE returns NULL and thus the corresponding row does not participate in AVG calculation (credit goes to #Samrat Alamgir).
I think this will produce the output what is you wanted:
SELECT a.*, b.*
FROM
(
SELECT AVG(C.income) AS "<50"
FROM Customer C
WHERE DATEDIFF(YEAR, C.birthDate, GETDATE()) < 50
) AS a,
(
SELECT AVG(C1.income) AS ">50"
FROM Customer C1
WHERE DATEDIFF(YEAR, C1.birthDate, GETDATE()) > 50
) AS b
Related
Im trying to establish for any given datetime a tag that is purely dependent on the time part.
However because the time part is cyclic I cant make it work with simple greater lower than conditions.
I tried a lot of casting and shift one time to 24hour mark to kinda break the cycle However it just gets more and more complicated and still doesnt work.
Im using SQL-Server, here is the situation:
DECLARE #tagtable TABLE (tag varchar(10),[start] time,[end] time);
DECLARE #datetimestable TABLE ([timestamp] datetime)
Insert Into #tagtable (tag, [start], [end])
values ('tag1','04:00:00.0000000','11:59:59.9999999'),
('tag2','12:00:00.0000000','19:59:59.9999999'),
('tag3','20:00:00.0000000','03:59:59.9999999');
Insert Into #datetimestable ([timestamp])
values ('2022-07-24T23:05:23.120'),
('2022-07-27T13:24:40.650'),
('2022-07-26T09:00:00.000');
tagtable:
tag
start
end
tag1
04:00:00.0000000
11:59:59.9999999
tag2
12:00:00.0000000
19:59:59.9999999
tag3
20:00:00.0000000
03:59:59.9999999
for given datetimes e.g. 2022-07-24 23:05:23.120, 2022-07-27 13:24:40.650, 2022-07-26 09:00:00.000
the desired result would be:
date
tag
2022-07-25
tag3
2022-07-27
tag2
2022-07-26
tag1
As I wrote i tried to twist this with casts and adding and datediffs
SELECT
If(Datepart(Hour, a.[datetime]) > 19,
Cast(Dateadd(Day,1,a.[datetime]) as Date),
Cast(a.[datetime] as Date)
) as [date],
b.[tag]
FROM #datetimestable a
INNER JOIN #tagtable b
ON SomethingWith(a.[datetime])
between SomethingWith(b.[start]) and SomethingWith(b.[end])
The only tricky bit here is that your tag time ranges can go over midnight, so you need to check that your time is either between start and end, or if it spans midnight its between start and 23:59:59 or between 00:00:00 and end.
The only other piece is splitting your timestamp column into date and time using a CTE, to save having to repeat the cast.
;WITH splitTimes AS
(
SELECT CAST(timestamp AS DATE) as D,
CAST(timestamp AS TIME) AS T
FROM #datetimestable
)
SELECT
DATEADD(
day,
CASE WHEN b.[end]<b.start THEN 1 ELSE 0 END,
a.D) as timestamp,
b.[tag]
FROM [splitTimes] a
INNER JOIN #tagtable b
ON a.T between b.[start] and b.[end]
OR (b.[end]<b.start AND (a.T BETWEEN b.[start] AND '23:59:59.99999'
OR a.T BETWEEN '00:00:00' AND b.[end]))
Live example: https://dbfiddle.uk/?rdbms=sqlserver_2019&fiddle=506aef05b5a761afaf1f67a6d729446c
Since they're all 8-hour shifts, we can essentially ignore the end (though, generally, trying to say an end time is some specific precision of milliseconds will lead to a bad time if you ever use a different data type (see the first section here) - so if the shift length will change, just put the beginning of the next shift and use >= start AND < end instead of BETWEEN).
;WITH d AS
(
SELECT datetime = [timestamp],
date = CONVERT(datetime, CONVERT(date, [timestamp]))
FROM dbo.datetimestable
)
SELECT date = DATEADD(DAY,
CASE WHEN t.start > t.[end] THEN 1 ELSE 0 END,
CONVERT(date, date)),
t.tag
FROM d
INNER JOIN dbo.tagtable AS t
ON d.datetime >= DATEADD(HOUR, DATEPART(HOUR, t.start), d.date)
AND d.datetime < DATEADD(HOUR, 8, DATEADD(HOUR,
DATEPART(HOUR, t.start), d.date));
Example db<>fiddle
Here's a completely different approach that defines the intervals in terms of starts and durations rather than starts and ends.
This allows the creation of tags that can span multiple days, which might seem like an odd capability to have here, but there might be a use for it if we add some more conditions down the line. For example, say we want to be able say "anything from 6pm friday to 9am monday gets the 'out of hours' tag". Then we could add a day of week predicate to the tag definition, and still use the duration-based interval.
I have defined the duration granularity in terms of hours, but of course this can easily be changed
create table #tags
(
tag varchar(10),
startTimeInclusive time,
durationHours int
);
insert #tags
values ('tag1','04:00:00', 8),
('tag2','12:00:00', 8),
('tag3','20:00:00', 8);
create table #dateTimes (dt datetime)
insert #dateTimes
values ('2022-07-24T23:05:23.120'),
('2022-07-27T13:24:40.650'),
('2022-07-26T09:00:00.000');
select dt.dt,
t.tag
from #datetimes dt
join #tags t on cast(dt.dt as time) >= t.startTimeInclusive
and dt.dt < dateadd
(
hour,
t.durationHours,
cast(cast(dt.dt as date) as datetime) -- strip the time from dt
+ cast(t.startTimeInclusive as datetime) -- add back the time from t
);
Maybe I am looking at this to simple, but,
can't you just take the first tag with an hour greater then your hour in table datetimestable.
With an order by desc it should always give you the correct tag.
This will work well as long as you have no gaps in your tagtable
select case when datepart(hour, tag.tagStart) > 19 then dateadd(day, 1, convert(date, dt.timestamp))
else convert(date, dt.timestamp)
end as [date],
tag.tag
from datetimestable dt
outer apply ( select top 1
tt.tag,
tt.tagStart
from tagtable tt
where datepart(Hour, dt.timestamp) > datepart(hour, tt.tagStart)
order by tt.tagStart desc
) tag
It returns the correct result in this DBFiddle
The result is
date
tag
2022-07-25
tag3
2022-07-27
tag2
2022-07-26
tag1
EDIT
If it is possible that there are gaps in the table,
then I think the most easy and solid solution would be to split that row that passes midnight into 2 rows, and then your query can be very simple
See this DBFiddle
select case when datepart(hour, tag.tagStart) > 19 then dateadd(day, 1, convert(date, dt.timestamp))
else convert(date, dt.timestamp)
end as [date],
tag.tag
from datetimestable dt
outer apply ( select tt.tag,
tt.tagStart
from tagtable tt
where datepart(Hour, dt.timestamp) >= datepart(hour, tt.tagStart)
and datepart(Hour, dt.timestamp) <= datepart(hour, tt.tagEnd)
) tag
I want to see all units sold like this
select [UnitsSold] from MyTable
But I also want to add another column showing only the UnitsSold from the last 30 days
How can do this:
MyTable.CreatedOn >= DATEADD(MONTH, -1, GETDATE())
but only for one column.
So basically I want to see on the same row all units sold and then units sold the past 30 days
You can use CASE statement inside your aggregate function, something like....
Select SUM([UnitsSold]) TotalSold
, SUM(CASE WHEN CreatedOn >= DATEADD(MONTH, -1, GETDATE())
THEN [UnitsSold] ELSE 0 END) SoldInLastMonth
FROM MyTable
This query:
SELECT CID, count(*) as NumOccurences
FROM Violations
WHERE DateOfViolation BETWEEN (dateadd(day, -30, getdate())) AND getdate()
GROUP BY CID
ORDER BY count(*) DESC;
gives the following result:
CID NumOccurences
1921 5
1042 5
1472 5
1543 5
2084 5
2422 5
NumOccurences is verified to be correct. Since CID exists in another tables, I want to tie CID to its intersection, a column in said other table Placement[CID,Intersection,...], and display that instead.
My desired output is:
Intersection NumOccurences
Elston and Charles 5
Diservey and Pkwy 5
Grand and Chicago 5
...
...
I tried this:
SELECT Intersection, count(DateOfViolation) as NumOccurences
FROM Violations
inner join Placement on Violations.CID = Placement.CID
WHERE DateOfViolation BETWEEN (dateadd(day, -30, getdate())) AND getdate()
GROUP BY Intersection
ORDER BY count(*) DESC;
but get this result (not correct):
Intersection NumOccurences
CALIFORNIA AND DIVERSEY 90
BELMONT AND KEDZIE 83
KOSTNER AND NORTH 82
STONEY ISLAND AND 79TH 78
RIDGE AND CLARK 60
ROOSEVELT AND HALSTED 60
ROOSEVELT AND KOSTNER 60
In fact, I've got no idea what my attempt query is even returning or where it's coming from.
EDIT
Running the query
SELECT CID, count(*) as num
from Placement
where Intersection = 'BELMONT AND KEDZIE'
group by Intersection, Address, CID
order by Intersection, Address, CID
yeilds
CID num
1372 1
1371 1
1373 1
I think you could do something like this:
SELECT
MIN(Placement.Intersection) AS Intersection,
COUNT(DISTINCT Violation.VID /* Violation ID? */) AS NumOccurences
FROM Violations INNER JOIN Placement ON Violations.CID = Placement.CID
WHERE DateOfViolation
BETWEEN cast(dateadd(day, -30, getdate()) as date) AND cast(getdate() as date)
GROUP BY Violations.CID
ORDER BY NumOccurences DESC;
Also be careful with that date range. I'm not sure whether you're dealing with date or datetime.
You might also try:
SELECT
(
SELECT MIN(Intersection) FROM Placement
WHERE Placement.CID = Violations.CID
) AS Intersection,
COUNT(*) AS NumOccurences
FROM Violations
WHERE DateOfViolation
BETWEEN cast(dateadd(day, -30, getdate()) as date) AND cast(getdate() as
GROUP BY CID
ORDER BY NumOccurences DESC;
You may not even need the MIN() in that second one.
There would have to be a one-to-one relationship between CIDs and Intersections for you to get the result you are after.
83 is actually a prime number, which would suggest that not only are there multiple entries for the BELMONT and KEDZIE intersection in the Placement table, but also that there is more than one CID corresponding to that intersection. The same may be true for other intersections
Try this:
SELECT Intersection, CID, count(*) as num
from Placement
-- where Intersection = 'BELMONT AND KEDZIE'
group by Intersection, CID
order by Intersection, CID
That will show you how many of each (intersection, CID) combination in your Placement table (uncomment the where clause to look at 'Belmont and Kenzie' specifically). Then re-ask yourself what you're trying to do.
I have two tables (SalesforceTasks and SalesforceContacts) that I am using for a scoring system project. A simple SELECT statement with a ROW_NUMBER() calculation is taking a very long time to run and actually stops querying once it hits a certain number of rows. The query doesn't stop executing, but it stops returning data.
Here is the query in question. It is a very vanilla process, where I need to get the newest date in the SalesforceTasks table and link it to the contact ID in the SalesforceContacts table. The SalesforceTasks table has 2,091,946 rows and the SalesforceContacts table has 446,772 rows.
Here is the query in question:
SELECT SC.ID
,CASE
WHEN DATEDIFF(DD, ST.CREATEDDATE, GETDATE()) BETWEEN 360 AND 1500
THEN 15
WHEN DATEDIFF(DD, ST.CREATEDDATE, GETDATE()) BETWEEN 181 AND 360
THEN 10
WHEN DATEDIFF(DD, ST.CREATEDDATE, GETDATE()) BETWEEN 60 AND 180
THEN 5
ELSE 0
END AS Score
,ROW_NUMBER() OVER (PARTITION BY ST.ACCOUNTID ORDER BY ACTIVITYDATE) AS LastCall
FROM Salesforce.dbo.SalesforceTasks AS ST
JOIN Salesforce.dbo.SalesforceContacts AS SC
ON ST.ACCOUNTID = SC.ACCOUNTID
WHERE STATUS = 'Completed'
AND TYPE LIKE 'Call%'
What is the best plan of attack here? As stated, the query is taking a very, very long time to run. Is there a better way to get the newest date from the SalesforceTasks table?
You could try breaking the statement down in to a 2 step process.
First filter records into #temp table and get the datediff without the CASE:
SELECT SC.ID
,DATEDIFF(DD, ST.CREATEDDATE, GETDATE()) AS ScoreDiff
,ROW_NUMBER() OVER (PARTITION BY ST.ACCOUNTID ORDER BY ACTIVITYDATE) AS LastCall
INTO #TEMP
FROM Salesforce.dbo.SalesforceTasks AS ST
JOIN Salesforce.dbo.SalesforceContacts AS SC
ON ST.ACCOUNTID = SC.ACCOUNTID
WHERE STATUS = 'Completed'
AND TYPE LIKE 'Call%'
AND DATEDIFF(DD, ST.CREATEDDATE, GETDATE()) BETWEEN 60 AND 1500
With the reduced dataset, you then perform the Scoring operation:
SELECT Id,
CASE ScoreDiff
WHEN BETWEEN 360 AND 1500
THEN 15
WHEN BETWEEN 181 AND 360
THEN 10
WHEN BETWEEN 60 AND 180
THEN 5
ELSE 0
END AS Score,
LastCall
FROM #temp
If purpose is just to get latest one then you can try this else need to find out other way
SELECT SC.ID,CASE
WHEN DATEDIFF(DD, ST.CREATEDDATE, GETDATE()) BETWEEN 360 AND 1500
THEN 15
WHEN DATEDIFF(DD, ST.CREATEDDATE, GETDATE()) BETWEEN 181 AND 360
THEN 10
WHEN DATEDIFF(DD, ST.CREATEDDATE, GETDATE()) BETWEEN 60 AND 180
THEN 5
ELSE 0
END AS Score,
SFC.ACTIVITYDATE
FROM Salesforce.dbo.SalesforceTasks AS ST
JOIN Salesforce.dbo.SalesforceContacts AS SC
CROSS APPLY
(
SELECT MAX(SFC.ID) AS SCID,MAX(SFC.ACTIVITYDATE) AS ACTIVITYDATE FROM Salesforce.dbo.SalesforceContacts SFC
WHERE SFC.ACCOUNTID=SC.ACCOUNTID
GROUP BY BY SFC.ACCOUNTID
HAVING MAX(SFC.ID)= SC.ID
)
ON ST.ACCOUNTID = SC.ACCOUNTID
WHERE STATUS = 'Completed'
AND TYPE LIKE 'Call%'
AND DATEDIFF(DD, ST.CREATEDDATE, GETDATE()) BETWEEN 60 AND 1500
so I'm trying to make a query that includes a daily sum of the amount from the first instance the database starts collecting data to the last available instance of that date (database collects data every hour). And while I have done this, now I have to make it show a month to date and a year to date sum amount. I have tried various ways to come up with this but have had no luck. Below is the code that I believe is the closest I have gotten to achieve this. Can someone help me make my code work or suggest another way around this?
Select * from
(
SELECT Devices.DeviceDesc,
SUM(DeviceSummaryData.Amount) AS MTD,
Devices.Area,
MIN(DeviceSummaryData.StartDate) AS FirstOfStartDate,
MAX(DeviceSummaryData.EndDate) AS LastOfStartDate
FROM Devices INNER JOIN DeviceSummaryData ON Devices.DeviceID = DeviceSummaryData.DeviceID
WHERE (DeviceSummaryData.StartDate = MONTH(getdate())) AND (DeviceSummaryData.EndDate <= CAST(DATEADD(DAY, 1, GETDATE())
AS date))
GROUP BY Devices.DeviceDesc, Devices.Area, DATEPART(day, DeviceSummaryData.StartDate)
--
) q2
UNION ALL
SELECT * FROM (
SELECT Devices.DeviceDesc,
Sum(Amount) as Daily,
Devices.Area,
MIN(StartDate) as FirstDate,
MAX(DeviceSummaryData.EndDate) AS LastOfStartDate
FROM Devices INNER JOIN DeviceSummaryData ON Devices.DeviceID = DeviceSummaryData.DeviceID
WHERE (DeviceSummaryData.StartDate >= CAST(DATEADD(DAY, 0, GETDATE()) AS date)) AND (DeviceSummaryData.EndDate <= CAST(DATEADD(DAY, 1, getdate()) AS date))
GROUP BY Devices.Area,
Devices.DeviceDesc,
DATEPART(day, DeviceSummaryData.StartDate)
ORDER BY Devices.DeviceDesc
) q2
Another type of attempt I have tried would be this:
SELECT Devices.DeviceDesc,
Sum(case
when DeviceSummaryData.StartDate >= CAST(DATEADD(DAY, 0, getdate()) AS date)
THEN Amount
else 0
end) as Daily,
Sum(case
when Month(StartDate) = MONTH(getdate())
THEN Amount
else 0
end) as MTD,
Devices.Area,
MIN(StartDate) as FirstDate,
MAX(DeviceSummaryData.EndDate) AS LastOfStartDate
FROM Devices INNER JOIN DeviceSummaryData ON Devices.DeviceID = DeviceSummaryData.DeviceID
WHERE (DeviceSummaryData.StartDate >= CAST(DATEADD(DAY, 0, GETDATE()) AS date)) AND (DeviceSummaryData.EndDate <= CAST(DATEADD(DAY, 1, getdate()) AS date))
GROUP BY Devices.Area,
Devices.DeviceDesc,
DATEPART(day, DeviceSummaryData.StartDate)
ORDER BY Devices.DeviceDesc
I'm not the best with Case When's, but I saw somewhere that this is a possible way to do this. I'm not too concerned with the speed or efficiency, I just need it to generate the query to be able to get the data. Any help and Suggestions are greatly appreciated!
The second attempt is on the right track but a bit confused. In the CASE statements you are trying to compare months etc, but your WHERE clause restricts the data you're looking at to a single day. Also, your GROUP BY should not include the day anymore. If you say in English what you want, it's "For each device area and type, I want to see a total, a MTD total and a YTD total". It's that "For each" bit that should define what appears in your GROUP BY.
Just remove the WHERE clause entirely and get rid of DATEPART(day, DeviceSummaryData.StartDate) from your GROUP BY and you should get the results you want. (Well, a daily and monthly total, anyway. Yearly is achieved much the same way).
Also note that DATEADD(DAY, 0, GETDATE()) is identical to just GETDATE().