(mssql) How can i add extra columns in extracted data

(mssql) How can i add extra columns in extracted data - sql-server

i have this code below (running on sql server 2017):
WITH selection AS (
SELECT servertimestamp
FROM eventlog
WHERE servertimestamp BETWEEN '5/29/2018' AND DATEADD(dd, +1, '6/29/2019')
AND (attributes LIKE '%N<=>PeopleIn%'))
(SELECT DATEADD(HOUR, DATEDIFF(HOUR, 0, servertimestamp) - (DATEDIFF(HOUR, 0, servertimestamp) % 2), 0) as timestamp , COUNT(servertimestamp) AS GONE_OUT
FROM selection
WHERE DATEPART(hh, servertimestamp) BETWEEN 8 AND 20
GROUP BY DATEADD(HOUR, DATEDIFF(HOUR, 0, servertimestamp) - (DATEDIFF(HOUR, 0, servertimestamp) % 2), 0))
ORDER BY timestamp
Also the screenshot below shows the result of the executed code:
What this code does is showing how many people came in a building each day. The data is grouped in a 2 hour basis.
What i want to do, is adding a column that shows how many people have gone out of the building for the same time slots that i'm already using.
Below i'm giving you an example of what i want to do:
Notice that on the 6th line i'm using the LIKE operator (attributes LIKE '%N<=>PeopleIn%'). This means that for the additional column, i'll have to make similar selections, but with the difference of using attributes LIKE '%N<=>PeopleOut%'.
Can i make it by using the UNION operator? Is there any other more obvious or easier way to do it?
Your help will be appreciated,
thank you.

You could do it by sort of labeling the servertimestamp field in your CTE based on the activity, then sum up the labels.
WITH selection
AS (
SELECT
servertimestamp
,CASE WHEN attributes LIKE '%N<=>PeopleIn%' THEN 1 ELSE 0 END AS PPL_IN
,CASE WHEN attributes LIKE '%N<=>PeopleOut%' THEN 1 ELSE 0 END AS PPL_OUT
FROM eventlog
WHERE
servertimestamp BETWEEN '5/29/2018' AND DATEADD(dd, + 1, '6/29/2019')
AND
(attributes LIKE '%N<=>PeopleIn%'
OR
attributes LIKE '%N<=>PeopleOut%')
)
(
SELECT
DATEADD(HOUR, DATEDIFF(HOUR, 0, servertimestamp) - (DATEDIFF(HOUR, 0, servertimestamp) % 2), 0) AS TIMESTAMP
,SUM(PPL_OUT) AS GONE_OUT
,SUM(PPL_IN) AS CAME_IN
FROM selection
WHERE DATEPART(hh, servertimestamp) BETWEEN 8
AND 20
GROUP BY DATEADD(HOUR, DATEDIFF(HOUR, 0, servertimestamp) - (DATEDIFF(HOUR, 0, servertimestamp) % 2), 0)
)
ORDER BY TIMESTAMP

UNION appends result sets in the order of execution.
UNION will not be my first choice in solving this problem.
What I see in you screenshot is that GONE_OUT and CAME_IN are grouped on a datetime which is unique and a category on which aggregated values are grouped.
You may have two (sub)queries, one for GONE_OUT and one for CAME_IN and then build a relation.
SELECT *
FROM GONE_OUT AS go
LEFT JOIN CAME_IN AS ci ON go.timestamp = ci.timestamp

Related

SQL Server: If the beginning of the week is Monday, why is the previous Sunday included in that week?

I'm working on a stored procedure that outputs a grid for charting. Users can input an interval to display the chart in days, weeks, months, etc. The data that drives this grid has a bunch of timestamps in a column called moddate. In order to group things, I'm using:
DATEADD(<interval>, datediff(<interval>, 0, moddate), 0)
Days, Months, Quarters, and Years all work fine. Weeks, however are including the previous Sunday. For example, The following select statement gives me an answer of 2017-04-03:
select "start_of_week" = dateadd(week, datediff(week, 0, '2017-04-02 00:00:00.000'), 0);
start_of_week:
'2017-04-03 00:00:00.000'
Changning ##datefirst does not affect the results.
I guess my question is twofold:
Why is April 2nd considered part of the week that starts with April 3rd?
Is there a better way to get around this than to first check for an interval of weeks, and then check every date to see if it's a Sunday and if it is, put -7 instead of 0 at the end?
Thanks in advance for any help.

I ended up using a CASE in both the select and group by:
SELECT CASE
WHEN #interval = 'week'
THEN DATEADD(<interval>, DATEDIFF(day, 0, moddate)/7, 0)
ELSE DATEADD(<interval>, DATEDIFF(<interval>, 0, moddate), 0)
END as moddate,
fullname,
isnull(totals, 0) as totals
FROM #results
GROUP BY CASE
WHEN #interval = 'week'
THEN DATEADD(<interval>, DATEDIFF(day, 0, moddate)/7, 0)
ELSE DATEADD(<interval>, DATEDIFF(<interval>, 0, moddate), 0)
END, fullname
The crucial bit was using the case again in the group by.
My thanks to ZLK for the /7 bit, as subtracting 7 was taking a week off of every answer.

Why is using hardcoded minutes in DateAdd way faster than using a field's value?

Dealing with a SQL issue and I'm not a SQL person, so need some guidance.
Given the SQL statement below, note that the first one uses a hardcoded value of "-360" in the DateAdd function, whereas the second uses a field value (OFFSET) that exists on every record (which has the value of either "-360" or "-300" depending on DST time of year).
Running the first query is extremely fast, while the second takes about 40 seconds longer.
Can someone tell me what the difference is that takes the second so much longer to execute, and because I HAVE to use that record's value and not hard code it, how can I speed up that query?
Query 1 (FAST):
SELECT 0 AS 'TempIndex', COUNT(*) AS 'TotalLY'
FROM CLOGS15 h
WHERE h.EVTYPE = 1
AND DateAdd(minute, -360, h.EVDATE) BETWEEN '2015-01-01 00:00:00.000' AND '2015-01-24 00:00:00.000'
Query 2 (SLOW):
SELECT 0 AS 'TempIndex', COUNT(*) AS 'TotalLY'
FROM CLOGS15 h
WHERE h.EVTYPE = 1
AND DateAdd(minute, OFFSET, h.EVDATE) BETWEEN '2015-01-01 00:00:00.000' AND '2015-01-24 00:00:00.000'

I could only imagine that the issue is sargability (the user of an index). However, I thought that dateadd() would prevent the use of an index. If you want to fix this though, perhaps this will work:
SELECT 0 AS TempIndex, COUNT(*) AS TotalLY
FROM CLOGS15 h
WHERE h.EVTYPE = 1 AND offset = 360 AND
DateAdd(minute, -360, h.EVDATE) BETWEEN '2015-01-01' AND '2015-01-24'
UNION ALL
SELECT 0 AS TempIndex, COUNT(*) AS TotalLY
FROM CLOGS15 h
WHERE h.EVTYPE = 1 AND offset = 300 AND
DateAdd(minute, -300, h.EVDATE) BETWEEN '2015-01-01' AND '2015-01-24';
EDIT:
Oops, the above returns two rows and you want one. So, use a subquery:
SELECT TempIndex, SUM(TotalLY) as TotalLY
FROM (SELECT 0 AS TempIndex, COUNT(*) AS TotalLY
FROM CLOGS15 h
WHERE h.EVTYPE = 1 AND offset = 360 AND
DateAdd(minute, -360, h.EVDATE) BETWEEN '2015-01-01' AND '2015-01-24'
UNION ALL
SELECT 0 AS TempIndex, COUNT(*) AS TotalLY
FROM CLOGS15 h
WHERE h.EVTYPE = 1 AND offset = 300 AND
DateAdd(minute, -300, h.EVDATE) BETWEEN '2015-01-01' AND '2015-01-24'
) h
GROUP BY TempIndex;

In your case, I think that the difference between the fast and slow queries relies in the index usage.
The SQL Server, in your fast query, might be rewriting your DATEADD to enable the index usage on EVDATE. As you are adding the date and a constant and checking if it is between to constant dates, it is probably moving the DATEADD from the left side (before the BETWEEN) to the right side (the dates after BETWEEN, reversing the signal of the constant in the DATEADD.
Your original is:
DateAdd(minute, -360, h.EVDATE)
BETWEEN '2015-01-01 00:00:00.000'
AND '2015-01-24 00:00:00.000'
It might be turning to:
h.EVDATE
BETWEEN DateAdd(minute, 360, '2015-01-01 00:00:00.000')
AND DateAdd(minute, 360, '2015-01-24 00:00:00.000')
Which is just the same as your original filter, but enables the index usage and the processing of DATEADD, as only two dates need to be processed, instead of all dates in your table.
In your slow query, as your DATEADD is using a variable offset, SQL Server can't apply the same rule described above and so it processes all dates in your table with DATEADD, which is really slower.
I think you should try a filter like the fast query, maybe something like this:
SELECT 0 AS 'TempIndex', COUNT(*) AS 'TotalLY'
FROM CLOGS15 h
WHERE 1=1
AND h.EVTYPE = 1
AND
(0=1
OR
(1=1
AND OFFSET = -360
AND h.EVDATE >= DateAdd(minute, 360, '2015-01-01 00:00:00.000')
AND h.EVDATE <= DateAdd(minute, 360, '2015-01-24 00:00:00.000')
)
OR
(1=1
AND OFFSET = -300
AND h.EVDATE >= DateAdd(minute, 300, '2015-01-01 00:00:00.000')
AND h.EVDATE <= DateAdd(minute, 300, '2015-01-24 00:00:00.000')
)
)
Would be great to this query an index with columns EVTYPE, OFFSET, EVDATE respectively.

Adding Month-to-date and Year-to-date in a query using SQL Server 2012?

so I'm trying to make a query that includes a daily sum of the amount from the first instance the database starts collecting data to the last available instance of that date (database collects data every hour). And while I have done this, now I have to make it show a month to date and a year to date sum amount. I have tried various ways to come up with this but have had no luck. Below is the code that I believe is the closest I have gotten to achieve this. Can someone help me make my code work or suggest another way around this?
Select * from
(
SELECT Devices.DeviceDesc,
SUM(DeviceSummaryData.Amount) AS MTD,
Devices.Area,
MIN(DeviceSummaryData.StartDate) AS FirstOfStartDate,
MAX(DeviceSummaryData.EndDate) AS LastOfStartDate
FROM Devices INNER JOIN DeviceSummaryData ON Devices.DeviceID = DeviceSummaryData.DeviceID
WHERE (DeviceSummaryData.StartDate = MONTH(getdate())) AND (DeviceSummaryData.EndDate <= CAST(DATEADD(DAY, 1, GETDATE())
AS date))
GROUP BY Devices.DeviceDesc, Devices.Area, DATEPART(day, DeviceSummaryData.StartDate)
--
) q2
UNION ALL
SELECT * FROM (
SELECT Devices.DeviceDesc,
Sum(Amount) as Daily,
Devices.Area,
MIN(StartDate) as FirstDate,
MAX(DeviceSummaryData.EndDate) AS LastOfStartDate
FROM Devices INNER JOIN DeviceSummaryData ON Devices.DeviceID = DeviceSummaryData.DeviceID
WHERE (DeviceSummaryData.StartDate >= CAST(DATEADD(DAY, 0, GETDATE()) AS date)) AND (DeviceSummaryData.EndDate <= CAST(DATEADD(DAY, 1, getdate()) AS date))
GROUP BY Devices.Area,
Devices.DeviceDesc,
DATEPART(day, DeviceSummaryData.StartDate)
ORDER BY Devices.DeviceDesc
) q2
Another type of attempt I have tried would be this:
SELECT Devices.DeviceDesc,
Sum(case
when DeviceSummaryData.StartDate >= CAST(DATEADD(DAY, 0, getdate()) AS date)
THEN Amount
else 0
end) as Daily,
Sum(case
when Month(StartDate) = MONTH(getdate())
THEN Amount
else 0
end) as MTD,
Devices.Area,
MIN(StartDate) as FirstDate,
MAX(DeviceSummaryData.EndDate) AS LastOfStartDate
FROM Devices INNER JOIN DeviceSummaryData ON Devices.DeviceID = DeviceSummaryData.DeviceID
WHERE (DeviceSummaryData.StartDate >= CAST(DATEADD(DAY, 0, GETDATE()) AS date)) AND (DeviceSummaryData.EndDate <= CAST(DATEADD(DAY, 1, getdate()) AS date))
GROUP BY Devices.Area,
Devices.DeviceDesc,
DATEPART(day, DeviceSummaryData.StartDate)
ORDER BY Devices.DeviceDesc
I'm not the best with Case When's, but I saw somewhere that this is a possible way to do this. I'm not too concerned with the speed or efficiency, I just need it to generate the query to be able to get the data. Any help and Suggestions are greatly appreciated!

The second attempt is on the right track but a bit confused. In the CASE statements you are trying to compare months etc, but your WHERE clause restricts the data you're looking at to a single day. Also, your GROUP BY should not include the day anymore. If you say in English what you want, it's "For each device area and type, I want to see a total, a MTD total and a YTD total". It's that "For each" bit that should define what appears in your GROUP BY.
Just remove the WHERE clause entirely and get rid of DATEPART(day, DeviceSummaryData.StartDate) from your GROUP BY and you should get the results you want. (Well, a daily and monthly total, anyway. Yearly is achieved much the same way).
Also note that DATEADD(DAY, 0, GETDATE()) is identical to just GETDATE().

TSQL retrieve all records in current month/year

I have a datetime field called DateFinished. I need to be able to retrieve all records in which DateFinished is within the current month/year.

If you've only got a small number of rows, this will do to get all rows where DateFinished is in this month of this year.
SELECT *
FROM MyTable
WHERE Year(DateFinished) = Year(CURRENT_TIMESTAMP)
AND Month(DateFinished) = Month(CURRENT_TIMESTAMP)
This could get quite slow over a large number of rows though - in which case using DateAdd, DatePart and BETWEEN is probably more appropriate, and can take advantage of indexes (I don't have time to write an answer involving those right now!)

Just as an alternative - this should use an index on DateFinished.
SELECT *
FROM MyTable
WHERE DateFinished BETWEEN
DATEADD(MONTH, DATEDIFF(MONTH, 0, GETDATE()), 0)
AND
DATEADD(MONTH, DATEDIFF(MONTH, 0, GETDATE()) + 1, 0)

So the problem with #Bridge's method is use of index.
#Moose & #PCurd's method has a problems depending on how the data is stored.
#PCurd's method would work fine if all data collected on a day is rounded down to that day. E.g. event at 5pm is recorded as 2021-11-30 00:00:00. But if time is kept (which is assumed as it is a datetime field in Ops situation) then this data will be lost.
So you need to use the <> operators.
SELECT *
FROM MyTable
WHERE DateFinished >=
DATEADD(MONTH, DATEDIFF(MONTH, 0, GETDATE()), 0)
AND DateFinished <
DATEADD(MONTH, DATEDIFF(MONTH, 0, GETDATE()) + 1, 0)
For the method using datefromparts: SQL select records with current month

How to calculate overlapping subscription days from orders with sql-server

I have an ordertable with orders. I want to calculate the amount of subscriptiondays for each user (preffered in a set-based way) for a specific day.
create table #orders (orderid int, userid int, subscriptiondays int, orderdate date)
insert into #orders
select 1, 2, 10, '2011-01-01'
union
select 2, 1, 10, '2011-01-10'
union
select 3, 1, 10, '2011-01-15'
union
select 4, 2, 10, '2011-01-15'
declare #currentdate date = '2011-01-20'
--userid 1 is expected to have 10 subscriptiondays left
(since there is 5 left when the seconrd order is placed)
--userid 2 is expected to have 5 subscriptionsdays left
I'm sure this has been done before, I just dont know what to search for.
Pretty much like a running total?
So when I set #currentdate to '2011-01-20' I want this result:
userid subscriptiondays
1 10
2 5
When I set #currentdate to '2011-01-25'
userid subscriptiondays
1 5
2 0
When I set #currentdate to '2011-01-11'
userid subscriptiondays
1 9
2 0
Thanks!

I think you would need to use a recursive common table expression.
EDIT: I've also added a procedural implementation further below instead of using a recursive common table expression. I recommend using that procedural approach, as I think there may be a number of data scenarios that the recursive CTE query that I've included probably doesn't handle.
The query below gives the correct answers for the scenarios that you've provided, but you would probably want to think up some additional complex scenarios and see whether there are any bugs.
For instance, I have a feeling that this query may break down if you have multiple previous orders overlapping with a later order.
with CurrentOrders (UserId, SubscriptionDays, StartDate, EndDate) as
(
select
userid,
sum(subscriptiondays),
min(orderdate),
dateadd(day, sum(subscriptiondays), min(orderdate))
from #orders
where
#orders.orderdate <= #currentdate
-- start with the latest order(s)
and not exists (
select 1
from #orders o2
where
o2.userid = #orders.userid
and o2.orderdate <= #currentdate
and o2.orderdate > #orders.orderdate
)
group by
userid
union all
select
#orders.userid,
#orders.subscriptiondays,
#orders.orderdate,
dateadd(day, #orders.subscriptiondays, #orders.orderdate)
from #orders
-- join any overlapping orders
inner join CurrentOrders on
#orders.userid = CurrentOrders.UserId
and #orders.orderdate < CurrentOrders.StartDate
and dateadd(day, #orders.subscriptiondays, #orders.orderdate) > CurrentOrders.StartDate
)
select
UserId,
sum(SubscriptionDays) as TotalSubscriptionDays,
min(StartDate),
sum(SubscriptionDays) - datediff(day, min(StartDate), #currentdate) as RemainingSubscriptionDays
from CurrentOrders
group by
UserId
;
Philip mentioned a concern about the recursion limit on common table expressions. Below is a procedural alternative using a table variable and a while loop, which I believe accomplishes the same thing.
While I've verified that this alternative code does work, at least for the sample data provided, I'd be glad to hear anyone's comments on this approach. Good idea? Bad idea? Any concerns to be aware of?
declare #ModifiedRows int
declare #CurrentOrders table
(
UserId int not null,
SubscriptionDays int not null,
StartDate date not null,
EndDate date not null
)
insert into #CurrentOrders
select
userid,
sum(subscriptiondays),
min(orderdate),
min(dateadd(day, subscriptiondays, orderdate))
from #orders
where
#orders.orderdate <= #currentdate
-- start with the latest order(s)
and not exists (
select 1
from #orders o2
where
o2.userid = #orders.userid
and o2.orderdate <= #currentdate
-- there does not exist any other order that surpasses it
and dateadd(day, o2.subscriptiondays, o2.orderdate) > dateadd(day, #orders.subscriptiondays, #orders.orderdate)
)
group by
userid
set #ModifiedRows = ##ROWCOUNT
-- perform an extra update here in case there are any additional orders that were made after the start date but before the specified #currentdate
update co set
co.SubscriptionDays = co.SubscriptionDays + #orders.subscriptiondays
from #CurrentOrders co
inner join #orders on
#orders.userid = co.UserId
and #orders.orderdate <= #currentdate
and #orders.orderdate >= co.StartDate
and dateadd(day, #orders.subscriptiondays, #orders.orderdate) < co.EndDate
-- Keep attempting to update rows as long as rows were updated on the previous attempt
while(#ModifiedRows > 0)
begin
update co set
SubscriptionDays = co.SubscriptionDays + overlap.subscriptiondays,
StartDate = overlap.orderdate
from #CurrentOrders co
-- join any overlapping orders
inner join (
select
#orders.userid,
sum(#orders.subscriptiondays) as subscriptiondays,
min(orderdate) as orderdate
from #orders
inner join #CurrentOrders co2 on
#orders.userid = co2.UserId
and #orders.orderdate < co2.StartDate
and dateadd(day, #orders.subscriptiondays, #orders.orderdate) > co2.StartDate
group by
#orders.userid
) overlap on
overlap.userid = co.UserId
set #ModifiedRows = ##ROWCOUNT
end
select
UserId,
sum(SubscriptionDays) as TotalSubscriptionDays,
min(StartDate),
sum(SubscriptionDays) - datediff(day, min(StartDate), #currentdate) as RemainingSubscriptionDays
from #CurrentOrders
group by
UserId
EDIT2: I've made some adjustments to the code above to address various special cases, such as if there just happen to be two orders for a user that both end on the same date.
For instance, changing the setup data to the following caused issues with the original code, which I've now corrected:
insert into #orders
select 1, 2, 10, '2011-01-01'
union
select 2, 1, 10, '2011-01-10'
union
select 3, 1, 10, '2011-01-15'
union
select 4, 2, 6, '2011-01-15'
union
select 5, 2, 4, '2011-01-17'
EDIT3: I've made some additional adjustments to address other special cases. In particular, the previous code ran into issues with the following setup data, which I've now corrected:
insert into #orders
select 1, 2, 10, '2011-01-01'
union
select 2, 1, 6, '2011-01-10'
union
select 3, 1, 10, '2011-01-15'
union
select 4, 2, 10, '2011-01-15'
union
select 5, 1, 4, '2011-01-12'

If my clarifying comment/question is correct, then you want to use DATEDIFF:
DATEDIFF(dd, orderdate, #currentdate)

My interpretation of the problem:
On day X, customer buys a “span” of subscription days (i.e. good for N days)
The span starts on the day of purchase and is good for X through day X + (N - 1)... but see below
If customer purchases a second span after the first expires (or any new span after all existing spans expire), repeat process. (A single 10-day purchase 30 days ago has no impact on a second purhcase made today.)
If customer purchases a span while existing span(s) are still in effect, the new span applies to day immediately after end of current span(s) through that date + (N – 1)
This is iterative. If customer buys 10-day spans on Jan 1st, Jan 2nd, and Jan 3rd, it would look something like:
As of 1st: Jan 1 – Jan 10
As of 2nd: Jan 1 – Jan 10, Jan 11 – Jan 20 (in effect, Jan 1 to Jan 20)
As of 3rd: Jan 1 – Jan 10, Jan 11 – Jan 20, Jan 21 – Jan 30 (in effect, Jan 1 to Jan 30)
If this is indeed the problem, then it is a horrible problem to solve in T-SQL. To deterimine the “effective span” of a given purchase, you have to calculate the effective span of all prior purchases in the order that they were purchased, because of that overall cumulative effect. This is a trivial problem with 1 user and 3 rows, but non-trivial with thousands of users with dozens of purchases (which, presumably, is what you want).
I would solve it like so:
Add column EffectiveDate of datatype date to the table
Build a one-time process to walk through every row user-by-user and orderdate by orderdate, and calculate the EffectiveDate as discussed above
Modify the process used to insert the data to calculate the EffectiveDate at the time a new entry is made. Done this way, you’d only ever have to reference the most recent purchase made by that user.
Wrangle out subsequent issues regarding deleting (cancelled?) or updating (mis-set?) orders
I may be wrong, but I don't see any way to address this using set-based tactics. (Recursive CTEs and the like would work, but they can only recurse to so many levels, and we don't know the limit for this problem -- let alone how often you'll need to run it, or how well it must perform.) I'll watch and upvote anyone who solves this without recursion!
And of course this only applies if my understanding of the problem is correct. If not, please disregard.

In fact, we need calculate summ of subscriptiondays minus days beetwen first subscrible date and #currentdate like:
select userid,
sum(subsribtiondays)-
DATEDIFF('dd',
(select min(orderdate)
from #orders as a
where a.userid=userid), #currentdate)
from #orders
where orderdate <= #currentdata
group by userid

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight