I have a situation where I am summing up several columns from a table and inserting the results into another table. This is being grouped by county and district. One of the columns is also taking the smallest total sales from a retailer in that district. The problem I have is that there may be some that have less than zero total sales. I only want to write the smallest value that is greater than zero to that column.
declare #WeekEnd datetime
set #WeekEnd = (select top(1) date from sales order by date desc)
select date
,county
,district
,sum(prod1)
,sum(prod2)
,sum(prod3)
,sum(prod4)
,sum(prod1+prod2+prod3+prod4) --Total Sales
,Case when min(prod1+prod2+prod3+prod4) > 0 then min(prod1+prod2+prod3+prod4)
--this works well except for when a total is less than zero, then it is null. I want to avoid the null and have it write the smallest value greater than zero.
end
from sales
where date = #WeekEnd
group by date,county,district
order by county, district
If I am reading your question correctly, you need to get the MIN TotalSales with a subquery:
declare #WeekEnd datetime
set #WeekEnd = (select top(1) date from sales order by date desc)
select date
,county
,district
,sum(prod1)
,sum(prod2)
,sum(prod3)
,sum(prod4)
,sum(prod1+prod2+prod3+prod4) --Total Sales
,(SELECT min(prod1+prod2+prod3+prod4)
FROM sales s2
WHERE s1.date=s2.date
AND s1.county=s2.county
AND s1.district=s2.district
AND (prod1+prod2+prod3+prod4)>0
)
from sales s1
where date = #WeekEnd
group by date,county,district
order by county, district
Haven't tried, but I would assume this works:
min(Case when prod1+prod2+prod3+prod4 <= 0
then null else prod1+prod2+prod3+prod4 end)
Related
I have two tables Agency_DailyPrices and Agency_DailyDiscounts.
Here is my query:
DECLARE #checkIn DATE = CAST(GETDATE() AS DATE);
DECLARE #checkOut DATE = DATEADD(DAY, 6, #checkIn);
DECLARE #currency_id INT = 3;
SELECT AP.date_, AP.property_id, price, discountPercent,
CASE WHEN discountPercent > 0 THEN (price * discountPercent/100) ELSE 0 END AS discountAmount
FROM Agency_DailyPrices AP
LEFT JOIN Agency_DailyDiscounts AD ON AD.date_= AP.date_ AND AD.property_id = AP.property_id
WHERE (AP.date_ BETWEEN #checkIn AND #checkOut) AND
currency_id = #currency_id
ORDER BY AP.property_id;
And here is the output.
I want to filter the records with property_id = 62 as there is no price for 2021-06-01.
In the other words. How can I retrieve properties that have price for each date between the given dates?
Thanks in advance
As temporary solution I counted the number of days between the given dates (in this case 7). And I return the the records if the total number of rows per property was equal to 7.
I have a table with a column for ID, StartDate, EndDate, And whether or not there was a gap between the enddate of that row and the next start date. If there was only one set instance of that ID i know that I could just do
SELECT min(startdate),max(enddate)
FROM table
GROUP BY ID
However, I have multiple instances of these IDs in several non-connected timespans. So if I were to do that I would get the very first start date and the last enddate for a different set of time for that personID. How would I go about making sure I get the min a max dates for the specific blocks of time?
I thought about potentially creating a new column where it would have a number for each set of time. So for the first set of time that has no gaps, it would have 1, then when the next row has a gap it will add +1 corresponding to a new set of time. but I am not really sure how to go about that. Here is some sample data to illustrate what I am working with:
ID StartDate EndDate NextDate Gap_ind
001 1/1/2018 1/31/2018 2/1/2018 N
001 2/1/2018 2/30/2018 3/1/2018 N
001 3/1/2018 3/31/2018 5/1/2018 Y
001 5/1/2018 5/31/2018 6/1/2018 N
001 6/1/2018 6/30/2018 6/30/2018 N
This is a classic "gaps and islands" problem, where you are trying to define the boundaries of your islands, and which you can solve by using some windowing functions.
Your initial effort is on track. Rather than getting the next start date, though, I used the previous end date to calculate the groupings.
The innermost subquery below gets the previous end date for each of your date ranges, and also assigns a row number that we use later to keep our groupings in order.
The next subquery out uses the previous end date to identify which groups of date ranges go together (overlap, or nearly so).
The outermost query is the end result you're looking for.
SELECT
Grp.ID,
MIN(Grp.StartDate) AS GroupingStartDate,
MAX(Grp.EndDate) AS GroupingEndDate
FROM
(
SELECT
PrevDt.ID,
PrevDt.StartDate,
PrevDt.EndDate,
SUM(CASE WHEN DATEADD(DAY,1,PrevDt.PreviousEndDate) >= PrevDt.StartDate THEN 0 ELSE 1 END)
OVER (PARTITION BY PrevDt.ID ORDER BY PrevDt.RN) AS GrpNum
FROM
(
SELECT
ROW_NUMBER() OVER (PARTITION BY ID ORDER BY StartDate, EndDate) as RN,
ID,
StartDate,
EndDate,
LAG(EndDate,1) OVER (PARTITION BY ID ORDER BY StartDate) AS PreviousEndDate
FROM
tbl
) AS PrevDt
) AS Grp
GROUP BY
Grp.ID,
Grp.GrpNum;
Results:
+-----+------------------+--------------+
| ID | InitialStartDate | FinalEndDate |
+-----+------------------+--------------+
| 001 | 2018-01-01 | 2018-03-01 |
| 001 | 2018-05-01 | 2018-06-01 |
+-----+------------------+--------------+
SQL Fiddle demo.
Further reading:
The SQL of Gaps and Islands in Sequences
Gaps and Islands Across Date Ranges
This is an example of a gaps-and-islands problem. A simple solution is to use lag() to determine if there are overlaps. When there is none, you have the start of a group. A cumulative sum defines the group -- and you aggregate on that.
select t.id, min(startdate), max(enddate)
from (select t.*,
sum(case when prev_enddate >= dateadd(day, -1, startdate)
then 0 else 1
end) over (partition by id order by startdate) as grp
from (select t.*, lag(enddate) over (partition by id order by startdate) as prev_enddate
from t
) t
) t
group by id, grp;
I have a table with job schedules :
job_id [unique ID]
pref_start [date]
spec_duration [time in seconds]
I can calculate the end date from the preferred start and duration. The pref_start is not fixed, and can be changed at whim by the engineers.
I need to report activity in any given week, so if I have data similar to:
jid start end
J1 01/01/yyyy 15/02/yyyy
J2 07/01/yyyy 08/02/yyyy
J3 09/02/yyyy 21/03/yyyy
How would I query "tell me the job id's that occur on each day of the week 07/02/yyyy to 12/02/yyyy"
First find the matching intervals between your jobs and your filtering interval, then the amount of days for the filter interval and the overlapping intervals must match:
DECLARE #Jobs TABLE (
ID INT IDENTITY,
StartDate DATE,
EndDate DATE)
INSERT INTO #Jobs (
StartDate,
EndDate)
VALUES
('2019-01-01', '2019-02-15'),
('2019-01-07', '2019-02-08'),
('2019-02-09', '2019-03-21')
DECLARE #FilterStartDate DATE = '2019-02-07'
DECLARE #FilterEndDate DATE = '2019-02-12'
;WITH AtLeast1DayOverlappingJobs AS
(
SELECT
J.ID,
J.StartDate,
J.EndDate,
OverlappingStartDate = CASE
WHEN J.StartDate > #FilterStartDate THEN J.StartDate ELSE #FilterStartDate END, -- Highest of 2
OverlappingEndDate = CASE
WHEN J.EndDate < #FilterEndDate THEN J.EndDate ELSE #FilterEndDate END -- Lowest of 2
FROM
#Jobs AS J
WHERE
-- They share at least 1 day
#FilterStartDate <= J.EndDate AND #FilterEndDate >= J.StartDate
)
SELECT
T.*
FROM
AtLeast1DayOverlappingJobs AS T
WHERE
-- Amount of days must match between filter and overlapping periods
DATEDIFF(DAY, #FilterStartDate, #FilterEndDate) = DATEDIFF(DAY, T.OverlappingStartDate, T.OverlappingEndDate)
Results:
ID StartDate EndDate OverlappingStartDate OverlappingEndDate
1 2019-01-01 2019-02-15 2019-02-07 2019-02-12
I'm trying to sum the all credits that occur before a debit, then sum all the debits after credit within a 4 day time period.
Table
ACCT |Date | Amount | Credit or debit
-----+----------+---------+----------------
152 |8/14/2017 | 48 | C
152 |8/12/2017 | 22.5 | D
152 |8/12/2017 | 40 | D
152 |8/11/2017 | 226.03 | C
152 |8/10/2017 | 143 | D
152 |8/10/2017 | 107.23 | C
152 |8/10/2017 | 20 | D
152 |8/10/2017 | 49.41 | C
My query should only sum if there is credit before the debit. the results will have 3 rows with the data above.
Output needed:
acct DateRange credit_amount debit_amount
--------------------------------------------------------------------------
152 2017-10-14 to 2017-10-18 49.41 20
152 2017-10-14 to 2017-10-18 107.23 143
152 2017-10-14 to 2017-10-18 226.03 62.5
The last one is summing the two debits until there is a credit.
First find the first credit.
sum the credits if there are more then 1 before a debit.
then find the debit and sum together until the next credit.
I only need the case where the credit date is before the debit date. The 48 on 8/14 is ignored because there is no debit after it.
The logic is to see if the account was credited then debited after it.
My attempt
DECLARE #StartDate DATE
DECLARE #EndDate DATE
DECLARE #OverallEndDate DATE
SET #OverallEndDate = '2017-08-14'
SET #StartDate = '2017-08-10'
SET #EndDate = dateadd(dd, 4, #startDate);
WITH Dates
AS (
SELECT #StartDate AS sd, #EndDate AS ed, #OverallEndDate AS od
UNION ALL
SELECT dateadd(dd, 1, sd), DATEADD(dd, 1, ed), od
FROM Dates
WHERE od > sd
), credits
AS (
SELECT DISTINCT A.Acct, LEFT(CONVERT(VARCHAR, #StartDate, 120), 10) + 'to' + LEFT(CONVERT(VARCHAR, #EndDate, 120), 10) AS DateRange, credit_amount, debit_amount
FROM (
SELECT t1.acct, sum(amount) AS credit_amount, MAX(t1.datestart) AS c_datestart
FROM [Transactions] T1
WHERE Credit_or_debit = 'C' AND T1.Datestart BETWEEN #StartDate AND #EndDate AND T1.[acct] = '152' AND T1.Datestart <= (
SELECT MIN(D1.Datestart)
FROM [Transactions] D1
WHERE T1.acct = D1.acct AND D1.Credit_or_debit = 'D' AND D1.Datestart BETWEEN #StartDate AND #EndDate
)
GROUP BY T1.acct
) AS A
CROSS JOIN (
SELECT t2.acct, sum(amount) AS debit_amount, MAX(t2.datestart) AS c_datestart
FROM [Transactions] T2 AND T2.DBCR = 'D' AND T2.Datestart BETWEEN #StartDate AND #EndDate AND T2.[acct] = '152' AND T2.Datestart <= (
SELECT MAX(D1.Datestart)
FROM [Transactions] D1
WHERE T2.acct = D1.acct AND D1.Credit_or_debit = 'D' AND D1.Datestart BETWEEN #StartDate AND #EndDate
)
GROUP BY T2.acct
) AS B
WHERE A.acct = B.acct AND A.c_datestart <= B.d_datestart
)
SELECT *
FROM credits
OPTION (MAXRECURSION 0)
Update:
The date stored is actually date timestamped. That is how I verify whether the debit is > credit.
It should be clear now that you definitely need a column that specifies the sequential order of transactions, because otherwise you can't decide whether a debit is placed befor or after a credit when they both have the same datestart. Assuming that you have such a column (in my query I named it ID), a solution could be as follows, without recursion and also without a self-join. The problem can be solved using some of the window functions available since SQL Server 2008.
My solution processes the data in several steps that I implemented as a sequence of 2 CTEs and a final PIVOT query:
DECLARE #StartDate DATE = '20170810';
DECLARE #EndDate DATE = dateadd(dd, 4, #StartDate);
DECLARE #DateRange nvarchar(24);
SET #DateRange =
CONVERT(nvarchar(10), #StartDate, 120) + ' to '
+ CONVERT(nvarchar(10), #EndDate, 120);
WITH
blocks (acct, CD, amount, blockno, r_blockno) AS (
SELECT acct, Credit_or_debit, amount
, ROW_NUMBER() OVER (PARTITION BY acct ORDER BY ID ASC)
- ROW_NUMBER() OVER (PARTITION BY acct, Credit_or_debit ORDER BY ID ASC)
, ROW_NUMBER() OVER (PARTITION BY acct ORDER BY ID DESC)
- ROW_NUMBER() OVER (PARTITION BY acct, Credit_or_debit ORDER BY ID DESC)
FROM Transactions
WHERE datestart BETWEEN #StartDate AND #EndDate
AND Credit_or_debit IN ('C','D') -- not needed, if always true
),
blockpairs (acct, CD, amount, pairno) AS (
SELECT acct, CD, amount
, DENSE_RANK() OVER (PARTITION BY acct, CD ORDER BY blockno)
FROM blocks
WHERE (blockno > 0 OR CD = 'C') -- remove leading debits
AND (r_blockno > 0 OR CD = 'D') -- remove trailing credits
)
SELECT acct, #DateRange AS DateRange
, amt.C AS credit_amount, amt.D AS debit_amount
FROM blockpairs PIVOT (SUM(amount) FOR CD IN (C, D)) amt
ORDER BY acct, pairno;
And this is how it works:
blocks
Here, the relevant data is retrieved from the table, meaning that the date range filter is applied, and another filter on the Credit_or_debit column makes sure that only the values C and D are contained in the result (if this is the case by design in your table, then that part of the WHERE clause can be omitted). The essential part in this CTE is the difference of two rownumbers (blockno). Credits and debits are numbered separately, and their respective rownumber is subtracted from the overall row number. Within a consecutive block of debits or credits, these numbers will be the same for each record, and they will be different (higher) in later blocks of the same type. The main use if this numbering is to identify the very first block (number 0) in order to be able to exclude it from
further processing in the next step in case it's a debit block. To be able to also identify the very last block (and filter it away in the next step if it's a credit block), a similar block numbering is made in the reverse order (r_blockno). The result (which I orderd just for visualization with your sample data) will look like this:
blockpairs
In this CTE, as described before, the very first block is filtered away if it's a debit block, and the very last block is filtered away if it's a credit block. Doing this, the number of remaining blocks must be even, and the logical order of blocks must be a sequence of pairs of credit and debit blocks, each pair starting with a credit block and followed by its associated debit block. Each pair of credit/debit blocks will result in a single row in the end. To associate the credit and debit blocks correctly in the query, I give them the same number by using separate numberings per type (the n-th credit block and the n-th debit block are associated by giving them the same number n). For this numbering, I use the DENSE_RANK function, for all records in a block to obtain the same number (pairno) and make the numbering gapless. For numbrting the blocks of the same type, I reuse the the blockno field described above for ordering. The result in your example (again sorted for visualization):
The final PIVOT query
Finally, the credit_amount and debit_amount are aggregated over the respective blocks grouping by acct and pairno and then diplayed side-by-side using a PIVOT query.
Although the column pairno isn't visible, it is used for sorting the resulting records.
I have a table in MSSQL with the following structure:
PersonId
StartDate
EndDate
I need to be able to show the number of distinct people in the table within a date range or at a given date.
As an example i need to show on a daily basis the totals per day, e.g. if we have 2 entries on the 1st June, 3 on the 2nd June and 1 on the 3rd June the system should show the following result:
1st June: 2
2nd June: 5
3rd June: 6
If however e.g. on of the entries on the 2nd June also has an end date that is 2nd June then the 3rd June result would show just 5.
Would someone be able to assist with this.
Thanks
UPDATE
This is what i have so far which seems to work. Is there a better solution though as my solution only gets me employed figures. I also need unemployed on another column - unemployed would mean either no entry in the table or date not between and no other entry as employed.
CREATE TABLE #Temp(CountTotal int NOT NULL, CountDate datetime NOT NULL);
DECLARE #StartDT DATETIME
SET #StartDT = '2015-01-01 00:00:00'
WHILE #StartDT < '2015-08-31 00:00:00'
BEGIN
INSERT INTO #Temp(CountTotal, CountDate)
SELECT COUNT(DISTINCT PERSON.Id) AS CountTotal, #StartDT AS CountDate FROM PERSON
INNER JOIN DATA_INPUT_CHANGE_LOG ON PERSON.DataInputTypeId = DATA_INPUT_CHANGE_LOG.DataInputTypeId AND PERSON.Id = DATA_INPUT_CHANGE_LOG.DataItemId
LEFT OUTER JOIN PERSON_EMPLOYMENT ON PERSON.Id = PERSON_EMPLOYMENT.PersonId
WHERE PERSON.Id > 0 AND DATA_INPUT_CHANGE_LOG.Hidden = '0' AND DATA_INPUT_CHANGE_LOG.Approved = '1'
AND ((PERSON_EMPLOYMENT.StartDate <= DATEADD(MONTH,1,#StartDT) AND PERSON_EMPLOYMENT.EndDate IS NULL)
OR (#StartDT BETWEEN PERSON_EMPLOYMENT.StartDate AND PERSON_EMPLOYMENT.EndDate) AND PERSON_EMPLOYMENT.EndDate IS NOT NULL)
SET #StartDT = DATEADD(MONTH,1,#StartDT)
END
select * from #Temp
drop TABLE #Temp
You can use the following query. The cte part is to generate a set of serial dates between the start date and end date.
DECLARE #ViewStartDate DATETIME
DECLARE #ViewEndDate DATETIME
SET #ViewStartDate = '2015-01-01 00:00:00.000';
SET #ViewEndDate = '2015-02-25 00:00:00.000';
;WITH Dates([Date])
AS
(
SELECT #ViewStartDate
UNION ALL
SELECT DATEADD(DAY, 1,Date)
FROM Dates
WHERE DATEADD(DAY, 1,Date) <= #ViewEndDate
)
SELECT [Date], COUNT(*)
FROM Dates
LEFT JOIN PersonData ON Dates.Date >= PersonData.StartDate
AND Dates.Date <= PersonData.EndDate
GROUP By [Date]
Replace the PersonData with your table name
If startdate and enddate columns can be null, then you need to add
addditional conditions to the join
It assumes one person has only one record in the same date range
You could do this by creating data where every start date is a +1 event and end date is -1 and then calculate a running total on top of that.
For example if your data is something like this
PersonId StartDate EndDate
1 20150101 20150201
2 20150102 20150115
3 20150101
You first create a data set that looks like this:
EventDate ChangeValue
20150101 +2
20150102 +1
20150115 -1
20150201 -1
And if you use running total, you'll get this:
EventDate Total
2015-01-01 2
2015-01-02 3
2015-01-15 2
2015-02-01 1
You can get it with something like this:
select
p.eventdate,
sum(p.changevalue) over (order by p.eventdate asc) as total
from
(
select startdate as eventdate, sum(1) as changevalue from personnel group by startdate
union all
select enddate, sum(-1) from personnel where enddate is not null group by enddate
) p
order by p.eventdate asc
Having window function with sum() requires SQL Server 2012. If you're using older version, you can check other options for running totals.
My example in SQL Fiddle
If you have dates that don't have any events and you need to show those too, then the best option is probably to create a separate table of dates for the whole range you'll ever need, for example 1.1.2000 - 31.12.2099.
-- Edit --
To get count for a specific day, it's possible use the same logic, but just sum everything up to that day:
declare #eventdate date
set #eventdate = '20150117'
select
sum(p.changevalue)
from
(
select startdate as eventdate, 1 as changevalue from personnel
where startdate <= #eventdate
union all
select enddate, -1 from personnel
where enddate < #eventdate
) p
Hopefully this is ok, can't test since SQL Fiddle seems to be unavailable.