I currently have two tables, tbl_Invoices
InvoiceNumber NextBillingDate
------------------------------
100 3/15/21
200 3/31/21
300 4/15/21
400 5/15/21
and tbl_GLPeriods:
GLPeriod PeriodStartDate PeriodEndDate
----------------------------------------------
250 3/3/21 4/3/21
251 4/4/21 5/2/21
252 5/3/21 6/3/21
I need a view that returns a column where the GL period for the next billing date is provided, ie:
InvoiceNumber NextBillingPeriod
---------------------------------
100 250
200 250
300 251
400 252
How do I query to find if one column is between the two columns in another table? I'm blanking on how to do this, thinking something with a CASE.
Edit: where I'm currently at, structurally won't work, but it shows what I'm currently trying to get going:
SELECT
*,
CASE
WHEN tbl_Invoices.NextBillingDate BETWEEN (SELECT PeriodStartDate FROM tbl_GLPeriods) AS stdt
AND (SELECT PeriodEndDate FROM tbl_GLPeriods) AS endt
THEN endt.GLPeriod
END AS NextBillingPeriod
FROM
tbl_Invoices
Solved with this thanks to #Charlieface:
select tbl_Invoices.InvoiceNumber, tbl_GLPeriods.GLPeriod
from tbl_Invoices
left join tbl_GLPeriods on tbl_Invoices.NextBillingDate between tbl_GLPeriods.PeriodStartDate AND tbl_GLPeriods.PeriodEndDate
You can use AND to connect multiple predicates to check for a range with <= and > (or equivalent). Like that you can use a correlated subquery similar to what you've tried, provided the periods cannot overlap.
SELECT i.invoicenumber,
(SELECT p.glperiod
FROM tbl_glperiods p
WHERE p.periodstartdate <= i.nextbillingdate
AND dateadd(DAY, 1, p.periodenddate) > i.nextbillingdate) nextbillingperiod
FROM tbl_invoices i;
You can also use a left join. Then the periods can overlap, you'll get multiple rows, if a date falls in two or more periods. A join might also perform better.
SELECT i.invoicenumber,
p.glperiod nextbillingperiod
FROM tbl_invoices i
LEFT JOIN tbl_glperiods p
ON p.periodstartdate <= i.nextbillingdate
AND dateadd(DAY, 1, p.periodenddate) > i.nextbillingdate;
Note that you can shorten dateadd(DAY, 1, p.periodenddate) to just p.periodenddate if tbl_glperiods.periodenddate is meant to be and exclusive upper bound or if it's inclusive but tbl_invoices.nextbillingdate is guaranteed not to be more precise than a day, i.e. it cannot have an hour, minute, second and so on portion. Otherwise you might miss timestamps on the last day past midnight.
select InvoiceNumber, (select GLPeriod from tbl_GLPeriods where NextBillingDate between PeriodStartDate and PeriodEndDate) 'NextBillingPeriod' from tbl_Invoices
Related
I have the following table:
ID | DATES
---+-----------
1 02-09-2010
2 03-08-2011
1 08-01-2011
3 04-03-2010
I am looking for IDs who had at least one date before 05-01-2010 AND at least one date after 05-02-2010
I tried the following:
WHERE tb1.DATES < '05-01-2010' AND tb1.DATES > '05-02-2010'
I don't think it's correct because I wasn't getting the right IDs when I did that and there's something wrong with that logic.
Can someone explain what I am doing wrong here?
The SQL command SELECT * FROM tb1 WHERE tb1.DATES < '05-01-2010' AND tb1.DATES > '05-02-2010' is asking "find all the rows where the 'dates' field is before 1 May and after 2 May" which - when put in English - is obviously none of them.
Instead, the command should be asking "find all the IDs which have a record that is before 1 May, and another record after 2 May" - creating the need to look at multiple records for each ID.
As #Martheen suggested, you could do this with two (sub)queries e.g.,
SELECT A.ID
FROM
(SELECT DISTINCT tb1.ID
FROM mytable tb1
WHERE tb1.[dates] < '20100501'
) AS A
INNER JOIN
(SELECT DISTINCT tb1.ID
FROM mytable tb1
WHERE tb1.[dates] > '20100502'
) AS B
ON A.ID = B.ID;
or using INTERSECT
SELECT DISTINCT tb1.ID
FROM mytable tb1
WHERE tb1.[dates] < '20100501'
INTERSECT
SELECT mt2.ID
FROM mytable mt2
WHERE mt2.[dates] > '20100502';
The use of DISTINCT in the above is so that you only get one row per ID, no matter how many rows they have before/after the relevant dates.
You could also do it via GROUP BY and HAVING - which in this particular case is easy as if any dates are before 1 May, then their earliest date must be before 1 May (and correspondingly for their max data and 2 May) e.g.,
SELECT mt1.ID
FROM mytable mt1
GROUP BY mt1.ID
HAVING MIN(mt1.[dates]) < '20100501' AND MAX(mt1.[dates]) > '20100502';
Here is a db<>fiddle with all 3 of these; all provide the same answer (one row, with ID = 1).
Finally, you should use an unambiguous format for your dates. My preferred one of these is 'yyyyMMdd' with no dashes/slashes/etc (as these make them ambiguous).
Different countries/servers/etc will convert the dates you have there differently e.g., SQL Server UTC string comparison not working
This is one solution to use between to specify range.
SELECT * from Table_name where
From_date BETWEEN '2013-01-03'AND '2013-01-09'
Other solution is to what you mentioned but see that the logic is correct
SELECT * from Table_name where
From_date > '2010-01-05'AND From_date <'2010-02-05'
I have 2 tables:
Query1: contains 3 columns, Due_Date, Received_Date, Diff
where Diff is the difference in the two dates in days
QueryHol with 2 columns, Date, Count
This has a list of dates and the count is set to 1 for everything. All these dates represent public holidays.
I want to be able to get the sum of QueryHol["Count"] if QueryHol["Date"] is between Query1["Due_Date"] and Query1["Received_Date"]
Result Wanted: a column joined onto Query1 to state how many public holidays fell into the date range so they can be subtracted from the Query1["Diff"] column to give a reflection of working days.
Because the 01-01-19 is a bank holiday i would want to minus that from the Diff to end up with results like below
Let me know if you require any more info.
Here's an option:
SELECT query1.due_date
, query1.received_date
, query1.diff
, queryhol.count
, COALESCE(query1.diff - queryhol.count, query1.diff) as DiffCount
FROM Query1
OUTER APPLY(
SELECT COUNT(*) AS count
FROM QueryHol
WHERE QueryHol.Date <= Query1.Received_Date
AND QueryHol.Date >= Query1.Due_Date
) AS queryhol
You may need to play around with the join condition - as it is assumes that the Received_Date is always later than the Due_Date which there is not enough data to know all of the use cases.
If I understand your problem, I think this is a possible solution:
select due_date,
receive_date,
diff,
(select sum(table2.count)
from table2
where table2.due_date between table1.due_date and table1.due_date) sum_holi,
table1.diff - (select sum(table2.count)
from table2
where table2.date between table1.due_date and table2.due_date) diff_holi
from table1
where [...] --here your conditions over table1.
I have a lot of explaining to do for the context of this question so bear with me.
At my company, we have a SQL Server database and I'm working in the Management studio 2014.
We have a table that's called Jobstatistics, which displays how many Jobs are done during Intervals of one hour each.
The table looks like this
The station field is basically different areas jobs can be done at.
As you can see, some rows are missing for certain intervals and this is because of the way this table gets filled with data. To fill this table we have a script running that looks at another table and aggregates the amount of jobs for all dates between this interval. In other words, if there aren't any jobs, there won't be a row inserted because there will be nothing to insert (no rows from the other table to aggregate any jobs on).
What I want to do here is fill in these extra intervals with 0 as the amount of Jobs. So there will always be the 24 intervals (hours) for each day and for each station. On top of that we have set targets which we would like to achieve and I declared these in another table, called JobstatisticsTargets, which you could call a calendar table to join the Jobstatistics table on.
The calender table looks like this
I have tried doing a left or right join so the missing intervals would get filled in and the Jobs would at least get NULL values, but the join clause doesn't do what I expect it to.
This is my tried attempt
SELECT a.[Station], a.[Interval], a.[Jobs], b.[28JPH], b.[35JPH]
FROM [JobStatistics] a
RIGHT JOIN [JobStatisticsTargets] b
ON CONVERT(VARCHAR(10),a.Interval,108) = b.Interval
WHERE DATEDIFF(DAY, a.Interval, GETDATE()) < 12
AND Station LIKE '138010'
ORDER BY a.Station, a.Interval
The LEFT JOIN does exactly the same as I would expect a normal join to do and it doesn't append any intervals with NULL values. (the query is just for one station and a few days so I could test easily)
Any help is much appreciated. I will check this topic regularly so be sure to ask any questions regarding the context if you have any and I will try to explain it as good as I can!
EDIT
With some help the query now looks like this
SELECT a.[Station], b.[Interval], a.[Jobs], b.[28JPH], b.[35JPH]
FROM [JobStatistics] a
RIGHT JOIN [JobStatisticsTargets] b
ON CONVERT(VARCHAR(10),a.Interval,108) = b.Interval
AND CONVERT(VARCHAR(10),a.Interval,110) = CONVERT(VARCHAR(10),GETDATE(),110)
AND Station LIKE '138010'
ORDER BY b.Interval
I filter on today's date now because otherwise the extra rows aren't what I want them to be at all. The problem is that I don't know an easy way of filling in my stations. I suppose I need a subquery for those or is there another way?
The problem now as well is that I can't do this query for different stations. I would expect 24 rows for each station representing all the intervals, but I get this as a result:
Station Interval Jobs 28JPH 35JPH
NULL 00:30:00 NULL 0 0
NULL 01:30:00 NULL 0 0
NULL 02:30:00 NULL 0 0
NULL 03:30:00 NULL 0 0
134040 04:30:00 2 0 0
136060 04:30:00 2 0 0
131080 04:30:00 2 0 0
138010 05:30:00 2 0 0
NULL 06:30:00 NULL 0 0
NULL 07:30:00 NULL 28 35
NULL 08:30:00 NULL 28 35
...
You filter on a field from the table which rows may not be presented in the join result: >>>AND Station LIKE '138010'
You should change your query and put this condition in ON CLAUSE, not in WHERE
check this script and let me know,
declare #t table(interval datetime,jobs int)
insert into #t VALUES('2017-04-28 05:30',1),('2017-04-28 06:30',5),('2017-04-29 06:30',5)
--select * from #t
;With CTE as
(
select cast('00:00' as time) as IntervalTime
union ALL
select DATEADD(MINUTE,30,IntervalTime)
from cte
where IntervalTime<'23:30'
)
,CTE1 AS(
select interval,jobs
,dense_rank()over( order by cast(interval as date))rn
from #t
)
select * FROM
(
select distinct case when t.interval is null then
DATEADD(day, DATEDIFF(day, 0,
(select top 1 interval from cte1 where rn=n.number)), cast(c.IntervalTime as datetime))
else t.interval end Newinterval,isnull(t.jobs,0) Jobs
from CTE c
left join cte1 t
on c.IntervalTime=cast(t.interval as time)
cross apply(select number from master.dbo.spt_values
where name is null and number<=(select max(rn) from cte1))n
)t4
where Newinterval is not null
I have a 'Service' table with millions of rows. Each row corresponds to a service provided by a staff in a given date and time interval (Each row has a unique ID). There are cases where a staff might provide services in overlapping time frames. I need to write a query that merges overlapping time intervals and returns the data in the format shown below.
I tried grouping by StaffID and Date fields and getting the Min of BeginTime and Max of EndTime but that does not account for the non-overlapping time frames. How can I accomplish this? Again, the table contains several million records so a recursive CTE approach might have performance issues. Thanks in advance.
Service Table
ID StaffID Date BeginTime EndTime
1 101 2014-01-01 08:00 09:00
2 101 2014-01-01 08:30 09:30
3 101 2014-01-01 18:00 20:30
4 101 2014-01-01 19:00 21:00
Output
StaffID Date BeginTime EndTime
101 2014-01-01 08:00 09:30
101 2014-01-01 18:00 21:00
Here is another sample data set with a query proposed by a contributor.
http://sqlfiddle.com/#!6/bfbdc/3
The first two rows in the results set should be merged into one row (06:00-08:45) but it generates two rows (06:00-08:30 & 06:00-08:45)
I only came up with a CTE query as the problem is there may be a chain of overlapping times, e.g. record 1 overlaps with record 2, record 2 with record 3 and so on. This is hard to resolve without CTE or some other kind of loops, etc. Please give it a go anyway.
The first part of the CTE query gets the services that start a new group and are do not have the same starting time as some other service (I need to have just one record that starts a group). The second part gets those that start a group but there's more then one with the same start time - again, I need just one of them. The last part recursively builds up on the starting group, taking all overlapping services.
Here is SQLFiddle with more records added to demonstrate different kinds of overlapping and duplicate times.
I couldn't use ServiceID as it would have to be ordered in the same way as BeginTime.
;with flat as
(
select StaffID, ServiceDate, BeginTime, EndTime, BeginTime as groupid
from services S1
where not exists (select * from services S2
where S1.StaffID = S2.StaffID
and S1.ServiceDate = S2.ServiceDate
and S2.BeginTime <= S1.BeginTime and S2.EndTime <> S1.EndTime
and S2.EndTime > S1.BeginTime)
union all
select StaffID, ServiceDate, BeginTime, EndTime, BeginTime as groupid
from services S1
where exists (select * from services S2
where S1.StaffID = S2.StaffID
and S1.ServiceDate = S2.ServiceDate
and S2.BeginTime = S1.BeginTime and S2.EndTime > S1.EndTime)
and not exists (select * from services S2
where S1.StaffID = S2.StaffID
and S1.ServiceDate = S2.ServiceDate
and S2.BeginTime < S1.BeginTime
and S2.EndTime > S1.BeginTime)
union all
select S.StaffID, S.ServiceDate, S.BeginTime, S.EndTime, flat.groupid
from flat
inner join services S
on flat.StaffID = S.StaffID
and flat.ServiceDate = S.ServiceDate
and flat.EndTime > S.BeginTime
and flat.BeginTime < S.BeginTime and flat.EndTime < S.EndTime
)
select StaffID, ServiceDate, MIN(BeginTime) as begintime, MAX(EndTime) as endtime
from flat
group by StaffID, ServiceDate, groupid
order by StaffID, ServiceDate, begintime, endtime
Elsewhere I've answered a similar Date Packing question with
a geometric strategy. Namely, I interperet the date ranges
as a line, and utilize geometry::UnionAggregate to merge
the ranges.
Your question has two peculiarities though. First, it calls
for sql-server-2008. geometry::UnionAggregate is not then
avialable. However, download the microsoft library at
https://github.com/microsoft/SQLServerSpatialTools and load
it in as a clr assembly to your instance and you have it
available as dbo.GeometryUnionAggregate.
But the real peculiarity that has my interest is the concern
that you have several million rows to work with. So I thought
I'd repeat the strategy here but with an added technique to
improve it's performance. This technique will work well if
you have a lot of your StaffID/date subsets that are the same.
First, let's build a numbers table. Swap this out with your favorite
way to do it.
select i = row_number() over (order by (select null))
into #numbers
from #services; -- where i put your data
Then convert the dates to floats and use those floats to create
geometrical points.
These points can then be turned into lines via STUnion and STEnvelope.
With your ranges now represented as geometric lines, merge them via
UnionAggregate. The resulting geometry object 'lines' might contain
multiple lines. But any overlapping lines turn into one line.
select s.StaffID,
s.Date,
linesWKT = geometry::UnionAggregate(line).ToString()
-- If you have SQLSpatialTools installed then:
-- linesWKT = dbo.GeometryUnionAggregate(line).ToString()
into #aggregateRangesToGeo
from #services s
cross apply (select
beginTimeF = convert(float, convert(datetime,beginTime)),
endTimeF = convert(float, convert(datetime,endTime))
) prepare
cross apply (select
beginPt = geometry::Point(beginTimeF, 0, 0),
endPt = geometry::Point(endTimeF, 0, 0)
) pointify
cross apply (select
line = beginPt.STUnion(endPt).STEnvelope()
) lineify
group by s.StaffID,
s.Date;
You have one 'lines' object for each staffId/date combo. But depending
on your dataset, there may be many 'lines' objects that are the same
between these combos. This may very well be true if staff are expected
to follow a routine and data is recorded to the nearest whatever.
So get a distinct lising of 'lines' objects. This should improve
performance.
From this, extract the individual lines inside 'lines'. Envelope the lines,
which ensures that the lines are stored only as their endpoints. Read the
endpoint x values and convert them back to their time representations.
Keep the WKT representation to join it back to the combos later on.
select lns.linesWKT,
beginTime = convert(time, convert(datetime, ap.beginTime)),
endTime = convert(time, convert(datetime, ap.endTime))
into #parsedLines
from (select distinct linesWKT from #aggregateRangesToGeo) lns
cross apply (select
lines = geometry::STGeomFromText(linesWKT, 0)
) geo
join #numbers n on n.i between 1 and geo.lines.STNumGeometries()
cross apply (select
line = geo.lines.STGeometryN(n.i).STEnvelope()
) ln
cross apply (select
beginTime = ln.line.STPointN(1).STX,
endTime = ln.line.STPointN(3).STX
) ap;
Now just join your parsed data back to the StaffId/Date combos.
select ar.StaffID,
ar.Date,
pl.beginTime,
pl.endTime
from #aggregateRangesToGeo ar
join #parsedLines pl on ar.linesWKT = pl.linesWKT
order by ar.StaffID,
ar.Date,
pl.beginTime;
We have a table with following data
Id,ItemId,SeqNumber;DateTimeTrx
1,100,254,2011-12-01 09:00:00
2,100,1,2011-12-01 09:10:00
3,200,7,2011-12-02 11:00:00
4,200,5,2011-12-02 10:00:00
5,100,255,2011-12-01 09:05:00
6,200,3,2011-12-02 09:00:00
7,300,0,2011-12-03 10:00:00
8,300,255,2011-12-03 11:00:00
9,300,1,2011-12-03 10:30:00
Id is an identity column.
The sequence for an ItemId starts from 0 and goes till 255 and then resets to 0. All this information is stored in a table called Item. The order of sequence number is determined by the DateTimeTrx but such data can enter any time into the system. The expected output is as shown below-
ItemId,PrevorNext,SeqNumber,DateTimeTrx,MissingNumber
100,Previous,255,2011-12-01 09:05:00,0
100,Next,1,2011-12-01 09:10:00,0
200,Previous,3,2011-12-02 09:00:00,4
200,Next,5,2011-12-02 10:00:00,4
200,Previous,5,2011-12-02 10:00:00,6
200,Next,7,2011-12-02 11:00:00,6
300,Previous,1,2011-12-03 10:30:00,2
300,Next,255,2011-12-03 16:30:00,2
We need to get those rows one before and one after the missing sequence. In the above example for ItemId 300 - the record with sequence 1 has entered first (2011-12-03 10:30:00) and then 255(2011-12-03 16:30:00), hence the missing number here is 2. So 1 is previous and 255 is next and 2 is the first missing number. Coming to ItemId 100, the record with sequence 255 has entered first (2011-12-02 09:05:00) and then 1 (2011-12-02 09:10:00), hence 255 is previous and then 1, hence 0 is the first missing number.
In the above expected result, MissingNumber column is the first occuring missing number just to illustrate the example.
We will not have a case where we would have a complete series reset at one time i.e. it can be either a series rundown from 255 to 0 as in for itemid 100 or 0 to 255 as in ItemId 300. Hence we need to identify sequence missing when in ascending order (0,1,...255) or either in descending order (254,254,0,2) etc.
How can we accomplish this in a t-sql?
Could work like this:
;WITH b AS (
SELECT *
,row_number() OVER (ORDER BY ItemId, DateTimeTrx, SeqNumber) AS rn
FROM tbl
), x AS (
SELECT
b.Id
,b.ItemId AS prev_Itm
,b.SeqNumber AS prev_Seq
,c.ItemId AS next_Itm
,c.SeqNumber AS next_Seq
FROM b
JOIN b c ON c.rn = b.rn + 1 -- next row
WHERE c.ItemId = b.ItemId -- only with same ItemId
AND c.SeqNumber <> (b.SeqNumber + 1)%256 -- Seq cycles modulo 256
)
SELECT Id, prev_Itm, 'Previous' AS PrevNext, prev_Seq
FROM x
UNION ALL
SELECT Id, next_Itm ,'Next', next_Seq
FROM x
ORDER BY Id, PrevNext DESC
Produces exactly the requested result.
See a complete working demo on data.SE.
This solution takes gaps in the Id column into consideration, as there is no mention of a gapless sequence of Ids in the question.
Edit2: Answer to updated question:
I updated the CTE in the query above to match your latest verstion - or so I think.
Use those columns that define the sequence of rows. Add as many columns to your ORDER BY clause as necessary to break ties.
The explanation to your latest update is not entirely clear to me, but I think you only need to squeeze in DateTimeTrx to achieve what you want. I have SeqNumber in the ORDER BY additionally to break ties left by identical DateTimeTrx. I edited the query above.