SQL GROUP BY AND VALUE - sql-server

I have a table
AvailbilityDate | Resort | AccomName | Price | Min Occupancy
24 June 2012 | Resort1 | Accom1 | 999 | 8
24 June 2012 | Resort1 | Accom2 | 888 | 6
24 June 2012 | Resort2 | Accom1a | 243 | 10
24 June 2012 | Resort2 | Accom2a | 563 | 7
What I currently have is
SELECT AvailbilityDate, Resort, MIN(Price) AS Lowest
FROM mytable
GROUP BY AvailbilityDate, Resort
I want to be able to get the AccomName and the Min Occupancy
Many thanks in advance

With standard ANSI SQL, the solution would be this:
SELECT *
FROM (
SELECT AvailbilityDate,
resort,
accomName,
price,
min_occupancy,
min(price) over (partition by AvailbilityDate, Resort) as min_price
FROM deals_panel_view
) t
WHERE min_price = price;
Should work on PostgreSQL, Oracle, DB2, SQL Server, Sybase and Terradata

Using a common table expression and Ranking functions you can do this
WITH cte as (
SELECT
ROW_NUMBER() over (PARTITION BY AvailabilityDate,Resort ORDER BY price) as row,
AvailbilityDate,
Resort,
AccomName,
Price,
[Min Occupancy]
FROM mytable
)
SELECT AvailbilityDate,Resort,Price,AccomName,[Min Occupancy] from cte where row=1

SELECT AvailbilityDate, Resort, AccomName, "Min Occupancy", MIN(Price) AS Lowest
FROM mytable
GROUP BY AvailbilityDate, Resort

You can do this in several ways.
1) Ranking - per a_horse_with_no_name's solution
2) Group by and Cross Apply. Essentially plug the columns you want to group up in the first subquery and then all other columns go into the second subquery. Use an Order By in the second subquery to deal with any column you want to apply MIN (or MAX) to.
SELECT a.AvailbilityDate,
a.resort,
b.AccomName,
b.min_occupancy,
b.Lowest
FROM
(
SELECT t1.AvailbilityDate, t1.resort
FROM myTable t1
GROUP BY t1.AvailbilityDate, t1.resort
) a
CROSS APPLY
(
SELECT TOP 1 t2.AccomName, t2.min_occupancy, t2.price as Lowest
FROM mytable t2
WHERE t2.AvailbilityDate = a.AvailbilityDate
AND t2.resort = a.resort
ORDER BY t2.price ASC
) b
3) Use subqueries in select statement (not very elegant but works) with Group By. This is assuming that there is only one accomName with the given minimum price for each combination of AvailbilityDate and resort.
SELECT a.AvailbilityDate,
a.resort,
(SELECT accomName FROM myTable t1
WHERE t1.AvailbilityDate = a.AvailbilityDate
AND t1.resort = a.resort
AND t1.price = MIN(a.price)
) as accomName,
(SELECT min_occupancy FROM myTable t1
WHERE t1.AvailbilityDate = a.AvailbilityDate
AND t1.resort = a.resort
AND t1.price = MIN(a.price)
) as min_occupancy,
MIN(a.price) as Lowest
FROM myTable a
GROUP BY a.AvailbilityDate, a.resort

SELECT AccomName, MIN([Min Occupancy]) AS Lowest FROM mytable GROUP BY AccomName
That would give you ONLY the desired Infos. If you want to have the desired fields in ADDITION to what you have already, it would be like:
SELECT AvailbilityDate, AccomName, Resort, MIN(Price) AS Lowest, Min([Min Occupancy]) As LowestMinOcc FROM mytable GROUP BY AvailbilityDate , AccomName , Resort

Related

using all values from one column in another query

I am trying to find a solution for the following issue that I have in sql-server:
I have one table t1 of which I want to use each date for each agency and loop it through the query to find out the avg_rate. Here is my table t1:
Table T1:
+--------+-------------+
| agency | end_date |
+--------+-------------+
| 1 | 2017-10-01 |
| 2 | 2018-01-01 |
| 3 | 2018-05-01 |
| 4 | 2012-01-01 |
| 5 | 2018-04-01 |
| 6 | 2017-12-01l |
+--------+-------------+
I literally want to use all values in the column end_date and plug it into the query here (I marked it with ** **):
with averages as (
select a.id as agency
,c.rate
, avg(c.rate) over (partition by a.id order by a.id ) as avg_cost
from table_a as a
join rates c on a.rate_id = c.id
and c.end_date = **here I use all values from t1.end_date**
and c.Start_date = **here I use all values from above minus half a year** = dateadd(month,-6,end_date)
group by a.id
,c.rate
)
select distinct agency, avg_cost from averages
order by 1
The reason why I need two dynamic dates is that the avg_rates vary if you change the timeframe between these dates.
My problem and my question is now:
How can you take the end_date from table t1 plug it into the query where c.end_date is and loop if through all values in t1.end_date?
I appreciate your help!
Do you really need a windowed average? Try this out.
;with timeRanges AS
(
SELECT
T.end_date,
start_date = dateadd(month,-6, T.end_date)
FROM
T1 AS T
)
select
a.id as agency,
c.rate,
T.end_date,
T.start_date,
avg_cost = avg(c.rate)
from
table_a as a
join rates c on a.rate_id = c.id
join timeRanges AS T ON A.DateColumn BETWEEN T.start_date AND T.end_date
group by
a.id ,
c.rate,
T.end_date,
T.start_date
You need a date column to join your data against T1 (I called it DateColumn in this example), otherwise all time ranges would return the same averages.
I can think of several ways to do this - Cursor, StoredProcedure, Joins ...
Given the simplicity of your query, a cartesian product (Cross Join) of Table T1 against the averages CTE should do the magic.

How to calculate running total over specific date or better?

I would like to calculate the what orders can be completed and what dates are missing (diff) after completing as many orders as possible at the moment. Picked in order of FEFO.
When thinking about the problem I think that some kind of a running sum based on both the dates of the stock and the orders would be one way to go. Based on Calculate running total / running balance and other similar threads it seems like a good fit for the problem - but I'm open to other solutions.
Example code
DECLARE #stockTable TABLE (
BATCH_NUM nvarchar(16),
QUANTITY int,
DATE_OUTGO DATE
)
DECLARE #orderTable TABLE (
ORDER_ID int,
QUANTITY int,
DATE_OUTGO DATE
)
INSERT INTO #stockTable (BATCH_NUM, QUANTITY, DATE_OUTGO)
VALUES
('1000', 10, '2017-08-25'),
('1001', 20, '2017-08-26'),
('1002', 10, '2017-08-27')
INSERT INTO #orderTable (ORDER_ID, QUANTITY, DATE_OUTGO)
VALUES
(1, 10, '2017-08-25'),
(1, 12, '2017-08-25'),
(2, 10, '2017-08-26'),
(3, 10, '2017-08-26'),
(4, 16, '2017-08-26')
SELECT
DATE_OUTGO,
SUM(RunningTotal) AS DIFF
FROM (
SELECT
orderTable.DATE_OUTGO AS DATE_OUTGO,
RunningTotal = SUM(stockTable.QUANTITY - orderTable.QUANTITY ) OVER
(ORDER BY stockTable.DATE_OUTGO ROWS UNBOUNDED PRECEDING)
FROM
#orderTable orderTable
INNER JOIN #stockTable stockTable
ON stockTable.DATE_OUTGO >= orderTable.DATE_OUTGO
GROUP BY
orderTable.DATE_OUTGO,
stockTable.DATE_OUTGO,
stockTable.QUANTITY,
orderTable.QUANTITY
) A
GROUP BY DATE_OUTGO
Results
The correct result would look like this.
-------------------------
| OT_DATE_OUTGO | DIFF |
-------------------------
| 2017-08-25 | 0 |
-------------------------
| 2017-08-26 | -18 |
-------------------------
My result currently looks like this.
-------------------------
| OT_DATE_OUTGO | DIFF |
-------------------------
| 2017-08-25 | 80 |
-------------------------
| 2017-08-26 | 106 |
-------------------------
I've taken out complexities like item numbers, different demands simultaneously (using the exact date only and date or better) etc. to simplify the core issue as much as possible.
Edit 1:
Updated rows in both tables and results (correct and with original query).
First answer gave a diff of -12 on 2017-08-25 instead of 0. But 2017-08-26 was correct.
You can use the following query:
;WITH ORDER_RUN AS (
SELECT SUM(SUM(QUANTITY)) OVER (ORDER BY DATE_OUTGO) AS ORDER_RUNTOTAL,
DATE_OUTGO
FROM #orderTable
GROUP BY DATE_OUTGO
), STOCK_RUN AS (
SELECT SUM(SUM(QUANTITY)) OVER (ORDER BY DATE_OUTGO) AS STOCK_RUNTOTAL,
DATE_OUTGO
FROM #stockTable
GROUP BY DATE_OUTGO
)
SELECT ORR.DATE_OUTGO AS OT_DATE_OUTGO,
X.STOCK_RUNTOTAL - ORDER_RUNTOTAL AS DIFF
FROM ORDER_RUN AS ORR
OUTER APPLY (
SELECT TOP 1 STOCK_RUNTOTAL
FROM STOCK_RUN AS SR
WHERE SR.DATE_OUTGO <= ORR.DATE_OUTGO
ORDER BY SR.DATE_OUTGO DESC) AS X
The first CTE calculates the order running total, whereas the second CTE calculates the stock running total. The query uses OUTER APPLY to get the stock running total up to the date the current order has been made.
Edit:
If you want to consume the stock of dates that come in the future with respect to the order date, then simply replace:
WHERE SR.DATE_OUTGO <= ORR.DATE_OUTGO
with
WHERE STOCK_RUNTOTAL <= ORDER_RUNTOTAL
in the OUTER APPLY operation.
Edit 2:
The following improved query should, at last, solve the problem:
;WITH ORDER_RUN AS (
SELECT SUM(SUM(QUANTITY)) OVER (ORDER BY DATE_OUTGO) AS ORDER_RUNTOTAL,
DATE_OUTGO
FROM #orderTable
GROUP BY DATE_OUTGO
), STOCK_RUN AS (
SELECT SUM(SUM(QUANTITY)) OVER (ORDER BY DATE_OUTGO) AS STOCK_RUNTOTAL,
SUM(SUM(QUANTITY)) OVER () AS TOTAL_STOCK,
DATE_OUTGO
FROM #stockTable
GROUP BY DATE_OUTGO
)
SELECT ORR.DATE_OUTGO AS OT_DATE_OUTGO,
CASE
WHEN X.STOCK_RUNTOTAL - ORDER_RUNTOTAL >= 0 THEN 0
ELSE X.STOCK_RUNTOTAL - ORDER_RUNTOTAL
END AS DIFF
FROM ORDER_RUN AS ORR
OUTER APPLY (
SELECT TOP 1 STOCK_RUNTOTAL
FROM STOCK_RUN AS SR
WHERE STOCK_RUNTOTAL >= ORDER_RUNTOTAL -- Stop if stock quantity has exceeded order quantity
OR
STOCK_RUNTOTAL = TOTAL_STOCK -- Stop if the end of stock has been reached
ORDER BY SR.DATE_OUTGO) AS X

Finding max date difference on a single column

in the below table example - Table A, we have entries for four different ID's 1,2,3,4 with the respective status and its time. I wanted to find the "ID" which took the maximum amount of time to change the "Status" from Started to Completed. In the below example it is ID = 4. I wanted to run a query and find the results, where we currently has approximately million records in a table. It would be really great, if someone provide an effective way to retrieve this data.
Table A
ID Status Date(YYYY-DD-MM HH:MM:SS)
1. Started 2017-01-01 01:00:00
1. Completed 2017-01-01 02:00:00
2. Started 2017-10-02 03:00:00
2. Completed 2017-10-02 05:00:00
3. Started 2017-15-03 06:00:00
3. Completed 2017-15-03 09:00:00
4. Started 2017-22-04 10:00:00
4. Completed 2017-22-04 15:00:00
Thanks!
Bruce
You can query as below:
Select top 1 with ties Id from #yourDate y1
join #yourDate y2
On y1.Id = y2.Id
and y1.[STatus] = 'Started'
and y2.[STatus] = 'Completed'
order by Row_number() over(order by datediff(mi,y1.[Date], y2.[date]) desc)
SELECT
started.ID, timediff(completed.date, started.date) as elapsed_time
FROM TABLE_A as started
INNER JOIN TABLE_A as completed ON (completed.ID=started.ID AND completed.status='Completed')
WHERE started.status='Started'
ORDER BY elapsed_time desc
be sure there's a index on TABLE_A for the columns ID, date
I haven't run this sql but it may solve your problem.
select a.id, max(DATEDIFF(SECOND, a.date, b.date + 1)) from TableA as a
join TableA as b on a.id = b.id
where a.status="started" and b.status="completed"
Here's a way with a correlated sub-query. Just uncomment the TOP 1 to get ID 4 in this case. This is based off your comments that there is only 1 "started" record, but could be multiple "completed" records for each ID.
declare #TableA table (ID int, Status varchar(64), Date datetime)
insert into #TableA
values
(1,'Started','2017-01-01 01:00:00'),
(1,'Completed','2017-01-01 02:00:00'),
(2,'Started','2017-02-10 03:00:00'),
(2,'Completed','2017-02-10 05:00:00'),
(3,'Started','2017-03-15 06:00:00'),
(3,'Completed','2017-03-15 09:00:00'),
(4,'Started','2017-04-22 10:00:00'),
(4,'Completed','2017-04-22 15:00:00')
select --top 1
s.ID
,datediff(minute,s.Date,e.EndDate) as TimeDifference
from #TableA s
inner join(
select
ID
,max(Date) as EndDate
from #TableA
where Status = 'Completed'
group by ID) e on e.ID = s.ID
where
s.Status = 'Started'
order by
datediff(minute,s.Date,e.EndDate) desc
RETURNS
+----+----------------+
| ID | TimeDifference |
+----+----------------+
| 4 | 300 |
| 3 | 180 |
| 2 | 120 |
| 1 | 60 |
+----+----------------+
If you know that 'started' will always be the earliest point in time for each ID and the last 'completed' record you are considering will always be the latest point in time for each ID, the following should have good performance for a large number of records:
SELECT TOP 1
id
, DATEDIFF(s, MIN([Date]), MAX([date])) AS Elapsed
FROM #TableA
GROUP BY ID
ORDER BY DATEDIFF(s, MIN([Date]), MAX([date])) DESC

How can we take the sum of each columns in SQL Server without using ;with cte?

How can I take sum of each rows by two row sum in 3rd column?
Here's a screenshot to illustrate:
You can see for id 1 sum is 10 but for id 2 sum is 10+50 = 60
and third sum is 60+100 = 160 and so on.
With Cte it is working fine for me. I need with out ;with cte means though code I need the sum
Example will as shown below
DECLARE #t TABLE(ColumnA INT, ColumnB VARCHAR(50));
INSERT INTO #t
VALUES (10,'1'), (50,'2'), (100,'3'), (5,'4'), (45,'5');
;WITH cte AS
(
SELECT ColumnB, SUM(ColumnA) asum
FROM #t
GROUP BY ColumnB
), cteRanked AS
(
SELECT asum, ColumnB, ROW_NUMBER() OVER(ORDER BY ColumnB) rownum
FROM cte
)
SELECT
(SELECT SUM(asum)
FROM cteRanked c2
WHERE c2.rownum <= c1.rownum) AS ColumnA,
ColumnB
FROM
cteRanked c1;
One option, which doesn't require explicit analytic functions, would be to use a correlated subquery to calculate the running total:
SELECT
t1.ID,
t1.Currency,
(SELECT SUM(t2.Currency) FROM yourTable t2 WHERE t2.ID <= t1.ID) AS Sum
FROM yourTable t1
Output:
Demo here:
Rextester
It looks like you need a simple running total.
There is an easy and efficient way to calculate running total in SQL Server 2012 and later. You can use SUM(...) OVER (ODER BY ...), like in the example below:
Sample data
DECLARE #t TABLE(ColumnA INT, ColumnB VARCHAR(50));
INSERT INTO #t
VALUES (10,'1'), (50,'2'), (100,'3'), (5,'4'), (45,'5');
Query
SELECT
ColumnB
,ColumnA
,SUM(ColumnA) OVER (ORDER BY ColumnB
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS SumColumnA
FROM #t
ORDER BY ColumnB;
Result
+---------+---------+------------+
| ColumnB | ColumnA | SumColumnA |
+---------+---------+------------+
| 1 | 10 | 10 |
| 2 | 50 | 60 |
| 3 | 100 | 160 |
| 4 | 5 | 165 |
| 5 | 45 | 210 |
+---------+---------+------------+
For SQL Server 2008 and below you need to use either correlated sub-queries as you do already or a simple cursor, which may be faster if the table is large.

Count number of days in a year with a record

I have a SQL Server table named AgentLog in which I store for each agent his daily number of sales.
+-----------+------------+-------------+
| AgentName | Date | SalesNumber |
+-----------+------------+-------------+
| John | 01.01.2014 | 45 |
| Terry | 01.01.2014 | 30 |
| John | 02.01.2014 | 20 |
| Terry | 02.01.2014 | 15 |
| Terry | 03.01.2014 | 52 |
| Terry | 04.01.2014 | 24 |
| Terry | 05.01.2014 | 12 |
| Terry | 06.01.2014 | 10 |
| Terry | 07.01.2014 | 23 |
| John | 08.01.2014 | 48 |
| Terry | 08.01.2014 | 35 |
| John | 09.01.2014 | 37 |
| Terry | 10.01.2014 | 35 |
+-----------+------------+-------------+
If an agent doesn't work on one particular day, there is no record of his sales on that date.
I want to generate a report(query) on a given date interval (ex: 01.01.2014 - 10.01.2014) that counts on how many days an agent wasn't present for work (ex: John - 6 days), was at work (John - 4 days) and also returns the date interval it wasn't present (ex: John 03.01.2014 - 07.01.2014, 10.01.2014) (there can be multiple intervals).
You need to create a custom table and populate it with a record for each date you want in your range (Feel free to go as far back in the past and forward into the future as you feel you may need.). You could do this in Excel very easily and import it.
Select *
from Custom.DateListTable dlt
left outer join agentlog ag
on dlt.Date = ag.Date
I would approach this by getting the number of dates in the interval, as well as the number of dates the agent was at work, and you then have everything you need.
To get the number of days you can use DATEDIFF:
SELECT DATEDIFF(day, '2014-01-01', '2014-10-01') AS totalDays;
To get the number of days an agent worked, you can use the COUNT(*) aggregate function:
SELECT agentName, COUNT(*) AS daysWorked
FROM myTable
GROUP BY agentName;
Then, you can just add to that query to get the days not worked by subtracting totalDays - daysWorked:
SELECT agentName, COUNT(*) AS daysWorked, (DATEDIFF(day, '2014-01-01', '2014-10-01') - COUNT(*)) AS daysMissed
FROM myTable
GROUP BY agentName;
Here is an SQL Fiddle example.
The only way I can think of to resolve this is to creating a temporary table with only one column (datetime) and save there all the dates from the selected range. You can create an stored procedure that fills that temporary table using a cursor with all the dates from the interval. Then do a LEFT join between your table and the temporary table to look for null values in your table (The days where that person didn't come to work)
Try this...
SET DATEFIRST 1; --Monday
DECLARE #StartDate DATETIME = '2014-01.01',
#EndDate DATETIME = '2014-01.10';
WITH data as (
select 0 as i, DATEADD(DAY, 0, #StartDate) as TheDate
union all
select i + 1, DATEADD(DAY, i + 1, #StartDate) as TheDate
from data
where i < (#EndDate - #StartDate)
)
SELECT a.AgentName,
SUM(CASE WHEN c.Date IS NULL THEN 1 ELSE 0 END) AS Missing,
SUM(CASE WHEN c.Date IS NOT NULL THEN 1 ELSE 0 END) AS Working
FROM Agent a
JOIN data b ON NOT EXISTS(SELECT NULL FROM SpecialDate s WHERE s.date = b.TheDate)
LEFT JOIN AgentLog c ON
c.AgentName = a.AgentName
AND c.Date = b.TheDate
WHERE DATEPART(weekday, b.TheDate) <= 5
GROUP BY a.AgentName
OPTION (MAXRECURSION 10000);
It includes a check for weekends, as well as a reference to "SpecialDate" where a list of non working days can be maintained, and excluded from the check.
Reading your question again, I realise that this will only solve half your problem.
NOTE: The following answer mainly addresses the trickiest part of the question, which is how to obtain "absence from work" intervals.
Given these values as Interval Start - End dates:
DECLARE #IntervalStart DATE = '2013-12-30'
DECLARE #IntervalEnd DATE = '2014-01-10'
the following query gives you the "absence from work" intervals:
SELECT AgentName,
DATEADD(d, 1, t.[Date]) As OffWorkStart,
DATEADD(d, -1, t.NextDate) As OffWorkEnd
FROM (
SELECT AgentName, [Date], LEAD([Date]) OVER (PARTITION BY AgentName ORDER BY [Date] ASC) As NextDate,
DATEDIFF(DAY, [Date], LEAD([Date]) OVER (PARTITION BY AgentName ORDER BY [Date] ASC)) As NextMinusCurrent
FROM #AgentLog) t
WHERE t.NextMinusCurrent > 1
-- Get marginal beginning interval (in case such an interval exists)
UNION ALL
SELECT AgentName, #IntervalStart AS OffWorkStart, DATEADD(DAY, -1, MIN([Date])) AS OffWorkEnd
FROM #AgentLog
GROUP BY AgentName
HAVING MIN([Date]) > #IntervalStart
-- Get marginal ending interval (in case such an interval exists)
UNION ALL
SELECT AgentName, DATEADD(DAY, 1, MAX([Date])) AS OffWorkStart, #IntervalEnd
FROM #AgentLog
GROUP BY AgentName
HAVING MAX([Date]) < #IntervalEnd
ORDER By AgentName, OffWorkStart
With the input data you supplied, the above query gives you the following output:
AgentName OffWorkStart OffWorkEnd
---------------------------------------
John 2013-12-30 2013-12-31
John 2014-01-03 2014-01-07
John 2014-01-10 2014-01-10
Terry 2013-12-30 2013-12-31
Terry 2014-01-09 2014-01-09
The idea behind the basic part of the query is to employ the following nested query:
SELECT AgentName,
[Date],
LEAD([Date]) OVER (PARTITION BY AgentName ORDER BY [Date] ASC) As NextDate,
DATEDIFF(DAY, [Date], LEAD([Date]) OVER (PARTITION BY AgentName ORDER BY [Date] ASC)) As NextMinusCurrent
FROM #AgentLog
in order to get any existing gaps between the days a certain agent is present for work. A value of NextMinusCurrent > 1 indicates such a gap.
Counting days is trivial once you have the above query in place. E.g. placing the above query in a CTE you can count total number of absence days with sth like:
;WITH cte (
... query goes here
)
SELECT AgentName, SUM(DATEDIFF(DAY, OffWorkStart, OffWorkEnd) + 1) AS AbsenceDays
FROM cte
GROUP By AgentName
P.S. The above query makes use of SQL Server LEAD function, which is available from SQL SERVER 2012 onwards.
SQL Fiddle here
EDIT:
CTEs together with ROW_NUMBER() can be used to simulate LEAD function. The first part of the query becomes:
;WITH cte1 AS (
SELECT AgentName,
[Date],
ROW_NUMBER() OVER (PARTITION BY AgentName ORDER BY [Date] ASC) As rn
FROM #AgentLog
),
cte2 AS (
SELECT cte1.AgentName, cte1.[Date],
cteLead.[Date] AS NextDate,
DATEDIFF(DAY, cte1.[Date], cteLead.[Date]) As NextMinusCurrent
FROM cte1
LEFT OUTER JOIN cte1 AS cteLead
ON (cte1.rn = cteLead.rn - 1) AND (cte1.AgentName = cteLead.AgentName)
)
SELECT AgentName,
DATEADD(d, 1, cte2.[Date]) As OffWorkStart,
DATEADD(d, -1, cte2.NextDate) As OffWorkEnd
FROM cte2
WHERE NextMinusCurrent > 1
SQL Fiddle for SQL Server 2008 here. I hope it executes in SQL Server 2005 also!

Resources