I have people that do many multi-day assignments (date x to date Y). I would like to find the date that they completed a milestone e.g. 50 days work completed.
Data is stored as a single row per Assignment
AssignmentId
StartDate
EndDate
I can sum up the total days they have completed up to a date, but am struggling to see how I would find out the date that a milestone was hit. e.g. How many people completed 50 days in October 2020 showing the date within the month that this occurred?
Thanks in advance
PS. Our database is SQL Server.
As mentioned by prwvious comments, it would be much easier to help you if you could provide example data and table structure in order help you answer this question.
However, guessing a simple DB structure with a table for your peolple, your tasks and the work each user completed, you can get the required sum of days by use of a date table (or cte) which contains a entry for each day and the window function SUM with UNBOUNDED PRECEDING. Following an example:
DECLARE #people TABLE(
id int
,name nvarchar(50)
)
DECLARE #tasks TABLE(
id int
,name nvarchar(50)
)
DECLARE #work TABLE(
people_id int
,task_id int
,task_StartDate date
,task_EndDate date
)
INSERT INTO #people VALUES (1, 'Peter'), (2, 'Paul'), (3, 'Mary');
INSERT INTO #tasks VALUES (1, 'Devleopment'), (2, 'QA'), (3, 'Sales');
INSERT INTO #work VALUES
(1, 1, '2019-04-05', '2019-04-08')
,(1, 1, '2019-05-05', '2019-06-08')
,(1, 1, '2019-07-05', '2019-09-08')
,(2, 2, '2019-04-08', '2019-06-08')
,(2, 2, '2019-09-08', '2019-10-03')
,(3, 1, '2019-11-01', '2019-12-01')
;WITH cte AS(
SELECT CAST('2019-01-01' AS DATE) AS dateday
UNION ALL
SELECT DATEADD(d, 1, dateday)
FROM cte
WHERE DATEADD(d, 1, dateday) < '2020-01-01'
),
cteWorkDays AS(
SELECT people_id, task_id, dateday, 1 AS cnt
FROM #work w
INNER JOIN cte c ON c.dateday BETWEEN w.task_StartDate AND w.task_EndDate
),
ctePeopleWorkdays AS(
SELECT *, SUM(cnt) OVER (PARTITION BY people_id ORDER BY dateday ROWS UNBOUNDED PRECEDING) dayCnt
FROM cteWorkDays
)
SELECT *
FROM ctePeopleWorkdays
WHERE dayCnt = 50
OPTION (MAXRECURSION 0)
The solution depends on how you store your data. The solution below assumes that each worked day exists as a single row in your data model.
The approach below uses a common table expression (cte) to generate a running total (Total) for each person (PersonId) and then filters on the milestone target (I set it to 5 to reduce the sample data size) and target month.
Sample data
create table WorkedDays
(
PersonId int,
TaskDate date
);
insert into WorkedDays (PersonId, TaskDate) values
(100, '2020-09-01'),
(100, '2020-09-02'),
(100, '2020-09-03'),
(100, '2020-09-04'),
(100, '2020-09-05'), -- person 100 worked 5 days by 2020-09-05 = milestone (in september)
(200, '2020-09-29'),
(200, '2020-09-30'),
(200, '2020-10-01'),
(200, '2020-10-02'),
(200, '2020-10-03'), -- person 200 worked 5 days by 2020-10-03 = milestone (in october)
(200, '2020-10-04'),
(200, '2020-10-05'),
(200, '2020-10-06'),
(300, '2020-10-10'),
(300, '2020-10-11'),
(300, '2020-10-12'),
(300, '2020-10-13'),
(300, '2020-10-14'), -- person 300 worked 5 days by 2020-10-14 = milestone (in october)
(300, '2020-10-15'),
(400, '2020-10-20'),
(400, '2020-10-21'); -- person 400 did not reach the milestone yet
Solution
with cte as
(
select wd.PersonId,
wd.TaskDate,
count(1) over(partition by wd.PersonId
order by wd.TaskDate
rows between unbounded preceding and current row) as Total
from WorkedDays wd
)
select cte.PersonId,
cte.TaskDate as MileStoneDate
from cte
where cte.Total = 5 -- milestone reached
and year(cte.TaskDate) = 2020
and month(cte.TaskDate) = 10; -- in october
Result
PersonId MilestoneDate
-------- -------------
200 2020-10-03
300 2020-10-14
Fiddle (also shows the common table expression output).
Related
In snowflake, I have a date time stamp. '2022-07-18 08:00:00"
How do I separate the day from the time? I want to group by the day, but cant because of the time.
Thank you.
In Snowflake you can use the DAY or DATE functions, e.g.
create a test table
create or replace table table_with_dates (ID number, DATE timestamp);
insert values
insert into table_with_dates values (1, '2022-07-18 08:00:00'),
(2, '2022-07-18 08:00:00'),
(3, '2022-07-18 08:00:00'),
(1, '2022-07-19 08:00:00'),
(2, '2022-07-19 08:00:00'),
(1, '2022-07-20 08:00:00'),
(2, '2022-07-20 08:00:00'),
(1, '2022-07-21 08:00:00');
select the data grouping by the DATE part
select date(DATE), count(*) from table_with_dates
group by date(DATE);
select the date grouping by the DAY part
select DAY(DATE), count(*) from table_with_dates
group by DAY(DATE);
So the simplest way is to cast to DATE
some data in a CTE so work against:
with data(timestamp) as (
select column1::timestamp
from values
('2022-07-18 08:00:00'),
('2022-07-18 08:00:00'),
('2022-07-18 08:00:00'),
('2022-07-19 08:00:00'),
('2022-07-19 08:00:00'),
('2022-07-20 08:00:00'),
('2022-07-20 08:00:00'),
('2022-07-21 08:00:00')
)
select
d.timestamp::date as date
,count(*) as count
from data as d
group by 1
order by 1;
gives:
DATE
COUNT
2022-07-18
3
2022-07-19
2
2022-07-20
2
2022-07-21
1
DATE_TRUNC with DAY gives the same results, but is a little verbose.
select
date_trunc('day', d.timestamp::date) as date
,count(*) as count
from data as d
group by 1
order by 1;
DATE
COUNT
2022-07-18
3
2022-07-19
2
2022-07-20
2
2022-07-21
1
My query gather a technician daily sales data.
select
SUM(O.SUB_TOTAL) AS TOTALSALES,
COUNT(DISTINCT O.ORDER_NO) AS BILLABLEORDERS
FROM ordhdr o
INNER JOIN schedule s ON s.ID_VAL = o.ORDER_NO
WHERE
s.DATE = Convert(varchar(10), GETDATE()-1,121)
AND O.[TYPE] = 'SVC'
However, I also want to get weekly cumulative sales to know whether he is on track or not for his weekly numbers but I struggling transforming the query.
This has to reset for each Sunday or Monday so I cannot use a CurrentDate-7 function.
I don't know how to only look at a CURRENT weeks data using SQL-Server Management Studio.
Look into the 'DatePart' function. You can use it to identify which week out of the year a given date resides in, and it seems to turn over on Sunday. For instance:
datepart(week, '2019-07-06') -- Saturday, returns 27
datepart(week, '2019-07-07') -- Sunday, returns 28
That alone should get you going. However, you can throw in a few more techniques to get all the information in one resultset.
Consider the following ordhdr table:
declare #ordhdr table (
order_no int,
sub_total decimal(8,2),
type varchar(15)
);
insert #ordhdr values
(1, 23.25, 'svc'),
(2, 324.23, 'svc'),
(3, 423.89, 'svc'),
(4, 324.80, 'svc'),
(5, 234.23, 'svc'),
(6, 923.23, 'svc');
... and the following schedule table:
declare #schedule table (id_val int, date date);
insert #schedule values
(1, '2019-07-04'),
(2, '2019-07-04'),
(3, '2019-07-08'),
(4, '2019-07-09'),
(5, '2019-07-09'),
(6, '2019-07-10');
Well, using datepart, datename, cross apply, and grouping sets, you can do this:
select ap.year,
ap.weekOfYear,
dayOfWeek =
case
when ap.weekOfYear is null then '<entire year>'
when ap.dayOfWeek is null then '<entire week>'
else ap.dayOfWeek
end,
s.date,
totalsales = sum(o.sub_total),
billableorders = count(distinct o.order_no)
from #ordhdr o
join #schedule s on s.id_val = o.order_no
cross apply (select
year = datepart(year, s.date),
weekOfYear = datepart(week, s.date),
dayOfWeek = datename(weekday, s.date)
) ap
where o.type = 'svc'
group by grouping sets (
(ap.year, ap.weekOfYear, ap.dayOfWeek, s.date),
(ap.year, ap.weekOfYear),
(ap.year)
)
order by weekOfYear, date
Which will give you daily, weekly, and yearly totals.
This is not a homework question.
I'm trying to take the count of t-shirts in an order and see which price range the shirts fall into, depending on how many have been ordered.
My initial thought (I am brand new at this) was to ask another table if count > 1st price range's maximum, and if so, keep looking until it's not.
printing_range_max printing_price_by_range
15 4
24 3
33 2
So for example here, if the order count is 30 shirts they would be $2 each.
When I'm looking into how to do that, it looks like most people are using BETWEEN or IF and hard-coding the ranges instead of looking in another table. I imagine in a business setting it's best to be able to leave the range in its own table so it can be changed more easily. Is there a good/built-in way to do this or should I just write it in with a BETWEEN command or IF statements?
EDIT:
SQL Server 2014
Let's say we have this table:
DECLARE #priceRanges TABLE(printing_range_max tinyint, printing_price_by_range tinyint);
INSERT #priceRanges VALUES (15, 4), (24, 3), (33, 2);
You can create a table with ranges that represent the correct price. Below is how you would do this in pre-2012 and post-2012 systems:
DECLARE #priceRanges TABLE(printing_range_max tinyint, printing_price_by_range tinyint);
INSERT #priceRanges VALUES (15, 4), (24, 3), (33, 2);
-- post-2012 using LAG
WITH pricerange AS
(
SELECT
printing_range_min = LAG(printing_range_max, 1, 0) OVER (ORDER BY printing_range_max),
printing_range_max,
printing_price_by_range
FROM #priceRanges
)
SELECT * FROM pricerange;
-- pre-2012 using ROW_NUMBER and a self-join
WITH prices AS
(
SELECT
rn = ROW_NUMBER() OVER (ORDER BY printing_range_max),
printing_range_max,
printing_price_by_range
FROM #priceRanges
),
pricerange As
(
SELECT
printing_range_min = ISNULL(p2.printing_range_max, 0),
printing_range_max = p1.printing_range_max,
p1.printing_price_by_range
FROM prices p1
LEFT JOIN prices p2 ON p1.rn = p2.rn+1
)
SELECT * FROM pricerange;
Both queries return:
printing_range_min printing_range_max printing_price_by_range
------------------ ------------------ -----------------------
0 15 4
15 24 3
24 33 2
Now that you have that you can use BETWEEN for your join. Here's the full solution:
-- Sample data
DECLARE #priceRanges TABLE
(
printing_range_max tinyint,
printing_price_by_range tinyint
-- if you're on 2014+
,INDEX ix_xxx NONCLUSTERED(printing_range_max, printing_price_by_range)
-- note: second column should be an INCLUDE but not supported in table variables
);
DECLARE #orders TABLE
(
orderid int identity,
ordercount int
-- if you're on 2014+
,INDEX ix_xxy NONCLUSTERED(orderid, ordercount)
-- note: second column should be an INCLUDE but not supported in table variables
);
INSERT #priceRanges VALUES (15, 4), (24, 3), (33, 2);
INSERT #orders(ordercount) VALUES (10), (20), (25), (30);
-- Solution:
WITH pricerange AS
(
SELECT
printing_range_min = LAG(printing_range_max, 1, 0) OVER (ORDER BY printing_range_max),
printing_range_max,
printing_price_by_range
FROM #priceRanges
)
SELECT
o.orderid,
o.ordercount,
--p.printing_range_min,
--p.printing_range_max
p.printing_price_by_range
FROM pricerange p
JOIN #orders o ON o.ordercount BETWEEN printing_range_min AND printing_range_max
Results:
orderid ordercount printing_price_by_range
----------- ----------- -----------------------
1 10 4
2 20 3
3 25 2
4 30 2
Now that we have that we can
I have a list of data :
Id StartAge EndAge Amount
1 0 2 50
2 2 5 100
3 5 10 150
4 6 9 160
I have to set Amount for various age group.
The age group >0 and <=2 need to pay 50
The age group >2 and <=5 need to pay 100
The age group >5 and <=10 need to pay 150
But
The age group >6 and <=9 need to pay 160 is an invalid input because >6 and <=9 already exist on 150 amount range.
I have to validate such kind of invalid input before inserting my data as a bulk.Once 5-10 range gets inserted anything that is within this range should not be accepted by system. For example: In above list, user should be allowed to insert 10-15 age group but any of the following should be checked as invalid.
6-9
6-11
3-5
5-7
If Invalid Input exists on my list I don't need to insert the list.
You could try to insert your data to the temporary table first.
DECLARE #TempData TABLE
(
[Id] TINYINT
,[StartAge] TINYINT
,[EndAge] TINYINT
,[Amount] TINYINT
);
INSERT INTO #TempData ([Id], [StartAge], [EndAge], [Amount])
VALUES (1, 0, 2, 50)
,(2, 2, 5, 100)
,(3, 5, 10, 150)
,(4, 6, 9, 160);
Then, this data will be transferred to your target table using INSERT INTO... SELECT... statement.
INSERT INTO <your target table>
SELECT * FROM #TempData s
WHERE
NOT EXISTS (
SELECT 1
FROM #TempData t
WHERE
t.[Id] < s.[Id]
AND s.[StartAge] < t.[EndAge]
AND s.[EndAge] > t.[StartAge]
);
I've created a demo here
We can use recursive CTE to find how records are chained by end age and start age pairs:
DECLARE #DataSource TABLE
(
[Id] TINYINT
,[StartAge] TINYINT
,[EndAge] TINYINT
,[Amount] TINYINT
);
INSERT INTO #DataSource ([Id], [StartAge], [EndAge], [Amount])
VALUES (1, 0, 2, 50)
,(2, 2, 5, 100)
,(3, 5, 10, 150)
,(4, 6, 9, 160)
,(5, 6, 11, 160)
,(6, 3, 5, 160)
,(7, 5, 7, 160)
,(9, 10, 15, 20)
,(8, 7, 15, 20);
WITH PreDataSource AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY [StartAge] ORDER BY [id]) as [pos]
FROM #DataSource
), DataSource AS
(
SELECT [Id], [StartAge], [EndAge], [Amount], [pos]
FROM PreDataSource
WHERE [id] = 1
UNION ALL
SELECT R.[Id], R.[StartAge], R.[EndAge], R.[Amount], R.[pos]
FROM DataSource A
INNER JOIN PreDataSource R
ON A.[Id] < R.[Id]
AND A.[EndAge] = R.[StartAge]
AND R.[pos] =1
)
SELECT [Id], [StartAge], [EndAge], [Amount]
FROM DataSource;
This is giving us, the following output:
Note, that before this, we are using the following statement to prepare the data:
SELECT *, ROW_NUMBER() OVER (PARTITION BY [StartAge] ORDER BY [id]) as [pos]
FROM #DataSource;
The idea is to find records with same start age and to calculated which one is inserted first. Then, in the CTE we are getting only the first.
Assuming you are bulk inserting the mentioned data into a temp table(#tmp) or table variable (#tmp).
If you are working on sql server 2012 try the below.
select *
from(select *,lag(endage,1,0)over(order by endage) as [col1]
from #tmp)tmp
where startage>=col1 and endage>col1
The result of this query should be inserted into your main table.
I have a database that stores data from the stock market.
There is a table with 3 columns: stockId, date, and volume
New data will be inserted into the table every trading day.
How can I get a result like 'Average volume for each stock over the last 10 trading days'?
SELECT AVG(volume) FROM mytable WHERE date >= (CURDATE() - INTERVAL 10 DAY)
EDIT:
Last 10 day groups, and their averages.
SELECT AVG(volume) FROM mytable GROUP BY date ORDER BY date DESC LIMIT 10
http://sqlfiddle.com/#!6/c8dbb/4
CREATE TABLE Stocks
([StockId] int, [Date] DateTime, [Volume] int)
;
INSERT INTO Stocks
([StockId], [Date], [Volume])
VALUES
(1, GetDate(), 1000),
(1, GetDate()+1, 2000),
(1, GetDate()+2, 4000),
(2, GetDate(), 1000),
(2, GetDate()+1, 1000),
(2, GetDate()+2, 500)
;
Select StockId, AVG(Volume) [AverageVolume]
FROM Stocks
WHERE [Date] >= DATEADD(dd, 0, DATEDIFF(dd, 0, GetDate())) - 10
Group by StockId
Order by StockId
SELECT SUM(volume)/10 FROM table_name
Where date Between Cast('7/18/13 12:01:01' As DateTime) And Cast('7/08/13 12:01:01' as DateTime)
I'm basing this off of Dodecapus answer and based off of comments you've given to other answers. I'm just including query in answer but check out sqlfiddle for working example with data.
http://sqlfiddle.com/#!6/91599/2
SELECT
StockId
,AVG(Volume) [AverageVolume]
FROM Stocks
WHERE [Date] IN
(
SELECT DISTINCT TOP 10 [Date] FROM Stocks ORDER BY [Date] DESC
)
GROUP BY StockId
ORDER BY StockId
This will only work if there is a record of at least one stock with volume per trading day.
Fun little query to write. Here it is:
SELECT AVG(x.Volume) FROM (SELECT Volume FROM StockTable WHERE Date BETWEEN
DATE_ADD(NOW(), INTERVAL -10 DAY) AND NOW())x
This is what I used to build a sample table to work off of in SQLFiddle:
CREATE TABLE StockTable (ID INT PRIMARY KEY AUTO_INCREMENT NOT NULL, Date DATETIME, Volume INT);
INSERT INTO StockTable (Date, Volume) VALUES (DATE_ADD(NOW(), INTERVAL 12 DAY), 1000), (DATE_ADD(NOW(), INTERVAL 1 DAY), 5000),
(DATE_ADD(NOW(), INTERVAL 0 DAY), 3000), (DATE_ADD(NOW(), INTERVAL -11 DAY), 6000), (DATE_ADD(NOW(), INTERVAL -5 DAY), 4000), (DATE_ADD(NOW(), INTERVAL 7 DAY), 9000);
Here is a link to the SQLFiddle of the query in action.
Idea behind the query: I create a derived table x which contains just the volumes within the past 10 days. Then I calculate the average of the volumes contained in that table. VoilĂ !
EDIT:
I realized specifically what you are looking for after reading through the other answers and comments. You are looking to get the average for each stock in the stock market over the last 10 days.
I built the sample table off of this:
CREATE TABLE StockTable (StockId INT NOT NULL, Date DATETIME, Volume INT);
INSERT INTO StockTable (StockId, Date, Volume) VALUES (1, DATE_ADD(NOW(), INTERVAL 6
DAY), 1000), (2, DATE_ADD(NOW(), INTERVAL 1 DAY), 5000),
(2, DATE_ADD(NOW(), INTERVAL 0 DAY), 3000), (3, DATE_ADD(NOW(), INTERVAL -8 DAY),
6000), (1, DATE_ADD(NOW(), INTERVAL -5 DAY), 4000),
(2, DATE_ADD(NOW(), INTERVAL 7 DAY), 9000);
And the query to get your results are:
SELECT StockId, AVG(Volume) FROM StockTable WHERE Date BETWEEN DATE_ADD(NOW(),
INTERVAL -10 DAY) AND NOW() GROUP BY StockId
Here is a link to the SQLFiddle of the query in action.