SQL Server query for calculating daily census - sql-server

I'm having a problem with a query that populates the daily census(# of current inpatients) for a hospital unit. This previous post is where I found the query.
SELECT
[date], COUNT (DISTINCT
CASE WHEN admit_date <= [date] AND discharge_date >= [date] THEN id END)) AS census
FROM
dbo.patients, dbo.census
GROUP BY
[date]
ORDER BY
[date]
There are 2 tables:
dbo.patients with id, admit_date, and discharge_date columns
dbo.census has a date column with every date since 2017, and a census column, which is blank
The query populates the census column, but the census count diminishes toward the end of the dates to smaller numbers then it should. For example, there are 65 null values for discharge_date, so there should be a census count of 65 for today's date, but the query produces a count of 8.

Probably need to account for NULL discharge date
SELECT [date], COUNT (DISTINCT
CASE WHEN admit_date <= [date] AND COALESCE(discharge_date, GETDATE()) >= [date] THEN id END))
AS census
FROM dbo.patients
CROSS JOIN dbo.census
GROUP BY [date]
ORDER BY [date]
That is, assuming [date] is some sort of current date/time stamp. Also, as per Sean Lange's comment, if you really want a CROSS JOIN then you should specify that in the query.

Related

Split a date column, and calculate an amount based on the result

In the sales table I have a column that contains full dates of each sale, I need to split it and add 2 separate columns of month and year to the sales table, and a column that contains the sum of all sales of each month that was.
This is the table I have-
Sales table
customer_id
date
Quantity
123
01/01/2020
6
124
01/02/2020
7
123
01/03/2021
5
123
15/01/2020
4
Here's what I wrote -
ALTER TABLE SALES ADD SELECT DATEPART (year, date) as year FROM SALES;
ALTER TABLE SALES ADD SELECT DATEPART (month, date) as month FROM SALES;
ALTER TABLE SALES ADD SUM_Quantity AS SUM() (Here I was stuck in a query...)
Is this a valid query, and how to write it correctly? Thanks!
One of the problems you're going to have here is that the outcome of the computed columns you have isn't compatible with the data being stored.
For example you can't build a sum of the quantity for all of the rows in January for each row in January. You need to group by the date and aggregate the quantity.
As such I think this might be an ideal candidate for an indexed view. This will allow you to store the calculated data, whilst preserving the data in the original table.
create table SalesTable (customer_id int, date date, quantity smallint);
insert SalesTable (customer_id, date, quantity)
select 123, '2020-01-01', 6
union
select 124, '2020-02-01', 7
union
select 123, '2021-03-01', 5
union
select 123, '2020-01-15', 4;
select * from SalesTable order by date; /*original data set*/
GO
CREATE VIEW dbo.vwSalesTable /*indexed view*/
WITH SCHEMABINDING
AS
SELECT customer_id, DATEPART(year, date) as year, datepart(MONTH, date) as
month, SUM(quantity) as sum_quantity from dbo.SalesTable group by Customer_Id,
DATEPART(year, date), DATEPART(MONTH, date)
GO
select * from vwSalesTable /*data from your indexed view*/
drop view vwSalesTable;
drop table SalesTable;

SQL Count number of records that appear more than once a month and aggregate by month

I am going through a ton of records and want to count the number of times an ID is updated more than once a month.
SELECT COUNT(*) AS Frequency, MONTH(Date) AS MM, YEAR(Date) AS YYYY, Id
FROM data
WHERE -- [some filtering]
AND Date <= -- end date
AND Date >= -- start date
GROUP BY Date, Id
HAVING COUNT(*) > 1
ORDER BY MM DESC
Now what happens there is that I am only finding the number of Id's that are updated more than once per day. What I want to do is to group the ID's by month.
I have tried using the column MM in my GROUP BY but then I get error codes stating that these are invalid column selections.
I tried using the following GROUP BY:
GROUP BY DATEPART(MONTH, Date), Id
All I get is continuous 8120 errors and I cannot figure out how I should put this together. Any help would be greatly appreciated.
First let's fix your query so it does proper aggregation:
SELECT COUNT(*) AS Frequency, MONTH(Date) AS MM, YEAR(Date) AS YYYY, Id
FROM data with(NOLOCK)
WHERE -- [some filtering]
GROUP BY MONTH(Date), YEAR(Date), Id
HAVING COUNT(*) > 1
ORDER BY YYYY, MM DESC
This gives you a list of Ids that were updated more than once each month.
Now, if you want to know how many Ids were updated more than once each month, you can add another level of aggregation:
SELECT MM, YYY, COUNT(*)
FROM (
SELECT COUNT(*) AS Frequency, MONTH(Date) AS MM, YEAR(Date) AS YYYY, Id
FROM data with(NOLOCK)
WHERE -- [some filtering]
GROUP BY MONTH(Date), YEAR(Date), Id
HAVING COUNT(*) > 1
) x
ORDER BY YYYY, MM DESC

SQL Server return records where the most recent matches the criteria

This is a question using SQL Server. Is there a more elegant way of doing this?
Considering a table mytable (I'm using Unix timestamps, I've converted these to readable dates for ease of reading)
ID Foreign_ID Date
-------------------------
1 1 01-Jul-15
2 2 01-Sep-16
3 3 05-Aug-16
4 2 01-Sep-15
I would like to extract the Foreign_ID where the most recent record's (highest ID) date is in a range, which is this example is the 1st January 2016 to 31st December 2016. The following works if substituting the dates for timestamps:
select distinct
Foreign_ID
from
mytable l1
where
(select top 1 Date
from mytable l2
where l2.Foreign_ID = l1.Foreign_ID
order by ID desc) >= **1 Jan 2016**
and
(select top 1 Date
from mytable l2
where l2.Foreign_ID = l1.Foreign_ID
order by ID desc) <= **31 Dec 2016**
That should only return Foreign_ID = 3. A simpler query would also return Foreign_ID 2 which would be wrong, as it has a more recent record dated out of the above ranges
This will all form part of a bigger query
Assuming SQL Server 2005+, you can use ROW_NUMBER:
WITH CTE AS
(
SELECT *,
RN = ROW_NUMBER() OVER( PARTITION BY Foreign_ID
ORDER BY ID DESC)
FROM dbo.YourTable
WHERE [Date] >= '01-Jan-2016' -- you need to use the right
AND [Date] <= '31-Dec-2016' -- date format here
)
SELECT Foreign_ID
FROM CTE
WHERE RN = 1;
If it's SQL Server 2008+, you can use this:
select foreign_id
from (
select foreign_id, row_number() over (order by id desc) as rseq
from myTable
where Date >= value1 and Date <= value2
) as x
where rseq = 1
Just fill in the date values, and you might have to put brackets or quotes around the column named "Date", since it is also a keyword.

SQL Sum over not bounded by where clauses

I am using the SUM Over the first time and have the following query noe:
SELECT Id, Amount, Date, TotalAmount = SUM(Amount) OVER (order by Amount)
FROM Account
WHERE Date >= '2016-03-01' AND Date <= '2016-03-10' AND UserId = 'xyz'
ORDER BY ValutaDate
The TotalAmount should be a running total over the whole table for a specific user (so the Sum Over clause should respect the where clause for the user). On the other hand I just need a few records and not the whole table, thats why I added the where clause specifying the date range. But now, of course, my sum gets calculated just for the range I specified.
What should I do, to get just a few records specified by date range but get the sum calculated over the whole table though. Is there an performant way to accomplish this?
Thanks in advance for helping me out.
Break the running total into its own query
; WITH all_rows_one_user as (SELECT *
, TotalAmount = SUM(Amount) OVER (order by ValutaDate)
FROM Account
WHERE UserId = 'xyz')
SELECT Id, Amount, Date, TotalAmount
FROM all_rows_one_user
WHERE Date >= '2016-03-01' AND Date <= '2016-03-10'
ORDER BY ValutaDate
Same query, different syntax:
SELECT Id, Amount, Date, TotalAmount
FROM (SELECT *
, TotalAmount = SUM(Amount) OVER (order by ValutaDate)
FROM Account
WHERE UserId = 'xyz') AS all_rows_one_user
WHERE Date >= '2016-03-01' AND Date <= '2016-03-10'
ORDER BY ValutaDate
The WHERE clause is applied first so the SUM can't access rows that don't match that.
You can use apply though. Note it will be reading the entire table for that user so might not perform too well without a decent index.
SELECT a.Id, a.Amount, a.Date, ta.TotalAmount
FROM Account a
OUTER APPLY (SELECT SUM(CASE WHEN a2.Date <= Account.Date THEN a2.TotalAmount ELSE 0 END) AS TotalAmount FROM Account a2 WHERE a2.UserId = Account.UserId) ta
WHERE a.Date >= '2016-03-01' AND a.Date <= '2016-03-10' AND a.UserId = 'xyz'

SQL Server: Top 10 salespeople per week - and previous rankings

select *
from
(
select year,
week,
salesperson,
count(*) as transactions,
rank() over(partition by week order by count(*) desc) as ranking
from sales
where year = '2010',
group by year,
week,
salesperson
) temp
where ranking <= 10
The query returns a list of the top 10 salespeople (in terms of # of transactions) for each week of the year.
How can I go about adding columns to my results for:
Previous week's ranking for that
salesperson
Total weeks in the Top 10 this year
Consecutive weeks in the Top 10 (starting at week 1)
Consecutive weeks in the Top 10 (starting in previous year, if possible)
Can you give any general advice on how to go about these sorts of problems?
PS: Using SQL server 2008
Actually, I'm not convinced that Views are the best way to go. You can do this sort of logic in CTE's and combine the entire thing into a single query. For example, here is what I have for everything except the consecutive logic:
;With
SalesDateParts As
(
Select DatePart(wk, SaleDate) As WeekNum, DatePart(yy, SaleDate) As [Year], SalesPersonId
From #Sales
)
, SalesByWeek As
(
Select [Year], WeekNum, SalesPersonId, Count(*) As SaleCount
, RANK() OVER( PARTITION BY [Year], [WeekNum] ORDER BY Count(*) DESC ) As SaleRank
From SalesDateParts
Group By [Year], WeekNum, SalesPersonId
)
, PrevWeekTopSales As
(
Select [Year], [WeekNum], SalesPersonId, SaleCount
From SalesByWeek
Where [Year] = DatePart(yyyy, DateAdd(d, -7, CURRENT_TIMESTAMP))
And WeekNum = DatePart(wk, DateAdd(d, -7, CURRENT_TIMESTAMP))
)
, WeeksInTop10 As
(
Select SalesPersonId, Count(*) As Top10Count
From SalesByWeek
Where SaleRank <= 10
Group By SalesPersonId
)
Select *
From Salespersons
Left Join WeeksInTop10
On WeeksInTop10.SalesPersonId = SalesPersons.SalesPersonId
Left Join PrevWeekTopSales
On PrevWeekTopSales.SalesPersonId = SalesPersons.SalesPersonId
The logic for "consecutive" is probably going to require a calendar table which contains a value for every day along with columns for the given date's year and week.
My advice is to do the other queries separately in views and then join them in by saleperson (which I assume is key)
The logic is this query is nice and clean and easy to follow. Otherwise - I think the way to attack this would be to start writing TSQL functions to calculate the other values, but I think those functions will have the queries in them anyway.

Resources