I have a complicated problem I am trying to solve. Please bear with me and feel free to ask any questions. I am quite new to SQL and having difficulty with this...
I need to count the median of a group of values. Now the values are not given in a table. The values are derived from a table based on hourly occurrences grouped by date.
Here's the sample table from where data is pooled.
CREATE TABLE Table22(
Request_Number BIGINT NOT NULL
,Request_Received_Date DATETIME NOT NULL
);
INSERT INTO Table22(Request_Number,Request_Received_Date) VALUES (2016311446,'8/9/16 9:56');
INSERT INTO Table22(Request_Number,Request_Received_Date) VALUES (20163612157,'9/6/16 9:17');
INSERT INTO Table22(Request_Number,Request_Received_Date) VALUES (2016384250,'9/12/16 14:52');
INSERT INTO Table22(Request_Number,Request_Received_Date) VALUES (20162920101,'4/19/16 8:11');
INSERT INTO Table22(Request_Number,Request_Received_Date) VALUES (2016418170,'10/6/16 12:28');
INSERT INTO Table22(Request_Number,Request_Received_Date) VALUES (2016392953,'9/6/16 12:39');
INSERT INTO Table22(Request_Number,Request_Received_Date) VALUES (20164123416,'10/6/16 15:05');
INSERT INTO Table22(Request_Number,Request_Received_Date) VALUES (2016335972,'8/9/16 7:49');
INSERT INTO Table22(Request_Number,Request_Received_Date) VALUES (20162622951,'9/6/16 9:57');
INSERT INTO Table22(Request_Number,Request_Received_Date) VALUES (20163913504,'9/6/16 9:47');
INSERT INTO Table22(Request_Number,Request_Received_Date) VALUES (20163211326,'9/6/16 12:38');
INSERT INTO Table22(Request_Number,Request_Received_Date) VALUES (20163610132,'8/30/16 16:34');
INSERT INTO Table22(Request_Number,Request_Received_Date) VALUES (20164119560,'10/6/16 15:53');
INSERT INTO Table22(Request_Number,Request_Received_Date) VALUES (2016334416,'8/10/16 11:06');
INSERT INTO Table22(Request_Number,Request_Received_Date) VALUES (20164320028,'10/6/16 15:27');
INSERT INTO Table22(Request_Number,Request_Received_Date) VALUES (20163515193,'8/24/16 19:50');
INSERT INTO Table22(Request_Number,Request_Received_Date) VALUES (2016159834,'4/19/16 13:21');
INSERT INTO Table22(Request_Number,Request_Received_Date) VALUES (2016178443,'4/19/16 13:05');
The Table has 2 columns: Request_Number and Request_Received_Date.
Request_Number is not unique and is kind of irrelevant. I am looking for how many requests are received for a particular date and hourly within that date (24 hours). Every time there is an entry for a date, that is counted as one occurrence (TicketCount). I can use the COUNT statements to count * from Request_received_date and group by date and hour.
I did just that and created a temporary table within my script:
CREATE TABLE #z (ForDate date, OnHour int, TicketCount int)
INSERT INTO #z (ForDate, OnHour, TicketCount)
SELECT CAST(Request_received_date as DATE) AS 'ForDate',
DATEPART(hh, request_received_date) AS 'OnHour',
COUNT(*) AS TicketCount /*Hourly Ticket Count Column*/
FROM Table22
GROUP BY CAST(request_received_date as DATE), DATEPART(hh, request_received_date)
ORDER BY ForDate Desc, OnHour ASC
SELECT * FROM #z order by ForDate Desc, OnHour ASC
Now I am having the hardest time finding the median value of count per day. I have tried many different formula for median calculation and was able to make most them work. Many different examples of median calculation can be found here
https://sqlperformance.com/2012/08/t-sql-queries/median
I like this piece of script to find median. The script for finding median is simple. But it finds median for all the values of Request_Received_Date. I am unable to find a way to use the group by date clause in here.
DECLARE #Median DECIMAL (12,2);
SELECT #Median = (
(SELECT MAX(TicketCount) FROM
(SELECT TOP 50 PERCENT TicketCount FROM #z ORDER BY TicketCount) AS BottomHalf)
+
(SELECT MIN(TicketCount) FROM
(SELECT TOP 50 PERCENT TicketCount FROM #z ORDER BY TicketCount DESC) AS TopHalf))/2;
SELECT #Median
Any help will be really appreciated.
The expected result is something like this:
ForDate Median
10/6/2016 2
9/12/2016 1
9/6/2016 2.5
8/30/2016 1
8/24/2016 1
8/10/2016 1
8/9/2016 1
4/19/2016 1.5
How about something like this? (Only apply if you use SQL Server 2012 or above)
SELECT DISTINCT ForDate, PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY TicketCount) OVER (PARTITION BY ForDate) AS Median
FROM #z;
In short, SQL-Server has two ways to calculate median, you can read about it here: https://msdn.microsoft.com/en-us/library/hh231327.aspx
You can compare them both in this case with the code here:
SELECT DISTINCT
ForDate
, PERCENTILE_DISC(0.5) WITHIN GROUP (ORDER BY TicketCount) OVER (PARTITION BY ForDate) AS MedianDisc
, PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY TicketCount) OVER (PARTITION BY ForDate) AS MedianCont
FROM
#z;
Related
I'm getting foreign exchange rates from Riksbanken API. This API doesn't give values for holidays, saturdays or sundays. So my FX rates is empty for those days. However when I convert currencies in my tables I do a join based on date & currency. So in a case where I have a transaction on a Sunday then I can't convert it because that is NULL, and transaction amount * NULL = NULL.
Basically this is my situation:
DECLARE #fxrates TABLE(cur varchar(5), rate decimal, rowDate date);
INSERT INTO #fxrates VALUES
('EUR',10.40,'2020-04-30')
, ('EUR',10.50,'2020-05-01')
, ('EUR',NULL,'2020-05-02')
, ('EUR',NULL,'2020-05-03')
, ('EUR',10.60,'2020-05-04')
, ('EUR',10.70,'2020-05-05')
DECLARE #value TABLE(cur varchar(5), amount decimal, rowDate date);
INSERT INTO #value VALUES
('EUR',1500,'2020-04-30')
, ('EUR',9000,'2020-05-01')
, ('EUR',1000,'2020-05-02')
, ('EUR',300,'2020-05-03')
, ('EUR',160,'2020-05-04')
, ('EUR',170,'2020-05-05')
--How I convert the values
select v.amount * fx.rate as [Converted amount] from #fxrates fx
JOIN #value v
on fx.cur=v.cur
and fx.rowDate=v.rowDate
My solution for this would be to always replace NULL with the "earlier" value, based on date. However I have no idea how that logic would be implemented in SQL. So my fxrates table would look like this:
('EUR',10.40,'2020-04-30')
('EUR',10.50,'2020-05-01')
('EUR',10.50,'2020-05-02')
('EUR',10.50,'2020-05-03')
('EUR',10.60,'2020-05-04')
('EUR',10.70,'2020-05-05')
Would replacing the NULL values be the best approach moving forward? And how can I achieve that using SQL?
I think you can use CROSS APPLY to get the latest exchange rate that is valid before or at the transaction date.
SELECT v.amount * fx.rate AS [Converted amount]
FROM #value v
CROSS APPLY (SELECT TOP 1
fx.rate
FROM #fxrates fx
WHERE fx.cur = v.cur
AND fx.rowdate <= v.rowdate
AND fx.rate IS NOT NULL
ORDER BY fx.rowdate DESC) fx;
You could use a correlated subquery here to find the most recent non NULL price for every date which be missing a price:
SELECT
cur,
COALESCE(rate,
(SELECT TOP 1 f2.rate FROM #fxrates f2
WHERE f2.cur = f1.cur AND f2.rowDate < f1.rowDate AND f2.rate IS NOT NULL
ORDER BY f2.rowdate DESC)
) AS rate,
rowDate
FROM #fxrates f1
ORDER BY
cur,
rowDate;
Demo
I have two tables in SQL DB. They both contain 3 columns that match, and additional columns that have different info in each one. I want to write a query that Interleaves them according to date / timestamp. Table A is for a machine that runs and takes a sample every 10 minutes. Table B is the logfile that has entries logged when operator makes adjustments, turns machine on / off, etc.
I have used the following query but it is giving me duplicates on table A.
I did the where(BatchTable.Batch = 'HB20419' and EventLogTable.Batch = 'HB20419') just to cut down on the amount of date being returned until I get the query figured out. One complication is each table has it's own date / time columns and they are named different and completely independent of each other.
SELECT BatchTable.Asset_Number,BatchTable.Recipe,BatchTable.Batch,BatchTable.Group_No, BatchTable.Sample_No, BatchTable.SampleDate, BatchTable.SampleTime, BatchTable.Weight, EventLogTable.EvtTime, EventLogTable.EvtValueBefore, EventLogTable.EvtValueAfter, EventLogTable.EvtComment
FROM BatchTable,EventLogTable
where(BatchTable.Batch = 'HB20419' and EventLogTable.Batch = 'HB20419')
order by Asset_Number, Recipe, Batch, Group_No, Sample_No ASC
Here is how that query would look using aliases, formatting and ANSI-92 style joins.
SELECT bt.Asset_Number
, bt.Recipe
, bt.Batch
, bt.Group_No
, bt.Sample_No
, bt.SampleDate
, bt.SampleTime
, bt.Weight
, elt.EvtTime
, elt.EvtValueBefore
, elt.EvtValueAfter
, elt.EvtComment
FROM BatchTable bt
join EventLogTable elt on elt.Batch = bt.Batch
WHERE bt.Batch = 'HB20419'
ORDER BY Asset_Number
, Recipe
, Batch
, Group_No
, Sample_No ASC
I had to make up some sample data, but it sounds like you want to union the two tables together to "interleave" them. You can do this by aliasing the column names to match and selecting null values for the final values from the opposite table. I acknowledge that I'm guessing at your desired outcome to some extent.
Make some sample data:
DECLARE #batch table (SampleDate VARCHAR(MAX), SampleTime VARCHAR(MAX), Recipe
VARCHAR(MAX))
DECLARE #event table (EvtTime DATETIME, EvtComment VARCHAR(MAX))
INSERT INTO #batch (SampleDate, SampleTime, Recipe) VALUES ('2018-08-09', '11:56:25
AM', 'Peanut Butter'), ('2018-08-09', '12:11:25 PM', 'Chocolate')
INSERT INTO #event (EvtTime, EvtComment) VALUES ('2018-08-09 11:58:22 AM', 'Turned up
speed'), ('2018-08-09 11:59:22 AM', 'Turned down temperature')
Then select and union to interleave:
SELECT CONVERT(DATETIME, CAST(SampleDate + ' ' + SampleTime AS datetime)) AS [Date],
Recipe, NULL as EvtComment FROM #batch
UNION
SELECT EvtTime AS [Date], NULL AS Recipe, EvtComment FROM #event
ORDER BY [Date]
Which yields:
Date Recipe EvtComment
----------------------- ------------------------- -------------------------
2018-08-09 11:56:25.000 Peanut Butter NULL
2018-08-09 11:58:22.000 NULL Turned up speed
2018-08-09 11:59:22.000 NULL Turned down temperature
2018-08-09 12:11:25.000 Chocolate NULL
Getting nowhere here and seems so simple.
Test data is:
declare #table table(SpellAdminsionDate datetime, SpellDischargeDate datetime, Pat_code varchar(10))
insert into #table (SpellAdminsionDate, SpellDischargeDate, Pat_code) values('2016-09-12 15:55:00:000','2016-09-19 20:20:00:000','HEY3052275')
insert into #table(SpellAdminsionDate, SpellDischargeDate, Pat_code) values ('2016-09-07 17:17:00:000','2016-09-17 18:40:00:000','HEY0810155')
insert into #table(SpellAdminsionDate, SpellDischargeDate, Pat_code) values ('2016-09-14 16:50:00:000','2016-09-17 18:01:00:000','HEY1059266')
insert into #table(SpellAdminsionDate, SpellDischargeDate, Pat_code) values ('2016-09-15 02:47:00:000','2016-09-15 17:28:00:000','HEY0742883')
insert into #table(SpellAdminsionDate, SpellDischargeDate, Pat_code) values ('2016-08-27 00:11:00:000','2016-09-14 12:49:00:000','HEY3050628')
insert into #table(SpellAdminsionDate, SpellDischargeDate, Pat_code) values ('2016-09-10 12:24:00:000','2016-09-13 20:00:00:000','HEY0912392')
insert into #table(SpellAdminsionDate, SpellDischargeDate, Pat_code) values ('2016-09-12 12:51:00:000','2016-09-13 19:55:00:000','HEY0908691')
Select * from #table`
Below is my simple code displaying the same thing:
SELECT c.SpellAdmissionDate,
c.SpellDischargeDate,
c.Pat_Code
FROM [CommDB].[dbo].[vwCivicaSLAM1617Live] c
WHERE c.Hrg_Code like 'VA%'
and c.Pat_Code like 'HEY%'
ORDER BY c.SpellDischargeDate desc
All I am after is a COUNT per day of active patients, for example take the 12/09/2016 on that date the result would be 5 (based on the test data) as the other 2 cam in after the 12th.
If it makes it easier I do have a date reference table called DATE_REFERENCE which has every date available to me.
Allowing for the possibility of having no patients on a day then you want to use your date reference table as the primary and left join to the patient information. You don't identify a column name so I just used [datecol].
SELECT
d.[datecol] the_date
, count(DISTINCT c.Pat_Code) num_patients
FROM DATE_REFERENCE d
LEFT JOIN [CommDB].[dbo].[vwCivicaSLAM1617Live] c
ON d.[datecol] BETWEEN c.SpellAdmissionDate AND c.SpellDischargeDate
AND c.Hrg_Code LIKE 'VA%'
AND c.Pat_Code LIKE 'HEY%'
GROUP BY
d.[datecol]
ORDER BY
d.[datecol] DESC
I suspect there may be more than this required, but without sample data and expected results it is difficult to know what you really need.
nb. I assume the date in that date_reference table is at midnight (00:00:00) or has a data type of date (without hours/minutes etc.)
Is this what you want?
SELECT dr.date,
(SELECT COUNT(*)
FROM [CommDB].[dbo].[vwCivicaSLAM1617Live] c
WHERE dr.date between c.SpellAdmissionDate and c.SpellDischargeDate) as cnt_per_day
FROM DATE_REFERENCE dr
Add the filters you want to the correlated query .
You can achieve this by joining to your date reference table and counting by a distinct of the patient reference
SELECT
dateRef
,COUNT(DISTINCT Pat_Code) AS PatCount
FROM #table t
RIGHT JOIN #date_reference
ON SpellAdminsionDate <= dateRef
AND SpellDischargeDate >= dateRef
GROUP BY dateRef;
Active count of patients per date can be achieved by (for temp data provided here)
SELECT DISTINCT CAST(T1.SPELLADMINSIONDATE AS DATE)
AdmissionDate,PATIENTS.TotalActive
FROM #TABLE T1
CROSS APPLY (SELECT COUNT(*)TotalActive FROM #TABLE T2 WHERE
CAST(T2.SPELLADMINSIONDATE AS DATE)<=CAST(T1.SPELLADMINSIONDATE AS DATE)) PATIENTS
I think you need to bring the DATE_REFERENCE table to the Query If it contains all the day dates that you need to count patients based on them, But here is a sample of how may you get the required result form this table only
Select DISTINCT c.SpellAdmissionDate, count(c.Pat_code) as PCounter
From [CommDB].[dbo].[vwCivicaSLAM1617Live] c
GROUP BY c.SpellAdmissionDate
ORDER BY c.SpellAdmissionDate desc
I’m facing a tough situation. I could not find a solution so far.
I have a table giving information based on date ranges. I’d like to have this information broken down by date. So I’m looking to convert the range into a row structure.
The extra difficulty is that the number of “periods” in the date range is variable.
The "periodicity" is deducted by the date range and the number of days in one period.
To be more specific, on one line of the table I've an
ID
start_date of the range
end_date of the range
number of days_in_the_period
numbers_periods
pricings to apply to each period in the range
Here is the initial table structure and the expected result:
CREATE TABLE Start(
Key VARCHAR(11) NOT NULL PRIMARY KEY
,Start_date VARCHAR(27) NOT NULL
,End_Date VARCHAR(27) NOT NULL
,Days_in_the_period INTEGER NOT NULL
,Nbr_periods INTEGER NOT NULL
,Pricing VARCHAR(6) NOT NULL
);
INSERT INTO Start(Key,Start_date,End_Date,Days_in_the_period,Nbr_periods,Pricing) VALUES ('010-1280001','2000-06-01 00:00:00.0000000','2001-12-01 00:00:00.0000000',30,19,'800,87');
INSERT INTO Start(Key,Start_date,End_Date,Days_in_the_period,Nbr_periods,Pricing) VALUES ('010-1280001','2002-01-01 00:00:00.0000000','2005-12-01 00:00:00.0000000',30,48,'440,32');
INSERT INTO Start(Key,Start_date,End_Date,Days_in_the_period,Nbr_periods,Pricing) VALUES ('010-1280001','2006-01-01 00:00:00.0000000','2007-02-01 00:00:00.0000000',30,14,'282,68');
INSERT INTO Start(Key,Start_date,End_Date,Days_in_the_period,Nbr_periods,Pricing) VALUES ('010-1280001','2007-03-01 00:00:00.0000000','2008-03-01 00:00:00.0000000',30,13,'283,99');
INSERT INTO Start(Key,Start_date,End_Date,Days_in_the_period,Nbr_periods,Pricing) VALUES ('010-1280001','2008-04-01 00:00:00.0000000','2009-01-01 00:00:00.0000000',60,5,'281,81');
INSERT INTO Start(Key,Start_date,End_Date,Days_in_the_period,Nbr_periods,Pricing) VALUES ('010-1280001','2009-02-01 00:00:00.0000000','2009-03-01 00:00:00.0000000',30,2,'281,81');
INSERT INTO Start(Key,Start_date,End_Date,Days_in_the_period,Nbr_periods,Pricing) VALUES ('010-1280001','2009-04-01 00:00:00.0000000','2019-07-01 00:00:00.0000000',30,124,'281,81');
INSERT INTO Start(Key,Start_date,End_Date,Days_in_the_period,Nbr_periods,Pricing) VALUES ('010-1280001','2019-08-01 00:00:00.0000000','2019-08-01 00:00:00.0000000',0,1,'372,96');
Expected
Key Date Pricing Days_in_the_period
010-1280001 2000-06-01 00:00:00.0000000 800,87 30
010-1280001 2000-07-01 00:00:00.0000000 800,87 30
… … … …
010-1280001 2008-04-01 00:00:00.0000000 281,81 60
010-1280001 2008-06-01 00:00:00.0000000 281,81 60
… … … …
010-1280001 2019-08-01 00:00:00.0000000 372,96 0
For information, the initial table contains about 100k records.
Does anyone has a brilliant idea for me?
Please revert for any clarification,
Tartino.
You can do this with the help of recursive CTE:
;WITH cte AS (
SELECT *
FROM YourTable
UNION ALL
SELECT c.[key],
DATEADD(month,c.Days_in_the_period/30,c.[Start_Date]),
c.End_Date,
c.Days_in_the_period,
c.Nbr_periods,
c.Pricing
FROM cte c
INNER JOIN YourTable y
ON y.[key] = c.[key] AND c.End_Date = y.End_Date
WHERE y.End_Date >=DATEADD(month,c.Days_in_the_period/30,c.[Start_Date])
)
SELECT [key],
[Start_Date] as [Date],
Pricing,
Days_in_the_period
FROM cte
ORDER BY [key], [Start_Date]
Another way is to use calendar table, and join it with your table.
Account_ID Amount
123 200
Result
Account_ID Amount
123 200
123 -200
Typically, our database will have two transactions for a void refund payment, but somehow few records only have one transaction.
I know I can manually insert a same record into the table.
Is there any other ways to clone a record and set the amount to negative without using insert statment?
Even though as M.Ali said it is not good to clone record but we can achieve but i didn't exactly know if it suits your requirement or not
DECLARE #T TABLE
([Account_ID] int, [Amount] int)
;
INSERT INTO #T
([Account_ID], [Amount])
VALUES
(123, 200)
;
;WITH CTE AS (select Account_ID,Amount,row_number()OVER(PARTITION BY Amount ORDER BY (Select NULL))RN from #T
CROSS APPLY(values('Account_ID',Account_ID),('Amount',Amount))M(v,s))
Select Account_ID,
CASE WHEN RN = 1 THEN cast(Amount as varchar) ELSE
'-' + cast(Amount as varchar)END
from CTE