nested case when with count and math expression - sql-server

how can I achieve this result below?
id
id_status
rate
25
X
62.5%
15
Y
37.5%
having tried this
SELECT
COUNT(tab.id) AS id,
tab.status AS id_status,
(CASE
WHEN tab.status = 'X' THEN (25/40) * 100 -- this is where I'm stucked (40 = total of ids)
WHEN tab.status = 'Y' THEN 100 - ((25/40) * 100)
END AS rate
FROM table AS tab
WHERE tab.status in ('X', 'Y')
GROUP BY ROLLUP (tab.status)

You can use a window function to get total count
select count(tab.id) as id,
tab.status as id_status,
200.0 * count(tab.id) / sum(count(*)) over(order by status rows between unbounded preceding and unbounded following) as rate
from your_table tab
where tab.status in ('X', 'Y')
group by rollup(tab.status)
Note explicit window specification because the default is generally .. and current row and 200 because rollup will add the total row.
db<>fiddle

Related

I would like the number '1000' to appear once only and then '0' for the remaining records until the next month appears-maybe a case type statement?

I am using SQL and I would like this number '1000' to appear once per month. I have a record set which has the first of every month appearing multiple times. I would like the number '1000' to appear once only and then '0' for the remaining records until the next month appears. I would like the below please- maybe a case type statement/order parition by? I am using SQL Server 2018 ##SQLSERVER. Please see table below of how i would like the data to appear.
Many Thanks :)
Date
Amount
01/01/2022
1000
01/01/2022
0
01/01/2022
0
01/02/2022
1000
01/02/2022
0
01/02/2022
0
01/03/2022
1000
01/03/2022
0
Solution for your problem:
WITH CT1 AS
(
SELECT *,
ROW_NUMBER() OVER(PARTITION BY CONCAT(MONTH([Date]),YEAR([Date])) ORDER BY [Date]) as rn
FROM your_table
)
SELECT [Date],
CASE WHEN rn = 1 THEN 1000 ELSE 0 END AS Amount
FROM CT1;
Working Example: DB<>Fiddle Link
Given just a list of dates you could use row_number and a conditional expression to arbitrarily assign one row of each month a value of 1000
select *,
Iif(Row_Number() over(partition by Month(date) order by (select null)) = 1, 1000, 0) Amount
from t
order by [date], Amount desc;

I want to get the accumulated amount every hour in Oracle

`select
order_price,
To_char(to_date(order_date,'YYYY-MM-DD HH24:MI:SS'),'yyyymmdd hh24') as order_date,
SUM(order_price) OVER(ORDER BY To_char(to_date(order_date,'YYYY-MM-DD HH24:MI:SS'),'yyyymmdd
hh24')) as "bth"
from order_tbl
where seller_no=100
order by order_date;`
I got this result.
But the data I want to get is as follows.
20000 / 20220524 15 / 52500
13000 / 20220524 15 / 52500
19500 / 20220524 15 / 52500
19600 / 20220524 16 / 72100
222000 / 20220524 17 / 738700
and even if there is no data...
0 / 20220524 18 / 738700
0 / 20220524 19 / 738700
0 / 20220524 20 / 738700
.
.
.
.
0 / 20220525 10 / 738700
13600 / 20220525 11 / 787300
like this.
I want to get the order_date and bth for every hour even if there is no data.
It's too difficult for me, but how can I do?
i will remove order_price and add distinct later.
You can try this: (read the guide below)
db<>fiddle
WITH all_hour AS (
SELECT LEVEL AS hour
FROM dual
CONNECT BY LEVEL <= 24
),
all_date AS (
SELECT TO_CHAR(DATE'2022-05-24' + LEVEL - 1, 'YYYYMMDD') AS dt
FROM dual
CONNECT BY LEVEL <= (DATE'2022-05-27' - DATE'2022-05-24' + 1)
),
all_date_hour AS (
SELECT dt || ' ' || (CASE WHEN hour < 10 THEN '0' || TO_CHAR(hour) ELSE TO_CHAR(hour) END) AS order_date
FROM all_date
CROSS JOIN all_hour
),
your_order AS (
SELECT
order_price,
TO_CHAR(TO_DATE(order_date,'YYYY-MM-DD HH24:MI:SS'),'YYYYMMDD HH24') AS order_date,
seller_no
FROM order_tbl
),
your_sum AS (
SELECT adh.order_date, SUM(CASE WHEN yo.seller_no = 100 THEN yo.order_price ELSE 0 END) AS bth
FROM all_date_hour adh
LEFT JOIN your_order yo ON adh.order_date = yo.order_date
GROUP BY adh.order_date
)
SELECT order_date, SUM(bth) OVER(ORDER BY order_date) AS bth
FROM your_sum
ORDER BY order_date;
Summary:
(1) Table 1 : all_hour
includes numbers ranging from 1 to 24
(2) Table 2 : all_date
includes dates from '2022-05-24' to '2022-05-27'.
if your prefer range is '2022-01-01' to '2022-12-31', just simply change '2022-05-24'(Line 7 & 9) -> '2022-01-01', and '2022-05-27'(Line 9) -> '2022-12-31'
(3) Table 3 : all_date_hour
includes dates in the format, 'YYYYMMDD HH24', e.g. '20220524 01'
it is a result from cross joining the first and second table
(4) Table 4: your_order
same as your sample table, order_tbl, just reformatting the order_date in the format, 'YYYYMMDD HH24', e.g. '20220524 01'
(5) Table 5: your_sum (NOT ACCUMULATED YET)
simple summation of order_price, group by the order_date
left join is use here so that all dates from all_date_hour is included
any additional conditions can be added inside the case statement (Line 24)
for example, see (Line 24) SUM(CASE WHEN yo.seller_no = 100 AND youradditionalcondition = somecondition THEN yo.order_price ELSE 0 END)
(6) Final Select Query:
accumulated sum is done here using SUM() OVER(ORDER BY *yourexpr*)
You can use this query - explained later:
select
to_date('01/01/2022 00:00:00','dd/mm/yyyy hh24:mi:ss') + all_hours.sequence_hour/24 order_date_all,
order_price,
To_char(order_date, 'yyyymmdd hh24') as order_date_real,
SUM(order_price) OVER(ORDER BY To_char(order_date, 'yyyymmdd hh24')) as "bth"
from order_tbl ,
(SELECT LEVEL sequence_hour
FROM dual
CONNECT BY LEVEL <= 10) all_hours
where trunc((order_date(+)-to_date('01/01/2022 00:00:00','dd/mm/yyyy hh24:mi:ss'))*24) = all_hours.sequence_hour
and seller_no(+)=100
order by 1;
Explanation:
you don't need to perform To_char(to_date(order_date,'YYYY-MM-DD HH24:MI:SS'),'yyyymmdd hh24') - it means that you take "order_date" which is already date and convert it again to date. It is not needed
I have added maximum number of hours - it is the subquery all_hours. Just for the example it is limited to 10, you can change it to any other value
I have added starting point of time from which you want to display the data "01/01/2022" - change it if you want. Pay attention that it appears in 2 places.
I have added an outer join with order_tbl - pay attention for "(+)" in where conditions. If you want to add additional conditions on order_tbl - remember to add the (+) on the right side of the column as I did with "seller_no"

Not able to filter the required data

Hi i have 1 table Student where data is inserted row wise and i want to select only those student whose marks are more than 50% in all the subjects and if in any subject marks are less than 50% then it should not select that student in output and all records should be excluded for that student and there is no primary key
i tried below code :
Select * into #temp1 from Student where percent >=0.5 and group by Roll_Number
i am getting error :
is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
and if try like this :
Select * into #temp1 from Student where percent >=0.5
then i am getting students who has even in 1 subject more than 50% which is not required in output
Table structure is as follows
Student_Name Roll_Number Subject Marks Percent
Ashutosh 1234 English 40 40%
Ishan 1231 Maths 60 60%
Atul 1232 Maths 30 30%
Ashutosh 1234 MAths 70 70%
now in output it should only give
Ishan 1234 Maths 60 60%
You can use count() over() in order to indicate for every student if any row has < 50%, then use this as a filter criteria:
with s as (
select *,
Count(case when perc<0.5 then 1 end) over(partition by Student_Name) pc
from Students
)
select *
from s
where Percent>=0.5 and pc=0
You can get the desired result by using a subquery / cte and a window function in order to check if the student has at least 50% in all subjects:
DECLARE #Student TABLE(
Student_Name VARCHAR(20)
,Roll_Number int
,Subject VARCHAR(20)
,Marks int
,Perc DECIMAL(5,2)
)
INSERT INTO #Student VALUES
('Ashutosh',1234,'English',40,0.4)
,('Ishan',1231,'Maths',60,0.6)
,('Atul',1232,'Maths',30,0.3)
,('Ashutosh',1234,'Maths',70,0.7);
WITH cteFilter AS(
SELECT *, ROW_NUMBER() OVER (PARTITION BY Student_Name, Roll_Number ORDER BY Perc ASC, Marks ASC) rn
FROM #Student
)
SELECT *
FROM cteFilter
WHERE rn = 1
AND Perc >= 0.5

How do you select a number of random rows from different AgeGroup?

I am trying to create a for loop in python to connect it to Snowflake since Snowflake does not support loops.
I want to select a number of random rows from different AgeGroups. eg. 1500 rows from AgeGroup "30-40", 1200 rows from AgeGroup "40-50" , 875 rows from AgeGroup "50-60".
Any ideas how to do it or an alternative method for a loop in Snowflake?
Have you looked at Snowflake's Stored Procedures? They are Javascript and would allow you to loop natively in Snowflake:
https://docs.snowflake.net/manuals/sql-reference/stored-procedures-overview.html
What do you mean by "Snowflake doesn't have loops"? SQL has "loops" if you can find them...
The following query does what you asked for:
WITH POPULATION AS ( /* 10,000 persons with random age 0-100 */
SELECT 'Person ' || SEQ2() ID, ABS(RANDOM()) % 100 AGE
FROM TABLE(GENERATOR(ROWCOUNT => 10000))
)
SELECT
ID,
AGE,
CASE
WHEN AGE < 30 THEN '0-30'
WHEN AGE < 40 THEN '30-40'
WHEN AGE < 50 THEN '40-50'
WHEN AGE < 60 THEN '50-60'
ELSE '60-100'
END AGE_GROUP,
ROW_NUMBER() OVER (PARTITION BY AGE_GROUP ORDER BY RANDOM()) DRAW_ORDER
FROM POPULATION
QUALIFY DRAW_ORDER <= DECODE(AGE_GROUP, '30-40', 1500, '40-50', 1200, '50-60', 875, 0);
Addendum:
As pointed out by waldente, a simpler and more efficient way is to use SAMPLE:
WITH
POPULATION_30_40 AS (SELECT * FROM POPULATION WHERE AGE >= 30 AND AGE < 40),
POPULATION_40_50 AS (SELECT * FROM POPULATION WHERE AGE >= 40 AND AGE < 50),
POPULATION_50_60 AS (SELECT * FROM POPULATION WHERE AGE >= 50 AND AGE < 60)
SELECT * FROM POPULATION_30_40 SAMPLE(1500 ROWS) UNION ALL
SELECT * FROM POPULATION_40_50 SAMPLE(1200 ROWS) UNION ALL
SELECT * FROM POPULATION_50_60 SAMPLE(875 ROWS)
If you want to draw n random samples from each group you could create a subquery containing a row number that is randomly distributed within each group, and then select the top n rows from each group.
If you have a table like this:
USER DATE
1 2018-11-04
1 2018-11-04
1 2018-12-07
1 2018-10-09
1 2018-10-09
1 2018-11-07
1 2018-11-09
1 2018-11-09
2 2019-11-02
2 2019-10-02
2 2019-11-03
2 2019-11-06
3 2019-11-10
3 2019-11-13
3 2019-11-15
This query could be used to return two random rows for User 2 and 3, and 3 random rows for user 1:
SELECT User, Date
FROM (
SELECT *, ROW_NUMBER() OVER(PARTITION BY User ORDER BY RANDOM()) as random_row
FROM Users)
WHERE
(User = 3 AND random_row < 3) OR
(User = 2 AND random_row < 3) OR
(User = 1 AND random_row < 4);
So in your case partition on and filter age_group instead of User.
Snowflake has support for random and deterministic table sampling. For Example:
Return a sample of a table in which each row has a 10% probability of being included in the sample:
SELECT * FROM testtable SAMPLE (10);
https://docs.snowflake.net/manuals/sql-reference/constructs/sample.html

SQL Divide values by a total sum, groupwise

I have a table like:
user_id operation amount
1 purchase 10
1 sale 40
2 purchase 100
2 sale 20
2 conversion 15
3 sale 70
4 conversion 40
given by the SQL query:
SELECT
user_id,
operation,
COUNT(item_num) AS amount
FROM MyTable
GROUP BY user_id, operation
I want to calculate, for each user, the percentage of the total amount for each operation and it would be nice to place them in the columns(actually dividing numbers):
user_id purchase sale conversion
1 10 /50 40 /50 0 /50
2 100 /135 20 /135 15 /135
3 0 /70 70 /70 0 /70
4 0 /40 0 /40 40 /40
EDIT:
Thanks to the intuition given in the responses, I was able to find the solution that suits me best
WITH result
AS
(
SELECT
[user_id],
[operation],
CAST(COUNT([item_num]) AS float) AS amount,
SUM(COUNT([item_num])) over(partition by [user_id]) AS total_amount
FROM Mytable
GROUP BY user_id, operation
)
SELECT
[user_id],
ROUND(ISNULL([purchase], 0) / total_amount, 2) AS purchase,
ROUND(ISNULL([sale], 0) / total_amount, 2) AS sale,
ROUND(ISNULL([conversion], 0) / total_amount, 2) AS conversion
FROM result
PIVOT
(
MAX(amount)
FOR operation IN ([purchase], [sale], [conversion])
) x
ORDER BY user_id
This code will work for you. You will need to use a CTE with the columns and a sum() over() calculating the total sum per user_id.
With that information, all you need to do is pivot the result and give the formatting desired. If you need the division and not just showing what its dividing, remove the formating and concatenation.
WITH Sum_Over AS
(
select user_id,operation, amount,sum(amount) over(partition by user_id) AS Total_Sum
from #test
)
SELECT user_id
,CAST(ISNULL([purchase],0) AS VARCHAR(5))+'/'+CAST(Total_Sum AS VARCHAR(5)) AS [purchase]
,CAST(ISNULL([sale],0) AS VARCHAR(5))+'/'+CAST(Total_Sum AS VARCHAR(5)) AS [sale]
,CAST(ISNULL([conversion],0) AS VARCHAR(5))+'/'+CAST(Total_Sum AS VARCHAR(5)) AS [conversion]
FROM Sum_Over
PIVOT (
max(amount)
FOR operation IN ([purchase],[sale],[conversion])
)x
ORDER BY user_id
The answer from Jader is cleaner. This one uses Group By:
With result As
(
SELECT
user_id,
Sum(amount) as tot,
Case When operation = 'purchase' Then
convert(nvarchar(5),sum(amount)) + '/' + convert(nVarChar(50),(Select sum(amount) From MyTable t Where t.user_id = x.user_id))
End As [purchase],
Case When operation = 'sale' Then
convert(nvarchar(5),sum(amount)) + '/' + convert(nVarChar(50),(Select sum(amount) From MyTable t Where t.user_id = x.user_id))
End As [sale],
Case When operation = 'conversion' Then
convert(nvarchar(5),sum(amount)) + '/' + convert(nVarChar(50),(Select sum(amount) From MyTable t Where t.user_id = x.user_id))
End As [conversion]
FROM MyTable x
GROUP BY user_id, operation
)
Select user_id,
isnull(Max(purchase),'0/' + convert(nVarChar(50),sum(tot))) As purchase,
isnull(Max(sale),'0/' + convert(nVarChar(50),sum(tot))) As sale,
isnull(Max(conversion),'0/' + convert(nVarChar(50),sum(tot))) As conversion
From result
Group By user_id

Resources