How do I group these values - sql-server

I have to take a person's race, gender, age range and
I have to take:
Race 1 - Gender 1 - Age Range
Race 1 - Gender 2 - Age Range
Race 2 - Gender 1 - Age Range
Race 2 - Gender 2 - Age Range
and turn it into:
Group # | Average Age
Group 1 | 20-30
Group 2 | 40-50
Group 3 | 30-40
Group 4 | 40-50
The age is inputted as 20-30, 30-40, 40-50 so I have to find the most repeated string but I don't know how to tie it all together in 2 columns and 4 rows. I'm still new and would like to learn. Can anyone explain how I can do this?
Edit:
End Result Correct Output Desired End Result

I'm not quite clear on your table structure but perhaps something like this would work.
select GroupType, age
from (select race + gender as GroupType , age, count(*) as frequency,
ROW_NUMBER() OVER (PARTITION BY race + gender ORDER BY COUNT(*) DESC) as seqnum
from tbl
group by race + gender, age) g
where seqnum = 1

Related

How to consolidate rows in table for the given scenario?

Let's say I have a table
CustId Name Age Gender Business Code
1 John 24 Male Automobiles 1
2 Peter 30 Male Space 3
2 Peter 30 Male IT null
3 Kris 48 Female Infra null
I need output as follows
CustId Name Age Gender Business Code
1 John 24 Male Automobiles 1
2 Peter 30 Male Space 3
3 Kris 48 Female CodeNotAvailable null
Peter has two businesses one with code and another without code. So, the row without code is removed.
Kris has business without code, so need to display CodeNotAvailable in Business column.
We can use ROW_NUMBER() to get the row numbers and pick the row. By default, SQL Server orders NULL first. We need to use order by code desc to get the non-null value as the first row in the ROW_NUBER()
SELECT CustId,Name, Age, Gender, Business, Code
from
(
SELECT *, ROW_NUMBER() OVER(PARTITION BY CustId ORDER BY Code desc) as rnk
FROM Table) as t
WHERE rnk = 1
;with r as (
select Custid, Name, Age, Gender,
case when code is null then 'CodeNotAvailable' else Business end as Business,
Code
from myTab
)
select max(CustId) CustId, Name, Age, Gender, Business, Code
from r
group by Name, Age, Gender, Business, Code

How to order by one column, but rank based on a different column that is not numeric?

I have four columns that I am trying to rank. They need to be grouped by employee ID and then listed low to high by order number. Then when everything is in order, I'm really trying to get the ranking of where the city falls in that order. If the same city is listed after another for the same employee then I want that those ranked the same.
An example of the table is below. The order is correct, but the ranking is not for what I'm trying to do.
Name Employee_ID Order_Number City Rank
John 1 1 Boston 1
John 1 2 Boston 2
Will 2 1 Peabody 1
Will 2 2 Weston 2
Will 2 3 Newton 3
select Name, Employee_ID, Order_Number, City,
dense_rank() over(partition by Employee_ID order by Order_Number) as rank
from #Employee
How I would actually want the results are:
Name Employee_ID Order_Number City Rank
John 1 1 Boston 1
John 1 2 Boston 1
Will 2 1 Boston 1
Will 2 2 Weston 2
Will 2 3 Newton 3
Then I would eventually remove the duplicate Cities to end up with:
Name Employee_ID Order_Number City Rank
John 1 1 Boston 1
Will 2 1 Boston 1
Will 2 2 Weston 2
Will 2 3 Newton 3
You can try this following script to get your desired output.
SELECT Name, Employee_ID, Order_Number, City ,
ROW_NUMBER() OVER (PARTITION BY Employee_ID ORDER BY Order_Number) rank
(
select Name, Employee_ID, Order_Number, City,
dense_rank() over(partition by Employee_ID,city order by Order_Number) as rank
from #Employee
)A
WHERE rank = 1
Output from your result set is-
Name Employee_ID Order_Number City rank
John 1 1 Boston 1
Will 2 1 Peabody 1
Will 2 2 Weston 2
Will 2 3 Newton 3
Check output of the script on Fiddle.
You can use LAG() to check if the previous city is the same. If the previous city is different or null then we take rank as it is, if cities are same then rank - 1 gives us the same number as row above. Demo
with cte as (select Name, Employee_ID, Order_Number, City,
dense_rank() over (partition by Employee_ID order by Order_Number) as rank,
lag(City) over (partition by Employee_ID order by Order_Number) as previousCity
from #Employee)
select
Name, Employee_ID, Order_Number, City,
case when previousCity = city then rank - 1
else rank end as rank
from cte

How to aggregate over two columns with a condition on the secondary one?

Beginner question - not sure how to phrase the question(s) succinctly (hence why I failed to find something similar in the archive?), so let's go with a fictitious example:
1) You own a hotel chain and want to find out how many rooms, in total over all of your hotels, that are occupied by adult males only.
2) The same as above, but for whatever reason you also want to find out the number of rooms occupied by single males and by 2+ males respectively.
You have a table containing, among other things, the following columns:
| guest_ID | hotel_name | room_ID | gender |
So the [guest_ID] is some ID column with a unique ID for every person staying at the hotel. The [hotel_name] is the name of each branch and should be irrelevant in the query. [room_ID] is the room number in each respective branch but is unique, so room 237 in The Pink Flamingo has a different ID than room 237 in The Stanley, an int or whatever. And let's say [gender] can take on the values 'male', 'female' or 'child'.
I want to make sure I don't accidentally pick up guest rows from 'mixed' rooms in the result.
how many rooms, in total over all of your hotels, that are occupied by adult males only
SELECT
COUNT(*) AS TOTAL_ROOMS
FROM
(
SELECT
room_ID
FROM
MyTable
OUTER APPLY
(
SELECT CASE WHEN gender = 'male' THEN 1 ELSE 0 END AS IS_MALE
) AS T
GROUP BY
room_ID
HAVING
SUM(T.IS_MALE)/COUNT(*) = 1
) AS ROOMS
The same as above, but for whatever reason you also want to find out the number of rooms occupied by single males
SELECT
COUNT(*) AS TOTAL_ROOMS
FROM
(
SELECT
room_ID
FROM
MyTable
OUTER APPLY
(
SELECT CASE WHEN gender = 'male' THEN 1 ELSE 0 END AS IS_MALE
) AS T
GROUP BY
room_ID
HAVING
SUM(T.IS_MALE)/COUNT(*) = 1 AND COUNT(*) = 1
) AS ROOMS
and by 2+ males
SELECT
COUNT(*) AS TOTAL_ROOMS
FROM
(
SELECT
room_ID
FROM
MyTable
OUTER APPLY
(
SELECT CASE WHEN gender = 'male' THEN 1 ELSE 0 END AS IS_MALE
) AS T
GROUP BY
room_ID
HAVING
SUM(T.IS_MALE)/COUNT(*) = 1 AND COUNT(*) >= 2
) AS ROOMS

Find and replace rows with similar value in one column in Oracle SQL

I want to find the rows which are similar to each other, and replace them with a new row. My table looks like this:
OrderID | Price | Minimum Number | Maximum Number | Volume
1 45 2 10 250
2 46 2 10 250
3 60 2 10 250
"Similar" in this context means that the rows that have same Maximum Number, Minimum Number, and Volume. Prices can be different, but the difference can be at most 2.
In this example, orders with OrderID of 1 and 2 are similar, but 3 is not (since even if it has same Minimum Number, Maximum Number, and Volume, its price is not within 2 units from orders 1 and 2).
Then, I want orders 1 and 2 be replaced by a new order, let's say OrderID 4, which has same Minimum Number and Maximum Number. Its Volume hass to be sum of volumes of the orders it is replacing. Its price can be the Price of any of the previous orders that will be deleted in the output table (45 or 46 in this example). So, the output for the example above would be:
OrderID | Price | Minimum Number | Maximum Number | Volume
4 45 2 10 500
3 60 2 10 250
Here is a way to do this in SQL Server 2012 or Oracle. The idea is to use lag() to find where groups should begin and end and then aggregate.
select min(id) as id, min(price) as price, MinimumNumber, MaximumNumber, sum(Volume)
from (select t.*,
sum(case when prev_price < price - 2 then 1 else 0 end) over
(partition by MinimumNumber, MaximumNumber, Volume order by price) as grp
from (select t.*,
lag(price) over (partition by MinimumNumber, MaximumNumber, Volume
order by price
) as prev_price
from table t
) t
) t
group by grp, price, MinimumNumber, MaximumNumber;
The only issue is the setting of the id. I'm not sure what the exact rule is for that.

Take average of only most recent group

There's one table named StudentScore which has fields of: Score, CourseID, StudentID and Semester. The later three ones are the primary keys.
I want to write a stored procedure to get the average score of each student. But the rule is quite complex and I don't know how to express it in one query. Nested query should be avoided if is possible.
Here is the rule:
If one student take a course for more than once, only the last score should be calculated.
For example, there're following data:
StudentID | CourseID | Semester | Score
1 1 1 80
1 2 1 40
1 3 1 60
1 2 2 50
1 3 2 20
2 1 1 90
The stored procedure should return:
StudentID | AvgScore
1 50 // which is avg(80, 50, 20)
2 90
Please suggest stored procedure as efficient as possible. Thanks!
;WITH x AS
(
SELECT StudentID, Score, rn = ROW_NUMBER() OVER
(PARTITION BY StudentID, CourseID
ORDER BY Semester DESC)
FROM dbo.StudentScore
)
SELECT StudentID, AvgScore = AVG(Score)
FROM x
WHERE rn = 1
GROUP BY StudentID;
If you want something rounded to certain decimal places, maybe:
;WITH x AS
(
SELECT StudentID, Score = 1.0*Score, rn = ROW_NUMBER() OVER
(PARTITION BY StudentID, CourseID
ORDER BY Semester DESC)
FROM dbo.StudentScore
)
SELECT StudentID, AvgScore = CONVERT(DECIMAL(10,2), AVG(Score))
FROM x
WHERE rn = 1
GROUP BY StudentID;

Resources