How to get Column from Max of multi another columns? - database

I need to get the G value at that row contain max of max columns (H,J,J)
Below example: after group, max value of H or I or J is 170, so I need to get column value in column G is 06/25/2022 07:00:00.
I used the following query, it seems to work but returned a lot of missing values after GROUP
"Select C,MAX(MAX(H),MAX(I),MAX(J)) as d1,G GROUP BY C HAVING H=d1 OR I=d1 OR j=d1"
How do I fix this.

Use a CTE that returns the max of H, I and J for each C like this:
WITH cte AS (
SELECT C, MAX(MAX(H), MAX(I), MAX(J)) max
FROM tablename
GROUP BY C
)
SELECT t.C, t.G
FROM tablename t
WHERE (t.c, MAX(t.H, t.I, t.J)) IN (SELECT C, max FROM cte);
For your sample data, maybe it is more suitable to GROUP BY B.
Or, with ROW_NUMBER() window function:
WITH cte AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY C ORDER BY MAX(H, I, J) DESC) rn
FROM tablename
)
SELECT C, G
FROM cte
WHERE rn = 1;
Or, with FIRST_VALUE() window fuction:
SELECT DISTINCT C,
FIRST_VALUE(G) OVER (PARTITION BY C ORDER BY MAX(H, I, J) DESC) G
FROM tablename;
See the demo.

Related

Finding point of interest on a square wave using sql

Good day,
I have a sql table with the following setup:
DataPoints{ DateTime timeStampUtc , bit value}
The points are on a minute interval, and store either a 1(on) or a 0(off).
I need to write a stored procedure to find the points of interest from all the data points.
I have a simplified drawing below:
I need to find the corner points only. Please note that there may be many data points between a value change. For example:
{0,0,0,0,0,0,0,1,1,1,1,0,0,0}
This is my thinking atm (high level)
Select timeStampUtc, Value
From Data Points
Where Value before or value after differs by 1 or -1
I am struggling to convert this concept to sql, and I also have a feeling there is an more elegant mathematical solution that I am not aware off. This must be a common problem in electronics?
I have wrapped the table into a CTE. Then, I am joining every row in the CTE to the next row of itself. Also, I've added a condition that the consequent rows should differ in the value.
This would return you all rows where the value changes.
;WITH CTE AS(
SELECT ROW_NUMBER() OVER(ORDER BY TimeStampUTC) AS id, VALUE, TIMESTAMPUTC
FROM DataPoints
)
SELECT CTE.TimeStampUTC as "Time when the value changes", CTE.id, *
FROM CTE
INNER JOIN CTE as CTE2
ON CTE.id = CTE2.id + 1
AND CTE.Value != CTE2.Value
Here's a working fiddle: http://sqlfiddle.com/#!6/a0ddc/3
If I got it correct, you are looking for something like this:
with cte as (
select * from (values (1,0),(2,0),(3,1),(4,1),(5,0),(6,1),(7,0),(8,0),(9,1)) t(a,b)
)
select
min(a), b
from (
select
a, b, sum(c) over (order by a rows unbounded preceding) grp
from (
select
*, iif(b = lag(b) over (order by a), 0, 1) c
from
cte
) t
) t
group by b, grp

only display one row when key field is the same

I have created a key field (C) by joining two columns(A&C). I want to run an sql that says, if column C is unique take only the top row.
Sample data:-
A B C D
10022 Blue 10022Blue Buggy
10300 Red 10300Red Noodle
10300 Red 10300Red Sammy
so I only want one line to show for 10300Red
Cheers
One way to do it is with a cte and ROW_NUMBER():
;WITH CTE AS
(
SELECT A,
B,
C,
D,
ROW_NUMBER() OVER(PARTITION BY C ORDER BY (SELECT NULL)) rn
FROM Table
)
SELECT A, B, C, D
FROM CTE
WHERE rn = 1
Note: You did say you want the "first" record, but you didn't specify the order of the records. Since tables in a relational database are unsorted by nature, "first" is simply an arbitrary row, hence "order by (select null)"
Do it this way:
select distinct A, B, C from tablename
You can find the result set by grouping it, then join it with the main table.
SELECT
A.*
FROM
YourTable A INNER JOIN
(
SELECT
G.C,
MAX(G.D) D
FROM
YourTable G
GROUP BY
G.C
) B ON A.C = B.C AND A.D = B.D

SQL syntax for complex GROUP BY with OVER statement: calculating Gini coefficient for multiple sets

I want to calculate the Gini coefficient for a number of sets, containing in a two-column table (here called #cits) containing a value and a set-ID. I have been experimenting with different Gini-coefficient calculations, described here (StackExchange query) and here (StackOverflow question with some good replies). Both of the examples only calculate one coefficient for one table, whereas I would like to do it with a GROUP BY clause.
The #cits table contains two columns, c and cid, being the value and set-ID respectively.
Here is my current try (incomplete):
select count(c) as numC,
sum(c) as totalC,
(select row_number() over(order by c asc, cid) id, c from #cits) as a
from #cits group by cid
selecting numC and totalC works well, of course, but the next line is giving me a headache. I can see that the syntax is wrong, but I can't figure out how to assign the row_number() per c per cid.
EDIT:
Based on the suggestions, I used partition, like so:
select cid,sumC = sum(a.id * a.c)
into #srep
from (
select cid,row_number() over (partition by cid order by c asc) id,
c
from #cits
) as a
group by a.cluster_id1
select count(c) as numC,
sum(c) as totalC, b.sumC
into #gtmp
from #cits a
join #srep b
on a.cid = b.cid
group by a.cid,b.sumC
select
gini = 2 * sumC / (totalC * numC) - (numC - 1) / numC
from #gtmp
This almost works. I get a result, but it is >1, which is unexpected, as the Gini-coefficient should be between 0 and 1. As stated in the comments, I would have preferred a one-query solution as well, but it is not a major issue at all.
You can "partition" the data so row numbering would start over for each ID...
but I'm not sure this is what you're after..
I'm assuming you want the CID displayed as you are grouping by it.
select count(c) as numC
, sum(c) as totalC
, row_number() over(partition by cID order by c asc) as a
, cid
from #cits group by cid
Note you don't need the subquery.
Yeah this isn't likely right.
output
NumC TotalC A CID
24 383 1 1
15 232 1 2
If I'm understanding correctly, you need numC and totalC for each C in a cid set, as well as the position of the c inside of that set. This should get you what you need:
select
rn.cid,
rn.c,
row_number() over (partition by rn.cid order by rn.c) as id,
agg.numC,
agg.totalC
from #cits rn
left outer join
(
select
cid,
count(c) as numC,
sum(c) as totalC
from #cits
group by cid
) agg
on rn.cid = agg.cid

GROUP BY doesn't contain specific column

I have the following statement in MSSQL
SELECT a, b, MAX(t)
FROM table
GROUP BY a, b
What I want is just to show c and d columns for each specific row in the result. How can I do that?
It sounds like you're looking for ROW_NUMBER() or RANK() (the former will ignore ties, the latter will include them), something like:
;With Ranked as (
SELECT a,b,c,d,t,
ROW_NUMBER() OVER (PARTITION BY a,b
ORDER BY t desc) as rn
FROM table
)
SELECT * from Ranked where rn = 1
Which will return one row for each unique combination of the a,b columns, choosing the other values such that they come from the row with the highest t value (and, as I say, this variant ignores ties).

How can I order by count with pagination?

I have to migrate some SQL from PostgreSQL to SQL Server (2005+). On PostgreSQL i had:
select count(id) as count, date
from table
group by date
order by count
limit 10 offset 25
Now i need the same SQL but for SQL Server. I did it like below, but get error: Invalid column name 'count'. How to solve it ?
select * from (
select row_number() over (order by count) as row, count(id) as count, date
from table
group by date
) a where a.row >= 25 and a.row < 35
You can't reference an alias by name, at the same scope, except in an ending ORDER BY (it is an invalid reference inside of a windowing function at the same scope).
To get the exact same results, it may need to be extended to (nesting scope for clarity):
SELECT c, d FROM
(
SELECT c, d, ROW_NUMBER() OVER (ORDER BY c) AS row FROM
(
SELECT d = [date], c = COUNT(id) FROM dbo.table GROUP BY [date]
) AS x
) AS y WHERE row >= 25 AND row < 35;
This can be shortened a little bit as per mohan's answer.
SELECT c, d FROM
(
SELECT COUNT(id), [date], ROW_NUMBER() OVER (ORDER BY COUNT(id))
FROM dbo.table GROUP BY [date]
) AS y(c, d, row)
WHERE row >= 25 AND row < 35;
In SQL Server 2012, it's much easier with OFFSET / FETCH - closer to the syntax you're used to, but actually using ANSI-compatible syntax rather than proprietary voodoo.
SELECT c = COUNT(id), d = [date]
FROM dbo.table GROUP BY [date]
ORDER BY COUNT(id)
OFFSET 25 ROWS FETCH NEXT 10 ROWS ONLY;
I blogged about this functionality in 2010 (lots of good comments there too) and should probably invest some time doing some serious performance tests.
And I agree with #ajon - I hope your real tables, columns and queries don't abuse reserved words like this.
It works
DECLARE #startrow int=0,#endrow int=0
;with CTE AS (
select row_number() over ( order by count(id)) as row,count(id) AS count, date
from table
group by date
)
SELECT * FROM CTE
WHERE row between #startrow and #endrow
I think this will do it
select * from (
select row_number() over (order by id) as row, count(id) as count, date
from table
group by date
) a where a.row >= 25 and a.row < 35
Also, I don't know what version of SQL Server you are using but SQL Server 2012 has a new Paging feature

Resources