T-SQL select rows where [col] = MIN([col]) - sql-server

I have a data set produced from a UNION query that aggregates data from 2 sources.
I want to select that data based on whether or not data was found in only of those sources,or both.
The data relevant parts of the set looks like this, there are a number of other columns:
row
preference
group
position
1
1
111
1
2
1
111
2
3
1
111
3
4
1
135
1
5
1
135
2
6
1
135
3
7
2
111
1
8
2
135
1
The [preference] column combined with the [group] column is what I'm trying to filter on, I want to return all the rows that have the same [preference] as the MIN([preference]) for each [group]
The desired output given the data above would be rows 1 -> 6
The [preference] column indicates the original source of the data in the UNION query so a legitimate data set could look like:
row
preference
group
position
1
1
111
1
2
1
111
2
3
1
111
3
4
2
111
1
5
2
135
1
In which case the desired output would be rows 1,2,3, & 5
What I can't work out is how to do (not real code):
SELECT * WHERE [preference] = MIN([preference]) PARTITION BY [group]

One way to do this is using RANK:
SELECT row
, preference
, [group]
, position
FROM (
SELECT row
, preference
, [group]
, position
, RANK() OVER (PARTITION BY [group] ORDER BY preference) AS seq
FROM t) t2
WHERE seq = 1
Demo here

Should by doable via simple inner join:
SELECT t1.*
FROM t AS t1
INNER JOIN (SELECT [group], MIN(preference) AS preference
FROM t
GROUP BY [group]
) t2 ON t1.[group] = t2.[group]
AND t1.preference = t2.preference

Related

Classifying rows into a grouping column that shows the data is related to prior rows

I have a set of data that I want to classify into groups based on a prior record id existing on the newer rows. The initial record of the group has a prior sequence id = 0.
The data is as follows:
customer id
sequence id
prior_sequence id
1
1
0
1
2
1
1
3
2
2
4
0
2
5
4
2
6
0
2
7
6
Ideally, I would like to create the following grouping column and yield the following results:
customer id
sequence id
prior sequence id
grouping
1
1
0
1
1
2
1
1
1
3
2
1
2
4
0
2
2
5
4
2
2
6
0
3
2
7
6
3
I've attempted to utilize island gap logic utilizing the ROW_NUMBER() function. However, I have been unsuccessful in doing so. I suspect the need here is more along the lines of a recursive CTE, which I am attempting at the moment.
I agree that a recursive CTE will do the job. Something like:
WITH reccte AS
(
/*query that determines starting point for recursion
*
* In this case we want all records with no prior_sequence_id
*/
SELECT
customer_id,
sequence_id,
prior_sequence_id,
/*establish grouping*/
ROW_NUMBER() OVER (ORDER BY sequence_id) as grouping
FROM yourtable
WHERE prior_sequence_id = 0
UNION
/*join the recursive CTe back to the table and iterate*/
SELECT
yourtable.customer_id,
yourtable.sequence_id,
yourtable.prior_sequence_id,
reccte.grouping
FROM reccte
INNER JOIN yourtable ON reccte.sequence_id = yourtable.prior_sequence_id
)
SELECT * FROM reccte;
It looks like you could use a simple correlated query, at least given your sample data:
select *, (
select Sum(Iif(prior_sequence_id = 0, 1, 0))
from t t2
where t2.sequence_id <= t.sequence_id
) Grouping
from t;
See Example Fiddle

Increment column values based on two columns in Oracle database

I have a table Test , that has below structure:
Id CID RO Other Columns
1 111 2
2 111 1
3 111 6
4 111 6
5 111 8
6 111 5
7 101 4
8 101 4
9 101 3
Resultant order in RO should be like below:
-> For One CID and ascending order of RO should get order (RO) replaced with 1,2,3,4 and so on
Final Order in RO column:
(RO column's value got replaced)
Id CID RO (New) RO Other Columns
1 111 2 2
2 111 1 1
3 111 6 4
4 111 6 5
5 111 8 6
6 111 5 3
7 101 4 2
8 101 4 3
9 101 3 1
There are hundreds of cids like that in table. Please let me know if this can be achieved in single query using some Oracle function or some procedure needs to be written. Any lead or example would be helpful.
Thanks
The NEW_RO column can be calculated with the analytic function ROW_NUMBER():
select ... ,
row_number() over (partition by cid order by ro) as new_ro [, ...]
In your data, there are ties for RO within the same CID. Do you care, in that case, which row gets what NEW_RO value? If, for example, in the case of same RO you also want to (further) order by ID, you can change the above to
select ... ,
row_number() over (partition by cid order by ro, id) as new_ro [, ...]
EDIT: I missed the fact that you need to UPDATE the RO values with the NEW_RO values. Analytic functions can't be used in an UPDATE statement (not directly anyway); the MERGE statement is the perfect alternative for this:
merge into test
using ( select id,
row_number() over (partition by cid order by ro, id) as new_ro
from test
) s
on (test.id = s.id)
when matched then update set ro = s.new_ro
;
Addressing the follow up question in the comment on #mathguy's answer. If you've got a query that is producing the new values you want, and want to quickly write an update, I like to use merge:
MERGE INTO your_table target
USING (your_query_here) source
ON (Target.ID = Source.ID)
WHEN MATCHED THEN
UPDATE SET Target.column = Source.new_value
https://docs.oracle.com/cd/B28359_01/server.111/b28286/statements_9016.htm#SQLRF01606
MERGE can do more than that, but I've found it handy in this "correct the data" situation.

How to select the value from the table based on category_id USING SQL SERVER

How to select the value from the table based on category_id?
I have a table like this. Please help me.
Table A
ID Name category_id
-------------------
1 A 1
2 A 1
3 B 1
4 C 2
5 C 2
6 D 2
7 E 3
8 E 3
9 F 3
How to get the below mentioned output from table A?
ID Name category_id
--------------------
1 A 1
2 A 1
4 C 2
5 C 2
7 E 3
8 E 3
Give a row number for each row based on group by category_id and sort by ascending order of ID. Then select the rows having row number 1 and 2.
Query
;with cte as (
select [rn] = row_number() over(
partition by [category_id]
order by [ID]
), *
from [your_table_name]
)
select [ID], [Name], [category_id]
from cte
where [rn] < 3;
Kindly run this query It really help You Out.
SELECT tbl.id,tbl.name, tbl.category_id FROM TableA as tbl WHERE
tbl.name IN(SELECT tbl2.name FROM TableA tbl2 GROUP BY tbl2.name HAVING Count(tbl2.name)> 1)
Code select all category_id from TableA which has Name entries more then one. If there is single entry of any name group by category_id then such data will be excluded. In above example questioner want to eliminate those records that have single Name entity like wise category_id 1 has name entries A and B among which A has two entries and B has single entry so he want to eliminate B from result set.

Performance issue with CTE SQL Server query

We have a table with a parent child relationship, that represents a deep tree structure.
We are using a view with a CTE to query the data but the performance is poor (see code and execution plan below).
Is there any way we can improve the performance?
WITH cte (ParentJobTypeId, Id) AS
(
SELECT
Id, Id
FROM
dbo.JobTypes
UNION ALL
SELECT
e.Id, cte.Id
FROM
cte
INNER JOIN
dbo.JobTypes AS e ON e.ParentJobTypeId = cte.ParentJobTypeId
)
SELECT
ISNULL(Id, 0) AS ParentJobTypeId,
ISNULL(ParentJobTypeId, 0) AS Id
FROM
cte
A quick example of using the range keys. As I mentioned before, hierarchies were 127K points and some sections where 15 levels deep
The cte Builds, let's assume the hier results will be will be stored in a table (indexed as well)
Declare #Table table(ID int,ParentID int,[Status] varchar(50))
Insert #Table values
(1,101,'Pending'),
(2,101,'Complete'),
(3,101,'Complete'),
(4,102,'Complete'),
(101,null,null),
(102,null,null)
;With cteOH (ID,ParentID,Lvl,Seq)
as (
Select ID,ParentID,Lvl=1,cast(Format(ID,'000000') + '/' as varchar(500)) from #Table where ParentID is null
Union All
Select h.ID,h.ParentID,cteOH.Lvl+1,Seq=cast(cteOH.Seq + Format(h.ID,'000000') + '/' as varchar(500)) From #Table h INNER JOIN cteOH ON h.ParentID = cteOH.ID
),
cteR1 as (Select ID,Seq,R1=Row_Number() over (Order by Seq) From cteOH),
cteR2 as (Select A.ID,R2 = max(B.R1) From cteOH A Join cteR1 B on (B.Seq Like A.Seq+'%') Group By A.ID)
Select B.R1
,C.R2
,A.Lvl
,A.ID
,A.ParentID
Into #TempHier
From cteOH A
Join cteR1 B on (A.ID=B.ID)
Join cteR2 C on (A.ID=C.ID)
Select * from #TempHier
Select H.R1
,H.R2
,H.Lvl
,H.ID
,H.ParentID
,Total = count(*)
,Complete = sum(case when D.Status = 'Complete' then 1 else 0 end)
,Pending = sum(case when D.Status = 'Pending' then 1 else 0 end)
,PctCmpl = format(sum(case when D.Status = 'Complete' then 1.0 else 0.0 end)/count(*),'##0.00%')
From #TempHier H
Join (Select _R1=B.R1,A.* From #Table A Join #TempHier B on A.ID=B.ID) D on D._R1 between H.R1 and H.R2
Group By H.R1
,H.R2
,H.Lvl
,H.ID
,H.ParentID
Order By 1
Returns the hier in a #Temp table for now. Notice the R1 and R2, I call these the range keys. Data (without recursion) can be selected and aggregated via these keys
R1 R2 Lvl ID ParentID
1 4 1 101 NULL
2 2 2 1 101
3 3 2 2 101
4 4 2 3 101
5 6 1 102 NULL
6 6 2 4 102
VERY SIMPLE EXAMPLE: Illustrates the rolling the data up the hier.
R1 R2 Lvl ID ParentID Total Complete Pending PctCmpl
1 4 1 101 NULL 4 2 1 50.00%
2 2 2 1 101 1 0 1 0.00%
3 3 2 2 101 1 1 0 100.00%
4 4 2 3 101 1 1 0 100.00%
5 6 1 102 NULL 2 1 0 50.00%
6 6 2 4 102 1 1 0 100.00%
The real beauty of the the range keys, is if you know an ID, you know where it exists (all descendants and ancestors).

Get Sum of Count

The View obtains the first three columns. I need to add one more column (totalCount) to the view that obtains the total count:
CId CCId CCount totalCount
1 a 3 6
1 a 3 6
1 b 3 6
1 c 3 6
2 b 2 6
2 b 2 6
2 a 2 6
2 a 2 6
3 v 1 6
How to get the totalCount as 6?
(Business rule for Cid=1 Ccount=3 Cid=2 Ccount=2 Cid=3 Ccount=1 So the totalCount =3+2+1 =6)
SELECT a.CID, a.CCID, a.CCOUNT,
b.TotalCount
FROM Table1 a, (SELECT SUM(DISTINCT cCOunt) TotalCount
FROM Table1) b
SQLFiddle Demo
UPDATE
As Andomar pointed out on the comment, An update has been made on the query,
SELECT a.CID, a.CCID, a.CCOUNT,
b.TotalCount
FROM Table1 a,
(
SELECT SUM(TotalCount) TotalCount
FROM
(
SELECT MAX(cCOunt) TotalCount
FROM Table1
GROUP BY CId
) c
) b
SQLFiddle Demo
With this code I came to the desired result:
select CId
,CCId
,CCount
,(select SUM(a.tcount)
from (select distinct CId ,CCount as tcount
from dbo.Test) as a ) totalcount
from dbo.Test
From your example data, I'm assuming a Cid can only have one, possibly repeated, value of CCount. In that case you can pick a random one (say max) using a group by, and sum those:
select sum(OneCCCount) as TotalCount
from (
select max(CCount) as OneCCCount
from YourTable
group by
CId
) as SubQueryAlias

Resources