Increment column values based on two columns in Oracle database - database

I have a table Test , that has below structure:
Id CID RO Other Columns
1 111 2
2 111 1
3 111 6
4 111 6
5 111 8
6 111 5
7 101 4
8 101 4
9 101 3
Resultant order in RO should be like below:
-> For One CID and ascending order of RO should get order (RO) replaced with 1,2,3,4 and so on
Final Order in RO column:
(RO column's value got replaced)
Id CID RO (New) RO Other Columns
1 111 2 2
2 111 1 1
3 111 6 4
4 111 6 5
5 111 8 6
6 111 5 3
7 101 4 2
8 101 4 3
9 101 3 1
There are hundreds of cids like that in table. Please let me know if this can be achieved in single query using some Oracle function or some procedure needs to be written. Any lead or example would be helpful.
Thanks

The NEW_RO column can be calculated with the analytic function ROW_NUMBER():
select ... ,
row_number() over (partition by cid order by ro) as new_ro [, ...]
In your data, there are ties for RO within the same CID. Do you care, in that case, which row gets what NEW_RO value? If, for example, in the case of same RO you also want to (further) order by ID, you can change the above to
select ... ,
row_number() over (partition by cid order by ro, id) as new_ro [, ...]
EDIT: I missed the fact that you need to UPDATE the RO values with the NEW_RO values. Analytic functions can't be used in an UPDATE statement (not directly anyway); the MERGE statement is the perfect alternative for this:
merge into test
using ( select id,
row_number() over (partition by cid order by ro, id) as new_ro
from test
) s
on (test.id = s.id)
when matched then update set ro = s.new_ro
;

Addressing the follow up question in the comment on #mathguy's answer. If you've got a query that is producing the new values you want, and want to quickly write an update, I like to use merge:
MERGE INTO your_table target
USING (your_query_here) source
ON (Target.ID = Source.ID)
WHEN MATCHED THEN
UPDATE SET Target.column = Source.new_value
https://docs.oracle.com/cd/B28359_01/server.111/b28286/statements_9016.htm#SQLRF01606
MERGE can do more than that, but I've found it handy in this "correct the data" situation.

Related

Finding A Time When A Value Changed

I am still learning many new things about SQL such as PARTITION BY and CTEs. I am currently working on a query which I have cobbled together from a similar question I found online. However, I can not seem to get it to work as intended.
The problem is as follows -- I have been tasked to show rank promotions in an organization from the begining of 2022 to today. I am working with 2 primary tables, an EMPLOYEES table and a PERIODS table. This periods table captures a snapshot of any given employee each month - including their rank at the time. Each of these months is also assigned a PeriodID (e.g. Jan 2022 = PeriodID 131). Our EMPLOYEE table holds the employees current rank. These ranks are stored as an int (e.g. 1,2,3 with 1 being lowest rank). It is possible for an employee to rank up more than once in any given month.
I have simplified the used query as much as I can for the sake of this problem. Query follows as:
;WITH x AS
(
SELECT
e.EmployeeID, p.PeriodID, p.RankID,
rn = ROW_NUMBER() OVER (PARTITION BY e.EmployeeID ORDER BY p.PeriodID DESC)
FROM employees e
LEFT JOIN periods p on p.EmployeeID= e.EmployeeID
WHERE p.PeriodID <= 131 AND p.PeriodID >=118 --This is the time range mentioned above
),
rest AS (SELECT * FROM x WHERE rn > 1)
SELECT
main.EmployeeID,
PeriodID = MIN(
CASE
WHEN main.CurrentRankID = Rest.RankID
THEN rest.PeriodID ELSE main.PeriodID
END),
main.RankID, rest.RankID
FROM x AS main LEFT OUTER JOIN rest ON main.EmployeeID = rest.EmployeeID
AND rest.rn >1
LEFT JOIN periods p on p.EmployeeID = e.EmployeeID
WHERE main.rn = 1
AND NOT EXISTS
(
SELECT 1 FROM rest AS rest2
WHERE EmployeeID = rest.EmployeeID
AND rn < rest.rn
AND main.RankID <> rest.RankID
)
and p.PeriodID <= 131 AND p.PeriodID >=118
GROUP BY main.EmployeeID, main.PeriodID, main.RankID, rest.RankID
As mentioned before, this query was borrowed from a similar question and modified for my own use. I imagine the bones of the query is good and maybe I have messed up a variable somewhere but I can not seem to locate the problem line. The end goal is for the query to result in a table showing the EmployeeID, PeriodID, the rank they are being promoted from, and the rank they are being promoted to in the month the promotion was earned. Similar to the below.
EmployeeID
PeriodID
PerviousRankID
NewRank
123
131
1
2
123
133
2
3
Instead, my query is spitting out repeating previous/current ranks and the PeriodIDs seem to be static (such as what is shown below).
EmployeeID
PeriodID
PerviousRankID
NewRank
123
131
1
1
123
131
1
1
I am hoping someone with a greater knowledge base on these functions is able to quickly notice my mistake.
If we assume some example DML/DDL (it's really helpful to provide this with your question):
DECLARE #Employees TABLE (EmployeeID INT IDENTITY, Name VARCHAR(20), RankID INT);
DECLARE #Periods TABLE (PeriodID INT, EmployeeID INT, RankID INT);
INSERT INTO #Employees (Name, RankID) VALUES ('Jonathan', 10),('Christopher', 10),('James', 10),('Jean-Luc', 8);
INSERT INTO #Periods (PeriodID, EmployeeID, RankID) VALUES
(1,1,1),(2,1,1),(3,1,1),(4,1,8 ),(5,1,10),(6,1,10),
(1,2,1),(2,2,1),(3,2,1),(4,2,8 ),(5,2,8 ),(6,2,10),
(1,3,1),(2,3,1),(3,3,7),(4,3,10),(5,3,10),(6,3,10),
(1,4,1),(2,4,1),(3,4,1),(4,4,8 ),(5,4,9 ),(6,4,9 )
Then we can accomplish what I think you're looking for using a OUTER APPLY then aggregates the values based on the current-row values:
SELECT e.EmployeeID, e.Name, e.RankID AS CurrentRank, ap.PeriodID AS ThisPeriod, p.PeriodID AS LastRankChangePeriodID, p.RankID AS LastRankChangedFrom, ap.RankID - p.RankID AS LastRankChanged
FROM #Employees e
LEFT OUTER JOIN #Periods ap
ON e.EmployeeID = ap.EmployeeID
OUTER APPLY (
SELECT EmployeeID, MAX(PeriodID) AS PeriodID
FROM #Periods
WHERE EmployeeID = e.EmployeeID
AND RankID <> ap.RankID
AND PeriodID < ap.PeriodID
GROUP BY EmployeeID
) a
LEFT OUTER JOIN #Periods p
ON a.EmployeeID = p.EmployeeID
AND a.PeriodID = p.PeriodID
ORDER BY e.EmployeeID, ap.PeriodID DESC
Using the correlated subquery we get a view of the data which we can filter using the current-row values, and we aggregate that to return the period we're looking for (where it's before this period, and it's not the same rank). Then it's just a join back to the Periods table to get the values.
You used an LEFT JOIN, so I've preserved that using an OUTER APPLY. If you wanted to filter using it, it would be a CROSS APPLY instead.
EmployeeID
Name
CurrentRank
ThisPeriod
LastRankChangePeriodID
LastRankChangedFrom
LastRankChanged
1
Jonathan
10
6
4
8
2
1
Jonathan
10
5
4
8
2
1
Jonathan
10
4
3
1
7
1
Jonathan
10
3
1
Jonathan
10
2
1
Jonathan
10
1
2
Christopher
10
6
5
8
2
2
Christopher
10
5
3
1
7
2
Christopher
10
4
3
1
7
2
Christopher
10
3
2
Christopher
10
2
2
Christopher
10
1
3
James
10
6
3
7
3
3
James
10
5
3
7
3
3
James
10
4
3
7
3
3
James
10
3
2
1
6
3
James
10
2
3
James
10
1
4
Jean-Luc
8
6
5
9
-1
4
Jean-Luc
8
5
4
8
1
4
Jean-Luc
8
4
3
1
7
4
Jean-Luc
8
3
4
Jean-Luc
8
2
4
Jean-Luc
8
1
Now we can see what the previous change looked like for each period. Currently Jonathan is has RankID 10. Last time that was different was in PeriodID 4 when it was 8. The same was true for PeriodID 5. In PeriodID 4 he had RankID 8, and prior to that he had RankID 1. Before that his Rank hadn't changed.
Jean-Luc was actually demoted as his last change. I don't know if this is possible within your model.

T-SQL select rows where [col] = MIN([col])

I have a data set produced from a UNION query that aggregates data from 2 sources.
I want to select that data based on whether or not data was found in only of those sources,or both.
The data relevant parts of the set looks like this, there are a number of other columns:
row
preference
group
position
1
1
111
1
2
1
111
2
3
1
111
3
4
1
135
1
5
1
135
2
6
1
135
3
7
2
111
1
8
2
135
1
The [preference] column combined with the [group] column is what I'm trying to filter on, I want to return all the rows that have the same [preference] as the MIN([preference]) for each [group]
The desired output given the data above would be rows 1 -> 6
The [preference] column indicates the original source of the data in the UNION query so a legitimate data set could look like:
row
preference
group
position
1
1
111
1
2
1
111
2
3
1
111
3
4
2
111
1
5
2
135
1
In which case the desired output would be rows 1,2,3, & 5
What I can't work out is how to do (not real code):
SELECT * WHERE [preference] = MIN([preference]) PARTITION BY [group]
One way to do this is using RANK:
SELECT row
, preference
, [group]
, position
FROM (
SELECT row
, preference
, [group]
, position
, RANK() OVER (PARTITION BY [group] ORDER BY preference) AS seq
FROM t) t2
WHERE seq = 1
Demo here
Should by doable via simple inner join:
SELECT t1.*
FROM t AS t1
INNER JOIN (SELECT [group], MIN(preference) AS preference
FROM t
GROUP BY [group]
) t2 ON t1.[group] = t2.[group]
AND t1.preference = t2.preference

How to split Row into multiple column using T-SQL

There are three column,wherever D_ID=13,value_amount holds value for mode of payment and wherever D_ID=10,value_amount holds value for amount.
ID D_ID Value_amount
1 13 2
1 13 2
1 10 1500
1 10 1500
2 13 1
2 13 1
2 10 2000
2 10 2000
Now I have to add two more columns amount and mode_of_payment and result should come like below
ID amount mode_of_payment
1 1500 2
1 1500 2
2 2000 1
2 2000 1
This is too long for a comment.
Simply put, your data is severely flawed. For the example data you've given, you're "ok", because the rows have the same values to the same ID, but what about when they don't? Let's assume, for example, we have data that looks like this:
ID D_ID Value_amount
1 13 1 --1
1 13 2 --2
1 10 1500 --3
1 10 1000 --4
2 13 1 --5
2 13 2 --6
2 10 2000 --7
2 10 3000 --8
I've added a "row number" next to data, for demonstration purposes only.
Here, what row is row "1" related to? Row "3" or row "4"? How do you know? There's no always ascending value in your data, so row "3" could just as easily be row "4". In fact, if we were to order the data using ID ASC, D_ID DESC, Value_amount ASC then rows 3 and 4 would "swap" in order. This could mean that when you attempt a solution, the order in wrong.
Tables aren't stored in any particular order, that are unordered. What determines the order the data is presented in is the ORDER BY clause, and if you don't have a value to define that "order", then that "order" is lost as soon as you INSERT it.
If, however, we add a always ascending value into your data, you can achieve this.
CREATE TABLE dbo.YourTable (UID int IDENTITY,
ID int,
DID int,
Value_amount int);
GO
INSERT INTO dbo.YourTable (ID, DID, Value_amount)
VALUES (1,13,1 ),
(1,13,2 ),
(1,10,1500),
(1,10,1000),
(2,13,1 ),
(2,13,2 ),
(2,10,2000),
(2,10,3000);
GO
WITH RNs AS(
SELECT ID,
DID,
Value_amount,
ROW_NUMBER() OVER (PARTITION BY ID, DID ORDER BY UID ASC) AS RN
FROM dbo.YourTable)
SELECT ID,
MAX(CASE DID WHEN 13 THEN Value_Amount END) AS Amount,
MAX(CASE DID WHEN 10 THEN Value_Amount END) AS PaymentMode
FROM RNs
GROUP BY RN,
ID;
GO
DROP TABLE dbo.YourTable;
Of course, you need to fix your design to implement this, but you need to do that anyway.

T-SQL select rows by oldest date and unique category

I'm using Microsoft SQL. I have a table that contains information stored by two different categories and a date. For example:
ID Cat1 Cat2 Date/Time Data
1 1 A 11:00 456
2 1 B 11:01 789
3 1 A 11:01 123
4 2 A 11:05 987
5 2 B 11:06 654
6 1 A 11:06 321
I want to extract one line for each unique combination of Cat1 and Cat2 and I need the line with the oldest date. In the above I want ID = 1, 2, 4, and 5.
Thanks
Have a look at row_number() on MSDN.
SELECT *
FROM (
SELECT *,
ROW_NUMBER() OVER (PARTITION BY col1, col2 ORDER BY date_time, id) rn
FROM mytable
) q
WHERE rn = 1
(run the code on SQL Fiddle)
Quassnoi's answer is fine, but I'm a bit uncomfortable with how it handles dups. It seems to return based on insertion order, but I'm not sure if even that can be guaranteed? (see these two fiddles for an example where the result changes based on insertion order: dup at the end, dup at the beginning)
Plus, I kinda like staying with old-school SQL when I can, so I would do it this way (see this fiddle for how it handles dups):
select *
from my_table t1
left join my_table t2
on t1.cat1 = t2.cat1
and t1.cat2 = t2.cat2
and t1.datetime > t2.datetime
where t2.datetime is null

How to use generate Id for different values in calculated columns?

I have a big query (which is already ordered as per my needs), one of the columns is calculated (varchar combination of other columns in the query). I need an incremental integer to identify this calculated column (duplicates should have the same id).
I canĀ“t use rank because the order in which I need the incremental number uses another criteria than the one used to generate the calculated column.
This is what I need:
OrderByColumn CalculatedColumn GeneratedId
1 ggg 1
1 aaa 2
1 ggg 1
1 fff 3
2 vvv 4
2 ddd 5
3 ggg 1
4 rrr 6
5 aaa 2
5 ooo 7
5 kkk 8
8 vvv 4
9 aaa 2
Use
ROW_NUMBER() OVER (PARTITION BY XXX ORDER BY YYY)
assuming you are using SQL2005 or better
http://msdn.microsoft.com/en-us/library/ms186734.aspx
-- though like you said this doesn't solve your dupes with same ID thing - ahhh! Give me a moment - should be able to do this pretty easy
Edit:
Here you go -
http://sqlfiddle.com/#!3/2f014/2
-- Select stuff:
select vals.val as genid, ord.* from ord
-- Join back to a distinct list of CalculatedColumn with a row_number() to id them
inner join
(select calculatedcolumn, row_number() over (order by calculatedcolumn) as val from ord group by calculatedcolumn) as vals on vals.calculatedcolumn = ord.calculatedcolumn
order by ord.orderbycolumn
Of course this is using the calculated column in the subquery - so you will need to re-calculate unless you store the value in a temp table or table variable

Resources