SQL Server / Oracle: How to increase index column dependent on another column? - sql-server

Given the following table:
Column1 Column2 idx
-------------------
1 1 0
2 1 0
3 2 0
4 3 0
5 3 0
1 3 0
How could I increase the idx column dependent on column2 in SQL Server and Oracle with an UPDATE statement?
I would like to have:
Column1 Column2 idx
--------------------
1 1 0
2 1 1
3 2 0
4 3 0
5 3 1
1 3 2
Thank you!

This (or similar) approach should work for both:
;with x as (
select idx, row_number() over(partition by Column2 order by Column1) as new_idx
from tbl
)
update x set idx = new_idx
(Here I assume that there is a typo in 6th row for Column1 - if not, there should be something else for ordering)

With Oracle you need a MERGE statement for this:
merge into x using (
select rowid as rid,
row_number() over(partition by Column2 order by Column1) as new_idx
from tbl
) t on (t.rid = x.rowid)
when matched then
set idx = t.new_idx;
Instead of using rowid you can replace the join with the primary key columns of the table.

Related

How to generate random 0 and 1 with 80-20 probability in sql server

I have a Table with 10 records, I have a column (name:RandomNumber) ,that its data type is bit .
now I want to insert data in to this column randomly in such a way that 80 percent of record (8 record) get 0 randomly and 20 percent (2 record) get 1.
For Example Like this:
Id
RandomNumber
1
0
2
0
3
0
4
1
5
0
6
0
7
0
8
1
9
0
10
0
One way is use ORDER BY NEWID() to assign 1 to two rows (20%) and assign 0 to others (remaining 80%) by excluding those assigned 1.
CREATE TABLE dbo.Example(
Id int NOT NULL CONSTRAINT PK_Test PRIMARY KEY
);
INSERT INTO dbo.Example VALUES(1),(2),(3),(4),(5),(6),(7),(8),(9),(10);
WITH ones AS (
SELECT TOP (2) Id, 1 AS RandomNumber
FROM dbo.Example
ORDER BY NEWID()
)
SELECT Id, 0 AS RandomNumber
FROM dbo.Example
WHERE Id NOT IN(SELECT Id FROM ones)
UNION ALL
SELECT Id, 1 AS RandomNumber
FROM ones
ORDER BY Id;
Alternatively, use ROW_NUMBER() OVER(ORDER BY NEWID()) and a CASE expression:
WITH example AS (
SELECT Id, ROW_NUMBER() OVER(ORDER BY NEWID()) AS rownum
FROM dbo.Example
)
SELECT Id, CASE WHEN rownum <= 2 THEN 1 ELSE 0 END AS RandomNumber
FROM example
ORDER BY Id;

Classifying rows into a grouping column that shows the data is related to prior rows

I have a set of data that I want to classify into groups based on a prior record id existing on the newer rows. The initial record of the group has a prior sequence id = 0.
The data is as follows:
customer id
sequence id
prior_sequence id
1
1
0
1
2
1
1
3
2
2
4
0
2
5
4
2
6
0
2
7
6
Ideally, I would like to create the following grouping column and yield the following results:
customer id
sequence id
prior sequence id
grouping
1
1
0
1
1
2
1
1
1
3
2
1
2
4
0
2
2
5
4
2
2
6
0
3
2
7
6
3
I've attempted to utilize island gap logic utilizing the ROW_NUMBER() function. However, I have been unsuccessful in doing so. I suspect the need here is more along the lines of a recursive CTE, which I am attempting at the moment.
I agree that a recursive CTE will do the job. Something like:
WITH reccte AS
(
/*query that determines starting point for recursion
*
* In this case we want all records with no prior_sequence_id
*/
SELECT
customer_id,
sequence_id,
prior_sequence_id,
/*establish grouping*/
ROW_NUMBER() OVER (ORDER BY sequence_id) as grouping
FROM yourtable
WHERE prior_sequence_id = 0
UNION
/*join the recursive CTe back to the table and iterate*/
SELECT
yourtable.customer_id,
yourtable.sequence_id,
yourtable.prior_sequence_id,
reccte.grouping
FROM reccte
INNER JOIN yourtable ON reccte.sequence_id = yourtable.prior_sequence_id
)
SELECT * FROM reccte;
It looks like you could use a simple correlated query, at least given your sample data:
select *, (
select Sum(Iif(prior_sequence_id = 0, 1, 0))
from t t2
where t2.sequence_id <= t.sequence_id
) Grouping
from t;
See Example Fiddle

SQL Server Group By Excluding Some Values

I have some records like below:
ID Val Amount
1 0 3
2 0 3
3 0 4
4 1 2
5 1 3
6 2 3
7 2 4
I want to group this data by the column Val and get the sum(amount), but do not group the ones with Val = 0.
The result set I need is like below:
Val Amount
0 3
0 3
0 4
1 5
2 7
I did it by two ways, but none seem to be the best way:
First one is by using unions, like, first having the ones with Val = 0, then grouping the ones with Val <> 0 and unioning the two result sets.
Second one is a little bit better. Let's call the data we have is in the table #Table:
WITH g AS
(
SELECT Val, Amount, CASE WHEN Val = '0' then Val + ID
else Val END A FROM #table
)
SELECT CASE WHEN A LIKE '0%' THEN 0 ELSE A END AS A, SUM(Amount)
FROM g
GROUP BY A
This also works, but being have to concatenate with the ID column (or raw_number) and than using a left function to remove it is not a best practice.
So I'm looking for a better approach, both looking better and performing better as well.
I work on SQL Server 2008, but I'm open to any solutions which require newer versions.
The shortest way of doing it is the following:
SELECT Val, SUM(Amount)
FROM mytable
GROUP BY Val, CASE WHEN Val = 0 THEN ID ELSE 0 END
Demo here
You can also do it using window functions:
;WITH CTE AS (
SELECT ID, Val, Amount,
DENSE_RANK() OVER (PARTITION BY Val
ORDER BY CASE
WHEN Val = 0 THEN ID
ELSE 0
END) AS rank
FROM mytable
)
SELECT Val, SUM(Amount) AS total_amount
FROM CTE
GROUP BY Val, rank
The result set returned by the CTE is:
ID Val Amount rank
--------------------
1 0 3 1
2 0 3 2
3 0 4 3
4 1 2 1
5 1 3 1
6 2 3 1
7 2 4 1
So using rank you can differentiate between 0 and the rest of Val values.
Demo here
You can use both methods and see how they compare to each other in terms of performance.
Use a union here. The top of the below union finds aggregate amounts of values which are not zero, and the bottom brings in the zero value records, not aggregated.
SELECT Val, SUM(Amount) AS Amount
FROM g
WHERE Val <> 0
GROUP BY Val
UNION ALL
SELECT Val, Amount
FROM g
WHERE Val = 0
ORDER BY Val;
Demo

Compare previous column value in SQL Server

There is a table with below structure:
ID XID RChange
1 1 1
2 2 1
3 3 1
4 1 0
5 2 0
6 3 1
ID column is an identity column
XID column will have some values repeating
RChange will have either 1 or 0
I need those rows from the table with RChange column value changing from 1 to 0 with same XID
i.e. in the above table, for XID 1 and 2 RChange value has changed from 1 to 0 but for XID 3 it has changed from 1 to 1.
So, I need to write a query which will retrieve
ID XID RChange
4 1 0
5 2 0
So, please help me with your ideas.
You have not included a timestamp so I am assuming the ID column will determine the order.
;WITH byXID AS
(
SELECT ID, XID, RChange, ROW_NUMBER() OVER(PARTITION BY XID ORDER BY ID) rn
FROM Table1
)
SELECT t1.ID, t1.XID, t1.RChange
FROM byXID t1
INNER JOIN byXID t0 ON t1.XID = t0.XID AND t0.rn = t1.rn - 1
WHERE t1.RChange = 0 AND t0.RChange = 1
SQL Fiddle demo
This is another approach:
SELECT t2.ID, t2.XID, t2.RChange
FROM Table1 t1 JOIN Table1 t2
ON t1.XID = t2.XID
WHERE t1.RChange = 1 AND t2.RChange = 0
Sql fiddle demo (Thanks #Ic. for the fiddle)

Filter Duplicate Rows on Conditions

I would like to filter duplicate rows on conditions so that the rows with minimum modified and maximum active and unique rid and did are picked. self join? or any better approach that would be performance wise better?
Example:
id rid modified active did
1 1 2010-09-07 11:37:44.850 1 1
2 1 2010-09-07 11:38:44.000 1 1
3 1 2010-09-07 11:39:44.000 1 1
4 1 2010-09-07 11:40:44.000 0 1
5 2 2010-09-07 11:41:44.000 1 1
6 1 2010-09-07 11:42:44.000 1 2
Output expected is
1 1 2010-09-07 11:37:44.850 1 1
5 2 2010-09-07 11:41:44.000 1 1
6 1 2010-09-07 11:42:44.000 1 2
Commenting on the first answer, the suggestion does not work for the below dataset(when active=0 and modified is the minimum for that row)
id rid modified active did
1 1 2010-09-07 11:37:44.850 1 1
2 1 2010-09-07 11:38:44.000 1 1
3 1 2010-09-07 11:39:44.000 1 1
4 1 2010-09-07 11:36:44.000 0 1
5 2 2010-09-07 11:41:44.000 1 1
6 1 2010-09-07 11:42:44.000 1 2
Assuming SQL Server 2005+. Use RANK() instead of ROW_NUMBER() if you want ties returned.
;WITH YourTable as
(
SELECT 1 id,1 rid,cast('2010-09-07 11:37:44.850' as datetime) modified, 1 active,1 did union all
SELECT 2,1,'2010-09-07 11:38:44.000', 1,1 union all
SELECT 3,1,'2010-09-07 11:39:44.000', 1,1 union all
SELECT 4,1,'2010-09-07 11:36:44.000', 0,1 union all
SELECT 5,2,'2010-09-07 11:41:44.000', 1,1 union all
SELECT 6,1,'2010-09-07 11:42:44.000', 1,2
),cte as
(
SELECT id,rid,modified,active, did,
ROW_NUMBER() OVER (PARTITION BY rid,did ORDER BY active DESC, modified ASC ) RN
FROM YourTable
)
SELECT id,rid,modified,active, did
FROM cte
WHERE rn=1
order by id
select id, rid, min(modified), max(active), did from foo group by rid, did order by id;
You can get good performance with a CROSS APPLY if you have a table that has one row for each combination of rid and did:
SELECT
X.*
FROM
ParentTable P
CROSS APPLY (
SELECT TOP 1 *
FROM YourTable T
WHERE P.rid = T.rid AND P.did = T.did
ORDER BY active DESC, modified
) X
Substituting (SELECT DISTINCT rid, did FROM YourTable) for ParentTable would work but will hurt performance.
Also, here is my crazy, single scan magic query which can often outperform other methods:
SELECT
id = Substring(Packed, 6, 4),
rid,
modified = Convert(datetime, Substring(Packed, 2, 4)),
Active = Convert(bit, 1 - Substring(Packed, 1, 1)),
did,
FROM
(
SELECT
rid,
did,
Packed = Min(Convert(binary(1), 1 - active) + Convert(binary(4), modified) + Convert(binary(4), id)
FROM
YourTable
GROUP BY
rid,
did
) X
This method is not recommended because it's not easy to understand, and it's very easy to make mistakes with it. But it's a fun oddity because it can outperform other methods in some cases.

Resources