updating min value on the second column when the first column appears more then once - sql-server

Im struggling with how to do this in one step.
I have a column with values which vary between 1 and +-20. Linked to this is a second value which is normally between 1 and 5.
what i want to do is when Number 1 values appears more then once then I need to update the value in column Number 2 to 99 but only the highest number in the Number 2 column.
I have added a pic to explain better.
Basically id is unique, if value 1 appears more then once I need to update value 2 for where the value in value 2 is the highest value.

You can use row_number() to find the row with the highest No2 value and you can use count() over() to check if there are more than one row present for a No1 value.
SQL Fiddle
MS SQL Server 2008 Schema Setup:
create table YourTable
(
No1 int,
No2 int
);
insert into YourTable values
(1, 3),
(1, 2),
(2, 1);
Query 1:
with C as
(
select No2,
row_number() over(partition by No1 order by No2 desc) as rn,
count(*) over(partition by No1) as c
from YourTable
)
update C
set No2 = 99
where rn = 1 and
c > 1
Results:
Query 2:
select *
from YourTable
Results:
| NO1 | NO2 |
|-----|-----|
| 1 | 99 |
| 1 | 2 |
| 2 | 1 |

Related

Count unique field when another field is all NULL

I have a table with three columns, which can contain duplicate rows
org - int NULL
id - int NULL
complete - bit NULL
So I might have data like so:
org | id | complete
-------------------
1 | 1 | 1
1 | 2 | NULL
1 | 2 | 1
1 | 3 | 1
2 | 3 | 1
2 | 4 | 1
2 | 4 | NULL
I want to get a count of all distinct id by org. That's easy enough to do with a COUNT(DISTINCT id) expression. Where I'm running into trouble now is I also want a count of all distinct id where any of the complete values isn't 1.
So from the above I'd want this output:
org | distinct id | distinct incomplete id
------------------------------------------
1 | 3 | 1
2 | 2 | 1
So for org 2, because id of 4 included a NULL value, then I can't count id 4 as fully complete, thus just id 3 is complete, thus resulting in a 1 in the distinct incomplete id column. So I don't know how to fill in the ???? part of the below query.
SELECT org, COUNT(DISTINCT id) TotalPeople, ???? IncompletePeople
FROM table
GROUP BY org
Try the following approach
DECLARE #T TABLE
(
Org INT,
Id INT,
Complete BIT
)
INSERT INTO #T
VALUES(1,1,1),(1,2,NULL),(1,2,1),(1,3,1),(2,3,1),(2,4,1),(2,4,NULL)
SELECT
Org,
DistinctId = COUNT(DISTINCT Id),
DistinctIncompleteId = SUM(CASE Complete WHEN 1 THEN 0 ELSE 1 END)
FROM #T
GROUP BY Org
You may try this way
create table #temptable ( org int, id int, comp int )
Go
insert into #temptable ( org, id, comp )
values ( 1,1,1)
,( 1, 2, null)
,( 1, 2, 1)
,( 1, 3, 1)
,( 2, 3, 1)
,( 2, 4, null)
,( 2, 4, 1)
select d.org, d.DistinctId, f.incompleteId from (
select COUNT (distinct id) as DistinctId , org from #temptable group by org) as d full outer join (
select COUNT (distinct id) as incompleteId , org from #temptable where comp is null group by org) as f on d.org=f.org
go
drop table #temptable
Group it by "org" and by "complete". Then put HAVING complete=1. Hope the code below helps you:
SELECT org, COUNT(id) TotalPeople, complete
FROM table
GROUP BY org,complete
HAVING complete=1 (complete IS NULL *for incomplete*)

Using multiple row results on a formula, based by group

Is there a way to use the results from a multiple rows on a formula, divided by each group.
I have the followin formula:
result = (1st vfg ) / (1 + (1st vfg / 2nd vfg) + (1st vfg / 3rd vfg) + ... + (1st vfg / *nth* vfg) )
vfg = value from group
For example, the table bellow:
Group | Value
---------------
1 | 1000
1 | 280
1 | 280
2 | 1000
Note: I guarantee that there will be no 0 (zero) or NULLs in the value for the first table
Should give me the following result:
Group | Result
---------------
1 | 122.85
2 | 1000 -> If there is only one value on the group, the result will be the value itself
You need a column that indicates the row order within a group (timestamps, the sequence number, identity column, etc.). Rows in a database table have no implicit order. Once you have that, you can use a CTE and window functions to solve the problem:
;WITH
cte AS
(
SELECT [Group]
, [Value]
, FIRST_VALUE([Value]) OVER (PARTITION BY [Group] ORDER BY RowOrder) AS FirstValue
, FIRST_VALUE([Value]) OVER (PARTITION BY [Group] ORDER BY RowOrder) / [Value] AS Subtotal
FROM MyTable
)
SELECT [Group]
, AVG(FirstValue) / SUM(Subtotal) AS Result
FROM cte
GROUP BY [Group]

Field equal 1 display

I am using SQL Server 2008 and I would like to only get the activityCode for the orderno when it equals 1 if there are duplicate orderno with the activityCode equals 0.
Also, if the record for orderno activityCode equals 0 then display those records also. But I would only like to display the orderno when the activityCode equals 0 if the same orderno activityCode does not equal 1 or the activityCode only equals 0. I hope this is clear and makes sense but let me know if I need to provide more details. Thanks
--create table
create table po_v
(
orderno int,
amount number,
activityCode number
)
--insert values
insert into po_v values
(170268, 2774.31, 0),
(17001988, 288.82, 0),
(17001988, 433.23, 1),
(170271, 3786, 1),
(170271, 8476, 0),
(170055, 34567, 0)
--Results
170268 | 2774.31 | 0
17001988 | 433.23 | 1
170271 | 3786 | 1
170055 | 34567 | 0
*****Updated*****
I have inserted two new records and the results have been updated. The data in the actual table has other numbers besides 0 and 1. The select statement displays the correct orderno's but I would like the other records for the orderno to display also. The partition only populates one record per orderno. If possible I would like to see the records with the same activityCode.
--insert values
insert into po_v values
(170271, 3799, 1),
(172525, 44445, 2)
--select statement
SELECT Orderno,
Amount,
Activitycode
FROM (SELECT orderno,
amount,
activitycode,
ROW_NUMBER()
OVER(
PARTITION BY orderno
ORDER BY activitycode DESC) AS dup
FROM Po_v)dt
WHERE dt.dup = 1
ORDER BY 1
--select statement results
170055 | 34567 | 0
170268 | 2774.31 | 0
170271 | 3786 | 1
172525 | 44445 | 2
17001988 | 433.23 | 1
--expected results
170055 | 34567 | 0
170268 | 2774.31 | 0
170271 | 3786 | 1
170271 | 3799 | 1
172525 | 44445 | 2
17001988 | 433.23 | 1
Not totally clear what you are trying to do here but this returns the output you are expecting.
select orderno
, amount
, activityCode
from
(
select *
, RowNum = ROW_NUMBER() over(partition by orderno order by activityCode desc)
from po_v
) x
where x.RowNum = 1
---EDIT---
With the new details this is a very different question. As I understand it now you want all row for that share the max activity code for each orderno. You can do this pretty easily with a cte.
with MyGroups as
(
select orderno
, Activitycode = max(activitycode)
from po_v
group by orderno
)
select *
from po_v p
join MyGroups g on g.orderno = p.orderno
and g.Activitycode = p.Activitycode
Try this
SELECT Orderno,
Amount,
Activitycode
FROM (SELECT orderno,
amount,
activitycode,
ROW_NUMBER()
OVER(
PARTITION BY orderno
ORDER BY activitycode DESC) AS dup
FROM Po_v)dt
WHERE dt.dup = 1
ORDER BY 1
Result
Orderno Amount Activitycode
------------------------------------
170055 34567.00 0
170268 2774.31 0
170271 3786.00 1
17001988 433.23 1

How can I group / window date ordered events delineated by an arbitrary expression?

I would like to group some data together based on dates and some (potentially arbitrary) indicator:
Date | Ind
================
2016-01-02 | 1
2016-01-03 | 5
2016-03-02 | 10
2016-03-05 | 15
2016-05-10 | 6
2016-05-11 | 2
I would like to group together subsequent (date-ordered) rows but breaking the group after Indicator >= 10:
Date | Ind | Group
========================
2016-01-02 | 1 | 1
2016-01-03 | 5 | 1
2016-03-02 | 10 | 1
2016-03-05 | 15 | 2
2016-05-10 | 6 | 3
2016-05-11 | 2 | 3
I did find a promising technique at the end of a blog post: "Use this Neat Window Function Trick to Calculate Time Differences in a Time Series" (the final subsection, "Extra Bonus"), but the important part of the query uses a keyword (FILTER) that doesn't seem to be supported in SQL Server (and a quick Google later and I'm not sure where it is supported!).
I'm still hopeful a technique using a window function might be the answer. I just need a counter that I can add to every row, (like RANK or ROW_NUMBER does) but that only increments when some arbitrary condition evaluates as true. Is there a way to do this in SQL Server?
Here is the solution:
DECLARE #t TABLE ([Date] DATETIME, Ind INT)
INSERT INTO #t
VALUES
('2016-01-02', 1),
('2016-01-03', 5),
('2016-03-02', 10),
('2016-03-05', 15),
('2016-05-10', 6),
('2016-05-11', 2)
SELECT [Date],
Ind,
1 + SUM([Group]) OVER(ORDER BY [Date]) AS [Group]
FROM
(
SELECT *,
CASE WHEN LAG(ind) OVER(ORDER BY [Date]) >= 10
THEN 1
ELSE 0
END AS [Group]
FROM #t
) t
Just mark row as 1 when previous is greater than 10 else 0. Then a running sum will give you the desired result.
Giving full credit to Giorgi for the idea, but I've modified his answer (both for my benefit and for future readers).
Just change the CASE statement to see if 30 or more days have lapsed since the last record:
DECLARE #t TABLE ([Date] DATETIME)
INSERT INTO #t
VALUES
('2016-01-02'),
('2016-01-03'),
('2016-03-02'),
('2016-03-05'),
('2016-05-10'),
('2016-05-11')
SELECT [Date],
1 + SUM([Group]) OVER(ORDER BY [Date]) AS [Group]
FROM
(
SELECT [Date],
CASE WHEN DATEADD(d, -30, [Date]) >= LAG([Date]) OVER(ORDER BY [Date])
THEN 1
ELSE 0
END AS [Group]
FROM #t
) t

SQL Server SUM based on subsequent records

Microsoft SQL Server 2012 (SP1) - 11.0.3156.0 (X64)
I am not sure of the best way to word this and have tried a few different searches with different combinations of words without success.
I only want to Sum Sequence = 1 when there are Sequence > 1, in the table below the Sequence = 1 lines marked with *. I don't care at all about checking that Sequence 2,3,etc match the same pattern because if they exist at all I need to Sum them.
I have data that looks like this:
| Sequence | ID | Num | OtherID |
|----------|----|-----|---------|
| 1 | 1 | 10 | 1 |*
| 2 | 1 | 15 | 1 |
| 3 | 1 | 20 | 1 |
| 1 | 2 | 10 | 1 |*
| 2 | 2 | 15 | 1 |
| 1 | 3 | 10 | 1 |
| 1 | 1 | 40 | 3 |
I need to sum the Num column but only when there is more than one sequence. My output would look like this:
Sequence Sum OtherID
1 20 1
2 30 1
3 20 1
I have tried grouping the queries in a bunch of different ways but really by the time I get to the sum, I don't know how to look ahead to make sure there are greater than 1 sequences for an ID.
My query at the moment looks something like this:
select Sequence, Sum(Num) as [Sum], OtherID
from tbl
where ID in (Select ID from tbl where Sequence > 1)
Group by Sequence, OtherID
tbl is a CTE that I wrapped around my query and it partially works, but is not really the filter I wanted.
If this is something that just shouldn't be done or can't be done then I can handle that, but if it's something I am missing I'd like to fix the query.
Edit:
I can't give the full query here but I started with this table/data (to get the above output). The OtherID is there because the data has the same ID/Sequence combinations but that OtherID helps separate them out so the rows are not identical (multiple questions on a form).
Create table #tmpTable (ID int, Sequence int, Num int, OtherID int)
insert into #tmpTable (ID, Sequence, Num, OtherID) values (1, 1, 10, 1)
insert into #tmpTable (ID, Sequence, Num, OtherID) values (1, 2, 15, 1)
insert into #tmpTable (ID, Sequence, Num, OtherID) values (1, 3, 20, 1)
insert into #tmpTable (ID, Sequence, Num, OtherID) values (2, 1, 10, 1)
insert into #tmpTable (ID, Sequence, Num, OtherID) values (2, 2, 15, 1)
insert into #tmpTable (ID, Sequence, Num, OtherID) values (3, 1, 10, 1)
insert into #tmpTable (ID, Sequence, Num, OtherID) values (1, 1, 40, 3)
The following will sum over Sequence and OtherID, but only when:
Either
sequence is greater than 1
or
there is something else with the same ID and OtherID, but a different sequence.
Query:
select Sequence, Sum(Num) as SumNum, OtherID from #tmpTable a
where Sequence > 1
or exists (select * from #tmpTable b
where a.ID = b.ID
and a.OtherID = b.OtherID
and b.Sequence <> a.Sequence)
group by Sequence, OtherID;
It looks like you are trying to sum by Sequence and OtherID if the Count of ID >1, so you could do something like below:
select Sequence, Sum(Num) as [Sum], OtherID
from tbl
where ID in (Select ID from tbl where Sequence > 1)
Group by Sequence, OtherID
Having count(id)>1

Resources