Creating unique identifier column(1 or zero) Rank () SQL SERVER - sql-server

I am trying to create a column in SQL SERVER that shows 1 OR 0(zero). I have a column of customer numbers that appear more than once. At the first hit on a unique non-repeated customer number it should show one and if it is repeated then 0(zero). How can I create this ?
CustNumber Unique
25122134 1
25122134 0
25122134 0
25122136 1
25122136 0
the solutions I am considering and trying out now are Rank() and Rank_DENSE().

declare #test table
(
CustNumber int
)
insert into #test values
(25122134),
(25122134),
(25122134),
(25122136),
(25122136)
select
* ,
// each CustNumber in partition has the same rank, but different row_number
case when (row_number() over (partition by CustNumber order by CustNumber)) = 1
then 1 else 0 end as [Unique]
// the 1st is unique, the rest (2..n) are not
from #test
order by CustNumber, [Unique] desc
// unique in each group should be displayed first

You don't want RANK because that, by definition, produces the same output for identical inputs.
ROW_NUMBER() and a simple CASE expression should do it:
;WITH Numbered as (
SELECT CustNumber,
ROW_NUMBER() OVER (PARTITION BY CustNumber
ORDER BY CustNumber) as rn --Unusual - pick a real column if you have a preference
FROM YourUnnamedTable
)
SELECT CustNumber,CASE WHEN rn = 1 THEN 1 ELSE 0 END as [Unique]
FROM Numbered

Related

Check which all records have duplicate using window function

I am using ROW_NUMBER() to find duplicates, encapsulated under CTEs. I am using ROW_NUMBER() because I also want to have column which shows how many duplicate rows are present in the table.
The below code gives only records greater than 1. That is row number 2 and 3. But how can I include row number 1 of duplicate records?
If I removed T>1, than output also contain records which does not have duplicate like records with Part "'0020R5',100".
DDL:
DROP TABLE #TEST
CREATE TABLE #TEST
(
PART VARCHAR(30),
ALTPART int
)
INSERT #TEST
SELECT '15-AB78',100 UNION ALL
SELECT '15-AB78',110 UNION ALL
SELECT '16-A9-1',100 UNION ALL
SELECT '16-A9-1',110 UNION ALL
SELECT '16-B97-2',100 UNION ALL
SELECT '16-B97-2',110 UNION ALL
SELECT '0020R5',100
Query:
WITH TEST(PART,ALTPART,T) AS
(
SELECT PART,ALTPART,ROW_NUMBER() OVER (PARTITION BY PART ORDER BY ALTPART ASC) AS T FROM #TEST
)
SELECT PART,ALTPART,T
FROM TEST
WHERE T>1
ORDER BY PART
GO
Current output:
'15-AB78',110,2
'16-A9-1',110,2
'16-B97-2',110,2
Expected Result:
'15-AB78',100,1
'15-AB78',110,2
'16-A9-1',100,1
'16-A9-1',110,2
'16-B97-2',100,1
'16-B97-2',110,2
You need to add another window function to count the number of duplicates in each group.
Something along these lines:
WITH TEST(PART,ALTPART,DuplicateNum, DuplicateCnt) AS
(
SELECT PART,ALTPART,
-- Number each duplicate
ROW_NUMBER() OVER (PARTITION BY PART ORDER BY ALTPART ASC) AS DuplicateNum,
-- Count duplicates
COUNT() OVER (PARTITION BY PART ) AS DuplicateCnt,
FROM #TEST
)
SELECT PART, ALTPART, DuplicateNum
FROM TEST
WHERE DuplicateCnt > 1
ORDER BY PART

SQL Server : how to divide database records into even, random groups

tblNames
OrganizationID (int)
LastName (varchar)
...
GroupNumber (int)
GroupNumber is currently NULL for all records, I need an UPDATE statement to update this column.
I need to split up records on an OrganizationID level into even, random groups.
If there are < 20,000 records for an OrganizationID, I need 2 even, random groups. So records for that OrganizationID will have a GroupNumber of 1 or 2. There will be the same (or if odd number of records difference of only 1) number of records for GroupNumber = 1 and for GroupNumber = 2, and there will be no recognizable way to tell how a person got into a GroupNumber - i.e. LastNames that start with A-L are group 1, M-Z are group 2 would not be OK.
If there are > 20,000 records for an OrganizationID, I need 4 even, random groups. So records for that OrganizationID will have a GroupNumber values of 1, 2, 3, or 4. There will be the same (or if odd number of records difference of only 1) number of records for each GroupNumber, and there will be no recognizable way to tell how a person got into a GroupNumber - i.e. LastNames that start with A-F are group 1, G-L are group 2, etc. would not be OK.
There are only about 20 organizations, so I can run an update statement 20 times, once per organizationID if needed.
I have full control of the table so I can add keys or columns, but for now this is what it is.
Would appreciate any help.
Create row numbers randomly (with ROW_NUMBER and GETID). Then get their modulo 2 or 4 depending on the record count to get buckets 0 to 1 or 0 to 3.
select
organizationid, lastname, ...,
case when cnt <= 20000 then rn % 2 else rn % 4 end as bucket
from
(
select
organizationid, lastname, ...,
row_number() over(order by newid()) as rn,
count(*) over () as cnt
from mytable
) randomized;
UPDATE: I suppose the update statement would have to look something like this:
with randomized as
(
select
groupnumber,
row_number() over(order by newid()) as rn,
count(*) over () as cnt
from mytable
)
update randomzized
set groupnumber = case when cnt <= 20000 then rn % 2 else rn % 4 end + 1;
Another slightly different approach;
Setting up some fake data:
if object_id('tempdb.dbo.#Orgs') is not null drop table #Orgs
create table #Orgs
(
RID int identity(1,1) primary key clustered,
OrganizationId int,
LastName varchar(36),
GroupId int
)
insert into #Orgs (OrganizationId, LastName)
select top 40000 row_number() over (order by (select null)) % 20000, newid()
from sys.all_objects a, sys.all_objects b
then using the rarely useful ntile() function to get as close to identically sized groups as possible. Sorting by newid() essentially sorts the data randomly (or as random as generating one guid to the next is).
declare #NumRandomGroups int = 4
update o
set GroupId = x.GroupId
from #orgs o
inner join (select RID, GroupId = ntile(#NumRandomGroups) over (order by newid())
from #orgs) x
on o.RID = x.RID
select GroupId, count(1)
from #Orgs
group by GroupId
select *
from #Orgs
order by RID
You can then set #NumRandomGroups to whatever you want it to be based on the count of Organizations

unique chat records sql

I have DB which having 5 column as follows:
message_id
user_id_send
user_id_rec
message_date
message_details
Looking for a SQL Serve Query, I want to Filter Results from two columns (user_id_send,user_id_rec)for Given User ID based on following constrains:
Get the Latest Record (filtered on date or message_id)
Only Unique Records (1 - 2 , 2 - 1 are same so only one record will be returned which ever is the latest one)
Ordered by Descending based on message_id
SQL Query
The main purpose of this query is to get records of user_id to find out to whom he has sent messages and from whom he had received messages.
I have also attached the sheet for your reference.
Here is my try
WITH t
AS (SELECT *
FROM messages
WHERE user_id_sender = 1)
SELECT DISTINCT user_id_reciever,
*
FROM t;
WITH h
AS (SELECT *
FROM messages
WHERE user_id_reciever = 1)
SELECT DISTINCT user_id_sender,
*
FROM h;
;WITH tmpMsg AS (
SELECT M2.message_id
,M2.user_id_receiver
,M2.user_id_sender
,M2.message_date
,M2.message_details
,ROW_NUMBER() OVER (PARTITION BY user_id_receiver+user_id_sender ORDER BY message_date DESC) AS 'RowNum'
FROM messages M2
WHERE M2.user_id_receiver = 1
OR M2.user_id_sender = 1
)
SELECT T.message_id
,T.user_id_receiver
,T.user_id_sender
,T.message_date
,T.message_details
FROM tmpMsg T
WHERE RowNum <= 1
The above should fetch you the results you are looking for when you query for a particular user_id (replace the 1 with parameter e.g. #p_user_id). The user_id_receiver+user_id_sender in the PARTITION clause ensure that records with user id combinations such as 1 - 2, 2 - 1 are not selected twice.
Hope this helps.
select * from
(
select ROW_NUMBER() over (order by message_date DESC) as rowno,
* from messages
where user_id_receiver = 1
--order by message_date DESC
) T where T.rowno = 1
UNION ALL
select * from
(
select ROW_NUMBER() over (order by message_date DESC) as rowno,
* from messages
where user_id_sender = 1
-- order by message_date DESC
) T where T.rowno = 1
Explanation: For each group of user_id_sender, it orders internally by message_date desc, and then adds row numbers, and we only want the first one (chronologically last). Then do the same for user_id_receiver, and union the results together to get 1 result set with all the desired rows. You can then add your own order by clause and additional where conditions at the end as required.
Of course, this only works for any 1 user_id at a time (replace =1 with #user_id).
To get a result from all user_id's at once, is a totally different query, so I hope this helps?

Get the missing value in a sequence of numbers

I made the following query for the SQL Server backend
SELECT TOP(1) (v.rownum + 99)
FROM
(
SELECT incrementNo-99 as id, ROW_NUMBER() OVER (ORDER BY incrementNo) as rownum
FROM proposals
WHERE [year] = '12'
) as v
WHERE v.rownum <> v.id
ORDER BY v.rownum
to find the first unused proposal number.
(It's not about the lastrecord +1)
But I realized ROW_NUMBER is not supported in access.
I looked and I can't find something similar.
Does anyone know how to get the same result as a ROW_NUMBER in access?
Maybe there's a better way of doing this.
Actually people insert their proposal No (incrementID) with no constraint. This number looks like this 13-152. xx- is for the current year and the -xxx is the proposal number. The last 3 digits are supposed to be incremental but in some case maybe 10 times a year they have to skip some numbers. That's why I can't have the auto increment.
So I do this query so when they open the form, the default number is the first unused.
How it works:
Because the number starts at 100, I do -99 so it starts at 1.
Then I compare the row number with the id so it looks like this
ROW NUMBER | ID
1 1 (100)
2 2 (101)
3 3 (102)
4 5 (104)<--------- WRONG
5 6 (105)
So now I know that we skip 4. So I return (4 - 99) = 103
If there's a better way, I don't mind changing but I really like this query.
If there's really no other way and I can't simulate a row number in access, i will use the pass through query.
Thank you
From your question it appears that you are looking for a gap in a sequence of numbers, so:
SELECT b.akey, (
SELECT Top 1 akey
FROM table1 a
WHERE a.akey > b.akey) AS [next]
FROM table1 AS b
WHERE (
SELECT Top 1 akey
FROM table1 a
WHERE a.akey > b.akey) <> [b].[akey]+1
ORDER BY b.akey
Where table1 is the table and akey is the sequenced number.
SELECT T.Value, T.next -1 FROM (
SELECT b.Value , (
SELECT Top 1 Value
FROM tblSequence a
WHERE a.Value > b.Value) AS [next]
FROM tblSequence b
) T WHERE T.next <> T.Value +1

If any clause when grouping

Doing a Sum() on a column adds up the values in that column based on group by. But lets say I want to sum these values only if all the values are not null or not 0, then I need a clause which checks if any of the values is 0 before it does the sum. How can I implement such a clause?
I'm using sql server 2005.
Thanks,
Barry
Let's supose your table schema is:
myTable( id, colA, value)
Then, one approach is:
Select colA, sum(value)
from myTable
group by colA
having count( id ) = count( nullif( value, 0 ))
Notice that nullif is a MSSQL server function. YOu should adapt code to your rdbms brand.
Explanation:
count aggregate function only count not null values. Here a counting null values test.
You say that 0+2+3=0 for this case. Assuming that NULL+2+3 should also be zero:
SELECT GroupField,
SUM(Value) * MIN(CASE WHEN COALESCE(Value, 0) = 0 THEN 0 ELSE 1 END)
FROM SumNonZero
GROUP BY GroupField
The above statement gives this result
GroupField (No column name)
case1 5
case2 0
case3 0
with this test data
CREATE TABLE SumNonZero (
GroupField CHAR(5) NOT NULL,
Value INT
)
INSERT INTO SumNonZero(GroupField, Value)
SELECT 'case1', 2
UNION ALL SELECT 'case1', 3
UNION ALL SELECT 'case2', 0
UNION ALL SELECT 'case2', 2
UNION ALL SELECT 'case2', 3
UNION ALL SELECT 'case3', NULL
UNION ALL SELECT 'case3', 3
UNION ALL SELECT 'case3', 4
It makes no sense to eliminate 0 from a SUM because it wont impact the sum.
But you may want to SUM based on another field:
select FIELD, sum(
case when(OTHER_FIELD>0) then FIELD
else 0
end)
from TABLE
group by TABLE

Resources