Find rows with duplicate entries in SQL Server - sql-server

I need to find rows with duplicate order numbers in a table in SQL Server.
For example SELECT * shows:
No
-----
5001
5002
5003
5003
5003
5004
5005
5006
5006
I want to get
No
------
5003
5003
5003
5006
5006
Is it possible to write a query to do that?

Although this question was answered thousand times on SO, use GROUP BY + HAVING:
SELECT No
FROM dbo.tablename
Group By No
Having Count(*) > 1

SELECT S.No
FROM [dbo].[Shows] S
INNER JOIN (
SELECT [No]
FROM [dbo].[Shows]
GROUP BY [No]
HAVING COUNT(*) > 1
) J
ON S.No = J.No

You can use COUNT() OVER() to get count of rows per No and, in an outer query, filter out all non-duplicate rows:
SELECT [No]
FROM (
SELECT [No], COUNT(*) OVER (PARTITION BY [No]) AS cnt
FROM mytable ) t
WHERE cnt > 1
Demo here

Related

Choose row that equal to the max value from a query

I want to know who has the most friends from the app I own(transactions), which means it can be either he got paid, or paid himself to many other users.
I can't make the query to show me only those who have the max friends number (it can be 1 or many, and it can be changed so I can't use limit).
;with relationships as
(
select
paid as 'auser',
Member_No as 'afriend'
from Payments$
union all
select
member_no as 'auser',
paid as 'afriend'
from Payments$
),
DistinctRelationships AS (
SELECT DISTINCT *
FROM relationships
)
select
afriend,
count(*) cnt
from DistinctRelationShips
GROUP BY
afriend
order by
count(*) desc
I just can't figure it out, I've tried count, max(count), where = max, nothing worked.
It's a two columns table - "Member_No" and "Paid" - member pays the money, and the paid is the one who got the money.
Member_No
Paid
14
18
17
1
12
20
12
11
20
8
6
3
2
4
9
20
8
10
5
20
14
16
5
2
12
1
14
10
It's from Excel, but I loaded it into sql-server.
It's just a sample, there are 1000 more rows
It seems like you are massively over-complicating this. There is no need for self-joining.
Just unpivot each row so you have both sides of the relationship, then group it up by one side and count distinct of the other side
SELECT
-- for just the first then SELECT TOP (1)
-- for all that tie for the top place use SELECT TOP (1) WITH TIES
v.Id,
Relationships = COUNT(DISTINCT v.Other),
TotalTransactions = COUNT(*)
FROM Payments$ p
CROSS APPLY (VALUES
(p.Member_No, p.Paid),
(p.Paid, p.Member_No)
) v(Id, Other)
GROUP BY
v.Id
ORDER BY
COUNT(DISTINCT v.Other) DESC;
db<>fiddle

How to Sum (MAX values) from different value groups in same column SQL Server

I have a table like this:
Date
Consec_Days
2015-01-01
1
2015-01-03
1
2015-01-06
1
2015-01-07
2
2015-01-09
1
2015-01-12
1
2015-01-13
2
2015-01-14
3
2015-01-17
1
I need to Sum the max value (days) for each of the consecutive groupings where Consec_Days are > 1. So the correct result would be 5 days.
This is a type of gaps-and-islands problem.
There are many solutions, here is one simple one
Get the start points of each group using LAG
Calculate a grouping ID using a windowed conditional count
Group by that ID and take the highest sum
WITH StartPoints AS (
SELECT *,
IsStart = CASE WHEN LAG(Consec_Days) OVER (ORDER BY Date) = 1 THEN 1 END
FROM YourTable t
),
Groupings AS (
SELECT *,
GroupId = COUNT(IsStart) OVER (ORDER BY Date)
FROM StartPoints
WHERE Consec_Days > 1
)
SELECT TOP (1)
SUM(Consec_Days)
FROM Groupings
GROUP BY
GroupId
ORDER BY
SUM(Consec_Days) DESC;
db<>fiddle
with cte as (
select Consec_Days,
coalesce(lead(Consec_Days) over (order by Date), 1) as next
from YourTable
)
select sum(Consec_Days)
from cte
where Consec_Days <> 1 and next = 1
db<>fiddle

How to get non matching records in sql server in same table?

I have a table say StudentBillDetails and in this table data is saved annually and yrid is referenced to some other table. Now I am stuck with a problem. I want to retrieve non matching records as described below.
Stid BillNo Yrid
1 525 3
1 525 1
1 525 4
2 443 4
2 442 1
2 443 3
In above given table structure as you can see for three years StId 1 has same value but StId 2 has a confliction in Yrid 1. So I want to get these type of records.
If you just want to flag Stid values which have conflicts then the following simple query should work:
SELECT Stid
FROM yourTable
GROUP BY Stid
HAVING COUNT(DISTINCT BillNo) > 1
If you want the entire records you could try joining your table to the above query:
SELECT t1.*
FROM yourTable t1
INNER JOIN
( SELECT Stid FROM yourTable GROUP BY Stid HAVING COUNT(DISTINCT BillNo) > 1 ) t2
ON t1.Stid = t2.Stid

T-SQL getting all unique groups with their usage count

How do I find the unique groups that are present in my table, and display how often that type of group is used?
For example (SQL Server 2008R2)
So, I would like to find out how many times the combination of
PMI 100
RT 100
VT 100
is present in my table and for how many itemid's it is used;
These three form a group because together they are assigned to a single itemid. The same combination is assigned to id 2527 and 2529, so therefore this group is used at least twice. (usagecount = 2)
(and I want to know that for all types of groups that are appearing)
The entire dataset is quite large, about 5.000.000 records, so I'd like to avoid using a cursor.
The number of code/pct combinations per itemid varies between 1 and 6.
The values in the "code" field are not known up front, there are more than a dozen values on average
I tried using pivot, but I got stuck eventually and I also tried various combinations of GROUP-BY and counts.
Any bright ideas?
Example output:
code pct groupid usagecount
PMI 100 1 234
RT 100 1 234
VT 100 1 234
CD 5 2 567
PMI 100 2 567
VT 100 2 567
PMI 100 3 123
PT 100 3 123
VT 100 3 123
RT 100 4 39
VT 100 4 39
etc
Just using a simple group:
SELECT
code
, pct
, COUNT(*)
FROM myTable
GROUP BY
code
, pct
Not too sure if that's more like what you're looking for:
select
uniqueGrp
, count(*)
from (
select distinct
itemid
from myTable
) as I
cross apply (
select
cast(code as varchar(max)) + cast(pct as varchar(max)) + '_'
from myTable
where myTable.itemid = I.itemid
order by code, pct
for xml path('')
) as x(uniqueGrp)
group by uniqueGrp
Either of these should return each combination of code and percentage with a group id for the code and the total number of instances of the code against it. You can use them for also adding the number of instances of the specific code/pct combo too for determining % contribution etc
select
distinct
t.code, t.pct, v.groupcol, v.vol
from
[tablename] t
inner join (select code, rank() over(order by count(*)) as groupcol,
count(*) as vol from [tablename] s
group by code) v on v.code=t.code
or
select
t.code, t.pct, v.groupcol, v.vol
from
(select code, pct from [tablename] group by code, pct) t
inner join (select code, rank() over(order by count(*)) as groupcol,
count(*) as vol from [tablename] s
group by code) v on v.code=t.code
Grouping by Code, and Pct should be enough I think. See the following :
select code,pct,count(p.*)
from [table] as p
group by code,pct

SQL Server 2008 how to select top [column value] and random record?

I'm using SQL Server 2008, I want select random row record, and the total number of record is depend on another table's column value, how to do this?
My SQL statement is something like this, but wrong..
select top b.number a.name, a.link_id
from A a
left join B b on b.link_id = a.link_id
order by newid()
Here are my tables and the expected result.
Table A:
name link_id
james 100
albert 100
susan 100
simon 101
tom 101
fion 101
Table B:
link_id number
100 2
101 1
Expected result:
when run 1st time, result may be:
name link_id
james 100
susan 100
fion 101
2nd time result may be:
albert 100
susan 100
simon 101
3rd time could be:
james 100
albert 100
fion 101
Explaination
Refer to table B, link_id: 100, number: 2
meaning that Table A should select out 2 random record for link_id = 100
and need to select 1 random record for link_id=101
You can use the ROW_NUMBER() function:
SELECT A.name, A.link_id
FROM(
SELECT name,link_id, ROW_NUMBER()OVER(PARTITION BY link_id ORDER BY NEWID()) rn
FROM dbo.tblA
) AS A
JOIN dbo.tblB AS B
ON A.link_id = B.link_id
WHERE A.rn <= B.number;
Here is a SqlFiddle to show this in action: http://sqlfiddle.com/#!3/92eac/2
Try this:
SELECT a.*
FROM b
CROSS APPLY
(
SELECT TOP (b.number) a.*
FROM a
WHERE a.link_id = b.link_id
ORDER BY
NEWID()
) a
Also see: SQLFiddle

Resources