How to consolidate rows in table for the given scenario? - sql-server

Let's say I have a table
CustId Name Age Gender Business Code
1 John 24 Male Automobiles 1
2 Peter 30 Male Space 3
2 Peter 30 Male IT null
3 Kris 48 Female Infra null
I need output as follows
CustId Name Age Gender Business Code
1 John 24 Male Automobiles 1
2 Peter 30 Male Space 3
3 Kris 48 Female CodeNotAvailable null
Peter has two businesses one with code and another without code. So, the row without code is removed.
Kris has business without code, so need to display CodeNotAvailable in Business column.

We can use ROW_NUMBER() to get the row numbers and pick the row. By default, SQL Server orders NULL first. We need to use order by code desc to get the non-null value as the first row in the ROW_NUBER()
SELECT CustId,Name, Age, Gender, Business, Code
from
(
SELECT *, ROW_NUMBER() OVER(PARTITION BY CustId ORDER BY Code desc) as rnk
FROM Table) as t
WHERE rnk = 1

;with r as (
select Custid, Name, Age, Gender,
case when code is null then 'CodeNotAvailable' else Business end as Business,
Code
from myTab
)
select max(CustId) CustId, Name, Age, Gender, Business, Code
from r
group by Name, Age, Gender, Business, Code

Related

How do I group these values

I have to take a person's race, gender, age range and
I have to take:
Race 1 - Gender 1 - Age Range
Race 1 - Gender 2 - Age Range
Race 2 - Gender 1 - Age Range
Race 2 - Gender 2 - Age Range
and turn it into:
Group # | Average Age
Group 1 | 20-30
Group 2 | 40-50
Group 3 | 30-40
Group 4 | 40-50
The age is inputted as 20-30, 30-40, 40-50 so I have to find the most repeated string but I don't know how to tie it all together in 2 columns and 4 rows. I'm still new and would like to learn. Can anyone explain how I can do this?
Edit:
End Result Correct Output Desired End Result
I'm not quite clear on your table structure but perhaps something like this would work.
select GroupType, age
from (select race + gender as GroupType , age, count(*) as frequency,
ROW_NUMBER() OVER (PARTITION BY race + gender ORDER BY COUNT(*) DESC) as seqnum
from tbl
group by race + gender, age) g
where seqnum = 1

How to filter columns with multiple values for each ID in SQL Server

I have a result set as below and I want to select a single record when the same ID has 2 records with different values for Age and status column, for example
Please see the result set below where ID, name, country name coming from table A and Age, Active status coming from b table
ID name country Age status
----------------------------------------------
1 Prasad India NULL NULL
2 John USA NULL NULL
3 GREG AUS NULL NULL
4 RAVI India NULL NULL
4 RAVI India 18 Years and Above 1
Go with this:
Select *
From
(
Select t2.*,
ROW_NUMBER() over(partition by ID order by name,country,Age, status desc) as rn
From yourtable t2
)
Where rn = 1

Removing Duplicates of two columns in a query

I have a select * query which gives lots of row and lots of columns of results. I have an issue with duplicates of one column A when given the same value of another column B that I would like to only include one of.
Basically I have a column that tells me the "name" of object and another that tells me the "number". Sometimes I have an object "name" with more than one entry for a given object "number". I only want distinct "numbers" within a "name" but I want the query to give the entire table when this is true and not just these two columns.
Name Number ColumnC ColumnD
Bob 1 93 12
Bob 2 432 546
Bob 3 443 76
This example above is fine
Name Number ColumnC ColumnD
Bob 1 93 12
Bob 2 432 546
Bill 1 443 76
Bill 2 54 1856
This example above is fine
Name Number ColumnC ColumnD
Bob 1 93 12
Bob 2 432 546
Bob 2 209 17
This example above is not fine, I only want one of the Bob 2's.
Try it if you are using SQL 2005 or above:
With ranked_records AS
(
select *,
ROW_NUMBER() OVER(Partition By name, number Order By name) [ranked]
from MyTable
)
select * from ranked_records
where ranked = 1
If you just want the Name and number, then
SELECT DISTINCT Name, Number FROM Table1
If you want to know how many of each there are, then
SELECT Name, Number, COUNT(*) FROM Table1 GROUP BY Name, Number
By using a Common Table Expression (CTE) and the ROW_NUMBER OVER PARTION syntax as follows:
WITH
CTE AS
(
SELECT
*,
ROW_NUMBER() OVER (PARTITION BY Name, Number ORDER BY Name, Number) AS R
FROM
dbo.ATable
)
SELECT
*
FROM
CTE
WHERE
R = 1
WITH
CTE AS
(
SELECT
*,
ROW_NUMBER() OVER (PARTITION BY Plant, BatchNumber ORDER BY Plant, BatchNumber) AS R
FROM dbo.StatisticalReports WHERE dbo.StatisticalReports. \!"FermBatchStartTime\!" >= DATEADD(d,-90, getdate())
)
SELECT
*
FROM
CTE
WHERE
R = 1
ORDER BY dbo.StatisticalReports.Plant, dbo.StatisticalReports.FermBatchStartTime

Use Sum in certain conditions

I have say the following rows
Country Population
IE 30
IE 20
UK 15
DE 20
DE 10
UK 20
BE 5
So basically I want to net the values together only for IE and DE... the rest I just want the values
So this would sum them all ..
Select Country, Sum(Population) From CountryPopulation group by Country
and I can add a where clause to exclude all other countries except IE and DE... but I also want these in the result set but just not summed.
So the table above would look like this when summed
Country Population
IE 50 -- Summed
UK 15 -- Orginal Value
DE 30 -- Summed
UK 20 -- Orginal Value
BE 5 -- Orginal Value
Problem is I can’t get a sum if, or case to work as the query has to be aggregated by group by. Only other way I can thing on is to
Sum all the IE and DE and union it with the rest of the data..
Or
Maybe use a CTE
Is there a nice slick way of doing this....
Select Country, Sum(Population)
From CountryPopulation
group by case when Country in ('IE','DE')
then 'IE_DE'
else Country
end
declare #t table (Country char(2), Population int)
insert into #t (Country, Population) values
('IE',30),
('IE',20),
('UK',15),
('DE',20),
('DE',10),
('UK',20),
('BE',5 )
; With Ordered as (
select Country,Population,CASE
WHEN Country in ('IE','DE') THEN 1
ELSE ROW_NUMBER() OVER (ORDER BY Country)
END as rn
from #t
)
select Country,rn,SUM(Population)
from Ordered
group by Country,rn
Produces:
Country rn
------- -------------------- -----------
BE 1 5
DE 1 30
IE 1 50
UK 6 15
UK 7 20
The trick is to just introduce a unique value for each row, except for the IE and DE rows that all get a 1. If the source rows all, actually, already have such a unique value then the CTE can be simplified (or avoided, at the expense of having to place the CASE expression in the GROUP BY as well as the SELECT)
You could also use UNION ALL and divide this query into two:
SELECT P.country,
P.population
FROM (SELECT country,
Population = Sum(population)
FROM dbo.countrypopulation cp
WHERE country IN ( 'IE', 'DE' )
GROUP BY country
UNION ALL
SELECT country, population
FROM dbo.countrypopulation cp
WHERE country NOT IN ( 'IE', 'DE' )
) P
ORDER BY P.population DESC
Even if this is not so concise it is readable and efficient.
sql-fiddle

SQL query like GROUP BY with OR condition

I'll try to describe the real situation. In our company we have a reservation system with a table, let's call it Customers, where e-mail and phone contacts are saved with each incoming order - that's the part of a system I can't change. I'm facing the problem how to get count of unique customers. With the unique customer I mean group of people who has either the same e-mail or same phone number.
Example 1: From the real life you can imagine Tom and Sandra who are married. Tom, who ordered 4 products, filled in our reservation system 3 different e-mail addresses and 2 different phone numbers when one of them shares with Sandra (as a homephone) so I can presume they are connected somehow. Sandra except this shared phone number filled also her private one and for both orders she used only one e-mail address. For me this means to count all of the following rows as one unique customer. So in fact this unique customer may grow up into the whole family.
ID E-mail Phone Comment
---- ------------------- -------------- ------------------------------
0 tom#email.com +44 111 111 First row
1 tommy#email.com +44 111 111 Same phone, different e-mail
2 thomas#email.com +44 111 111 Same phone, different e-mail
3 thomas#email.com +44 222 222 Same e-mail, different phone
4 sandra#email.com +44 222 222 Same phone, different e-mail
5 sandra#email.com +44 333 333 Same e-mail, different phone
As ypercube said I will probably need a recursion to count all of these unique customers.
Example 2: Here is the example of what I want to do.Is it possible to get count of unique customers without using recursion for instance by using cursor or something or is the recursion necessary ?
ID E-mail Phone Comment
---- ------------------- -------------- ------------------------------
0 linsey#email.com +44 111 111 ─┐
1 louise#email.com +44 111 111 ├─ 1. unique customer
2 louise#email.com +44 222 222 ─┘
---- ------------------- -------------- ------------------------------
3 steven#email.com +44 333 333 ─┐
4 steven#email.com +44 444 444 ├─ 2. unique customer
5 sandra#email.com +44 444 444 ─┘
---- ------------------- -------------- ------------------------------
6 george#email.com +44 555 555 ─── 3. unique customer
---- ------------------- -------------- ------------------------------
7 xavier#email.com +44 666 666 ─┐
8 xavier#email.com +44 777 777 ├─ 4. unique customer
9 xavier#email.com +44 888 888 ─┘
---- ------------------- -------------- ------------------------------
10 robert#email.com +44 999 999 ─┐
11 miriam#email.com +44 999 999 ├─ 5. unique customer
12 sherry#email.com +44 999 999 ─┘
---- ------------------- -------------- ------------------------------
----------------------------------------------------------------------
Result ∑ = 5 unique customers
----------------------------------------------------------------------
I've tried a query with GROUP BY but I don't know how to group the result by either first or second column. I'm looking for let's say something like
SELECT COUNT(*) FROM Customers
GROUP BY Email OR Phone
Thanks again for any suggestions
P.S.
I really appreciate the answers for this question before the complete rephrase. Now the answers here may not correspond to the update so please don't downvote here if you're going to do it (except the question of course :). I completely rewrote this post.Thanks and sorry for my wrong start.
Here is a full solution using a recursive CTE.
;WITH Nodes AS
(
SELECT DENSE_RANK() OVER (ORDER BY Part, PartRank) SetId
, [ID]
FROM
(
SELECT [ID], 1 Part, DENSE_RANK() OVER (ORDER BY [E-mail]) PartRank
FROM dbo.Customer
UNION ALL
SELECT [ID], 2, DENSE_RANK() OVER (ORDER BY Phone) PartRank
FROM dbo.Customer
) A
),
Links AS
(
SELECT DISTINCT A.Id, B.Id LinkedId
FROM Nodes A
JOIN Nodes B ON B.SetId = A.SetId AND B.Id < A.Id
),
Routes AS
(
SELECT DISTINCT Id, Id LinkedId
FROM dbo.Customer
UNION ALL
SELECT DISTINCT Id, LinkedId
FROM Links
UNION ALL
SELECT A.Id, B.LinkedId
FROM Links A
JOIN Routes B ON B.Id = A.LinkedId AND B.LinkedId < A.Id
),
TransitiveClosure AS
(
SELECT Id, Id LinkedId
FROM Links
UNION
SELECT LinkedId Id, LinkedId
FROM Links
UNION
SELECT Id, LinkedId
FROM Routes
),
UniqueCustomers AS
(
SELECT Id, MIN(LinkedId) UniqueCustomerId
FROM TransitiveClosure
GROUP BY Id
)
SELECT A.Id, A.[E-mail], A.Phone, B.UniqueCustomerId
FROM dbo.Customer A
JOIN UniqueCustomers B ON B.Id = A.Id
Finding groups that have only same Phone:
SELECT
ID
, Name
, Phone
, DENSE_RANK() OVER (ORDER BY Phone) AS GroupPhone
FROM
MyTable
ORDER BY
GroupPhone
, ID
Finding groups that have only same Name:
SELECT
ID
, Name
, Phone
, DENSE_RANK() OVER (ORDER BY Name) AS GroupName
FROM
MyTable
ORDER BY
GroupName
, ID
Now, for the (complex) query you describe, let's say we have a table like this instead:
ID Name Phone
---- ------------- -------------
0 Kate +44 333 333
1 Sandra +44 000 000
2 Thomas +44 222 222
3 Robert +44 000 000
4 Thomas +44 444 444
5 George +44 222 222
6 Kate +44 000 000
7 Robert +44 444 444
--------------------------------
Should all these be in one group? As they all share name or phone with someone else, forming a "chain" of relative persons:
0-6 same name
6-1-3 same phone
3-7 same name
7-4 same-phone
4-2 same name
2-5 bame phone
For the dataset in the example you could write something like this:
;WITH Temp AS (
SELECT Name, Phone,
DENSE_RANK() OVER (ORDER BY Name) AS NameGroup,
DENSE_RANK() OVER (ORDER BY Phone) AS PhoneGroup
FROM MyTable)
SELECT MAX(Phone), MAX(Name), COUNT(*)
FROM Temp
GROUP BY NameGroup, PhoneGroup
I don't know if this is the best solution, but here it is:
SELECT
MyTable.ID, MyTable.Name, MyTable.Phone,
CASE WHEN N.No = 1 AND P.No = 1 THEN 1
WHEN N.No = 1 AND P.No > 1 THEN 2
WHEN N.No > 1 OR P.No > 1 THEN 3
END as GroupRes
FROM
MyTable
JOIN (SELECT Name, count(Name) No FROM MyTable GROUP BY Name) N on MyTable.Name = N.Name
JOIN (SELECT Phone, count(Phone) No FROM MyTable GROUP BY Phone) P on MyTable.Phone = P.Phone
The problem is that here are some joins made on varchars and could end up in increasing execution time.
Here is my solution:
SELECT p.LastName, P.FirstName, P.HomePhone,
CASE
WHEN ph.PhoneCount=1 THEN
CASE
WHEN n.NameCount=1 THEN 'unique name and phone'
ELSE 'common name'
END
ELSE
CASE
WHEN n.NameCount=1 THEN 'common phone'
ELSE 'common phone and name'
END
END
FROM Contacts p
INNER JOIN
(SELECT HomePhone, count(LastName) as PhoneCount
FROM Contacts
GROUP BY HomePhone) ph ON ph.HomePhone = p.HomePhone
INNER JOIN
(SELECT FirstName, count(LastName) as NameCount
FROM Contacts
GROUP BY FirstName) n ON n.FirstName = p.FirstName
LastN FirstN Phone Comment
Hoover Brenda 8138282334 unique name and phone
Washington Brian 9044563211 common name
Roosevelt Brian 7737653279 common name
Reagan Charles 7734567869 unique name and phone

Resources