Concatenate distinct surname values for each distinct name value - sql-server

What i want to achieve is the concatenation of all DISTINCT surname values for each DISTINCT name values.
What i have manage is the concatenation of DISTINCT name values but unfortunately all surname values.
Below is my code:
SELECT DISTINCT ST2.[Name],
SUBSTRING(
(
SELECT ','+ST1.Surname AS [text()]
FROM [Ext_Names] ST1
WHERE ST1.[Name] = ST2.[Name]
ORDER BY ST1.[Name]
FOR XML PATH ('')
), 2, 1000) [Surname]
FROM [Ext_Names] ST2
Sample Data
Result
Desired output

You need to select the distinct values first, then aggregate. If you are running SQL Server 2017 or higher, you can use string_agg():
select name, string_agg(surname, ',') within group (order by surname) surnames
from (select distinct name, surname from ext_names) t
group by name

You can done with this.
SELECT DISTINCT ST2.[Name], SUBSTRING((SELECT ','+ T.Surename AS [text()] FROM (SELECT DISTINCT ST1.Surename FROM Ext_Names ST1 WHERE ST1.Name = ST2.Name ) T ORDER BY T.Surename FOR XML PATH('')),2,1000) [SureEname]
FROM Ext_Names ST2

Related

TSQL: group by Substring (Name) and retrieve ID in SELECT

We have companies' data stored in a table. In an effort to de-duplicate the rows, we need to identify duplicate data sets of companies by using following criterion: If First five letters of the CompanyName, City and postal code match with other records' same fields then it is a duplicate. We will later remove the duplicates. The problem I am running in to is that I can't retrieve IDs of these records since I am not grouping the records on ID.
I am using following SQL:
Select count(ID) as DupCount
, SUBSTRING(Name,1,5) as Name
, City
, PostalCode
from tblCompany
group by SUBSTRING(Name,1,5)
, City
, PostalCode
Having count(ID) > 1
order by count(ID) desc
How do I retrieve the ID of these records?
You can use window functions:
Select c.*
from (select c.*,
count(*) over (partition by left(Name, 5), City, PostalCode) as cnt
from tblCompany c
) c
where cnt >= 2;
This will return the individual rows with dups. You can then summarize this or do what you want with the result set.
Use group_concat() to get the ids as a comma separated list:
select
SUBSTRING(Name,1,5) as Name,
City,
PostalCode,
count(ID) as counter,
group_concat(id order by id) as ids
from tblCompany
group by SUBSTRING(Name,1,5), City, PostalCode
having count(ID) > 1
order by count(ID) desc

How to use DISTINCT keyword in SQL Server?

How to use DISTINCT keyword in SQL Server? I mean if it can work for given field.
select id, name, age
from dbo.XXX
There are multiple row returned by the query. I would like to get how many kinds of id or name or age.
select **distinct** id, name, age from dbo.XXX or
select id, **distinct** name, age from dbo.XXX or
select id, name, **distinct** age from dbo.XXX
To sum up, I would like to use a single SQL to get the distinct count of each fields, like select π—±π—Άπ˜€π˜π—Άπ—»π—°π˜ id, π—±π—Άπ˜€π˜π—Άπ—»π—°π˜ name, π—±π—Άπ˜€π˜π—Άπ—»π—°π˜ age from dbo.XXX
Dense_Rank can be used to calculate a distinct count for any column and multiple columns:
Select col1, col2, col3,
dense_rank() over (partition by [col1] order by [Unique ID]) + dense_rank() over (partition by [col1] order by [Unique ID] desc) - 1 as DistCountCol1,
dense_rank() over (partition by [col2] order by [Unique ID]) + dense_rank() over (partition by [col2] order by [Unique ID] desc) - 1 as DistCountCol2,
dense_rank() over (partition by [col3] order by [Unique ID]) + dense_rank() over (partition by [col3] order by [Unique ID] desc) - 1 as DistCountCol3
from [table]
select distinct ID
from dbo.XXX
Select distinct name
from dbo.XXX
Select distinct age
from dbo.XXX
If you want to know how many rows you have for each distinct ID or Name or Age, you can use the following:
Select ID, count(id) as [ID_Recurrence]
from dbo.XXX
group by ID
Select Age, count(age) as [Age_Recurrence]
from dbo.XXX
group by Age
Select Name, count(name) as [Name_Recurrence]
from dbo.XXX
group by Name
The DISTINCT keyword return a unique row like the Following
SELECT DISTINCT ID FROM SomeTable
SELECT DISTINCT ID , SCORE FROM SomeTable
If you want to get unique value from row try the following code.
The Below code is copied from here
select t.id, t.player_name, t.team
from tablename t
join (select team, min(id) as minid from tablename group by team) x
on x.team = t.team and x.minid = t.id
select COUNT(distinct id) uniqueIDCount
from dbo.XXX
would count distinct values of id field, if you want to count distinct values for field combination you must concat fields, assuming your id is integer and name is nvarchar:
select COUNT(distinct CONVERT(nvarchar, id) + name) uniqueIDCount
from dbo.XXX
note that even this way looks nice it is probably not the most efficient one, here you have more efficient, but also more complicated method way:
with c as (
select distinct id, name
from dbo.XXX
)select COUNT(1)
from c
Not sure why it's complicated. U can have 3 different queries and u can union to return single set if u want .

Error in SQLServer: Subquery returned more than 1 value

I would like to insert to Clients table data from two different tables (Surname and name). Moreover I would like to have a third column (email) that is a concatination from the first two. when i try the code hereunder it gives me the following error: "Subquery returned more than 1 value".
insert into CLIENTS (LastName,Firstname, EMAIL)
select (select top 150 Surname from Surname order by NEWID()),
(select top 150 Name from Name order by Newid()),
(select concat(concat(FisrtName, LastName),'#novaims.com') from clients);
Could you please help me understand where is the problem?
The error message is obvious your sub-query can result more than one record. Try this
;WITH cte
AS (SELECT 1 AS val
UNION ALL
SELECT val + 1
FROM cte
WHERE val < 150)
SELECT FisrtName,
LastName,
Concat(FisrtName, LastName, '#novaims.com')
FROM cte
OUTER apply (SELECT TOP 1 Surname FROM Surname ORDER BY Newid()) s (FisrtName)
OUTER apply (SELECT TOP 1 NAME FROM NAME ORDER BY Newid()) n (LastName)
Option (Maxrecursion 0)
You need to move the table references to the from clause. I think this does what you want:
insert into CLIENTS (LastName, Firstname, EMAIL)
select surname, name, concat(name, surname, '#novaims.com')
from (select Surname, row_number() over (order by newid()) as seqnum
from Surname
) s join
(select Name, row_number() over (order by newid()) as seqnum
from Name
)
on n.seqnum = s.seqnum;
Another method uses apply:
insert into CLIENTS (LastName, Firstname, EMAIL)
select top 150 s.surname, n.name, concat(n.name, s.surname, '#novaims.com')
from surname s cross apply
(select top 1 n.*
from names n
order by newid()
) n
order by newid();
This is more similar to your original idea. Do note, though, that the same name can appear more than once. And the performance should be better for the first version (because the sort is only happening once on each table).

How to find the maximum value in join without using if in sql stored procedure

I have a two tables like below
A
Id Name
1 a
2 b
B
Id Name
1 t
6 s
My requirement is to find the maximum id from table and display the name and id for that maximum without using case and if.
i findout the maximum by using below query
SELECT MAX(id)
FROM (SELECT id,name FROM A
UNION
SELECT id,name FROM B) as c
I findout the maximum 6 using the above query.but i can't able to find the name.I tried the below query but it's not working
SELECT MAX(id)
FROM (SELECT id,name FROM A
UNION
SELECT id,name FROM B) as c
How to find the name?
Any help will be greatly appreciated!!!
First combine the tables, since you need to search both. Next, determine the id you need. JOIN the id back with the temporarily created table to retreive the name that belongs to that id:
WITH tmpTable AS (
SELECT id,name FROM A
UNION
SELECT id,name FROM B
)
, max AS (
SELECT MAX(id) id
FROM tmpTable
)
SELECT t.id, t.name
FROM max m
JOIN tmpTable t ON m.id = t.id
You could use ROW_NUMBER(). You have to UNION ALL TableA and TableB first.
WITH TableA(Id, Name) AS(
SELECT 1, 'a' UNION ALL
SELECT 2, 'b'
)
,TableB(Id, Name) AS(
SELECT 1, 't' UNION ALL
SELECT 6, 's'
)
,Combined(Id, Name) AS(
SELECT * FROM TableA UNION ALL
SELECT * FROM TableB
)
,CTE AS(
SELECT *, RN = ROW_NUMBER() OVER(ORDER BY ID DESC) FROM Combined
)
SELECT Id, Name
FROM CTE
WHERE RN = 1
Just order by over the union and take first row:
SELECT TOP 1 * FROM (SELECT * FROM A UNION SELECT * FROM B) x
ORDER BY ID DESC
This won't show ties though.
For you stated that you used SQL Server 2008. Therefore,I used FULL JOIN and NESTED SELECT to get what your looking for. See below:
SELECT
(SELECT
1,
ISNULL(A.Id,B.Id)Id
FROM A FULL JOIN B ON A.Id=B.Id) AS Id,
(SELECT
1,
ISNULL(A.Name,B.Name)Name
FROM A FULL JOIN B ON A.Id=B.Id) AS Name
It's possible to use ROW_NUMBER() or DENSE_RANK() functions to get new numiration by Id, and then select value with newly created orderId equal to 1
Use:
ROW_NUMBER() to get only one value (even if there are some repetitions of max id)
DENSE_RANK() to get all equal max id values
Here is an example:
DECLARE #tb1 AS TABLE
(
Id INT
,[Name] NVARCHAR(255)
)
DECLARE #tb2 AS TABLE
(
Id INT
,[Name] NVARCHAR(255)
)
INSERT INTO #tb1 VALUES (1, 'A');
INSERT INTO #tb1 VALUES (7, 'B');
INSERT INTO #tb2 VALUES (4, 'C');
INSERT INTO #tb1 VALUES (7, 'D');
SELECT * FROM
(SELECT Id, Name, ROW_NUMBER() OVER (ORDER BY Id DESC) AS [orderId]
FROM
(SELECT Id, Name FROM #tb1
UNION
SELECT Id, Name FROM #tb2) as tb3) AS TB
WHERE [orderId] = 1
SELECT * FROM
(SELECT Id, Name, DENSE_RANK() OVER (ORDER BY Id DESC) AS [orderId]
FROM
(SELECT Id, Name FROM #tb1
UNION
SELECT Id, Name FROM #tb2) as tb3) AS TB
WHERE [orderId] = 1
Results are:

Can I get comma separated values from sub query? If not, how to get this done?

I have a table
Create table Country_State_Mapping
(
Country nvarchar(max),
State nvarchar(max)
)
With 5 records.
Insert into Country_State_Mapping values('INDIA', 'Maharastra')
Insert into Country_State_Mapping values('INDIA', 'Bengal')
Insert into Country_State_Mapping values('INDIA', 'Karnatak')
Insert into Country_State_Mapping values('USA', 'Alaska')
Insert into Country_State_Mapping values('USA', 'California')
I need to write a SQL query which will have give me 2 records/2 columns as below.
1st column Contry and second AllStates
1 record(2 columns) will be
India and Maharastra,Bengal,Karnatak
2nd
USA and Alaska,California
I tried I like this
select distinct
OutTable.Country,
(select State
from Country_State_Mapping InnerTable
where InnerTable.Country = OutTable.Country)
from Country_State_Mapping AS OutTable
but did not work...
SELECT Country, AllStates =
STUFF((SELECT ', ' + State
FROM Country_State_Mapping b
WHERE b.Country = a.Country
FOR XML PATH('')), 1, 2, '')
FROM Country_State_Mapping a
GROUP BY Country
This is a bit nasty and potentially slow if the number of records in the Country_State_Mapping table is large but it does get the job done. It uses the recursive feature of Common Table Expressions introduced in SQL 2005.
;WITH Base
AS
(
SELECT
ROW_NUMBER() OVER (PARTITION BY Country ORDER BY Country, [State] DESC) AS CountryRowId,
ROW_NUMBER() OVER (ORDER BY Country, [State]) AS RowId,
Country,
[State]
FROM Country_State_Mapping
),
Recur
AS
(
SELECT
CountryRowId,
RowId,
Country,
[State]
FROM Base
WHERE RowId = 1
UNION ALL
SELECT
B.CountryRowId,
B.RowId,
B.Country,
CASE WHEN R.Country <> B.Country THEN B.[State] ELSE R.[State] + ',' + B.[State] END
FROM Recur R
INNER JOIN Base B
ON R.RowId + 1 = B.RowId
)
SELECT *
FROM Recur
WHERE CountryRowId = 1
OPTION (MAXRECURSION 0)--Dangerous

Resources