Parse Thai Name into First Last - sql-server

I need to parse a list of FullNames into First and Last Name. If a middle name is included, it should be included in the fist name field.
John Smith would be:
FirstName = John
LastName = Smith
John J. Smith would be:
FirstName = John J.
LastName = Smith
The issue is the names might be either Thai or English character set. I need to properly parse either set. I have tried just about everything...
DECLARE #FullName NVARCHAR(MAX) = N'กล้วยไม้ สวามิวัศดุ์'
--DECLARE #FullName NVARCHAR(MAX) = N'Mark C. Wilson'
SELECT
LEN(#FullName) AS StringLength,
LEN(#FullName) - LEN(REPLACE(#FullName,N' ', N'')),
LEN(REPLACE(#FullName,N' ', N'')),
#FullName AS FullName,
REVERSE(#FullName) AS ReverseName, -- This is obviously no Reverse of the string
CHARINDEX(N' ', REVERSE(#FullName)) AS LastSpaceLocation,
CHARINDEX(N' ', #FullName) AS FirstSpaceLocation,
LEN(#FullName) AS LenString,
STUFF(#FullName, 1, CHARINDEX(N' ', #FullName), N'') as FirstName,
RIGHT(#FullName, LEN(#FullName) - CHARINDEX(N' ', #FullName) + 1) as LastName,
LEFT(#FullName, LEN(#FullName) - CHARINDEX(N' ', REVERSE(#FullName))) AS FirstName,
STUFF(RIGHT(#FullName, CHARINDEX(N' ', REVERSE(#FullName))),1,1,N'') AS LastName,
LEN(#FullName),
REVERSE(#FullName),
REVERSE(' '),
LEN(#FullName) - CHARINDEX(reverse(' '), REVERSE(#FullName)) - LEN(' ') + 1
The REVERSE simply does not work when the Thai character set is used.

I can't read Thai (I'm not that bright), but perhaps this may help.
Here we are using a CROSS APPLY to "fix" the string, and then it is a small matter of PasrName() and Concat()
I should add, parsing names is a slippery slope. One needs to consider
Multi Word Last Names ie De la Cruz
Suffix ie. Richard R Cappelletti MD
Example
Declare #YourTable table (FullName nvarchar(100))
Insert Into #YourTable values
('John Smith')
,('John J. Smith')
,(N'กล้วยไม้ สวามิวัศดุ์')
Select A.*
,LastName = replace(parsename(S,1),'|','.')
,FirstName = replace(concat(parsename(S,4),' '+parsename(S,3),' '+parsename(S,2)),'|','.')
From #YourTable A
Cross Apply ( values (replace(replace(FullName,'.','|'),' ','.'))) B(S)
Returns
FullName LastName FirstName
John Smith Smith John
John J. Smith Smith John J.
กล้วยไม้ สวามิวัศดุ์ สวามิวัศดุ์ กล้วยไม้
EDIT 2008 Version
Select A.*
,LastName = replace(parsename(S,1),'|','.')
,FirstName = replace( IsNull(parsename(S,4),'') + IsNull(' '+parsename(S,3),'') + IsNull(' '+parsename(S,2),''),'|','.')
From #YourTable A
Cross Apply ( values (replace(replace(FullName,'.','|'),' ','.'))) B(S)

I'm Thai and one thing I know is that Thai people don't do middle name.

Related

SQL Server : how to split fullname into first and last name and show only first 100 characters for last name

Below is the select statement. I'm trying to split the [Display name] column into Firstname and Lastname. But when I try to insert the data into the destination table, I get an error
Data Truncation error
The problem is with the Lastname column in the destination table, it only stores 50 characters. I need to only insert first 20 characters of the last name.
SELECT
SUBSTRING(DisplayName, 1, CHARINDEX(' ', DisplayName) - 1) AS [FirstName],
SUBSTRING(DisplayName, CHARINDEX(' ', DisplayName) + 1, LEN(DisplayName)) AS [LastName]
FROM
[Destination_Table]
WHERE
VoidFlag = 0
AND ControlPlanCode = #ControlPlan
This is the result from the above select :
FirstName LastName
-------------------------------------------
Lynn Trepanier
Becky Simonds
Mary Bell
Lynn Trepanier
Enrollment Services Enrollment Services
Wendy Ferenc
Patrick McGrath
Kevin Weishaar
Benefit Configuration Service Benefit Configuration Service
Try this
SELECT
SUBSTRING(DisplayName, 1, CHARINDEX(' ', DisplayName) - 1) AS [FirstName],
case when LEN(DisplayName) -CHARINDEX(' ', DisplayName) <= 50 then
SUBSTRING(DisplayName, CHARINDEX(' ', DisplayName) + 1, LEN(DisplayName) -CHARINDEX(' ', DisplayName) )
else
SUBSTRING(DisplayName, CHARINDEX(' ', DisplayName) + 1,50 )
end
AS [LastName]
FROM
[Destination_Table]
First, cut the text to the correct length (LEN(DisplayName) -CHARINDEX(' ', DisplayName) )
Secondly, test the length of the truncated text. If it is greater than 50, only cut out 50 characters

How to remove the whitespace while concatenating while some have Null values?

I am trying to concatenate the name of the players where some of the players have no middle name. While concatenating as below I am getting an white space for players without a middle name and logic holds good for players with a middle name. How do I remove the unwanted whitespace for NULL valued columns alone?
I want only the Initial of the middle name in the concatenate expression.
SELECT m_playerid, first_name + ' ' + SUBSTRING (coalesce (middle_
name, ' '), 1,1) + ' ' + last_name as [Full name]
, game as Game, inns as Innings, [scores] as Scores FROM odsports
Shouldn't I be introducing a condition to get remove of the
whitespace for NULL? I am struck!
You can use the fact that concatenating a NULL to anything with the + operator produces a NULL whereas the CONCAT function converts NULL to empty string.
So CONCAT(first_name, ' ', LEFT(middle_name,1) + ' ', last_name) will handle null middle names as you want - as in the following example
WITH T(first_name, middle_name, last_name) AS
(
SELECT 'Franklin', 'Delano', 'Roosevelt' union all
SELECT 'Barack', NULL, 'Obama'
)
SELECT CONCAT(first_name, ' ', LEFT(middle_name,1) + ' ', last_name)
FROM T
Returns
+----------------------+
| (No column name) |
+----------------------+
| Franklin D Roosevelt |
| Barack Obama |
+----------------------+
Add a replace for double spaces, as well as use isnull function. Try this
SELECT
m_playerid,
REPLACE(
LTRIM(RTRIM(ISNULL(first_name ,'')))
+CASE WHEN middle_name IS NULL
THEN ' '
ELSE ' '+LEFT(ISNULL(middle_name,' '),1)+' ' END
+
LTRIM(RTRIM(ISNULL(last_name,'')))
,' ',' ') as [Full name],
game as Game,
inns as Innings,
[scores] as Scores
FROM odsports
Try this:
SELECT m_playerid,
COALESCE(first_name + ' ' + middle_name + ' ' + last_name,
first_name + ' ' + last_name,
first_name,
last_name) as [Full name],
game as Game,
inns as Innings,
[scores] as Scores
FROM odsports
SELECT
m_playerid,
LTRIM(CONCAT(first_name,Space(1),LTRIM(RTRIM(middle_name+space(1)+last_name))))
as [Full name],
game as Game,
inns as Innings,
[scores] as Scores
FROM odsports

SQL: How can I Parse firstname, lastname and title from fullname?

I have a column named Employee name in table 1
Example: Mr.FirstName LastName
and so on with various titles but there are employee names without the title in the same column.I am about to split the single column and do an insert in the new table (table 2) with three different columns like FirstName, LastName and Title.So while doing an insert into the new table I am not able to split the employee column name like I mentioned.Any help will be really appreciated I started with LINQ so I am not aware of much SQL functions.
Update : Sample data
Here's one example.
DECLARE #name varchar(100) = 'Mr.FirstName LastName'
SELECT
LEFT(#name, CHARINDEX('.', #name)) AS Title,
SUBSTRING(#name, CHARINDEX('.', #name)+1, CHARINDEX(' ', #name)-CHARINDEX('.', #name)) AS FirstName,
SUBSTRING(#name, CHARINDEX(' ', #name)+1, 1000) AS LastName
It takes...
the left part till the . as Title.
everything after the . until the first space as FirstName
everything after the first space as LastName
Note: There's no check for errors, if the name does not fit into this pattern.
For the simplest case you specified in the question the following query should work
SELECT *,
SUBSTRING(Employee_name, 0, CHARINDEX('.', Employee_name)) AS Title,
SUBSTRING(Employee_name,
CHARINDEX('.', Employee_name)+1,
CHARINDEX(' ', Employee_name)) AS FirstName,
SUBSTRING(Employee_name,
CHARINDEX(' ', Employee_name)+1,
LEN(Employee_name)) AS LastName
FROM Employee;

SQL: Concatenate column values in a single row into a string separated by comma

Let's say I have a table like this in SQL Server:
Id City Province Country
1 Vancouver British Columbia Canada
2 New York null null
3 null Adama null
4 null null France
5 Winnepeg Manitoba null
6 null Quebec Canada
7 Seattle null USA
How can I get a query result so that the location is a concatenation of the City, Province, and Country separated by ", ", with nulls omitted. I'd like to ensure that there aren't any trailing comma, preceding commas, or empty strings. For example:
Id Location
1 Vancouver, British Columbia, Canada
2 New York
3 Adama
4 France
5 Winnepeg, Manitoba
6 Quebec, Canada
7 Seattle, USA
I think this takes care of all of the issues I spotted in other answers. No need to test the length of the output or check if the leading character is a comma, no worry about concatenating non-string types, no significant increase in complexity when other columns (e.g. Postal Code) are inevitably added...
DECLARE #x TABLE(Id INT, City VARCHAR(32), Province VARCHAR(32), Country VARCHAR(32));
INSERT #x(Id, City, Province, Country) VALUES
(1,'Vancouver','British Columbia','Canada'),
(2,'New York' , null , null ),
(3, null ,'Adama' , null ),
(4, null , null ,'France'),
(5,'Winnepeg' ,'Manitoba' , null ),
(6, null ,'Quebec' ,'Canada'),
(7,'Seattle' , null ,'USA' );
SELECT Id, Location = STUFF(
COALESCE(', ' + RTRIM(City), '')
+ COALESCE(', ' + RTRIM(Province), '')
+ COALESCE(', ' + RTRIM(Country), '')
, 1, 2, '')
FROM #x;
SQL Server 2012 added a new T-SQL function called CONCAT, but it is not useful here, since you still have to optionally include commas between discovered values, and there is no facility to do that - it just munges values together with no option for a separator. This avoids having to worry about non-string types, but doesn't allow you to handle nulls vs. non-nulls very elegantly.
select Id ,
Coalesce( City + ',' +Province + ',' + Country,
City+ ',' + Province,
Province + ',' + Country,
City+ ',' + Country,
City,
Province,
Country
) as location
from table
This is a hard problem, because the commas have to go in-between:
select id, coalesce(city+', ', '')+coalesce(province+', ', '')+coalesce(country, '')
from t
seems like it should work, but we can get an extraneous comma at the end, such as when country is NULL. So, it needs to be a bit more complicated:
select id,
(case when right(val, 2) = ', ' then left(val, len(val) - 1)
else val
end) as val
from (select id, coalesce(city+', ', '')+coalesce(province+', ', '')+coalesce(country, '') as val
from t
) t
Without a lot of intermediate logic, I think the simplest way is to add a comma to each element, and then remove any extraneous comma at the end.
Use the '+' operator.
Understand that null values don't work with the '+' operator (so for example: 'Winnepeg' + null = null), so be sure to use the ISNULL() or COALESCE() functions to replace nulls with an empty string, e.g.: ISNULL('Winnepeg','') + ISNULL(null,'').
Also, if it is even remotely possible that one of your collumns could be interpreted as a number, then be sure to use the CAST() function as well, in order to avoid error returns, e.g.: CAST('Winnepeg' as varchar(100)).
Most of the examples so far neglect one or more pieces of this. Also -- some of the examples use subqueries or do a length check, which you really ought not to do -- just not necessary -- though your optimizer might save you anyway if you do.
Good Luck
ugly but it will work for MS SQL:
select
id,
case
when right(rtrim(coalesce(city + ', ','') + coalesce(province + ', ','') + coalesce(country,'')),1)=',' then left(rtrim(coalesce(city + ', ','') + coalesce(province + ', ','') + coalesce(country,'')),LEN(rtrim(coalesce(city + ', ','') + coalesce(province + ', ','') + coalesce(country,'')))-1)
else rtrim(coalesce(city + ', ','') + coalesce(province + ', ','') + coalesce(country,''))
end
from
table
I know it's an old question, but should someone should stumble upon this today, SQL Server 2017 and later has the STRING_AGG function, with the WITHIN GROUP option :
with level1 as
(select id,city as varcharColumn,1 as columnRanking from mytable
union
select id,province,2 from mytable
union
select id,country,3 from mytable)
select STRING_AGG(varcharColumn,', ')
within group(order by columnRanking)
from level1
group by id
Should empty strings exist aside of nulls, they should be excluded with some WHERE clause in level1.
Here is an option:
SELECT (CASE WHEN City IS NULL THEN '' ELSE City + ', ' END) +
(CASE WHEN Province IS NULL THEN '' ELSE Province + ', ' END) +
(CASE WHEN Country IS NULL THEN '' ELSE Country END) AS LOCATION
FROM MYTABLE

How to switch data in a column and update the column with result information in SQL Server

I have a table USERS and it has a column FULLNAME that contains full users names in the format FirstName LastName and I need to switch the data, and update it in the same column, using this format LastName, FirstName.
Ex. James Brown needs to be switched to Brown, James and updated in the same column (FULLNAME)
Is there any way to do it?
Thanks.
The best solution would be to split the elements into two columns (LastName, FirstName). However, if you want to try and squeeze it into 1 column, and you're assuming that every name is split by a single space
DECLARE #Users TABLE ( Name VARCHAR(100) )
INSERT INTO #Users
( Name
)
SELECT 'James Brown'
UNION ALL
SELECT 'Mary Ann Watson'
SELECT RIGHT(Name, CHARINDEX(' ', REVERSE(Name)) - 1) + ', ' + LEFT(Name,
LEN(Name)
- CHARINDEX(' ',
REVERSE(Name)))
FROM #Users
This also assumes that if they have more than two names sperated by a space, then the last word is the last name, and everything else is first name. Works for "Mary Ann Watson", but not "George Tucker Jones", if "Tucker Jones" is a last name.
Assuming they only have 1 space in the name:
UPDATE USERS
SET FULLNAME = RIGHT(FULLNAME,len(FULLNAME) - CHARINDEX(' ',FULLNAME)) +', '+ LEFT(FULLNAME,charindex(' ',FULLNAME)-1)
WHERE LEN(FULLNAME) - LEN(REPLACE(FULLNAME, ' ', '')) = 1

Resources