snowflake concat two fields with space - snowflake-cloud-data-platform

I got two fields: First_name (e.g.John) and Surname (e.g.Doe)
How do I concat these for a join to another table that's Name (e.g. John Doe).
Trying contact(First_name, Surname) gives me JohnDoe.
Thanks!

There looks to be a concat with separator function https://docs.snowflake.com/en/sql-reference/functions/concat_ws.html
Try concat_ws(' ', First_name, Surname);

For the cases where you will only have two fields, there are three equally trivial methods, CONCAT, ||, & CONCAT_WS:
select
first_name
,surname
,concat(first_name, ' ', surname) as way1
,first_name || ' ' || surname as way2
,concat_ws(' ',first_name, surname) as way3
from values
('john', 'doe'),
('the rock', null)
as t(first_name, surname);
FIRST_NAME
SURNAME
WAY1
WAY2
WAY3
john
doe
john doe
john doe
john doe
the rock
null
null
null
null
but if any of those tokens are null, then the whole things is null.
If you have more than 2 tokens and non are null, then concat_ws starts to win.
and if you want something that can handle nulls but not have unwanted whitespace. you need to use array_construct_compact/array_to_string:
select
one
,two
,three
,concat(one, ' ', two, ' ', three) as way1
,one || ' ' || two || ' ' || three as way2
,concat_ws(' ',one, two, three) as way3
,array_to_string(array_construct_compact(one, two, three), ' ') as way4
from values
('john', 'doe', ''),
('the rock', null, null),
('three','happy', 'words')
as t(one, two, three);
ONE
TWO
THREE
WAY1
WAY2
WAY3
WAY4
john
doe
''
john doe
john doe
john doe
john doe
the rock
null
null
null
null
null
the rock
three
happy
words
three happy words
three happy words
three happy words
three happy words

Related

Concat in select formula - snowflake

when I use concat in select formula I will get concat of other column too.
Example:
SELECT
firstname,
surname,
concat(firstname,' ',surname) AS fullname
FROM
employee
Source data:
| firstname | surname |
| John | Kenedy |
Output data:
| firstname | surname | fullname |
| Kenedy John | Kenedy | Kenedy Kenedy John |
Am I using concat wrog way?
Hello you have a bad syntax this must be work
SELECT CONCAT(firstname, ' ', surname) as fullname FROM employee;
Result:
+-----------------+
| fullname |
+-----------------+
| John Kenedy |
| Abraham Lincoln |
+-----------------+
You can get more info here
better still. don't use concat function. Use the operator || instead. If you use concat(), and you need to concatenate a bunch of things, it gets very ugly very quickly nesting all the concats within each other.
which do you prefer?
select concat('I ', concat('think ', concat('data ', concat('is ', 'fun '))))
-or-
select 'I ' || 'think ' || 'data ' || 'is ' || 'fun '
Your source data firstname column is not the same as your output data firstname column. If you were to run your concat function on the source data as you've presented it, then I believe you would get the results you expect.
Edit 1: Removing duplicate words from a record with SQL
Use a SPLIT_TO_TABLE table function to split each part of the concatenation to an individual row
Use QUALIFY clause to filter out duplicate words for each flattened record
Grouping by the firstname and surname, use a LISTAGG function to concatenate together each unique word using an ORDER BY clause to preserve the order of the words
CREATE OR REPLACE TEMPORARY TABLE TMP_EMPLOYEE
AS
SELECT $1 AS FIRSTNAME
,$2 AS SURNAME
FROM VALUES
('John','Kenedy')
,('Kenedy John','Kenedy')
;
WITH A AS (
SELECT E.FIRSTNAME
,E.SURNAME
,STT.SEQ
,STT.INDEX
,STT.VALUE
FROM TMP_EMPLOYEE E
,LATERAL SPLIT_TO_TABLE(FIRSTNAME || ' ' || SURNAME,' ') STT
QUALIFY ROW_NUMBER() OVER(PARTITION BY STT.SEQ,STT.VALUE ORDER BY STT.INDEX) = 1
)
SELECT A.FIRSTNAME
,A.SURNAME
,LISTAGG(A.VALUE,' ') WITHIN GROUP(ORDER BY A.INDEX) AS FULLNAME
FROM A
GROUP BY A.FIRSTNAME,A.SURNAME
;
Notes
This does not compare any two or more records to each other to find duplicates

Cannot concatenate string columns

I'd like to concatenate three columns - street, street number and city to one column "adress". The strange thing is that I cannot do it for some reason.
This is what I have tried so far:
SELECT street,
street_num,
city,
isnull(street,'') + '' + isnull(street_num,'') + '' + isnull(city,'') AS tst1, --doesnt work
concat(isnull(street,''),' ',isnull(street_num,''), ' ', isnull(city,'')) AS tst2, --doesnt work
(street_num + ' ' + street) AS tst3, --does work
(street_num + ' ' + city) AS tst4, --does work
(city + ' ' + street) AS tst5 --doesnt work
FROM [DB].[dbo].[adresses]
Note that + or concat doesnt work, it only shows the first column, street in these cases. However, if I start with street number and add street or city, it does work. But if I try to add third column, it is not shown.
If it helps, the table was pulled from Oracle by OPENQUERY and the table structure is as follows:
street VARCHAR(100), null
street_num VARCHAR(50), null
city VARCHAR(100), null
I am on MSSQL 2014.
EDIT
As asked in the comments, i cant show the data as I am dealing with addresses of our customers. Below are two dummy records plus expected result (adress) as example:
street | street_num | city | adress
--------------------------------------------------------------------
avenida pino alto | 45 | avila | avenida pino alto 45 avila
rue de abaixo | 86 | madrid | rue de abaixo 86 madrid
Furthermore, If i copy the records and do something like this, it works of course.
SELECT 'avenida pino alto' + ' ' + '45' + ' ' + 'avila'
Based on comments, it seems that your street column contains some char/data that causes problems.
I have no idea what it could be, but you can try to find out like this:
select top 10
street,
len(street) as streetCharLen,
cast(street as varbinary(500)) as streetBytes
from [DB].[dbo].[adresses]
Then compare what the different columns tell you.
Here's a quick sample:
declare #t table (
id int,
thestring varchar(50)
)
insert into #t values (1, 'test')
select thestring,
len(thestring) as slen,
cast(thestring as varbinary(100)) as sbytes
from #t
If in this sample, the slen is not 4, or the sbytes contains something that does not map back to one of the characters that I see when selecting, then something is wrong with the string.
Use convert(varchar,[exp]):
SELECT street,
street_num,
city,
isnull(convert(varchar,street),'') + '' + isnull(convert(varchar,street_num),'') + '' + isnull(convert(varchar,city),'') AS tst1
FROM [DB].[dbo].[adresses]
Try as follows.
select isnull((convert varchar(250),street),'')+isnull((convert varchar(250),[street number]),'')+
isnull((convert varchar(250),[city]),'') as 'Adress'
from .......(your query)

Make unique colume in SQL

I have table which has a duplicate data.
This is my Now table
Id Name
1 shahin Zen
2 shahin Zen & Aaron Henley
3 Fred Sayz feat. Antonia Lucas
4 Fred Sayz feat. Lawrence Alexander
5 Fred Sayz feat. Sibel
Note: I can not use distinct beacuse name has not fully match.
I want to make a table form this table like,
ID Name
1 shahin
2 Fred
Please anyone solved this kind of problem.
Thanks advance
if you just want to get distinct first words of the rows:
select distinct substring(Name, 0, charindex(' ', Name, 0))
from myTable
you can also add a check for the rows that contains space character by adding a where clause:
where charindex(' ', myTable, 0) > 0
If you just need the first names, try this:
SELECT
LEFT(name, CHARINDEX(' ', name))
FROM Table1
GROUP BY LEFT(name, CHARINDEX(' ', name))
You need to account for those records that don't have a space...
Select Distinct Left(name,CharIndex(' ',name+' '))
From myTable

Separating firstname surname

If the original field looks like paul#yates then this syntax picks out the surname correctly
substring(surname,CHARINDEX('#',surname+'#')+1,LEN(name3))
however if the field is paul#b#yates then the surname looks like #b#yates. I want the middle letter to be dropped so it picks only the surname out.
any ideas?
You can;
;with T(name) as (
select 'paul#yates' union
select 'paul#b#yates'
)
select
right(name, charindex('#', reverse(name) + '#') - 1)
from T
>>
yates
yates
you could reverse the array, split it till you find the first "#", take that part and reverse it again.
if this is java, there should be an array.reverse function, otherwise you possibly need to write it on your own.
also you could cut the string in pieces until there are no mor "#" signs left and then take the last part (the substring should return "-1" or something), but i like my first idea better.
Here's an example for you
declare #t table (name varchar(max));
insert #t select
'john' union all select
'john#t#bill' union all select
'joe#public';
select firstname=left(name,-1+charindex('#',name+'#')),
surname=case when name like '%#%' then
stuff(name,1,len(name)+1-charindex('#',reverse(name)+'#'),'')
end
from #t;
-- results
FIRSTNAME SURNAME
john (null)
john bill
joe public

SQL Remove almost duplicate rows

I have a table that contains unfortuantely bad data and I'm trying to filter some out. I am sure that the LName, FName combonation is unique since the data set is small enough to verify.
LName, FName, Email
----- ----- -----
Smith Bob bsmith#example.com
Smith Bob NULL
Doe Jane NULL
White Don dwhite#example.com
I would like to have the query results bring back the "duplicate" record that does not have a NULL email, yet still bring back a NULL Email when there is not a duplicate.
E.g.
Smith Bob bsmith#example.com
Doe Jane NULL
White Don dwhite#example.com
I think the solution is similar to Sql, remove duplicate rows by value, but I don't really understand if the asker's requirements are the same as mine.
Any suggestions?
Thanks
You can use ROW_NUMBER() analytic function:
SELECT *
FROM (
SELECT a.*, ROW_NUMBER() OVER(PARTITION BY LName, FName ORDER BY Email DESC) rnk
FROM <YOUR_TABLE> a
) a
WHERE RNK = 1
This drops the null rows if there are any non null values.
SELECT lname
, fname
, MIN(email)
FROM YourTable
GROUP BY
lname
, fname
Test script
DECLARE #Test TABLE (
LName VARCHAR(32)
, FName VARCHAR(32)
, Email VARCHAR(32)
)
INSERT INTO #Test
SELECT 'Smith', 'Bob', 'bsmith#example.com'
UNION ALL SELECT 'Smith', 'Bob', 'NULL'
UNION ALL SELECT 'Doe', 'Jane', 'NULL'
UNION ALL SELECT 'White', 'Don', 'dwhite#example.com'
SELECT lname
, fname
, MIN(Email)
FROM #Test
GROUP BY
lname
, fname
Here is a relatively simple query that uses standard SQL and does just this:
SELECT * FROM Person P
WHERE Email IS NOT NULL OR -- Take all people with non-null e-mails
Email IS NULL AND -- and all people with null e-mails, as long as
NOT EXISTS -- there is no duplicate record of the same person
(SELECT * -- with a non-null e-mail
FROM Person P2
WHERE P2.LName=P.LName AND P2.FName=P.FName AND P2.Email IS NOT NULL)
Since there are plenty of SQL solutions posted already, you may want to create a data fix to remove the bad data, then add the necessary constraints to prevent bad data from ever being inserted. Bad data in a database is a side effect of poor design.

Resources