Using special character to JOIN - snowflake-cloud-data-platform

I have a table A which loaded from a external stage, one column col1 has special char for example 'Español'. I need to join tableA with tableB
select * from tableA
join tableB
on tableA.col1 = tableB.col1
I know tableB.col1 has exactly same value 'Español', but this join couldn't catch it. Anyone knows why and how to get it joined?
Thanks.

If the join is failing it is because the values in col1 are not equivalent even though they might look the same when displayed. I wonder if using hex_encode(col1) on each table might surface the difference?

Related

How to use INSERT INTO with a LEFT JOIN?

I would appreciate if someone could help out.
There are two tables that I need to combine. So far I have been doing this every time.
SELECT * INTO TABLE_3
FROM TABLE_1 LEFT JOIN TABLE_2
ON TABLE_1.[DATE] = TABLE_2.[DATE1]
However, I would like to skip the part of creating a new table and insert the columns I need directly into the existing table.
I tried doing this,
INSERT INTO [TABLE_1] (USD,EUR,RUR)
SELECT USD,EUR,RUR
FROM TABLE_1 AS T1 LEFT JOIN TABLE_2 AS T2
ON T1.[DATE] = T2.[DATE1]
but got an error saying that my column names are ambiguous
I use SQL server 2014.
Instead of giving the column name directly, please specify the alias name to say from which table the column should take. May be here both tables having same column the you are trying to select. You should specify the exact table
INSERT INTO [TABLE_1] (USD,EUR,RUR)
SELECT [T1/T2].USD,[T1/T2].EUR,[T1/T2].RUR
FROM TABLE_1 AS T1 LEFT JOIN TABLE_2 AS T2
ON T1.[DATE] = T2.[DATE1]
Either you can specify T1 or T2 as per your business logic. Please rewrite the query as mentioned here. This will solve the problem. Please try this.

Shortcut for adding table to column name SQL-server 2014

Stupidly simple question, but I just don't know what to google!
If I create a query like this:
Select id, data
from table1
Now I want to join with table2. I can immediately see that the id column is no longer unique and I have to change it to
table1.id
Is there any smart way (like a keyboard-shortcut) to do this, instead of manually adding table1 to every column? Either before I add the Join to secure that all columns will be unique, or after with suggestions based on the different possible tables.
No, there is no helper.
But do not you can alias the table name:
select x.Col1, y.Col2
from ALongTableName x
inner join AReallyReallyLongTableName y on x.Id = y.OtherId
which can also make queries clearer, and is very much necessary when doing self joins.
First of all, you should start using aliases:
SQL aliases are used to give a database table, or a column in a table,
a temporary name.
Basically aliases are created to make column names more readable.
This will narrow down your problem and make your code maintenance easier. If that's not enough, I guess you could start using auto-completion tools, such as these:
SQL Complete
SQL Prompt
ApexSQL Complete
These have your desired functionality, however, they do not always work as expected (at least for me).
Oh! You can use alias table name. Like this:
SELECT A.ID, A.data
FROM TableA A
INNER JOIN TableB B
ON A.ID = B.ID
You just only use A. or B. if two table have same this column selected. If they different, you don't need: Like this:
SELECT A.ID, data -- if Table B not have column data
FROM TableA A
INNER JOIN TableB B
ON A.ID = B.ID
Or:
Select A.*, B.ID
FROM TableA A
INNER JOIN TableB B
ON A.ID = B.ID

Unnecessary DS_BCAST_INNER after GROUP BY in Redshift

I have a problem and would like to know if there is a solution to this.
I am having absolutely unnecessarily table broadcast(DS_BCAST_INNER) in my query.
Imagine you have Table1 and Table2 both having the same distkey MediaId.
When I join both tables directly there is no redistribution which is good. But when I try to do something similar to:
WITH t1
AS
(
SELECT MediaId, ... FROM Table1 ...predicates... GROUP BY MediaId, ...
),
t2 AS
(
SELECT MediaId, ... FROM Table2 ...predicates... GROUP BY MediaId, ...
)
Select ... FROM t1 JOIN t2 ON t.MediaId = t2.MediaId ....
I see DS_BCAST_INNER in execution plan shown by explain command while it is obviously useless.
How can I avoid it?
Run an EXPLAIN on this and look at the underlying data types of your tables (before the group by).
I've seen this recently where Table1 was a char(36) and Table2 has a varchar(36); this caused a cast and a broadcast, since the hashing of a char and a varchar is (probably) different. (The varchar will have a length prefix that is probably being included in the hash... :-( )
The data types on the join must be EXACTLY the same, not nearly. E.g. an INT to a BIGINT will probably have the same issue.
(Haven't checked this, but possibly even nullability?)

left join not showing null values

I need to find the items that exist in table A but not in table B. Now that would be really simple in MySQL doing a join like this
select * from A
left join B on A.key=B.key
where B.key is null
However for some reason this is not working in MSSQL. I have created the query without the where clause to see all the results and I only see matches, not null values. Do you have any idea why this is not working?
I know I can alternatively use "when not exists" but I want to know the reason as to why with a join is not working.
I am adding the code for your review
select Absences.CustomerID, b.*
from (
select * from openquery(JUAN_INTERFACE,'select cmp_wwn from Planet_Customers where i_outcome =4')) b
left join Absences on Absences.CustomerID = b.cmp_wwn
where Absences.Type = 3223
Your where clause is filtering out null values:
where Absences.Type = 3223
You are left-joining from the openquery subquery to Absences; and then filtering only rows that have a specific (non-null) value in an Absences column.
Did you mean to join the other way around, from Absenses to openquery?

How does t-sql update work without a join

I think my head is muddy or something. I'm trying to figure out how a t-sql update works without a join when updating one table from another. I've always used joins in the past but came across a stored proc where someone else created one without a join. This update is being used in SQL 2008R2 and it works.
Update table1
SET col1 = (SELECT TOP 1 colX FROM table2 WHERE colZ = colY),
col2 = (SELECT TOP 1 colE FROM table2 WHERE colZ = colY)
Obviously, colY is a field in table1. To get the same results in a select statement (not update), a join is required. I guess I don't understand how an update works behind the scenes but it must be doing some kind of join?
SQL Server translates those subqueries into joins. You can look at this by getting the query plan. You can write an equivalent query with UPDATE ... FROM ... JOIN syntax and observe the query plan to be essentially the same.
The sample code shown is unusual, hard to understand, redundant and inflexible. I recommend against using this style.
No it's doing a sub query, well two in this case. Be damn painful if you have another 98 col fields.
You can do something similar for select
select *,
(SELECT TOP 1 colX FROM table2 WHERE colZ = colY) as col1
From table1
A left join would simply be more efficient
Your example unless the dbms optimises it it running the subquery(ies) for each row in table.
Got to say whoever wrote it is less than competent.
These subqueries are what is called correlated subqueries. If you were to write the same query as a SELECT rather than an UPDATE it would look like this.
SELECT col1 = (SELECT TOP 1 table2.colX FROM table2 WHERE table2.colZ = table1.colY),
col2 = (SELECT TOP 1 table2.colE FROM table2 WHERE table2.colZ = table1.colY)
FROM table1
The JOIN is in the fact that you are referencing a column from an outside table on the inside of the subquery. Table1 is referenced in the UPDATE command. You can include a FROM clause but it isn't required for a setup like this.
You can use the same syntax in a SELECT with no join, but you need to alias the table if colY also exists in table2
SELECT (SELECT TOP 1 colX FROM table2 WHERE colZ = T.colY)
, (SELECT TOP 1 colE FROM table2 WHERE colZ = T.colY)
FROM table1 AS T
I only ever use this sort of thing when building up an ad hoc query just for my own infomation. If it's going to be put into any sort of permanent code I'll convert it to a join as it's easier to read and more maintainable.

Resources