SQL Server - coalesce data on duplicate keys when MERGE-ing - sql-server

I've stumbled across an annoying situation where my source query results have duplicate keys with differing data. Unfortunately I need to back-fill any NULLs.
I tried with a MERGE but I get a key error.
The equivalent query in MySQL (that I cannot convert) is:
Please note that I have changed all the field and table names
INSERT INTO user_brief (name, high_score, colour)
SELECT
u.name,
h.high_score,
p.colour,
FROM foo_table AS f
LEFT JOIN users AS u ON f.user_id = u.id
LEFT JOIN high_scores AS h ON f.user_id = h.id
LEFT JOIN preferences AS p ON f.user_id = p.id
ON DUPLICATE KEY
UPDATE
name = COALESCE(user_brief.name, VALUES(name)),
high_score = COALESCE(user_brief.high_score, VALUES(high_score)),
colour = COALESCE(user_brief.colour, VALUES(colour));
SELECT Query Results
If we take just the SELECT you would get the following results:
name | high_score | color
---------------------------
foo | NULL | brown
foo | 40 | NULL
bar | 29 | blue
...
Desired Results
name | high_score | color
---------------------------
foo | 40 | brown
bar | 29 | blue
...
As you can see it has flattened (not sure if that's the correct term) taking the first non-null value for each column of a name keyed record.
My attempted MERGE solution (but it gets key errors):
MERGE INTO user_brief AS target
USING (SELECT
u.name,
h.high_score,
p.colour,
FROM foo_table AS f
LEFT JOIN users AS u ON f.user_id = u.id
LEFT JOIN high_scores AS h ON f.user_id = h.id
LEFT JOIN preferences AS p ON f.user_id = p.id) AS source
ON target.name = source.name
WHEN MATCHED THEN
UPDATE SET
target.name = COALESCE(source.name, target.name),
target.high_score = COALESCE(source.high_score, target.high_score),
target.colour = COALESCE(source.colour, target.colour)
WHEN NOT MATCHED BY TARGET THEN
INSERT (name, high_score, colour)
VALUES (source.name, source.high_score, source.colour);

You could use GROUP BY to flatten source:
WITH source AS (
SELECT
u.name,
high_score = MIN(h.high_score),
colour = MIN(p.colour)
FROM foo_table AS f
LEFT JOIN users AS u ON f.user_id = u.id
LEFT JOIN high_scores AS h ON f.user_id = h.id
LEFT JOIN preferences AS p ON f.user_id = p.id
GROUP BY u.name
)
MERGE INTO user_brief AS target
USING source
ON target.name = source.name
WHEN MATCHED THEN
UPDATE SET
target.name = COALESCE(source.name, target.name),
target.high_score = COALESCE(source.high_score, target.high_score),
target.colour = COALESCE(source.colour, target.colour)
WHEN NOT MATCHED BY TARGET THEN
INSERT (name, high_score, colour)
VALUES (source.name, source.high_score, source.colour);

Related

Join with sub query vs Join to table - why is join to table incorrect?

I have 2 queries;
1 with sub query join - this is pulling everything back correctly
1 with a join - this is pulling back an incorrect calculation in the GrossAnnualDebit, and overall is much lower than the GrossAnnualDebit figure from the sub query joins.
SELECT prty_id AS PropertyID,
ISNULL(SUM(tr.grs_val_trans), 0) + ISNULL(SUM(voi.grs_valtrs), 0) AS GrossAnnualDebit
FROM qlfdat..hgmprty1 p1
LEFT JOIN
(
SELECT prty_ref,
SUM(grs_val_trans) AS grs_val_trans
FROM qlfdat..hratrans
WHERE trans_ppyy BETWEEN 201805 AND 201904
AND trans_type = 'D'
GROUP BY prty_ref
) AS tr ON tr.prty_ref = p1.prty_id
LEFT JOIN
(
SELECT prty_ref,
SUM(grs_valtrs) AS grs_valtrs
FROM qlfdat..hraptvtt
WHERE trans_ppyy BETWEEN 201805 AND 201904
GROUP BY prty_ref
) AS voi ON voi.prty_ref = p1.prty_id
GROUP BY prty_id;
SELECT prty_id AS PropertyID,
ISNULL(SUM(tr.grs_val_trans), 0) + ISNULL(SUM(voi.grs_valtrs), 0) AS GrossAnnualDebit
FROM qlfdat..hgmprty1 p1
LEFT JOIN qlfdat..hratrans AS tr ON tr.prty_ref = p1.prty_id
AND tr.trans_type = 'D'
AND tr.trans_ppyy BETWEEN 201805 AND 201904
LEFT JOIN qlfdat..hraptvtt AS voi ON voi.prty_ref = p1.prty_id
AND voi.trans_ppyy BETWEEN 201805 AND 201904
AND voi.trans_ppyy = tr.trans_ppyy
GROUP BY prty_id;
I could tell exactly what's the issue without sample code, but a difference I could see is on second query, your voi table is no longer left join with p1 table, you left join it with tr table, that might cause your issue.

EF6 - Generating unneeded nested queries

I have the following tables:
MAIN_TBL:
Col1 | Col2 | Col3
------------------
A | B | C
D | E | F
And:
REF_TBL:
Ref1 | Ref2 | Ref3
------------------
A | G1 | Foo
D | G1 | Bar
Q | G2 | Xyz
I wish to write the following SQL query:
SELECT M.Col1
FROM MAIN_TBL M
LEFT JOIN REF_TBL R
ON R.Ref1 = M.Col1
AND R.Ref2 = 'G1'
WHERE M.Col3 = 'C'
I wrote the following LINQ query:
from main in dbContext.MAIN_TBL
join refr in dbContext.REF_TBL
on "G1" equals refr.Ref2
into refrLookup
from refr in refrLookup.DefaultIfEmpty()
where main.Col1 == refr.Col1
select main.Col1
And the generated SQL was:
SELECT
[MAIN_TBL].[Col1]
FROM (SELECT
[MAIN_TBL].[Col1] AS [Col1],
[MAIN_TBL].[Col2] AS [Col2],
[MAIN_TBL].[Col3] AS [Col3]
FROM [MAIN_TBL]) AS [Extent1]
INNER JOIN (SELECT
[REF_TBL].[Ref1] AS [Ref1],
[REF_TBL].[Ref2] AS [Ref2],
[REF_TBL].[Ref3] AS [Ref3]
FROM [REF_TBL]) AS [Extent2] ON [Extent1].[Col1] = [Extent2].[Ref1]
WHERE ('G1' = [Extent2].[DESCRIPTION]) AND ([Extent2].[Ref1] IS NOT NULL) AND CAST( [Extent1].[Col3] AS VARCHAR) = 'C') ...
Looks like it is nesting a query within another query, while I just want it to pull from the table. What am I doing wrong?
I may be wrong, but it looks like you don't do the same in linq query and sql query, especially on your left joining clause.
I would go for this, if you want something similar to your sql query.
from main in dbContext.MAIN_TBL.Where(x => x.Col3 == "C")
join refr in dbContext.REF_TBL
on new{n = "G1", c = main.Col1} equals new{n = refr.Ref2, c = refr.Col1}
into refrLookup
from r2 in refrLookup.DefaultIfEmpty()
select main.Col1
By the way, it doesn't make much sense to left join on a table which is not present in the select clause : you will just get multiple identical Col1 if there's more than one related item in the left joined table...

INNER JOIN clause ignoring NULL values

I am looking to query some data that pertains to medications a patient has been prescribed that are in a certain category. But I also want to show patients that do not have any medications. My query so far:
SELECT
pd.fname,
pd.lname,
pp.drug_name,
pp.drug_strength
FROM
patient_data pd
FULL OUTER JOIN patient_prescr pp on pp.pid = pd.pid
FULL OUTER JOIN formulary f on pp.med_id = f.id
INNER JOIN formulary_categories fc on f.category = fc.id AND fc.id in (34,36,37,38,5)
WHERE
pd.lname = 'Test'
When applying the INNER JOIN to formulary_categories, I can correctly specify the category in which the drug I want to specify, but when I do this, it WILL NOT include patients that do not have any medications.
With the INNER JOIN joining the formulary_categories table, my results look like this:
-----------------------------------------------------------------------
fname | lname | drug_name | drug_strength
-----------------------------------------------------------------------
Cathy Test Clonazepam 0.5mg
Larry Test Librium 25mg
Jennifer Test Vistrail 25mg
-----------------------------------------------------------------------
If I change the INNER JOIN to a FULL OUTER JOIN, it simply ignores the category constraint, and pulls all categories.
However, the query will not include patients that do not have any medications prescribed. Id like my results to look something like:
-----------------------------------------------------------------------
fname | lname | drug_name | drug_strength
-----------------------------------------------------------------------
Cathy Test Clonazepam 0.5mg
Larry Test Librium 25mg
Joe Test NULL NULL
Jennifer Test Vistrail 25mg
Steve Test NULL NULL
-----------------------------------------------------------------------
You are actually looking for LEFT JOIN:
SELECT
pd.fname,
pd.lname,
pp.drug_name,
pp.drug_strength
FROM
patient_data pd
FULL OUTER JOIN patient_prescr pp on pp.pid = pd.pid
FULL OUTER JOIN formulary f on pp.med_id = f.id
LEFT JOIN formulary_categories fc on f.category = fc.id
AND fc.id in (34,36,37,38,5)
WHERE
pd.lname = 'Test'
A LEFT JOIN will not filter data if a correlation is not found between the values in the two tables (or result sets) and will display a NULL value for the columns which displays data from the table where a correlation was not found (just like in your expected output sample).
You can also take a look at the best article (in my opinion) for understanding all types of JOINs, here.
You simply need LEFT OUTER JOIN :
SELECT
pd.fname,
pd.lname,
pp.drug_name,
pp.drug_strength
FROM
patient_data pd
FULL OUTER JOIN patient_prescr pp on pp.pid = pd.pid
FULL OUTER JOIN formulary f on pp.med_id = f.id
LEFT OUTER JOIN formulary_categories fc on f.category = fc.id AND fc.id in (34,36,37,38,5)
WHERE
pd.lname = 'Test'

Get element names

I got 4 tables as follows:
tbProjekt
--------------
Id
every Machine has ProjektId which belongs to:
tblMaszyna
--------------
Id
ProjektId
tblElement
--------------
Id
Name
in this table i am associating elements with machines:
tblMaszElem
--------------
Id
IdElem
IdMach
I would like to take those elements - Name from tblElement which belongs to machines which belongs to specified ProjectId. So lets say for ProjectId 10 How can i achieve that?
select e.Name
from tbElement e
inner join tbMaszElem me on me.IdElem = e.Id
inner join tbMaszyna m on m.Id = me.IdMach
inner join tbProject p on p.Id = m.ProjektId
where
p.Id = 10
This should do. This selects the Name column of all entries in the tbElement table which are associated to a machine that's associated to a project where the project ID is 10.
Please check this sample and its comment
select
te.name
from
tblMaszElem tmem
inner join tblElement te on te.id = tmem.IdElem
inner join tblMaszyna tmzy on tmzy.id = tmem.IdMach
--inner join tbProjekt tp on tp.id = tmzy.ProjektId --i think this should be avoidable
where
tp.id = 10

How to retrieve data from two tables related using a third table, SQL Server

I have three tables(simplified)
movie(id int primary key identity, title varchar(20) not null)
genre(id int primary key identity, type varchar(10) not null)
movie_genre(movie_id int references movie(id),
genre_id int references genre(id),
primary key(movie_id, genre_id))
Data in movie
id title
---------------------------
1 | Inception
2 | The Dark Knight
Data in genre
id type
---------------------
1 | action
2 | adventure
3 | thriller
Data in movie_genre
movie_id genre_id
----------------------------
1 | 1
1 | 2
2 | 1
2 | 3
I want to display movie name with its genre types displayed in one column. So, the output would be
title | genres
-----------------------------------------
Inception | action adventure
The Dark Knight | action thriller
I tried to do it in this way
select
movie.title, genre.type
from
movie, genre
where
movie.id = movie_genre.movie_id
and genre.id = movie_genre.genre_id;
but it says :
The multi-part identifier "movie_genre.movie_id" could not be bound.
The multi-part identifier "movie_genre.genre_id" could not be bound.
I am very new to SQL, any help would be appreciated.
Edit :
Using
SELECT G.[Type] ,M.[Title]
FROM movie_genre MG
LEFT JOIN genre G ON MG.genre_id = G.ID
LEFT JOIN movie M ON MG.Movie_ID = M.ID
OR
select movie.title, genre.type
from movie, genre, movie_genre
where
movie.id = movie_genre.movie_id
and genre.id = movie_genre.genre_id;
The output is now,
title | genres
-----------------------------------------
Inception | action
Inception | adventure
The Dark Knight | action
The Dark Knight | thriller
How could I display genres in one row?
SELECT G.[Type]
,M.[Title]
FROM movie_genre MG
LEFT JOIN genre G ON MG.genre_id = G.ID
LEFT JOIN movie M ON MG.Movie_ID = M.ID
To get a list
SELECT DISTINCT M.[Title]
,STUFF((
SELECT ' ' + G.[Type]
FROM genre G INNER JOIN movie_genre MG
ON MG.genre_id = G.ID
WHERE MG.Movie_id = Mov.Movie_id
FOR XML PATH(''),TYPE)
.value('.','NVARCHAR(MAX)'),1,1, '') Genre
FROM movie_genre Mov
INNER JOIN movie M ON Mov.Movie_ID = M.ID
OR
SELECT DISTINCT M.[Title]
,STUFF(List,1,1, '') Genre
FROM #movie_genre Mov
INNER JOIN #movie M
ON Mov.Movie_ID = M.ID
CROSS APPLY
(
SELECT ' ' + G.[Type]
FROM #genre G INNER JOIN #movie_genre MG
ON MG.genre_id = G.ID
WHERE MG.Movie_id = Mov.Movie_id
FOR XML PATH('')
)Gen(List)
SQL FIDDLE
I believe you will need to add the 'movie_genre' to FROM, e.g:
SELECT movie.title, genre.type FROM (movie, genre, movie_genre) WHERE ....

Resources