SQL Server : left outer join on two relations to 2 tables - sql-server

I have 2 tables with primary keys and third or many table which references these 2 primary tables and have some extra values on one or both primary keys.
I need to create some SQL which will always deliver result with as much information as possible by joining these 3 tables. Best result - all 3 tables joined. Medium result - at least some primary keys (or both) are selected. Worst result all columns are null.
Main idea is to have combination of two primary tables and many extra tables which could be empty but should allow results from tables with values.
I tried to start with 3 tables but got stuck on second join.
It works for me only when I join first table. Joining second one produces error.
What should I use instead of ? as SQL statement?
http://sqlfiddle.com/#!18/7438b/3
CREATE TABLE [AGENCIES]
(
[AGENCY_NAME] [CHAR](9),
id INT IDENTITY(1,1) NOT NULL PRIMARY KEY
);
CREATE TABLE [PERSONS]
(
[NAME] [CHAR](9),
id INT IDENTITY(1,1) NOT NULL PRIMARY KEY
);
CREATE TABLE [AGENCY_PERSON]
(
agency_id INT FOREIGN KEY REFERENCES agencies(id),
person_id INT FOREIGN KEY REFERENCES persons(id),
[TITLE] [CHAR](9) NULL,
id INT IDENTITY(1,1) NOT NULL PRIMARY KEY
);
INSERT INTO agencies (AGENCY_NAME)
VALUES ('AgencyOne'), ('AgencyTwo'), ('Agency3');
INSERT INTO persons (name)
VALUES ('PersonOne'), ('PersonTwo'), ('Person3');
INSERT INTO AGENCY_PERSON (agency_id, person_id, title)
VALUES (1, 1, 'TitleOne'), (1, 2, 'TitleTwo');
SELECT * FROM AGENCY_PERSON;
-- works fine for one primary table
SELECT [AGENCY_NAME], [TITLE]
FROM agencies
LEFT OUTER JOIN [AGENCY_PERSON] ON [AGENCY_PERSON].agency_id = agencies.id
WHERE [AGENCY_NAME] = 'AgencyOne';
-- error for two primary tables: Msg 4104 - The multi-part identifier "agencies.id" could not be bound.
SELECT [AGENCY_NAME], [TITLE], persons.name
FROM agencies, persons
LEFT OUTER JOIN [AGENCY_PERSON] ON [AGENCY_PERSON].agency_id = agencies.id
AND [AGENCY_PERSON].person_id = persons.id
WHERE [AGENCY_NAME] = 'AgencyOne';
-- select ? 'AgencyOne' - all records exist
-- AgencyOne, TitleOne, PersonOne
-- select ? 'TitleTwo' - both records on primary tables exist, but no in join table
-- AgencyOne, TitleTwo, NULL
-- select ? 'Agency3' - one of primary tables exist
-- Agency3, NULL, NULL
-- select ? 'Title3' - one of primary tables exist
-- NULL, Title3, NULL
-- select ? 'AgencyX' - nothing exists
-- NULL, NULL, NULL
forpas gave good answer but it is in reverse. Extra tables are left joined by primary which requires extra tables exist and have values. What I need is opposite - extra tables should join primaries. For example it could be more extra tables like PERSON_PHONE, PERSON_ADDRES or AGENCY_PERSON_LOCATION. As soon as agency or person exist (but no values in these extra tables) result should be row with existing agency and person and nulls in all other columns from the joined tables.

Your code would work if you did not use that old style (cross) join:
from agencies, persons
So write it like this:
select a.[AGENCY_NAME], ap.[TITLE], p.name
from agencies as a cross join persons as p
left outer join [AGENCY_PERSON] as ap
on ap.agency_id = a.id and ap.person_id = p.id
where a.[AGENCY_NAME] = 'AgencyOne';
I used aliases for all the tables involved and I qualified all the columns with the aliases of the tables they belong.
Results:
> AGENCY_NAME | TITLE | name
> :---------- | :-------- | :--------
> AgencyOne | TitleOne | PersonOne
> AgencyOne | TitleTwo | PersonTwo
> AgencyOne | null | Person3
I'm not sure if this is what you want as output but I believe you see now how you can join all 3 tables.
In case you want only the matching rows of the tables, then you should do inner joins:
select a.[AGENCY_NAME], ap.[TITLE], p.name
from [AGENCY_PERSON] as ap
inner join agencies as a on ap.agency_id = a.id
inner join persons as p on ap.person_id = p.id
where a.[AGENCY_NAME] = 'AgencyOne';
Results:
> AGENCY_NAME | TITLE | name
> :---------- | :-------- | :--------
> AgencyOne | TitleOne | PersonOne
> AgencyOne | TitleTwo | PersonTwo
See the demo.

Related

TSQL Reference Table with redundant keys

I'm currently working on a stored procedure on SQL Server 2016. In my Database I have a table structure and need to add another table, which references to the same table as an existing one.
Thus, I have 2 times a 1:1 relation to the same table.
The occuring problem is, I reference the same keys from 2 different origin tables twice in the same target table.
Target table:
FK_Tables | Text
----------------
1 | Table One Text Id: 1
1 | Table Two Text Id: 1 // The error: Same FK_Tables 2 times
Table One:
ID | OtherField
---------
1 | 42
Table Two:
ID | CoolField
---------
1 | 22
Table One and Table Two are currently referencing to the table Reference Table.
Do you know how I can solve this problem, of the same ID twice?
Thanks!!
You need to add a column for each table you're referencing, otherwise you wouldn't know where the ID is coming from if they were all inserted into the same field. Something like this:
/*
CREATE TEST TABLES
*/
DROP TABLE IF EXISTS tbOne;
CREATE TABLE tbOne ( ID INT IDENTITY(1,1) NOT NULL PRIMARY KEY
, TXT VARCHAR(10)
);
DROP TABLE IF EXISTS tbTwo;
CREATE TABLE tbTwo ( ID INT IDENTITY(1,1) NOT NULL PRIMARY KEY
, TXT VARCHAR(10)
);
DROP TABLE IF EXISTS Target;
CREATE TABLE Target ( ID INT IDENTITY(1,1) NOT NULL PRIMARY KEY
, FKTB1 INT
, FKTB2 INT
, TXT VARCHAR(100)
);
-- 1st FK tbOne
ALTER TABLE Target ADD CONSTRAINT FK_One FOREIGN KEY (FKTB1) REFERENCES tbOne (ID);
--2nd FK tbTwo
ALTER TABLE Target ADD CONSTRAINT FK_Two FOREIGN KEY (FKTB2) REFERENCES tbTwo (ID);
-- Populate test tables
INSERT INTO tbOne (TXT)
SELECT TOP 100 LEFT(text, 10)
FROM SYS.messages
INSERT INTO tbTwo (TXT)
SELECT TOP 100 LEFT(text, 10)
FROM SYS.messages
INSERT INTO [Target] (FKTB1, FKTB2, TXT)
SELECT 1, 1, 'Test - constraint'
-- Check result set
SELECT *
FROM tbTwo
SELECT *
FROM tbOne
SELECT *
FROM [Target] T
INNER JOIN tbOne TB1
ON T.FKTB1 = TB1.ID
INNER JOIN tbTwo TB2
ON T.FKTB2 = TB2.ID

Lookup delimited values in a table in sql-server

In a table A i have a column (varchar*30) city-id with the value e.g. 1,2,3 or 2,4.
The description of the value is stored in another table B, e.g.
1 Amsterdam
2 The Hague
3 Maastricht
4 Rotterdam
How must i join table A with table B to get the descriptions in one or maybe more rows?
Assuming this is what you meant:
Table A:
id
-------
1
2
3
Table B:
id | Place
-----------
1 | Amsterdam
2 | The Hague
3 | Maastricht
4 | Rotterdam
Keep id column in both tables as auto increment, and PK.
Then just do a simple inner join.
select * from A inner join B on (A.id = B.id);
Ideal way to deal with such scenarios is to have a normalized table as Collin. In case that can't be done here is the way to go about -
You would need to use a table-valued function to split the comma-seperated value. If you are having SQL-Server 2016, there is a built-in SPLIT_STRING function, if not you would need to create one as shown in this link.
create table dbo.sCity(
CityId varchar(30)
);
create table dbo.sCityDescription(
CityId int
,CityDescription varchar(30)
);
insert into dbo.sCity values
('1,2,3')
,('2,4');
insert into dbo.sCityDescription values
(1,'Amsterdam')
,(2,'The Hague')
,(3,'Maastricht')
,(4,'Rotterdam');
select ctds.CityDescription
,sst.Value as 'CityId'
from dbo.sCity ct
cross apply dbo.SplitString(CityId,',') sst
join dbo.sCityDescription ctds
on sst.Value = ctds.CityId;

Choice of Clustered and Non-Clustered Index in SQL Server

I have two standalone tables Person and Bank in one database. Moreover I have a reference table Customers in another database.
Table: Person - Database #1
PersonId (PK) Name
_________________________________
1 Ram
2 Raj
3 John
4 Emma
Table: Bank - Database #1
BankId (PK) Name
_________________________________
1 ICICI
2 HDFC
3 SBI
Table: Customers - Database #2
CustomerId (PK) BankId PersonId UserName PIN
_____________________________________________________________
1 1 1 person1 7456bb
2 1 4 person4 NULL
3 2 1 person1 5691io
4 3 2 person2 7892yh
5 3 4 person4 1596pl
I need to execute the following queries efficiently, without INDEX it will do a FULL COLUMN search:
SELECT
c.CustomerId, p.Name, b.Name, c.UserName, c.PIN
FROM
DB2.Customers c
INNER JOIN
DB1.Person p ON p.PersonId = c.PersonId
INNER JOIN
DB1.Bank b ON b.BankId = c.BankId
SELECT *
FROM DB2.Customers c
WHERE c.UserName = 'person1'
SELECT *
FROM DB2.Customers c
WHERE c.UserName = 'person2' AND c.PIN = '7892yh'
So, I need a CREATE table SQL query for the table Customers. The following are the constraints
CustomerId int NOT NULL primary key
BankId int NOT NULL -- needs index and non-unique
PersonId int NOT NULL -- needs index and non-unique
UserName varchar(25) NOT NULL -- needs index and non-unique
PIN varchar(10) NULL -- needs index and non-unique
The said table don't have a FK relationship because the said table is in different database server. So, I need a efficient table structure to fetch records using JOIN. I don't know the which INDEX is efficient in this scenario either clustered nor non-clustered index.

Left Join containing where clause INSIDE join

Lets say we have the following table structure:
DECLARE #Person TABLE
(
PersonId INT,
Name VARCHAR(50)
)
DECLARE #Address TABLE
(
AddressId INT IDENTITY(1,1),
PersonId INT
)
And we insert two person records:
INSERT INTO #Person (PersonId, Name) VALUES (1, 'John Doe')
INSERT INTO #Person (PersonId, Name) VALUES (2, 'Jane Doe')
But we only insert a address record for John
INSERT INTO #Address (PersonId) VALUES (1)
If I execute the following queries I get different results
SELECT *
FROM #Person p
LEFT JOIN #Address a
ON p.PersonId = a.PersonId AND a.PersonId IS NULL
PersonId | Name | AddressId | PersonId
1 | John Doe | NULL | NULL
2 | Jane Doe | NULL | NULL
VS
SELECT *
FROM #Person p
LEFT JOIN #Address a
ON p.PersonId = a.PersonId
WHERE a.PersonId IS NULL
PersonId | Name | AddressId | PersonId
2 | Jane Doe | NULL | NULL
Why are the queries returning different results?
The first query is not meeting any of your conditions. Hence it is displaying all results from the #Person table (Typical Left join). Where as in the second query, the where clause is applied after the join. Hence it is displaying proper result.
First:
get all records (two) from Person and join 0 records from Address, cos none of address have PersonID = NULL. After that no additional filters applyed. And you see two records from Person
Second:
get all records (two) from Person and one of them joined to Address with ID = 1. After that your WHERE filter applyed and one of records with joined ID = 1 disappears.
ON clause defines which all matching rows to show from both tables.
WHERE clause actually filters the rows.
In the 1st query, it is returning 2 rows because LEFT JOIN returns all the rows from the left table irrespective of match from right table.
2nd query is returning 1 row, because for PersonId=1, #Address table contains a matching record hence a.PersonId is NOT NULL.
Make it a habit to read your SQL query from the Where condition and then look at your joins, this will give you a clearer meaning/understanding of what is happening or going to be returned.
In this case you said WHERE a.PersonId IS NULL the Select Part must happen and It must Join using the following join criteria.
That is how your query is being read by your machine hence the different sets of results.
And then in contrast, on the condition where there is no where clause, the results on the Left table (p) do not have to exist on (a) but at the same time the results on (a) must be null but already they might not exist. Already at this point your SQL will be confused.

Outer SELECT from INSERT RETURNING and inner SELECT statement

I am trying to write a query that copies a row in a table to that same table, gives it a new sequential primary key, and associates it with a new foreign key. I need to associate the new primary key with another foreign key that's not inserted and exists in a different relational table (a lookup table).
I'd like to be able to do this as a single transaction, but I can't seem to find a way to associate the original row with the copied row as the unique id is new for the copied row. This is going to be a bit of a mouthful, but here's my specific question:
Can an outer SELECT clause enclose an inner INSERT with inner SELECT clause and RETURNING such that values from both the inner SELECT and the INSERT's RETURNING clause are selected and properly joined? Here's what I've attempted:
WITH batch_select AS (
SELECT id, owner_id, 1992 AS project_id
FROM batch
WHERE project_id = 1921
),
batch_insert AS (
INSERT INTO batch (owner_id, project_id)
SELECT bs.owner_id, bs.experiment_id
FROM batch_select bs
RETURNING id
)
SELECT bs.id AS origin_id, bi.id AS destination_id
FROM batch_select bs, batch_insert bi;
I need the origin_id to correspond to the destination_id. Obviously right now it's just a CROSS JOIN where everything is paired with everything and isn't very useful. I'd also be using the results of the last SELECT statement to run the INSERT into the lookup table, something like this (batch_join_select query could be implemented in the last insert, but has been left for clarity):
WITH batch_select AS (
SELECT id, owner_id, 1992 AS project_id
FROM batch
WHERE project_id = 1921
),
batch_insert AS (
INSERT INTO batch (owner_id, project_id)
SELECT bs.owner_id, bs.experiment_id
FROM batch_select bs
RETURNING id
),
batch_join_select AS (
SELECT bs.id AS origin_id, bi.id AS destination_id
FROM batch_select bs, batch_insert bi
)
INSERT INTO lookup_batch_container (batch_id, container_id)
SELECT bjs.destination_id, lbc.container_id
FROM batch_join_select bjs
INNER JOIN lookup_batch_container lbc ON lbc.batch_id = bjs.origin_id;
I found a similar question on the dba exchange, but the accepted answer doesn't correctly associate the two when there's more than one row.
Do I just have to do this with several transactions?
[EDIT] Adding some minimal schema:
Table lookup_batch_container
Column | Type | Modifiers
--------------+---------+-----------
batch_id | integer | not null
container_id | integer | not null
Indexes:
"lookup_batch_container_batch_id_container_id_key" UNIQUE CONSTRAINT, btree (batch_id, container_id)
Foreign-key constraints:
"lookup_batch_container_batch_id_fkey" FOREIGN KEY (batch_id) REFERENCES batch(id) ON DELETE CASCADE
"lookup_batch_container_container_id_fkey" FOREIGN KEY (container_id) REFERENCES container(id) ON DELETE CASCADE
Table batch
Column | Type | Modifiers
------------------+-----------------------------+------------------------------------------------------------------------------------
id | integer | not null default nextval('batch_id_seq'::regclass)
owner_id | integer | not null
project_id | integer | not null
Indexes:
"batch_pkey" PRIMARY KEY, btree (id)
Foreign-key constraints:
"batch_project_id_fkey" FOREIGN KEY (project_id) REFERENCES project(id) ON DELETE CASCADE
"batch_owner_id_fkey" FOREIGN KEY (owner_id) REFERENCES owner(id) ON DELETE CASCADE
Referenced by:
TABLE "lookup_batch_container" CONSTRAINT "lookup_batch_container_batch_id_fkey" FOREIGN KEY (batch_id) REFERENCES batch(id) ON DELETE CASCADE
Table container
Column | Type | Modifiers
-----------------------+-----------------------------+------------------------------------------------------------------------------
id | integer | not null default nextval('stirplate_source_file_container_id_seq'::regclass)
owner_id | integer | not null
status | container_status_enum | not null default 'new'::container_status_enum
name | text | not null
Indexes:
"container_pkey" PRIMARY KEY, btree (id)
Foreign-key constraints:
"container_owner_id_fkey" FOREIGN KEY (owner_id) REFERENCES owner(id) ON DELETE CASCADE
Referenced by:
TABLE "lookup_batch_container" CONSTRAINT "lookup_batch_container_container_id_fkey" FOREIGN KEY (container_id) REFERENCES container(id) ON DELETE CASCADE
with batch_select as (
select id, owner_id, 1992 as project_id
from batch
where project_id = 1921
), batch_insert as (
insert into batch (owner_id, project_id)
select owner_id, project_id
from batch_select
order by id
returning *
)
select unnest(oid) as oid, unnest(did) as did
from (
select
array_agg(distinct bs.id order by bs.id) as oid,
array_agg(distinct bi.id order by bi.id) as did
from
batch_select bs
inner join
batch_insert bi using (owner_id, project_id)
) s
;
oid | did
-----+-----
1 | 4
2 | 5
3 | 6
Given this batch table:
create table batch (
id serial primary key,
owner_id integer,
project_id integer
);
insert into batch (owner_id, project_id) values
(1,1921),(1,1921),(2,1921);
It could be simpler if the primary key were (id, owner_id, project_id). Isn't it?

Resources