Need to make NULL=Value evaluate to TRUE - sql-server

I have a dimension table I'm trying to create that would require records with NULLs to be overwritten by a value when all other non-null fields match.
This logic works and shows what I mean by "null=Value evaluates to TRUE":
UPDATE A
SET
A.SSN = COALESCE(A.SSN, B.SSN)
,A.DOB = COALESCE(A.DOB, B.DOB)
,A.ID_1 = COALESCE(A.ID_1, B.ID_1)
,A.ID_2 = COALESCE(A.ID_2, B.ID_2)
,A.ID_3 = COALESCE(A.ID_3, B.ID_3)
,A.ID_4 = COALESCE(A.ID_4, B.ID_4)
FROM #TESTED1 A
INNER JOIN #TESTED1 B
ON (A.SSN = B.SSN
OR A.SSN IS NULL
OR B.SSN IS NULL)
AND (A.DOB = B.DOB
OR A.DOB IS NULL
OR B.DOB IS NULL)
AND (A.ID_1 = B.ID_1
OR A.ID_1 IS NULL
OR B.ID_1 IS NULL)
AND (A.ID_2 = B.ID_2
OR A.ID_2 IS NULL
OR B.ID_2 IS NULL)
AND (A.ID_3 = B.ID_3
OR A.ID_3 IS NULL
OR B.ID_3 IS NULL)
AND (A.ID_4 = B.ID_4
OR A.ID_4 IS NULL
OR B.ID_4 IS NULL)
WHERE A.ArbitraryTableID <> B.ArbitraryTableID
but takes exponentially longer the more records that are evaluated, 10k records takes 9sec, 100k records takes 9min, etc. I'm trying to do an initial load of around 30mil records and then I will have to evaluate the entire table in a MERGE operation with another 10k records every day.
For example I would need the following three rows (that all exist on the same table) to combine into two rows with all values populated:
Just like this:
Unfortunately members can have multiple IDs so I can't count on any one of these IDs to be unique or even exist at all to cut down on my join conditions.

For performance of this query, make sure you have an index sorting all the criteria you are making your join on.
I did a quick example of what you described:
`declare #test table (
row_name NVARCHAR(50),
id1 int null,
id2 int null,
id3 int null
)
insert into #test values('row1', 1,2,3), ('row2',1,4,5), ('row3',11,null,null), ('row4',null,4,null), ('row5',3,6,5), ('row6',3,null,null)
select *
from #test t1
inner join #test t2 on (
(t1.id1 = t2.id1
or t1.id1 is null
or t2.id1 is null)
and (
t1.id2 = t2.id2
or t1.id2 is null
or t2.id2 is null)
and (
t1.id3 = t2.id3
or t1.id3 is null
or t2.id3 is null)
)
where t1.row_name <> t2.row_name
order by t1.row_name`
There are a couple of possible problems I see in my test output:
row3 and row4 in my example match because they have none of the same IDs. I'm guessing this is not desired but if you really have several independent systems with different keys, is it possible that you have a lot of rows that fall into this scenario? Every row with id1 set and no other keys and every row with id2 set and no other keys will match.
row1 and row4 do not match even though they should through transitivity (row1.id1 -> row2.id1, row2.id2-> row4.id2)

Based on your response to my comment, I suggest the following solution:
a master record identifying the member/customer
child records for each master record storing the respective IDs
Replace your UPDATE statement with
INSERTs into the master table for all records in table A that are guaranteed to be unique (e.g. SSN).
INSERTs into the child table for all records in table A with not-NULL ID attributes
mark records in table A as processed by UPDATEing a foreign key column referencing the master records IDENTITY primary key
INSERT into the child table all records from A that you can safely assign to existing master records, and again set the FK
This solution would resolve the performance issues resulting from a 5-way JOIN, and also mark processed source records as processed.

Related

SQL Server copy rows to second table

I have a table for bookings (table_b) that has around 1.3M rows. A second table (table_s) is used to note when these rows are needed to be accessed by a separate application.
Currently there are triggers to make a record in table_s but this doesn't help with all existing data.
I believe I need to have a query that selects the rows that exists in table_b but not table_s and then insert a row for each line.
Here is my current syntax but don't think it has been formed correctly
DECLARE #b_id [INT] = 0;
WHILE(1 = 1)
BEGIN
SELECT TOP 10
#b_id = MIN([b].[b_id])
FROM
[table_b] AS [b]
LEFT JOIN
[table_s] AS [s] ON [b].[b_id] = [s].[b_id]
WHERE
[s].[b_id] IS NULL;
IF #b_id IS NULL
BREAK;
INSERT INTO [table_s] ([b_id], [processed])
VALUES (#b_id, 0);
END;
Syntactically everything is fine. But there are some misconceptions present in your query
select top 10 #b_id = MIN(b.b_id)
a variable can hold just one value, even though you select top 10 it will assign single value to variable. Your current approach will loop for each non existing record
I don't think for 1 million records insert we need to split the insert into batches. Try this way
INSERT INTO table_s
(b_id,
processed)
SELECT b_id,
0
FROM table_b AS b
WHERE NOT EXISTS (SELECT 1
FROM table_s AS s
WHERE b.b_id = s.b_id)

postgresql: Insert two values in table b if both values are not in table a

I'm doing an assignment where I am to make an sql-database of a tournament result. Players can be added by their name, and when the database has at least two or more players who has not already been assigned to a match, two players should be matched against each other.
For instance, if the tables currently are empty I add Joe as a player. I then also add James and since the table then has two players, who also are not in the matches-table, a new row in the matches-table is created with their p_id set to left_player_P_id and right_player_P_id.
I thought it would be a good idea to create a function and a trigger so that every time a row is added to the player-table, the sql-code would run and create the row in the matches as needed. I am open to other ways of doing this.
I've tried multiple different approaches including SQL - Insert if the number of rows is greater than and Using IF ELSE statement based on Count to execute different Insert statements but I am now at a loss.
Problematic code:
This approach returns a syntax error.
IF ((select count(*) from players_not_in_any_matches) >= 2)
begin
insert into matches values (
(select p_id from players_not_in_any_matches limit 1),
(select p_id from players_not_in_any_matches limit 1 offset 1)
)
end;
Alternative approach (still problematic code):
This approach seems more promising (but less readable). However, it inserts even if there are no rows returned inside the where not exists.
insert into matches (left_player_p_id, right_player_p_id)
select
(select p_id from players_not_in_any_matches limit 1),
(select p_id from players_not_in_any_matches limit 1 offset 1)
where not exists (
select * from players_not_in_any_matches offset 2
);
Tables
CREATE TABLE players (
p_id serial PRIMARY KEY,
full_name text
);
CREATE TABLE matches(
left_player_P_id integer REFERENCES players,
right_player_P_id integer REFERENCES players,
winner integer REFERENCES players
);
Views
-- view for getting all players not currently assigned to a match
create view players_not_in_any_matches as
select * from players
where p_id not in (
select left_player_p_id from matches
) and
p_id not in (
select right_player_p_id from matches
);
Try:
insert into matches (left_player_p_id, right_player_p_id)
select p1.p_id, p2.p_id
from players p1
join players p2
on p1.p_id <> p2.p_id
and not exists(
select 1 from matches m
where p1.p_id in (m.left_player_p_id, m.right_player_p_id)
)
and not exists(
select 1 from matches m
where p2.p_id in (m.left_player_p_id, m.right_player_p_id)
)
limit 1
Anti joins (not-exists operators) in the above query could be further simplified a bit using LEFT JOINs:
insert into matches (left_player_p_id, right_player_p_id)
select p1.p_id, p2.p_id
from players p1
join players p2
left join matches m1
on p1.p_id in (m1.left_player_p_id, m1.right_player_p_id)
left join matches m2
on p2.p_id in (m2.left_player_p_id, m2.right_player_p_id)
where m1.left_player is null
and m2.left_player is null
limit 1
but in my opinion the former query is more readable, while the latter one looks tricky.

conditional "next value for sequence"

scenario:
Sql Server 2012 Table named "Test" has two fields. "CounterNo" and "Value" both integers.
There are 4 sequence objects defined named sq1, sq2, sq3, sq4
I want to do these on inserts:
if CounterNo = 1 then Value = next value for sq1
if CounterNo = 2 then Value = next value for sq2
if CounterNo = 3 then Value = next value for sq3
I think, create a custom function assign it as default value of Value field. But when i tried custom functions not supports "next value for Sequence Objects"
Another way is using trigger. That table has trigger already.
Using a Stored Procedure for Inserts is the best way. But EntityFramework 5 Code-First is not supporting it.
Can you suggest me a way to achieve this.
(if you show me how can i do it with custom functions you can also post it here. It's another question of me.)
Update:
In reality there are 23 fields in that table and also primary keys setted and i'm generating this counter value on software side, using "counter table".It is not good to generate counter values on client side.
I'm using 4 sequence objects as counters because they represents different types of records.
If i use 4 counters on same record at same time, all of them generates next values. I want only related counter generates it's next value while others remains same.
I'm not shure if I fully understand your use case but maybe the following sample illustrates what you need.
Create Table Vouchers (
Id uniqueidentifier Not Null Default NewId()
, Discriminator varchar(100) Not Null
, VoucherNumber int Null
-- ...
, MoreData nvarchar(100) Null
);
go
Create Sequence InvoiceSequence AS int Start With 1 Increment By 1;
Create Sequence OrderSequence AS int Start With 1 Increment By 1;
go
Create Trigger TR_Voucher_Insert_VoucherNumer On Vouchers After Insert As
If Exists (Select 1 From inserted Where Discriminator = 'Invoice')
Update v
Set VoucherNumber = Next Value For InvoiceSequence
From Vouchers v Inner Join inserted i On (v.Id = i.Id)
Where i.Discriminator = 'Invoice';
If Exists (Select 1 From inserted Where Discriminator = 'Order')
Update v
Set VoucherNumber = Next Value For OrderSequence
From Vouchers v Inner Join inserted i On (v.Id = i.Id)
Where i.Discriminator = 'Order';
go
Insert Into Vouchers (Discriminator, MoreData)
Values ('Invoice', 'Much')
, ('Invoice', 'More')
, ('Order', 'Data')
, ('Invoice', 'And')
, ('Order', 'Again')
;
go
Select * From Vouchers;
Now Invoice- and Order-Numbers will be incremented independently. And as you can have multiple insert triggers on the same table, that shouldn't be an issue.
I think you're thinking about this in the wrong way. You have 3 values and these values are determined by another column. Switch it around, create 3 columns and remove the Counter column.
If you have a table with value1, value2 and value3 then the Counter value is implied by the column in which the value resides. Create a unique index on these three columns and add an identity column for a primary key and you're sorted; you can do it all in a stored procedure easily.
If you have four different types of records, use four different tables, with a separate identity column in each one.
If you need to see all the data together, then use a view to combine them:
create v_AllTypes as
select * from type1 union all
select * from type2 union all
select * from type3 union all
select * from type4;
Alternatively, do the calculation of the sequence number on output:
select t.*,
row_number() over (partition by CounterNo order by t.id) as TypeSeqNum
from AllTypes t;
Something seems amiss with your data model if it requires conditional updates to four identity columns.

Better way to join null values than picking a magic value?

I need to join two tables that are more or less the same (one is a staging table data to go in the other).
Some of the columns are nullable, and when the values are null, the join in my merge statement does not match. (This is normal behavior for nulls.)
The problem is that, when they don't match it causes the row to be deleted and recreated, Changing the value identity of the row in the actual table.
I know that I can do something like this to join nulls:
on coalesce(target.SomeId, -9999) = coalesce(source.SomeId, -9999)
But I don't like having to pick out a number that I hope will never be used. (It feels dirty.)
Is there a better way to make a join on a nullable column than using a magic number like this?
Let's go with this:
target.SomeId = source.SomeId
or (target.SomeId is null and source.SomeId is null)
Conceptually, this should make sense. That is, either both values are null or both values are equal to each other. This should also perform better as the coalesce forces a table scan. I've converted the coalesce style to that above and seen tremendous performance gains.
I almost exclusively use the following pattern
ON EXISTS (SELECT target.SomeId INTERSECT source.SomeId)
after picking it up from Paul White's blog post here.
ON ((target.SomeId IS NULL) AND (source.SomeId IS NULL))
OR ((target.SomeId IS NOT NULL) AND (source.SomeId IS NOT NULL)
AND (target.SomeId = source.SomeId)))
Assuming you mean columns that aren't part of the joining key, then
...and (isnull(source.ColX, target.ColX) = isnull(target.ColX, source.ColX)
or (source.ColX is null and target.ColX is null))
should cover all possibilities: the first line catches if both values are not null or only one value is not null, and the second line catches if both are null. Pretty ugly, but then that's what happens when you get too many nulls in your system.
The result is strange and contains rows resulted from INNER JOIN and rows resulted from a CROSS JOIN betwen NULL IDs from source and target table (ex. {S-C, T-C}, {S-C, T-D}, {S-C, T-E}, {S-D, T-C}, {S-D, T-D}, {S-D, T-E},).
Look at this example:
DECLARE #Source TABLE (SomeId INT NULL, Name VARCHAR(10) NOT NULL);
DECLARE #Target TABLE (SomeId INT NULL, Name VARCHAR(10) NOT NULL);
INSERT #Source VALUES (1,'S-A'),(2,'S-B'),(NULL,'S-C'),(NULL,'S-D');
INSERT #Target VALUES (1,'T-A'),(2,'T-B'),(NULL,'T-C'),(NULL,'T-D'),(NULL,'T-E'),(6,'T-F');
SELECT s.*, t.*
FROM #Source s
INNER JOIN #Target t ON s.SomeId = t.SomeId OR s.SomeId IS NULL AND t.SomeId IS NULL;
SELECT s.*, t.*
FROM #Source s
INNER JOIN #Target t ON ISNULL(s.SomeId,-9999) = ISNULL(t.SomeId,-9999);
Results:
SomeId Name SomeId Name
----------- ---------- ----------- ----------
1 S-A 1 T-A <- INNER JOIN
2 S-B 2 T-B <- INNER JOIN
NULL S-C NULL T-C <- "CROSS JOIN"
NULL S-C NULL T-D <- "CROSS JOIN"
NULL S-C NULL T-E <- "CROSS JOIN"
NULL S-D NULL T-C <- "CROSS JOIN"
NULL S-D NULL T-D <- "CROSS JOIN"
NULL S-D NULL T-E <- "CROSS JOIN"
Special characters?
Else you can try:
on (target.SomeId is null OR source.SomeId is null OR target.SomeId = source.SomeId)

T-SQL finding of exactly same values in referenced table

Lets assume I have 3 tables in my Sql Serer 2008 database:
CREATE TABLE [dbo].[Properties](
[PropertyId] [int] NOT NULL,
[PropertyName] [nvarchar](50) NOT NULL
)
CREATE TABLE [dbo].[Entities](
[EntityId] [int] NOT NULL,
[EntityName] [nvarchar](50) NOT NULL
)
CREATE TABLE [dbo].[PropertyValues](
[EntityId] [int] NOT NULL,
[PropertyId] [int] NOT NULL,
[PropertyValue] [int] NOT NULL
)
Table Properties contains possible set of Properties which values can set up configured for business objects.
Table Entities contains business objects which are configured from app.
Table 3 contains selected Property values for business objects. Each business object can contain its own set of properties (i.e. "Property1" can be configured for first object but not configured for the second one).
My task is to find business objects which are exactly same as given object (ones which have exactly same set of properties with exactly same values). Performance is critical.
Any suggestions?
[ADDED]
For example there is an entry in Entities table with EntityId = 1. In PropertyValues table there are 3 row which are related to this entry:
EntityId PropertyId PropertyValue
1 4 Val4
1 5 Val5
1 6 Val6
The requirement is to find other entries in Entity table which have 3 related rows in PropertyValues table and these rows contain the same data as rows for EntityId = 1 (besides of EntityId column)
[ADDED]
Please, see my new question: Best approach to store data which attributes can vary
[BOUNTY1]
Thanks for all. The answers were very helpful. My task is complicated a little bit (but this complication can be useful in performance purposes). Please, see the details below:
The new table named EntityTypes is added
EntityTypeId column has been added into Entities and Properties tables
Now, there are several types of entities. Each entity has it's own set of properties.
Is it possible to increase performance using this information?
[BOUNTY2]
There is the second complication:
IsDeleted column is added to Property table
PropertyValues table can have values for Properties which already deleted from database. Entities which have such properties are considered invalid.
Some entities don't have values for each property of EntityType set. These entities also are considered as invalid.
The question is: How do I can write a script which will select all Entities and additional column IsValid for them.
;with cteSource as
(
select PropertyId,
PropertyValue
from PropertyValues
where EntityId = #EntityID
)
select PV.EntityId
from PropertyValues as PV
inner join cteSource as S
on PV.PropertyId = S.PropertyId and
PV.PropertyValue = S.PropertyValue and
PV.EntityId <> #EntityID
group by PV.EntityId
having count(*) = (select count(*)
from cteSource) and
count(*) = (select count(*)
from PropertyValues as PV1
where PV1.EntityId = PV.EntityId)
For your addition you can add this where clause:
where -- exlude entities with deleted properties
PV.EntityID not in (select PV2.EntityID
from Properties as P
inner join PropertyValues as PV2
on P.PropertyID = PV2.PropertyID
where P.IsDeleted = 1)
-- exclude entities with missing EntityType
and PV.EntityID not in (select E.EntityID
from Entities as E
where E.EntityType is null)
Edit:
If you want to test the query against some sample data you can do so here:
https://data.stackexchange.com/stackoverflow/q/110243/matching-properties
My task is to find business objects which are exactly same as given object (ones which have exactly same set of properties with exactly same values). Performance is critical.
Approach might vary depending on the average number of properties objects will typically have, a few versus dozens.
Assuming that objects have a varying number of properties:
I would start with a composite non-unique index on the dyad (PropertyValues.PropertyId, PropertyValues.PropertyValue) for select-performance.
Then, given an entity ID, I would select its propertid, propertyvalue pairs into a cursor.
[EDIT:
Not sure whether (entityid, propertyid) is unique in your system or if you are allowing multiple instances of the same property id for an entity, e.g. FavoriteColors:
entityid propertyid property value
1 17 blue
1 17 dark blue
1 17 sky blue
1 17 ultramarine
You would also need either a non-unique index on the monad (PropertyValues.entityid) or a composite index on (PropertyValues.entityid,PropertyValues.propertyid); the composite index would be unique if you wanted to prevent the same propertyid from being associated with an entity more than once.
If a property can occur multiple times, you should probably have a CanBeMultivalued flag in your Properties table. You should have a unique index on the triad (entityid, propertyid, propertyvalue) if you wanted to prevent this:
entityid propertyid property value
1 17 blue
1 17 blue
If you have this triad indexed, you would not need (entityid) index or the (entityid, propertyid) composite index in the PropertyValues table.
[/EDIT]
Then I would create a temp table to store matching entity ids.
Then I would iterate my cursor above to grab the given entity's propertyid, propertyvalue pairs, one pair at a time, and issue a select statement with each iteration:
insert into temp
select entityid from PropertyValues
where propertyid = mycursor.propertyid and propertyvalue = mycursor.propertyvalue
At the end of the loop, you have a non-distinct set of entityids in your temp table for all entities that had at least one of the properties in common with the given object. But the ones you want must have all properties in common.
Since you know how many properties the given object has, you can do the following to fetch only those entities that have all of the properties in common with the given object:
select entityid from temp
group by entityid having count(entityid) = {the number of properties in the given object}
ADDENDUM:
After the first property-value pair of the given object is used to select all potential matches, your temp table would not be missing any possible matches; rather it would contain entityids that were not perfect matches, which must be discarded in some manner, either by being ignored (by your group by having... clause) or by being explicitly removed from the temp table.
Also, after the first iteration of the loop, you could explore the possibility that an inner join between the temp table and the PropertyValues table might offer some performance gain:
select entityid from propertvalues
>> inner join temp on temp.entityid = propertyvalues.entityid <<
where propertyid = mycursor.propertyid and propertyvalue = mycursor.propertyvalue
And you might also try removing entityids from temp after the first iteration:
delete from temp
where not exists
(
select entityid from propertyvalues
inner join temp on temp.entityid = propertyvalues.entityid
where propertyid = mycursor.propertyid and propertyvalue = mycursor.propertyvalue
)
Alternatively, it would be possible to optimize this looping approach further if you stored some metadata about property-frequency. Optimally, when looking for matches for a given entity, you'd want to begin with the least frequently occuring property-value pair. You could order the given object's property-value pairs by ascending frequency, so that in your loop you'd be looking for the rarest one first. That would reduce the set of potential matches to its smallest possible size on the first iteration of the loop.
Of course, if temp were empty at any time after the given object's first property-value pair was used to look for matches, you would know that there are no matches for your given object, because you have found a property-value that no other entity possesses, and you could exit the loop and return a null set.
One way to look at this is if I have all base ball cards you have then we don't have the same baseball card as I may have more. But if you also have all the baseball cards that I have then we have exactly the same baseball cards. This is a little more complex as we are looking by team. By team could count the match count, my count, and your count and compare those 3 counts but that is 3 joins. This solution is 2 joins and I think it would be faster than the 3 join option.
To me the bonus questions did not make sense. There as a change to a table but that table name did not match any of the tables. Need a full table description for those bonus questions.
Below is the 2 join option:
select [m1].[IDa] as [EntityId1], [m1].[IDb] as [EntityId2]
from
( select [PV1].[EntityId] as [IDa], [PV2].[EntityId] as [IDb]
from [PropertyValue] as [PV1]
left outer join [PropertyValue] as [PV2]
on [PV2].[EntityId] <> [PV1].[EntityId]
and [PV2].[PropertyId] = [PV1].[PropertyId]
and [PV2].[PropertyValue] = [PV1].[PropertyValue]
group by [PV1].[EntityId], [PV2].[EntityId]
having count(*) = count([PV2].[EntityId])
) as [m1]
join
( select [PV1].[EntityId] as [IDa], [PV2].[EntityId] as [IDb]
from [PropertyValue] as [PV1]
right outer join [PropertyValue] as [PV2]
on [PV2].[EntityId] <> [PV1].[EntityId]
and [PV2].[PropertyId] = [PV1].[PropertyId]
and [PV2].[PropertyValue] = [PV1].[PropertyValue]
group by [PV1].[EntityId], [PV2].[EntityId]
having count(*) = count([PV1].[EntityId]))
) as [m2]
on [m1].[IDa] = [m2].[IDa] and [m1].[IDb] = [m2].[IDb]
Below is the 3 join count based option:
select [m1].[IDa] as [EntityId1], [m1].[IDb] as [EntityId2]
from
( select [PV1].[EntityId] as [IDa], [PV2].[EntityId] as [IDb], COUNT(*) as [count]
from [PropertyValue] as [PV1]
join [PropertyValue] as [PV2]
on [PV2].[EntityId] <> [PV1].[EntityId]
and [PV2].[PropertyId] = [PV1].[PropertyId]
and [PV2].[PropertyValue] = [PV1].[PropertyValue]
group by [PV1].[EntityId], [PV2].[EntityId]
) as [m1]
join
( select [PV1].[EntityId] as [IDa], COUNT(*) as [count]
from [PropertyValue] as [PV1]
group by [PV1].[EntityId]
having count(*) = count([PV1].[sID]))
) as [m2]
on [m1].[IDa] = [m2].[IDa] and [m1].[count] = [m2].[count]
join
( select [PV2].[EntityId] as [IDb], COUNT(*) as [count]
from [PropertyValue] as [PV2]
group by [PV2].[EntityId]
) as [m3]
on [m1].[IDb] = [m3].[IDb] and [m1].[count] = [m3].[count]
My task is to find business objects which are exactly same as given
object (ones which have exactly same set of properties with exactly
same values).
if the "given objec"t is described as e.g. #PropertyValues, so the query would be:
create table #PropertyValues(
[PropertyId] [int] NOT NULL,
[PropertyValue] [int] NOT NULL
)
insert #PropertyValues
select
3, 3 -- e.g.
declare
#cnt int
select #cnt = count(*) from #PropertyValues
select
EntityId
from
PropertyValues pv
left join #PropertyValues t on t.PropertyId = pv.PropertyId and t.PropertyValue = pv.PropertyValue
group by
EntityId
having
count(t.PropertyId) = #cnt
and count(pv.PropertyId) = #cnt
drop table #PropertyValues
But if performance is so much critical, you can create special indexed field on table Entities, e.g. EntityIndex varchar(8000), which will be filled by trigger on PropertyValues table as convert(char(10), PropertyId) + convert(char(10), PropertyValue) (for all properties of entity, sorted!). So it will be possible to do very fast seek by this field.
I think this is just a simple self-join:
select P2.EntityID,E.EntityName
from PropertyValues P1
inner join PropertyValues P2
on P1.PropertyID = P2.PropertyID
and P1.PropertyValue = P2.PropertyValue
inner join Entity E
on P2.EntityID = E.EntityID
where P1.EntityId = 1
and P2.EntityId <> 1
group by P2.EntityID, E.EntityName

Resources