SQL Server select join detect if common column between two tables are different - sql-server

I am trying to write a function to check between two tables which have a common column with the same name and ID values.
Table 1: CompanyRecords
CompanyRecordsID CompanyId CompanyName CompanyProcessID
-----------------------------------------------------------
1 222 Sears 123
2 333 JCPenny 456
Table 2: JointCompanies
JointCompaniesID CompanyId CompanyName ComanyProcessID
-----------------------------------------------------------
3 222 KMart 123
4 444 Walmart 001
They both use the same foreign key CompanyProcessID with value 123.
How do I write a select statement when it is passed the CompanyProcessID to tell if the CompanyId has changed for the same CompanyProcessId.
I assume it is a join between the two tables with WHERE CompanyProcessID
Thanks for any help.

Is this what you want?
select max(case when cr.name = jc.name then 0 else 1 end) as name_not_same
from CompanyRecords cr join
JointCompanies jc
on cr.ComanyProcessID = jc.ComanyProcessID
where cr.ComanyProcessID = ?

Related

SQL Check to see if any value in a list exists in another table

I have a temp table that looks something like this:
Record DepartmentId PositionId EmployeeId StatusId CustomerId
1 Null Null Null 4
2 7 454 Null Null
3 Null 454 Null 3
3 Null Null Null Null 214
3 Null Null Null Null 100
3 Null Null Null Null 312
4 Null Null Null Null 357
I inserted the above into the temp table from tables that looked like this:
Record Table Record-to-Department Record-to-Position
Record Name Record DepartmentId Record PositionId
1 Red 2 7 2 454
2 blue 3 454
3 Green
4 Purple
Record-To-Status Record-To-Customer
Record StatusId Record CustomerId
1 4 3 214
3 3 3 100
3 312
4 357
I have an employee whose record looks something like this:
EmployeeId DepartmentId PositionId StatusId
342 7 454 4
Employee Customers:
EmployeeId CustomerId
342 357
342 95
342 720
In this scenario, it would return Record 1 (because it matches the StatusId), Record 2 (because it matches both the DepartmentId and the PositionId), but it would not return Record 3 because it only matches the PositionId and not the StatusId, and it would return RecordId 4 because one of the Employee CustomerIds matches the CustomerId on Record 4.
I got part of this answer on another question enter link description here (please forgive me I am new and trying to figure out how to ask everything I need to know), but I can't figure out how to handle the multi-records.
I tried selecting the Employees customer Id's into a table variable and then attempted to use the Coalesce like this:
Declare #Customers table(CustomerId int)
INSERT INTO #Customers(CustomerId)
SELECT DISTINCT S.CustomerId
FROM employee_Customers
Select * from tbl
WHERE
COAlesce(StatusId,#StatusId)=#StatusId AND
COALESCE(DepartmentId,#DepartmentId)=#DepartmentId AND
Coalesce(PositionId,#PositionId)=#PositionId AND
Coalesce(EmployeeCompanyId,#EmployeeCompanyId) = #EmployeeCompanyId AND
COALESCE((Select CustomerId from tbl_Requirement_to_Customer),(Select CustomerId from #Customers)) = (Select CustomerId from #Customers)
But I receive the error "Subquery Returned more than 1 value".
I have a possible solution you can try. I don't think it will be plug-and-play but hopefully you can adapt it to your situation. I am using just the data as presented in your temp table, your employee record and your Employee-customers correlation.
The basic logic is to join your temp table to the employee(s) using or condition, but then to get a count of populated values, and compare this count to a count of the number of matching values, which must be at least the first count and greater than zero.
This returns your desired output:
select t.*
from Temp t
left join emp e on e.DepartmentId=t.DepartmentId or e.PositionId=t.PositionId or e.EmployeeId=t.EmployeeId or e.StatusId=t.StatusId
outer apply (
select case when exists (
select * from EmployeeCustomers ec join emp e on e.EmployeeId=ec.EmployeeId where ec.CustomerID=t.CustomerId
) then 1 else 0 end CustomerIdMatch
)c
outer apply (
values (
Iif(t.departmentId is null,0,1) +
Iif(t.PositionId is null,0,1) +
Iif(t.EmployeeId is null,0,1) +
Iif(t.StatusId is null,0,1) +
c.CustomerIdMatch
))x(Cnt)
outer apply (
values (
Iif(t.departmentId=e.DepartmentId,1,0) +
Iif(t.PositionId=e.PositionId,1,0) +
Iif(t.EmployeeId=e.EmployeeId,1,0) +
Iif(t.StatusId=e.StatusId,1,0) +
c.CustomerIdMatch
))y(Cnt2)
where cnt2>=cnt and cnt2>0
See working DB<>Fiddle

SQL Server query involving subqueries - performance issues

I have three tables:
Table 1: | dbo.pc_a21a22 |
batchNbr Other columns...
-------- ----------------
12345
12346
12347
Table 2: | dbo.outcome |
passageId record
---------- ---------
00003 200
00003 9
00004 7
Table 3: | dbo.passage |
passageId passageTime batchNbr
---------- ------------- ---------
00001 2015.01.01 12345
00002 2016.01.01 12345
00003 2017.01.01 12345
00004 2018.01.01 12346
What I want to do: for each batchNbr in Table 1 get first its latest passageTime and the corresponding passageID from Table 3. With that passageID, get the relevant rows in Table 2 and establish whether any of these rows contains the record 200. Per passageId there are at most 2 records in Table 2
What is the most efficient way to do this?
I have already created a query that works, but it's awfully slow and thus unfit for tables with millions of rows. Any suggestion on how to either change the query or do it another way? Altering the table structure is not an option, I only have read rights to the database.
My current solution (slow):
SELECT TOP 50000
a.batchNbr,
CAST ( CASE WHEN 200 in (SELECT TOP 2 record FROM dbo.outcome where passageId in (
SELECT SubqueryResults.passageId From (SELECT Top 1 passageId FROM dbo.passage pass WHERE pass.batchNbr = a.batchNbr ORDER BY passageTime Desc) SubqueryResults
)
) then 1 else 0 end as bit) as KGT_IO_END
FROM dbo.pc_a21a22 a
The desired output is:
batchNbr 200present
--------- ----------
12345 1
12346 0
I suggest you use table joining rather than subqueries.
select
a.*, b.*
from
dbo.table1 a
join
dbo.table2 b on a.id = b.id
where
/*your where clause for filtering*/
EDIT:
You could use this as a reference Join vs. sub-query
Try this
SELECT TOP 50000 a.*, (CASE WHEN b.record = 200 THEN 1 ELSE 0 END) AS
KGT_IO_END
FROM dbo.Test1 AS a
LEFT OUTER JOIN
(SELECT record, p.batchNbr
FROM dbo.Test2 AS o
LEFT OUTER JOIN (SELECT MAX(passageId) AS passageId, batchNbr FROM
dbo.Test3 GROUP BY batchNbr) AS p ON o.passageId = p.passageId
) AS b ON a.batchNbr = b.batchNbr;
The MAX subquery is to get the latest passageId by batchNbr.
However, your example won't get the record 200, since the passageId of the record with 200 is 00001, while the latest passageId of the batchNbr 12345 is 00003.
I used LEFT OUTER JOIN since the passageId from Table2 no longer match any of the latest passageId from Table3. The resulting subquery would have no records to join to Table1. Therefore INNER JOIN would not show any records from your example data.
Output from your example data:
batchNbr KGT_IO_END
12345 0
12346 0
12347 0
Output if we change the passageId of record 200 to 00003 (the latest for 12345)
batchNbr KGT_IO_END
12345 1
12346 0
12347 0

Getting sub products from same table

I have two tables, table1 has end products and product parts, table2 has a mapping between table1.productid and partsid. i.e.
table1
--------------------------------------------
productid descrip code cost ....etc
1235 product A 07 12.5 ......
789 labor 03 2.5 ....
839 part1 03 5 ....
and table2 looks like
table2
--------------------------------------------
productid partsID quantity
1235 789 1
1235 839 2
2341 2315 2
.....
I need to select the end product that meets specific code and go pull up the parts from the part table and display like below:
Resuls
--------------------------------------------
productid descrip code partsID cost ....etc
1235 product A 07 789 2.5 ......
1235 product A 03 839 5 ....
.......
Basically for each end product, line up the parts to the right and cost and all other details associated with the parts and not the end product.
Appreciate any help with this.
Could you not join the Two tables across ProductID field:
SELECT * FROM
dbo.TABLE1 o INNER JOIN
dbo.TABLE2 t ON o.ProductId = t.ProductId
WHERE o.code = 7
OR o.code = 3
etc ... whatever additional criteria you need
If you also need to Join the partsID you could say
AND o.PartId = t.partsId
It looks like you want to get something like this query, but I'm not sure about the code=07, you've specified in the example. Probably the code must be 03 for both parts.
SELECT
prd.productid,
prd.descrip,
prt.code,
prdprt.partsID,
prt.cost,
....etc
FROM table1 prd
JOIN table2 prdprt ON prd.productid=prdprt.productid
JOIN table1 prt ON prdprt.partsID=prt.productid
WHERE prd.code in ('07', ...othecodes) --if you filter by product's code
OR prt.code in ('03', ...othecodes) --if you filter by parts's code

SQL Server - Update Column with Handing Duplicate and Unique Rows Based Upon Timestamp

I'm working with SQL Server 2005 and looking to export some data off of a table I have. However, prior to do that I need to update a status column based upon a field called "VisitNumber", which can contain multiple entries same value entries. I have a table set up in the following manner. There are more columns to it, but I am just putting in what's relevant to my issue
ID Name MyReport VisitNumber DateTimeStamp Status
-- --------- -------- ----------- ----------------------- ------
1 Test John Test123 123 2014-01-01 05.00.00.000
2 Test John Test456 123 2014-01-01 07.00.00.000
3 Test Sue Test123 555 2014-01-02 08.00.00.000
4 Test Ann Test123 888 2014-01-02 09.00.00.000
5 Test Ann Test456 888 2014-01-02 10.00.00.000
6 Test Ann Test789 888 2014-01-02 11.00.00.000
Field Notes
ID column is a unique ID in incremental numbers
MyReport is a text value and can actually be thousands of characters. Shortened for simplicity. In my scenario the text would be completely different
Rest of fields are varchar
My Goal
I need to address putting in a status of "F" for two conditions:
* If there is only one VisitNumber, update the status column of "F"
* If there is more than one visit number, only put "F" for the one based upon the earliest timestamp. For the other ones, put in a status of "A"
So going back to my table, here is the expectation
ID Name MyReport VisitNumber DateTimeStamp Status
-- --------- -------- ----------- ----------------------- ------
1 Test John Test123 123 2014-01-01 05.00.00.000 F
2 Test John Test456 123 2014-01-01 07.00.00.000 A
3 Test Sue Test123 555 2014-01-02 08.00.00.000 F
4 Test Ann Test123 888 2014-01-02 09.00.00.000 F
5 Test Ann Test456 888 2014-01-02 10.00.00.000 A
6 Test Ann Test789 888 2014-01-02 11.00.00.000 A
I was thinking I could handle this by splitting each types of duplicates/triplicates+ (2,3,4,5). Then updating every other (or every 3,4,5 rows). Then delete those from the original table and combine them together to export the data in SSIS. But I am thinking there is a much more efficient way of handling it.
Any thoughts? I can accomplish this by updating the table directly in SQL for this status column and then export normally through SSIS. Or if there is some way I can manipulate the column for the exact conditions I need, I can do it all in SSIS. I am just not sure how to proceed with this.
WITH cte AS
(
SELECT *, ROW_NUMBER() OVER (PARTITION BY VisitNumber ORDER BY DateTimeStamp) rn from MyTable
)
UPDATE cte
SET [status] = (CASE WHEN rn = 1 THEN 'F' ELSE 'A' END)
I put together a test script to check the results. For your purposes, use the update statements and replace the temp table with your table name.
create table #temp1 (id int, [name] varchar(50), myreport varchar(50), visitnumber varchar(50), dts datetime, [status] varchar(1))
insert into #temp1 (id,[name],myreport,visitnumber, dts) values (1,'Test John','Test123','123','2014-01-01 05:00')
insert into #temp1 (id,[name],myreport,visitnumber, dts) values (2,'Test John','Test456','123','2014-01-01 07:00')
insert into #temp1 (id,[name],myreport,visitnumber, dts) values (3,'Test Sue','Test123','555','2014-01-01 08:00')
insert into #temp1 (id,[name],myreport,visitnumber, dts) values (4,'Test Ann','Test123','888','2014-01-01 09:00')
insert into #temp1 (id,[name],myreport,visitnumber, dts) values (5,'Test Ann','Test456','888','2014-01-01 10:00')
insert into #temp1 (id,[name],myreport,visitnumber, dts) values (6,'Test Ann','Test789','888','2014-01-01 11:00')
select * from #temp1;
update #temp1 set status = 'F'
where id in (
select id from #temp1 t1
join (select min(dts) as mindts, visitnumber
from #temp1
group by visitNumber) t2
on t1.visitnumber = t2.visitnumber
and t1.dts = t2.mindts)
update #temp1 set status = 'A'
where id not in (
select id from #temp1 t1
join (select min(dts) as mindts, visitnumber
from #temp1
group by visitNumber) t2
on t1.visitnumber = t2.visitnumber
and t1.dts = t2.mindts)
select * from #temp1;
drop table #temp1
Hope this helps

SQL query like GROUP BY with OR condition

I'll try to describe the real situation. In our company we have a reservation system with a table, let's call it Customers, where e-mail and phone contacts are saved with each incoming order - that's the part of a system I can't change. I'm facing the problem how to get count of unique customers. With the unique customer I mean group of people who has either the same e-mail or same phone number.
Example 1: From the real life you can imagine Tom and Sandra who are married. Tom, who ordered 4 products, filled in our reservation system 3 different e-mail addresses and 2 different phone numbers when one of them shares with Sandra (as a homephone) so I can presume they are connected somehow. Sandra except this shared phone number filled also her private one and for both orders she used only one e-mail address. For me this means to count all of the following rows as one unique customer. So in fact this unique customer may grow up into the whole family.
ID E-mail Phone Comment
---- ------------------- -------------- ------------------------------
0 tom#email.com +44 111 111 First row
1 tommy#email.com +44 111 111 Same phone, different e-mail
2 thomas#email.com +44 111 111 Same phone, different e-mail
3 thomas#email.com +44 222 222 Same e-mail, different phone
4 sandra#email.com +44 222 222 Same phone, different e-mail
5 sandra#email.com +44 333 333 Same e-mail, different phone
As ypercube said I will probably need a recursion to count all of these unique customers.
Example 2: Here is the example of what I want to do.Is it possible to get count of unique customers without using recursion for instance by using cursor or something or is the recursion necessary ?
ID E-mail Phone Comment
---- ------------------- -------------- ------------------------------
0 linsey#email.com +44 111 111 ─┐
1 louise#email.com +44 111 111 ├─ 1. unique customer
2 louise#email.com +44 222 222 ─┘
---- ------------------- -------------- ------------------------------
3 steven#email.com +44 333 333 ─┐
4 steven#email.com +44 444 444 ├─ 2. unique customer
5 sandra#email.com +44 444 444 ─┘
---- ------------------- -------------- ------------------------------
6 george#email.com +44 555 555 ─── 3. unique customer
---- ------------------- -------------- ------------------------------
7 xavier#email.com +44 666 666 ─┐
8 xavier#email.com +44 777 777 ├─ 4. unique customer
9 xavier#email.com +44 888 888 ─┘
---- ------------------- -------------- ------------------------------
10 robert#email.com +44 999 999 ─┐
11 miriam#email.com +44 999 999 ├─ 5. unique customer
12 sherry#email.com +44 999 999 ─┘
---- ------------------- -------------- ------------------------------
----------------------------------------------------------------------
Result ∑ = 5 unique customers
----------------------------------------------------------------------
I've tried a query with GROUP BY but I don't know how to group the result by either first or second column. I'm looking for let's say something like
SELECT COUNT(*) FROM Customers
GROUP BY Email OR Phone
Thanks again for any suggestions
P.S.
I really appreciate the answers for this question before the complete rephrase. Now the answers here may not correspond to the update so please don't downvote here if you're going to do it (except the question of course :). I completely rewrote this post.Thanks and sorry for my wrong start.
Here is a full solution using a recursive CTE.
;WITH Nodes AS
(
SELECT DENSE_RANK() OVER (ORDER BY Part, PartRank) SetId
, [ID]
FROM
(
SELECT [ID], 1 Part, DENSE_RANK() OVER (ORDER BY [E-mail]) PartRank
FROM dbo.Customer
UNION ALL
SELECT [ID], 2, DENSE_RANK() OVER (ORDER BY Phone) PartRank
FROM dbo.Customer
) A
),
Links AS
(
SELECT DISTINCT A.Id, B.Id LinkedId
FROM Nodes A
JOIN Nodes B ON B.SetId = A.SetId AND B.Id < A.Id
),
Routes AS
(
SELECT DISTINCT Id, Id LinkedId
FROM dbo.Customer
UNION ALL
SELECT DISTINCT Id, LinkedId
FROM Links
UNION ALL
SELECT A.Id, B.LinkedId
FROM Links A
JOIN Routes B ON B.Id = A.LinkedId AND B.LinkedId < A.Id
),
TransitiveClosure AS
(
SELECT Id, Id LinkedId
FROM Links
UNION
SELECT LinkedId Id, LinkedId
FROM Links
UNION
SELECT Id, LinkedId
FROM Routes
),
UniqueCustomers AS
(
SELECT Id, MIN(LinkedId) UniqueCustomerId
FROM TransitiveClosure
GROUP BY Id
)
SELECT A.Id, A.[E-mail], A.Phone, B.UniqueCustomerId
FROM dbo.Customer A
JOIN UniqueCustomers B ON B.Id = A.Id
Finding groups that have only same Phone:
SELECT
ID
, Name
, Phone
, DENSE_RANK() OVER (ORDER BY Phone) AS GroupPhone
FROM
MyTable
ORDER BY
GroupPhone
, ID
Finding groups that have only same Name:
SELECT
ID
, Name
, Phone
, DENSE_RANK() OVER (ORDER BY Name) AS GroupName
FROM
MyTable
ORDER BY
GroupName
, ID
Now, for the (complex) query you describe, let's say we have a table like this instead:
ID Name Phone
---- ------------- -------------
0 Kate +44 333 333
1 Sandra +44 000 000
2 Thomas +44 222 222
3 Robert +44 000 000
4 Thomas +44 444 444
5 George +44 222 222
6 Kate +44 000 000
7 Robert +44 444 444
--------------------------------
Should all these be in one group? As they all share name or phone with someone else, forming a "chain" of relative persons:
0-6 same name
6-1-3 same phone
3-7 same name
7-4 same-phone
4-2 same name
2-5 bame phone
For the dataset in the example you could write something like this:
;WITH Temp AS (
SELECT Name, Phone,
DENSE_RANK() OVER (ORDER BY Name) AS NameGroup,
DENSE_RANK() OVER (ORDER BY Phone) AS PhoneGroup
FROM MyTable)
SELECT MAX(Phone), MAX(Name), COUNT(*)
FROM Temp
GROUP BY NameGroup, PhoneGroup
I don't know if this is the best solution, but here it is:
SELECT
MyTable.ID, MyTable.Name, MyTable.Phone,
CASE WHEN N.No = 1 AND P.No = 1 THEN 1
WHEN N.No = 1 AND P.No > 1 THEN 2
WHEN N.No > 1 OR P.No > 1 THEN 3
END as GroupRes
FROM
MyTable
JOIN (SELECT Name, count(Name) No FROM MyTable GROUP BY Name) N on MyTable.Name = N.Name
JOIN (SELECT Phone, count(Phone) No FROM MyTable GROUP BY Phone) P on MyTable.Phone = P.Phone
The problem is that here are some joins made on varchars and could end up in increasing execution time.
Here is my solution:
SELECT p.LastName, P.FirstName, P.HomePhone,
CASE
WHEN ph.PhoneCount=1 THEN
CASE
WHEN n.NameCount=1 THEN 'unique name and phone'
ELSE 'common name'
END
ELSE
CASE
WHEN n.NameCount=1 THEN 'common phone'
ELSE 'common phone and name'
END
END
FROM Contacts p
INNER JOIN
(SELECT HomePhone, count(LastName) as PhoneCount
FROM Contacts
GROUP BY HomePhone) ph ON ph.HomePhone = p.HomePhone
INNER JOIN
(SELECT FirstName, count(LastName) as NameCount
FROM Contacts
GROUP BY FirstName) n ON n.FirstName = p.FirstName
LastN FirstN Phone Comment
Hoover Brenda 8138282334 unique name and phone
Washington Brian 9044563211 common name
Roosevelt Brian 7737653279 common name
Reagan Charles 7734567869 unique name and phone

Resources