TSQL Group By & Count not aggregating as expected - sql-server

I have a query that returns 6 rows and I want to aggregate the information to provide a single row with a count of instances. The without aggregate query returns the correct data but when I add a GroupBy and Count to the query it returns 2 rows.
The underlying ID (SR01.ReportKey) shown in the first result has two records so I think the Group By is somehow using this field in the grouping.
NOTE: The ReportKey is not actually used in the query I just had it in the first result for information purposes.
Question :
Any idea why the Group By is not grouping all the rows into a single result with a count of 6?
Without aggregate
Query :
SELECT
'Open' AS RecStatus,
ISNULL(UWZone.UWZoneID,'') AS ZoneID,
ISNULL(UWZone.UWZoneName,'') AS ZoneName,
Branch.BranchID,
ISNULL(Branch.BranchName,'') AS BranchName,
UW.UWID AS ServicingRep,
ISNULL(UW.UWName,'') + '/' + ISNULL(UA.UWName, '') AS RepName
FROM ProductivityRecommendations
INNER JOIN SR01 ON SR01.ReportKey = ProductivityRecommendations.ReportKey
LEFT JOIN UW ON SR01.Underwriter = UW.UWID
LEFT JOIN UW AS UA ON SR01.UA = UA.UWID
LEFT JOIN Branch ON SR01.ProdBranch = Branch.BranchID
LEFT JOIN UWZone ON UWZone.UWZoneAbbrev = Branch.UWZone
WHERE ISNULL(SR01.ServicingBranch,'-') <> '-'
AND ProductivityRecommendations.DateComplete BETWEEN #DateFrom AND #DateTo
AND ProductivityRecommendations.RecCriticality IN ('CRI', 'CCM')
AND ProductivityRecommendations.RecStatus IN ('N','O','U','A','R')
AND DateRecIssued IS NOT NULL
AND (#Zone IS NULL OR (UWZone.UWZoneID IN (SELECT val FROM ufn_SplitMax(#Zone ,','))))
AND (#Branch IS NULL OR (Branch.BranchID IN (SELECT val FROM ufn_SplitMax(#Branch ,','))))
AND (#RepID IS NULL OR (SR01.Underwriter IN(SELECT val FROM ufn_SplitMax(#RepID ,','))) OR #RepID IS NULL OR (SR01.UA IN(SELECT val FROM ufn_SplitMax(#RepID ,','))))
AND (#InsuredNumber IS NULL OR (ProductivityRecommendations.CustNum IN (SELECT val FROM ufn_SplitMax(#InsuredNumber ,','))))
Results :
Adding aggregates
Query :
SELECT
'Open' AS RecStatus,
ISNULL(UWZone.UWZoneID,'') AS ZoneID,
ISNULL(UWZone.UWZoneName,'') AS ZoneName,
Branch.BranchID,
ISNULL(Branch.BranchName,'') AS BranchName,
UW.UWID AS ServicingRep,
ISNULL(UW.UWName,'') + '/' + ISNULL(UA.UWName, '') AS RepName,
COUNT(ProductivityRecommendations.RecStatus) AS Requests
FROM ProductivityRecommendations
INNER JOIN SR01 ON SR01.ReportKey = ProductivityRecommendations.ReportKey
LEFT JOIN UW ON SR01.Underwriter = UW.UWID
LEFT JOIN UW AS UA ON SR01.UA = UA.UWID
LEFT JOIN Branch ON SR01.ProdBranch = Branch.BranchID
LEFT JOIN UWZone ON UWZone.UWZoneAbbrev = Branch.UWZone
WHERE ISNULL(SR01.ServicingBranch,'-') <> '-'
AND ProductivityRecommendations.DateComplete BETWEEN #DateFrom AND #DateTo
AND ProductivityRecommendations.RecCriticality IN ('CRI', 'CCM')
AND ProductivityRecommendations.RecStatus IN ('N','O','U','A','R')
AND DateRecIssued IS NOT NULL
AND (#Zone IS NULL OR (UWZone.UWZoneID IN (SELECT val FROM ufn_SplitMax(#Zone ,','))))
AND (#Branch IS NULL OR (Branch.BranchID IN (SELECT val FROM ufn_SplitMax(#Branch ,','))))
AND (#RepID IS NULL OR (SR01.Underwriter IN(SELECT val FROM ufn_SplitMax(#RepID ,','))) OR #RepID IS NULL OR (SR01.UA IN(SELECT val FROM ufn_SplitMax(#RepID ,','))))
AND (#InsuredNumber IS NULL OR (ProductivityRecommendations.CustNum IN (SELECT val FROM ufn_SplitMax(#InsuredNumber ,','))))
GROUP BY UWZone.UWZoneID, UWZone.UWZoneName, Branch.BranchID, Branch.BranchName, SR01.ServicingRep, UW.UWID, ISNULL(UW.UWName,'') + '/' + ISNULL(UA.UWName, '')
Results :

as #Johan said in a comment:
This will should give you 1 row only in your described senario:
GROUP BY
ISNULL(UWZone.UWZoneID,''),
ISNULL(UWZone.UWZoneName,''),
Branch.BranchID,
ISNULL(Branch.BranchName,'') ,
UW.UWID,
ISNULL(UW.UWName,''),
ISNULL(UA.UWName, '')

Try a "select distinct" of the columns that you are grouping by, and also a LEN to see if they have spaces or other char that are not seen, and also test for NULLs. Then, you decide how to handel these columns, using ISNULL, COALESCE, CASE, and/or WHERE statements, depending on what you need.

Related

Convert PostgreSQL to MS SQL

I am needing help converting a PostgreSQL query to MSSQ.
Below is what i have done so far but i am issuing with the function and array areas which i do not think are allowed in MS SQL.
Is there something that that i need to do change the function and looks the WHERE statement has an array in it too.
I have added the select statement for the #temp table but when i create the #temp table i am getting errors saying incorrect syntax
CREATE FUNCTION pm_aggregate_report
(
_facility_ids uuid[]
, _risk_ids uuid[] DEFAULT NULL::uuid[]
, _assignee_ids uuid[] DEFAULT NULL::uuid[]
, _start_date date DEFAULT NULL::date
, _end_date date DEFAULT NULL::date
)
RETURNS TABLE
(
facility character varying
, pm_id uuid, grouped_pm boolean
, risk_id uuid
, risk character varying
, pm_status_id uuid
, user_id uuid
, assignee text
, completed_by uuid
, total_labor bigint
)
CREATE TABLE #tmp_pm_aggregate
(
facility_id VARCHAR(126),
pm_id VARCHAR(126),
grouped_pm VARCHAR(126),
risk_id VARCHAR(126),
pm_status_id VARCHAR(126),
user_id VARCHAR(126),
completed_by VARCHAR(126)
)
SELECT DISTINCT
COALESCE(gp.facility_id, a.facility_id) as facility_id,
COALESCE(p.grouped_pm_id, p.id) as pm_id,
CASE WHEN p.grouped_pm_id IS NULL THEN false ELSE true END as grouped_pm,
COALESCE(gp.risk_id, a.risk_id) as risk_id,
COALESCE(gp.pm_status_id, p.pm_status_id) as pm_status_id,
COALESCE(gass.user_id, sass.user_id) as user_id,
COALESCE(gp.completed_by, p.completed_by) as completed_by
FROM pms p
JOIN assets a
ON p.asset_id = a.id
LEFT JOIN grouped_pms gp
ON p.grouped_pm_id = gp.id
LEFT JOIN assignees sass
ON p.id = sass.record_id
AND sass.type = 'single_pm'
LEFT JOIN assignees gass
ON p.grouped_pm_id = gass.record_id
AND gass.type = 'grouped_pm'
LEFT JOIN users u
ON (sass.user_id = u.id OR gass.user_id = u.id)
WHERE a.facility_id = ANY(_facility_ids)
AND NOT a.is_component
AND COALESCE(gp.pm_status_id, p.pm_status_id) in ('f9bdfc17-3bb5-4ec0-8477-24ef05ea3b9b', '06fc910c-3d07-4284-8f6e-8fb3873f5333')
AND COALESCE(gp.completion_date, p.completion_date) BETWEEN COALESCE(_start_date, '1/1/2000') AND COALESCE(_end_date, '1/1/3000')
AND COALESCE(gp.show_date, p.show_date) <= CURRENT_TIMESTAMP
AND COALESCE(gass.user_id, sass.user_id) IS NOT NULL
AND u.user_type_id != 'ec823d98-7023-4908-8006-2e33ddf2c11b'
AND (_risk_ids IS NULL OR COALESCE(gp.risk_id, a.risk_id) = ANY(_risk_ids)
AND (_assignee_ids IS NULL OR COALESCE(gass.user_id, sass.user_id) = ANY(_assignee_ids);
SELECT
f.name as facility,
t.pm_id,
t.grouped_pm,
t.risk_id,
r.name as risk,
t.pm_status_id,
t.user_id,
u.name_last + ', ' + u.name_first as assignee,
t.completed_by,
ISNULL(gwl.total_labor, swl.total_labor) as total_labor
FROM #tmp_pm_aggregate t
JOIN facilities f
ON t.facility_id = f.id
JOIN risks r
ON t.risk_id = r.id
JOIN users u
ON t.user_id = u.id
LEFT JOIN (SELECT wl.record_id, wl.user_id, SUM(wl.labor_time) as total_labor
FROM work_logs wl
WHERE wl.type = 'single_pm'
GROUP BY wl.record_id, wl.user_id) as swl
ON t.pm_id = swl.record_id
AND t.user_id = swl.user_id
AND t.grouped_pm = false
LEFT JOIN (SELECT wl.record_id, wl.user_id, SUM(wl.labor_time) as total_labor
FROM work_logs wl
WHERE wl.type = 'grouped_pm'
GROUP BY wl.record_id, wl.user_id) as gwl
ON t.pm_id = gwl.record_id
AND t.user_id = gwl.user_id
AND t.grouped_pm = true
ORDER BY facility,
assignee,
risk;
DROP TABLE #tmp_pm_aggregate;
You can create an inline Table Valued Function, and simply return a resultset from it. You do not need (and cannot use) a temp table, you do not declare the returned "rowset" shape.
For the array parameters, you can use a Table Type:
CREATE TYPE dbo.GuidList (value uniqueidentifier NOT NULL PRIMARY KEY);
Because the table parameters are actual tables, you must query them like this (NOT EXISTS (SELECT 1 FROM #risk_ids) OR ISNULL(gp.risk_id, a.risk_id) IN (SELECT r.value FROM #risk_ids))
The parameters must start with #
There is no boolean type, you must use bit
Always use deterministic date formats for literals. yyyymmdd works for dates. Do you need to take into account hours and minutes, because you haven't?
ISNULL generally performs better than COALESCE in SQL Server, as the compiler understands it better
You may want to pass a separate parameter showing whether you passed in anything for the optional table parameters
I suggest you look carefully at the actual query: why does it need DISTINCT? It performs poorly, and is usually a code-smell indicating poorly thought-out joins. Perhaps you need to combine the two joins on assignees, or perhaps you should use a row-numbering strategy somewhere.
CREATE FUNCTION dbo.pm_aggregate_report
(
#facility_ids dbo.GuidList
, #risk_ids dbo.GuidList
, #assignee_ids dbo.GuidList
, #start_date date
, #end_date date
)
RETURNS TABLE AS RETURN
SELECT DISTINCT -- why DISTINCT, perhaps rethink your joins
ISNULL(gp.facility_id, a.facility_id) as facility_id,
ISNULL(p.grouped_pm_id, p.id) as pm_id,
CASE WHEN p.grouped_pm_id IS NULL THEN CAST(0 AS bit) ELSE CAST(1 AS bit) END as grouped_pm,
ISNULL(gp.risk_id, a.risk_id) as risk_id,
ISNULL(gp.pm_status_id, p.pm_status_id) as pm_status_id,
ISNULL(gass.user_id, sass.user_id) as user_id,
ISNULL(gp.completed_by, p.completed_by) as completed_by
FROM pms p
JOIN assets a
ON p.asset_id = a.id
LEFT JOIN grouped_pms gp
ON p.grouped_pm_id = gp.id
LEFT JOIN assignees sass
ON p.id = sass.record_id
AND sass.type = 'single_pm'
LEFT JOIN assignees gass
ON p.grouped_pm_id = gass.record_id
AND gass.type = 'grouped_pm'
LEFT JOIN users u
ON (sass.user_id = u.id OR gass.user_id = u.id) -- is this doubling up your rows?
WHERE a.facility_id IN (SELECT f.value FROM #facility_ids f)
AND a.is_component = 0
AND ISNULL(gp.pm_status_id, p.pm_status_id) in ('f9bdfc17-3bb5-4ec0-8477-24ef05ea3b9b', '06fc910c-3d07-4284-8f6e-8fb3873f5333')
AND ISNULL(gp.completion_date, p.completion_date) BETWEEN ISNULL(#start_date, '20000101') AND ISNULL(#end_date, '30000101') -- perhaps use >= AND <
AND ISNULL(gp.show_date, p.show_date) <= CURRENT_TIMESTAMP
AND ISNULL(gass.user_id, sass.user_id) IS NOT NULL
AND u.user_type_id != 'ec823d98-7023-4908-8006-2e33ddf2c11b'
AND (NOT EXISTS (SELECT 1 FROM #risk_ids) OR ISNULL(gp.risk_id, a.risk_id) IN (SELECT r.value FROM #risk_ids))
AND (NOT EXISTS (SELECT 1 FROM #assignee_ids) OR ISNULL(gass.user_id, sass.user_id) IN (SELECT aid.value FROM #assignee_ids aid));

Unable concate NULL value in SQL using CONCAT, COALESCE and ISNULL

I have a query with multiple joins where I want to combine records from two columns into one. If one column is empty then I want to show one column value as result. I tried with CONCAT, COALEASE and ISNULL but no luck. What am I missing here?
My objective is, create one column which has combination of s.Script AS Original and FromAnotherTable from query. Below query runs but throws Invalid column name 'Original' and Invalid column name 'FromAnotherTable'. when I try to use CONCAT, COALEASE or ISNULL .
SQL Query:
SELECT DISTINCT
c.Name AS CallCenter,
LTRIM(RTRIM(s.Name)) Name,
d.DNIS,
s.ScriptId,
s.Script AS Original,
(
SELECT TOP 5 CCSL.Line+'; '
FROM CallCenterScriptLine CCSL
WHERE CCSL.ScriptId = s.ScriptId
ORDER BY ScriptLineId FOR XML PATH('')
) AS FromAnotherTable,
--CONCAT(s.Script, SELECT TOP 5 CCSL.Line+'; ' FROM dbo.CallCenterScriptLine ccsl WHERE ccsl.ScriptId = s.ScriptId ORDER BY ccsl.ScriptLineId xml path(''))
--CONCAT(Original, FromAnotherTable) AS Option1,
--COALESCE(Original, '') + FromAnotherTable AS Option2,
--ISNULL(Original, '') + FromAnotherTable AS Option3,,
r.UnitName AS Store,
r.UnitNumber
FROM CallCenterScript s WITH (NOLOCK)
INNER JOIN CallCenterDNIS d WITH (NOLOCK) ON d.ScriptId = s.ScriptId
INNER JOIN CallCenter c WITH (NOLOCK) ON c.Id = s.CallCenterId
INNER JOIN CallCenterDNISRestaurant ccd WITH (NOLOCK) ON ccd.CallCenterDNISId = d.CallCenterDNISId
INNER JOIN dbo.Restaurant r WITH (NOLOCK) ON r.RestaurantID = ccd.CallCenterRestaurantId
WHERE c.Id = 5
AND (1 = 1)
AND (s.IsDeleted = 0 OR s.IsDeleted IS NULL)
ORDER BY DNIS ASC;
Output:
This works:
DECLARE #Column1 VARCHAR(50) = 'Foo',
#Column2 VARCHAR(50) = NULL;
SELECT CONCAT(#Column1,#Column2);
SELECT COALESCE(#Column2, '') + #Column1
SELECT ISNULL(#Column2, '') + #Column1
So I am not sure what I am missing in my original query.
Look at row 3 in the results you are getting. In your concatenated columns (Option1, 2, 3) you are getting the first script column twice. Not the first one + the second one like you expect.
The reason is because you've aliased your subquery "script" which is the same name as another column in your query, which makes it ambiguous.
Change the alias of the subquery and the problem should go away. I'm frankly surprised your query didn't raise an error.
EDIT: You can't use a column alias in another column's definition in the same level of the query. In other words, you can't do this:
SELECT
SomeColumn AS A
, (Subquery that returns a column) AS B
, A + B --this is not allowed
FROM ...
You can either create a CTE that returns the aliased columns and then concatenate them in the main query that selects from the CTE, or you have to use the original sources of the aliases, like so:
SELECT
SomeColumn AS A
, (Subquery that returns a column) AS B
, SomeColumn + (Subquery that returns a column) --this is fine
FROM ...
I took another approach where instead on creating separate column, I used ISNULL in my subQuery which returns my desired result.
Query:
SELECT DISTINCT
c.Name AS CallCenter,
LTRIM(RTRIM(s.Name)) Name,
d.DNIS,
s.ScriptId,
s.Script AS Original,
(
SELECT TOP 5 ISNULL(CCSL.Line, '')+'; ' + ISNULL(s.Script, '')
FROM CallCenterScriptLine CCSL
WHERE CCSL.ScriptId = s.ScriptId
ORDER BY ScriptLineId FOR XML PATH('')
) AS FromAnotherTable,
r.UnitName AS Store,
r.UnitNumber
FROM CallCenterScript s WITH (NOLOCK)
INNER JOIN CallCenterDNIS d WITH (NOLOCK) ON d.ScriptId = s.ScriptId
INNER JOIN CallCenter c WITH (NOLOCK) ON c.Id = s.CallCenterId
INNER JOIN CallCenterDNISRestaurant ccd WITH (NOLOCK) ON ccd.CallCenterDNISId = d.CallCenterDNISId
INNER JOIN dbo.Restaurant r WITH (NOLOCK) ON r.RestaurantID = ccd.CallCenterRestaurantId
WHERE c.Id = 5
AND (1 = 1)
AND (s.IsDeleted = 0 OR s.IsDeleted IS NULL)
ORDER BY DNIS ASC;
Here's a simplified example using table variables.
Instead of using a subquery for a field, it uses a CROSS APPLY.
And CONCAT in combination with STUFF is used to glue the strings together.
declare #Foo table (fooID int identity(1,1) primary key, Script varchar(30));
declare #Bar table (barID int identity(1,1) primary key, fooID int, Line varchar(30));
insert into #Foo (Script) values
('Test1'),('Test2'),(NULL);
insert into #Bar (fooID, Line) values
(1,'X'),(1,'Y'),(2,NULL),(3,'X'),(3,'Y');
select
f.fooID,
f.Script,
x.Lines,
CONCAT(Script+'; ', STUFF(x.Lines,1,2,'')) as NewScript
from #Foo f
cross apply (
select '; '+b.Line
from #Bar b
where b.fooID = f.fooID
FOR XML PATH('')
) x(Lines)
Result:
fooID Script Lines NewScript
----- ------- ------- -----------
1 Test1 ; X; Y Test1; X; Y
2 Test2 NULL Test2;
3 NULL ; X; Y X; Y

SQL combine two queries result into one dataset

I am trying to combine two SQL queries the first is
SELECT
EAC.Person.FirstName,
EAC.Person.Id,
EAC.Person.LastName,
EAC.Person.EmployeeId,
EAC.Person.IsDeleted,
Controller.Cards.SiteCode,
Controller.Cards.CardCode,
Controller.Cards.ActivationDate,
Controller.Cards.ExpirationDate,
Controller.Cards.Status,
EAC.[Group].Name
FROM
EAC.Person
INNER JOIN
Controller.Cards ON EAC.Person.Id = Controller.Cards.PersonId
INNER JOIN
EAC.GroupPersonMap ON EAC.Person.Id = EAC.GroupPersonMap.PersonId
INNER JOIN
EAC.[Group] ON EAC.GroupPersonMap.GroupId = EAC.[Group].Id
And the second one is
SELECT
IsActive, ActivationDateUTC, ExpirationDateUTC,
Sitecode + '-' + Cardcode AS Credential, 'Badge' AS Type,
CASE
WHEN isActive = 0
THEN 'InActive'
WHEN ActivationDateUTC > GetUTCDate()
THEN 'Pending'
WHEN ExpirationDAteUTC < GetUTCDate()
THEN 'Expired'
ELSE 'Active'
END AS Status
FROM
EAC.Credential
JOIN
EAC.WiegandCredential ON Credential.ID = WiegandCredential.CredentialId
WHERE
PersonID = '32'
Where I would like to run the second query for each user of the first query using EAC.Person.Id instead of the '32'.
I would like all the data to be returned in one Dataset so I can use it in Report Builder.
I have been fighting with this all day and am hoping one of you smart guys can give me a hand. Thanks in advance.
Based on your description in the comments, I understand that the connection between the two datasets is actually the PersonID field, which exists in both EAC.Credential and EAC.Person; however, in EAC.Credential, duplicate values exist for PersonID, and you want only the most recent one for each PersonID.
There are a few ways to do this, and it will depend on the number of rows returned, the indexes, etc., but I think maybe you're looking for something like this...?
SELECT
EAC.Person.FirstName
,EAC.Person.Id
,EAC.Person.LastName
,EAC.Person.EmployeeId
,EAC.Person.IsDeleted
,Controller.Cards.SiteCode
,Controller.Cards.CardCode
,Controller.Cards.ActivationDate
,Controller.Cards.ExpirationDate
,Controller.Cards.Status
,EAC.[Group].Name
,X.IsActive
,X.ActivationDateUTC
,X.ExpirationDateUTC
,X.Credential
,X.Type
,X.Status
FROM EAC.Person
INNER JOIN Controller.Cards
ON EAC.Person.Id = Controller.Cards.PersonId
INNER JOIN EAC.GroupPersonMap
ON EAC.Person.Id = EAC.GroupPersonMap.PersonId
INNER JOIN EAC.[Group]
ON EAC.GroupPersonMap.GroupId = EAC.[Group].Id
CROSS APPLY
(
SELECT TOP 1
IsActive
,ActivationDateUTC
,ExpirationDateUTC
,Sitecode + '-' + Cardcode AS Credential
,'Badge' AS Type
,'Status' =
CASE
WHEN isActive = 0
THEN 'InActive'
WHEN ActivationDateUTC > GETUTCDATE()
THEN 'Pending'
WHEN ExpirationDateUTC < GETUTCDATE()
THEN 'Expired'
ELSE 'Active'
END
FROM EAC.Credential
INNER JOIN EAC.WiegandCredential
ON EAC.Credential.ID = EAC.WiegandCredential.CredentialId
WHERE EAC.Credential.PersonID = EAC.Person.PersonID
ORDER BY EAC.Credential.ID DESC
) AS X
-- Optionally, you can also add conditions to return specific rows, i.e.:
-- WHERE EAC.Person.PersonID = 32
This option uses a CROSS APPLY, which means that every row of the first dataset will return additional values from the second dataset, based on the criteria that you described. In this CROSS APPLY, I'm joining the two datasets based on the fact that PersonID exists in both EAC.Person (in your first dataset) as well as in EAC.Credential. I then specify that I want only the TOP 1 row for each PersonID, with an ORDER BY specifying that we want the most recent (highest) value of ID for each PersonID.
The CROSS APPLY is aliased as "X", so in your original SELECT you now have several values prefixed with the X. alias, which just means that you're taking these fields from the second query and attaching them to your original results.
CROSS APPLY requires that a matching entry exists in both subsets of data, much like an INNER JOIN, so you'll want to check and make sure that the relevant values exist and are returned correctly.
I think this is pretty close to the direction you're trying to go. If not, let me know and I'll update the answer. Good luck!
Try like this;
select Query1.*, Query2.* from (
SELECT
EAC.Person.FirstName,
EAC.Person.Id as PersonId,
EAC.Person.LastName,
EAC.Person.EmployeeId,
EAC.Person.IsDeleted,
Controller.Cards.SiteCode,
Controller.Cards.CardCode,
Controller.Cards.ActivationDate,
Controller.Cards.ExpirationDate,
Controller.Cards.Status,
EAC.[Group].Name
FROM
EAC.Person
INNER JOIN
Controller.Cards ON EAC.Person.Id = Controller.Cards.PersonId
INNER JOIN
EAC.GroupPersonMap ON EAC.Person.Id = EAC.GroupPersonMap.PersonId
INNER JOIN
EAC.[Group] ON EAC.GroupPersonMap.GroupId = EAC.[Group].Id)
Query1 inner join (SELECT top 100
IsActive, ActivationDateUTC, ExpirationDateUTC,
Sitecode + '-' + Cardcode AS Credential, 'Badge' AS Type,
CASE
WHEN isActive = 0
THEN 'InActive'
WHEN ActivationDateUTC > GetUTCDate()
THEN 'Pending'
WHEN ExpirationDAteUTC < GetUTCDate()
THEN 'Expired'
ELSE 'Active'
END AS Status
FROM
EAC.Credential
JOIN
EAC.WiegandCredential ON Credential.ID = WiegandCredential.CredentialId
ORDER BY EAC.Credential.ID DESC) Query2 ON Query1.PersonId = Query2.PersonID
Just select two queries to join them like Query1 and Query2 by equaling PersonId data.

Update records SQL?

First when I started this project seemed very simple. Two tables, field tbl1_USERMASTERID in Table 1 should be update from field tbl2_USERMASTERID Table 2. After I looked deeply in Table 2, there is no unique ID that I can use as a key to join these two tables. Only way to match the records from Table 1 and Table 2 is based on FIRST_NAME, LAST_NAME AND DOB. So I have to find records in Table 1 where:
tbl1_FIRST_NAME equals tbl2_FIRST_NAME
AND
tbl1_LAST_NAME equals tbl2_LAST_NAME
AND
tbl1_DOB equals tbl2_DOB
and then update USERMASTERID field. I was afraid that this can cause some duplicates and some users will end up with USERMASTERID that does not belong to them. So if I find more than one record based on first,last name and dob those records would not be updated. I would like just to skip and leave them blank. That way I wouldn't populate invalid USERMASTERID. I'm not sure what is the best way to approach this problem, should I use SQL or ColdFusion (my server side language)? Also how to detect more than one matching record?
Here is what I have so far:
UPDATE Table1 AS tbl1
LEFT OUTER JOIN Table2 AS tbl2
ON tbl1.dob = tbl2.dob
AND tbl1.fname = tbl2.fname
AND tbl1.lname = tbl2.lname
SET tbl1.usermasterid = tbl2.usermasterid
WHERE LTRIM(RTRIM(tbl1.usermasterid)) = ''
Here is query where I tried to detect duplicates:
SELECT DISTINCT
tbl1.FName,
tbl1.LName,
tbl1.dob,
COUNT(*) AS count
FROM Table1 AS tbl1
LEFT OUTER JOIN Table2 AS tbl2
ON tbl1.dob = tbl2.dob
AND tbl1.FName = tbl2.first
AND tbl1.LName = tbl2.last
WHERE LTRIM(RTRIM(tbl1.usermasterid)) = ''
AND LTRIM(RTRIM(tbl1.first)) <> ''
AND LTRIM(RTRIM(tbl1.last)) <> ''
AND LTRIM(RTRIM(tbl1.dob)) <> ''
GROUP BY tbl1.FName,tbl1.LName,tbl1.dob
Some data after I tested query above:
First Last DOB Count
John Cook 2008-07-11 2
Kate Witt 2013-06-05 1
Deb Ruis 2016-01-22 1
Mike Bennet 2007-01-15 1
Kristy Cruz 1997-10-20 1
Colin Jones 2011-10-13 1
Kevin Smith 2010-02-24 1
Corey Bruce 2008-04-11 1
Shawn Maiers 2016-08-28 1
Alenn Fitchner 1998-05-17 1
If anyone have idea how I can prevent/skip updating duplicate records or how to improve this query please let me know. Thank you.
You could check for and avoid duplicate matches using with common_table_expression (Transact-SQL)
along with row_number()., like so:
with cte as (
select
t.fname
, t.lname
, t.dob
, t.usermasterid
, NewUserMasterId = t2.usermasterid
, rn = row_number() over (partition by t.fname, t.lname, t.dob order by t2.usermasterid)
from table1 as t
inner join table2 as t2 on t.dob = t2.dob
and t.fname = t2.fname
and t.lname = t2.lname
and ltrim(rtrim(t.usermasterid)) = ''
)
--/* confirm these are the rows you want updated
select *
from cte as t
where t.NewUserMasterId != ''
and not exists (
select 1
from cte as i
where t.dob = i.dob
and t.fname = i.fname
and t.lname = i.lname
and i.rn>1
);
--*/
/* update those where only 1 usermasterid matches this record
update t
set t.usermasterid = t.NewUserMasterId
from cte as t
where t.NewUserMasterId != ''
and not exists (
select 1
from cte as i
where t.dob = i.dob
and t.fname = i.fname
and t.lname = i.lname
and i.rn>1
);
--*/
I use the cte to extract out the sub query for readability. Per the documentation, a common table expression (cte):
Specifies a temporary named result set, known as a common table expression (CTE). This is derived from a simple query and defined within the execution scope of a single SELECT, INSERT, UPDATE, or DELETE statement.
Using row_number() to assign a number for each row, starting at 1 for each partition of t.fname, t.lname, t.dob. Having those numbered allows us to check for the existence of duplicates with the not exists() clause with ... and i.rn>1
You could use a CTE to filter out the duplicates from Table1 before joining:
; with CTE as (select *
, count(ID) over (partition by LastName, FirstName, DoB) as IDs
from Table1)
update a
set a.ID = b.ID
from Table2 a
left join CTE b
on a.FirstName = b.FirstName
and a.LastName = b.LastName
and a.Dob = b.Dob
and b.IDs = 1
This will work provided there are no exact duplicates (same demographics and same ID) in table 1. If there are exact duplicates, they will also be excluded from the join, but you can filter them out before the CTE to avoid this.
Please try below SQL:
UPDATE Table1 AS tbl1
INNER JOIN Table2 AS tbl2
ON tbl1.dob = tbl2.dob
AND tbl1.fname = tbl2.fname
AND tbl1.lname = tbl2.lname
LEFT JOIN Table2 AS tbl3
ON tbl3.dob = tbl2.dob
AND tbl3.fname = tbl2.fname
AND tbl3.lname = tbl2.lname
AND tbl3.usermasterid <> tbl2.usermasterid
SET tbl1.usermasterid = tbl2.usermasterid
WHERE LTRIM(RTRIM(tbl1.usermasterid)) = ''
AND tbl3.usermasterid is null

Intersect query with no duplicates

I have not used sql server in a large complex scale in years, and Looking for help on how to proper sintax intersect type query to joing these two data sets, and not create duplicate names. Some patients will have both an order and a clinical event entry and some will only have a clinical event.
Data Set 1
SELECT
distinct
ea.alias as FIN,
per.NAME_Last + ', ' + per.NAME_FIRST + ' ' + Isnull(per.NAME_MIDDLE, '') as PatientName,
oa.action_dt_tm as CirOrder,
od.ORIG_ORDER_DT_TM as DischOrder,
e.disch_dt_tm as ActualDisch,
prs.NAME_FULL_FORMATTED as OrderedBy,
from pathway py
join encounter e on e.CERNER_ENCOUNTER_ID = py.encntr_id
join encntr_alias ea on ea.CERNER_ENCNTR_ID = e.CERNER_ENCOUNTER_ID and ea.ENCNTR_ALIAS_TYPE_WCD = 1049
join person per on per.CERNER_PERSON_ID = e.cerner_PERSON_ID
join orders o on o.CERNER_ENCNTR_ID= e.CERNER_ENCOUNTER_ID and o.CATALOG_wCD = '82111' -- communication order
and o.pathway_catalog_id = '43809296' ---Circumcision Order
join order_action oa on oa.[CERNER_ORDER_ID] = o.CERNER_ORDER_ID and oa.ACTION_TYPE_WCD = '2494'--ordered
join orders od on od.CERNER_ENCNTR_ID= e.CERNER_ENCOUNTER_ID and od.CATALOG_WCD = '203520' --- Discharge Patient
join prsnl prs on prs.CERNER_PERSON_ID = oa.order_provider_id
where py.pathway_catalog_id = '43809296' and ---Circumcision Order
oa.action_dt_tm > '2016-01-01 00:00:00'
and oa.ACTION_DT_TM < '2016-01-19 23:59:59'
--use the report prompts as parameters for the action_dt_tm
Data Set 2
SELECT
distinct e.[CERNER_ENCOUNTER_ID],
ea.alias as FIN,
per.NAME_Last + ', ' + per.NAME_FIRST + ' ' + Isnull(per.NAME_MIDDLE, '') as PatientName,
ce.EVENT_END_DT_TM as CircTime,
od.ORIG_ORDER_DT_TM as DischOrder,
e.disch_dt_tm as ActualDisch,
'' OrderedBy, -- should be blank for this set
cv.DISPLAY
from encounter e
join clinical_event ce on e.CERNER_ENCOUNTER_ID = ce.CERNER_ENCNTR_ID
join encntr_alias ea on ea.CERNER_ENCNTR_ID = e.CERNER_ENCOUNTER_ID and ea.ENCNTR_ALIAS_TYPE_WCD = 1049
join person per on per.CERNER_PERSON_ID = e.cerner_PERSON_ID
join orders od on od.CERNER_ENCNTR_ID= e.CERNER_ENCOUNTER_ID and od.CATALOG_WCD = '203520' --- Discharge Patient
left outer join ENCNTR_LOC_HIST elh on elh.CERNER_ENCNTR_ID = e.CERNER_ENCOUNTER_ID
left outer join CODE_VALUE cv on cv.CODE_VALUE_WK = elh.LOC_NURSE_UNIT_WCD
where ce.event_wcd = '201148' ---Newborn Circumcision
and ce.[RESULT_VAL] = 'Newborn Circumcision'
and ce.EVENT_END_DT_TM > '2016-01-01 00:00:00'
and ce.event_end_dt_tm < '2016-01-19 23:59:59’
and ce.RESULT_STATUS_WCD = '25'
and elh.ACTIVE_STATUS_DT_TM < ce.event_end_dt_tm -- Circ time between the location's active time and end time.
and elh.END_EFFECTIVE_DT_TM > ce.[EVENT_END_DT_TM]
--use the report prompts as parameters for the ce.[EVENT_END_DT_TM]
The structure of an intersect query is as simple as:
select statement 1
intersect
select statement 2
intersect
select statement 3
...
This will return all columns that are in both select statements. The columns returned in the select statements must be of the same quantity and type (or at least be convertible to common type).
You can also do an intersect type of query just using inner joins to filter out records in the one query that are not in the other. So for a simple example let's say you have two tables of colors.
Select distinct ColorTable1.Color
from ColorTable1
join ColorTable2
on ColorTable1.Color = ColorTable2.Color
This will return all the distinct colors in ColorTable1 that are also in ColorTable2. Using joins to filter could help your query perform better, but it does take more thought.
Also see: Set Operators (Transact-SQL)

Resources