Filtering a table based on a non-unique column to give only unique fields in SQL

Filtering a table based on a non-unique column to give only unique fields in SQL - sql-server

How would I filter out a table that it only includes one value for a column (it does not matter which one).
The SQL query used to create the below looks like this :
SELECT DISTINCT
S.Id AS ReferenceID,
M.NewModuleID AS ModuleId,
SM.Compulsory
FROM
Struct S
INNER JOIN
StructModule SM
ON SM.StructId = S.Id
INNER JOIN
ModuleMap M
ON M.StructId = S.Id
AND SM.ModuleId = M.OldModuleId
However this does not return the values in the way that I need it. the return table looks like this:
ReferenceID NewModuleID Compulsory
1 100 1
1 210 0
2 251 1
2 251 0
However I would like the SQL query to return a unique value for the NewModuleID field. Ideally taking the first occurrence of a value
the relevant columns of the above tables are as follows:
Struct:
ID (INT)
StructModule:
ID (INT)
StructID (INT)
ModuleID (INT)
Compulsory (BIT)
ModuleMap:
ID (INT)
OldModuleId (INT)
StructID (INT)
NewModuleID (INT)

Your question is not very clear, but after reading following statement.
However I would like the SQL query to return a unique value for the
NewModuleID field. Ideally taking the first occurrence of a value
I can guess that you are looking for something like following query.
SELECT * FROM
(
SELECT
S.Id AS ReferenceID,
M.NewModuleID AS ModuleId,
SM.Compulsory ,
ROW_NUMBER() OVER(PARTITION BY S.ID, M.NewModuleID ORDER BY M.NewModuleID) RN
FROM
Struct S
INNER JOIN
StructModule SM
ON SM.StructId = S.Id
INNER JOIN
ModuleMap M
ON M.StructId = S.Id
AND SM.ModuleId = M.OldModuleId
)T
WHERE RN=1
Note : You don't need distinct if you are using RN=1 condition.

Related

Oracle get only last 1 row data on multiple tables query

I have an Oracle query to get only last 1 row data.
SELECT
R.FORM_NO,
R.PART_NO,
L.L_FORM_NO,
L.HDR_ID,
L.CP_ID_SLC_FORM_NO,
S.FORM_NO,
S.PART_NO,
S.CP_ID
FROM
WA_T_QC_REVISION R,
WA_T_QC_REVISION_LIST L,
WA_T_QC_CP_SELECTED S
WHERE
R.FORM_NO = L.HDR_ID AND
S.FORM_NO = L.CP_ID_SLC_FORM_NO AND
R.PART_NO = 'PA03670-B501'
ORDER BY R.FORM_NO DESC
When I try to adding the query to be like this:
SELECT * FROM(
SELECT
R.FORM_NO,
R.PART_NO,
L.L_FORM_NO,
L.HDR_ID,
L.CP_ID_SLC_FORM_NO,
S.FORM_NO,
S.PART_NO,
S.CP_ID
FROM
WA_T_QC_REVISION R,
WA_T_QC_REVISION_LIST L,
WA_T_QC_CP_SELECTED S
WHERE
R.FORM_NO = L.HDR_ID AND
S.FORM_NO = L.CP_ID_SLC_FORM_NO AND
R.PART_NO = 'PA03670-B501'
ORDER BY R.FORM_NO DESC)
WHERE ROWNUM <= 1
I got an error
ORA-00918: column ambiguously defined
What I want is to get only last 1 row data from tables.

The immediate fix here is to just alias the columns having the same name such that they no longer have the same name, e.g.
SELECT * FROM (
SELECT
R.FORM_NO AS FORM_NO_R,
R.PART_NO AS PART_NO_R,
L.L_FORM_NO,
L.HDR_ID,
L.CP_ID_SLC_FORM_NO,
S.FORM_NO AS FORM_NO_S,
S.PART_NO AS PART_ON_S,
S.CP_ID
FROM WA_T_QC_REVISION R
INNER JOIN WA_T_QC_REVISION_LIST L
ON R.FORM_NO = L.HDR_ID
INNER JOIN WA_T_QC_CP_SELECTED S
ON S.FORM_NO = L.CP_ID_SLC_FORM_NO
WHERE
R.PART_NO = 'PA03670-B501'
ORDER BY R.FORM_NO DESC
)
WHERE ROWNUM <= 1
Note also that I replaced your implicit joins with explicit inner joins. Using formal join syntax is the preferred way of writing queries (and has been for more than 25 years).

Update records SQL?

First when I started this project seemed very simple. Two tables, field tbl1_USERMASTERID in Table 1 should be update from field tbl2_USERMASTERID Table 2. After I looked deeply in Table 2, there is no unique ID that I can use as a key to join these two tables. Only way to match the records from Table 1 and Table 2 is based on FIRST_NAME, LAST_NAME AND DOB. So I have to find records in Table 1 where:
tbl1_FIRST_NAME equals tbl2_FIRST_NAME
AND
tbl1_LAST_NAME equals tbl2_LAST_NAME
AND
tbl1_DOB equals tbl2_DOB
and then update USERMASTERID field. I was afraid that this can cause some duplicates and some users will end up with USERMASTERID that does not belong to them. So if I find more than one record based on first,last name and dob those records would not be updated. I would like just to skip and leave them blank. That way I wouldn't populate invalid USERMASTERID. I'm not sure what is the best way to approach this problem, should I use SQL or ColdFusion (my server side language)? Also how to detect more than one matching record?
Here is what I have so far:
UPDATE Table1 AS tbl1
LEFT OUTER JOIN Table2 AS tbl2
ON tbl1.dob = tbl2.dob
AND tbl1.fname = tbl2.fname
AND tbl1.lname = tbl2.lname
SET tbl1.usermasterid = tbl2.usermasterid
WHERE LTRIM(RTRIM(tbl1.usermasterid)) = ''
Here is query where I tried to detect duplicates:
SELECT DISTINCT
tbl1.FName,
tbl1.LName,
tbl1.dob,
COUNT(*) AS count
FROM Table1 AS tbl1
LEFT OUTER JOIN Table2 AS tbl2
ON tbl1.dob = tbl2.dob
AND tbl1.FName = tbl2.first
AND tbl1.LName = tbl2.last
WHERE LTRIM(RTRIM(tbl1.usermasterid)) = ''
AND LTRIM(RTRIM(tbl1.first)) <> ''
AND LTRIM(RTRIM(tbl1.last)) <> ''
AND LTRIM(RTRIM(tbl1.dob)) <> ''
GROUP BY tbl1.FName,tbl1.LName,tbl1.dob
Some data after I tested query above:
First Last DOB Count
John Cook 2008-07-11 2
Kate Witt 2013-06-05 1
Deb Ruis 2016-01-22 1
Mike Bennet 2007-01-15 1
Kristy Cruz 1997-10-20 1
Colin Jones 2011-10-13 1
Kevin Smith 2010-02-24 1
Corey Bruce 2008-04-11 1
Shawn Maiers 2016-08-28 1
Alenn Fitchner 1998-05-17 1
If anyone have idea how I can prevent/skip updating duplicate records or how to improve this query please let me know. Thank you.

You could check for and avoid duplicate matches using with common_table_expression (Transact-SQL)
along with row_number()., like so:
with cte as (
select
t.fname
, t.lname
, t.dob
, t.usermasterid
, NewUserMasterId = t2.usermasterid
, rn = row_number() over (partition by t.fname, t.lname, t.dob order by t2.usermasterid)
from table1 as t
inner join table2 as t2 on t.dob = t2.dob
and t.fname = t2.fname
and t.lname = t2.lname
and ltrim(rtrim(t.usermasterid)) = ''
)
--/* confirm these are the rows you want updated
select *
from cte as t
where t.NewUserMasterId != ''
and not exists (
select 1
from cte as i
where t.dob = i.dob
and t.fname = i.fname
and t.lname = i.lname
and i.rn>1
);
--*/
/* update those where only 1 usermasterid matches this record
update t
set t.usermasterid = t.NewUserMasterId
from cte as t
where t.NewUserMasterId != ''
and not exists (
select 1
from cte as i
where t.dob = i.dob
and t.fname = i.fname
and t.lname = i.lname
and i.rn>1
);
--*/
I use the cte to extract out the sub query for readability. Per the documentation, a common table expression (cte):
Specifies a temporary named result set, known as a common table expression (CTE). This is derived from a simple query and defined within the execution scope of a single SELECT, INSERT, UPDATE, or DELETE statement.
Using row_number() to assign a number for each row, starting at 1 for each partition of t.fname, t.lname, t.dob. Having those numbered allows us to check for the existence of duplicates with the not exists() clause with ... and i.rn>1

You could use a CTE to filter out the duplicates from Table1 before joining:
; with CTE as (select *
, count(ID) over (partition by LastName, FirstName, DoB) as IDs
from Table1)
update a
set a.ID = b.ID
from Table2 a
left join CTE b
on a.FirstName = b.FirstName
and a.LastName = b.LastName
and a.Dob = b.Dob
and b.IDs = 1
This will work provided there are no exact duplicates (same demographics and same ID) in table 1. If there are exact duplicates, they will also be excluded from the join, but you can filter them out before the CTE to avoid this.

Please try below SQL:
UPDATE Table1 AS tbl1
INNER JOIN Table2 AS tbl2
ON tbl1.dob = tbl2.dob
AND tbl1.fname = tbl2.fname
AND tbl1.lname = tbl2.lname
LEFT JOIN Table2 AS tbl3
ON tbl3.dob = tbl2.dob
AND tbl3.fname = tbl2.fname
AND tbl3.lname = tbl2.lname
AND tbl3.usermasterid <> tbl2.usermasterid
SET tbl1.usermasterid = tbl2.usermasterid
WHERE LTRIM(RTRIM(tbl1.usermasterid)) = ''
AND tbl3.usermasterid is null

SQL: Select a column independent of where clause

SELECT TOP 1000 p.Title,p.Distributor, SUM(r.SalesVolume) AS VolumeOfSales,
CAST(SUM(r.CustomerPrice*r.SalesVolume) as decimal (18,0)) AS ValueOfSales,
CAST (AVG(r.CustomerPrice) as decimal (18,1)) AS AvgPrice,
p.MS_ContentType AS category ,Min(c.WeekId) AS ReleaseWeek
from Product p
INNER JOIN RawData r
ON p.ProductId = r.ProductId
INNER JOIN Calendar c
ON r.DayId = c.DayId
WHERE c.WeekId BETWEEN ('20145231') AND ('20145252')
AND p.Distributor IN ('WARNER', 'TF1', 'GAUMONT')
AND p.VODEST IN ('VOD', 'EST')
AND p.ContentFlavor IN ('SD', 'HD', 'NC')
AND p.MS_ExternalID1 IN ('ADVENTURE/ACTION', 'ANIMATION/FAMILY', 'COMEDY')
AND p.MS_ContentType IN ('FILM', 'TV', 'OTHERS')
AND r.CountryId = 1
GROUP BY p.Title,p.Distributor,p.MS_ContentType
ORDER BY VolumeOfSales DESC, ValueOfSales DESC
I want to madify the above query so that only the column ReleaseWeek is independent of the where clause WHERE c.WeekId BETWEEN ('20145231') AND ('20145252')
The result that I dervive looks like:
`Title Distributor VolumeOfSales ValueOfSales AvgPrice category ReleaseWeek
Divergente M6SND 94038 450095 4.0 Film 20145233`
However what I really want is the ReleaseWeek to be the first value in the column c.WeekId corresponding to that Titlein the database and not the first one between ('20145231') AND ('20145252') What is the best way to modify it? Any leads would be greatful.

LEFT JOIN gets heavy as the number of records in the second table increases

I am trying to run a SELECT query using LEFT JOIN. I get a COUNT on my second table ( the table on the right side of LEFT JOIN ). This process becomes slightly heavy as the number of records on the second table goes up. My first and second table have a one-to-many relationship. The second table's CampaignId column is a foreign key to the first table's Id. This is a simplified version of my query:
SELECT a.[Id]
,a.CampaignId
,a.[Inserted] AS 'Date'
,COUNT(b.Id) AS 'Received'
FROM [CampaignRun] AS a
LEFT JOIN [CampaignRecipient] AS b
ON a.Id = b.CampaignRunId
GROUP BY
a.[Id], a.CampaignId,a.[Inserted]
HAVING
a.CampaignId = 637
ORDER BY
a.[Inserted] DESC
The number 637 is an example for one the records only.
Is there a way to make this query run faster?

Use a sub-select to calculate Received:
SELECT a.[Id]
,a.CampaignId
,a.[Inserted] AS 'Date'
, (SELECT COUNT(*) FROM [CampaignRecipient] AS b
WHERE a.Id = b.CampaignRunId ) AS 'Received'
FROM [CampaignRun] AS a
WHERE a.CampaignId = 637
ORDER BY a.[Inserted] DESC

You have unneed HAVING clause here, which you can move to WHERE clause
SELECT a.[Id]
,a.CampaignId
,a.[Inserted] AS 'Date'
,COUNT(b.Id) AS 'Received'
FROM [CampaignRun] AS a
LEFT JOIN [CampaignRecipient] AS b
ON a.Id = b.CampaignRunId
WHERE a.CampaignId = 637
GROUP BY a.[Id], a.CampaignId,a.[Inserted]
ORDER BY a.[Inserted] DESC
Also ensure that you have index on foreign key in [CampaignRecipient] table on CampaignRunId column. It's considered a good practice.

SQL Select random from multiple table and order by specific criteria on one table

I need to select a random record from 3 tables and ensure I am ordering by photoOrder
Select TOP 1(a.id), a.mls_number, a.parcel_name, a.property_type, a.ownership_type, b.filename, b.photoOrder, c.county_Name
From property as a
Inner JOIN
listingPhotos as b on a.id = b.ListingID
LEFT JOIN
counties as C on a.county_name = c.id
WHERE a.isCommercial = 'True'
Order By NEWID()
So this query works, but I need to ensure that the b.filename record is ordered by b.photoOrder and thus the b.photoOrder should always be 1.
The b table (listing photos) has multiple photo files per property and I need to only select the photo that is 1st in the photo order.
Thanks

You could subquery your listingPhotos table and limit to WHERE PhotoOrder = 1:
Select TOP 1(a.id), a.mls_number, a.parcel_name, a.property_type, a.ownership_type, b.filename, b.photoOrder, c.county_Name
From property as a
Inner JOIN
(SELECT ListingID , filename, PhotoOrder FROM listingPhotos WHERE PhotoORder = 1
) as b on a.id = b.ListingID
LEFT JOIN
counties as C on a.county_name = c.id
WHERE a.isCommercial = 'True'
Order By NEWID()

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Filtering a table based on a non-unique column to give only unique fields in SQL - sql-server

Related

Oracle get only last 1 row data on multiple tables query

Update records SQL?

SQL: Select a column independent of where clause

LEFT JOIN gets heavy as the number of records in the second table increases

SQL Select random from multiple table and order by specific criteria on one table

Categories

Resources