Select distinct rows from multiple column query - sql-server

I have this query that currently returns all the pertinent information I need, but I need it to only return unique DISPLAYCLAIMNUMBER rows. I have tried throwing a DISTINCT before it, but that gives me an error for invalid syntax. I'm pretty confused on how to put DISTINCT on something that is in the middle of my SELECT. The order of my query is also important, because I'm sending the data to an Excel table. I'm sure this is an extremely inefficient, but the database is very fragmented, and I have to refer to several tables, just for one piece of data. Feel free to improve, if there is a way to.
SELECT cd.NAMECREFID, clm.CONTACTNAME, cd.ASSIGNEDPOLICY, clm.DISPLAYCLAIMNUMBER, clm.CAUSEOFLOSS, scp.PERILCODE, scp.DESCRIBE, clm.ASSIGNEDDATETIME, clm.CLOSEDATETIME, clp.PAYMENT
FROM CLMSTAT clm
inner join SCPERTBL scp on clm.ASSIGNEDPERILNUMBER = scp.PERILCODE
inner join CLMPYMT clp on clm.ASSIGNEDCLAIMNUMBER = clp.ASSIGNEDCLAIMNUMBER
inner join SNAMES sn on clm.CONTACTNAME = sn.NAME
inner join TCUSDTL cd on sn.NAMEID = cd.NAMEID
WHERE clm.CONTACTNAME Like '%Smith%'
ORDER BY clm.CONTACTNAME, clm.ASSIGNEDDATETIME DESC

You should simply try this
SELECT
DISTINCT -- at the start
cd.NAMECREFID, clm.CONTACTNAME, cd.ASSIGNEDPOLICY, clm.DISPLAYCLAIMNUMBER, clm.CAUSEOFLOSS, scp.PERILCODE, scp.DESCRIBE, clm.ASSIGNEDDATETIME, clm.CLOSEDATETIME, clp.PAYMENT
FROM CLMSTAT clm
inner join SCPERTBL scp on clm.ASSIGNEDPERILNUMBER = scp.PERILCODE
inner join CLMPYMT clp on clm.ASSIGNEDCLAIMNUMBER = clp.ASSIGNEDCLAIMNUMBER
inner join SNAMES sn on clm.CONTACTNAME = sn.NAME
inner join TCUSDTL cd on sn.NAMEID = cd.NAMEID
WHERE clm.CONTACTNAME Like '%Smith%'
ORDER BY clm.CONTACTNAME, clm.ASSIGNEDDATETIME DESC
This will give you a unique combination of all columns in the SELECT list.
However if this is not what you desire, and you simply want unique values for clm.DISPLAYCLAIMNUMBER only, I'd ask why do you need rest? If you don't then your query's SELECT list should be like below
SELECT DISTINCT clm.DISPLAYCLAIMNUMBER
FROM CLMSTAT clm
...

SELECT cd.NAMECREFID, clm.CONTACTNAME, cd.ASSIGNEDPOLICY, clm.DISPLAYCLAIMNUMBER, clm.CAUSEOFLOSS, scp.PERILCODE, scp.DESCRIBE, clm.ASSIGNEDDATETIME, clm.CLOSEDATETIME, clp.PAYMENT
FROM CLMSTAT clm
inner join SCPERTBL scp on clm.ASSIGNEDPERILNUMBER = scp.PERILCODE
inner join CLMPYMT clp on clm.ASSIGNEDCLAIMNUMBER = clp.ASSIGNEDCLAIMNUMBER
inner join SNAMES sn on clm.CONTACTNAME = sn.NAME
inner join TCUSDTL cd on sn.NAMEID = cd.NAMEID
WHERE clm.CONTACTNAME Like '%Smith%'
GROUP BY clm.DISPLAYCLAIMNUMBER
ORDER BY clm.CONTACTNAME, clm.ASSIGNEDDATETIME DESC

Related

SQL Server - using Over/Partition By in complex query

We have many complex queries which involve a lot of columns and joins (see the example below) that are implemented as views.
In some cases these queries return duplicate rows which then have to be programmatically removed by the consuming app. Therefore, we would like to enhance the SQL query to eliminate the duplicates, and speed up the retrieval process.
I know that I can use OVER / PARTITION BY logic to do this, but I am not sure of how to modify the queries to obtain a working syntax.
Here is an example:
SELECT
Main.MfgOrder.OrderNumber,
Main.MfgOrder.DesignBOMID,
Main.Design_Plant.PlantID,
Main.MfgOrder_Operation.OrderOpID,
Main.MfgOrder_Operation.DesignOpID,
Main.MfgOrder_Operation.OpSeq,
Main.MfgOrder_Operation.Description,
Main.MfgOrder_Operation.CompletionStatus,
Main.MfgOrder__Shift.OrderShiftID,
Main.MfgOrder__Shift.WorkCenterMachineID,
Main.MfgOrder___Event.OrderEventID,
Main.MfgOrder____Reel.OrderReelID,
Main.MfgOrder____Reel.ReelNumber,
Main.MfgOrder____Reel.Location,
Main.MfgOrder____Reel.Test_Status AS Test_Status_Reel,
Main.MfgOrder____Reel.Test_Disposition AS Test_Disposition_Reel,
Main.MfgOrder____Reel.LabReleased,
Main.MfgOrder____Reel.ShipReelsBypassSet,
Main.MfgOrder_____Length.OrderLengthID,
Main.MfgOrder_____Length.LengthType,
Main.MfgOrder_____Length.LocationOnReel,
Main.MfgOrder_____Length.LocationOnLength,
Main.MfgOrder_____Length.TrialNumber,
Main.MfgOrder_____Length.SampleNumber,
Main.MfgOrder_____Length.PrintNumber,
Main.MfgOrder_____Length.Test_Status AS Test_Status_Length,
Main.MfgOrder_____Length.Test_Category,
Main.MfgOrder_____Length.Test_Disposition AS Test_Disposition_Length,
Main.MfgOrder_____Length.SampleSubmittedBy,
Main.MfgOrder_____Length.SampleSubmittedDate,
Main.MfgOrder_____Length.BypassTesting,
Main.MfgOrder_____Length_OperatorQty.Sample1Destination,
Main.MfgOrder_____Length_OperatorQty.Sample2Destination,
Main.MfgOrder_____Length_OperatorQty.Sample3Destination,
Main.MfgOrder______Component.OrderComponentID,
Main.MfgOrder______Component.DesignComponentID,
Main.MfgOrder______Component.ItemNo,
Main.MfgOrder_______Test.LabTestID,
Main.MfgOrder_______Test.OrderTestID,
Main.MfgOrder_______Test.TestComplete,
Main.MfgOrder_______Test.TestStatus,
Main.MfgOrder________Marker2.OrderMarkerID,
Master.Color.ColorName,
Master.LabTest.ExcludeFromPassFail,
CASE
WHEN Main.Design_Component.Component_Label IS NULL
THEN 'Unknown'
ELSE Main.Design_Component.Component_Label
END AS Component_Label
FROM
Main.MfgOrder
INNER JOIN
Main.Design__BOM ON Main.MfgOrder.DesignBOMID = Main.Design__BOM.DesignBOMID
INNER JOIN
Main.Design_Plant ON Main.Design__BOM.DesignPlantID = Main.Design_Plant.DesignPlantID
INNER JOIN
Main.MfgOrder_Operation ON Main.MfgOrder.OrderNumber = Main.MfgOrder_Operation.OrderNumber
INNER JOIN
Main.MfgOrder__Shift ON Main.MfgOrder_Operation.OrderOpID = Main.MfgOrder__Shift.OrderOpID
INNER JOIN
Main.MfgOrder___Event ON Main.MfgOrder__Shift.OrderShiftID = Main.MfgOrder___Event.OrderShiftID
INNER JOIN
Main.MfgOrder____Reel ON Main.MfgOrder___Event.OrderEventID = Main.MfgOrder____Reel.OrderEventID
INNER JOIN
Main.MfgOrder_____Length ON Main.MfgOrder____Reel.OrderReelID = Main.MfgOrder_____Length.OrderReelID
LEFT OUTER JOIN
Main.MfgOrder______Component ON Main.MfgOrder_____Length.OrderLengthID = Main.MfgOrder______Component.OrderLengthID
LEFT OUTER JOIN
Main.MfgOrder_______Test ON Main.MfgOrder______Component.OrderComponentID = Main.MfgOrder_______Test.OrderComponentID
LEFT OUTER JOIN
Main.MfgOrder________Marker2 ON Main.MfgOrder_______Test.OrderTestID = Main.MfgOrder________Marker2.OrderTestID
LEFT OUTER JOIN
Main.Design_Component ON Main.MfgOrder______Component.DesignComponentID = Main.Design_Component.DesignComponentID
LEFT OUTER JOIN
Master.Color ON Main.MfgOrder______Component.TapeColorID = Master.Color.ColorNumber
LEFT OUTER JOIN
Master.LabTest ON Main.MfgOrder_______Test.LabTestID = Master.LabTest.LabTestID
LEFT OUTER JOIN
Main.MfgOrder_____Length_OperatorQty ON Main.MfgOrder______Component.OrderLengthID = Main.MfgOrder_____Length_OperatorQty.OrderLengthID
you can use row_number as below: Below query will not select duplicate only on OrderNumber, if you need to add other columns you add accordingly
Select * from (
Select
RowN = Row_Number() over( partition by Main.MfgOrder.OrderNumber order by Main.MfgOrder.OrderNumber),
--- All your select columns and all your query with joins
) a
Where a.RowN = 1
Is the entire row duplicated exactly? If so then just add DISTINCT
SELECT DISTINCT
...
FROM
...
If you are getting duplicate rows where most columns the same but some columns are different then GROUP BY the columns that are the same and select MIN(column_name) for the ones that are causing the extra rows to appear.

groupby in view of sql returns aggregate error

I am trying to create a view with this query as you can see here:
SELECT dbo.Lines.LineNumber, dbo.Lines.DocumentNumber, dbo.Joints.JointNumber, dbo.Joints.JointSize, dbo.Joints.ShopField, dbo.Joints.WPS, dbo.WeldDetails.StateStep2 AS WeldState, dbo.Welds.WeldNumber,
dbo.FitUps.FitUpNumber, MAX(dbo.WeldDetails.Id) AS WeldDetailId, MAX(dbo.FitUpDetails.Id) AS FitupDetailId, dbo.Joints.Id AS JointId, dbo.Ends.Name, dbo.Joints.THK, dbo.FitUpDetails.StateStep2 AS FitupState,
dbo.Joints.Revision, dbo.Joints.Note
FROM dbo.FitUps INNER JOIN
dbo.Welds INNER JOIN
dbo.Joints INNER JOIN
dbo.WeldDetails ON dbo.Joints.Id = dbo.WeldDetails.JointId INNER JOIN
dbo.FitUpDetails ON dbo.Joints.Id = dbo.FitUpDetails.JointId ON dbo.Welds.Id = dbo.WeldDetails.WeldId ON dbo.FitUps.Id = dbo.FitUpDetails.FitUpId INNER JOIN
dbo.Lines ON dbo.Joints.LineId = dbo.Lines.Id INNER JOIN
dbo.Ends ON dbo.Joints.EndId = dbo.Ends.Id
GROUP BY dbo.Joints.Id
But when i want to save the view i get this error :
Here is a part of my data :
Every joint id can have multi fitupdetailid and welddetailid in my view i want just show the maximum value of fitupdetailid and welddetailid of my joint.
I rewrote your query with a more readable join structure than what your GUI spit out. This should run for you and fix your error. Whether the results are what you want or not depends on your data. You may also want to re-order the grouping to group how you want, hierarchically. But all of those columns will need to be in the grouping in one form or another.
SELECT
dbo.Lines.LineNumber,
dbo.Lines.DocumentNumber,
dbo.Joints.JointNumber,
dbo.Joints.JointSize,
dbo.Joints.ShopField,
dbo.Joints.WPS,
dbo.WeldDetails.StateStep2 AS WeldState,
dbo.Welds.WeldNumber,
dbo.FitUps.FitUpNumber,
MAX(dbo.WeldDetails.Id) AS WeldDetailId,
MAX(dbo.FitUpDetails.Id) AS FitupDetailId,
dbo.Joints.Id AS JointId,
dbo.Ends.Name,
dbo.Joints.THK,
dbo.FitUpDetails.StateStep2 AS FitupState,
dbo.Joints.Revision,
dbo.Joints.Note
FROM
dbo.FitUps
INNER JOIN dbo.FitUpDetails ON dbo.FitUps.Id = dbo.FitUpDetails.FitUpId
INNER JOIN dbo.Joints ON dbo.Joints.Id = dbo.FitUpDetails.JointId
INNER JOIN dbo.WeldDetails ON dbo.Joints.Id = dbo.WeldDetails.JointId
INNER JOIN dbo.Welds ON dbo.Welds.Id = dbo.WeldDetails.WeldId
INNER JOIN dbo.Lines ON dbo.Joints.LineId = dbo.Lines.Id
INNER JOIN dbo.Ends ON dbo.Joints.EndId = dbo.Ends.Id
GROUP BY
dbo.Lines.LineNumber,
dbo.Lines.DocumentNumber,
dbo.Joints.JointNumber,
dbo.Joints.JointSize,
dbo.Joints.ShopField,
dbo.Joints.WPS,
dbo.WeldDetails.StateStep2,
dbo.Welds.WeldNumber,
dbo.FitUps.FitUpNumber,
dbo.Joints.Id,
dbo.Ends.Name,
dbo.Joints.THK,
dbo.FitUpDetails.StateStep2,
dbo.Joints.Revision,
dbo.Joints.Note

MS SQL Table Joins - Multiple Tables

I am new to MS SQL and am having trouble joining 4 tables within a query.
I am trying to join Orders, Order Lines, Client, and Picked tables to create a query to show quantity ordered and picked for a client. If I comment out the last inner join for Picked, I get the correct results. When I include the inner join for Picked the query returns results but data that should be in the Picked fields is NULL. One order line can have 1 or more Picked lines.
SELECT W_Warehouse, OH.OrderID, OH.RequiredDate, C.Client, OL.LineNbr, OL.QtyOrd, P.QtyPick
FROM Order
INNER JOIN Warehouse on Order.OH_WHS = Warehouse.W_PK
INNER JOIN Client on Order.O_Client = Client.C_PK
INNER JOIN OrderLine on Order.O_PK = OrderLine.OL_PK
INNER JOIN Picked on OrderLine.O_PK = Picked.P_PK
WHERE C.CLIENT = 'WENDYS'
Without knowing the data in the tables it is difficult to answer precisely.
But as you say you have 1+ rows in the Picked table, you probably want to do aggregation with GROUP BY and SUM()
Maybe this is what you're looking for:
SELECT
W.W_Warehouse,
OH.OrderID,
OH.RequiredDate,
C.Client,
OL.LineNbr,
OL.QtyOrd,
P.QtyPick
FROM
Order OH
INNER JOIN Warehouse W on OH.OH_WHS = W.W_PK
INNER JOIN Client C on OH.O_Client = C.C_PK
INNER JOIN OrderLine OL on OH.O_PK = OL.OL_PK
CROSS APPLY (
select sum(QtyPick) as QtyPick
from Picked P
where OL.O_PK = P.P_PK
) P
WHERE
C.CLIENT = 'WENDYS'
It calculates the sum of QtyPick separately so it doesn't increase the number of lines in the result.

Group By not working in SQL Server

CREATE VIEW [dbo].[Payment_Transaction_vw]
AS
SELECT payment_trans_id,
Student_Info.student_fname,
Student_Info.student_lname,
Student_Info.ID_Number,
Trimester_Payment.deadline,
Transaction_Info.trans_name,
Payment_Transaction.amount,
Payment_Transaction.date_paid
FROM [Payment_Transaction]
INNER JOIN Student_Info
ON Payment_Transaction.student_info_id = Student_Info.student_info_id
INNER JOIN Trimester_Payment
ON Payment_Transaction.trimester_id = Trimester_Payment.trimester_id
INNER JOIN Transaction_Info
ON Payment_Transaction.trans_info_id = Transaction_Info.trans_info_id
GROUP BY ID_Number,trans_name;
That is my script to make a view in sql server in visual studio, I wanted to group the ID_Number & trans_name which have a repeating values in the table Payment_Transactions. I wanted that this ID_Number with the trans_name will only displayed once. I also want to sum up the amount paid for every ID_number with the same trans_name.
When using group by you want to make sure that any unique value will be aggregated or consolidated so that it can be displayed in one row. As it is right now, payment_trans_id (and others) are still unique and since you chose to display these the group by cannot be done.
What do you want to do with payment_trans_id, date_paid, amount ... all other columns really?
Example using MAX(), MIN() and AVG():
SELECT
MAX(payment_trans_id) AS payment_trans_id,
Transaction_Info.trans_name,
Student_Info.ID_Number,
AVG(Payment_Transaction.amount) AS amount,
MIN(Payment_Transaction.date_paid) AS date_paid
FROM [Payment_Transaction]
INNER JOIN Student_Info ON Payment_Transaction.student_info_id = Student_Info.student_info_id
INNER JOIN Trimester_Payment ON Payment_Transaction.trimester_id = Trimester_Payment.trimester_id
INNER JOIN Transaction_Info ON Payment_Transaction.trans_info_id = Transaction_Info.trans_info_id
GROUP BY ID_Number, trans_name;
For support in EF, perhaps this will be sufficient (perhaps not):
SELECT
ISNULL(MAX(payment_trans_id),0) AS Id,
Transaction_Info.trans_name,
Student_Info.ID_Number,
AVG(Payment_Transaction.amount) AS amount,
MIN(Payment_Transaction.date_paid) AS date_paid
FROM [Payment_Transaction]
INNER JOIN Student_Info ON Payment_Transaction.student_info_id = Student_Info.student_info_id
INNER JOIN Trimester_Payment ON Payment_Transaction.trimester_id = Trimester_Payment.trimester_id
INNER JOIN Transaction_Info ON Payment_Transaction.trans_info_id = Transaction_Info.trans_info_id
GROUP BY ID_Number, trans_name;
You have to aggregate columns that are not in group by clause.
As example, which one of date_paid (for payment_trans_id = 1 and 2) you want to return? 6/25/2015 or 5/6/2015? SQL server cant know, so you get multiple rows.

TSQL Inner select using outer join

I have a query that is working for the most part until I had to add the inner select for "Trainers".
As you can see in the code below, I am trying to get all of the trainers for each of the segment ID's.
I am getting an error on the first inner selects where clause WHERE trn.segmentID = tes.teSegmentID saying that tes.teSegmentID is not defined.
Is there another way to approach this query in order to get the trainers like I am trying to accomplish?
SELECT *,
(SELECT e2.[FirstName] AS trainerFirst,
e2.[LastName] AS trainerLast
FROM BS_Training_Trainers AS trn
LEFT OUTER JOIN
employeeTable AS e2
ON trn.trainerEmpID = e2.EmpID
WHERE trn.segmentID = tes.teSegmentID
FOR XML PATH ('trainer'), TYPE, ELEMENTS, ROOT ('trainers'))
FROM dbo.BS_TrainingEvents AS a
WHERE a.trainingEventID IN (SELECT tes.trainingEventID
FROM dbo.BS_TrainingEvent_Segments AS tes
INNER JOIN
dbo.BS_TrainingEvent_SegmentDetails AS tesd
ON tesd.segmentID = tes.teSegmentID
INNER JOIN
dbo.BS_LocaleCodes AS locale
ON locale.localeID = tesd.localeID
WHERE locale.location = 'Baltimore');
It seems like you're taking the scenic route towards this:
SELECT a.*,
X.[FirstName],
X.[LastName]
FROM dbo.BS_TrainingEvents AS a
LEFT OUTER JOIN (SELECT e2.[FirstName], e2.[LastName], locale.location FROM dbo.BS_TrainingEvent_Segments AS tes
INNER JOIN dbo.BS_Training_Trainers AS trn ON trn.segmentID = tes.teSegmentID
INNER JOIN dbo.BS_TrainingEvent_SegmentDetails AS tesd ON tesd.segmentID = tes.teSegmentID
INNER JOIN dbo.BS_LocaleCodes AS locale ON locale.localeID = tesd.localeID
LEFT OUTER JOIN employeeTable AS e2 ON trn.trainerEmpID = e2.EmpID) AS X ON a.trainingEventID = X.trainingEventID
WHERE X.location = 'Baltimore';
Not sure if I got all those joins right, it was hard to decode from all the nesting you have going on.
If I have guessed table relationships from their names correctly, the only way to solve this is to reference the same filtering condition twice: first, in the XML generation part, and second in the outer level of the query:
with cte as (
select distinct tes.trainingEventID, tes.teSegmentID
from dbo.BS_TrainingEvent_Segments AS tes
INNER JOIN dbo.BS_TrainingEvent_SegmentDetails AS tesd ON tesd.segmentID = tes.teSegmentID
INNER JOIN dbo.BS_LocaleCodes AS locale ON locale.localeID = tesd.localeID
WHERE locale.location = 'Baltimore'
)
SELECT a.*, (
SELECT e2.[FirstName] AS trainerFirst, e2.[LastName] AS trainerLast
FROM BS_Training_Trainers AS trn
LEFT OUTER JOIN employeeTable AS e2 ON trn.trainerEmpID = e2.EmpID
inner join cte c on trn.segmentID = c.teSegmentID
FOR XML PATH ('trainer'), TYPE, ELEMENTS, ROOT ('trainers')
)
FROM dbo.BS_TrainingEvents AS a
where exists (select 0 from cte c where c.testrainingEventID = a.trainingEventID);
It's difficult to tell whether this is completely correct, of course, but I hope you get the idea.
Oh yes, and if you would have an event with multiple Baltimore segments, you will never be able to tell which trainer takes which one. But you can always add more data into XML to resolve this.

Resources