Group By not working in SQL Server - sql-server

CREATE VIEW [dbo].[Payment_Transaction_vw]
AS
SELECT payment_trans_id,
Student_Info.student_fname,
Student_Info.student_lname,
Student_Info.ID_Number,
Trimester_Payment.deadline,
Transaction_Info.trans_name,
Payment_Transaction.amount,
Payment_Transaction.date_paid
FROM [Payment_Transaction]
INNER JOIN Student_Info
ON Payment_Transaction.student_info_id = Student_Info.student_info_id
INNER JOIN Trimester_Payment
ON Payment_Transaction.trimester_id = Trimester_Payment.trimester_id
INNER JOIN Transaction_Info
ON Payment_Transaction.trans_info_id = Transaction_Info.trans_info_id
GROUP BY ID_Number,trans_name;
That is my script to make a view in sql server in visual studio, I wanted to group the ID_Number & trans_name which have a repeating values in the table Payment_Transactions. I wanted that this ID_Number with the trans_name will only displayed once. I also want to sum up the amount paid for every ID_number with the same trans_name.

When using group by you want to make sure that any unique value will be aggregated or consolidated so that it can be displayed in one row. As it is right now, payment_trans_id (and others) are still unique and since you chose to display these the group by cannot be done.
What do you want to do with payment_trans_id, date_paid, amount ... all other columns really?
Example using MAX(), MIN() and AVG():
SELECT
MAX(payment_trans_id) AS payment_trans_id,
Transaction_Info.trans_name,
Student_Info.ID_Number,
AVG(Payment_Transaction.amount) AS amount,
MIN(Payment_Transaction.date_paid) AS date_paid
FROM [Payment_Transaction]
INNER JOIN Student_Info ON Payment_Transaction.student_info_id = Student_Info.student_info_id
INNER JOIN Trimester_Payment ON Payment_Transaction.trimester_id = Trimester_Payment.trimester_id
INNER JOIN Transaction_Info ON Payment_Transaction.trans_info_id = Transaction_Info.trans_info_id
GROUP BY ID_Number, trans_name;
For support in EF, perhaps this will be sufficient (perhaps not):
SELECT
ISNULL(MAX(payment_trans_id),0) AS Id,
Transaction_Info.trans_name,
Student_Info.ID_Number,
AVG(Payment_Transaction.amount) AS amount,
MIN(Payment_Transaction.date_paid) AS date_paid
FROM [Payment_Transaction]
INNER JOIN Student_Info ON Payment_Transaction.student_info_id = Student_Info.student_info_id
INNER JOIN Trimester_Payment ON Payment_Transaction.trimester_id = Trimester_Payment.trimester_id
INNER JOIN Transaction_Info ON Payment_Transaction.trans_info_id = Transaction_Info.trans_info_id
GROUP BY ID_Number, trans_name;

You have to aggregate columns that are not in group by clause.
As example, which one of date_paid (for payment_trans_id = 1 and 2) you want to return? 6/25/2015 or 5/6/2015? SQL server cant know, so you get multiple rows.

Related

Join 4 tables and sum quantity for 2 tables using id from one table

My tables:
Order is:
PurchaseOrderHead
PurchaseOrder
ReceivingNoteHead
ReceivingNote
I want the output like this
MaterialID, PO.Quantity, RN.Quantity so far
There can be multiple receiving notes for a given purchaseorderhead_id as every ReceivingNoteHead will have a PurchaseOrderHeadID.
My attempt:
select
PurchaseOrder.MaterialID,
sum(distinct PurchaseOrder.Quantity) as "Sum_Quantity",
sum(ReceivingNote.Quantity) as "ReceivingNote_Quantity",
PurchaseOrderHead.id
from
(((dbo.PurchaseOrder
inner join
dbo.PurchaseOrderHead on (PurchaseOrderHead.id = PurchaseOrder.PurchaseOrderHeadID))
left outer join
dbo.ReceivingNoteHead ReceivingNoteHead (ReceivingNoteHead.PurchaseOrderHeadID = PurchaseOrderHead.id))
left outer join
dbo.ReceivingNote on (ReceivingNote.ReceivingNoteHeadID = ReceivingNoteHead.id))
group by
PurchaseOrder.MaterialID,
PurchaseOrderHead.id
having
(PurchaseOrderHead.id = 1004)
But ReceivingNote Quantities are repeated when there's no ReceivingNote MaterialID that matches PurchaseOrder's MaterialID.
This also does not work when theres multiple same MaterialID in either PurchaseOrder or ReceivingNote
I would like to learn whether I need to break the ReceivingNote table into 2 tables because of PurchaseOrderHeadID? And I want to get rid of the sum distinct because it's not the way I want it to be.
Maybe by first aggregating the material purchases in a sub-query.
Then left join that to the materials on the receiving end.
Untested notepad scribble:
SELECT
poMat.MaterialID,
poMat.TotQuantity AS [PurchaseOrder_Quantity],
SUM(rn.Quantity) AS [ReceivingNote_Quantity],
poMat.PurchaseOrderHeadID
FROM
(
SELECT
po.PurchaseOrderHeadID,
po.MaterialID,
SUM(po.Quantity) AS TotQuantity
FROM dbo.PurchaseOrder po
-- Uncomment to filter on the PurchaseOrderHeadID
-- WHERE po.PurchaseOrderHeadID = 1004
GROUP BY
po.PurchaseOrderHeadID,
po.MaterialID
) poMat
LEFT JOIN dbo.ReceivingNoteHead rnH
ON rnH.PurchaseOrderHeadID = poMat.PurchaseOrderHeadID
LEFT JOIN dbo.ReceivingNote rn
ON rn.ReceivingNoteHeadID = rnH.id
AND rn.MaterialID = poMat.MaterialID
GROUP BY
poMat.PurchaseOrderHeadID,
poMat.MaterialID,
poMat.TotQuantity
ORDER BY
poMat.PurchaseOrderHeadID,
poMat.MaterialID;
This however, won't show received materials that don't have a matching purchased material.
You are getting duplicate because the table ReceivingNoteHead does not have the PurchaseOrder.ID in it. Add the column PurchaseOrderID in ReceivingNoteHead and you should be good to go
select
PurchaseOrder.MaterialID,
sum(PurchaseOrder.Quantity) as "Sum_Quantity",
sum(ReceivingNote.Quantity) as "ReceivingNote_Quantity",
PurchaseOrderHead.id
from
dbo.PurchaseOrder
inner join
dbo.PurchaseOrderHead on PurchaseOrderHead.id = PurchaseOrder.PurchaseOrderHeadID
left outer join
dbo.ReceivingNoteHead ReceivingNoteHead ReceivingNoteHead.PurchaseOrderHeadID = PurchaseOrderHead.id *and ReceivingNoteHead.PurchaseOrderID=PurchaseOrder.ID*
left outer join
dbo.ReceivingNote on ReceivingNote.ReceivingNoteHeadID = ReceivingNoteHead.id
group by
PurchaseOrder.MaterialID,
PurchaseOrderHead.id
having
PurchaseOrderHead.id = 1004

SQL Server - using Over/Partition By in complex query

We have many complex queries which involve a lot of columns and joins (see the example below) that are implemented as views.
In some cases these queries return duplicate rows which then have to be programmatically removed by the consuming app. Therefore, we would like to enhance the SQL query to eliminate the duplicates, and speed up the retrieval process.
I know that I can use OVER / PARTITION BY logic to do this, but I am not sure of how to modify the queries to obtain a working syntax.
Here is an example:
SELECT
Main.MfgOrder.OrderNumber,
Main.MfgOrder.DesignBOMID,
Main.Design_Plant.PlantID,
Main.MfgOrder_Operation.OrderOpID,
Main.MfgOrder_Operation.DesignOpID,
Main.MfgOrder_Operation.OpSeq,
Main.MfgOrder_Operation.Description,
Main.MfgOrder_Operation.CompletionStatus,
Main.MfgOrder__Shift.OrderShiftID,
Main.MfgOrder__Shift.WorkCenterMachineID,
Main.MfgOrder___Event.OrderEventID,
Main.MfgOrder____Reel.OrderReelID,
Main.MfgOrder____Reel.ReelNumber,
Main.MfgOrder____Reel.Location,
Main.MfgOrder____Reel.Test_Status AS Test_Status_Reel,
Main.MfgOrder____Reel.Test_Disposition AS Test_Disposition_Reel,
Main.MfgOrder____Reel.LabReleased,
Main.MfgOrder____Reel.ShipReelsBypassSet,
Main.MfgOrder_____Length.OrderLengthID,
Main.MfgOrder_____Length.LengthType,
Main.MfgOrder_____Length.LocationOnReel,
Main.MfgOrder_____Length.LocationOnLength,
Main.MfgOrder_____Length.TrialNumber,
Main.MfgOrder_____Length.SampleNumber,
Main.MfgOrder_____Length.PrintNumber,
Main.MfgOrder_____Length.Test_Status AS Test_Status_Length,
Main.MfgOrder_____Length.Test_Category,
Main.MfgOrder_____Length.Test_Disposition AS Test_Disposition_Length,
Main.MfgOrder_____Length.SampleSubmittedBy,
Main.MfgOrder_____Length.SampleSubmittedDate,
Main.MfgOrder_____Length.BypassTesting,
Main.MfgOrder_____Length_OperatorQty.Sample1Destination,
Main.MfgOrder_____Length_OperatorQty.Sample2Destination,
Main.MfgOrder_____Length_OperatorQty.Sample3Destination,
Main.MfgOrder______Component.OrderComponentID,
Main.MfgOrder______Component.DesignComponentID,
Main.MfgOrder______Component.ItemNo,
Main.MfgOrder_______Test.LabTestID,
Main.MfgOrder_______Test.OrderTestID,
Main.MfgOrder_______Test.TestComplete,
Main.MfgOrder_______Test.TestStatus,
Main.MfgOrder________Marker2.OrderMarkerID,
Master.Color.ColorName,
Master.LabTest.ExcludeFromPassFail,
CASE
WHEN Main.Design_Component.Component_Label IS NULL
THEN 'Unknown'
ELSE Main.Design_Component.Component_Label
END AS Component_Label
FROM
Main.MfgOrder
INNER JOIN
Main.Design__BOM ON Main.MfgOrder.DesignBOMID = Main.Design__BOM.DesignBOMID
INNER JOIN
Main.Design_Plant ON Main.Design__BOM.DesignPlantID = Main.Design_Plant.DesignPlantID
INNER JOIN
Main.MfgOrder_Operation ON Main.MfgOrder.OrderNumber = Main.MfgOrder_Operation.OrderNumber
INNER JOIN
Main.MfgOrder__Shift ON Main.MfgOrder_Operation.OrderOpID = Main.MfgOrder__Shift.OrderOpID
INNER JOIN
Main.MfgOrder___Event ON Main.MfgOrder__Shift.OrderShiftID = Main.MfgOrder___Event.OrderShiftID
INNER JOIN
Main.MfgOrder____Reel ON Main.MfgOrder___Event.OrderEventID = Main.MfgOrder____Reel.OrderEventID
INNER JOIN
Main.MfgOrder_____Length ON Main.MfgOrder____Reel.OrderReelID = Main.MfgOrder_____Length.OrderReelID
LEFT OUTER JOIN
Main.MfgOrder______Component ON Main.MfgOrder_____Length.OrderLengthID = Main.MfgOrder______Component.OrderLengthID
LEFT OUTER JOIN
Main.MfgOrder_______Test ON Main.MfgOrder______Component.OrderComponentID = Main.MfgOrder_______Test.OrderComponentID
LEFT OUTER JOIN
Main.MfgOrder________Marker2 ON Main.MfgOrder_______Test.OrderTestID = Main.MfgOrder________Marker2.OrderTestID
LEFT OUTER JOIN
Main.Design_Component ON Main.MfgOrder______Component.DesignComponentID = Main.Design_Component.DesignComponentID
LEFT OUTER JOIN
Master.Color ON Main.MfgOrder______Component.TapeColorID = Master.Color.ColorNumber
LEFT OUTER JOIN
Master.LabTest ON Main.MfgOrder_______Test.LabTestID = Master.LabTest.LabTestID
LEFT OUTER JOIN
Main.MfgOrder_____Length_OperatorQty ON Main.MfgOrder______Component.OrderLengthID = Main.MfgOrder_____Length_OperatorQty.OrderLengthID
you can use row_number as below: Below query will not select duplicate only on OrderNumber, if you need to add other columns you add accordingly
Select * from (
Select
RowN = Row_Number() over( partition by Main.MfgOrder.OrderNumber order by Main.MfgOrder.OrderNumber),
--- All your select columns and all your query with joins
) a
Where a.RowN = 1
Is the entire row duplicated exactly? If so then just add DISTINCT
SELECT DISTINCT
...
FROM
...
If you are getting duplicate rows where most columns the same but some columns are different then GROUP BY the columns that are the same and select MIN(column_name) for the ones that are causing the extra rows to appear.

MS SQL Table Joins - Multiple Tables

I am new to MS SQL and am having trouble joining 4 tables within a query.
I am trying to join Orders, Order Lines, Client, and Picked tables to create a query to show quantity ordered and picked for a client. If I comment out the last inner join for Picked, I get the correct results. When I include the inner join for Picked the query returns results but data that should be in the Picked fields is NULL. One order line can have 1 or more Picked lines.
SELECT W_Warehouse, OH.OrderID, OH.RequiredDate, C.Client, OL.LineNbr, OL.QtyOrd, P.QtyPick
FROM Order
INNER JOIN Warehouse on Order.OH_WHS = Warehouse.W_PK
INNER JOIN Client on Order.O_Client = Client.C_PK
INNER JOIN OrderLine on Order.O_PK = OrderLine.OL_PK
INNER JOIN Picked on OrderLine.O_PK = Picked.P_PK
WHERE C.CLIENT = 'WENDYS'
Without knowing the data in the tables it is difficult to answer precisely.
But as you say you have 1+ rows in the Picked table, you probably want to do aggregation with GROUP BY and SUM()
Maybe this is what you're looking for:
SELECT
W.W_Warehouse,
OH.OrderID,
OH.RequiredDate,
C.Client,
OL.LineNbr,
OL.QtyOrd,
P.QtyPick
FROM
Order OH
INNER JOIN Warehouse W on OH.OH_WHS = W.W_PK
INNER JOIN Client C on OH.O_Client = C.C_PK
INNER JOIN OrderLine OL on OH.O_PK = OL.OL_PK
CROSS APPLY (
select sum(QtyPick) as QtyPick
from Picked P
where OL.O_PK = P.P_PK
) P
WHERE
C.CLIENT = 'WENDYS'
It calculates the sum of QtyPick separately so it doesn't increase the number of lines in the result.

How to join additional table when left outer not working

I have an existing proc which I have chopped up for brevity's sake
SELECT col1, col2
FROM (
col1, col2
SELECT col3--aggregate columns
FROM iep i
INNER JOIN student s ON s.studentID = i.studentID
INNER JOIN dbo.IDuration id ON i.IepID = id.iepID
INNER JOIN AppointmentStudent as ON s.studentID = as.studentID
INNER JOIN Appointment a ON as.appointmentID = a.appointmentID
INNER JOIN AppointmentTherapist at ON a.appointmentID = at.appointmentID
WHERE s.studentID = #studentID
GROUP BY col1, col2
) t
The aggregate columns summarizes appointments into the weeks of the year, but it only does sos for the weeks the student had appointments. I have an additional table called SchoolWeekYear that is populated with all of the weeks of the year that I am trying to integrate to this proc so I get 52 records back and not just the handful I am currently getting.
SELECT col1, col2
FROM (
col1, col2
SELECT col3--aggregate columns
FROM iep i
INNER JOIN student s ON s.studentID = i.studentID
INNER JOIN dbo.IDuration id ON i.IepID = id.iepID
INNER JOIN AppointmentStudent as ON s.studentID = as.studentID
INNER JOIN Appointment a ON as.appointmentID = a.appointmentID
LEFT OUTER JOIN SchoolWeekYear swy on a.calWeekNumber = swy.calWeekNumber
INNER JOIN AppointmentTherapist at ON a.appointmentID = at.appointmentID
WHERE s.studentID = #studentID
GROUP BY col1, col2
) t
Is this possible?
You need to integrate SchoolWeekYear into the existing table set at an earlier stage.
To show you the principle, let us simplify the problem even further. Let there be a table called WeeklyData with columns WeekNumber and SomeData. Some weeks might have multiple entries, some others none. So this query
SELECT
WeekNumber,
AGG(SomeData)
FROM
WeeklyData
GROUP BY
WeekNumber
;
would return only weeks present in WeeklyData. If you want to return data for all weeks, use a corresponding reference table (let it be called AllWeeks) like this:
SELECT
aw.WeekNumber,
AGG(wd.SomeData)
FROM
AllWeeks AS aw
LEFT JOIN
WeeklyData AS wd ON aw.WeekNumber = wd.WeekNumber
GROUP BY
aw.WeekNumber
;
So, you take the reference table (AllWeeks) and join the data table (WeeklyData) to it, not the other round.
Now, what if the original query was slightly more complex? Let us now suppose the data table is called StudentWeeklyData and has a column called StudentID which is a reference to a Students table. Let us also imagine the query is similar to yours in that it logically includes the Students table before the data table is joined and filters the results on the primary key of Students:
SELECT
s.StudentID,
s.StudentName,
swd.WeekNumber,
AGG(swd.SomeData)
FROM
Students AS s
INNER JOIN
StudentWeeklyData AS swd ON s.StudentID = swd.StudentID
WHERE
s.StudentID = #StudentID
GROUP BY
s.StudentID,
s.StudentName,
swd.WeekNumber
;
(Not every detail matters here, I just wanted to use a more similar example for you that would still be simple enough to understand.) Again, this would return only weeks where the specified student has data in StudentWeeklyTable. If you wanted to return all weeks for the student (some of them potentially empty, of course), this is how you could go about it:
SELECT
s.StudentID,
s.StudentName,
aw.WeekNumber,
AGG(swd.SomeData)
FROM
Students AS s
CROSS JOIN
AllWeeks AS aw
LEFT JOIN
StudentWeeklyData AS swd ON s.StudentID = swd.StudentID
AND aw.WeekNumber = swd.WeekNumber
WHERE
s.StudentID = #StudentID
GROUP BY
s.StudentID,
s.StudentName,
aw.WeekNumber
;
Here you can see again that the AllWeeks table is included before the data table. The difference to the previous case is we are not left-joining the result of the join between Students and StudentWeekly to AllWeeks, nor are we left-joining the data table itself specifically to AllWeeks. Instead, the data table is joined to the result of a cross join, Students × AllWeeks.
Returning to your specific situation, I realise that in your case even more tables are involved. Since you are not specifying how all those tables are related to one another, I can only guess that SchoolWeekYear should be cross-joined somewhere after FROM and before this line:
INNER JOIN Appointment a ON as.appointmentID = a.appointmentID
and that the said line should be modified like this:
LEFT JOIN Appointment a ON as.appointmentID = a.appointmentID
AND swy.calWeekNumber = a.calWeekNumber
the swy being an alias assigned to SchoolWeekYear.
It is also worth noting that there is a subsequent inner join with AppointmentTherapist. That join would eliminate the effect of the above left join if it remained unchanged, because its condition references the Appointment table. Perhaps, the syntactically easiest way to fix the issue would be to change that inner join to a left one too, although there is another way: instead of
LEFT JOIN Appointment a ON as.appointmentID = a.appointmentID
AND swy.calWeekNumber = a.calWeekNumber
LEFT JOIN AppointmentTherapist at ON a.appointmentID = at.appointmentID
you could use this syntax:
LEFT JOIN
Appointment a
INNER JOIN AppointmentTherapist at ON a.appointmentID = at.appointmentID
ON as.appointmentID = a.appointmentID
AND swy.calWeekNumber = a.calWeekNumber
That way the logical order of joining would be changed: Appointment and AppointmentTherapist would be first inner-joined with each other, then the result set would be outer-joined to the result of the previously specified joins.
It is possible. But if you have multiple row with some calWeekNumber on the SchoolWeekYear table, your aggregate function return wrong result.
If you want all lines in SchoolWeekYear shown, regardless of a match, you should use RIGHT OUTER JOIN instead of LEFT.

Group by in SQL Server

I have a query in SQL Server
SELECT
k12_dms_contacts_master.prefix_id AS prefix,
k12_dms_contacts_master.first_name,
k12_dms_contacts_master.last_name,
k12_dms_contacts_master.email,
k12_dms_institution_master.inst_name,
k12_dms_institution_master.address,
k12_dms_cities.name AS city_name,
k12_dms_zip_codes.zip_code,
k12_dms_institution_master.type_id,
k12_dms_contacts_institution_jobtitles.glevel_id,
k12_dms_districts.name AS district_name,
k12_dms_counties.name AS county_name,
k12_dms_institution_master.state_id,
k12_dms_institution_master.phone,
k12_dms_contacts_institution_jobtitles.job_title_id
FROM
k12_dms_institution_master
INNER JOIN k12_dms_contacts_institution_jobtitles ON k12_dms_contacts_institution_jobtitles.inst_id = k12_dms_institution_master.id
INNER JOIN k12_dms_contacts_master ON k12_dms_contacts_institution_jobtitles.contact_id = k12_dms_contacts_master.id
INNER JOIN k12_dms_cities ON k12_dms_cities.id = k12_dms_institution_master.city_id
INNER JOIN k12_dms_districts ON k12_dms_districts.id = k12_dms_institution_master.district_id
INNER JOIN k12_dms_counties ON k12_dms_counties.id = k12_dms_institution_master.county_id
INNER JOIN k12_dms_zip_codes ON k12_dms_zip_codes.id = k12_dms_institution_master.zip_code_id
WHERE
k12_dms_zip_codes.zip_code IN ('92678', '92679', '92688', '92690', '92691', '92692', '92693', '92694', '92877',
'92879', '92881', '92883')
ORDER BY
k12_dms_institution_master.state_id,
k12_dms_institution_master.inst_name ASC
Now I want to perform GROUP BY on Email address and Institution name but I am getting this error :
Column 'k12_dms_contacts_master.prefix_id' is invalid in the select
list because it is not contained in either an aggregate function or
the GROUP BY clause.
Any help would be highly appreciable.
The error message says it all.
You have created a group and since this column is not part of the "group by" nor an aggregation of all the groups column (like sum or count) you can't use it in the select clause.
Please note that the return of a group by is one row per group. Logically, that column would be different for any group member so it can not fit one line!

Resources