Group by in SQL Server - sql-server

I have a query in SQL Server
SELECT
k12_dms_contacts_master.prefix_id AS prefix,
k12_dms_contacts_master.first_name,
k12_dms_contacts_master.last_name,
k12_dms_contacts_master.email,
k12_dms_institution_master.inst_name,
k12_dms_institution_master.address,
k12_dms_cities.name AS city_name,
k12_dms_zip_codes.zip_code,
k12_dms_institution_master.type_id,
k12_dms_contacts_institution_jobtitles.glevel_id,
k12_dms_districts.name AS district_name,
k12_dms_counties.name AS county_name,
k12_dms_institution_master.state_id,
k12_dms_institution_master.phone,
k12_dms_contacts_institution_jobtitles.job_title_id
FROM
k12_dms_institution_master
INNER JOIN k12_dms_contacts_institution_jobtitles ON k12_dms_contacts_institution_jobtitles.inst_id = k12_dms_institution_master.id
INNER JOIN k12_dms_contacts_master ON k12_dms_contacts_institution_jobtitles.contact_id = k12_dms_contacts_master.id
INNER JOIN k12_dms_cities ON k12_dms_cities.id = k12_dms_institution_master.city_id
INNER JOIN k12_dms_districts ON k12_dms_districts.id = k12_dms_institution_master.district_id
INNER JOIN k12_dms_counties ON k12_dms_counties.id = k12_dms_institution_master.county_id
INNER JOIN k12_dms_zip_codes ON k12_dms_zip_codes.id = k12_dms_institution_master.zip_code_id
WHERE
k12_dms_zip_codes.zip_code IN ('92678', '92679', '92688', '92690', '92691', '92692', '92693', '92694', '92877',
'92879', '92881', '92883')
ORDER BY
k12_dms_institution_master.state_id,
k12_dms_institution_master.inst_name ASC
Now I want to perform GROUP BY on Email address and Institution name but I am getting this error :
Column 'k12_dms_contacts_master.prefix_id' is invalid in the select
list because it is not contained in either an aggregate function or
the GROUP BY clause.
Any help would be highly appreciable.

The error message says it all.
You have created a group and since this column is not part of the "group by" nor an aggregation of all the groups column (like sum or count) you can't use it in the select clause.
Please note that the return of a group by is one row per group. Logically, that column would be different for any group member so it can not fit one line!

Related

Why do I have duplicate records in my JOIN

I am retrieving data from table ProductionReportMetrics where I have column NetRate_QuoteID. Then to that result set I need to get Description column.
And in order to get a Description column, I need to join 3 tables:
NetRate_Quote_Insur_Quote
NetRate_Quote_Insur_Quote_Locat
NetRate_Quote_Insur_Quote_Locat_Liabi
But after that my premium is completely off.
What am I doing wrong here?
SELECT QLL.Description,
QLL.ClassCode,
prm.NetRate_QuoteID,
QL.LocationID,
ISNULL(SUM(premium),0) AS NetWrittenPremium,
MONTH(prm.EffectiveDate) AS EffMonth
FROM ProductionReportMetrics prm
LEFT JOIN NetRate_Quote_Insur_Quote Q
ON prm.NetRate_QuoteID = Q.QuoteID
INNER JOIN NetRate_Quote_Insur_Quote_Locat QL
ON Q.QuoteID = QL.QuoteID
INNER JOIN NetRate_Quote_Insur_Quote_Locat_Liabi QLL
ON QL.LocationID = QLL.LocationID
WHERE YEAR(prm.EffectiveDate) = 2016 AND
CompanyLine = 'Ironshore Insurance Company'
GROUP BY MONTH(prm.EffectiveDate),
QLL.Description,
QLL.ClassCode,
prm.NetRate_QuoteID,
QL.LocationID
I think the problem in this table:
What Am I missing in this Query?
select
ClassCode,
QLL.Description,
sum(Premium)
from ProductionReportMetrics prm
LEFT JOIN NetRate_Quote_Insur_Quote Q ON prm.NetRate_QuoteID = Q.QuoteID
LEFT JOIN NetRate_Quote_Insur_Quote_Locat QL ON Q.QuoteID = QL.QuoteID
LEFT JOIN
(SELECT * FROM NetRate_Quote_Insur_Quote_Locat_Liabi nqI
JOIN ( SELECT LocationID, MAX(ClassCode)
FROM NetRate_Quote_Insur_Quote_Locat_Liabi GROUP BY LocationID ) nqA
ON nqA.LocationID = nqI.LocationID ) QLL ON QLL.LocationID = QL.LocationID
where Year(prm.EffectiveDate) = 2016 AND CompanyLine = 'Ironshore Insurance Company'
GROUP BY Q.QuoteID,QL.QuoteID,QL.LocationID
Now it says
Msg 8156, Level 16, State 1, Line 14
The column 'LocationID' was specified multiple times for 'QLL'.
It looks like DVT basically hit on the answer. The only reason you would get different amounts(i.e. duplicated rows) as a result of a join is that one of the joined tables is not a 1:1 relationship with the primary table.
I would suggest you do a quick check against those tables, looking for table counts.
--this should be your baseline count
SELECT COUNT(*)
FROM ProductionReportMetrics
GROUP BY MONTH(prm.EffectiveDate),
prm.NetRate_QuoteID
--this will be a check against the first joined table.
SELECT COUNT(*)
FROM NetRate_Quote_Insur_Quote Q
WHERE QuoteID IN
(SELECT NetRate_QuoteID
FROM ProductionReportMetrics
GROUP BY MONTH(prm.EffectiveDate),
prm.NetRate_QuoteID)
Basically you will want to do a similar check against each of your joined tables. If any of the joined tables are part of the grouping statement, make sure they are also in the grouping of the count check statement. Also make sure to alter the WHERE clause of the check count statement to use the join clause columns you were using.
Once you find a table that returns the incorrect number of rows, you will have your answer as to what table is causing the problem. Then you will just have to decide how to limit that table down to distinct rows(some type of aggregation).
This advice is really just to show you how to QA this particular query. Break it up into the smallest possible parts. In this case, we know that it is a join that is causing the problem, so take it one join at a time until you find the offender.

Use groupby with inner join to get unique record on single column

I need unique record on colum a.[id] so I am using group but it getting error
Msg 8120, Level 16, State 1, Line 3
Column 'dbo.assessment_dfn.name' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
Query
SELECT a.[id] AS AssessmentID
--,ad.[code] AS Assessment_Dfn_Code
,ad.[name] AS Assessment_Dfn_name
,mdfn.[id] AS Module_Dfn_ID
,a.[student_stage]
,a.[assessments_sitting]
,a.[sitting_date]
,ads.[sitting_number]
,s.[id] AS StudentID
,a.[assessor] AS AssessorID
,mdfn.[lead] AS ModuleLead
,a.[submission_date]
,a.[status]
,a.[complete]
,a.[assessment]
,a.[saved]
,ele_cliPro.[grade] AS GradeID
,codGrd.[name] AS GradeName
FROM [adb_TestDb].[dbo].[assessments] AS a (NOLOCK)
INNER JOIN [adb_TestDb].[dbo].[student_stage] AS ss ON a.[student_stage] = ss.[id]
INNER JOIN [dbo].[assessment_dfn_sittings] ads WITH (NOLOCK) ON ads.[id] = a.[assessments_sitting]
INNER JOIN [adb_TestDb].[dbo].[students] AS s ON ss.student = s.[id]
INNER JOIN [dbo].[assessment_dfn] ad WITH (NOLOCK) ON ad.[id] = ads.[assessment]
INNER JOIN [dbo].[module_dfn] AS mdfn ON ad.[module] = mdfn.[id]
INNER JOIN [adb_TestDb].[dbo].[elements_clinicalprocedures] AS ele_cliPro ON a.[id] = ele_cliPro.[assessment]
INNER JOIN [adb_TestDb].[dbo].[codes_grade_details] AS codGrd ON ele_cliPro.[grade] = codGrd.[id] AND codGrd.[id] = 4
where a.id= 2532
group by a.[id]
As suggested by SQL you cannot get columns when using GROUP BY if they are not in that clause or in an aggregate function.
If you need only one record and you know all values are the same you could try to use an aggregate function for all the values required (eg MAX) but this solution should be used only if you are completely aware of getting only one among the possible values!

Group By not working in SQL Server

CREATE VIEW [dbo].[Payment_Transaction_vw]
AS
SELECT payment_trans_id,
Student_Info.student_fname,
Student_Info.student_lname,
Student_Info.ID_Number,
Trimester_Payment.deadline,
Transaction_Info.trans_name,
Payment_Transaction.amount,
Payment_Transaction.date_paid
FROM [Payment_Transaction]
INNER JOIN Student_Info
ON Payment_Transaction.student_info_id = Student_Info.student_info_id
INNER JOIN Trimester_Payment
ON Payment_Transaction.trimester_id = Trimester_Payment.trimester_id
INNER JOIN Transaction_Info
ON Payment_Transaction.trans_info_id = Transaction_Info.trans_info_id
GROUP BY ID_Number,trans_name;
That is my script to make a view in sql server in visual studio, I wanted to group the ID_Number & trans_name which have a repeating values in the table Payment_Transactions. I wanted that this ID_Number with the trans_name will only displayed once. I also want to sum up the amount paid for every ID_number with the same trans_name.
When using group by you want to make sure that any unique value will be aggregated or consolidated so that it can be displayed in one row. As it is right now, payment_trans_id (and others) are still unique and since you chose to display these the group by cannot be done.
What do you want to do with payment_trans_id, date_paid, amount ... all other columns really?
Example using MAX(), MIN() and AVG():
SELECT
MAX(payment_trans_id) AS payment_trans_id,
Transaction_Info.trans_name,
Student_Info.ID_Number,
AVG(Payment_Transaction.amount) AS amount,
MIN(Payment_Transaction.date_paid) AS date_paid
FROM [Payment_Transaction]
INNER JOIN Student_Info ON Payment_Transaction.student_info_id = Student_Info.student_info_id
INNER JOIN Trimester_Payment ON Payment_Transaction.trimester_id = Trimester_Payment.trimester_id
INNER JOIN Transaction_Info ON Payment_Transaction.trans_info_id = Transaction_Info.trans_info_id
GROUP BY ID_Number, trans_name;
For support in EF, perhaps this will be sufficient (perhaps not):
SELECT
ISNULL(MAX(payment_trans_id),0) AS Id,
Transaction_Info.trans_name,
Student_Info.ID_Number,
AVG(Payment_Transaction.amount) AS amount,
MIN(Payment_Transaction.date_paid) AS date_paid
FROM [Payment_Transaction]
INNER JOIN Student_Info ON Payment_Transaction.student_info_id = Student_Info.student_info_id
INNER JOIN Trimester_Payment ON Payment_Transaction.trimester_id = Trimester_Payment.trimester_id
INNER JOIN Transaction_Info ON Payment_Transaction.trans_info_id = Transaction_Info.trans_info_id
GROUP BY ID_Number, trans_name;
You have to aggregate columns that are not in group by clause.
As example, which one of date_paid (for payment_trans_id = 1 and 2) you want to return? 6/25/2015 or 5/6/2015? SQL server cant know, so you get multiple rows.

Sql limit rows returned by Inner Join

SELECT TOP (100) PERCENT dbo.Travelers.InsDate,
dbo.Certificates.CertificateNumber,
dbo.Certificates.Payment,
dbo.Travelers.FirstName,
dbo.Travelers.LastName,
dbo.Travelers.DOB,
dbo.Travelers.Address,
dbo.Travelers.City,
dbo.Travelers.State,
dbo.Travelers.Zip,
dbo.Travelers.Email,
dbo.Travelers.BestPhone,
dbo.Buyers.Name,
dbo.Buyers.SalesRep,
dbo.Sales.BoxNumber
FROM dbo.Sales
INNER JOIN dbo.Buyers ON dbo.Sales.BuyerID = dbo.Buyers.ID
INNER JOIN dbo.Travelers
INNER JOIN dbo.Certificates ON dbo.Travelers.CertificateID = dbo.Certificates.ID ON dbo.Sales.BoxNumber = LEFT(dbo.Certificates.CertificateNumber, 4)
WHERE (dbo.Certificates.PaymentCode = '1')
ORDER BY dbo.Travelers.InsDate DESC
This query is returning multiple records with the same CertificateNumber. I want it to return a DISTINCT CertificateNumber but since the BoxNumber is a derivative of CertificaeNumber it is returning multiple rows.
I have tried Distinct and Group BY.
Any one have any suggestions?
Upon further research another issue was exposed, BoxNumbers were sold to multiple buyers. Once I fixed this issue it ran fine.

Query Executing Problem

Using SQL 2005: “Taking too much time to execute”
I want to filter the date, the date should not display in holidays, and I am using three tables with Inner Join
When I run the below query, It taking too much time to execute, because I filter the cardeventdate with three table.
Query
SELECT
PERSONID, CardEventDate tmp_cardevent3
WHERE (CardEventDate NOT IN
(SELECT T_CARDEVENT.CARDEVENTDATE
FROM T_PERSON
INNER JOIN T_CARDEVENT ON T_PERSON.PERSONID = T_CARDEVENT.PERSONID
INNER JOIN DUAL_PRO_II_TAS.dbo.T_WORKINOUTTIME ON T_CARDEVENT.CARDEVENTDAY = DUAL_PRO_II_TAS.dbo.T_WORKINOUTTIME.DAYCODE
AND T_PERSON.TACODE = DUAL_PRO_II_TAS.dbo.T_WORKINOUTTIME.TACODE
WHERE (DUAL_PRO_II_TAS.dbo.T_WORKINOUTTIME.HOLIDAY = 'true')
)
)
ORDER BY PERSONID, CardEventDate DESC
For the above mentioned Query, there is any other way to do date filter.
Expecting alternative queries for my query?
I'm pretty sure that it's not the joined tables that is the problem, but rather the "not in" that makes it slow.
Try to use a join instead:
select m.PERSONID, m.CardEventDate
from T_PERSON p
inner join T_CARDEVENT c on p.PERSONID = c.PERSONID
inner join DUAL_PRO_II_TAS.dbo.T_WORKINOUTTIME w
on c.CARDEVENTDAY = w.DAYCODE
and p.TACODE = w.TACODE
and w.HOLIDAY = 'true'
right join tmp_cardevent3 m on m.CardEventDate = c.CardEventDate
where c.CardEventDate is null
order by m.PERSONID, m.CardEventDate desc
(There is a from clause missing from your query, so I don't know what table you are trying to get the data from.)
Edit:
Put tmp_cardevent3 in the correct place.
Have you created indices on all of the columns that you are using to do the joins? In particular, I'd consider indices on PERSONID in T_CARDEVENT, TACODE in both T_PERSON and T_WORKINOUTTIME, and HOLIDAY in T_WORKINOUTTIME.

Resources