Why does this query work only when I use group by? - sql-server

This query works:
select p.Nombre as Nombre, c.Nombre as Categoria, s.Nombre as Subcategoria FROM Producto as p
inner join Subcategoria as s ON p.IDSubcategoria = s.ID
inner join Categoria as c on s.IDCategoria = c.ID
group by p.Nombre, c.Nombre, s.Nombre
order by p.Nombre
But when I remove the s.Nombre on the group by statement, I get this error:
Msg 8120, Level 16, State 1, Line 1
Column 'Subcategoria.Nombre' is
invalid in the select list because it
is not contained in either an
aggregate function or the GROUP BY
clause.
Can someone explain to me a little bit what the group by function does and why it allows the query to work?
In the interest of learning! Thanks.

When you state group by p.Nombre, you are specifying that there should be exactly 1 row of output for each distinct p.Nombre. Hence, other fields in the select clause must be aggregated (so that if there are multiple rows with the same p.Nombre, they can be 'collapsed' into one value)
By grouping on p.Nombre, c.Nombre, s.Nombre, you are saying that there should be exactly 1 row of output for each distinct tuple. Hence, it works (because the fields displayed are involved in the grouping clause).

If you use GROUP BY clause you can have on SELECT fields:
the fields that you already use in group by section
agregates (min, max, count....) on other fields
One little example:
MyTable
FieldA FieldB
a 1
a 2
b 3
b 5
Query:
select a, b from myTable GroupBy a
A B
a ?
b ?
Which values you want to have in the field B?
a-> 1 or a -> 2 or a -> 3 (1+2)
If the first you need min(a) aggregate function. If you need 2 - max. If 3 - sum().

The group by function collapses those rows that have the same value in the columns specified in the GROUP BY clause to just one row. For any other columns in your SELECT which are not specified in the GROUP BY clause, the SQL engine needs to know what to do with those columns too by way of an aggregation function, e.g. SUM, MAX, AVG, etc. If you don't specify an aggregation function then the engine throws an exception because it doesn't know what to do.
E.g.
select p.Nombre as Nombre, c.Nombre as Categoria, SUM(s.Nombre) as Subcategoria FROM Producto as p
inner join Subcategoria as s ON p.IDSubcategoria = s.ID
inner join Categoria as c on s.IDCategoria = c.ID
group by p.Nombre, c.Nombre
order by p.Nombre

A group-by clause is only required if you use aggregate functions like COUNT or MAX. As a side effect it removes duplicate rows. In your case it is simpler to remove duplicates by adding DISTINCT to the select clause, and removing the group-by clause altogether.
select DISTINCT p.Nombre as Nombre, c.Nombre as Categoria, s.Nombre as Subcategoria FROM Producto as p
inner join Subcategoria as s ON p.IDSubcategoria = s.ID
inner join Categoria as c on s.IDCategoria = c.ID
order by p.Nombre

Related

Incorrect Sum On Table Join

I am writing a query but i'm getting wrong result.Table are follows:
Tbl1(ProId, price,VId)
Tbl2(ProId, price, VId)
I have written this query:
SELECT
a.ProId, b.ProId,
SUM(a.price) - SUM(b.price) AS TotalPro
FROM
tbl1 AS a
INNER JOIN
tbl2 AS b ON a.ProId = b.ProId
WHERE
a.VId = '1234'
GROUP BY
a.ProId, b.ProId;
This query is returning an incorrect answer. What I have done is sum the price from table one and two separately and minus them the answer was fine. But when I join, I don't know why I am getting the wrong answer. ProId is same in both table, values are same.
I guess you want sth like below:
SELECT ProdId, SUM(price)
FROM (
SELECT a.ProId,a.price
FROM Tbl1 a
WHERE a.VId='1234'
UNION ALL
SELECT b.ProdId, -b.price
FROM Tbl2 b
--WHERE b.VId ='1234' (?)
) sub
GROUP BY ProdId;
The issue with JOIN is you may have some rows that are summed multiple times.

Duplicates when using inner join in t-sql

I know i am missing something ,my issue is, I have two tables with identical values except a filter and trying to join these temp tables in a SP but i am getting duplicate values.
Below is the sample code
SELECT DISTINCT
B.SUBSCRIBER_TAX_ID, B.MEMBER_FIRST_NAME, B.MEMBER_LAST_NAME,
B.BENEFIT_PLAN_NAME AS MEDICAL_PLAN, B.MEMBER_EFF_DATE AS MED_EFF_DATE, B.MEMBER_TERMINATION_DATE AS MED_END_DATE,
P.BENEFIT_PLAN_NAME AS PHARM_PLAN_NAME, P.MEMBER_EFF_DATE AS PHARM_EFF_DATE, P.MEMBER_TERMINATION_DATE AS PHARM_ENDdATE
FROM #BH_MED B
INNER JOIN #BH_PHARM P ON B.MEMBER_HCC_ID = P.MEMBER_HCC_ID
order by b.BENEFIT_PLAN_NAME,P.BENEFIT_PLAN_NAME
I want results as
!I want distinct abc,def in column 3 and column 6
Use group by
SELECT DISTINCT
B.SUBSCRIBER_TAX_ID, B.MEMBER_FIRST_NAME, B.MEMBER_LAST_NAME,
B.BENEFIT_PLAN_NAME AS MEDICAL_PLAN, B.MEMBER_EFF_DATE AS MED_EFF_DATE, B.MEMBER_TERMINATION_DATE AS MED_END_DATE,
P.BENEFIT_PLAN_NAME AS PHARM_PLAN_NAME, P.MEMBER_EFF_DATE AS PHARM_EFF_DATE, P.MEMBER_TERMINATION_DATE AS PHARM_ENDdATE
FROM #BH_MED B
INNER JOIN #BH_PHARM P ON B.MEMBER_HCC_ID = P.MEMBER_HCC_ID
GROUP BY B.SUBSCRIBER_TAX_ID, B.MEMBER_FIRST_NAME, B.MEMBER_LAST_NAME,B.BENEFIT_PLAN_NAME,B.MEMBER_EFF_DATE,B.MEMBER_TERMINATION_DATE,P.BENEFIT_PLAN_NAME,P.MEMBER_EFF_DATE,P.MEMBER_TERMINATION_DATE
order by b.BENEFIT_PLAN_NAME,P.BENEFIT_PLAN_NAME

Use groupby with inner join to get unique record on single column

I need unique record on colum a.[id] so I am using group but it getting error
Msg 8120, Level 16, State 1, Line 3
Column 'dbo.assessment_dfn.name' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
Query
SELECT a.[id] AS AssessmentID
--,ad.[code] AS Assessment_Dfn_Code
,ad.[name] AS Assessment_Dfn_name
,mdfn.[id] AS Module_Dfn_ID
,a.[student_stage]
,a.[assessments_sitting]
,a.[sitting_date]
,ads.[sitting_number]
,s.[id] AS StudentID
,a.[assessor] AS AssessorID
,mdfn.[lead] AS ModuleLead
,a.[submission_date]
,a.[status]
,a.[complete]
,a.[assessment]
,a.[saved]
,ele_cliPro.[grade] AS GradeID
,codGrd.[name] AS GradeName
FROM [adb_TestDb].[dbo].[assessments] AS a (NOLOCK)
INNER JOIN [adb_TestDb].[dbo].[student_stage] AS ss ON a.[student_stage] = ss.[id]
INNER JOIN [dbo].[assessment_dfn_sittings] ads WITH (NOLOCK) ON ads.[id] = a.[assessments_sitting]
INNER JOIN [adb_TestDb].[dbo].[students] AS s ON ss.student = s.[id]
INNER JOIN [dbo].[assessment_dfn] ad WITH (NOLOCK) ON ad.[id] = ads.[assessment]
INNER JOIN [dbo].[module_dfn] AS mdfn ON ad.[module] = mdfn.[id]
INNER JOIN [adb_TestDb].[dbo].[elements_clinicalprocedures] AS ele_cliPro ON a.[id] = ele_cliPro.[assessment]
INNER JOIN [adb_TestDb].[dbo].[codes_grade_details] AS codGrd ON ele_cliPro.[grade] = codGrd.[id] AND codGrd.[id] = 4
where a.id= 2532
group by a.[id]
As suggested by SQL you cannot get columns when using GROUP BY if they are not in that clause or in an aggregate function.
If you need only one record and you know all values are the same you could try to use an aggregate function for all the values required (eg MAX) but this solution should be used only if you are completely aware of getting only one among the possible values!

SQL: Select a column independent of where clause

SELECT TOP 1000 p.Title,p.Distributor, SUM(r.SalesVolume) AS VolumeOfSales,
CAST(SUM(r.CustomerPrice*r.SalesVolume) as decimal (18,0)) AS ValueOfSales,
CAST (AVG(r.CustomerPrice) as decimal (18,1)) AS AvgPrice,
p.MS_ContentType AS category ,Min(c.WeekId) AS ReleaseWeek
from Product p
INNER JOIN RawData r
ON p.ProductId = r.ProductId
INNER JOIN Calendar c
ON r.DayId = c.DayId
WHERE c.WeekId BETWEEN ('20145231') AND ('20145252')
AND p.Distributor IN ('WARNER', 'TF1', 'GAUMONT')
AND p.VODEST IN ('VOD', 'EST')
AND p.ContentFlavor IN ('SD', 'HD', 'NC')
AND p.MS_ExternalID1 IN ('ADVENTURE/ACTION', 'ANIMATION/FAMILY', 'COMEDY')
AND p.MS_ContentType IN ('FILM', 'TV', 'OTHERS')
AND r.CountryId = 1
GROUP BY p.Title,p.Distributor,p.MS_ContentType
ORDER BY VolumeOfSales DESC, ValueOfSales DESC
I want to madify the above query so that only the column ReleaseWeek is independent of the where clause WHERE c.WeekId BETWEEN ('20145231') AND ('20145252')
The result that I dervive looks like:
`Title Distributor VolumeOfSales ValueOfSales AvgPrice category ReleaseWeek
Divergente M6SND 94038 450095 4.0 Film 20145233`
However what I really want is the ReleaseWeek to be the first value in the column c.WeekId corresponding to that Titlein the database and not the first one between ('20145231') AND ('20145252') What is the best way to modify it? Any leads would be greatful.

Group by in SQL Server

I have a query in SQL Server
SELECT
k12_dms_contacts_master.prefix_id AS prefix,
k12_dms_contacts_master.first_name,
k12_dms_contacts_master.last_name,
k12_dms_contacts_master.email,
k12_dms_institution_master.inst_name,
k12_dms_institution_master.address,
k12_dms_cities.name AS city_name,
k12_dms_zip_codes.zip_code,
k12_dms_institution_master.type_id,
k12_dms_contacts_institution_jobtitles.glevel_id,
k12_dms_districts.name AS district_name,
k12_dms_counties.name AS county_name,
k12_dms_institution_master.state_id,
k12_dms_institution_master.phone,
k12_dms_contacts_institution_jobtitles.job_title_id
FROM
k12_dms_institution_master
INNER JOIN k12_dms_contacts_institution_jobtitles ON k12_dms_contacts_institution_jobtitles.inst_id = k12_dms_institution_master.id
INNER JOIN k12_dms_contacts_master ON k12_dms_contacts_institution_jobtitles.contact_id = k12_dms_contacts_master.id
INNER JOIN k12_dms_cities ON k12_dms_cities.id = k12_dms_institution_master.city_id
INNER JOIN k12_dms_districts ON k12_dms_districts.id = k12_dms_institution_master.district_id
INNER JOIN k12_dms_counties ON k12_dms_counties.id = k12_dms_institution_master.county_id
INNER JOIN k12_dms_zip_codes ON k12_dms_zip_codes.id = k12_dms_institution_master.zip_code_id
WHERE
k12_dms_zip_codes.zip_code IN ('92678', '92679', '92688', '92690', '92691', '92692', '92693', '92694', '92877',
'92879', '92881', '92883')
ORDER BY
k12_dms_institution_master.state_id,
k12_dms_institution_master.inst_name ASC
Now I want to perform GROUP BY on Email address and Institution name but I am getting this error :
Column 'k12_dms_contacts_master.prefix_id' is invalid in the select
list because it is not contained in either an aggregate function or
the GROUP BY clause.
Any help would be highly appreciable.
The error message says it all.
You have created a group and since this column is not part of the "group by" nor an aggregation of all the groups column (like sum or count) you can't use it in the select clause.
Please note that the return of a group by is one row per group. Logically, that column would be different for any group member so it can not fit one line!

Resources