How to use group by in SQL Server query? - sql-server

I have problem with group by in SQL Server
I have this simple SQL statement:
select *
from Factors
group by moshtari_ID
and I get this error :
Msg 8120, Level 16, State 1, Line 1
Column 'Factors.ID' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
This is my result without group by :
and this is error with group by command :
Where is my problem ?

In general, once you start GROUPing, every column listed in your SELECT must be either a column in your GROUP or some aggregate thereof. Let's say you have a table like this:
| ID | Name | City |
| 1 | Foo bar | San Jose |
| 2 | Bar foo | San Jose |
| 3 | Baz Foo | Santa Clara |
If you wanted to get a list of all the cities in your database, and tried:
SELECT * FROM table GROUP BY City
...that would fail, because you're asking for columns (ID and Name) that aren't in the GROUP BY clause. You could instead:
SELECT City, count(City) as Cnt FROM table GROUP BY City
...and that would get you:
| City | Cnt |
| San Jose | 2 |
| Santa Clara | 1 |
...but would NOT get you ID or Name. You can do more complicated things with e.g. subselects or self-joins, but basically what you're trying to do isn't possible as-stated. Break down your problem further (what do you want the data to look like?), and go from there.
Good luck!

When you group then you can select only the columns you group by. Other columns need to be aggrgated. This can be done with functions like min(), avg(), count(), ...
Why is this? Because with group by you make multiple records unique. But what about the column not being unique? The DB needs a rule for those on how to display then - aggregation.

You need to apply aggregate function such as max(), avg() , count() in group by.
For example this query will sum totalAmount for all moshtari_ID
select moshtari_ID,sum (totalAmount) from Factors group by moshtari_ID;
output will be
moshtari_ID SUM
2 120000
1 200000

Try it,
select *
from Factorys
Group by ID, date, time, factorNo, trackingNo, totalAmount, createAt, updateAt, bark_ID, moshtari_ID

If you are applying group clause then you can only use group columns and aggregate function in select
syntax:
SELECT expression1, expression2, ... expression_n,
aggregate_function (aggregate_expression)
FROM tables
[WHERE conditions]
GROUP BY expression1, expression2, ... expression_n
[ORDER BY expression [ ASC | DESC ]];

Related

In PostgreSQL, how can I extract matching items from a list?

I have a query in PostgreSQL that returns results like this, records with a string and a json array:
id | property_list
-----+-------------------------------------------------------------------------------
"i1" | [{"a":{"b":"no"}}, {"a":{"b":"yes"}}, {"a":{"b":"true"}}, {"a":{"b":"false"}}]
"i2" | [{"a":{"b":"yes"}}, {"a":{"b":"no"}}, {"a":{"b":"no"}}]
What I need is something like this:
id | yes_or_true
-----+------------
"i1" | 2
"i2" | 1
I need to count the properties in property_list where a.b equals "yes" or "true".
There are more properties, but there is always an a.b property with a string as its value.
I can solve this using a PL/pgSQL function, but for some reason, I'm in a situation where I can't use a PL/pgSQL function. How can I solve this in the query?
You can do this using jsonb_array_elements and a subquery:
SELECT
id,
(SELECT count(*)
FROM json_array_elements(property_list) el
WHERE el->'a'->>'b' IN ('true','yes')
) AS yes_or_true
FROM the_table
A lateral join to jsonb_array_elements() will solve this:
with indat (id, property_list) as (
values
('i1', '[{"a":{"b":"no"}}, {"a":{"b":"yes"}}, {"a":{"b":"true"}}, {"a":{"b":"false"}}]'::jsonb),
('i2', '[{"a":{"b":"yes"}}, {"a":{"b":"no"}}, {"a":{"b":"no"}}]'::jsonb)
)
select id, count(*) filter (where jdat->'a'->>'b' in ('yes', 'true'))
from indat
cross join lateral jsonb_array_elements(property_list) as j(jdat)
group by id;
id | count
----+-------
i1 | 2
i2 | 1
(2 rows)

SQL GROUP BY with columns which contain mirrored values

Sorry for the bad title. I couldn't think of a better way to describe my issue.
I have the following table:
Category | A | B
A | 1 | 2
A | 2 | 1
B | 3 | 4
B | 4 | 3
I would like to group the data by Category, return only 1 line per category, but provide both values of columns A and B.
So the result should look like this:
category | resultA | resultB
A | 1 | 2
B | 4 | 3
How can this be achieved?
I tried this statement:
SELECT category, a, b
FROM table
GROUP BY category
but obviously, I get the following errors:
Column 'a' is invalid in the select list because it is not contained
in either an aggregate function or the GROUP BY clause.
Column 'b' is invalid in the select list because it is not contained in either an
aggregate function or the GROUP BY clause.
How can I achieve the desired result?
Try this:
SELECT category, MIN(a) AS resultA, MAX(a) AS resultB
FROM table
GROUP BY category
If the values are mirrored then you can get both values using MIN, MAX applied on a single column like a.
Seams you don't really want to aggregate per category, but rather remove duplicate rows from your result (or rather rows that you consider duplicates).
You consider a pair (x,y) equal to the pair (y,x). To find duplicates, you can put the lower value in the first place and the greater in the second and then apply DISTINCT on the rows:
select distinct
category,
case when a < b then a else b end as attr1,
case when a < b then b else a end as attr2
from mytable;
Considering you want a random record from duplicates for each category.
Here is one trick using table valued constructor and Row_Number window function
;with cte as
(
SELECT *,
(SELECT Min(min_val) FROM (VALUES (a),(b))tc(min_val)) min_val,
(SELECT Max(max_val) FROM (VALUES (a),(b))tc(max_val)) max_val
FROM (VALUES ('A',1,2),
('A',2,1),
('B',3,4),
('B',4,3)) tc(Category, A, B)
)
select Category,A,B from
(
Select Row_Number()Over(Partition by category,max_val,max_val order by (select NULL)) as Rn,*
From cte
) A
Where Rn = 1

Best way to concat 1 to n values into single field from two tables

T-SQL
Imagine two tables looking like this:
Table: students
==============================
| TeacherID | SName |
| 1 | Thompson |
| 1 | Nickles |
| 2 | Cree |
==============================
Table: teacher
====================================================
| TeacherID | TName | + many other fields |
| 1 | Pipers | |
| 2 | Slinger | |
====================================================
The field names are completely arbitrary.
I want to create a query with the following output:
================================================================
| TeacherName | many other fields | Students |
| Pipers | | Thompson,Nickles |
================================================================
Currently I have something like this:
SELECT *
FROM teacher
LEFT JOIN (
SELECT DISTINCT
EL2.teacherID,
STUFF(( SELECT ',' + SName
FROM students
WHERE EL2.teacherID = students.teacherID
FOR XML PATH('')
),1,1,'') AS "Students"
FROM students, teacher EL2) t1
ON t1.teacherID = teacher.teacherID
WHERE t1.Students LIKE '%Thompson%'
This works and gives me what I need. The WHERE clause is to illustrate that I
also absolutely need to be able to filter if a teacher has that student, but then put all students that teacher has into the concated field.
My question now is if there is a better way to do this.
I already looked at this:
Concatenate many rows into a single text string?
But it didn't help me much because one I couldn't get it to work with two seperate tables and two I couldn't filter the way I needed.
The SQL Management Studio execution plan indicates that the SELECT DISTINCT is
very expensive and others have said that the reliance on XML PATH is not optimal because it's behaviour can change.
Be carefull with a DISTINCT on names, as you might have two students with the same name! And btw: GROUP BY is in most cases a better performing approach to get a distinct list...
You might try something like this:
SELECT t.*
,STUFF(( SELECT ',' + s.SName
FROM students AS s
WHERE t.teacherID = s.teacherID
FOR XML PATH('')
),1,1,'') AS Students
FROM teacher AS t
WHERE EXISTS(SELECT 1 FROM students AS x WHERE x.teacherID=t.teacherID /*AND [PUT YOUR FILTER HERE]*/)
If I understand this correctly you want to find only teachers where one given student is connected to the teacher. And in this case you want to find all students bound to all teachers connected to the given student, correct?
At the end you find a /*AND [PUT YOUR FILTER HERE]*/ At this place you should put something like AND x.StudentId=123. This will filter the teachers to the rows connected with this student only. For these teachers all students are concatenated...
Use XML Path,..How for XML path works:
select
TeacherID,
Tname,
stuff((select ','+s.sname from students s where s.teacherid=t.teacherid
for xml path('')),1,1,'')as students
from
teachers t

how to select column name that is not in group by clause

SELECT [a],[b],[c],COUNT(*) AS Hits
FROM [dbo].[ta] join [dbo].[tb] on [tb].[id] = [ta].[id]
where GROUP BY [a],[b],[c]
I want to select column c , but not in group by cluase?
because column "c" is returned a not suifficient record when as in group by clause
My current query is
SELECT top 20 [headshot],
[athleteId],
[athleteName],
COUNT() AS Hits
FROM [dbo].[tblRep_Usecase2]
join [dbo].[tblRep_Login] on [tblRep_Login].[ID] = [tblRep_Usecase2].[ID]
where athleteName != 'NULL'
and headshot != 'NULL'
and headshot != ''
and convert(date,[tblRep_Usecase2].[insertDate]) >='11/9/2015'
and convert(date,[tblRep_Usecase2].[insertDate]) <='11/9/2015'
and [tblRep_Usecase2].[appsportid]='41'
GROUP BY [headshot],[athleteId],[athleteName]
HAVING COUNT()>1 order by Hits DESC
Short answer : You can't.
Longer answer : You have to apply a formula to define how the "c" data is discriminated.
For instance, if you have a table about kids like
|a |b |c |
|-------|-------|---------|
|Bob |Sarah |Amanda |
|Bob |Sarah |Steve |
|Bob |Amanda |Sarah Jr.|
You ask him the number of kids of each couples (so Bob+Sarah = 2, Bob+Amanda=1), but then you also ask him the name of a kid. Since he cannot tell you which kid you want, he cannot give you your result.
In this case, maybe Bob, Sarah, Amada, Steve and Sarah Jr are in a table elsewhere with their age. so if you want only the oldest child, you would need to return a,b,"subquery","count". Depending on the situation, a simple aggregate function like MAX/MIN also works.

How to perform statistical computations in a query?

I have a table which is filled with float values. I need to calculate the number of results grouped by their distribution around the mean value (Gaussian Distribution). Basically, it is calculated like this:
SELECT COUNT(*), FloatColumn - AVG(FloatColumn) - STDEV(FloatColumn)
FROM Data
GROUP BY FloatColumn - AVG(FloatColumn) - STDEV(FloatColumn)
But for obvious reasons, SQL Server gives this error: Cannot use an aggregate or a subquery in an expression used for the group by list of a GROUP BY clause.
My question is, can I somehow leave this computation to SQL Server? Or do I have to do it the old fashioned way? Retrieve all the data, and do the calculation myself?
To get the aggregate of the whole set you can use an empty OVER clause
WITH T(Result)
AS (SELECT FloatColumn - Avg(FloatColumn) OVER() - Stdev(FloatColumn) OVER ()
FROM Data)
SELECT Count(*),
Result
FROM T
GROUP BY Result
SQL Fiddle
You can perform a pre-aggregation of the data, and join back to the table.
Schema Setup:
create table data(floatcolumn float);
insert data values
(1234.56),
(134.56),
(134.56),
(234.56),
(1349),
(900);
Query 1:
SELECT COUNT(*) C, D.FloatColumn - A
FROM
(
SELECT AVG(FloatColumn) + STDEV(FloatColumn) A
FROM Data
) preagg
CROSS JOIN Data D
GROUP BY FloatColumn - A;
Results:
| C | COLUMN_1 |
--------------------------
| 2 | -1196.876067819572 |
| 1 | -1096.876067819572 |
| 1 | -431.436067819572 |
| 1 | -96.876067819572 |
| 1 | 17.563932180428 |

Resources