T-SQL GROUP BY a nvarchar error - sql-server

I was wondering if anyone could help me with a request kindly:
Here is the data:
Table 1: EMPLOYEE
FK: DID
PK: Name
UserName
Table 2: DEPARTMENT
PK: DID
TerminationDate
I’m looking to find the number of terminated employees in the quarter. Here is the T-SQL so far:
SELECT
DEPARTMENT.name AS Name,
COUNT(e.userName)
FROM
EMPLOYEE AS e
JOIN
DEPARTMENT ON e.department = DEPARTMENT.DID
UNION
SELECT
u.eu, u.name
FROM
(SELECT
dd.name, COUNT(ee.userName) AS eu
FROM
DEPARTMENT AS dd
JOIN
EMPLOYEE AS ee ON dd.DID = ee.department
AND ee.terminationDate IS NOT NULL
WHERE
ee.terminationDate IS NOT NULL
AND ee.terminationDate BETWEEN '2015-04-01' AND '2015-06-30'
GROUP BY
dd.name, ee.userName) AS u
GROUP BY
u.eu, u.name, Name
ORDER BY
Name
The error is:
Msg 8120, Level 16, State 1, Line 1
Column 'DEPARTMENT.name' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.

In the first part of your query, you list the department name in your select clause but don't include it in a group by clause (because it is missing)
select DEPARTMENT.name as Name, COUNT(e.userName)
from EMPLOYEE as e
join DEPARTMENT on e.department = DEPARTMENT.DID
group by DEPARTMENT.Name --add this line

The number of problems with this query are bewildering. To start with... The table definitions you have listed do not match your query. You have the TerminationDate as an attribute of the DEPARTMENT table, where as in the query it looks like you are accessing it though the EMPLOYEE table, which makes sense to be there. So I'm going to assume that your table structure really is:
Table 1: EMPLOYEE
FK: DID
PK: Name
UserName
TerminationDate
Table 2: DEPARTMENT
PK: DID
Name
The way you have used a UNION in this query leads me to believe that you have a misunderstanding of what a UNION statement does. UNION is used to join two sets into one. So in set theory {1,2,3} union {3, 4} is {1,2,3,3,4}. In SQL Server, you have to make sure that the types and number of columns in the first select statement match all the selects that are unioned with it. So for example:
Select 1 A, 2 B
Union
Select 3 B
Will give this error because the first select has two columns and the second has only one:
Msg 205, Level 16, State 1, Line 3 All queries combined using a UNION,
INTERSECT or EXCEPT operator must have an equal number of expressions
in their target lists.
And this query:
Select 1 A
Union
Select 'Bob' A
Will give this error because the column types do not match:
Msg 245, Level 16, State 1, Line 2 Conversion failed when converting
the varchar value 'Bob' to data type int.
If all you are looking for is the number of employees terminated between two dates, then there really isn't really anything you need to union. All the employees are already together in one set (the Employee table)... and you just need to filter the dates you want and count the record total. You don't even need the DEPARTMENT table to calculate this.
I could write the query for you... but I'll leave the rest to you.

Related

component 'DEPARTMENT_ID' must be declared PLSQL

I have the following code.
DECLARE
TYPE t_dep IS TABLE OF DEPARTMENTS%ROWTYPE
INDEX BY BINARY_INTEGER;
v_dep t_dep;
BEGIN
FOR dep_rec IN
(SELECT department_name, location_id FROM Departments
ORDER BY department_id ASC)
LOOP
v_dep(dep_rec.department_id) := dep_rec;
END LOOP;
END;
And I get this error:
Error report -
ORA-06550: line 10, column 23:
PLS-00302: component 'DEPARTMENT_ID' must be declared
ORA-06550: line 10, column 9:
PL/SQL: Statement ignored
06550. 00000 - "line %s, column %s:\n%s"
*Cause: Usually a PL/SQL compilation error.
*Action:
This is the departments table.
Here is my task:
Write an anonymous PL/SQL block that declares and populates an INDEX BY table of records containing
department data. The table of records should use the departmentid as a primary key, and each element should
contain department name and location id. The data should be stored in the INDEX BY table of records in ascending
sequence of departmentid. The block should not display any output.
How can I deal with this error?
When you create your loop over this implict cursor
SELECT department_name, location_id
FROM Departments
ORDER BY department_id ASC
you say that you are only interested in the columns department_name, location_id.
If you even need ID, you have to add it to the select list:
SELECT department_name, location_id, department_id
Also, your structure is defined as
TABLE OF DEPARTMENTS%ROWTYPE
so you need all the coumns of the table, in the right order, to populate it: you need manager_id too.
Another point: if you define a structure as INDEX BY BINARY_INTEGER, you have to use a binary integer to index it; are you sure department_id is a binary index?
DEP_REC contains department_name and location_id, while you're using department_id in this statement:
v_dep(dep_rec.department_id) := dep_rec;
-------------
As it doesn't exist, your code fails.
How to fix it? Include department_id into cursor FOR loop's SELECT statement.
Also, as DEPARTMENTS table contains 4 columns, you'll have to select them all in order to make it work (currently, MANAGER_ID is missing).

GROUP BY or Aggregation Function error message [duplicate]

This question already has answers here:
GROUP BY / aggregate function confusion in SQL
(5 answers)
Closed 3 years ago.
I got an error -
Column 'Employee.EmpID' is invalid in the select list because it is
not contained in either an aggregate function or the GROUP BY clause.
select loc.LocationID, emp.EmpID
from Employee as emp full join Location as loc
on emp.LocationID = loc.LocationID
group by loc.LocationID
This situation fits into the answer given by Bill Karwin.
correction for above, fits into answer by ExactaBox -
select loc.LocationID, count(emp.EmpID) -- not count(*), don't want to count nulls
from Employee as emp full join Location as loc
on emp.LocationID = loc.LocationID
group by loc.LocationID
ORIGINAL QUESTION -
For the SQL query -
select *
from Employee as emp full join Location as loc
on emp.LocationID = loc.LocationID
group by (loc.LocationID)
I don't understand why I get this error. All I want to do is join the tables and then group all the employees in a particular location together.
I think I have a partial explanation for my own question. Tell me if its ok -
To group all employees that work in the same location we have to first mention the LocationID.
Then, we cannot/do not mention each employee ID next to it. Rather, we mention the total number of employees in that location, ie we should SUM() the employees working in that location. Why do we do it the latter way, i am not sure.
So, this explains the "it is not contained in either an aggregate function" part of the error.
What is the explanation for the GROUP BY clause part of the error ?
Suppose I have the following table T:
a b
--------
1 abc
1 def
1 ghi
2 jkl
2 mno
2 pqr
And I do the following query:
SELECT a, b
FROM T
GROUP BY a
The output should have two rows, one row where a=1 and a second row where a=2.
But what should the value of b show on each of these two rows? There are three possibilities in each case, and nothing in the query makes it clear which value to choose for b in each group. It's ambiguous.
This demonstrates the single-value rule, which prohibits the undefined results you get when you run a GROUP BY query, and you include any columns in the select-list that are neither part of the grouping criteria, nor appear in aggregate functions (SUM, MIN, MAX, etc.).
Fixing it might look like this:
SELECT a, MAX(b) AS x
FROM T
GROUP BY a
Now it's clear that you want the following result:
a x
--------
1 ghi
2 pqr
Your query will work in MYSQL if you set to disable ONLY_FULL_GROUP_BY server mode (and by default It is). But in this case, you are using different RDBMS. So to make your query work, add all non-aggregated columns to your GROUP BY clause, eg
SELECT col1, col2, SUM(col3) totalSUM
FROM tableName
GROUP BY col1, col2
Non-Aggregated columns means the column is not pass into aggregated functions like SUM, MAX, COUNT, etc..
Basically, what this error is saying is that if you are going to use the GROUP BY clause, then your result is going to be a relation/table with a row for each group, so in your SELECT statement you can only "select" the column that you are grouping by and use aggregate functions on that column because the other columns will not appear in the resulting table.
"All I want to do is join the tables and then group all the employees
in a particular location together."
It sounds like what you want is for the output of the SQL statement to list every employee in the company, but first all the people in the Anaheim office, then the people in the Buffalo office, then the people in the Cleveland office (A, B, C, get it, obviously I don't know what locations you have).
In that case, lose the GROUP BY statement. All you need is ORDER BY loc.LocationID

Understanding an ambiguous column name for inner query

I ran into a weird query today that I thought would be failed, but it succeeded in an unexpected way. Here's a minimal reproduction of it.
Tables and data:
CREATE TABLE Employee(ID int, Name varchar(max))
CREATE TABLE Engineer(ID int, Title varchar(max))
GO
INSERT INTO Employee(ID, Name) VALUES (1, 'Bobby')
INSERT INTO Employee(ID, Name) VALUES (2, 'Sue')
INSERT INTO Engineer(ID, Title) VALUES (1, 'Electrical Engineer')
INSERT INTO Engineer(ID, Title) VALUES (2, 'Network Engineer')
Queries:
--Find all Engineers with same title as Bobby has
SELECT * FROM Engineer WHERE Title IN (select Title from Employee WHERE Name = 'Bobby')
This returns all rows in Engineer table (unexpected, I thought it would fail). Note that the above query is incorrect. The inner query uses a column "Title" which doesn't exist in the table being selected from ("Employee"). So it must be binding the Title column value from Engineer in the outer query....which is always equal to itself so all rows are returned I think.
I can force it too if I fully qualify the column name, and that would fail as expected:
SELECT * FROM Engineer WHERE Title IN
(select Empl.Title from Employee Empl WHERE Name = 'Bobby')
This fails with "Invalid column name 'Title'."
Apparently if I were to add the Title column to the Employee table, it uses the Employee.Title column value instead.
ALTER TABLE Employee ADD Title varchar(max)
GO
UPDATE Employee SET Title = 'Electrical Engineer' WHERE ID = 1
UPDATE Employee SET Title = 'Network Engineer' WHERE ID = 2
SELECT * FROM Engineer WHERE Title IN
(select Title from Employee WHERE Name = 'Bobby')
This returns just one row (as expected).
I kind of understand what is happening here, what I'm looking for is a link to some documentation or some keyword that would help me read up and understand it fully (or even some explanation).
Of course it fails. There is no column named Title in your Employee table. In the query that does work it is a subquery so it is pulling Title from Engineer.
You can avoid this entirely if you develop the habit of ALWAYS referencing columns with 2 part naming instead of just the column name.
But in your queries you should start learning how to use joins instead of subqueries for everything. Your code would be far less confusing.
Since Title is not qualified it uses the Title from table Engineer
SELECT * FROM Engineer WHERE Title IN (select Title from Employee WHERE Name = 'Bobby')
In the last it uses the closest Title (from Employee) .
If you use alias and 2 part name then you stay out of this confusion.
As far as documentation. Finding closest column is probably an undocumented feature.
I found the documentation on the behavior: Qualifying Column Names in Subqueries
The general rule is that column names in a statement are implicitly qualified by the table referenced in the FROM clause at the same level. If a column does not exist in the table referenced in the FROM clause of a subquery, it is implicitly qualified by the table referenced in the FROM clause of the outer query.

spx for moving values to new table

I am trying to create one spx which based upon my ID which is 1009 will move 9 columns data to new table:
The old table has 9 columns:
CD_Train
CD_Date
CD_Score
Notes_Train
Notes_Date
Notes_Score
Ann_Train
Ann_Date
Ann_Score
userid - common in both tables
ID - 1009 - only exists in this table
and my new table has:
TrainingID,
TrainingType,
Score,
Date,
Status,
userid
TrainingType will have 3 values: Notes, CD, Ann
and other fields like score will get data from notes_score and so on
and date will get data from notes_date,cd_date depending upon in which column cd training went
status will get value from Notes_Train, cd_train and so on
based upon this, I am lost how should I do it
I tried querying one sql of users table and tried to do the join but I am losing the ground how to fix it
No idea yet, how to fill your column trainingId but the rest can be done by applying some UNION ALL clauses:
INSERT INTO tbl2 (trainingType,Date,Score,Status,userid)
Select 'CD' , CD_date, CD_score, CD_Train, userid FROM tbl1 where CD_date>0
UNION ALL
SELECT 'Notes', Notes_Date, Notes_Score, Notes_Train, userid FROM tbl1 where Notes_date>0
UNION ALL
SELECT 'Ann', Ann_Date, Ann_Score, ANN_Train, userid
FROM tbl1 where Ann_date>0
I don't know as yet whether all columns are filled in each row. That is the reason for the where clauses which should filter out only those rows with relevant data in the three selected columns.

Obtain Duplicated Data

Please suggest an SQL query to find duplicate customers across different stores, e.g. customer table has id, name, phone, storeid in it, I need to write queries for the following:
Duplicate customers within a store
Duplicate customers across different stores
Table data:
id name phone storeid
-----------------------------------
1 abc 123 4
2 abc 123 4
3 abc 123 5
The first query should show only first 2 records, and the second query should show all 3 records.
You can do something like the following:-
SELECT Name,Phone, COUNT(Id) NumberOfTimes, StoreID
FROM Customers
GROUP BY Name,Phone,StoreID
HAVING COUNT(Id) > 1
ORDER BY StoreID
Hope this helps.
Solution
You can try this for the first query:
SELECT *
FROM customer,
WHERE 1 < (
SELECT COUNT(name)
FROM customer
WHERE name IN (
SELECT name FROM customer
)
) AND
1 < (
SELECT COUNT(storeid)
FROM customer
WHERE storeid IN (
SELECT storeid FROM customer
)
);
Now, for the second query, use the above one, but remove everything after and including the AND.
Explanation
Let's look at the query step-by-step:
SELECT *
FROM customer
This is stating you want all the columns from the customers table.
WHERE 1 < (
SELECT COUNT(name)
FROM customer
WHERE name IN (
SELECT name FROM customer
)
)
This is a pretty long query, so let's look from inside-outward.
WHERE name IN (
SELECT name FROM customer
)
This time we're getting all the names of customers and checking if their is match in our curret table. To be truthful, we might not need this whole section....
SELECT COUNT(name)
FROM customer
This is stating we want the total number of times each name appears (count) in the customers table that matches the where clause.
WHERE 1 < (
....
)
Here, we are comparing the result from the subquery (the number of duplicated names) and checking to see if it is greater than l (i.e., there is a duplicate).
AND
.....
The AND keyword indicates that this second condition must be true in addition to the previous conditions.
The full query should return all entries where both the names and store ids are duplicated; if you remove everything including and after the AND, that will result in all entries which have the same name, but not neccessarily the right store id.
Notes
The other two answers are suggesting grouping duplicated data, but in your particular case, I think you do want the duplicated entries as per your expected results (albeit you should add more expected output info than that).
SELECT storeName, customerName FROM customer
WHERE id IN (
SELECT c.storeid
FROM customer 'c'
RIGHT JOIN store 's' ON (c.storeid = s.id)
GROUP BY c.storeid
HAVING COUNT(*) > 1
)
Basically, we are grouping by storeids, which allows us to count the times they occur in the customer table. We get the id of a case where there are multiple occurrences, and we select the storeName and CustomerName from the customer table that contains the id we got from the inner query.

Resources