GROUP BY Works with Select columns not in GROUP BY - database

why is this GROUP BY still working when the SELECTed columns are neither in the GROUP BY clause, nor aggregate function.
DATABASE SCHEMA HERE
SELECT FirstName,
LastName,
City,
Email,
COUNT(I.CustomerId) AS Invoices
FROM Customers C INNER JOIN Invoices I
ON C.CustomerId = I.CustomerId
GROUP BY C.CustomerId

This syntax is allowed and documented in SQLite: Bare columns in an aggregate query.
The columns FirstName, LastName, City, Email are called bare columns.
Such columns get an arbitrary value with the exception of the case where one (and only this one) of MIN() or MAX() is used. In this case the values of the bare columns are taken from the row that contains then min or max aggregated value.
In any case be careful when you use this syntax because you would get unexpected results.

Two things I want to talk about:
Group by will work if the column name exists in the table you are working on. In your query, you have inner join Customer table with Invoice table. From your schema, I can see in the Invoice table CustomerId column exists.
In SQL serve have to give all the column name that you selected plus your desired column name. What I mean by that your query should be like this.
SELECT FirstName,
LastName,
City,
Email,
COUNT(I.CustomerId) AS Invoices
FROM Customers C INNER JOIN Invoices I
ON C.CustomerId = I.CustomerId
GROUP BY C.CustomerId,
LastName,
City,
Email
So, I think you are using MySQL that's why it's working.

Related

Insert Into TABLE from 3 separate tables

I am getting my face kicked in....
I have a total of 4 tables
1. Business (BusinessID, CustomerID, BusName, Territory)
2. Customer (CustomerID, Name)
3. Sales (BusinessID, CustomerID, Territory, Jan, Feb, Mar, Apr, May, Jun)
4. Performance (this is the table I want the info in)
I've already created the table to have the following columns, BusinessID, CustomerID, BusName, Name, Territory, Jan,Feb,Mar,Apr,May,Jun
Every time I try to insert its not properly joining and I am getting a bunch of errors "multi-part identifier could not be bound"
insert into Performance (BusinessID, CustomerID, BusName, Name, Territory, January2018, February2018, March2018, April2018, May2018, June2018)
select Business.BusinessID, Customer.CustomerID, Business.BusName, Customer.Name, Sales.Territory, Sales.January2018, Sales.February2018, Sales.March2018, Sales.April2018, Sales.May2018, Sales.June2018
from Business A
inner join Customer B ON a.CustomerID = b.CustomerID
inner join Sales C ON b.CustomerID = c.CustomerID;
Due to this error I had to do 3 seperate insert into and that caused a bunch of nulls....
face palm is happening and could use some advice.
Image: enter image description here
Thanks,
VeryNew2SQL
You have used table ALIASES, so you have to use those aliases in you SELECT
A for Business, B for Customer and C for Sales.
Read about ALIASES here.
select A.BusinessID, B.CustomerID, A.BusName, B.Name, C.Territory, C.January2018, C.February2018, C.March2018, C.April2018, C.May2018, C.June2018
from Business A
inner join Customer B ON a.CustomerID = b.CustomerID
inner join Sales C ON b.CustomerID = c.CustomerID;
When you create a table alias in your FROM and JOIN clauses, you need to refer to the aliases in your SELECT statement and not the actual table names.
Alternatively, leave your SELECT statement as it is, and adjust your table names to remove the alias. You'll then need the join conditions to refer to your actual table names, rather than the alias. So for example;
select Business.BusinessID, Customer.CustomerID, Business.BusName, Customer.Name, Sales.Territory, Sales.January2018, Sales.February2018, Sales.March2018, Sales.April2018, Sales.May2018, Sales.June2018
from Business
inner join Customer ON Business.CustomerID = Customer.CustomerID
inner join Sales ON Customer.CustomerID = Sales.CustomerID;
Even just try running the SELECT statement above first to make sure you get the query correct before trying it in your insert.

SQL Server query issues with GROUP BY

I have to write a query regarding the statement below:
List all directors who directed 50 movies or more, in descending order of the number of movies they directed. Return the directors' names and the number of movies each of them directed.
I have written multiple variations but I keep getting errors.
It involves joins. The tables involved are:
Directors (directorID, firstname, lastname),
Movie_Directors (directorID, movieID).
What I have tried so far is:
SELECT DISTINCT
firstname, lastname,
COUNT(movie_directors.directorID)
FROM
dbo.movie_directors
INNER JOIN
directors ON directors.directorID = movie_directors.directorID
GROUP BY
firstname, lastname
HAVING
COUNT(movie_directors.directorID) >= 50
Is this correct?
Whenever you use a GROUP BY, any column not in an aggregate function MUST be in the GROUP BY clause MSDN - Group By (Transact SQL).
The reasoning is this: a GROUP by smashes records by the unique sets of values of each column in the group by, so any column not in the GROUP BY or HAVING clause would be outside of the purpose of a group by.
So by forcing an aggregate function for the columns, we guarantee the select statement is purposeful in its results...which should be how you code anyways.
Also, COUNT() ignores NULL values anyways and your ON predicate will only return on matches between the two tables on the director_ID. INNER JOIN will not return null results.
So use use a COUNT(<group by colum>) in your select statement.
Lastly, your HAVING clause is another predicate and can only be used with a GROUP BY.
MSDN - HAVING (Transact-SQL)

SQLite - Return 0 if null

I have an assignment in Database Management Systems in which I have to write queries for given problems.
I have 4 problems, of which I solved 3 and stuck with the last one.
Details:
Using version 1.4 of the Chinook Database
(https://chinookdatabase.codeplex.com/).
SQLite DB Browser
Chinook Sqlite AutoIncrementPKs.sqlite file​ in the directory with Chinook files is the database I am working on
Problem Statement:
Write a query to generate a ranked list of employees based upon the amount of money brought in via customer invoices for which they were the support representative. The result set (see figure below) should have the following fields (in order) for all employees (even those that did not support any customers): ID (e_id), first name (e_first name), last name (e_last_name), title (e_title), and invoice total (total_invoices). The rows should be sorted by the invoice total (greatest first), then by last name (alphabetically), then first name (alphabetically). The invoice total should be preceded by a dollar sign ($) and have two digits after the decimal point (rounded, as appropriate); in the case of employees without any invoices, you should output a $0.00, not NULL. You may find it useful to look at the IFNULL, ROUND, and PRINTF functions of SQLite.
Desired Output:
My Query:
Select Employee.EmployeeId as e_id,
Employee.FirstName as e_first_name,
Employee.LastName as e_last_name,
Employee.Title as e_title,
'$' || printf("%.2f", Sum(Invoice.Total)) as total_invoices
From Invoice Inner Join Customer On Customer.CustomerId = Invoice.CustomerId
Inner Join Employee On Employee.EmployeeId = Customer.SupportRepId
Group by Employee.EmployeeId
Having Invoice.CustomerId in
(Select Customer.CustomerId From Customer
Where Customer.SupportRepId in
(Select Employee.EmployeeId From Employee Inner Join Customer On Employee.EmployeeId = Customer.SupportRepId)
)
order by sum(Invoice.Total) desc
My Output:
As you can see, the first three rows are correct but the later rows are not printed because employees don't have any invoices and hence EmployeeID is null.
How do I print the rows in this condition?
I tried with Coalesce and ifnull functions but I can't get them to work.
I'd really appreciate if someone can modify my query to get matching solutions.
Thanks!
P.S: This is the schema of Chinook Database
It often happens that it is simpler to use subqueries:
SELECT EmployeeId,
FirstMame,
LastName,
Title,
(SELECT printf("...", ifnull(sum(Total), 0))
FROM Invoice
JOIN Customer USING (CustomerId)
WHERE Customer.SupportRepId = Employee.EmployeeId
) AS total_invoices
FROM Employee
ORDER BY total_invoices DESC;
(The inner join could be replaced with a subquery, too.)
But it's possible that you are supposed to show that you have learned about outer joins, which generate a fake row containing NULL values if a matching row is not found:
...
FROM Employee
LEFT JOIN Customer ON Employee.EmployeeId = Customer.SupportRepId
LEFT JOIN Invoice USING (CustomerID)
...
And if you want to be a smartass, replace ifnull(sum(...), 0) with total(...).

Using group by to update multiple records in an update query

I have two tables. Table1 has company data (Company ID, Company Name ...), so single record for each company.
Table2 has information about departments in that company (Department ID, Department Name, Company ID, Company Name ... ). So, second table might have n number of records where same company id is used.
Problem is one of our trigger failed to work properly, and no one noticed till now. So, when Company Name was updated in Table1, it never reflected in Table2.
To correct this, I have to do something like the below query:
Update Table2
Set
[Company Name] = (select [Company Name]
from Table1
where Table2.Company ID = Table1.Company ID)
Group By Table2.Company ID
Basically, I am trying to update all records in Table2 to use the same name as Table1, for each record in Table1.
I am a bit confused about how to create the inner select clause.
P.S. Sorry, it might be a bit confusing. Kindly do let me know how to reword it the best.
Don't need to use group by ...
UPDATE T2
SET [Company Nane] = T1.[Company Nane]
FROM Table1 T1
INNER JOIN Table2 T2
ON T1.[Company ID] = T2.[Company ID]
As other have mentioned: Remove GROUP BY and your query works fine.
GROUP BY is used to produce a result row in a query that is an aggregate of other rows. E.g. one record per company from your department table with the most busy department per company. You cannot update such a result record, for that record does not exist in the table. You can only update table records.
So remove GROUP BY from your query and you have it straight-forward.
Update DepartmentTable
Set [Company Name] =
(
select [Company Name]
from CompanyTable
where CompanyTable.[Company ID] = DepartmentTable.[Company ID]
);

Need help creating a query for a non-normalized database

I've never worked with a non-normalized database before, so I'll try and explain my problem as best I can. So I have two tables:
The customers table holds all the customers information, and the orders table holds all the orders that they have placed. I haven't listed all the fields in the tables, just the ones that I need. The customer number in both tables is not the primary key, but I'm inner joining on them anyway. So the problem I'm having is that I don't know how to make a query that:
Selects all the customers with their first name, last name, and email, and also show the most recent orderdate, most recent total, and most recent ordertype. I know that I have to use a max() aggregate for the date, but that's as far as I got. Please help a noob out.
You can try:
SELECT FirstName,
LastName,
Email,
OrderDate,
OrderTotal,
OrderType
FROM Customers AS C
INNER JOIN Order AS O
ON O.CustomerNumber = C.CustomerNumber AND
O.OrderDate = (
SELECT MAX (O1.OrderDate)
FROM Order AS O1
WHERE O1.CustomerNumber = C.CustomerNumber)
)
assuming that Orders.OrderDate is unique for each CustomerNumber, does this work for you? if a single CustomerNumber has more than one entry in Order for OrderDate, you'll get each of those rows.
select c.FirstName, c.LastName, c.Email, o.OrderDate, o.OrderTotal, o.OrderType
from Customers c
join
(select CusomterNumber, max(OrderDate) as MostRecentOrderDate
from Orders
group by CustomerNumber
) mro on mro.CustomerNumber=s.CustomerNumber
join Orders o on o.OrderDate=mro.MostRecentOrdeDate and
o.CustomerNumber=mro.CustomerNumber
Try this:
SELECT
Customers.*, Orders.*
FROM
Customers
JOIN
(SELECT
Customer_Number,
MAX(Order_Date) OrderDate
FROM
Orders
GROUP BY
Customer_Number
) as Ord ON Customers.Customer_Number = Ord.Customer_Number
JOIN Order ON Orders.Customer_Number = Ord.Customer_Number
If you are doing this with SQL Server use the query designer and basically all you want to do is do a join since you have two keys that are the same one in Customer Table ->Customer Join on Order->Customer alias the Customer table as C and Orders table as O
so for example
SELECT Customer.*, Orders.*
From Customer c, Orders O INNER JOIN O where C.Customer Number = O.Customer Number
This should be enough to get you started.. if you don't want all the fields then fully qualify the names for example
SELECT C.FirstName, C.LastName, O.OrderDate, O.OrderType FROM Customer C, Orders O
WHERE C.Customer NUmber = O.Customer Number //this is another way of doing a Join when working with the where Clause.

Resources