SELECT statement with sub-query - sql-server

Instructions:
Business case: The accounting department would like a reporting of the top ten vendors with their last invoice date and average invoice amount.
Write a SELECT statement that returns three columns:
VendorName (from the Vendors table)
LatestInv (summary function that returns the last entry from InvoiceDate)
AverageInv: (summary function that returns the average from InvoiceTotal)
Hint: you will need to join two tables before joining to the derived table (the subquery)
Subquery portion: SELECT statement that returns the top ten VendorID and AverageInv (same name and function as described in the outer query). Group the results by the appropriate column and sort the results by AverageInv from largest to smallest. Correlate the subquery as BestVendors and join it to the correct table (where both share a key field).
Group the outer query by the appropriate column and sort the results by LatestInv
most recent to oldest
My code
SELECT VendorName, MAX(InvoiceDate) AS LatestInv, AVG(InvoiceTotal) AS AverageInv
FROM Vendors v JOIN
(SELECT TOP 10 VendorID, AVG(InvoiceTotal) AS AverageInv
FROM Invoices
GROUP BY VendorID
ORDER BY AverageInv DESC) AS BestVendors
ON v.VendorID = BestVendors.VendorID
GROUP BY VendorName
ORDER BY LatestInv
MAX(InvoiceDate) has a red line under it as well as AVG(InvoiceTotal) because they are from the Invoices table. Not the Vendors. However if I use FROM Invoices in the outer query then VendorName won't be recognized? How do I fix this and get the result set that this question is looking for?
Also these pics show some sample data from the Invoices and Vendors Table

Try this:
SELECT VendorName, BestVendors.LatestInv, BestVendors.AverageInv
FROM Vendors v
INNER JOIN
(
SELECT TOP 10 VendorID
,AVG(InvoiceTotal) AS AverageInv
,MAX(InvoiceDate) AS LatestInv
FROM Invoices
GROUP BY VendorID
ORDER BY AverageInv DESC
) AS BestVendors
ON v.VendorID = BestVendors.VendorID
ORDER BY LatestInv DESC

Related

SQL Project using a where clause

So this is what I am working with new to sql and still learning been stuck on this for a few days now. Any advice would be appreciated I attached the image of the goal I'm trying to achieve
OrderItem And Product Table
Order And OrderItem Table(https://i.stack.imgur.com/pdbMT.png)
Scenario: Our boss would like to see the OrderNumber, OrderDate, Product Name, UnitPrice and Quantity for products that have TotalAmounts larger than the average
Create a query with a subquery in the WHERE clause. OrderNumber, OrderDate and TotalAmount come from the Order table. ProductName comes from the Product table. UnitPrice and Quantity come from the OrderItem table.
This is the code I came up with but it causes product name to run endlessly and displays wrong info.
USE TestCorp;
SELECT DISTINCT OrderNumber,
OrderDate,
ProductName,
i.UnitPrice,
Quantity,
TotalAmount
FROM [Order], Product
JOIN OrderItem i ON Product.UnitPrice = i.UnitPrice
WHERE TotalAmount < ( SELECT AVG(TotalAmount)
FROM [Order]
)
ORDER BY TotalAmount DESC;
Best guess assuming joins and fields not provided.
SELECT O.OrderNumber, O.orderDate, P.ProductName, OI.UnitPrice, OI.Quantity, O.TotalAmount
FROM [Order] O
INNER JOIN OrderItem OI
on O.ID = OI.orderID
INNER JOIN Product P
on P.ID= OI.ProductID
CROSS JOIN (SELECT avg(TotalAmount) AvgTotalAmount FROM [Order]) z
WHERE O.TotalAmount > z.AvgTotalAmount
Notes:
You're mixing join notations don't use , and inner join together that's mixing something called ANSI Standards.
I'm not sure why you have a cross join to product to begin with
You don't specify how to join Order to order item.
It seems very odd to be joining on Price.... join on order ID or productID maybe?
you could cross join to an "Average" result so it's available on every record. (I aliased this inline view "Z" in my attempt)
so what the above does is include all Orders. and for each order, an order item must be associated for it to be included. And then for each order item, a productid must be included and related to a record in product. If for some reason an order item record doens't have a related entry in product table, it gets excluded.
I use a cross join to get the average as it's executed 1 time and applied/joined to every record.
If we use the query in the where clause it's executed one time for EVERY record (unless the DB Engine optimizer figures it out and generates a better plan)
I Assume
Order.ID relates to OrderItem.OrderID
OrderItem.productID relates to Product.ID
Order.TotalAmount is what we are wanting to "Average" and compare against
Every Order has an Order Item entry
Every Order Item entry has a related product.

SQL - Filter calculated column with calculated column

I'm trying to find out the most dosed patients in a database. The sum of the doses has to be calculated and then I have to dynamically list out the patients who have been dosed that much. The query has to be dynamic, and there can be more than 5 patients listed - For example, the 5 most doses are 7,6,5,4,3 doses, but 3 people have gotten 5 doses, so I'd have to list out 7 people in total (the patients getting 7,6,5,5,5,4,3 doses). I'm having issues because you cannot refer to a named column in a where clause and I have no idea how to fix this.
The query goes like this:
SELECT
info.NAME, SUM(therapy.DOSE) AS total
FROM
dbo.PATIENT_INFORMATION_TBL info
JOIN
dbo.PATIENT_THERAPY_TBL therapy ON info.HOSPITAL_NUMBER = therapy.HOSPITAL_NUMBER
LEFT JOIN
dbo.FORMULARY_CLINICAL clinical ON clinical.ITEMID = therapy.ITEMID
WHERE
total IN (SELECT DISTINCT TOP 5 SUM(t.DOSE) AS 'DOSES'
FROM dbo.PATIENT_INFORMATION_TBL i
JOIN dbo.PATIENT_THERAPY_TBL t ON i.HOSPITAL_NUMBER = t.HOSPITAL_NUMBER
LEFT JOIN dbo.FORMULARY_CLINICAL c ON c.ITEMID = t.ITEMID
GROUP BY NAME
ORDER BY 'DOSES' DESC)
GROUP BY
info.NAME
ORDER BY
total DESC
The database looks like this:
The main question is: how can I use a where/having clause where I need to compare a calculated column to a list of dynamically calculated values?
I'm using Microsoft's SQL Server 2012. The DISTINCT in the subquery is needed so that only the top 5 dosages appear (e.g. without DISTINCT I get 7,6,5,4,3 with DISTINCT I get 7,6,6,5,4 and my goal is the first one).
Most DBMSes support Standard SQL Analytical Functions like DENSE_RANK:
with cte as
(
SELECT info.NAME, SUM(therapy.DOSE) as total,
DENSE_RANK() OVER (ORDER BY SUM(therapy.DOSE) DESC) AS dr
FROM dbo.PATIENT_INFORMATION_TBL info
JOIN dbo.PATIENT_THERAPY_TBL therapy ON info.HOSPITAL_NUMBER=therapy.HOSPITAL_NUMBER
LEFT JOIN dbo.FORMULARY_CLINICAL clinical ON clinical.ITEMID=therapy.ITEMID
GROUP BY info.NAME
)
select *
from cte
where dr <= 5 -- only the five highest doses
ORDER BY total desc
Btw, you probably don't need the LEFT JOIN as you're not selecting any column from dbo.FORMULARY_CLINICAL

SQLite - Return 0 if null

I have an assignment in Database Management Systems in which I have to write queries for given problems.
I have 4 problems, of which I solved 3 and stuck with the last one.
Details:
Using version 1.4 of the Chinook Database
(https://chinookdatabase.codeplex.com/).
SQLite DB Browser
Chinook Sqlite AutoIncrementPKs.sqlite file​ in the directory with Chinook files is the database I am working on
Problem Statement:
Write a query to generate a ranked list of employees based upon the amount of money brought in via customer invoices for which they were the support representative. The result set (see figure below) should have the following fields (in order) for all employees (even those that did not support any customers): ID (e_id), first name (e_first name), last name (e_last_name), title (e_title), and invoice total (total_invoices). The rows should be sorted by the invoice total (greatest first), then by last name (alphabetically), then first name (alphabetically). The invoice total should be preceded by a dollar sign ($) and have two digits after the decimal point (rounded, as appropriate); in the case of employees without any invoices, you should output a $0.00, not NULL. You may find it useful to look at the IFNULL, ROUND, and PRINTF functions of SQLite.
Desired Output:
My Query:
Select Employee.EmployeeId as e_id,
Employee.FirstName as e_first_name,
Employee.LastName as e_last_name,
Employee.Title as e_title,
'$' || printf("%.2f", Sum(Invoice.Total)) as total_invoices
From Invoice Inner Join Customer On Customer.CustomerId = Invoice.CustomerId
Inner Join Employee On Employee.EmployeeId = Customer.SupportRepId
Group by Employee.EmployeeId
Having Invoice.CustomerId in
(Select Customer.CustomerId From Customer
Where Customer.SupportRepId in
(Select Employee.EmployeeId From Employee Inner Join Customer On Employee.EmployeeId = Customer.SupportRepId)
)
order by sum(Invoice.Total) desc
My Output:
As you can see, the first three rows are correct but the later rows are not printed because employees don't have any invoices and hence EmployeeID is null.
How do I print the rows in this condition?
I tried with Coalesce and ifnull functions but I can't get them to work.
I'd really appreciate if someone can modify my query to get matching solutions.
Thanks!
P.S: This is the schema of Chinook Database
It often happens that it is simpler to use subqueries:
SELECT EmployeeId,
FirstMame,
LastName,
Title,
(SELECT printf("...", ifnull(sum(Total), 0))
FROM Invoice
JOIN Customer USING (CustomerId)
WHERE Customer.SupportRepId = Employee.EmployeeId
) AS total_invoices
FROM Employee
ORDER BY total_invoices DESC;
(The inner join could be replaced with a subquery, too.)
But it's possible that you are supposed to show that you have learned about outer joins, which generate a fake row containing NULL values if a matching row is not found:
...
FROM Employee
LEFT JOIN Customer ON Employee.EmployeeId = Customer.SupportRepId
LEFT JOIN Invoice USING (CustomerID)
...
And if you want to be a smartass, replace ifnull(sum(...), 0) with total(...).

Include subselect only if there is one result using tsql

We have an invoice, a invoice detail and a order table and the tables are linked by the invoice detail rows because the invoice details are grouped by delivery date so a invoice often covers multiple order numbers.
Now I would like to build a view that would display the order number if there is only one order involved in the invoice by using a subselect of some kind.
I came up with this one but it still generates an error reporting that the subquery return more than one result
SELECT Invoice.Id, Invoice.TotalAmount,
(SELECT DISTINCT OrderId FROM InvoiceDetail
WHERE InvoiceDetail.InvoiceId = Invoice.Id
GROUP BY OrderId HAVING COUNT(DISTICT OrderId) = 1) AS OrderId
FROM Invoice
Any ideas to get this to work?
How about:
SELECT
Invoice.Id,
Invoice.TotalAmount,
OneOrder.OrderId
FROM
Invoice
LEFT JOIN (
SELECT InvoiceId, MIN(OrderId) OrderId
FROM InvoiceDetail
GROUP BY InvoiceId
HAVING COUNT(DISTINCT OrderId) = 1
) OneOrder ON OneOrder.InvoiceId = Invoice.Id
Tested correct:
SELECT Id, TotalAmount, OrderInfo.OrderId
FROM Invoice
JOIN
(
SELECT InvoiceId, OrderId
FROM InvoiceDetail
JOIN Invoice
ON InvoiceDetail.InvoiceId = Invoice.Id
GROUP BY InvoiceId, OrderId
HAVING COUNT(OrderId)=1
) AS OrderInfo
ON Invoice.Id=OrderInfo.InvoiceId
Notice lack of DISTINCT in HAVING clause, which is incorrect (it would cause multiple order ids to count as one, breaking the expected behavior)
Change
GROUP BY OrderId
to
GROUP BY InvoiceDetail.InvoiceId
Your problem may just be the typo in the HAVING clause. See DISTICT.

Using SELECT TOP from one column, then sorting on a different column

I'm using SQL Server 2005, and I want to query for the vendors generating the most revenue, sorted by the vendor's name. Below is the query I have tried. The inner subquery gets the 15 largest vendors sorted by revenue, and I try to order those results by the vendor name.
SELECT Revenue, VendorName
FROM (
SELECT TOP 15
SUM(po.POTotal) AS Revenue
, Vendors.VendorName AS VendorName
FROM PurchaseOrders po
INNER JOIN Vendors ON po.Vendor_ID = Vendors.Vendor_ID
WHERE ...
GROUP BY Vendors.VendorName
ORDER BY Revenue DESC
)
ORDER BY VendorName ASC
But this gives me an error message:
Msg 156, Level 15, State 1, Line 14
Incorrect syntax near the keyword 'ORDER'.
Is there another way to do this? I think this might be possible with a view, but I'd prefer not to do it that way.
I apologize if this is a duplicate, I don't even know what to search for to see if this has already been asked.
Add an alias for the subquery:
SELECT Revenue, VendorName
FROM (SELECT TOP 15
SUM(po.POTotal) AS Revenue,
v.VendorName AS VendorName
FROM PurchaseOrders po
JOIN Vendors v
ON po.Vendor_ID = v.Vendor_ID
WHERE ...
GROUP BY v.VendorName
ORDER BY Revenue DESC) Z
ORDER BY VendorName ASC
You need to give your derived table an alias:
...
ORDER BY Revenue DESC
) AS DerivedTable
ORDER BY VendorName;
I believe you can do this with a CTE:
WITH revenue (Revenue, VendorName)
AS
(SELECT TOP 15 SUM(po.POTotal) AS Revenue, Vendors.VendorName AS VendorName
FROM PurchaseOrders po
INNER JOIN Vendors
ON po.Vendor_ID = Vendors.Vendor_ID
WHERE ...
GROUP BY Vendors.VendorName
ORDER BY Revenue DESC)
SELECT Revenue, VendorName
FROM revenue
ORDER BY VendorName ASC
You can also do this without a sub-query if you like --
SELECT sum(po.POTotal) as Revenue, vendors.VendorName
FROM PurchaseOrders po INNER JOIN Vendors ON po.Vendor_ID = Vendors.Vendor_ID
WHERE ...
GROUP BY Vendors.VendorName
ORDER BY sum(po.POTotal) DESC, VendorName ASC
Try that and see if it works - we do the same sort of thing here and this was our solution...
Sorry, forgot the TOP 15 in the query above - it needs to go just befor the sum() aggregate function.

Resources