Probelm Statement:
Write a query to return 2015 sales information for each supplier. We would like to include all suppliers in the result set, regardless of whether their products were sold in 2015.
Sales are determined using Sales.Orders and Sales.OrderLines as in the previous two questions. However, since we are asking for this information from the perspective of the supplier, you also need to use the tables Warehouse.StockItems and Purchasing.Suppliers.
The columns required in the result set are:
SupplierID - As it appears in the table Purchasing.Suppliers.
SupplierName - As it appears in the table Purchasing.Suppliers.
OrderCount - The number of orders placed on products for each supplier.
Sales - The subtotal from the orders placed, calculated from Quantity and UnitPrice of the table Sales.OrderLines.
The results should be sorted such that the supplier with the highest sales is at the top. If two suppliers have the same sales, next use the order count with the highest count at the top. If two suppliers have the same sales and order count, use the supplier name in ascending order as the final tie breaker. This will ensure a deterministic result.
I am using the WorldWideImporters Microsoft sample database tables. I am trying to return the 2015 sales information for each supplier in Purchasing.Suppliers. I am returning the OrderCount and the Sum of the 2015 sales in respective columns. I am having trouble with joins here since I have to connect Suppliers to the Warehouse.StockItems and then connect these items to specific OrderLines which have a field for StockItemID.
The problem is that usually I would join orders to orderlines, so that I could filter only orders and thus orderlines in 2015. However, with the table structure that I have specified, it seems I have to connect OrderLines to Orders.
So what I did was to join those Orders back with OrderLines to provide the result I am used to. Here is my attempt at a solution:
<pre>
SELECT S.SupplierID
,S.SupplierName
,COUNT(DISTINCT O.OrderID) AS OrderCount
,ISNULL(SUM(OLP.Quantity * OLP.UnitPrice), 0.00) AS Sales
FROM Purchasing.Suppliers AS S
LEFT OUTER JOIN Warehouse.StockItems AS W ON S.SupplierID = W.SupplierID
LEFT OUTER JOIN Sales.OrderLines AS OL ON W.StockItemID = OL.StockItemID
LEFT OUTER JOIN Sales.Orders AS O ON OL.OrderID = O.OrderID
AND O.OrderDate BETWEEN '2015-01-01' AND '2015-12-31'
LEFT OUTER JOIN Sales.OrderLines AS OLP ON O.OrderID = OLP.OrderID
GROUP BY S.SupplierID
,S.SupplierName
ORDER BY Sales DESC
,OrderCount
,SupplierName;
</pre>
Edit:
Results:
Look to have every supplier as expected even ones that had no sales or orders. I am not sure if the calculated sales is correct though and I am not sure how to verify. Didn't know if anyone saw a flaw in my query.
I have no idea if this is correct or the most efficient way to solve this problem. I do have constraints that I can only use joins, no subqueries, unions, etc.
Any help in understanding would be appreciated. Thank you.
To benchmark the orders, without regard to suppliers or order lines:
/* query 1 */
SELECT
COUNT(*) AS ordercount
FROM Sales.Orders AS o
WHERE o.OrderDate >= '20150101' AND o.OrderDate < '20160101'
Then to benchmark the orders lines, without regard to suppliers:
/* query 2 */
SELECT
COUNT(DISTINCT o.OrderID) AS ordercount
, SUM(olp.Quantity * olp.UnitPrice) AS sales
FROM Sales.Orders AS o
INNER JOIN Sales.OrderLines AS olp ON olP.OrderID = o.OrderID
WHERE o.OrderDate >= '20150101' AND o.OrderDate < '20160101'
Now start introducing more joins, and if the values alter then the most recent join is at fault:
/* query 3 */
SELECT
COUNT(DISTINCT o.OrderID) AS ordercount
, SUM(olp.Quantity * olp.UnitPrice) AS sales
FROM Sales.Orders AS o
INNER JOIN Sales.OrderLines AS olp ON olP.OrderID = o.OrderID
INNER JOIN Warehouse.StockItems AS w ON w.StockItemID = olp.StockItemID
WHERE o.OrderDate >= '20150101' AND o.OrderDate < '20160101'
and then:
/* query 4 */
SELECT
COUNT(DISTINCT o.OrderID) AS ordercount
, SUM(olp.Quantity * olp.UnitPrice) AS sales
FROM Sales.Orders AS o
INNER JOIN Sales.OrderLines AS olp ON olP.OrderID = o.OrderID
INNER JOIN Warehouse.StockItems AS w ON w.StockItemID = olp.StockItemID
INNER JOIN Purchasing.Suppliers s ON s.SupplierID = w.SupplierID
WHERE o.OrderDate >= '20150101' AND o.OrderDate < '20160101'
You certainly don't need to join order lines twice, regarding the left joins it depends on what it is you are trying to achieve, e.g.:
Only suppliers with orders in the date range:
/* query 5 */
SELECT
s.SupplierID
, s.SupplierName
, COUNT(DISTINCT o.OrderID) AS ordercount
, ISNULL(SUM(olp.Quantity * olp.UnitPrice), 0.00) AS sales
FROM Purchasing.Suppliers s
INNER JOIN Warehouse.StockItems AS w ON s.SupplierID = w.SupplierID
INNER JOIN Sales.OrderLines AS olp ON w.StockItemID = olp.StockItemID
INNER JOIN Sales.Orders AS o ON olP.OrderID = o.OrderID
WHERE o.OrderDate >= '20150101'
AND o.OrderDate < '20160101' -- note: this is "the next" day
GROUP BY
s.SupplierID
, s.SupplierName
ORDER BY
sales DESC
, ordercount
, SupplierName;
All suppliers with stock references:
/* query 6 */
SELECT
s.SupplierID
, s.SupplierName
, COUNT(DISTINCT o.OrderID) AS ordercount
, ISNULL(SUM(olp.Quantity * olp.UnitPrice), 0.00) AS sales
FROM Purchasing.Suppliers s
INNER JOIN Warehouse.StockItems AS w ON s.SupplierID = w.SupplierID
LEFT JOIN Sales.OrderLines AS olp ON w.StockItemID = olp.StockItemID
LEFT JOIN Sales.Orders AS o ON olP.OrderID = o.OrderID
AND o.OrderDate >= '20150101'
AND o.OrderDate < '20160101' -- note: this is "the next" day
GROUP BY
s.SupplierID
, s.SupplierName
ORDER BY
sales DESC
, ordercount
, SupplierName;
Every supplier:
/* query 7 */
SELECT
s.SupplierID
, s.SupplierName
, COUNT(DISTINCT o.OrderID) AS ordercount
, ISNULL(SUM(olp.Quantity * olp.UnitPrice), 0.00) AS sales
FROM Purchasing.Suppliers s
LEFT JOIN Warehouse.StockItems AS w ON s.SupplierID = w.SupplierID
LEFT JOIN Sales.OrderLines AS olp ON w.StockItemID = olp.StockItemID
LEFT JOIN Sales.Orders AS o ON olP.OrderID = o.OrderID
AND o.OrderDate >= '20150101'
AND o.OrderDate < '20160101' -- note: this is "the next" day
GROUP BY
s.SupplierID
, s.SupplierName
ORDER BY
sales DESC
, ordercount
, SupplierName;
Please be very cautious about using between for date ranges, the most reliable way to define a date range is to use >= and < as shown above, this way it does not matter what the time precision of the data is. Also YYYYMMDD is the safest date literal format in TSQL.
Related
I have a sample db with 3 different tables, customers, orders, orderdetails.
The assignment is to show customer name, and address from a customers table and then show each order total amount by order id. Order details has order id several times and it is by unit x price so I have to sum these after performing the calculation.
Customers has a field customerid, which I can use to join with orders which has the same field, the orders table has orderid which I can use to join to orderdetails and sum the order total but I do not know how to put the information together. Customers table does not have the fields to calculate the total order, and only has customerid. So, I'm trying to pinch together from 3 tables where there is some related column but not all present in each table.
I can do 2 separate select statements and each do what I expect but I have been trying to get the info together and have been unable to.
SELECT c.CustomerID, c.[Address], o.orderid
FROM Customers c
Join Orders o
ON c.CustomerID = o.CustomerID
--how to join these together?
SELECT od.orderid, SUM(od.UnitPrice*od.Quantity) as 'Subtotal'
FROM OrderDetails od
Join Orders o
ON od.OrderID = o.OrderID
Group by od.OrderID
I am trying to show this with the following information:
Customer Name, Address, OrderID, and Order Total.
try this -
SELECT c.CustomerID, c.[Address], od.orderid, SUM(od.UnitPrice* od.Quantity) as 'Subtotal'
FROM OrderDetails od
Join Orders o
ON od.OrderID = o.OrderID
join Customers c ON c.CustomerID = o.CustomerID
Group by c.CustomerID, c.[Address], od.OrderID
You can join three tables together like given below. I am using derived table OrderDetails to calculate subtotal at orderId level.
SELECT c.CustomerID, c.[Address], o.orderid, SUM(od.Subtotal) as 'Subtotal'
FROM Customers c
Join Orders o
ON c.CustomerID = o.CustomerID
join (SELECT orderid, SUM(od.UnitPrice*od.Quantity) as Subtotal from OrderDetails od GROUP BY OrderId) as OrderDetails od
ON od.OrderID = o.OrderID
group by c.CustomerID, c.[Address], o.orderid
Mukesh's answer led me directly to the finish, I was able to leave the customerid out of the result with the following. This helped a lot and I appreciate everyone's input.
SELECT c.CompanyName, c.[Address], od.orderid, SUM(od.UnitPrice* od.Quantity) as
'Subtotal'
FROM OrderDetails od
Join Orders o
ON od.OrderID = o.OrderID
join Customers c ON c.CustomerID = o.CustomerID
Group by c.CompanyName, c.[Address], od.OrderID
I am using the WorldWideImporters Microsoft sample database tables. I am trying to return rows that are customers with columns that will show their total sales for 2014, 2015, and the total of the two each in their own columns. I am not allowed to use subqueries of any kind.
For this problem I am trying to solve it like this:
SELECT C.CustomerID
,C.CustomerName
,ISNULL(SUM(OL.Quantity * OL.UnitPrice), 0.00) AS [2014Sales]
,ISNULL(SUM(OLP.Quantity * OLP.UnitPrice), 0.00) AS [2015Sales]
,ISNULL(SUM(OL.Quantity * OL.UnitPrice), 0.00)
+ ISNULL(SUM(OLP.Quantity * OLP.UnitPrice), 0.00) AS [TotalSales]
FROM Sales.Customers AS C
LEFT OUTER JOIN Sales.Orders AS O ON C.CustomerID = O.CustomerID
AND O.OrderDate BETWEEN '2014-01-01' AND '2014-12-31'
LEFT OUTER JOIN Sales.OrderLines AS OL ON O.OrderID = OL.OrderID
LEFT OUTER JOIN Sales.Orders AS OP ON O.CustomerID = C.CustomerID
AND O.OrderDate BETWEEN '2015-01-01' AND '2015-12-31'
LEFT OUTER JOIN Sales.OrderLines AS OLP ON OL.OrderID = O.OrderID
GROUP BY C.CustomerID
,C.CustomerName
ORDER BY TotalSales DESC, CustomerID;
I am having a hard time understanding joins. I come from having some object oriented experience and I can't quite wrap my head around relational joins. I can see how I could solve this with subqueries, one for 2015, one for 2014.
As it stands my query is running endlessly, which means that the way I tried to join twice must be trying to combine too many combinations of rows. Any help explaining what is happening here and how to fix my query would be much appreciated.
While editing your query there are some lines that didn't make sense:
You joined to the same tables without indicating in the ON statement their condition.
Notice that you have named Sales.Orders and Sales.OrderLines OP and OLP respectively but still use the criteria O.CustomerID = C.CustomerID and OL.OrderID = O.OrderID.
LEFT OUTER JOIN Sales.OrderLines AS OL ON O.OrderID = OL.OrderID
LEFT OUTER JOIN Sales.Orders AS OP ON O.CustomerID = C.CustomerID
AND O.OrderDate BETWEEN '2015-01-01' AND '2015-12-31'
LEFT OUTER JOIN Sales.OrderLines AS OLP ON OL.OrderID = O.OrderID
See if this works: I just corrected your join criterias.
SELECT C.CustomerID
,C.CustomerName
,ISNULL(SUM(OL.Quantity * OL.UnitPrice), 0.00) AS [2014Sales]
,ISNULL(SUM(OLP.Quantity * OLP.UnitPrice), 0.00) AS [2015Sales]
,ISNULL(SUM(OL.Quantity * OL.UnitPrice), 0.00)
+ ISNULL(SUM(OLP.Quantity * OLP.UnitPrice), 0.00) AS [TotalSales]
FROM Sales.Customers AS C
LEFT OUTER JOIN Sales.Orders AS O ON C.CustomerID = O.CustomerID
AND CONVERT(date, O.OrderDate) BETWEEN '2014-01-01' AND '2014-12-31'
LEFT OUTER JOIN Sales.OrderLines AS OL ON O.OrderID = OL.OrderID
LEFT OUTER JOIN Sales.Orders AS OP ON C.CustomerID = OP.CustomerID
AND CONVERT(date, OP.OrderDate) BETWEEN '2015-01-01' AND '2015-12-31'
LEFT OUTER JOIN Sales.OrderLines AS OLP ON OP.OrderID = OLP.OrderID
GROUP BY C.CustomerID
,C.CustomerName
ORDER BY [TotalSales] DESC, CustomerID;
I also added a CONVERT(date to the date field to ignore the time part.
EDIT:
Without using subquery I think this won't work. So I'll just leave this here in case you can use it already.
SELECT [2014].CustomerID
,[2014].CustomerName
,[2014Sales]
,[2015Sales]
,[2014Sales] + [2015Sales] AS [TotalSales]
FROM (
(SELECT C.CustomerID
,C.CustomerName
,ISNULL(SUM(OL.Quantity * OL.UnitPrice), 0.00) AS [2014Sales]
FROM Sales.Customers AS C
LEFT OUTER JOIN Sales.Orders AS O ON C.CustomerID = O.CustomerID
LEFT OUTER JOIN Sales.OrderLines AS OL ON O.OrderID = OL.OrderID
WHERE O.OrderDate BETWEEN '2014-01-01' AND '2014-12-31'
GROUP BY C.CustomerID
,C.CustomerName) AS [2014]
LEFT OUTER JOIN
(SELECT C2.CustomerID
,C2.CustomerName
,ISNULL(SUM(OLP.Quantity * OLP.UnitPrice), 0.00) AS [2015Sales]
FROM Sales.Customers AS C2
LEFT OUTER JOIN Sales.Orders AS OP ON OP.CustomerID = C2.CustomerID
LEFT OUTER JOIN Sales.OrderLines AS OLP ON OP.OrderID = OLP.OrderID
WHERE OP.OrderDate BETWEEN '2015-01-01' AND '2015-12-31'
GROUP BY C2.CustomerID
,C2.CustomerName) AS [2015] ON [2014].CustomerID = [2015].CustomerId
ORDER BY [2014Sales] + [2015Sales] DESC, CustomerID;
I'm trying to rank a query by not just one count, but by two.
I want to rank customers by the order items per orders.
WITH CTE AS
(
SELECT
o.CustomerId,
COUNT(DISTINCT o.OrderId) AS OrderCount,
COUNT(oi.OrderItemId) AS OrderItemCount
FROM
OrderItem oi
INNER JOIN
Order o ON o.OrderId = oi.OrderId
WHERE
o.CategoryId = 52 -- website sales
GROUP BY
o.CustomerId
)
SELECT
cust.Code,
cust.DisplayTitle,
CTE.OrderCount,
CTE.OrderItemCount,
--AVG(CTE.OrderItemCount/CTE.OrderCount) AS SumProduct ????
FROM
CTE
INNER JOIN
Customer cust ON cust.CustomerId = CTE.CustomerId
GROUP BY
cust.Code,
cust.DisplayTitle,
CTE.OrderCount,
CTE.OrderItemCount
ORDER BY
SumProduct DESC
I'm basically trying to implement the T-SQL equivalent of SUMPRODUCT() in Excel.
SELECT
o.CustomerId,
COUNT(DISTINCT o.OrderId) AS OrderCount,
COUNT(oi.OrderItemId) AS OrderItemCount,
COUNT(oi.OrderItemId) / COUNT(DISTINCT o.OrderId) avg
FROM OrderItem oi
INNER JOIN Order o ON o.OrderId = oi.OrderId
WHERE o.CategoryId = 52 -- website sales
GROUP BY o.CustomerId
order by COUNT(oi.OrderItemId) / COUNT(DISTINCT o.OrderId) desc
Just add in the join to customer
I am trying to implement a pivoted table in sql but it is not working. What I currently have is the following:
WITH Pivoted
AS
(
select vg.ParentProductCategoryName, c.CompanyName, sd.LineTotal
FROM SalesLT.Product p join SalesLT.vGetAllCategories vg on p.ProductCategoryID = vg.ProductCategoryID
Join SalesLT.SalesOrderDetail sd on p.ProductID = sd.ProductID
JOIN SalesLT.SalesOrderHeader as soh ON sd.SalesOrderID = soh.SalesOrderID
JOIN SalesLT.Customer AS c ON soh.CustomerID = c.CustomerID
pivot(Sum([LineTotal]) for [ParentProductCategoryName] in (Accessories, Bikes, Clothing, Components)) AS sales
)
select * from Pivoted p;
;
I get the error:
multi part "Column name" Identifier could not be bounded.
If I removed the column names in the select part and used * instead, I get:
The column 'ProductCategoryID' was specified multiple times for...
What I want is to have a view of the total Revenue (as specified by the sum of the lineTotal in the SalesOrderDetail Table) per each ParentProductCategoryName (in vGetAllCategories) stated (pivoted as columns) with respect to each CompanyName (in Customer). How to better achieve this? Thanks.
Not sure why you'd need a CTE for this.. but put your JOINS in a derived table and pivot that derived table instead.
SELECT *
FROM (SELECT vg.ParentProductCategoryName,
c.CompanyName,
sd.LineTotal
FROM SalesLT.Product p
JOIN SalesLT.vGetAllCategories vg ON p.ProductCategoryID = vg.ProductCategoryID
JOIN SalesLT.SalesOrderDetail sd ON p.ProductID = sd.ProductID
JOIN SalesLT.SalesOrderHeader AS soh ON sd.SalesOrderID = soh.SalesOrderID
JOIN SalesLT.Customer AS c ON soh.CustomerID = c.CustomerID
) t
PIVOT( SUM([LineTotal])
FOR [ParentProductCategoryName] IN (Accessories,Bikes,Clothing,Components) ) AS sales
or you could just use the SUM aggregate with CASE
SELECT c.CompanyName,
Accessories = SUM(CASE WHEN vg.ParentProductCategoryName = 'Accessories' THEN sd.LineTotal END),
Bikes = SUM(CASE WHEN vg.ParentProductCategoryName = 'Bikes' THEN sd.LineTotal END),
Clothing = SUM(CASE WHEN vg.ParentProductCategoryName = 'Clothing' THEN sd.LineTotal END),
Components = SUM(CASE WHEN vg.ParentProductCategoryName = 'Components' THEN sd.LineTotal END)
FROM SalesLT.Product p
JOIN SalesLT.vGetAllCategories vg ON p.ProductCategoryID = vg.ProductCategoryID
JOIN SalesLT.SalesOrderDetail sd ON p.ProductID = sd.ProductID
JOIN SalesLT.SalesOrderHeader AS soh ON sd.SalesOrderID = soh.SalesOrderID
JOIN SalesLT.Customer AS c ON soh.CustomerID = c.CustomerID
GROUP BY c.CompanyName
AdventureWorks2012 DB - I am trying to return top 1 or 2 emlpoyees from Finance dept and Engineer dept who have worked longest. I cant get my query to return both, only results from engineering show. Any suggestions?
SELECT TOP 2 EDH.StartDate, E.BusinessEntityID, D.Name, EDH.EndDate, DATEDIFF(hour,EDH.StartDate, GETDATE()) AS HoursWorked
FROM HumanResources.Employee E
INNER JOIN Person.Person PP ON E.BusinessEntityID = PP.BusinessEntityID
INNER JOIN HumanResources.EmployeeDepartmentHistory EDH ON E.BusinessEntityID = EDH.BusinessEntityID
INNER JOIN HumanResources.Department D ON D.DepartmentID = EDH.DepartmentID
WHERE (D.Name LIKE 'Finance' OR D.Name = 'Engineering')
AND EDH.EndDate IS NULL
GROUP BY D.Name, EDH.StartDate,E.BusinessEntityID,EDH.EndDate
ORDER BY EDH.StartDate ASC
Your problem is that the employees from Engineering happen to have started before the employees from Finance. The ORDER BY is affecting all of your records (both departments), and then the TOP 2 value is grabbing the two most recent employees, regardless of departments.
If you are trying to write a query that returns the first employee from each department, you're going to have to get a bit more complex. Here is an example that uses the ROW_NUMBER() function to order employees within each department by their start date, then filters those records to only return employees who are the first individuals in their department.
SELECT
StartDate,
BusinessEntityID,
Name,
EndDate,
HoursWorked
FROM
(
SELECT
EDH.StartDate,
E.BusinessEntityID,
D.Name,
EDH.EndDate,
DATEDIFF(hour,EDH.StartDate, GETDATE()) AS HoursWorked,
ROW_NUMBER() OVER (PARTITION BY D.Name ORDER BY EDH.StartDate) AS RowNumberWithinDepartment
FROM
HumanResources.Employee E
INNER JOIN
Person.Person PP ON E.BusinessEntityID = PP.BusinessEntityID
INNER JOIN
HumanResources.EmployeeDepartmentHistory EDH ON E.BusinessEntityID = EDH.BusinessEntityID
INNER JOIN
HumanResources.Department D ON D.DepartmentID = EDH.DepartmentID
WHERE
(D.Name LIKE 'Finance' OR D.Name = 'Engineering') AND
EDH.EndDate IS NULL
GROUP BY D.Name, EDH.StartDate,E.BusinessEntityID,EDH.EndDate
) x
WHERE RowNumberWithinDepartment = 1
ORDER BY StartDate ASC