Group by grouping sets wtihout including all columns selected - sql-server

I have 4 columns: AccountNumber, Company, Batch, and Amount. I want to sum(amount) by Company and Batch but not by AccountNumber as AccountNumber is always unique and this wouldn't really accomplish anything. Is there a way using grouping sets to sum(amount) by Company and Batch but not by AccountNumber while still displaying AccountNumbers in the results set?
Thank you

Yes, this kind of problem requires a sub-query and a join. It is easy to understand if we think of the problem as two steps:
This will sum by company and batch:
select company, batch, sum(amount) as sum_amount
from atable
group by company, batch
Now just join that to your account numbers
select atable.accountnumber, subq.company, subq.batch, subq.sum_amount
from (
select company, batch, sum(amount) as sum_amount
from atable
group by company, batch
) subq
join atable on subq.company = atable.company and subq.batch = atable.batch

select b.Company, b.Batch, b.AccountNumber, s.Total
from baseTable b
inner join
(
select Company, Batch, sum(Amount) Total
from basetable
group by Company, Batch
) s on s.Company = b.Company and s.Batch = b.Batch
This will do it, but does give the total on each line. Not really any way to prevent that, though.

Related

Column 'ACCOUNT.ACCOUNT_ID' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause

I am trying to get available balance on last(max) date. I am trying to write below query but it is showing error.
select ACCOUNT_ID,AVAIL_BALANCE,OPEN_DATE,MAX(LAST_ACTIVITY_DATE)
from ACCOUNT
group by CUST_ID;
Column 'ACCOUNT.ACCOUNT_ID' is invalid in the select list because it
is not contained in either an aggregate function or the GROUP BY
clause.
I am new to sql. Can anyone let me know where I am wrong in this query?
Any column not having a calculation/function on it must be in the GROUP BY clause.
select ACCOUNT_ID,AVAIL_BALANCE,OPEN_DATE,MAX(LAST_ACTIVITY_DATE)
from ACCOUNT
group by ACCOUNT_ID,AVAIL_BALANCE,OPEN_DATE;
If you're wanting the most recent row for each customer, think ROW_NUMBER(), not GROUP BY:
;With Numbered as (
select *,ROW_NUMBER() OVER (
PARTITION BY CUST_ID
ORDER BY LAST_ACTIVITY_DATE desc) rn
from Account
)
select ACCOUNT_ID,AVAIL_BALANCE,OPEN_DATE,LAST_ACTIVITY_DATE
from Numbered
where rn=1
I think you want to select one records having max(LAST_ACTIVITY_DATE) for each CUST_ID.
For this you can use TOP 1 WITH TIES like following.
SELECT TOP 1 WITH TIES account_id,
avail_balance,
open_date,
last_activity_date
FROM account
ORDER BY Row_number()
OVER (
partition BY cust_id
ORDER BY last_activity_date DESC)
Issue with your query is, you can't select non aggregated column in select if you don't specify those columns in group by
If you want to get the max activity date for a customer then your query should be as below
select CUST_ID, MAX(LAST_ACTIVITY_DATE)
from ACCOUNT
group by CUST_ID;
You can't select any other column which is not in the group by clause. The error message also giving the same message.
with query(CUST_ID, LAST_ACTIVITY_DATE) as
(
select
CUST_ID,
MAX(LAST_ACTIVITY_DATE) as LAST_ACTIVITY_DATE
from ACCOUNT
group by CUST_ID
)
select
a.ACCOUNT_ID,
a.AVAIL_BALANCE,
a.OPEN_DATE,
a.LAST_ACTIVITY_DATE
from ACCOUNT as a
inner join query as q
on a.CUST_ID = q.CUST_ID
and a.LAST_ACTIVITY_DATE = q.LAST_ACTIVITY_DATE

Unique vs MAX in SQL statement

I have a table with three columns:
PERSON
VISITOR
DATE
The table is basically a transactional table. The following is true:
There are multiple rows per person
There are multiple rows per visitor
There are multiple rows of a given person/visitor combination.
Assumed unique person/date combination
What I need is
I want visitor for each Person's MAX Date.
I cannot have multiple persons in the output.
Person must be unique.
visitor may repeat.
I have tried:
SELECT
ROW_NUMBER() OVER (PARTITION BY PERSON, VISITOR ORDER BY Date DESC) row_num,
PERSON,
VISITOR as VISITOR
FROM
`TABLE`
ORDER BY
PERSON
Maybe this... not sure I fully understand question. Sample data /expected results would help.
You said you wanted only the 1 person with the visitor per max date so the row_num of 1 will be the record w/ the max date. and since we partition by person it will not matter if person A had 3 visitors. only the person and their Most recent visitor will be listed.
WITH cte as (
SELECT ROW_NUMBER() OVER (PARTITION BY PERSON ORDER BY Date DESC) row_num
, PERSON
, VISITOR as VISITOR
FROM `TABLE`)
SELECT *
FROM cte
WHERE row_Num = 1
I think this can be done with a cross apply too though i'm not as good at using them yet...
SELECT A.Person, A.Visitor, A.Date
FROM table A
CROSS APPLY (SELECT TOP 1 *
FROM TABLE B
WHERE A.Person = B.Person
and A.Visitor = B.Visitor
and A.Date = B.Date
ORDER BY DATE DESC) C
Essentially the inner query runs for each record on the outer query; thus only the top most record will be returned thus the newest date.
select a.* from myTable as a inner join (
SELECT person, max(date) as maxDate from myTable group by person
) as b
on a.date = b.maxDate
and a.person = b.person;
I am weak in reading and writing English.
In my opinion the answer may be:
SELECT `PERSON`, `VISITOR`, MAX(`DATE`) AS `DATE`
FROM `TABLE`
GROUP BY `PERSON`, `VISITOR`;

Subtracting two columns within the sql query

I have been trying to subtract two columns in sql server to form a third one. Below is my query
select AD.Id, Sum(APS.Amount) AS TotalDue,
isnull((select sum(Amount) from Activation where InvoiceId in (select InvoiceId from Invoices where AgreementId = AD.Id)),0)
As AllocatedToDate
from AdvantageDetails AD
inner join AllPaymentsSubstantial APS
on APS.AgreementId=AD.Id
where AD.OrganizationId=30
group by AD.Id
What I tried is below but it is not working. :
select AD.Id, Sum(APS.Amount) AS TotalDue,
isnull((select sum(Amount) from Activation where InvoiceId in (select InvoiceId from Invoices where AgreementId = AD.Id)),0)
As AllocatedToDate , (TotalDue-AllocatedToDate) as NewColumn
from AdvantageDetails AD
inner join AllPaymentsSubstantial APS
on APS.AgreementId=AD.Id
where AD.OrganizationId=30
group by AD.Id
At last I tried it using a CTE which worked fine. But I want to do it without creating CTE. Can there be any other way for performing the same functionality. I do not want to use CTE because it is forcasted that there
can be other columns which will be calculated in future.
with CTE as(select AD.Id, Sum(APS.Amount) AS TotalDue,
isnull((select sum(Amount) from Activation where InvoiceId in (select InvoiceId from Invoices where AgreementId = AD.Id)),0)
As AllocatedToDate , (TotalDue-AllocatedToDate) as NewColumn
from AdvantageDetails AD
inner join AllPaymentsSubstantial APS
on APS.AgreementId=AD.Id
where AD.OrganizationId=30
group by AD.Id) select * , (CTE.TotalDue-CTE.AllocatedToDate)As Newcolumn from CTE
You can do it without a CTE by repeating the entire formula that makes up AllocatedToDate.
You cannot use the alias of a column in the SELECT list, so you cannot do this:
SELECT {some calculation} AS ColumnA, (ColumnA - ColumnB) AS ColumnC
If you don't want to use a CTE or derived table, you have to do this:
SELECT {some calculation} AS ColumnA, ({some calculation} - ColumnB) AS ColumnC
And by the way, I can't imagine why the possibility of future columns being added is a reason not to use a CTE. To me, it sounds like a reason TO use a CTE, as you will only have to make changes in one place in the code, and not duplicate the same code in different places in the same query.
You can just use nested queries:
select Id, TotalDue, AllocatedToDate, (TotalDue-AllocatedToDate) as NewColumn
from (
select AD.Id, Sum(APS.Amount) AS TotalDue,
isnull((select sum(Amount) from Activation where InvoiceId in (select InvoiceId from Invoices where AgreementId = AD.Id)),0)
As AllocatedToDate
from AdvantageDetails AD
inner join AllPaymentsSubstantial APS
on APS.AgreementId=AD.Id
where AD.OrganizationId=30
group by AD.Id
) x

Need help creating a query for a non-normalized database

I've never worked with a non-normalized database before, so I'll try and explain my problem as best I can. So I have two tables:
The customers table holds all the customers information, and the orders table holds all the orders that they have placed. I haven't listed all the fields in the tables, just the ones that I need. The customer number in both tables is not the primary key, but I'm inner joining on them anyway. So the problem I'm having is that I don't know how to make a query that:
Selects all the customers with their first name, last name, and email, and also show the most recent orderdate, most recent total, and most recent ordertype. I know that I have to use a max() aggregate for the date, but that's as far as I got. Please help a noob out.
You can try:
SELECT FirstName,
LastName,
Email,
OrderDate,
OrderTotal,
OrderType
FROM Customers AS C
INNER JOIN Order AS O
ON O.CustomerNumber = C.CustomerNumber AND
O.OrderDate = (
SELECT MAX (O1.OrderDate)
FROM Order AS O1
WHERE O1.CustomerNumber = C.CustomerNumber)
)
assuming that Orders.OrderDate is unique for each CustomerNumber, does this work for you? if a single CustomerNumber has more than one entry in Order for OrderDate, you'll get each of those rows.
select c.FirstName, c.LastName, c.Email, o.OrderDate, o.OrderTotal, o.OrderType
from Customers c
join
(select CusomterNumber, max(OrderDate) as MostRecentOrderDate
from Orders
group by CustomerNumber
) mro on mro.CustomerNumber=s.CustomerNumber
join Orders o on o.OrderDate=mro.MostRecentOrdeDate and
o.CustomerNumber=mro.CustomerNumber
Try this:
SELECT
Customers.*, Orders.*
FROM
Customers
JOIN
(SELECT
Customer_Number,
MAX(Order_Date) OrderDate
FROM
Orders
GROUP BY
Customer_Number
) as Ord ON Customers.Customer_Number = Ord.Customer_Number
JOIN Order ON Orders.Customer_Number = Ord.Customer_Number
If you are doing this with SQL Server use the query designer and basically all you want to do is do a join since you have two keys that are the same one in Customer Table ->Customer Join on Order->Customer alias the Customer table as C and Orders table as O
so for example
SELECT Customer.*, Orders.*
From Customer c, Orders O INNER JOIN O where C.Customer Number = O.Customer Number
This should be enough to get you started.. if you don't want all the fields then fully qualify the names for example
SELECT C.FirstName, C.LastName, O.OrderDate, O.OrderType FROM Customer C, Orders O
WHERE C.Customer NUmber = O.Customer Number //this is another way of doing a Join when working with the where Clause.

Include subselect only if there is one result using tsql

We have an invoice, a invoice detail and a order table and the tables are linked by the invoice detail rows because the invoice details are grouped by delivery date so a invoice often covers multiple order numbers.
Now I would like to build a view that would display the order number if there is only one order involved in the invoice by using a subselect of some kind.
I came up with this one but it still generates an error reporting that the subquery return more than one result
SELECT Invoice.Id, Invoice.TotalAmount,
(SELECT DISTINCT OrderId FROM InvoiceDetail
WHERE InvoiceDetail.InvoiceId = Invoice.Id
GROUP BY OrderId HAVING COUNT(DISTICT OrderId) = 1) AS OrderId
FROM Invoice
Any ideas to get this to work?
How about:
SELECT
Invoice.Id,
Invoice.TotalAmount,
OneOrder.OrderId
FROM
Invoice
LEFT JOIN (
SELECT InvoiceId, MIN(OrderId) OrderId
FROM InvoiceDetail
GROUP BY InvoiceId
HAVING COUNT(DISTINCT OrderId) = 1
) OneOrder ON OneOrder.InvoiceId = Invoice.Id
Tested correct:
SELECT Id, TotalAmount, OrderInfo.OrderId
FROM Invoice
JOIN
(
SELECT InvoiceId, OrderId
FROM InvoiceDetail
JOIN Invoice
ON InvoiceDetail.InvoiceId = Invoice.Id
GROUP BY InvoiceId, OrderId
HAVING COUNT(OrderId)=1
) AS OrderInfo
ON Invoice.Id=OrderInfo.InvoiceId
Notice lack of DISTINCT in HAVING clause, which is incorrect (it would cause multiple order ids to count as one, breaking the expected behavior)
Change
GROUP BY OrderId
to
GROUP BY InvoiceDetail.InvoiceId
Your problem may just be the typo in the HAVING clause. See DISTICT.

Resources