Sql Server Group by subquery - sql-server

I have a database of sales transactions that have sales of multiple items identified by a unique 'sale number' in the salenum field. Some sales are taxable and are identified by a code field with the value 'T'. The non taxed sale just omits the 'T' value to indicate it is a non taxed transaction.
QUANTITY PARTN COST PRICE CODE SALENUM
3 SAS6895 2.38 9.99 D 411436
1 GELBKP 7.4458 11.5409 D 411436
3 BRW.5 0.1471 0.228 D 411436
1 GWG 24.5668 45.00 D 411436
1 MODC4 1.3767 3.5 D 411436
1 GPFQ 6.9969 10.8451 D 411436
1 Tax 6.605 6.605 T 411436
1 OTC 0.4144 0.99 D 411437
1 S777 1.71 2.6505 D 411437
In the salnumber series 411436 code T shows that this sale was taxed. salenumber 411437 omits the T so this is an exempt sale.
What I want to do is query the table and sum the transaction grouped by salenumber as taxable sales and then another query that show exempt sales.
select sum(quantity*price) as total from business where date = '8/28/2014'
group by salenum
will show both types but I can't filter by the Tax I think this could be done with a subquery but I am lost at this point on the syntax.
Thanks in Advanced
PS, I studied the help section on how to clearly state my question so feed back would be appreciated so I can be a better member of this forum

If you want two rows, one for taxable and one for non-taxable, then you can use two levels of aggregation. For instance, to get the results on two different rows:
select IsTaxable, sum(sumqp)
from (select salenum, sum(quantity * price) as sumqp,
max(case when code = 'T' then 1 else 0 end) as IsTaxable
from transactions t
group by salenum
) t
group by IsTaxable;

Also can calculating the two value to two column
select salenum, sum(CASE code WHEN 'T' THEN quantity * price ELSE 0 END) as TaxedAmount,
sum(CASE code WHEN 'D' THEN quantity * price ELSE 0 END) as NoTaxedAmount
from transactions t
group by salenum

Related

In T-SQL What is the best way to handle groups within groups?

This issue is from a booking system, the task is to add a flag to the table and also produce total number of visitors.
ItineraryID
BookingID
Type
NumberOfPeople
*Flag
1001
211
Entry Fee
2
F
1001
212
Camping Fee
2
T
1002
361
Entry Fee
4
T
1003
388
Entry Fee
2
F
1003
389
Entry Fee
2
F
1003
390
Camping Fee
2
T
1003
391
Camping Fee
2
T
1005
401
Camping Fee
2
T
The last column is what I am going to create, and have no good way to design the SQL query.
When an itinerary issued, the visitors paid for Entry Fee and/or Camping Fee. If Both camping and entry paid, then we should count "number of people" from the row of camping fee (mark T). If an itinerary only have entry or camping, then mark T
Further explanation:
Booking system had some bugs, so visitors may pay camping fee only and not buy entry ticket, e.g. 1005
Booking system has the ability to make group purchase and indicate visitor info separately. e.g. 1003: two couples made one transaction, paid for both entry and camping
For ItineraryID 1001, total no. of people is 2, for 1003 total no. of people is 4. then For the above example table, to produce a total number of visitors, SUM(case when Type='Camping Fee' then NumberOfPeople else 0 end) OVER (PARTITION BY ItineraryID, Type) should be ok, just wondering is there any other robust way to do it?
And I am stuck at the flag column creation, the real table has over a million rows...
Consider this:
WITH tots AS(
SELECT
itineraryID,
SUM(CASE WHEN Type = 'Entry Fee' THEN NumberOfPeople END) as E,
SUM(CASE WHEN Type = 'Camping Fee' THEN NumberOfPeople END) as C
FROM
t
GROUP BY itineraryID
)
If we join this back to our table (SELECT * FROM t JOIN tots ON t.itineraryID = tots.itineraryID) then we can use the E and C values per row to work some things out:
If E or C is 0 then mark T ("If an itinerary only have entry or camping, then mark T")
If E = C and it's a Camping row then mark 'T'
If E = C and it's an Entry row mark 'F'
After this logic is done in a SELECT CASE WHEN, you just need to convert it to an UPDATE JOIN where you modify t (UPDATE t SET flag = CASE WHEN ... FROM t JOIN tots ...)
Or you can make a new table with the result of the select (or you can make a view this it and just query it and it will work out the T/F dynamically each time)
NB: Your example data didn't seem to consider what happens if 2 entry and 4 camping are bought.. But it's easy to extend the logic

Selecting same MSSQL table with different condition to get the difference

I have a table FinTrans As
Seq|Ledger|Debit_Credit|Amount
1 |130000|Debit |105
2 |120000|Debit |1456
3 |130000|Credit |500
4 |130000|Debit |9680
5 |130000|Credit |1432
6 |120000|Debit |1628
I want to find (sum of Debit Amount) - (sum of Credit Amount) for each ledger.
For eg.in above case for Ledger 130000
the sum of Debit Amount = 105+9680 = 9785
the sum of Credit Amount = 500 +1432=1537
Difference = 8248
How can I write a SQL query on the same table?
You can put a CASE expression inside an aggregate function. This is called conditional aggregation.
SELECT Ledger, SUM(Amount * CASE WHEN Debit_Credit = 'Credit' THEN -1 ELSE 1 END) As Difference
FROM FinTrans
-- WHERE Ledger = 130000 -- optional
GROUP BY d.Ledger
It works here because of the commutative property, which says you don't have to add up all the credits and debits separately to subtract one from the other; you can do all the additions and subtractions in any order and still end up with the same result.
If you really want to, you can do it this way:
SELECT Ledger,
SUM(CASE WHEN Debit_Credit = 'Debit' THEN Amount ELSE 0 END)
- SUM(CASE WHEN Debit_Credit = 'Credit' THEN Amount ELSE 0 END) As Difference
FROM FinTrans
-- WHERE Ledger = 130000 -- optional
GROUP BY d.Ledger
It more resembles the problem description. But it's more complicated and slower, and again, it's not needed.

SQL Server - Calculate AVG() using Joins

I have a Cab transport application
Each driver has a Trip and for each trip, there can be multiple customers (Cab pooling) giving their feedback.
Now I want to get the drivers of those drivers who got more than 10 five star ratings(5*) and a minimum of 20% Five-star ratings out of their total Ratings received from their customers.
Let's say a driver got a total 40 feedbacks in the last 30 days out of which 16 are 5-star ratings, then this driver has met the criteria of minimum 10 5* star ratings and more than 20% 5* ratings. This driver id should be fetched.
SELECT TR.[DriverId]
,100.0 * AVG(CASE
WHEN FE.[Rating] = 5
THEN 1.0
ELSE 0
END) AS Percentage
FROM tblFeedback FE
LEFT OUTER JOIN tblTrip TR ON FE.TripId = TR.TripId
WHERE FE.DATE >= GETDATE() - 30
AND FE.Rating = 5
GROUP BY DriverId
HAVING COUNT(CASE
WHEN FE.[Rating] = 5
THEN DriverId
END) >= 10
AND 100 * AVG(CASE
WHEN FE.[Rating] = 5
THEN 1.0
ELSE 0
END) > 20
The above query is showing the Percentage as 100.000 for all the Drivers whose Id's are fetched, even those drivers whose total percentage is 18% are also fetched and their percentage is shown as 100%.
This query has screwed my report completely
Try this. You need to include all the ratings in order to calculate the percentage:
SELECT r.[DriverId], 100.0*r.five_stars/r.total_ratings AS Percentage
FROM (
SELECT TR.[DriverId]
SUM(CASE WHEN FE.Rating =5 THEN 1 ELSE 0 END) AS five_stars,
SUM(*) AS total_ratings
FROM tblFeedback FE
INNER JOIN tblTrip TR ON FE.TripId = TR.TripId
WHERE FE.DATE >= GETDATE() - 30
GROUP BY TR.DriverId) r
WHERE r.five_stars>=10
AND 100.0*r.five_stars/r.total_ratings>20.0;
I think the issue is in your WHERE clause. This line in particular:
AND FE.Rating = 5
This is forcing the tblFeedback table to only return records that have a five-star rating, and therefore, only the five-star ratings are used in the calculation. Try taking that line out and see if the calculations are any closer to what you expect.

SQL Server - monthly avg of count

I want to be able to find out the monthly average of a count
My code at the moment is
SELECT
company,
COUNT(company) AS 'count'
FROM Information
GROUP BY company
I basically need it to be
SELECT company,
count(company) as 'count'
avg(count(company)) per month as 'average'
FROM Information
group by company
I want the result to look something like this
company count monthly average
a 5 6
b 13 14
c 2 2
d 45 45
e 23 21
f 6 5
A very simple approach would be to count per company and month first and then aggregate this data to get total and avarage per company.
select
company,
sum(cnt) as records,
avg(cnt) as records_per_month
from
(
select company, year(start_date), month(start_date), count(*) as cnt
from information
group by company, year(start_date), month(start_date)
) agg
group by company;
But read my comment to your question.
SELECT YEAR(yourDate) * 100 + MONTH(yourDate) YYMM,
company,
count(company) as 'count'
avg(count(company)) per month as 'average'
FROM Information
group by company
,YEAR(yourDate) * 100 + MONTH(yourDate)

sql cross join - what use has anyone found for it? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
Today, for the first time in 10 years of development with sql server I used a cross join in a production query. I needed to pad a result set to a report and found that a cross join between two tables with a creative where clause was a good solution. I was wondering what use has anyone found in production code for the cross join?
Update: the code posted by Tony Andrews is very close to what I used the cross join for. Believe me, I understand the implications of using a cross join and would not do so lightly. I was excited to have finally used it (I'm such a nerd) - sort of like the time I first used a full outer join.
Thanks to everyone for the answers! Here's how I used the cross join:
SELECT CLASS, [Trans-Date] as Trans_Date,
SUM(CASE TRANS
WHEN 'SCR' THEN [Std-Labor-Value]
WHEN 'S+' THEN [Std-Labor-Value]
WHEN 'S-' THEN [Std-Labor-Value]
WHEN 'SAL' THEN [Std-Labor-Value]
WHEN 'OUT' THEN [Std-Labor-Value]
ELSE 0
END) AS [LABOR SCRAP],
SUM(CASE TRANS
WHEN 'SCR' THEN [Std-Material-Value]
WHEN 'S+' THEN [Std-Material-Value]
WHEN 'S-' THEN [Std-Material-Value]
WHEN 'SAL' THEN [Std-Material-Value]
ELSE 0
END) AS [MATERIAL SCRAP],
SUM(CASE TRANS WHEN 'RWK' THEN [Act-Labor-Value] ELSE 0 END) AS [LABOR REWORK],
SUM(CASE TRANS
WHEN 'PRD' THEN [Act-Labor-Value]
WHEN 'TRN' THEN [Act-Labor-Value]
WHEN 'RWK' THEN [Act-Labor-Value]
ELSE 0
END) AS [ACTUAL LABOR],
SUM(CASE TRANS
WHEN 'PRD' THEN [Std-Labor-Value]
WHEN 'TRN' THEN [Std-Labor-Value]
ELSE 0
END) AS [STANDARD LABOR],
SUM(CASE TRANS
WHEN 'PRD' THEN [Act-Labor-Value] - [Std-Labor-Value]
WHEN 'TRN' THEN [Act-Labor-Value] - [Std-Labor-Value]
--WHEN 'RWK' THEN [Act-Labor-Value]
ELSE 0 END) -- - SUM([Std-Labor-Value]) -- - SUM(CASE TRANS WHEN 'RWK' THEN [Act-Labor-Value] ELSE 0 END)
AS [LABOR VARIANCE]
FROM v_Labor_Dist_Detail
where [Trans-Date] between #startdate and #enddate
--and CLASS = (CASE #class WHEN '~ALL' THEN CLASS ELSE #class END)
GROUP BY [Trans-Date], CLASS
UNION --REL 2/6/09 Pad result set with any missing dates for each class.
select distinct [Description] as class, cast([Date] as datetime) as [Trans-Date], 0,0,0,0,0,0
FROM Calendar_To_Fiscal cross join PRMS.Product_Class
where cast([Date] as datetime) between #startdate and #enddate and
not exists (select class FROM v_Labor_Dist_Detail vl where [Trans-Date] between #startdate and #enddate
and vl.[Trans-Date] = cast(Calendar_To_Fiscal.[Date] as datetime)
and vl.class= PRMS.Product_Class.[Description]
GROUP BY [Trans-Date], CLASS)
order by [Trans-Date], CLASS
A typical legitimate use of a cross join would be a report that shows e.g. total sales by product and region. If no sales were made of product P in region R then we want to see a row with a zero, rather than just not showing a row.
select r.region_name, p.product_name, sum(s.sales_amount)
from regions r
cross join products p
left outer join sales s on s.region_id = r.region_id
and s.product_id = p.product_id
group by r.region_name, p.product_name
order by r.region_name, p.product_name;
One use I've come across a lot is splitting records out into several records, mainly for reporting purposes.
Imagine a string where each character represents some event during the corresponding hour.
ID | Hourly Event Data
1 | -----X-------X-------X--
2 | ---X-----X------X-------
3 | -----X---X--X-----------
4 | ----------------X--X-X--
5 | ---X--------X-------X---
6 | -------X-------X-----X--
Now you want a report which shows how many events happened at what day. Cross join the table with a table of IDs 1 to 24, then work your magic...
SELECT
[hour].id,
SUM(CASE WHEN SUBSTRING([data].string, [hour].id, 1) = 'X' THEN 1 ELSE 0 END)
FROM
[data]
CROSS JOIN
[hours]
GROUP BY
[hours].id
=>
1, 0
2, 0
3, 0
4, 2
5, 0
6, 2
7, 0
8, 1
9, 0
10, 2
11, 0
12, 0
13, 2
14, 1
15, 0
16, 1
17, 2
18, 0
19, 0
20, 1
21, 1
22, 3
23, 0
24, 0
I have different reports that prefilter the recordset (by various lines of business within the firm), but there were calculations that required percentages of revenue firm-wide. The recordsource had to contain the firm total instead of relying on calculating the overall sum in the report itself.
Example: The recordset has balances for each client and the Line of Business the client's revenue comes from. The report may only show 'retail' clients. There is no way to get a sum of the balances for the entire firm, but the report shows the percentage of the firm's revenue.
Since there are different balance fields, I felt it was less complicated to have full join with the view that has several balances (I can also reuse this view of firm totals) instead of multiple fields made up sub queries.
Another one is an update statement where multiple records needed to be created (one record for each step in a preset workflow process).
Here's one, where the CROSS JOIN substitutes for an INNER JOIN. This is useful and legitimate when there are no identical values between two tables on which to join. For example, suppose you have a table that contains version 1, version 2 and version 3 of some statement or company document, all saved in a SQL Server table so that you can recreate a document that is associated with an order, on the fly, long after the order, and long after your document was rewritten into a new version. But only one of the two tables you need to join (the Documents table) has a VersionID column. Here is a way to do this:
SELECT DocumentText, VersionID =
(
SELECT d.VersionID
FROM Documents d
CROSS JOIN Orders o
WHERE o.DateOrdered BETWEEN d.EffectiveStart AND d.EffectiveEnd
)
FROM Documents
I've used a CROSS JOIN recently in a report that we use for sales forcasting, the report needs to break out the amount of sales that a sales person has done in each General Ledger account.
So in the report I do something to this effect:
SELECT gla.AccountN, s.SalespersonN
FROM
GLAccounts gla
CROSS JOIN Salesperson s
WHERE (gla.SalesAnalysis = 1 OR gla.AccountN = 47500)
This gives me every GL account for every sales person like:
SalesPsn AccountN
1000 40100
1000 40200
1000 40300
1000 48150
1000 49980
1000 49990
1005 40100
1005 40200
1005 40300
1054 48150
1054 49980
1054 49990
1078 40100
1078 40200
1078 40300
1078 48150
1078 49980
1078 49990
1081 40100
1081 40200
1081 40300
1081 48150
1081 49980
1081 49990
1188 40100
1188 40200
1188 40300
1188 48150
1188 49980
1188 49990
For charting (reports) where every grouping must have a record even if it is zero.
(e.g. RadCharts)
I had combinations of am insolvency field from my source data.
There are 5 distinct types but the data had combinations of 2 of these. So I created lookup table of the 5 distinct values then used a cross join for an insert statement to fill out the rest. like so
insert into LK_Insolvency (code,value)
select a.code+b.code, a.value+' '+b.value
from LK_Insolvency a
cross join LK_Insolvency b
where a.code <> b.code <--this makes sure the x product of the value with itself is not used as this does not appear in the source data.
I personally try to avoid cartesian product's in my queries. I suppose have a result set of every combination of your join could be useful, but usually if I end up with one, I know I have something wrong.

Resources