I have invoicing solution that uses Azure SQL to store and calculate invoice data. I have been requested to provide 'credit' functionality so rather than recovering customers charges, the totals are deducted from an amount of available credit and reflected in the invoice (solution xyz may have 1500 worth of charges, but deducted from available credit of 10,000 means its effectively zero'd and leaves 8,500 credit remaining ). Unfortunately after several days I haven't been able to work out how to do this.
I am able to get a list of items and their costs from sql easily:
invoice_id
contact_id
solution_id
total
date
202104-015
52
10000
30317.27
2021-05-22
202104-015
52
10001
2399.90
2021-05-22
202104-015
52
10005
8302.27
2021-05-22
202104-015
52
10060
3625.22
2021-05-22
202104-015
52
10111
22.87
2021-05-22
202104-015
52
10115
435.99
2021-05-22
I have another table that shows the credit available for the given contact:
id
credit_id
owner_id
total_applied
date_applied
1
C00001
52
500000.00
2021-05-14
I have tried using the following SQL statement, based on another stackoverflow question to subtract from the previous row, thinking each row would then reflect the remaining credit:
Select
invoice_id,
solution_id
sum(total) as 'total',
cr.total_remaining - coalesce(lag(total)) over (order by s.solution_id), 0) as credit_available,
date
from
invoices
left join credits cr on
cr.credit_id = 'C00001'
Whilst this does subtract, it only subtracts from the row above it, not all of the rows above it:
invoice_id
solution_id
total
credit_available
date
202104-015
10000
30317.27
500000.00
2021-05-22
202104-015
10001
2399.90
469682.73
2021-05-22
202104-015
10005
8302.27
497600.10
2021-05-22
202104-015
10060
3625.22
491697.73
2021-05-22
202104-015
10111
22.87
496374.78
2021-05-22
202104-015
10115
435.99
499977.13
2021-05-22
I've also tried various queries with a mess of case statements.
Im at the point where I am contemplating using powershell or similar to do the task instead (loop through each solution, check if there is enough available credit, update a deduction table, goto next etc) but I'd rather keep it all in SQL if I can.
Anyone have some pointers for this beginner?
You don't need to use window functions, use a sub-query that sums the total of previous invoices. But be sure to use index the table correctly so that performance is not a problem.
There are two sub-queries, one for the previous total sum and another to get the date of the next credit for contact_id.
SELECT [inv].[invoice_id],
[inv].[solution_id],
[inv].[total],
-- subquery that sums the previous totals
[cr].[total_applied] - COALESCE((
SELECT SUM([inv_inner].[total])
FROM [dbo].[invoices] AS [inv_inner]
WHERE [inv_inner].[solution_id] < [inv].[solution_id]
), 0) AS [credit_available],
[inv].[date]
FROM [dbo].[invoices] [inv]
LEFT JOIN [dbo].[credits] [cr]
ON [cr].[owner_id] = [inv].[contact_id]
-- here, we make sure that the credit is available for the correct period
-- invoice date >= credit date_applied
AND [inv].[date] >= [cr].[date_applied]
-- and invoice date < next date_applied or tomorrow, in case there are no next date_applied
AND [inv].[date] < COALESCE((
SELECT MIN([cr2].[date_applied])
FROM [dbo].[credits] [cr2]
WHERE [cr2].[owner_id] = [cr].[owner_id]
AND [cr2].[date_applied] > [cr].[date_applied]
), GETDATE()+1)
AND [cr].[credit_id] = 'C00001';
This query works, but it is for this question only. Please study it and adapt to your real world problem.
This is a pretty complex scenario. I sadly cannot spend the time to offer a complete solution here. I do can provide you with tips and points of attention here:
Be sure to determine the actual remaining credit based on the complete invoice history. If you introduce filtering (in a WHERE-clause, for example, or by including joins with other tables), the results should not be affected by it. You should probably pre-calculate the available credit per invoice detail record in a temporary table or in a CTE and use that data in your main query.
Make sure that you regard the date_applied value of the credit. Before a credit is applied to a customer, that customer should probably have less credit or no credit at all. That should be reflected correctly on historical invoices, I guess.
Make sure you determine the correct amount of total credit. It is unclear from the information provided in your question how that should be determined/calculated. Is only the latest total_applied value from the credits table active? Or should all the historical total_applied values be summarized to get the total available credit?)
Include a correct join between your invoices table and your credits table. Currently, this join is hard coded in your query.
Also regard actual payments by customers. Payments have effect on the available credit, I assume. Also note that, unless you are OK with a history that changes, you need to regard the payment dates as well (just like the credit change dates).
I'm not sure how you would solve your scenario using PowerShell... I do know for sure, that this can be tackled with SQL.
I cannot say anything about the resulting performance, however. These kinds of calculations surely come with a price tag attached in that regard. If you need high performance, I guess it might be more practical to include columns in your invoices table to physically store the available credit with each invoice detail record.
Edit
I have experimented a little with your scenario and your additional comments.
My solution implementation uses two CTEs:
The first CTE (cte_invoice_credit_dates) retrieves the date of the active credit record for specific invoice IDs.
The second CTE (cte_contact_invoice_summarized_totals) calculates the invoice totals of all the invoices of a specific contact. Since you want to summarize on solution detail per invoice as well, I also included the solution ID per invoice in the querying logic.
The main query selects all columns from the invoices table and uses the data from the two CTEs to calculate three additional columns in the result set:
Column credit_assigned represents the total assigned credit at the invoice's date.
Column summarized_total shows the contact's cumulative invoice total.
Column credit_available shows the remaining credit.
WITH
[cte_invoice_credit_dates] AS (
SELECT DISTINCT
I.[invoice_id],
C.[date_applied]
FROM
[invoices] AS I
OUTER APPLY (SELECT TOP (1) [date_applied]
FROM [credits]
WHERE
[owner_id] = I.[contact_id] AND
[date_applied] <= I.[date]
ORDER BY [date_applied] DESC) AS C
),
[cte_contact_invoice_summarized_totals] AS (
SELECT
I.[contact_id],
I.[invoice_id],
I.[solution_id],
SUM(H.[total]) AS [total]
FROM
[invoices] AS I
INNER JOIN [invoices] AS H ON
H.[contact_id] = I.[contact_id] AND
H.[invoice_id] = I.[invoice_id] AND
H.[solution_id] <= I.[solution_id] AND
H.[date] <= I.[date]
GROUP BY
I.[contact_id],
I.[invoice_id],
I.[solution_id]
)
SELECT
I.[invoice_id],
I.[contact_id],
I.[solution_id],
I.[total],
I.[date],
COALESCE(C.[total_applied], 0) AS [credit_assigned],
H.[total] AS [summarized_total],
COALESCE(C.[total_applied] - H.[total], 0) AS [credit_available]
FROM
[invoices] AS I
INNER JOIN [cte_contact_invoice_summarized_totals] AS H ON
H.[contact_id] = I.[contact_id] AND
H.[invoice_id] = I.[invoice_id] AND
H.[solution_id] = I.[solution_id]
LEFT JOIN [cte_invoice_credit_dates] AS CD ON
CD.[invoice_id] = I.[invoice_id]
LEFT JOIN [credits] AS C ON
C.[owner_id] = I.[contact_id] AND
C.[date_applied] = CD.[date_applied]
ORDER BY
I.[invoice_id],
I.[solution_id];
Related
Well I'm using Postgresql(but it won't matter if you can advise a solution in any SQL syntax), I have a table like
employee
department
salary
1
sales
30,000
2
sales
25,000
3
marketing
45,000
4
marketing
55,000
so on...
What I want to achieve is:
employee
department
salary
difference
1
sales
30,000
0/null
2
sales
25,000
5,000
3
marketing
45,000
0/null
4
marketing
55,000
10,000
So technically I want to extract the value difference of consecutive rows, however I can't use the window functions (I don't know why, but it is must to avoid in this challenge)
in a perfect world, we'd be able to do lag() or lead() functions partitioned by department name and store the value difference in other column, but I don't know how to do it without them.
I tried subqueries multiple ways, but every time I ended up having NULL or 0 in new a column
You can use a self-join to table itself and join each employee with the previous row within the same department.
SELECT t1.employee, t1.department, t1.salary, ABS(t2.salary - t1.salary) AS difference
FROM tab t1
LEFT JOIN tab t2
ON t1.department = t2.department AND t1.employee = t2.employee +1
I've stumbled upon a problem that is giving me huge headaches, which is the following:
I have a table Deals, that contains information about this entity from our Sales CRM. I also have a table Company, that contains information about the companies pegged to those deals.
I was asked to compute a metric called Pipeline Conversion Rate, which is calculated as:
Won deals / Created Deals
Until here, everything is quite clear. Nevertheless, when computing this metric I was asked to do so in a sliding-window-function-fashion, which means to compute the metric only looking at the prior 90 days. Thing is that to look at the last 90 days of the numerator, we need to use one Date (created date); while when looking at the prior 90 days of the denominator, we should take into account the closed date (both dimensions are part of the Deals table).
There wouldn't be any problem if we could do this kind of window functions in Snowflake, as the following (I know syntax may not be exactly this one, but you get the idea):
count(deal_id) over (
partition by is_inbound, sales_agent, sales_tier, country
order by created_date range between 90 days preceding and current row
) as created_deals_last_90_days,
count(case when is_deal_won then deal_id end) over (
partition by is_inbound, sales_agent, sales_tier, country
order by created_date range between 90 days preceding and current row
) as won_deals_last_90_days
But we can't as far as I know. So my current workaround is the following (taken from this post):
select
calendar_date,
is_inbound,
sales_tier,
sales_agent,
country,
(
select count(deal_id)
from deals
where d.is_inbound = is_inbound
and d.sales_tier = sales_tier
and d.sales_agent = sales_agent
and d.country = country
and created_date between cal.calendar_date - 90 and cal.calendar_date
) as created_deals_last_90_days,
(
select count(case when is_deal_won then deal_id end)
from deals
where d.is_inbound = is_inbound
and d.sales_tier = sales_tier
and d.sales_agent = sales_agent
and d.country = country
and closed_date between cal.calendar_date - 90 and cal.calendar_date
) as won_deals_last_90_days
from calendar as cal
left join deals as d on cal.calendar_date between d.created_date and d.closed_date
*Note that I am using a calendar table here as base table, in order to have visibility on all calendar dates since without it I might say I'd be missing on those dates where there are no new deals (could happen on weekends).
Problem is that I am not getting correct figures when I cross check the raw data and the output of this query, and I have no idea how to make this (ugly) workaround, well... work.
Any ideas are more than welcome!
Well, it turns out it was way easier than I expected. After some trial-and-error, I figured out the only thing that could be failing was the JOIN condition in the outer query:
on cal.calendar_date between d.created_date and d.closed_date
This was assuming that both dates needed to be in the range, while this assumption is wrong. By tweaking the above mentioned part of the outer query to:
on cal.calendar_date >= d.created_date
It captures all those Deals that were created on or before the calendar_date, and therefore all of them since it is a mandatory field.
Maintaining the rest of the query as is, and assuming that there will be no nulls in any of the partitions, the results are the ones I expected.
I am trying to write a stored procedure that will run every day and check invoices for past due or not. I want to pull all invoices from table that are not paid then I want to go through them and find the difference between todays date and the date the order was placed. From there I want to check what the account terms for that order are( basically how long they have to pay) and if they have gone over the terms then I will calculate a service charge and update balancedue. I have a basic idea of what to do but I don't know how to go through the selected records without looping through each one. I thought there was a better way to do it in sql server.
The invoice table has an accountid, ispaid, and creationdate. The account table as the terms for the account. Then I have an accountbalance table with several fields I would update if needed.
Accountbalance fields
balancedue
pastdue30
pastdue60
pastdue90
pastdueover90
The accountid can get me from invoice to account and accountbalance and the date can give me how long it has been, I then would just update the accountbalance accordingly to terms and how long it has been past due. I know its a little hard to understand without seeing it.
This is what I am basically trying to do I am just not sure how to do it for each record
select * from invoice where ispaid = 0
days = currentdate - invoicecreationdate
switch (days)
case 30
update balance
case 60
update balance
case 90
update balance
if(days > terms)
update balance add servicecharge
Your additions help, but I'm still a little unsure on what's going on here (for example, what are the pastdueXX fields in Accountbalance and how do they relate to the balancedue field?). Also, can one account have multiple past due invoices?
It sounds like you're looking for something similar to the following:
update ab set balancedue =
(case when datediff(i.creationdate,getdate()) > 90 then balance due + ...
when datediff(i.creationdate,getdate()) > 60 then balance due + ...
...
end)
from accountbalance ab
join account a
on ab.accountid = a.accountid
join invoice i
on a.accountid = i.accountid
Sorry for the vagueness, but again, still have some questions.
INTRODUCTION TO DATABASE TABLE BEING USED -
I am working on a “Stock Market Prices” based Database Table. My table has got the data for the following FIELDS –
ID
SYMBOL
OPEN
HIGH
LOW
CLOSE
VOLUME
VOLUME CHANGE
VOLUME CHANGE %
OPEN_INT
SECTOR
TIMESTAMP
New data gets added to the table daily “Monday to Friday”, based on the stock market price changes for that day. The current requirement is based on the VOLUME field, which shows the volume traded for a particular stock on daily basis.
REQUIREMENT –
To get the Average and Total Volume for last 10,15 and 30 Days respectively.
METHOD USED CURRENTLY -
I created these 9 SEPARATE QUERIES in order to get my desired results –
First I have created these 3 queries to take out the most recent last 10,15 and 30 dates from the current table:
qryLast10DaysStored
qryLast15DaysStored
qryLast30DaysStored
Then I have created these 3 queries for getting the respective AVERAGES:
qrySymbolAvgVolume10Days
qrySymbolAvgVolume15Days
qrySymbolAvgVolume30Days
And then I have created these 3 queries for getting the respective TOTALS:
qrySymbolTotalVolume10Days
qrySymbolTotalVolume15Days
qrySymbolTotalVolume30Days
PROBLEM BEING FACED WITH CURRENT METHOD -
Now, my problem is that I have ended up having these so many different queries, whereas I wanted to get the output into One Single Query, as shown in the Snapshot of the Excel Sheet:
http://i49.tinypic.com/256tgcp.png
SOLUTION NEEDED -
Is there some way by which I can get these required fields into ONE SINGLE QUERY, so that I do not have to look into multiple places for the required fields? Can someone please tell me how to get all these separate queries into one -
A) Either by taking out or moving the results from these separate individual queries to one.
B) Or by making a new query which calculates all these fields within itself, so that these separate individual queries are no longer needed. This would be a better solution I think.
One Clarification about Dates –
Some friend might think why I used the method of using Top 10,15 and 30 for getting the last 10,15 and 30 Date Values. Why not I just used the PC Date for getting these values? Or used something like -
("VOLUME","tbl-B", "TimeStamp BETWEEN Date() - 10 AND Date()")
The answer is that I require my query to "Read" the date from the "TIMESTAMP" Field, and then perform its calculations accordingly for LAST / MOST RECENT "10 days, 15 days, 30 days” FOR WHICH THE DATA IS AVAILABLE IN THE TABLE, WITHOUT BOTHERING WHAT THE CURRENT DATE IS. It should not depend upon the current date in any way.
If there is any better method or more efficient way to create these queries, then please enlighten.
You have separate queries to compute 10DayTotalVolume and 10DayAvgVolume. I suspect you can compute both in one query, qry10DayVolumes.
SELECT
b.SYMBOL,
Sum(b.VOLUME) AS 10DayTotalVolume,
Avg(b.VOLUME) AS 10DayAvgVolume
FROM
[tbl-B] AS b INNER JOIN
qryLast10DaysStored AS q
ON b.TIMESTAMP = q.TIMESTAMP
GROUP BY b.SYMBOL;
However, that makes me wonder whether 10DayAvgVolume can ever be anything other than 10DayTotalVolume / 10
Similar considerations apply to the 15 and 30 day values.
Ultimately, I think you want something based on a starting point like this:
SELECT
q10.SYMBOL,
q10.[10DayTotalVolume],
q10.[10DayAvgVolume],
q15.[15DayTotalVolume],
q15.[15DayAvgVolume],
q30.[30DayTotalVolume],
q30.[30DayAvgVolume]
FROM
(qry10DayVolumes AS q10
INNER JOIN qry15DayVolumes AS q15
ON q10.SYMBOL = q15.SYMBOL)
INNER JOIN qry30DayVolumes AS q30
ON q10.SYMBOL = q30.SYMBOL;
That assumes you have created qry15DayVolumes and qry30DayVolumes following the approach I suggested for qry10DayVolumes.
If you want to cut down the number of queries, you could use subqueries for each of the qry??DayVolumes saved queries, but try it this way first to make sure the logic is correct.
In that second query above, there can be a problem due to field names which start with digits. Enclose those names in square brackets or re-alias them in qry10DayVolumes, qry15DayVolumes, and qry30DayVolumes using alias names which begin with letters instead of digits.
I tested the query as written above with the "2nd Upload.mdb" you uploaded, and it ran without error from Access 2007. Here is the first row of the result set from that query:
SYMBOL 10DayTotalVolume 10DayAvgVolume 15DayTotalVolume 15DayAvgVolume 30DayTotalVolume 30DayAvgVolume
ACC-1 42909 4290.9 54892 3659.46666666667 89669 2988.96666666667
Access doesn't support most advanced SQL syntax and clauses, so this is a bit of a hack, but it works, and is fast on your small sample. You're basically running 3 queries but the Union clauses allow you to combine into one:
select
Symbol,
sum([10DayTotalVol]) as 10DayTotalV,
sum([10DayAvgVol]) as 10DayAvgV,
sum([15DayTotalVol]) as 15DayTotalV,
sum([15DayAvgVol]) as 15DayAvgV,
sum([30DayTotalVol]) as 30DayTotalV,
sum([30DayAvgVol]) as 30DayAvgV
from (
select
Symbol,
sum(volume) as 10DayTotalVol, avg(volume) as 10DayAvgVol,
0 as 15DayTotalVol, 0 as 15DayAvgVol,
0 as 30DayTotalVol, 0 as 30DayAvgVol
from
[tbl-b]
where
timestamp >= (select min(ts) from (select distinct top 10 timestamp as ts from [tbl-b] order by timestamp desc ))
group by
Symbol
UNION
select
Symbol,
0, 0,
sum(volume), avg(volume),
0, 0
from
[tbl-b]
where
timestamp >= (select min(ts) from (select distinct top 15 timestamp as ts from [tbl-b] order by timestamp desc ))
group by
Symbol
UNION
select
Symbol,
0, 0,
0, 0,
sum(volume), avg(volume)
from
[tbl-b]
where
timestamp >= (select min(ts) from (select distinct top 30 timestamp as ts from [tbl-b] order by timestamp desc ))
group by
Symbol
) s
group by
Symbol
Using SQL Server 2005. I am building an inventory/purchasing program and I’m at the point where I need the user to “check out” equipment. When he selects a product, I need to query which stock locations have the available Qty, and tell the user which location to walk to/ retrieve product.
Here is a query for a particular [StockLocation_Products].ProductID, with a particular assigned [ProductUsages].ProductUsageID.
SELECT
PROD.ProductID,
PROD.ProductName,
SL.Room,
SL.StockSpace,
SLPPU.ResvQty,
PRDUSG.ProductUsage
FROM [StockLocations] SL
INNER JOIN [StockLocation_Products] SLP ON SL.StockLocationID = SLP.StockLocationID
INNER JOIN [StockLocation_Product_ProductUsages] SLPPU ON SLP.StockLocationID = SLPPU.StockLocationID AND SLP.ProductID = SLPPU.ProductID
INNER JOIN [ProductUsages] PUSG ON SLPPU.ProductUsageID = PRDUSG.ProductUsageID
INNER JOIN [Products] PROD ON SLPPU.ProductID = PROD.ProductID
WHERE SLP.ProductID = 4 AND PRDUSG.ProductUsageID = 1
This query returns:
ProductID ProductName Room StockSpace ResvQty ProductUsage
------------------------------------------------------------------------------------------------------------------------
4 Addonics Pocket DVD+/-R/RW B700 5-D 12 MC Pool
4 Addonics Pocket DVD+/-R/RW B700 6-B 10 MC Pool
4 Addonics Pocket DVD+/-R/RW B700 6-C 21 MC Pool
4 Addonics Pocket DVD+/-R/RW B700 6-D 20 MC Pool
I thought maybe I could use an additional HAVING clause to make this query return which combination of StockSpace(s) you’d need to visit to satisfy a request for some Qty. E.g. User needs to pull 30 of Product (ID =4).
But I don’t really understand how to use GROUP BY with HAVING SUM(), to achieve what I want.
I tried various things in my group by / having clause, but I just don’t get any results.
GROUP BY PROD.ProductID,PROD.ProductName,SL.Room,SL.StockSpace,SLPPU.ResvQty,PUSG.ProductUsage
HAVING SUM(ResvQty) >= 30;
I want results that show (at least one) combination of StockSpaces which sums up to 30, so I can tell the user “you can get 21 units from space ‘6-C’, and 9 units from ‘6-B’. There may be multiple combinations of rows that could sum() >= 30, but I need at least how to find one combination that does! Help!
You can have an inner select, such as:
SELECT count_of_foo, count(bar), baz
FROM (SELECT count(foo) as count_of_foo, bar, baz, other1, other2 FROM complex_query WHERE foo = bar HAVING count(foo) > 1) inner_query
GROUP BY count_of_foo, baz.
This will give you the ability to add more group by after the HAVING clause.
What you are trying to do is a running sum, which you can get with various techniques in SQL. I think the most efficient query, especially if you are trying to do this all in the same query, is to use a CTE (here's one example).
Another technique that doesn't rely on CTE requires the data to be populated into another table (could be a temp table, though) and basically you do a join-and-sort operation as you go.
Once you get the data to include a running sum, then you can simply select the values from which the running sum is less than or equal to the total number that you are trying to locate.
And here is a nice summary of several of the different techniques.