Creating an amortization schedule in snowflake - snowflake-cloud-data-platform

I have a view in snowflake that gives me the following:
loan date
loan amount
maturity date
payment frequency (weekly, biweekly, semimonthly, monthly)
payment amount
I want to generate a sort of amortization schedule off of this, if you will. So if i have a loan with a date of 1/1/2022, and a maturity date of 3/9/2022 and a payment frequency of biweekly # $50 per payment, i would want to see an output like:
LoanID
Payment Date
Payment Amount
Payment Frequency
abc123
1/15/2022
$50.00
biweekly
abc123
1/29/2022
$50.00
biweekly
abc123
2/12/2022
$50.00
biweekly
abc123
2/26/2022
$50.00
biweekly
abc123
3/09/2022
$50.00
biweekly
I'm assuming i need some sort of loop while payment date < maturity date and sum(payment amount) < loan amount, but i'm not sure how to set that up properly for a view with thousands of loans. ANY help you all can provide would be incredible and i'm very grateful!

You can get this by writing a Recursive CTE, just remember that the default is limited to 100 iterations, if you need more loops then check this MAX_RECURSIONS parameter.
This is just an example of code, you should extend it to include some extreme data protection;
Sample data:
CREATE OR REPLACE TABLE LoanTable (
LoanID STRING,
Loan_date DATE,
Loan_amount NUMERIC(12,2),
Maturity_date DATE,
Payment_frequency STRING,
Payment_amount NUMERIC(12,2)
);
INSERT INTO LoanTable
VALUES ('abc123', '1/1/2022', 250, '3/9/2022', 'biweekly', 50);
Query:
WITH Recursive_CTE AS (
SELECT LoanID,
CASE Payment_frequency WHEN 'weekly' THEN DATEADD(WEEK, 1, Loan_date)
WHEN 'biweekly' THEN DATEADD(WEEK, 2, Loan_date)
WHEN 'semimonthly' THEN DATEADD(DAY, 15, Loan_date) -- I don't know how the semimonthly value is determined??
WHEN 'monthly' THEN DATEADD(MONTH, 1, Loan_date) END AS Payment_Date,
Payment_amount,
Loan_amount - Payment_amount AS Left_to_pay,
Payment_frequency,
Maturity_date
FROM LoanTable
UNION ALL
SELECT LoanID,
CASE Payment_frequency WHEN 'weekly' THEN DATEADD(WEEK, 1, Payment_Date)
WHEN 'biweekly' THEN DATEADD(WEEK, 2, Payment_Date)
WHEN 'semimonthly' THEN DATEADD(DAY, 15, Payment_Date) -- I don't know how the semimonthly value is determined??
WHEN 'monthly' THEN DATEADD(MONTH, 1, Payment_Date) END AS Payment_Date,
Payment_amount,
IFF(Left_to_pay - Payment_amount < 0, Left_to_pay, Left_to_pay - Payment_amount) AS Left_to_pay,
Payment_frequency,
Maturity_date
FROM Recursive_CTE
WHERE Left_to_pay > 0
)
SELECT LoanID, IFF(Payment_Date > Maturity_date, Maturity_date, Payment_Date) AS Payment_Date, Payment_amount, Left_to_pay, Payment_frequency
FROM Recursive_CTE
ORDER BY LoanID, Payment_Date;

Here's how to do Amortization via a JavaScript UDF with an example of how to call it. I had some trouble getting the JSON out of the function so returned it as a text string, stripped the double quotes, flattened it, and converted to a Table. Maybe someone better at JavaScript could modify it to return the table precleaned.
CREATE OR REPLACE FUNCTION AMORTIZATIONTABLE("AMOUNTFINANCED" FLOAT, "INTEREST" FLOAT, "PERIODS" FLOAT)
RETURNS STRING
LANGUAGE javascript
AS $$
const annuity = (AMOUNTFINANCED, INTEREST, PERIODS) => AMOUNTFINANCED * (INTEREST / (1 - (1 + INTEREST)**(-PERIODS)));
const balance_t = (AMOUNTFINANCED, INTEREST, P) => {
const period_movements = {
base: AMOUNTFINANCED
}
period_movements.interest = AMOUNTFINANCED * INTEREST;
period_movements.amortization = P - (AMOUNTFINANCED * INTEREST);
period_movements.annuity = P;
period_movements.final_value = Math.round((AMOUNTFINANCED - period_movements.amortization) * 100) / 100;
return period_movements;
}
const display_mortgage = (AMOUNTFINANCED, INTEREST, PERIODS) => {
var data = [];
const payements = annuity(AMOUNTFINANCED, INTEREST, PERIODS);
let movements = balance_t(AMOUNTFINANCED, INTEREST, payements);
while (movements.final_value > -.01) {
data.push(movements);
movements = balance_t(movements.final_value, INTEREST, payements);
}
return data;
}
data2 = display_mortgage(AMOUNTFINANCED, INTEREST, PERIODS);
return JSON.stringify(data2);
$$;
SELECT
INDEX + 1 AS Period,
a.VALUE:base AS CurrPrincipalBal,
a.VALUE:annuity AS TotalPayment,
a.VALUE:amortization AS PrincipalPmt,
a.VALUE:interest AS InterestPmt,
a.VALUE:final_value AS NewPrincipalBal
FROM
(SELECT * FROM TABLE(flatten(INPUT => SELECT parse_json(REPLACE(AMORTIZATIONTABLE(20000.00, 0.04, 12.00),'"',''))))) AS a;

Table generator is another approach.
Thanks to Simon for making this solution better. Respect!
WITH CTE_MY_DATE AS
(SELECT DATEADD(DAY, row_number() over (order by null)-1, '1900-01-01')::date AS MY_DATE FROM table(generator(rowcount => 18000)))
SELECT
date(MY_DATE) CALENDAR_DATE,
concat( decode(extract ('dayofweek_iso', date(MY_DATE)),1,'Monday',2, 'Tuesday',3, 'Wednesday',4, 'Thursday',5, 'Friday',6, 'Saturday',7, 'Sunday'),TO_CHAR(date(MY_DATE), ', MMMM DD, YYYY')) FULL_DATE_DESC
,row_number() over (partition by 1 order by calendar_date ) MOD_IS_COOL
FROM
CTE_MY_DATE
where
CALENDAR_DATE
between '2022-01-02' and '2022-09-03'
qualify
mod(MOD_IS_COOL, 14) = 0

So I thought I could write this "cleaner" using a table generator.
To be fair I feel this is cleaner than the recursive CTE.
To things to note, you "max loan period possible" need to be inserted for the 1000 that I have.
The bimonthly, is done by find the number of days between the monthly options, and taking "half that" it's normal to use 15 for the odd days.
But like that:
WITH loans_table(loanid, loan_date, loan_amount,
maturity_date, payment_frequency,
payment_amount) as (
SELECT * FROM VALUES
('abc123', '2022-01-01'::date, 250, '2022-03-09'::date, 'biweekly', 50)
), table_of_numbers as (
SELECT row_number() over(order by null) as rn
FROM TABLE(generator(ROWCOUNT => 1000))
/* that 1000 should be larger than any loan perdiod length you have */
), loan_enrich as (
SELECT
*
,CASE Payment_frequency
WHEN 'weekly' THEN 7
WHEN 'biweekly' THEN 14
WHEN 'semimonthly' THEN 14
WHEN 'monthly' THEN 28
END as period_lo_days
,datediff('day', loan_date, maturity_date) as loan_days
,CEIL(loan_days / period_lo_days) as loan_periods
FROM loans_table
)
SELECT
l.loanid,
CASE payment_frequency
WHEN 'weekly' THEN dateadd('week', r.rn, l.loan_date)
WHEN 'biweekly' THEN dateadd('week', r.rn * 2, l.loan_date)
WHEN 'semimonthly' THEN
case r.rn%2
when 0 then dateadd('month', r.rn/2, l.loan_date)
when 1 then dateadd('days', floor(datediff('days', dateadd('month', (r.rn-1)/2, l.loan_date), dateadd('month', (r.rn+1)/2, l.loan_date))/2), dateadd('month', (r.rn-1)/2, l.loan_date))
end
WHEN 'monthly' THEN dateadd('month', r.rn, l.loan_date)
END as payment_date,
l.payment_amount,
l.payment_frequency
FROM loan_enrich AS l
JOIN table_of_numbers AS r
ON l.loan_periods >= r.rn
ORDER BY 1, r.rn;
gives:
LOANID
PAYMENT_DATE
PAYMENT_AMOUNT
PAYMENT_FREQUENCY
abc123
2022-01-15
50
biweekly
abc123
2022-01-29
50
biweekly
abc123
2022-02-12
50
biweekly
abc123
2022-02-26
50
biweekly
abc123
2022-03-12
50
biweekly
So this can be boosted, to have semimonthly15 which is always 15 days later, and we can do some do some filtering incase the number of rows was more than we need, and we can show logic for handling a final payment that is smaller than the prior payments:
WITH loans_table(loanid, loan_date, loan_amount,
maturity_date, payment_frequency,
payment_amount) as (
SELECT * FROM VALUES
('abc123', '2022-01-01'::date, 250, '2022-03-09'::date, 'biweekly', 50),
('abc124', '2022-01-01'::date, 249, '2022-03-09'::date, 'semimonthly', 50),
('abc125', '2022-01-01'::date, 249, '2022-03-09'::date, 'semimonthly15', 50)
), table_of_numbers as (
SELECT row_number() over(order by null) as rn
FROM TABLE(generator(ROWCOUNT => 1000))
/* that 1000 should be larger than any loan perdiod length you have */
), loan_enrich as (
SELECT
*
,CASE Payment_frequency
WHEN 'weekly' THEN 7
WHEN 'biweekly' THEN 14
WHEN 'semimonthly' THEN 14
WHEN 'semimonthly15' THEN 14
WHEN 'monthly' THEN 28
END as period_lo_days
,datediff('day', loan_date, maturity_date) as loan_days
,CEIL(loan_days / period_lo_days) as loan_periods
FROM loans_table
)
SELECT
l.loanid,
CASE payment_frequency
WHEN 'weekly' THEN dateadd('week', r.rn, l.loan_date)
WHEN 'biweekly' THEN dateadd('week', r.rn * 2, l.loan_date)
WHEN 'semimonthly' THEN
case r.rn%2
when 0 then dateadd('month', r.rn/2, l.loan_date)
when 1 then dateadd('days', floor(datediff('days', dateadd('month', (r.rn-1)/2, l.loan_date), dateadd('month', (r.rn+1)/2, l.loan_date))/2), dateadd('month', (r.rn-1)/2, l.loan_date))
end
WHEN 'semimonthly15' THEN
case r.rn%2
when 0 then dateadd('month', r.rn/2, l.loan_date)
when 1 then dateadd('days', 15, dateadd('month', (r.rn-1)/2, l.loan_date))
end
WHEN 'monthly' THEN dateadd('month', r.rn, l.loan_date)
END as payment_date,
l.payment_amount,
l.payment_frequency,
l.loan_amount,
l.loan_amount - least(l.loan_amount, l.payment_amount * r.rn) as still_to_pay,
least(l.loan_amount - least(l.loan_amount, l.payment_amount * (r.rn-1)), l.payment_amount) as this_payment
FROM loan_enrich AS l
JOIN table_of_numbers AS r
ON l.loan_periods >= r.rn
WHERE this_payment > 0
ORDER BY 1, r.rn
LOANID
PAYMENT_DATE
PAYMENT_AMOUNT
PAYMENT_FREQUENCY
LOAN_AMOUNT
STILL_TO_PAY
THIS_PAYMENT
abc123
2022-01-15
50
biweekly
250
200
50
abc123
2022-01-29
50
biweekly
250
150
50
abc123
2022-02-12
50
biweekly
250
100
50
abc123
2022-02-26
50
biweekly
250
50
50
abc123
2022-03-12
50
biweekly
250
0
50
abc124
2022-01-16
50
semimonthly
249
199
50
abc124
2022-02-01
50
semimonthly
249
149
50
abc124
2022-02-15
50
semimonthly
249
99
50
abc124
2022-03-01
50
semimonthly
249
49
50
abc124
2022-03-16
50
semimonthly
249
0
49
abc125
2022-01-16
50
semimonthly15
249
199
50
abc125
2022-02-01
50
semimonthly15
249
149
50
abc125
2022-02-16
50
semimonthly15
249
99
50
abc125
2022-03-01
50
semimonthly15
249
49
50
abc125
2022-03-16
50
semimonthly15
249
0
49

Related

Snowflake: Aggregating by a sliding window (the past 60 mins) for a dataset where the sampling frequency is non-uniform

I have data with non-uniform sampling dist. I want to the aggregate data on a rolling/ sliding basis (the past 60 mins).
In order to achieve an hourly average (partitioned by city), I used to following code which worked.
SELECT *,
AVG(VALUE) OVER (PARTITION BY CITY, DATE_AND_HOUR ORDER BY TIMESTAMP
FROM
(
SELECT *,
date_trunc('HOUR', TIMESTAMP) as DATE_AND_Hour
FROM SAMPLE_DATA
)
However, my desired output is as follows:
I know Snowflake doesn't support RANGE and I can't use specify which rows BETWEEN in a windows function as my sampling dist is non-uniform.
I read some potential solutions on this page but they don't work in snowflake: sum last n days quantity using sql window function
Essentially, it's an analogous problem.
You can solve this with a self-join:
with data as (
select *
from temp_fh_wikipedia.public.wikipedia_2020
where title in ('San_Francisco', 'Los_Angeles')
and wiki='en'
and datehour > '2020-10-13'
)
select a.title, a.datehour, a.views, avg(b.views) avg_previous_5h
from data a
join (
select *
from data
) b
on a.title=b.title
and b.datehour between timestampadd(hour, -5, a.datehour) and a.datehour
group by 1, 2, 3
order by 1, 2
limit 100
Just change 'hour' for 'minutes', if you want the last x minutes.
Firstly what you show as "average" in your example is the "sum", and you first "Shanghia" result is including a "Beijing" result.
You have two options, build a fixed sized window dataset (build partials for each minute) and then use window frame of fixed size over that, OR self-join and just aggregate those (as Felipe has shown).
If you have very dense data, you might find the former more performant, and if you have sparse data, the later approach should be faster, and is definitely faster to code.
So the simple first:
with data(city, timestamp, value) as (
select column1, try_to_timestamp(column2, 'yyyy/mm/dd hh:mi'), column3 from values
('beijing', '2022/05/25 10:33', 22),
('beijing', '2022/05/25 10:37', 20),
('beijing', '2022/05/25 11:36', 29),
('beijing', '2022/05/26 11:36', 28),
('beijing', '2022/05/26 10:00', 21),
('shanghai', '2022/05/26 11:00', 33),
('shanghai', '2022/05/26 11:46', 35),
('shanghai', '2022/05/26 12:40', 37)
)
select a.*
,avg(b.value) as p60_avg
,count(b.value)-1 as p60_count
,sum(b.value) as p60_sum
from data as a
left join data as b
on a.city = b.city and b.timestamp between dateadd(hour, -1, a.timestamp) and a.timestamp
group by 1,2,3
order by 1,2
gives:
CITY
TIMESTAMP
VALUE
P60_AVG
P60_COUNT
P60_SUM
beijing
2022-05-25 10:33:00.000
22
22
0
22
beijing
2022-05-25 10:37:00.000
20
21
1
42
beijing
2022-05-25 11:36:00.000
29
24.5
1
49
beijing
2022-05-26 10:00:00.000
21
21
0
21
beijing
2022-05-26 11:36:00.000
28
28
0
28
shanghai
2022-05-26 11:00:00.000
33
33
0
33
shanghai
2022-05-26 11:46:00.000
35
34
1
68
shanghai
2022-05-26 12:40:00.000
37
36
1
72
The dense version:
with data(city, timestamp, value) as (
select column1, try_to_timestamp(column2, 'yyyy/mm/dd hh:mi'), column3 from values
('beijing', '2022/05/25 10:33', 22),
('beijing', '2022/05/25 10:37', 20),
('beijing', '2022/05/25 11:36', 29),
('beijing', '2022/05/26 11:36', 28),
('beijing', '2022/05/26 10:00', 21),
('shanghai', '2022/05/26 11:00', 33),
('shanghai', '2022/05/26 11:46', 35),
('shanghai', '2022/05/26 12:40', 37)
), filled_time as (
select city,
dateadd(minute, row_number() over(partition by city order by null)-1, min_t) as timestamp
from (
select
city, min(timestamp) as min_t, max(timestamp) as max_t
from data
group by 1
), table(generator(ROWCOUNT => 10000))
qualify timestamp <= max_t
)
select
ft.city
,ft.timestamp
,avg(d.value) over (order by ft.timestamp ROWS BETWEEN 60 PRECEDING AND current row ) as p60_avg
from filled_time as ft
left join data as d
on ft.city = d.city and ft.timestamp = d.timestamp
order by 1,2;
gives:
CITY
TIMESTAMP
P60_AVG
beijing
2022-05-25 10:33:00.000
22
beijing
2022-05-25 10:34:00.000
22
beijing
2022-05-25 10:35:00.000
22
beijing
2022-05-25 10:36:00.000
22
beijing
2022-05-25 10:37:00.000
21
beijing
2022-05-25 10:38:00.000
21
beijing
2022-05-25 10:39:00.000
21
beijing
2022-05-25 10:40:00.000
21
beijing
2022-05-25 10:41:00.000
21
beijing
2022-05-25 10:42:00.000
21
beijing
2022-05-25 10:43:00.000
21
beijing
2022-05-25 10:44:00.000
21
beijing
2022-05-25 10:45:00.000
21
beijing
2022-05-25 10:46:00.000
21
snip...
And those "extra" rows could be dumped with a qualify
select
ft.city
,ft.timestamp
,avg(d.value) over (order by ft.timestamp ROWS BETWEEN 60 PRECEDING AND current row ) as p60_avg
--,count(b.value)-1 as p60_count
--,sum(b.value) as p60_sum
from filled_time as ft
left join data as d
on ft.city = d.city and ft.timestamp = d.timestamp
qualify d.value is not null
order by 1,2;

Create SQL Pivot Table Depending On Different Periods of times

I want to create pivot table for data of the due values of different customers, and i want to pivot my data to 3 pivoted periods of time Such that the first Column has the total notes of the period from today till 30 days from now in the future and the second one for the values due in the period of ((Now + 30)< Due <60)
and the next one has values of ((Now+ 60) < Due < 90) and the last one has the value due today.
This's My Code which get my raw data:
SELECT [ADD].AccountID
,SUM(convert(money,ADDN.Amount - ISNULL(CollectedValue,0))) AS [Total Rest Amount]
,ADDN.DueDate AS [Due Date]
FROM [Accounting].[AccDocumentDetailsNotes] ADDN
INNER JOIN Accounting.AccDocumentDetails [ADD]
ON ADDN.AccDocumentDetailID = [ADD].ID
INNER JOIN Accounting.AccDocumentHeader ADH
ON ADH.ID = [ADD].AccDocumentHeaderID
INNER JOIN [Accounting].[AccNotesCollectors] ANC
ON ANC.NoteID = ADDN.ID
INNER JOIN Accounting.AccAccounts AA
ON AA.ID = [ADD].AccountID
GROUP BY [ADD].AccountID,ADDN.DueDate,[CodeTypePart],ADDN.Amount,CollectedValue
HAVING [CodeTypePart] = 'NR' AND convert(money,ADDN.Amount - ISNULL(CollectedValue,0)) > 0
And This's a Historical sample from the result:
AccountID Total Rest Amount Due Date
----------- --------------------- -----------------------
25 6800.00 2017-02-23 17:31:09.000
25 1700.00 2017-02-23 17:31:09.000
25 10602.00 2017-05-28 16:28:14.000
27 14500.00 2017-02-28 14:53:57.000
30 120150.00 2017-02-24 00:23:20.000
30 117050.00 2017-02-24 00:23:20.000
33 2000.00 2017-04-04 20:48:51.193
45 39500.00 2017-04-18 20:13:46.000
45 31300.00 2017-04-18 20:13:46.000
45 9000.00 2017-04-18 20:13:46.000
45 32200.00 2017-04-22 16:38:47.803
46 32500.00 2017-02-23 20:14:24.000
46 15910.00 2017-02-23 20:14:24.000
And I want to seems as:
So you need to break down your data into groups by how overdue it is, and then pivot on that. Then to get the total, you can add together all the sub-columns.
select
AccountID,
isnull([90+],0)+isnull( [today 61-90],0)+ isnull( [today 31-60],0)+isnull( [today-30],0) total,
[90+], [today 61-90], [today 31-60], [today-30]
from
(
select AccountId, Amount,
CASE
WHEN datediff(d, duedate, getdate()) <= 30 THEN 'today-30'
when datediff(d, duedate, getdate()) between 31 and 60 then 'today 31-60'
when datediff(d, duedate, getdate()) between 61 and 90 then 'today 61-90'
else '90+'
END as daysoverdue
from #t
) src
pivot
( sum(Amount) for daysoverdue in ([90+], [today 61-90], [today 31-60], [today-30] ))p
Try this:
;with data as (
select
Today = cast(getdate() as date),
Plus30 = dateadd(d, 30, cast(getdate() as date) ),
Plus60 = dateadd(d, 60, cast(getdate() as date) ),
Plus90 = dateadd(d, 90, cast(getdate() as date) ),
EndOfTime = cast('21991231' as date),
t.*
from #t as t
)
select
AccountId,
Total = sum(Amount),
Due0To30 = sum(pvt.Due0To30),
Due31To60 = sum(pvt.Due31To60),
Due61To90 = sum(pvt.Due61To90),
Due91Plus = sum(pvt.Due91Plus)
from data
cross apply (values
(Today, Plus30, Amount, 0, 0, 0),
(Plus30, Plus60, 0, Amount, 0, 0),
(Plus60, Plus90, 0, 0, Amount, 0),
(Plus90, EndOfTime, 0, 0, 0, Amount)
)pvt(StartDate,EndDate,Due0To30, Due31To60, Due61To90, Due91Plus)
where [Due Date] >= pvt.StartDate
and [Due Date] < pvt.EndDate
group by AccountID

SQL Server time sheet calculation

I have a time punch program the outputs the data set below. RECTYP_43 are the (1) in and (2) out punches. I need a query to look at the look at the LOGINDATE_43 and LOGINTIME_43 and the RECTYPE_43 and get the difference between 1 and 2.
I thought this would be easier than it has proven to be.
empid_43 RECTYPE_43 LOGINDATE_43 LOGINTIME_43
------------------------------------------------------------
127 1 2016-10-21 00:00:00.000 0558
127 2 2016-10-21 00:00:00.000 1430
127 2 2016-10-21 00:00:00.000 1201
127 1 2016-10-21 00:00:00.000 1228
127 1 2016-10-24 00:00:00.000 0557
127 2 2016-10-24 00:00:00.000 1200
127 1 2016-10-24 00:00:00.000 1228
127 2 2016-10-24 00:00:00.000 1430
2589 2 2016-10-21 00:00:00.000 1431
2589 1 2016-10-21 00:00:00.000 0556
2589 1 2016-10-24 00:00:00.000 0550
2589 2 2016-10-24 00:00:00.000 1431
2589 2 2016-10-24 00:00:00.000 1201
2589 1 2016-10-24 00:00:00.000 1226
69 1 2016-10-24 00:00:00.000 1229
69 2 2016-10-24 00:00:00.000 1430
69 1 2016-10-24 00:00:00.000 0555
69 2 2016-10-24 00:00:00.000 1200
You can use a CTE to get all the punch-ins and then a subquery to find the first punch out that comes after that time...
;WITH ctePunchIn AS (
SELECT empid_43, LOGINDATE_43 AS Date_In, LOGINTIME_43 AS Time_In
FROM #Table1
WHERE [RECTYPE_43] = 1
)
SELECT
empid_43, Date_In, Time_In
,(SELECT TOP 1 LOGINTIME_43 FROM #Table1 WHERE
(empid_43 = ctePunchIn.empid_43)
AND
(LOGINDATE_43 = ctePunchIn.Date_In)
AND
(LOGINTIME_43 > ctePunchIn.Time_In)
AND
(RECTYPE_43 = 2)
ORDER BY empid_43, Date_In, LOGINTIME_43) AS Time_Out
FROM
ctePunchIn
Dazedandconfused's answer works if the logout Time is the same date as the login time, but if the user logs out on a different day to logging in, it will not work.
e.g.
INSERT into Punch (empId_43, RecType_43, LoginDate_43, LoginTime_43)
VALUES (15, 1, '2016-01-01', '2305'),
(15, 2, '2016-01-02', '0005');
In order to accomodate for this, you need to know what the next item in the table is for that employee. And with that, you can ensure that the next item is also a logout event. This will help capture situations where someone has forgotten to punch out.
Extending the CTE can provide a more complete solution:
WITH Data AS
(
SELECT empId_43,
RecType_43,
LoginDate_43,
LoginTime_43,
RowNum = ROW_NUMBER() OVER (PARTITION BY empId_43
ORDER BY LoginDate_43, LoginTime_43)
FROM Punch
)
SELECT PIn.empId_43 [Employee],
PIn.LoginDate_43 [LoginDate],
PIn.LoginTime_43 [LoginTime],
POut.LoginDate_43 [LogoutDate],
POut.LoginTime_43 [LogoutTime]
FROM Data PIn
LEFT JOIN Data POut ON PIn.empId_43 = POut.empId_43
AND POut.RecType_43 = 2
AND POut.RowNum = PIn.RowNum + 1
WHERE PIn.RecType_43 = 1
ORDER BY PIn.empId_43, PIn.LoginDate_43, PIn.LoginTime_43;
However, Row_Number can be inefficient. Doing this is best when looking at a small subset (e.g. a particular date range, etc).
slightly different way of doing it:
select
punchIn.empid_43,
punchIn.login as dateTime_in,
punchout.login as dateTime_out
from
(
SELECT empId_43,
RecType_43,
LoginDate_43,
LoginTime_43,
dateadd('n',right(logintime_43,2),
dateadd('hh',left(LoginTime_43,2),
LoginDate_43)) as login,
RowNum = ROW_NUMBER() OVER (PARTITION BY empId_43
ORDER BY LoginDate_43, LoginTime_43)
FROM Punch
where rectype_43 = 1
) punchIn left outer join
(
SELECT empId_43,
RecType_43,
LoginDate_43,
LoginTime_43,
dateadd('n',right(logintime_43,2),
dateadd('hh',left(LoginTime_43,2),
LoginDate_43)) as login,
RowNum = ROW_NUMBER() OVER (PARTITION BY empId_43
ORDER BY LoginDate_43, LoginTime_43)
FROM Punch
where rectype_43 = 2
) punchOut on
punchin.empID = punchout.empID and
punchin.rownum = punchout.rownum
assuming all punchin rows have a corresponding punchout row

How to Format SQL Query? Use Pivot?

Here is my Query. The results that I get are correct but I'm having trouble getting it in the desired format. I've tried to use Pivot, but I get errors. Any ideas?
Query:
DECLARE #SMonth DATETIME
SET #SMonth = '12/01/2015'
SELECT
SMonth 'Sales Month',
c.CustNumber 'Customer',
b.Description 'Brand',
Sum (SaleQuantity) 'Qty'
FROM
DistStructure.Customer c
JOIN Sales.Sale s ON s.CustId = c.CustId
JOIN Sales.Import i on i.ImportRefId = s.ImportRefId
JOIN AppSecurity.Log l on l.LogId = s.ImportRefId
JOIN Sales.Prod p on p.ProdId = s.ProdId
JOIN Sales.Brand b on b.BrandId = p.BrandId
WHERE
s.SMonth = #SMonth AND
i.ImportStatId = 50
Group By
CustNumber,
SMonth,
Description
Order By
CustNumber
Query Results:
Sales Month Customer Brand Qty
----------------------------------------------------
2015-12-01 00:00:00.000 030554 FS 29
2015-12-01 00:00:00.000 030554 BS 5
2015-12-01 00:00:00.000 032204 FZ 21
2015-12-01 00:00:00.000 032204 BS 14
2015-12-01 00:00:00.000 032204 FS 114
2015-12-01 00:00:00.000 034312 FZ 8
2015-12-01 00:00:00.000 034312 FS 104
2015-12-01 00:00:00.000 034312 BS 16
2015-12-01 00:00:00.000 034983 FS 63
2015-12-01 00:00:00.000 034983 BS 18
2015-12-01 00:00:00.000 034983 FZ 3
Desired Format:
Note: The Customer should be rolled up by Brand (so there is only one row per Customer) and then totaled. If the Brand has no data a zero should be placed in the spot.
Sales Month Customer BS FS FZ Total
--------------------------------------------------------------
2015-12-01 00:00:00.000 030554 5 29 0 34
2015-12-01 00:00:00.000 032204 14 114 21 149
2015-12-01 00:00:00.000 034312 16 104 8 128
2015-12-01 00:00:00.000 034983 18 63 3 84
Here is one way using Conditional Aggregate to alter your existing query to get the desired result format.
;with cte as
(
SELECT [Sales Month]=SMonth,
[Customer]= c.CustNumber,
[BS] = Sum(CASE WHEN b.Description = 'BS' THEN SaleQuantity ELSE 0 END),
[FS]= Sum(CASE WHEN b.Description = 'FS' THEN SaleQuantity ELSE 0 END),
[FZ]= Sum(CASE WHEN b.Description = 'FZ' THEN SaleQuantity ELSE 0 END)
FROM DistStructure.Customer c
JOIN Sales.Sale s
ON s.CustId = c.CustId
JOIN Sales.Import i
ON i.ImportRefId = s.ImportRefId
JOIN AppSecurity.Log l
ON l.LogId = s.ImportRefId
JOIN Sales.Prod p
ON p.ProdId = s.ProdId
JOIN Sales.Brand b
ON b.BrandId = p.BrandId
WHERE s.SMonth = #SMonth
AND i.ImportStatId = 50
GROUP BY CustNumber,
SMonth
ORDER BY [Customer]
)
SELECT [Sales Month],
[Customer],
[BS],
[FS],
[FZ],
TOTAL=[BS] + [FS] + [FZ]
FROM CTE
Note: If number of Brand's are unknown then you need to use dynamic code
I believe this is what you are looking for:
/*
Setup Sample Table
*/
declare #t table
(
[Sales Month] datetime,
Customer nvarchar(6),
Brand nvarchar(2),
Qty tinyint
)
/*
Setup Sample Table with
*/
insert into #t
([Sales Month], Customer, Brand, Qty)
values ('2015-12-01', '030554', N'FS', 29),
('2015-12-01', '030554', N'BS', 5),
('2015-12-01', '032204', N'FZ', 21),
('2015-12-01', '032204', N'BS', 14),
('2015-12-01', '032204', N'FS', 114),
('2015-12-01', '034312', N'FZ', 8),
('2015-12-01', '034312', N'FS', 104),
('2015-12-01', '034312', N'BS', 16),
('2015-12-01', '034983', N'FS', 63),
('2015-12-01', '034983', N'BS', 18),
('2015-12-01', '034983', N'FZ', 3)
/*
Generating desired output
*/
select pvt.[Sales Month],
pvt.Customer,
isnull(pvt.BS, 0) as BS,
isnull(pvt.FS, 0) as FS,
isnull(pvt.FZ, 0) as FZ,
isnull(pvt.BS, 0) + isnull(pvt.FS, 0) + isnull(pvt.FZ, 0) as Total
from #t as t pivot
( sum(Qty) for Brand in (BS, FS, FZ) ) as pvt

How to get weekly report with weekstart date and end date as column name in SQL Server

Data:
Date Productivity
-------------------------
01/06/2015 50
01/06/2015 50
02/06/2015 60
02/06/2015 50
01/06/2015 55
03/06/2015 50
03/06/2015 50
03/06/2015 50
04/06/2015 50
04/06/2015 50
04/06/2015 50
05/06/2015 50
05/06/2015 50
05/06/2015 50
06/06/2015 50
06/06/2015 50
08/06/2015 50
08/06/2015 50
09/06/2015 50
10/06/2015 50
11/06/2015 50
12/06/2015 50
13/06/2015 50
13/06/2015 50
13/06/2015 50
I want output like this, which contains average of week productivity:
Date Productivity
------------------------------------------
01/06/2015-06/06/2015 50.93
08/06/2015-13/06/2015 50
SqlFiddle
SELECT
[date] = CONVERT(NVARCHAR(100), MIN([date]), 103 ) + '-' +
CONVERT(NVARCHAR(100), MAX([date]), 103 )
,[Productivity] = CAST(AVG(Productivity * 1.0) AS DECIMAL(10,2))
FROM tab
GROUP BY DATEPART(wk, [Date]);
EDIT:
Added calculating week start and end if data doesn't contain all days.
SqlFiddleDemo2
set datefirst 1;
SELECT
[date] = CONVERT(NVARCHAR(100), DATEADD(dd, -(DATEPART(dw, MIN([date]))-1), MIN([date])), 103 ) + ' - ' + CONVERT(NVARCHAR(100), DATEADD(dd, 7-(DATEPART(dw, MAX([date]))), MAX([date])), 103 )
,[Productivity] = CAST(AVG(Productivity * 1.0) AS DECIMAL(10,2))
FROM tab
GROUP BY DATEPART(wk, [Date]);
Lad's answer will do what you needs. But if there needs the weeks which don't have entry as Avg '0' you may have to write a cte for finding out the weeks which should left join with lad's answer

Resources