So this is what I have.
Have a function that returns loan exceptions. So if someone has a missing document, that's an exception, or a signature required, that's an exception etc.
The problem is that ALL of the information being returned by this function contains all of the information for the loan. Including the amount.
So if there are 6 exceptions on a single loan, and the loan is 1000, then totaling the amount by exceptions gives you 6000, because 1000 is stored in every record detail.
So here is a similar set of records that I am returning.
poolDesc| loanNumber| Exception | Amount
Consumer| 123 | Missing Sig| 100
Consumer| 123 | Missing Doc| 100
Consumer| 123 | Late Pymt | 100
Estate | 456 | Address Ent| 2000
Estate | 456 | Missing Doc| 2000
Estate | 789 | Missing Sig| 1000
Consumer| 345 | Missing Sig| 500
What I am looking for out of that selection is:
POOL CountExceptions LoanAmount
Consumer 4 600
Estate 3 3000
There has to be a way to do this, and its going to an SSRS report if that helps.
Thanks
SELECT poolDesc Pool,
SUM(CountExceptions) CountExceptions,
SUM(LoanAmount) LoanAmount
FROM (
SELECT
poolDesc ,
COUNT(*) CountExceptions,
SUM(Amount) OVER (PARTITION BY poolDesc, loanNumber ) LoanAmount
FROM
loanExceptions
GROUP BY poolDesc, loanNumber, Amount
) a
GROUP BY poolDesc
full test script
create table loanExceptions (
poolDesc varchar(50),
loanNumber int,
Exception varchar(50),
Amount float)
insert into loanExceptions values
('Consumer',123,'Missing Sig',100),
('Consumer',123,'Missing Doc',100),
('Consumer',123,'Late Pymt',100),
('Estate',456,'Address Ent',2000),
('Estate',456,'Missing Doc',2000),
('Estate',789,'Missing Sig',1000),
('Consumer',345,'Missing Sig',500)
SELECT poolDesc Pool,
SUM(CountExceptions) CountExceptions,
SUM(LoanAmount) LoanAmount
FROM (
SELECT
poolDesc ,
COUNT(*) CountExceptions,
SUM(Amount) OVER (PARTITION BY poolDesc, loanNumber ) LoanAmount
FROM
loanExceptions
GROUP BY poolDesc, loanNumber, Amount
) a
GROUP BY poolDesc
DROP TABLE loanExceptions
Related
I work in an organization that has over 75,000 employees. In our payroll system, each employee has 32 unique banks which store things like Sick Time, Vacation Time, Banked Overtime, etc.
Here are the existing tables
Employee
(
Employee_key INT IDENTITY(1,1)
Lastname,
Firstname
)
Employee Key | Lastname | Firstname
-----------------------------------
100 | Smith | John
Bank
(
Bank_key INT IDENTITY(1,1),
Bank_name VARCHAR(50)
)
Bank_key | Bank_name
---------------------
100 | VACATION
Employee_balance
(
Employee_key INT, --FK to Employee
Bank_key INT, --FK to Bank
Bank_balance NUMERIC(10,5) -- Aggregate value of bank including future dated entries
)
Employee_key | Bank_Key | Bank_Balance
--------------------------------------
100 | 100 | 0
Employee_balance_trans
(
Employee_key INT, --FK to Employee
Bank_key INT, --FK to Bank
Trans_dt DATE -- transaction date that affects the bank
Bank_delta NUMERIC(10,5)
)
Employee_key | Bank_key | Trans_dt | Balance_delta
--------------------------------------------------
100 | 100 | 20230701 | -8.0
100 | 100 | 20230801 | -8.0
100 | 100 | 20230901 | -8.0
100 | 100 | 20231001 | -8.0
100 | 100 | 20231101 | -8.0
This employee has 5 vacation days booked into the future, for a total of 40 hours. As of January 1, the employee had 40 hours in their vacation bank, but because the employee_balance table is net of all future dated entries, I have to do some SQL processing to get the value for a current date.
SELECT eb.employee_key,
eb.bank_key,
'2023-01-01',
eb.employee_balance - ISNULL(SUM(feb.balance_delta), 0)
FROM employee a
INNER JOIN employee_balance eb on eb.employee_key = a.employee_key
LEFT OUTER JOIN wfms.employee_balance_trans ebt ON ebt.balance_key = eb.balance_key
AND ebt.employee_key = eb.employee_key
AND ebt.trans_dt > '2023-01-01'
GROUP BY eb.employee_key, eb.balance_key, eb.employee_balance
Running this query using 2023-01-01 returns a bank value of 40 hours. Running the query on 2023-07-01 returns a value of 32 hours and so on. This query is fine for calculating a single employee balance. The problem starts when a manager of a department with 1000 employees wants to see a report showing the employee banks at the beginning and end of each month.
I created a new table as follows:
Employee_bank_history
(
employee_key INT, --FK to employee
bank_key INT, --FK to bank
bank_date DATE,
bank_balance NUMERIC (10,5) -- Contains the bank balance as of the bank date
)
The table has a unique clustered index consisting of employee_key, bank_key and bank_date. The table is also populated every evening with a date range from December 31 2021 to Current Date. The start date gets reset every year, so there will be a maximum of 730 days worth of data. This means that at the maximum date range of 730 days, there will be almost 2 billion rows. (75,000 employees X 32 banks X 730 days.)
Currently, I am loading 950 million rows, and the following INSERT statement takes 30-45 minutes.
DECLARE #StartDate DATE = DATEFROMPARTS(DATEPART (YY, GETDATE())-2, 12, 31)
DECLARE #EndDate DATE = GETDATE()
;
WITH cte_bank_dates AS
(
SELECT [date]
FROM dim_date
WHERE [date] BETWEEN #StartDate AND #EndDate
)
INSERT INTO employee_balance_history
SELECT de.employee_key,
deb.balance_key,
cte.[date],
deb.bank_balance - ISNULL(SUM(feb.bank_delta), 0)
FROM employee de
INNER JOIN employee_balance deb on deb.employee_key = de.employee_key
CROSS JOIN cte_bank_dates cte
LEFT OUTER JOIN employee_balance_trans feb ON feb.balance_key = deb.balance_key
AND feb.employee_key = deb.employee_key
AND feb.bank_date > cte.[date]
GROUP BY de.employee_key, deb.balance_key, cte.[date], deb.bank_balance
OPTION (MAXRECURSION 0)
I use the CTE to get only the dates in the correct range. I need to have each date in the range, so that I know which future dates to exclude from the aggregate option. The resulting query to get bank balances as of a given date is blazing fast.
Today, I had my hands slapped and was told that the CROSS JOIN to the CTE was not needed and to optimize the query because it was slowing everything else down when it runs.
Leaving aside the fact that it will run overnight once in production, I'm left to wonder if there's a better way to populate this table for every employee, every bank and every date. The number of rows is unavoidable, as is the calculation to strip out future dated transactions from the employee bank balance.
Does anyone have any idea how I might make this faster, and less resource intensive on the server?
I am looking for some elegant solution to find preferred channel by client.
As an input we get list of transactions, which contains clientid, date, invoice_id, channel and amount. For every client we need to find preferred channel based on amount.
In case some specific client has 2 channels - outcome should be RANDOM among those channels.
Input data:
Clients ID | Date | Invoice Id | Channel | Amount
-----------+------------+------------+---------+--------
Client #1 | 01-01-2020 | 0000000001 | Retail | 90
Client #1 | 07-01-2020 | 0000000002 | Website | 180
Client #2 | 08-01-2020 | 0000000003 | Retail | 70
Client #2 | 09-01-2020 | 0000000004 | Website | 70
Client #3 | 10-01-2020 | 0000000005 | Retail | 140
Client #4 | 11-01-2020 | 0000000006 | Retail | 70
Client #4 | 13-01-2020 | 0000000007 | Website | 30
Desired output:
Clients ID | Top-Channel
-----------+-----------------
Client #1 | Website >> website 180 > retail 90
Client #2 | Retail >> random choice from Retail and Website
Client #3 | Retail >> retail 140 > website 0
Client #4 | Retail >> retail 70 > website 30
Usually to solve such tasks I do some manipulations with GROUP BY, add a random number which is less than 1, and many other tricks. But most probably, there is a better solution.
This is for Microsoft SQL Server
If you have the totals, then you can use window functions:
select t.*
from (select t.*,
row_number() over (partition by client_id order by amount desc) as seqnum
from t
) t
where seqnum = 1;
If you need to aggregate to get the totals, the same approach works with aggregation:
select t.*
from (select t.client_id, t.channel, sum(amount) as total_amount,
row_number() over (partition by client_id order by sum(amount) desc) as seqnum
from t
group by t.client_id, t.channel
) t
where seqnum = 1;
So keeping your desired output in mind, I wrote the following T-SQL without using group by
declare #clients_id int =1
declare #clients_id_max int = (select max(clients_id) from random)
declare #tab1 table (clients_id int, [Top-Channel] nvarchar(10), amount int)
declare #tab2 table (clients_id int, remarks nvarchar (100))
while #clients_id <= #clients_id_max
begin
if ((select count(*) from random where clients_id =#clients_id) > 1)
begin
insert into #tab2 select top 1 a.clients_id, a.channel +' '+ cast (a.amount as nvarchar(5)) +' ; '+ b.channel +' '+ cast (b.amount as nvarchar(5)) as remarks
from random a, random b where a.clients_id =#clients_id and a.clients_id = b.clients_id and a.channel <> b.channel
order by a.amount desc
end
else
begin
insert into #tab2 select a.clients_id, a.channel +' '+ cast (a.amount as nvarchar(5)) as remarks
from random a, random b where a.clients_id =#clients_id and a.clients_id = b.clients_id
order by a.amount desc
end
insert into #tab1 select top 1 clients_id, Channel as [Top-Channel], amount from random where clients_id = #clients_id order by amount desc
set #clients_id = #clients_id +1
end
select a.clients_id, a.[Top-Channel], b.Remarks from #tab1 a join #tab2 b on a.clients_id = b.clients_id
[Query Output
: https://i.stack.imgur.com/7RzcV.jpg ]
This will work:
select distinct ID,FIRST_VALUE(Channel) over (partition by ID order by amount desc,NEWID())
from Table1
For a customer, I'm sending through an XML file to another system, the sales orders and I sum the quantities for each item across all sales orders lines (e.g.: if I have "ItemA" in 10 sales orders with different quantities in each one, I sum the quantity and send the total).
In return, I get a response whether the requested quantities can be delivered to the customers or not. If not, I still get the total quantity that can be delivered. However, could be situations when I request 100 pieces of "ItemA" and I cannot deliver all 100, but 98. In cases like this, I need to distribute (to UPDATE a custom field) those 98 pieces FIFO, according to the requested quantity in each sales order and based on the registration date of each sales order.
I tried to use a WHILE LOOP but I couldn't achieve the desired result. Here's my piece of code:
DECLARE #PickedQty int
DECLARE #PickedERPQty int
DECLARE #OrderedERPQty int=2
SET #PickedQty =
WHILE (#PickedQty>0)
BEGIN
SET #PickedERPQty=(SELECT CASE WHEN #PickedQty>#OrderedERPQty THEN #OrderedERPQty ELSE #PickedQty END)
SET #PickedQty=#PickedQty-#PickedERPQty
PRINT #PickedQty
IF #PickedQty>=0
BEGIN
UPDATE OrderLines
SET UDFValue2=#PickedERPQty
WHERE fDocID='82DADC71-6706-44C7-9B78-7FCB55D94A69'
END
IF #PickedQty <= 0
BREAK;
END
GO
Example of response
I requested 35 pieces but only 30 pieces are available to be delivered. I need to distribute those 30 pieces for each sales order, based on requested quantity and also FIFO, based on the date of the order. So, in this example, I will update the RealQty column with the requested quantity (because I have stock) and in the last one, I assign the remaining 5 pieces.
ord_Code CustOrderCode Date ItemCode ReqQty AvailQty RealQty
----------------------------------------------------------------------------
141389 CV/2539 2018-11-25 PX085 10 30 10
141389 CV/2550 2018-11-26 PX085 5 30 5
141389 CV/2563 2018-11-27 PX085 10 30 10
141389 CV/2564 2018-11-28 PX085 10 30 5
Could anyone give me a hint? Thanks
This might be more verbose than it needs to be, but I'll leave it to you to skinny it down if that's possible.
Set up the data:
DECLARE #OrderLines TABLE(
ord_Code INTEGER NOT NULL
,CustOrderCode VARCHAR(7) NOT NULL
,[Date] DATE NOT NULL
,ItemCode VARCHAR(5) NOT NULL
,ReqQty INTEGER NOT NULL
,AvailQty INTEGER NOT NULL
,RealQty INTEGER NOT NULL
);
INSERT INTO #OrderLines(ord_Code,CustOrderCode,[Date],ItemCode,ReqQty,AvailQty,RealQty) VALUES (141389,'CV/2539','2018-11-25','PX085',10,0,0);
INSERT INTO #OrderLines(ord_Code,CustOrderCode,[Date],ItemCode,ReqQty,AvailQty,RealQty) VALUES (141389,'CV/2550','2018-11-26','PX085', 5,0,0);
INSERT INTO #OrderLines(ord_Code,CustOrderCode,[Date],ItemCode,ReqQty,AvailQty,RealQty) VALUES (141389,'CV/2563','2018-11-27','PX085',10,0,0);
INSERT INTO #OrderLines(ord_Code,CustOrderCode,[Date],ItemCode,ReqQty,AvailQty,RealQty) VALUES (141389,'CV/2564','2018-11-28','PX085',10,0,0);
DECLARE #AvailQty INTEGER = 30;
For running totals, for SQL Server 20012 and up anyway, SUM() OVER is the preferred technique so I started off with some variants on that. This query brought in some useful numbers:
SELECT
ol.ord_Code,
ol.CustOrderCode,
ol.Date,
ol.ItemCode,
ol.ReqQty,
#AvailQty AS AvailQty,
SUM(ReqQty) OVER (PARTITION BY ord_Code ORDER BY [Date]) AS TotalOrderedQty,
#AvailQty-SUM(ReqQty) OVER (PARTITION BY ord_Code ORDER BY [Date]) AS RemainingQty
FROM
#OrderLines AS ol;
Then I used the RemainingQty to do a little math. The CASE expression is hairy, but the first step checks to see if the RemainingQty after processing this row will be positive, and if it is, we fulfill the order. If not, we fulfill what we can. The nested CASE is there to stop negative numbers from coming into the result set.
SELECT
ol.ord_Code,
ol.CustOrderCode,
ol.Date,
ol.ItemCode,
ol.ReqQty,
#AvailQty AS AvailQty,
SUM(ReqQty) OVER (PARTITION BY ord_Code ORDER BY [Date]) AS TotalOrderedQty,
#AvailQty-SUM(ReqQty) OVER (PARTITION BY ord_Code ORDER BY [Date]) AS RemainingQty,
CASE
WHEN (#AvailQty-SUM(ReqQty) OVER (PARTITION BY ord_Code ORDER BY [Date])) > 0
THEN ol.ReqQty
ELSE
CASE
WHEN ol.ReqQty + (#AvailQty-SUM(ReqQty) OVER (PARTITION BY ord_Code ORDER BY [Date])) > 0
THEN ol.ReqQty + (#AvailQty-SUM(ReqQty) OVER (PARTITION BY ord_Code ORDER BY [Date]))
ELSE 0
END
END AS RealQty
FROM
#OrderLines AS ol
Windowing functions (like SUM() OVER) can only be in SELECT and ORDER BY clauses, so I had to do a derived table with a JOIN. A CTE would work here, too, if you prefer. But I used that derived table to UPDATE the base table.
UPDATE Lines
SET
Lines.AvailQty = d.AvailQty
,Lines.RealQty = d.RealQty
FROM
#OrderLines AS Lines
JOIN
(
SELECT
ol.ord_Code,
ol.CustOrderCode,
ol.Date,
ol.ItemCode,
#AvailQty AS AvailQty,
CASE
WHEN (#AvailQty-SUM(ReqQty) OVER (PARTITION BY ord_Code ORDER BY [Date])) > 0
THEN ol.ReqQty
ELSE
CASE
WHEN ol.ReqQty + (#AvailQty-SUM(ReqQty) OVER (PARTITION BY ord_Code ORDER BY [Date])) > 0
THEN ol.ReqQty + (#AvailQty-SUM(ReqQty) OVER (PARTITION BY ord_Code ORDER BY [Date]))
ELSE 0
END
END AS RealQty
FROM
#OrderLines AS ol
) AS d
ON d.CustOrderCode = Lines.CustOrderCode
AND d.ord_Code = Lines.ord_Code
AND d.ItemCode = Lines.ItemCode
AND d.Date = Lines.Date;
SELECT * FROM #OrderLines;
Results:
+----------+---------------+---------------------+----------+--------+----------+---------+
| ord_Code | CustOrderCode | Date | ItemCode | ReqQty | AvailQty | RealQty |
+----------+---------------+---------------------+----------+--------+----------+---------+
| 141389 | CV/2539 | 25.11.2018 00:00:00 | PX085 | 10 | 30 | 10 |
| 141389 | CV/2550 | 26.11.2018 00:00:00 | PX085 | 5 | 30 | 5 |
| 141389 | CV/2563 | 27.11.2018 00:00:00 | PX085 | 10 | 30 | 10 |
| 141389 | CV/2564 | 28.11.2018 00:00:00 | PX085 | 10 | 30 | 5 |
+----------+---------------+---------------------+----------+--------+----------+---------+
Play with different available qty values here: https://rextester.com/MMFAR17436
I'm looking for a way to truncate or drop extra decimal places in SQL. I've found a way but i'm having a problem with values that do not have 3 decimal places.
I have the following data
ProductID | Price | Amount
------------+----------+---------
100 | 50.01 | 1
101 | 25 | 0.789
It's very simple, all I need to do is get the total from each product (Price * Amount).
My query:
select
[ProductID],
[Price],
[Amount],
round(SUM(([Price] * [Amount])),2,1) as 'Total'
from
[Tables]
What I get is:
ProductID | Price | Amount | Total
-----------+-----------+-----------+-----------
100 | 50.01 | 1 | 50 <=======
101 | 25 | 0.789 | 19.72
So, if my calculator is working, the result of this simple operation is:
(50.01 * 1) = 50.01
-
(25 * 0.789) = 19.725
-
Question: SQL does the trick dropping the 5 from the 19.725, but why does (50.01 * 1) equals 50?
I do know that if I use Round((value),2,0) I'll get 50.01, but if I do that 19.725 becomes 19.73 and that is not correct for my application.
What can I do to fix this?
If you cast price and amount to either numeric or decimal data type as shown below, you should arrive at the expected result:
DECLARE #Tables table
(
ProductID int,
Price float,
Amount float
);
INSERT #Tables
(ProductID, Price, Amount)
VALUES
(100, 50.01, 1),
(101, 25, 0.789);
SELECT ProductID
,Price
,Amount
,ROUND(SUM((CAST(Price AS decimal(5,2)) * CAST(Amount AS decimal(5,3)))),2,1) AS 'Total'
FROM #Tables
GROUP BY ProductID, Price, Amount;
(2 row(s) affected)
ProductID Price Amount Total
----------- ---------------------- ---------------------- ---------------------------------------
100 50.01 1 50.01000
101 25 0.789 19.72000
(2 row(s) affected)
SELECT ProductID,
Price,
Amount,
CAST(SUBSTRING(CAST(CAST(Price * Amount AS decimal(18,3)) AS VARCHAR),0, LEN(CAST(CAST(Price * Amount AS decimal(18,3)) AS VARCHAR))) AS DECIMAL(18,2)) AS Total
FROM [Tables]
Using Microsoft SQL Server Express Edition (64-bit) 10.0.550.0
I'm trying to extract data from an Autodesk Vault server. The SQL involved to get to the required data is too advanced for my current level of knowledge, so I'm trying to lay a puzzle using bits from Google and StackOverflow as pieces. Using this excellent answer I was able to transpose the vertical data into a manageable horizontal format.
The Autodesk Vault database stores information about CAD drawings (among other things). The main vertical table dbo.Property holds information about all the different revisions of each CAD drawing. The problem I'm currently facing is that I'm getting too much data. I just want the data from the latest revision of each CAD drawing.
Here's my SQL so far
select
CreateDate,
EntityID,
PartNumber,
CategoryName,
[Subject],
Title
from
(
select
EntityID,
CreateDate,
[53] as PartNumber,
[28] as CategoryName,
[42] as [Subject],
[43] as Title
from
(
select
p.Value,
p.PropertyDefID,
p.EntityID,
e.CreateDate
from dbo.Property as p
inner join dbo.Entity as e on p.EntityID = e.EntityId
where p.PropertyDefID in(28, 42, 43, 53)
and e.EntityClassID = 8
) t1
pivot
(
max(Value)
for PropertyDefID in([28], [42], [43], [53])
) t2
) t3
where PartNumber is not null
and PartNumber != ''
and CategoryName = 'Drawing'
-- (1) additional condition
order by PartNumber, CreateDate desc
Where dbo.Property.Value is of sql_variant datatype. The query above results in a data set similar to this:
CreateDate | EntityID | PartNumber | CategoryName | Subject | Title
---------------------------------------------------------------------
2016-01-01 | 59046 | 10001 | Drawing | Xxxxx | Yyyyy
2016-05-01 | 60137 | 10001 | Drawing | Xxxxx | Yyyyy
2016-08-01 | 62518 | 10001 | Drawing | Xxxx | Yyyyyy
2016-12-16 | 63007 | 10001 | Drawing | Xxxxxx | Yyyyyy
2016-01-01 | 45776 | 10002 | Drawing | Zzzzz | NULL
2016-11-01 | 65011 | 10002 | Drawing | Zzzzzz | NULL
...
(about 23000 rows)
The problem that I have is that I'm getting all revisions for each drawing. In the example above I only want the latest revision for PartNumber=10001 dated '2016-12-16' etc.
I have also looked at this answer on how to group and select rows where one of the columns has a max value, but I just can't seem to figure out how to combine the two. I tried adding the following snippet to the commented line in the above query, but it fails on many different levels.
and (PartNumber, CreateDate) in
(
select PartNumber, max(CreateDate)
from t3
group by PartNumber
)
The reason I'm tagging this question "pivot", although the pivoting is already done, is that I suspect that the pivoting is what's causing me trouble. I just haven't been able to wrap my head around this pivoting stuff yet, and my SQL optimization skills are seriously lacking. Maybe the filtering should be done at an inner level?
Drawing inspiration from the comment provided by #Strawberry, I kept working and tweaking until I got something that seems to work. I had to use a PIVOT inside a PIVOT for it all to work.
Edit: At first I used views, but then the prerequisites changed as I had to work with a read-only database user. Fortunately, I was still allowed to create temporary tables.
This is the final result.
if object_id('tempdb.dbo.#Properties', 'U') is not null
drop table #Properties
create table #Properties
(
PartNumber nvarchar(max),
[Subject] nvarchar(max),
Title nvarchar(max),
CreateDate datetime
)
insert into #Properties
(
PartNumber,
[Subject],
Title,
CreateDate
)
select
convert(nvarchar(max), PartNumber),
convert(nvarchar(max), [Subject]),
convert(nvarchar(max), Title),
convert(datetime, CreateDate)
from
(
select
EntityID,
CreateDate,
[53] as PartNumber,
[42] as [Subject],
[43] as Title
from
(
select
p.Value,
p.PropertyDefID,
p.EntityID,
e.CreateDate
from dbo.Property as p
inner join dbo.Entity as e on p.EntityID = e.EntityId
where p.PropertyDefID in (42, 43, 53)
and e.EntityClassID = 8
and p.EntityID in
(
select
max(EntityID) as MaxEntityID
from
(
select
EntityID,
[28] as CategoryName,
[53] as PartNumber
from
(
select
p.Value,
p.EntityID,
p.PropertyDefID
from dbo.Property as p
inner join dbo.Entity as e on p.EntityID = e.EntityId
where p.PropertyDefID in (28, 53)
and e.EntityClassID = 8 -- FileIteration
) as t1
pivot
(
max(Value)
for PropertyDefID in ([28], [53])
) as t2
) as t3
where CategoryName = 'Drawing'
group by PartNumber
)
) as t4
pivot
(
max(Value)
for PropertyDefID in ([42], [43], [53])
) as t5
) as t6
where PartNumber is not null
and PartNumber != ''
order by PartNumber
select * from #Properties;
-- search conditions goes here
I had to change the suggested join to a where x in(y) because the join was insanely slow (I terminated the query after four minutes). Now the resulting data set (which takes ~2 seconds to produce) looks promising:
PartNumber | Subject | Title | CreateDate | ...
-----------------------------------------------------------------------
100000 | Xxxxxx | Yyyyyy | 2015-08-17 09-10 | ...
100001 | Zzzzzz | NULL | 2015-09-02 15-23 | ...
...
(about 8900 rows)
No more old revisions in the set.