Get total sum for in Sql server for unknown location? - sql-server

There is an insurance policy and this policy can be paid by 1-3 agents.
line #1 ) for example : for policy Id 1 , an agent who's ID is 100 , paid 123
line #3 ) for example : for policy Id 3 , an agent who's ID is 999 , paid 741 , and also another agent who's ID is 100 paid 874
(the representation is not how it should be done correctly , but that how I have it as a fact).
How can I found how much agent ID 100 has paid total ?
(123+541+874+557+471+552)
I have a very ugly union's solution.
SQL ONLINE

In a well normalized model this is a simple query. You can 'normalize' in a CTE query then sum:
with cte as (
select agent1id as id, agent1sum as s
from insurance where agent1id is not null
union all
select agent2id as id, agent2sum as s
from insurance where agent2id is not null
union all
select agent3id as id, agent3sum as s
from insurance where agent3id is not null
)
select sum( s)
from cte
where id = 100
This is a friendly index approax if your table contains index for agents columns. A friendly index query avoid full table scan.

Looks like
SUM(
CASE WHEN agent1id=100 THEN agent1sum ELSE 0 END +
CASE WHEN agent2id=100 THEN agent2sum ELSE 0 END +
CASE WHEN agent3id=100 THEN agent3sum ELSE 0 END)
should aggregate it properly. If you need to do it for all agents, I'd use the agent table or use a CTE before this query to get the distinct agent IDs, then replace 100 above.

Related

Adding conditions at the WHERE clause gives more results

I use SqlServer. I have a table with lots of columns the importants of which are:
· User_name
· Partition - Date in xxxx-xx-xx format
· Game - a string that works as an ID
· Credits - A number
· Bet - Another number
· Prize - Another number
· Num_Spins - Another number
I wrote a query to select of those the ones that interest me given a specific date.
Select distinct CONCAT(User_Name, DATALENGTH(User_Name)) as User_name, Partition, Game, Bet, Num_spins, Credits, Prize
from ***
where Partition>='2019-09-01' and Partition<'2019-11-17' and Bet>0 and credits is not null
and User_Name IN (Select distinct userName from *** where GeoIpCountryCode='ES')
I wish I could make that a view or something, but unfortunately I don't have the privileges to do so. Therefore, I do a subquery from it:
I want to find out of those rows, the ones whose numbers follow a certain math result: (Credits+Bet-Prize) > 100000 and num_spins>5
Select user_name, partition, count(Game) as difMachines
FROM
(
Select distinct CONCAT(User_Name, DATALENGTH(User_Name)) as User_name, Partition, Game, Bet, Num_spins, Credits, Prize
from ***
where Partition>='2019-09-01' and Partition<'2019-11-17' and Bet>0 and credits is not null
and User_Name IN (Select distinct userName from *** where GeoIpCountryCode='ES')
) as A
where
(Credits+Bet-Prize) > 100000 and num_spins>5
group by User_Name, Partition;
Now, I got all the information I need. I run the last query, to group_by date these results so I can analyze them:
Select datepart(week,Partition) as Week, count (distinct user_name) as Users
from (
Select user_name, partition, count(Game) as difMachines
FROM
(
Select distinct CONCAT(User_Name, DATALENGTH(User_Name)) as User_name, Partition, Game, Bet, Num_spins, Credits, Prize
from ***
where Partition>='2019-09-01' and Partition<'2019-11-17' and Bet>0 and credits is not null
and User_Name IN (Select distinct userName from *** where GeoIpCountryCode='ES')
) as A
where
(Credits+Bet-Prize) > 100000 and num_spins>5
group by User_Name, Partition
) as B
Where difMachines=1
group by datepart(week,Partition)
order by Week asc;
I know the query can be optimized, but that's not what troubles me. The problem is that when running this query, I obtain at week 36 17050 users. If I change this line (Credits+Bet-Prize) > 100000 and num_spins>5 for this one (Credits+Bet-Prize) > 100000 (so, I purely remove the num_spins>5 part), I get 16800 users instead.
To sum up, I get more results by being more restrictive in my query. That does not make sense to me. Someone please can help? Head me to the right direction or something?
Thank you
You are trying to get the count of result set with this filter diffmachine=1,isn't?. but if you remove the filter num_spins>5 then count will increase for diffmachine greater than 1.here i give an example like yours
Declare #t table
(
[user_name] varchar(5), [partition] date, Game varchar(10),num_spins int
)
insert into #t
select 'a','01nov19','g1',1
union all
select 'a','01nov19','g1',2
union all
select 'a','01nov19','g1',3
union all
select 'a','01nov19','g1',4
union all
select 'a','01nov19','g1',5
union all
select 'a','01nov19','g1',6
union all
select 'b','01nov19','g1',7
select * from
(
select [user_name],[partition],count(game) cnt
from #t
where num_spins>5
group by [user_name],[partition]
)a
where cnt=1

Find dup records with different extensions in SQL Server

I have a subscriptions table. Sample records:
SUBS_ID | SUBS Name
1 | SC FORM 124
2 | SC FORM 124-R
I need to find both the records, as the subscription name is exactly the same but just with an extension-R.
Really bad throwaway code written straight here and untested, but...
with cte As (Select Name, Id
From Subs
Where Name Not Like '%-R'
)
Select cte.Id, cte.Name, M.Name
From Subs As M
Join cte
On cte.Name + '-R' = M.Name
You can use row_Number and partition by as below:
Select * from (
Select *, DupeRecords = Row_number() over(partition by replace([Subs Name],'-R','') order by Subs_Id)
from #yoursubs
) a Where a.DupeRecords > 1
Based on your latest criteria:
So, in the above example when I query the table I should get all 3
records ...the first one being the base record and the remaining 2
being the extensions – SQL User 17 mins ago
SELECT distinct
0 as Subs_ID
, CASE WHEN SUBS_Name like '%-%' THEN left(SUBS_Name,charindex('-',SUBS_Name)-1) ELSE SUBS_Name END AS SUB_NAME_MAIN
, '' as Extension
FROM
subs
UNION
SELECT
Subs_ID
, CASE WHEN SUBS_Name like '%-%' THEN left(SUBS_Name,charindex('-',SUBS_Name)-1) ELSE SUBS_Name END AS SUB_NAME_MAIN
, CASE WHEN SUBS_Name like '%-%' THEN RIGHT(SUBS_Name, LEN(SUBS_Name) - charindex('-',SUBS_Name)+1) ELSE '' END AS Extension
FROM
subs
will produce the following result. A 'Master' row that is given an arbitray ID number of '0' and each row of that master's family and its extension.
Subs_ID SUB_NAME_MAIN Extension
----------- -------------------- --------------------
0 SC FORM 124
1 SC FORM 124
2 SC FORM 124 -R

Complex grouping algorithm with combinations in Sql server

I have a complex grouping problem. I should solve it on Sql server 2005 but a solution that works on a more recent release is ok (we will upgrade soon).
test table:
CREATE TABLE [dbo].[testGrouping](
[Family] [varchar](50) NOT NULL,
[Person] [varchar](50) NOT NULL,
[transNr] [int] NOT NULL,
[Amount] [numeric](6, 2) NOT NULL,
[ExpectedGroup] [int] NULL
)
test data
INSERT INTO [testGrouping]([Family],[Person],[transNr],[Amount],[ExpectedGroup])
SELECT 'f1','p1',1, 10.00,1
union SELECT 'f1','p1',2 , -9.00,1
union SELECT 'f1','p2',3 , -1.00,1
union SELECT 'f2','p3',4 , 50.00,2
union SELECT 'f2','p4',5 ,-50.01,2
union SELECT 'f2','p5',6 ,-30.00,3
union SELECT 'f2','p5',7 , 20.00,3
union SELECT 'f2','p5',8 , 10.00,3
union SELECT 'f3','p7',9 , -1.00,4
union SELECT 'f3','p7',10, -2.00,4
union SELECT 'f3','p7',11, -6.00,null
union SELECT 'f3','p9',12, 2.00,null
union SELECT 'f3','p7',13, 3.00,4
union SELECT 'f2','p6',14,100.00,null
Now the problem. The ExpectedGroup starts at null, I must fill it with my code.
The requirement is to identify groups of records with ABS(sum(amount)) <= 0.01
In details:
I can group 2 or more records of the same "person";
After grouping by persons, I can search groups in the same "family"
Each record can belong to 1 group only
Each person can belong to 1 family only
Records that cannot be grouped have group = null
A group can have more than 2 records (and that's the real challenge!)
In the real data each "Family" can have up to 200 records, and each "Person" can have up to 10 records.
Amount is always <> 0
Explanation of grouping in sample data:
Group 1:
Include all the records of family f1 because no partial combination of person in that family has ABS(sum(amount)) <= 0.01
Group 2:
Persons p3 and p4 have ABS(sum(amount)) <= 0.01.
Group 3:
Persons p5 has ABS(sum(amount)) <= 0.01. So Family f2 is divided into 2 groups and a single record (transNr 13) has no group
Group 4:
In family f3 you could group transNr 9 and 11 but there is a group that belongs to Person p7 only, therefore it has higher priority.
I could easily find groups like Group 1
select family, sum(amount) from testGrouping group by Family HAVING ABS(sum(amount)) <= 0.01
and also group 3
select person, sum(amount) from testGrouping group by Person HAVING ABS(sum(amount)) <= 0.01
But other cases are trickyer (see Family f2: there are several ways to construct groups there, grouping by p5 is trivial but the other records are not so easy)
My idea in pseudo code is:
-- process the easy cases...
group by person, set a group number to persons having ABS(sum(amount)) <= 0.01
group by family, set a group number to families having ABS(sum(amount)) <= 0.01
-- process the remaining records
For each person
Generate all combinations of not grouped records of that person
For each combination of records
IF ABS(sum(amount)) of the combination <= 0.01 THEN
Assign a group to records of the combination
Recalculate the combinations (we have less records to work with)
END IF
Next combination
Next person
For each family
Generate all combinations of not grouped records of that family
For each combination of records
IF ABS(sum(amount)) of the combination <= 0.01 THEN
Assign a group to records of the combination
Recalculate the combinations (we have less records to work with)
END IF
Next combination
Next family
(on each step I can use only record not assigned to a group in previous steps, the For each translates into cursors)
My questions are:
Can you suggest me a better algorithm? (the solution must be SQL only but to describe it pseudocode is ok) I think that my pseudocode translates into a spaghetti code of nested loops, cursors, goto and other ugly code.(performance is not so critical, a few minutes to process about 10.000 records is acceptable)
How can I implement the "Generate all combinations" part? In the sample, for family F2 I should try all the possible groups of 2 records, then all the possible groups of 3 and so on till testing all the combinations. transNr is unique record ID.

Left outer join for first row in group only

I have a table that looks like this:
BANK ACCOUNT_NAME EXCESS DEBT
Acme Bank Checking1 500 300
Acme Bank Personal 200 100
Bank One Business 100 50
I need a sql query that returns.
BANK ACCOUNT_NAME EXCESS DEBT AVAILABLE
Acme Bank Checking1 500 300 300
Acme Bank Personal 200 100 NULL
Bank One Business 100 50 50
AVAILABLE would be the Sum(EXCESS) - Sum(DEBT) grouped by BANK. AVAILABLE would then appear only on the first row of BANK-ACCOUNT_NAME combination. How do I do this?
My first attempt results in AVAILABLE having values on all rows, which not intended. I only want the first row in the group to have an AVAILABLE value.
SELECT
outer.BANK
,outer.ACCOUNT_NAME
,outer.EXCESS
,outer.DEBT
,inner2.AVAILABLE
FROM BankBalances AS outer
CROSS APPLY
(
SELECT TOP 1
Bank
,SUM(EXCESS) - SUM(DEBT) AS AVAILABLE
FROM BankBalances AS inner
GROUP BY Bank
WHERE outer.BANK = inner.BANK
) AS inner2
You can use the following query:
SELECT BANK, ACCOUNT_NAME, EXCESS, DEBT,
CASE WHEN ROW_NUMBER() OVER (PARTITION BY BANK ORDER BY ACCOUNT_NAME) = 1
THEN SUM(EXCESS) OVER (PARTITION BY BANK) -
SUM(DEBT) OVER (PARTITION BY BANK)
ELSE NULL
END AS AVAILABLE
FROM BankBalances
You can use windowed version of SUM in order to avoid CROSS APPLY. ROW_NUMBER is simply used to check for first row.
I have made the assumption that first row is considered the one having the 'minimum' ACCOUNT_NAME value within each BANK partition.
Demo here
you can use ROW_NUMBER and SUM OVER() with Partition by like this.
;WITH CTE AS
(
SELECT
BANK
,ACCOUNT_NAME
,EXCESS
,DEBT
,SUM(EXCESS - DEBT) OVER(PARTITION BY BANK) AS AVAILABLE,
,ROW_NUMBER()OVER(PARTITION BY BANK ORDER BY ACCOUNT_NAME ASC) rn
FROM BankBalances
)
SELECT BANK
,ACCOUNT_NAME
,EXCESS
,DEBT
,CASE WHEN rn = 1 THEN AVAILABLE ELSE null end as AVAILABLE
FROM CTE

Identify unique values from 2 rows

There can be multiple account names assigned to 1 account number. Since there are a million of rows in the DB I want to find out how I can query the account # that is only assigned to 1 account. Please see sample data below:
Account # 100, 100, 500, 650, 250, 250, 600, 400, 400
Account Name ABA, DSA, ABA, DSA, ABA, DSA, DSA, ABA, ABA
The result of the query should be account # 500,650,400 because these account # only have one account name assigned to it.
Account 100 and 250 have multiple account names assigned to it. How do I filter account number with only 1 account name assigned to it? Please help
GROUP BY and HAVING is the way to go as stated in a comment above.
select account_num
from your_table
group by account_num
having COUNT(*) = 1
Some people will frown on the * in COUNT(), but if this is an ad hoc, non-production query, I don't think it's a big deal in terms of performance.
Based on the question, I believe the query should return 400,500,600,650.
Using a query with the clause of having count(*) =1 would not return 400 based on the data given in the question.
create table #Accounts (
AccountNumber int not null
, AccountName char(3) not null
)
insert into #Accounts (AccountNumber,AccountName) values
(100,'ABA')
,(100,'DSA')
,(500,'ABA')
,(650,'DSA')
,(250,'ABA')
,(250,'DSA')
,(600,'DSA')
,(400,'ABA')
,(400,'ABA');
with cte as (
select AccountNumber
, AccountName
, RowNumber= row_number() over (
partition by AccountNumber
order by AccountName
)
from #Accounts
group by AccountNumber, AccountName
)
select AccountNumber
, AccountName
from cte
where not exists (
select 1
from cte as i
where i.AccountNumber = cte.AccountNumber
and i.RowNumber = 2
);
This will return the account no and name where the account no is linked to more than one name
NOTE: In your example data, account no 400 is linked to the same name twice...
I'm not sure it that should count as linked to 1 name or not...
This will include AccountNo 400
SELECT DISTINCT AccountNo, AccountName
FROM #Accounts a
WHERE (
SELECT COUNT(DISTINCT AccountName)
FROM #Accounts b
WHERE a.AccountNo = b.AccountNo) = 1
This will NOT include Account No 400 (No DISTINCT in Sub Query)
SELECT DISTINCT AccountNo, AccountName
FROM #Accounts a
WHERE (
SELECT COUNT(AccountName)
FROM #Accounts b
WHERE a.AccountNo = b.AccountNo) = 1

Resources