Formatting data in sql - sql-server

I have few tables and basically I'm working out on telerik reports. The structure and the sample data I have is given below:
IF EXISTS(SELECT 1 FROM sys.tables WHERE object_id = OBJECT_ID('Leave'))
BEGIN;
DROP TABLE [Leave];
END;
GO
IF EXISTS(SELECT 1 FROM sys.tables WHERE object_id = OBJECT_ID('Addition'))
BEGIN;
DROP TABLE [Addition];
END;
GO
IF EXISTS(SELECT 1 FROM sys.tables WHERE object_id = OBJECT_ID('Deduction'))
BEGIN;
DROP TABLE [Deduction];
END;
GO
IF EXISTS(SELECT 1 FROM sys.tables WHERE object_id = OBJECT_ID('EmployeeInfo'))
BEGIN;
DROP TABLE [EmployeeInfo];
END;
GO
CREATE TABLE [EmployeeInfo] (
[EmpID] INT NOT NULL PRIMARY KEY,
[EmployeeName] VARCHAR(255)
);
CREATE TABLE [Addition] (
[AdditionID] INT NOT NULL PRIMARY KEY,
[AdditionType] VARCHAR(255),
[Amount] VARCHAR(255),
[EmpID] INT FOREIGN KEY REFERENCES EmployeeInfo(EmpID)
);
CREATE TABLE [Deduction] (
[DeductionID] INT NOT NULL PRIMARY KEY,
[DeductionType] VARCHAR(255),
[Amount] VARCHAR(255),
[EmpID] INT FOREIGN KEY REFERENCES EmployeeInfo(EmpID)
);
CREATE TABLE [Leave] (
[LeaveID] INT NOT NULL PRIMARY KEY,
[LeaveType] VARCHAR(255) NULL,
[DateFrom] VARCHAR(255),
[DateTo] VARCHAR(255),
[Approved] Binary,
[EmpID] INT FOREIGN KEY REFERENCES EmployeeInfo(EmpID)
);
GO
INSERT INTO EmployeeInfo([EmpID], [EmployeeName]) VALUES
(1, 'Marcia'),
(2, 'Lacey'),
(3, 'Fay'),
(4, 'Mohammad'),
(5, 'Mike')
INSERT INTO Addition([AdditionID], [AdditionType], [Amount], [EmpID]) VALUES
(1, 'Bonus', '2000', 2),
(2, 'Increment', '5000', 5)
INSERT INTO Deduction([DeductionID], [DeductionType], [Amount], [EmpID]) VALUES
(1, 'Late Deductions', '2000', 4),
(2, 'Delayed Project Completion', '5000', 1)
INSERT INTO Leave([LeaveID],[LeaveType],[DateFrom],[DateTo], [Approved], [EmpID]) VALUES
(1, 'Annual Leave','2018-01-08 04:52:03','2018-01-10 20:30:53', 1, 1),
(2, 'Sick Leave','2018-02-10 03:34:41','2018-02-14 04:52:14', 1, 2),
(3, 'Casual Leave','2018-01-04 11:06:18','2018-01-05 04:11:00', 1, 3),
(4, 'Annual Leave','2018-01-17 17:09:34','2018-01-21 14:30:44', 1, 4),
(5, 'Casual Leave','2018-01-09 23:31:16','2018-01-12 15:11:17', 1, 3),
(6, 'Annual Leave','2018-02-16 18:01:03','2018-02-19 17:16:04', 1, 2)
The query I am using to get the output is something like this:
SELECT Info.EmployeeName, Addition.AdditionType, Addition.Amount, Deduction.DeductionType, Deduction.Amount,
Leave.LeaveType,
SUM(DATEDIFF(Day, Leave.DateFrom, Leave.DateTo)) [#OfLeaves],
DatePart(MONTH, Leave.DateFrom)
FROM EmployeeInfo Info
LEFT JOIN Leave
ON Info.EmpID = Leave.EmpID
LEFT JOIN Addition
ON Info.EmpID = Addition.EmpID
LEFT JOIN Deduction
ON Info.EmpID = Deduction.EmpID
WHERE Approved = 1
GROUP BY Info.EmployeeName, Addition.AdditionType, Addition.Amount, Deduction.DeductionType, Deduction.Amount,
Leave.LeaveType,
DatePart(MONTH, Leave.DateFrom)
I actually want to get the output which I could be able to show on the report but somehow as I'm using joins the data is repeating on multiple rows for same user and that's why it's also appearing multiple times on the report.
The output I am getting is something like this
Fay NULL NULL NULL NULL Casual Leave 4 1
Lacey Bonus 2000 NULL NULL Annual Leave 3 2
Lacey Bonus 2000 NULL NULL Sick Leave 4 2
Marcia NULL NULL Delayed Project Completion 5000 Annual Leave 2 1
Mohammad NULL NULL Late Deductions 2000 Annual Leave 4 1
Although what I want it looks something like this:
Fay NULL NULL NULL NULL Casual Leave 4 1
Lacey Bonus 2000 NULL NULL Annual Leave 3 2
Lacey NULL NULL NULL NULL Sick Leave 4 2
Marcia NULL NULL Delayed Project Completion 5000 Annual Leave 2 1
Mohammad NULL NULL Late Deductions 2000 Annual Leave 4 1
As there was only one bonus and it was not allocated multiple times than it should appear one time. I am stuck in formatting the table layout so I think I might able to get a hint in formatting the output in query so I won't have to do there.
Best,

My own recommendation on this case is to change the left joins to a single table in the following way:
select
info.employeename, additiontype, additionamount, deductiontype, deductionamount, leavetype, #ofleaves, leavemth
from Employeeinfo info
join
(
Select
Leave.empid, null as additiontype, null as additionamount, null as deductiontype, null as deductionamount, leave.leavetype, DATEDIFF(Day, Leave.DateFrom, Leave.DateTo) [#OfLeaves], DatePart(MONTH, DateFrom) leavemth
from leave
where approved = 1
Union all
Select
Addition.empid, additiontype, amount, null, null, null, null, null
From addition
Union all
Select empid, null, null, deductiontype, amount, null, null, null
From deduction
) payadj on payadj.empid= info.empid
This approach separates the different pay adjustments into the different columns and also ensures that you don't get the double ups where this joins add multiple employee IDs.
You might need to explicitly name all the null columns for each Union - I haven't tested it, but I thought you only need to name the columns in a union all once.
The output comes in the format below;
employeename bonus leavetype
Lacey 2000 null
Lacey null Sick Leave
Lacey null Annual Leave
Rather than type out the full result set here is a link to sqlfiddle;
http://sqlfiddle.com/#!18/935e9/5/0

The problem you're facing is based on how you are joining the tables together. It's not syntax that's necessarily wrong but how we look at the data and how we understand the relationships between the tables. When doing the LEFT JOINs your query is able to find EmpIDs in each table and it is happy with that and grabs the records (or returns NULL if there are no records matching the EmpID). That isn't really what you're looking for since it can join too much together. So let's see why this is happening. If we take out the join to the Addition table your results would look like this:
Fay NULL NULL Casual Leave 4 1
Lacey NULL NULL Annual Leave 3 2
Lacey NULL NULL Sick Leave 4 2
Marcia Delayed Project Completion 5000 Annual Leave 2 1
Mohammad Late Deductions 2000 Annual Leave 4 1
You are still left with two rows for Lacey. The reason for these two rows is because of the join to the Leave table. Lacey has taken two leaves of absence. One for Sick Leave and the other for Annual Leave. Both of those records share the same EmpID of 2. So when you join to the Addition table (and/or to the rest of the tables) on EmpID the join looks for all matching records to complete that join. There's a single Addition record that matches two Leave records joined on EmpID. Thus, you end up with two Bonus results--the same Addition record for the two Leave records. Try running this query and check the results, it should also illustrate the problem:
SELECT l.LeaveType, l.EmpID, a.AdditionType, a.Amount
FROM Leave l
LEFT JOIN Addition a ON a.EmpID = l.EmpID
The results using your provided data would be:
Annual Leave 1 NULL NULL
Sick Leave 2 Bonus 2000
Casual Leave 3 NULL NULL
Annual Leave 4 NULL NULL
Casual Leave 3 NULL NULL
Annual Leave 2 Bonus 2000
So the data itself isn't wrong. It's just that when joining on EmpID in this way the relationships may be confusing.
So the problem is the relationship between the Leave table and the others. It doesn't make sense to join Leave to the Addition or Deduction tables directly on EmpID because it may look as though Lacey received a bonus for each leave of absence for example. This is what you are experiencing here.
I would suggest three separate queries (and potentially three reports). One to return the leave of absence data and the others for the Addition and Deduction data. Something like:
--Return each employee's leaves of absence
SELECT e.EmployeeName
, l.LeaveType
, SUM(DATEDIFF(Day, l.DateFrom, l.DateTo)) [#OfLeaves]
, DatePart(MONTH, l.DateFrom)
FROM EmployeeInfo e
LEFT JOIN Leave l ON e.EmpID = l.EmpID
WHERE l.Approved = 1
--Return each employee's Additions
SELECT e.EmployeeName
, a.AdditionType
, a.Amount
FROM EmployeeInfo e
LEFT JOIN Addition a ON e.EmpID = a.EmpID
--Return each employee's Deductions
SELECT e.EmployeeName
, d.DeductionType
, d.Amount
FROM EmployeeInfo e
LEFT JOIN Deduction d ON e.EmpID = d.EmpID
Having three queries should better represent the relationship the EmployeeInfo table has with each of the others and separate concerns. From there you can GROUP BY the different types of data and aggregate the values and get total counts and sums.
Here are some resources which may help if you hadn't found these already:
Explanation of SQL Joins: https://blog.codinghorror.com/a-visual-explanation-of-sql-joins/
SQL Join Examples: https://www.w3schools.com/sql/sql_join.asp
Telerik Reporting Documentation: https://docs.telerik.com/reporting/overview

Related

How to convert rows into columns

I have a table named Shift and I am defining OffDays in this table using below mentioned columns.
CREATE TABLE [tbl_Shift](
[OffDay1] [nvarchar](25) NOT NULL CONSTRAINT [DF_tbl_Shift_OffDay1] DEFAULT (N'Sunday'),
[IsAlternateOffDay2] [bit] NULL,
[OffDay2] [nvarchar](25) NULL
)
INSERT INTO [tbl_Shift] VALUES ('Sunday', 'True', 'Saturday')
AlternateOffDay is bit so if it is True than OffDay2 can be defined.
I want the result to be shown like below in case I have OffDay2 as Saturday.
Holidays
----------
Sunday
Saturday
I have tried this but the result comes up in 2 columns and 1 row and also it skips 2nd if first is not null but that's not the problem. I just want them to be in 2 rows.
Select DISTINCT ISNULL(OffDay1,OffDay2) from [HRM].[tbl_Shift]
In this case, the simplest solution is to use a UNION (which implicitly does a DISTINCT):
(simplified, just add any WHERE clause as necessary to e.g. ignore OffDay2 where the flag is not set)
SELECT OffDay1 FROM [HRM].[tbl_Shift]
UNION
SELECT OffDay2 FROM [HRM].[tbl_Shift]
Alternatively, you could look into UNPIVOT which is used for switching column values to multiple rows. Something like this:
SELECT DISTINCT TheDay
FROM
(SELECT OffDay1, OffDay2
FROM tbl_Shift) p
UNPIVOT
(TheDay FOR EachDay IN
(OffDay1, OffDay2)
)AS unpvt;

Create table 6 x 6 with automatic spill from upline

I am creating a code to company MMN. the idea is a system which has a table 6 x 6 with automatic spill.
For example.
I register 6 new persons.
John
Peter
Mary
Lary
Anderson
Paul
When I register my 7th the system automatic follow the order below me and put into John network. When I register the 8th the system automatic follow the order below me and put into Peter network.
Table 6 x 6
Firt level: 6
Second level: 36
I am trying to creating a test with stored procedure in sqlserver.
I am stuck in the part how I can do automatically put the new person registered to below me when I reach the limit of the table.
Creating a Matrix would be denormalizing your data. It is usually best practice NOT to do this, as it makes data manipulation a lot more difficult, among other reasons. How would you prevent the rows from being more than 6? You'd have to add a weird constraint like so:
create table #matrix ( ID int identity(1,1),
Name1 varchar(64),
Name2 varchar(64),
Name3 varchar(64),
Name4 varchar(64),
Name5 varchar(64),
Name6 varchar(64),
CONSTRAINT ID_PK PRIMARY KEY (ID),
CONSTRAINT Configuration_SixRows CHECK (ID <= 6))
I'm betting you aren't doing this, and thus, you can't "ensure" no more than 6 rows is inserted into your table. If you are doing this, then you'd have to insert data one row at a time which goes against everything SQL Server is about. This would be to check if the first column is full yet, then move to the second, then the third, etc... it just doesn't make sense.
Instead, I would create a ParentID column to relate your names to their respective network as you stated. This can be done with a computed column like so:
declare #table table (ID int identity(1,1),
Names varchar(64),
ParentID as case
when ID <= 6 then null
else replace(ID % 6,0,6)
end)
insert into #table
values
('John')
,('Peter')
,('Mary')
,('Lary')
,('Anderson')
,('Paul')
,('Seven')
,('Eight')
,('Nine')
,('Ten')
,('Eleven')
,('Twelve')
,('Thirteen')
,('Fourteen')
select * from #table
Then, if you wanted to display it in a matrix you would use PIVOT(), specifically Dynamic Pivot. There are a lot of examples on Stack Overflow on how to do this. This also accounts for if you want the matrix to be larger than 6 X N... perhaps the network grows so each member has 50 individuals... thus 6 (rows) X 51 (columns)
IF it's only going to be 6 columns, or not many more, then you can also use a simple join logic...
select
t.ID
,t.Names
,t2.Names
,t3.Names
from #table t
left join
#table t2 on t2.ParentID = t.ID and t2.ID = t.ID + 6
left join
#table t3 on t3.ParentID = t.ID and t3.ID = t.ID + 12
--continue on
where
t.ParentID is null
You can see this in action with This OnLine DEMO
Here is some information on normalization

SQL Function to return sequential id's

Consider this simple INSERT
INSERT INTO Assignment (CustomerId,UserId)
SELECT CustomerId,123 FROM Customers
That will obviously assign UserId=123 to all customers.
What I need to do is assign them to 3 userId's sequentially, so 3 users get one third of the accounts equally.
INSERT INTO Assignment (CustomerId,UserId)
SELECT CustomerId,fnGetNextId() FROM Customers
Could I create a function to return sequentially from a list of 3 ID's?, i.e. each time the function is called it returns the next one in the list?
Thanks
Could I create a function to return sequentially from a list of 3 ID's?,
If you create a SEQUENCE, then you can assign incremental numbers with the NEXT VALUE FOR (Transact-SQL) expression.
This is a strange requirement, but the modulus operator (%) should help you out without the need for functions, sequences, or altering your database structure. This assumes that the IDs are integers. If they're not, you can use ROW_NUMBER or a number of other tactics to get a distinct number value for each customer.
Obviously, you would replace the SELECT statement with an INSERT once you're satisfied with the code, but it's good practice to always select when developing before inserting.
SETUP WITH SAMPLE DATA:
DECLARE #Users TABLE (ID int, [Name] varchar(50))
DECLARE #Customers TABLE (ID int, [Name] varchar(50))
DECLARE #Assignment TABLE (CustomerID int, UserID int)
INSERT INTO #Customers
VALUES
(1, 'Joe'),
(2, 'Jane'),
(3, 'Jon'),
(4, 'Jake'),
(5, 'Jerry'),
(6, 'Jesus')
INSERT INTO #Users
VALUES
(1, 'Ted'),
(2, 'Ned'),
(3, 'Fred')
QUERY:
SELECT C.Name AS [CustomerName], U.Name AS [UserName]
FROM #Customers C
JOIN #Users U
ON
CASE WHEN C.ID % 3 = 0 THEN 1
WHEN C.ID % 3 = 1 THEN 2
WHEN C.ID % 3 = 2 THEN 3
END = U.ID
You would change the THEN 1 to whatever your first UserID is, THEN 2 with the second UserID, and THEN 3 with the third UserID. If you end up with another user and want to split the customers 4 ways, you would do replace the CASE statement with the following:
CASE WHEN C.ID % 4 = 0 THEN 1
WHEN C.ID % 4 = 1 THEN 2
WHEN C.ID % 4 = 2 THEN 3
WHEN C.ID % 4 = 3 THEN 4
END = U.ID
OUTPUT:
CustomerName UserName
-------------------------------------------------- --------------------------------------------------
Joe Ned
Jane Fred
Jon Ted
Jake Ned
Jerry Fred
Jesus Ted
(6 row(s) affected)
Lastly, you will want to select the IDs for your actual insert, but I selected the names so the results are easier to understand. Please let me know if this needs clarification.
Here's one way to produce Assignment as an automatically rebalancing view:
CREATE VIEW dbo.Assignment WITH SCHEMABINDING AS
WITH SeqUsers AS (
SELECT UserID, ROW_NUMBER() OVER (ORDER BY UserID) - 1 AS _ord
FROM dbo.Users
), SeqCustomers AS (
SELECT CustomerID, ROW_NUMBER() OVER (ORDER BY CustomerID) - 1 AS _ord
FROM dbo.Customers
)
-- INSERT Assignment(CustomerID, UserID)
SELECT SeqCustomers.CustomerID, SeqUsers.UserID
FROM SeqUsers
JOIN SeqCustomers ON SeqUsers._ord = SeqCustomers._ord % (SELECT COUNT(*) FROM SeqUsers)
;
This shifts assignments around if you insert a new user, which could be quite undesirable, and it's also not efficient if you had to JOIN on it. You can easily repurpose the query it contains for one-time inserts (the commented-out INSERT). The key technique there is joining on ROW_NUMBER()s.

Options for indexing a view with cte

I have a view for which I want to create an Indexed view. After a lot of energy I was able to put the sql query in place for the view and It looks like this -
ALTER VIEW [dbo].[FriendBalances] WITH SCHEMABINDING as
WITH
trans (Amount,PaidBy,PaidFor, Id) AS
(SELECT Amount,userid AS PaidBy, PaidForUsers_FbUserId AS PaidFor, Id FROM dbo.Transactions
FULL JOIN dbo.TransactionUser ON dbo.Transactions.Id = dbo.TransactionUser.TransactionsPaidFor_Id),
bal (PaidBy,PaidFor,Balance) AS
(SELECT PaidBy,PaidFor, SUM( Amount/ transactionCounts.[_count]) AS Balance FROM trans
JOIN (SELECT Id,COUNT(*)AS _count FROM trans GROUP BY Id) AS transactionCounts ON trans.Id = transactionCounts.Id AND trans.PaidBy <> trans.PaidFor
GROUP BY trans.PaidBy,trans.PaidFor )
SELECT ISNULL(bal.PaidBy,bal2.PaidFor)AS PaidBy,ISNULL(bal.PaidFor,bal2.PaidBy)AS PaidFor,
ISNULL( bal.Balance,0)-ISNULL(bal2.Balance,0) AS Balance
FROM bal
left JOIN bal AS bal2 ON bal.PaidBy = bal2.PaidFor AND bal.PaidFor = bal2.Paidby
WHERE ISNULL( bal.Balance,0)>ISNULL(bal2.Balance,0)
Sample Data for FriendBalances View -
PaidBy PaidFor Balance
------ ------- -------
9990 9991 1000
9990 9992 2000
9990 9993 1000
9991 9993 1000
9991 9994 1000
It is mainly a join of 2 tables.
Transactions -
CREATE TABLE [dbo].[Transactions](
[Id] [int] IDENTITY(1,1) NOT NULL,
[Date] [datetime] NOT NULL,
[Amount] [float] NOT NULL,
[UserId] [bigint] NOT NULL,
[Remarks] [nvarchar](255) NULL,
[GroupFbGroupId] [bigint] NULL,
CONSTRAINT [PK_Transactions] PRIMARY KEY CLUSTERED
Sample data in Transactions Table -
Id Date Amount UserId Remarks GroupFbGroupId
-- ----------------------- ------ ------ -------------- --------------
1 2001-01-01 00:00:00.000 3000 9990 this is a test NULL
2 2001-01-01 00:00:00.000 3000 9990 this is a test NULL
3 2001-01-01 00:00:00.000 3000 9991 this is a test NULL
TransactionUsers -
CREATE TABLE [dbo].[TransactionUser](
[TransactionsPaidFor_Id] [bigint] NOT NULL,
[PaidForUsers_FbUserId] [bigint] NOT NULL
) ON [PRIMARY]
Sample Data in TransactionUser Table -
TransactionsPaidFor_Id PaidForUsers_FbUserId
---------------------- ---------------------
1 9991
1 9992
1 9993
2 9990
2 9991
2 9992
3 9990
3 9993
3 9994
Now I am not able to create a view because my query contains cte(s). What are the options that I have now?
If cte can be removed, what should be the other option which would help in creating indexed views.
Here is the error message -
Msg 10137, Level 16, State 1, Line 1 Cannot create index on view "ShareBill.Test.Database.dbo.FriendBalances" because it references common table expression "trans". Views referencing common table expressions cannot be indexed. Consider not indexing the view, or removing the common table expression from the view definition.
The concept:
Transaction mainly consists of:
an Amount that was paid
UserId of the User who paid that amount
and some more information which is not important for now.
TransactionUser table is a mapping between a Transaction and a User Table. Essentially a transaction can be shared between multiple persons. So we store that in this table.
So we have transactions where 1 person is paying for it and other are sharing the amount. So if A pays 100$ for B then B would owe 100$ to A. Similarly if B pays 90$ for A then B would owe only $10 to A. Now if A pays 300$ for A,b,c that means B would owe 110$ and C would owe 10$ to A.
So in this particular view we are aggregating the effective amount that has been paid (if any) between 2 users and thus know how much a person owes another person.
Okay, this gives you an indexed view (that needs an additional view on top of to sort out the who-owes-who detail), but it may not satisfy your requirements still.
/* Transactions table, as before, but with handy unique constraint for FK Target */
CREATE TABLE [dbo].[Transactions](
[Id] [int] IDENTITY(1,1) NOT NULL,
[Date] [datetime] NOT NULL,
[Amount] [float] NOT NULL,
[UserId] [bigint] NOT NULL,
[Remarks] [nvarchar](255) NULL,
[GroupFbGroupId] [bigint] NULL,
CONSTRAINT [PK_Transactions] PRIMARY KEY CLUSTERED (Id),
constraint UQ_Transactions_XRef UNIQUE (Id,Amount,UserId)
)
Nothing surprising so far, I hope
/* Much expanded TransactionUser table, we'll hide it away and most of the maintenance is automatic */
CREATE TABLE [dbo]._TransactionUser(
[TransactionsPaidFor_Id] int NOT NULL,
[PaidForUsers_FbUserId] [bigint] NOT NULL,
Amount float not null,
PaidByUserId bigint not null,
UserCount int not null,
LowUserID as CASE WHEN [PaidForUsers_FbUserId] < PaidByUserId THEN [PaidForUsers_FbUserId] ELSE PaidByUserId END,
HighUserID as CASE WHEN [PaidForUsers_FbUserId] < PaidByUserId THEN PaidByUserId ELSE [PaidForUsers_FbUserId] END,
PerUserDelta as (Amount/UserCount) * CASE WHEN [PaidForUsers_FbUserId] < PaidByUserId THEN -1 ELSE 1 END,
constraint PK__TransactionUser PRIMARY KEY ([TransactionsPaidFor_Id],[PaidForUsers_FbUserId]),
constraint FK__TransactionUser_Transactions FOREIGN KEY ([TransactionsPaidFor_Id]) references dbo.Transactions,
constraint FK__TransactionUser_Transaction_XRef FOREIGN KEY ([TransactionsPaidFor_Id],Amount,PaidByUserID)
references dbo.Transactions (Id,Amount,UserId) ON UPDATE CASCADE
)
This table now maintains enough information to allow the view to be constructed. The rest of the work we do is to construct/maintain the data in the table. Note that, with the foreign key constraint, we've already ensured that if, say, an amount is changed in the transactions table, everything gets recalculated.
/* View that mimics the original TransactionUser table -
in fact it has the same name so existing code doesn't need to change */
CREATE VIEW dbo.TransactionUser
with schemabinding
as
select
[TransactionsPaidFor_Id],
[PaidForUsers_FbUserId]
from
dbo._TransactionUser
GO
/* Effectively the PK on the original table */
CREATE UNIQUE CLUSTERED INDEX PK_TransactionUser on dbo.TransactionUser ([TransactionsPaidFor_Id],[PaidForUsers_FbUserId])
Anything that's already written to work against TransactionUser will now work against this view, and be none the wiser. Except, they can't insert/update/delete the rows without some help:
/* Now we write the trigger that maintains the underlying table */
CREATE TRIGGER dbo.T_TransactionUser_IUD
ON dbo.TransactionUser
INSTEAD OF INSERT, UPDATE, DELETE
AS
SET NOCOUNT ON;
/* Every delete affects *every* row for the same transaction
We need to drop the counts on every remaining row, as well as removing the actual rows we're interested in */
WITH DropCounts as (
select TransactionsPaidFor_Id,COUNT(*) as Cnt from deleted group by TransactionsPaidFor_Id
), KeptRows as (
select tu.TransactionsPaidFor_Id,tu.PaidForUsers_FbUserId,UserCount - dc.Cnt as NewCount
from dbo._TransactionUser tu left join deleted d
on tu.TransactionsPaidFor_Id = d.TransactionsPaidFor_Id and
tu.PaidForUsers_FbUserId = d.PaidForUsers_FbUserId
inner join DropCounts dc
on
tu.TransactionsPaidFor_Id = dc.TransactionsPaidFor_Id
where
d.PaidForUsers_FbUserId is null
), ChangeSet as (
select TransactionsPaidFor_Id,PaidForUsers_FbUserId,NewCount,1 as Keep
from KeptRows
union all
select TransactionsPaidFor_Id,PaidForUsers_FbUserId,null,0
from deleted
)
merge into dbo._TransactionUser tu
using ChangeSet cs on tu.TransactionsPaidFor_Id = cs.TransactionsPaidFor_Id and tu.PaidForUsers_FbUserId = cs.PaidForUsers_FbUserId
when matched and cs.Keep = 1 then update set UserCount = cs.NewCount
when matched then delete;
/* Every insert affects *every* row for the same transaction
This is why the indexed view couldn't be generated */
WITH TU as (
select TransactionsPaidFor_Id,PaidForUsers_FbUserId,Amount,PaidByUserId from dbo._TransactionUser
where TransactionsPaidFor_Id in (select TransactionsPaidFor_Id from inserted)
union all
select TransactionsPaidFor_Id,PaidForUsers_FbUserId,Amount,UserId
from inserted i inner join dbo.Transactions t on i.TransactionsPaidFor_Id = t.Id
), CountedTU as (
select TransactionsPaidFor_Id,PaidForUsers_FbUserId,Amount,PaidByUserId,
COUNT(*) OVER (PARTITION BY TransactionsPaidFor_Id) as Cnt
from TU
)
merge into dbo._TransactionUser tu
using CountedTU new on tu.TransactionsPaidFor_Id = new.TransactionsPaidFor_Id and tu.PaidForUsers_FbUserId = new.PaidForUsers_FbUserId
when matched then update set Amount = new.Amount,PaidByUserId = new.PaidByUserId,UserCount = new.Cnt
when not matched then insert
([TransactionsPaidFor_Id],[PaidForUsers_FbUserId],Amount,PaidByUserId,UserCount)
values (new.TransactionsPaidFor_Id,new.PaidForUsers_FbUserId,new.Amount,new.PaidByUserId,new.Cnt);
Now that the underlying table is being maintained, we can finally write the indexed view you wanted in the first place... almost. The issue is that the totals we create may be positive or negative, because we've normalized the transactions so that we can easily sum them:
CREATE VIEW [dbo]._FriendBalances
WITH SCHEMABINDING
as
SELECT
LowUserID,
HighUserID,
SUM(PerUserDelta) as Balance,
COUNT_BIG(*) as Cnt
FROM dbo._TransactionUser
WHERE LowUserID != HighUserID
GROUP BY
LowUserID,
HighUserID
GO
create unique clustered index IX__FriendBalances on dbo._FriendBalances (LowUserID, HighUserID)
So we finally create a view, built on the indexed view above, that if the balance is negative, we flip the person owed, and the person owing around. But it will use the index on the above view, which is most of the work we were seeking to save by having the indexed view:
create view dbo.FriendBalances
as
select
CASE WHEN Balance >= 0 THEN LowUserID ELSE HighUserID END as PaidBy,
CASE WHEN Balance >= 0 THEN HighUserID ELSE LowUserID END as PaidFor,
ABS(Balance) as Balance
from
dbo._FriendBalances WITH (NOEXPAND)
Now, finally, we insert your sample data:
set identity_insert dbo.Transactions on --Ensure we get IDs we know
GO
insert into dbo.Transactions (Id,[Date] , Amount , UserId , Remarks ,GroupFbGroupId)
select 1 ,'2001-01-01T00:00:00.000', 3000, 9990 ,'this is a test', NULL union all
select 2 ,'2001-01-01T00:00:00.000', 3000, 9990 ,'this is a test', NULL union all
select 3 ,'2001-01-01T00:00:00.000', 3000, 9991 ,'this is a test', NULL
GO
set identity_insert dbo.Transactions off
GO
insert into dbo.TransactionUser (TransactionsPaidFor_Id, PaidForUsers_FbUserId)
select 1, 9991 union all
select 1, 9992 union all
select 1, 9993 union all
select 2, 9990 union all
select 2, 9991 union all
select 2, 9992 union all
select 3, 9990 union all
select 3, 9993 union all
select 3, 9994
And query the final view:
select * from dbo.FriendBalances
PaidBy PaidFor Balance
9990 9991 1000
9990 9992 2000
9990 9993 1000
9991 9993 1000
9991 9994 1000
Now, there is additional work we could do, if we were concerned that someone may find a way to dodge the triggers and perform direct changes to the base tables. The first would be yet another indexed view, that will ensure that every row for the same transaction has the same UserCount value. Finally, with a few additional columns, check constraints, FK constraints and more work in the triggers, I think we can ensure that the UserCount is correct - but it may add more overhead than you want.
I can add scripts for these aspects if you want me to - it depends on how restrictive you want/need the database to be.

A very complicated SQL query issue

I have 2 tables ...
Customer
CustomerIdentification
Customer table has 2 fields
CustomerId varchar(20)
Customer_Id_Link varchar(50)
CustomerIdentification table has 3 fields
CustomerId varchar(20)
Identification_Number varchar(50)
Personal_ID_Type_Code int -- is a foreign key to another table but thats irrelevant
Basically, Customer is the customer master table (with CustomerID as primary key) and CustomerIdentification can have several pieces of identifications for a given customer. In other words, CustomerId in CustomerIdentification is a foriegn key to Customer table. A customer can have many pieces of identifications, each having a Identification_Number and Personal_ID_Type_Code (which is an integer that tells you whether the identification is a passport, sin, drivers license etc.).
Now, customer table has the following data: Customer_Id_Link is blank (empty string) at this point
CustomerId Customer_Id_Link
--------------------------------
'CU-1' <Blank>
'CU-2' <Blank>
'CU-3' <Blank>
'CU-4' <Blank>
'CU-5' <Blank>
and CustomerIdentification table has the following data:
CustomerId Identification_Number Personal_ID_Type_Code
------------------------------------------------------------
'CU-1' 'A' 1
'CU-1' 'A' 2
'CU-1' 'A' 3
'CU-2' 'A' 1
'CU-2' 'B' 3
'CU-2' 'C' 4
'CU-3' 'A' 1
'CU-3' 'B' 2
'CU-3' 'C' 4
'CU-4' 'A' 1
'CU-4' 'B' 2
'CU-4' 'B' 3
'CU-5' 'B' 3
Essentially, more than one customer can have same Identification_Number and Personal_ID_Type_Code in CustomerIdentification. When this happens, all Customer_Id_Link fields need to be updated with a common value (could be a GUID or whatever). But the processing for this is more complex.
Rules are these:
For matching Personal_ID_Type_Code and Identification_Number fields between Customer Records
- Compare the Identification_Number fields for all other common Personal_ID_Type_Code fields for all the Customer Records from the above match
- if true, then link the Customer Records
For example:
Match ID 1 A for CU-1, CU-2, CU-3, CU-4
Exception ID 2 mismatch (A on CU-1 vs B on CU-3)
No linkage done
Match ID 2 B for CU-3, CU-4
No ID mismatch
Link CU-3 and CU-4 (update Customer_Id_Link field with a common value in customer table for both)
Match ID 3 A for CU-1, CU-4
Exception ID 2 mismatch (A vs B)
No linkage done
Match ID 3 B for CU-2, CU-5
No ID mismatch
Link CU-2 and CU-5 (update Customer_Id_Link field with a common value in customer table for both) Match ID 4 C for CU-2, CU-3
CU-2 already linked, keep CU-5 to customer linking list
CU-3 already linked, keep CU-4 to customer linking list
Exception ID 3 mismatch (B on CU-2 vs A on CU-4)
No linkage done (previous linkage remains)
Any help will be appreciated. This has kept me awake for two days now, and I cant seem to be able to find the solution. Ideally, the solution will be a stored procedure that I can execute to do customer linking.
 - SQL Server 2008 R2 Standard 64 bit
UPDATE-------------------------------
I knew it was going to be tough to explain this problem, so I take the blame. But essentially, I want to be able to link all the customers that have same identificationNumbers, only, a customer can have more than 1 identificationNumber. Take example 1. 1 A (1 being Personal_id_type_code and A being identificationNumber exists for 4 different customers. CU-1, CU-2, CU-3, CU-4. So they could potentially be the same customer that exists 4 different times in customer table with different customer ID. We need to link them with 1 common value. However, CU-1 has 2 other identifications and if even 1 of them is different from the other 3 (CU-2, CU-3, CU-4) they are not the same customer. So ID 2 with Num A does not match with ID 2 for CU-3 (its B) and same for CU-4. Also, even though ID 2 num A does not exist in CU-2, CU-1's ID 3 and num A does not match with CU-2s ID 3 (its B). Therefore its not a match at all.
Next common Id's and num is 2-b which exists in CU-3 and CU-4. These two customers are in fact same cause both have ID 1 - A and ID 2 - B. ID 4 - C and ID 3 - A is irrelevant cause both IDs are different. Which essentially means this customer has 4 IDs I A, 2 B, 4 C and 3 A. So now we need to link this customer with a common unique value (guid) in customer table.
I hope I explained this very complicated issue now. It is tough to explain as this is a very unique problem.
I've changed your data model a bit to try and make it a bit more obvious what's going on..
CREATE TABLE [dbo].[Customer]
(
[CustomerName] VARCHAR(20) NOT NULL,
[CustomerLink] VARBINARY(20) NULL
)
CREATE TABLE [dbo].[CustomerIdentification]
(
[CustomerName] VARCHAR(20) NOT NULL,
[ID] VARCHAR(50) NOT NULL,
[IDType] VARCHAR(16) NOT NULL
)
And I've added some more test data..
INSERT [dbo].[Customer]
([CustomerName])
VALUES ('Fred'),
('Bob'),
('Vince'),
('Tom'),
('Alice'),
('Matt'),
('Dan')
INSERT [dbo].[CustomerIdentification]
VALUES
('Fred', 'A', 'Passport'),
('Fred', 'A', 'SIN'),
('Fred', 'A', 'Drivers Licence'),
('Bob', 'A', 'Passport'),
('Bob', 'B', 'Drivers Licence'),
('Bob', 'C', 'Credit Card'),
('Vince', 'A', 'Passport'),
('Vince', 'B', 'SIN'),
('Vince', 'C', 'Credit Card'),
('Tom', 'A', 'Passport'),
('Tom', 'B', 'SIN'),
('Tom', 'B', 'Drivers Licence'),
('Alice', 'B', 'Drivers Licence'),
('Matt', 'X', 'Drivers Licence'),
('Dan', 'X', 'Drivers Licence')
Is this what you're looking for:
;WITH [cteNonMatchingIDs] AS (
-- Pairs where the IDType is the same, but
-- name and ID don't match
SELECT ci3.[CustomerName] AS [CustomerName1],
ci4.[CustomerName] AS [CustomerName2]
FROM [dbo].[CustomerIdentification] ci3
INNER JOIN [dbo].[CustomerIdentification] ci4
ON ci3.[IDType] = ci4.[IDType]
WHERE ci3.[CustomerName] <> ci4.[CustomerName]
AND ci3.[ID] <> ci4.[ID]
),
[cteMatchedPairs] AS (
-- Pairs where the IDType and ID match, and
-- there aren't any non matching IDs for the
-- CustomerName
SELECT DISTINCT
ci1.[CustomerName] AS [CustomerName1],
ci2.[CustomerName] AS [CustomerName2]
FROM [dbo].[CustomerIdentification] ci1
LEFT JOIN [dbo].[CustomerIdentification] ci2
ON ci1.[CustomerName] <> ci2.[CustomerName]
AND ci1.[IDType] = ci2.[IDType]
WHERE ci1.[ID] = ISNULL(ci2.[ID], ci1.[ID])
AND NOT EXISTS (
SELECT 1
FROM [cteNonMatchingIDs]
WHERE ci1.[CustomerName] = [CustomerName1] -- correlated subquery
AND ci2.[CustomerName] = [CustomerName2]
)
AND ci1.[CustomerName] < ci2.[CustomerName]
),
[cteMatchedList] ([CustomerName], [CustomerNameList]) AS (
-- Turn the matched pairs into list of matching
-- CustomerNames
SELECT [CustomerName1],
[CustomerNameList]
FROM (
SELECT [CustomerName1],
CONVERT(VARCHAR(1000), '$'
+ [CustomerName1] + '$'
+ [CustomerName2]) AS [CustomerNameList]
FROM [cteMatchedPairs]
UNION ALL
SELECT [CustomerName2],
CONVERT(VARCHAR(1000), '$'
+ [CustomerName2]) AS [CustomerNameList]
FROM [cteMatchedPairs]
) [cteMatchedPairs]
UNION ALL
SELECT [cteMatchedList].[CustomerName],
CONVERT(VARCHAR(1000),[CustomerNameList] + '$'
+ [cteMatchedPairs].[CustomerName2])
FROM [cteMatchedList] -- recursive CTE
INNER JOIN [cteMatchedPairs]
ON RIGHT([cteMatchedList].[CustomerNameList],
LEN([cteMatchedPairs].[CustomerName1])
) = [cteMatchedPairs].[CustomerName1]
),
[cteSubstringLists] AS (
SELECT r1.[CustomerName],
r2.[CustomerNameList]
FROM [cteMatchedList] r1
INNER JOIN [cteMatchedList] r2
ON r2.[CustomerNameList] LIKE '%' + r1.[CustomerNameList] + '%'
),
[cteCustomerLink] AS (
SELECT DISTINCT
x1.[CustomerName],
HASHBYTES('SHA1', x2.[CustomerNameList]) AS [CustomerLink]
FROM (
SELECT [CustomerName],
MAX(LEN([CustomerNameList])) AS [MAX LEN CustomerList]
FROM [cteSubstringLists]
GROUP BY [CustomerName]
) x1
INNER JOIN (
SELECT [CustomerName],
LEN([CustomerNameList]) AS [LEN CustomerList],
[CustomerNameList]
FROM [cteSubstringLists]
) x2
ON x1.[MAX LEN CustomerList] = x2.[LEN CustomerList]
AND x1.[CustomerName] = x2.[CustomerName]
)
UPDATE c
SET [CustomerLink] = cl.[CustomerLink]
FROM [dbo].[Customer] c
INNER JOIN [cteCustomerLink] cl
ON cl.[CustomerName] = c.[CustomerName]
SELECT *
FROM [dbo].[Customer]

Resources