Create BD structure for multiple stats - database

my cuestion is simple but there are so many possibilities...
Imagine you have a car, the car has a state (available, rented, in workshop, sold,...) and each state has different attributes:
available: we want to keep the employee has established the state and date.
rented: employee, client(company or person), start date, end date and price
in workshop: workshop, date.
sold: client(company or person), employee, date, price.
and finally, we want to keep a record of states of each car.

Easy. This is one entity (car) that has a multi-valued attribute (state). Let's use status, the word state has other more obvious meanings. However, the entity has other associated attributes that only apply according to the status, so the status attribute can be treated as if it were an entity. This the status of a car is represented as a relationship with the status "entity." Your other requirement that the status be recorded is also easy. Be aware, though, that easy does not necessarily mean simple.
create table Cars(
ID int auto_generated primary key,
... ... -- other attributes
);
The status has a base-class, sub-class format
create table Status(
ID char( 1 ) not null primary key, -- 'A', 'R', 'W', 'S'
Label varchar not null - 'Available', 'Rented', 'In Workshop', 'Sold'
constraint UQ_StatusLabel unique( ID, Label )
);
Now the tables that will contain the actual status information for the cars.
create table AStatus(
A_ID char( 1 ) check( A_ID = 'A' ),
EstablishedBy int not null references Employees( ID ),
);
create table RStatus(
R_ID char( 1 ) check( R_ID = 'R' ),
RentedBy int not null references Employees( ID ),
RentedTo int not null references Clients( ID ),
Price currency not null
);
create table WStatus(
W_ID char( 1 ) check( W_ID = 'W' ),
WS_ID int not null references Workshops( ID )
);
create table SStatus(
S_ID char( 1 ) check( S_ID = 'S' ),
SoldBy int not null references Employees( ID ),
SoldTo int not null references Clients( ID ),
Price currency not null
);
Finally, the intersection table that ties these all together.
create table CarStatus(
CarID int not null references Cars( ID ),
EffDate date not null default Now(),
StatusID char( 1 ) not null references Status( ID ),
constraint PK_CarStatus primary key( CarID, EffDate )
);
Let's follow a car as it goes into the workshop, becomes available, is rented out, returned (becomes available again) and then sold. All these changes take place (conveniently enough) on the first date of successive months.
CarStatus xStatus
1 1/1/16 W W 17 -- Jan 1, Car #1 goes to workshop 17
1 2/1/16 A A 2 -- Feb 1, Employee #2 makes if available
1 3/1/16 R R 2 12 24 -- Mar 1, Employee #2 rents to client #12 for $24
1 4/1/16 A A 3 -- Apr 1, Employee #3 returns car to Available
1 5/1/16 S S 3 13 20000 -- May 1, Employee #3 sells car to client #13 for $20000
Note that the table rows do not replace each other. So the CarStatus table contains the five rows as shown, AStatus contains two rows, and each of the other xStatus tables contain one row each.
To read the history of what has happened to car # 1:
select c.id CarID, s.Label Status, cs.EffDate "Effective Date"
from CarStatus cs
join Cars c
on c.ID = cs.CarID
join Status s
on s.ID = cs.StatusID
where cs.CarID = 1
order by cs.EffDate;
CarID Status Effective Date
1 In Workshop 1/1/16
1 Available 2/1/16
1 Rented 3/1/16
1 Available 4/1/16
1 Sold 5/1/16
Note there is only one date associated with a rental. The return date is the date the car next becomes available. If there is some length of time between a return and when it can become available again, such as for cleaning, then add another status. Something like, oh, Cleaning. (Or you could add a "ReturnDate" field to the RStatus table, but that opens you up for the possibility of anomalous data so you have to be extra vigilant.)
To show the current state of a car is a little more involved. But I'm using Version Normal Form (search for it under my name for more details) which is a standard pattern I use for maintaining a history of data changes. So the query also fits a standard pattern which is quickly learned.
select c.id CarID, s.Label Status, cs.EffDate "Effective Date"
from CarStatus cs
join CarStatus cs2
on cs2.CarID = cs.CarID
and cs2.EffDate =(
select Max( cs3.EffDate )
from CarStatus cs3
where cs3.EffDate <= Now() )
join Cars c
on c.ID = cs.CarID
join Status s
on s.ID = cs.StatusID
where cs.CarID = 1
order by cs.EffDate;
This will show the last row of the result set above because that is the last status entered. You can remove the WHERE and ORDER BY clauses and use this query to create a view which shows the current status of all the cars.
Here's the cool part. Suppose you want to see the status of the car as of 15 Mar. Just change the "Now()" in the query above to "to_date('3/15/16')" and you will get the third row in the result set above: Rented. So you can look at the current status or the past status on any particular date using the same query.

Related

Matching in 2 Waves. Second "Wave" is based on First "Wave"

I have a "matching" scenario where I need to match records from a table.
I've altered my situation to use the Northwind database .. for illustration purposes.
Given a "set" of data (put in my #holder table below), I need to find matches based on the following criteria.
If both lastname and firstname match, match on TWO or more of the following : (city-state-together OR zip) , phone, extension
If one of either lastname OR firstname match, match on THREE or more of the following : (city-state-together OR zip) , phone,
extension
Note that "city-state-together OR zip" means that I need to match on the combination of city-and-state ........ or zip..........and if all three match, (city-state-and-zip), that should still only count as "1" for the "(ColumnCityStateZipEnum + ColumnHomePhoneEnum + ColumnExtensionEnum)" calculation.
I've come up with the below. But I have 7 left joins.
Is there another way to do this kind of problem in SQL?
Use Northwind /* Or NorthwindPartial */
GO
declare #holder table ( holderidentitykey int identity (1,1), lastname varchar(32) , firstname varchar(48) , city varchar(32) , stateabbr varchar(32) , zip varchar(5) , homephone varchar(16) , extension varchar(8) )
insert into #holder ( lastname , firstname , city , stateabbr, zip, homephone , extension )
select null , null, null, null, null , null, null
union all select 'Davolio' , 'Nancy', null, null, '98122' , '(206) 555-9857', null /* should 'match'. lastname, firstname and TWO of the other data-elements */
union all select 'Davolio' , null, null, null, null , null, null
union all select 'Fuller' , 'Andrew', 'Tacoma', 'WA', null , null, null
union all select 'Peacock' , 'MaggyNotAMatchNoPhone', 'Redmond', 'WA', '98052' , null, null
union all select 'Peacock' , 'MaggyNotAMatchWithPhoneAndExtension', 'Redmond', 'WA', '98052' , '(206) 555-8122', '5176' /* should 'match'. lastname and THREE of the other data-elements */
/*
If both lastname and firstname match, match on TWO or more of the following : (city-state-together OR zip) , phone, extension
If one of either lastname OR firstname match, match on THREE or more of the following : (city-state-together OR zip) , phone, extension
*/
select distinct * from
(
select
holderidentitykey,
ColumnLastNameFirstNameEnum =
case
when h.lastname = eLastName.LastName and h.firstname = eFirstName.FirstName then 2
when h.lastname = eLastName.LastName then 1
when h.firstname = eFirstName.FirstName then 1
else 0
end
,
ColumnCityStateZipEnum =
case
when h.zip = eZip.PostalCode then 1
when h.city = eCity.City and h.stateabbr = eState.Region then 1
else 0
end
,
ColumnHomePhoneEnum =
case
when h.homephone = eHomePhone.HomePhone then 1
else 0
end
,
ColumnExtensionEnum =
case
when h.extension = eExtension.Extension then 1
else 0
end
, eLastName.LastName , eFirstName.FirstName, eZip.PostalCode, eCity.City, eState.Region, eHomePhone.HomePhone, eExtension.Extension
from
#holder h
left join dbo.Employees eLastName on h.lastname = eLastName.LastName
left join dbo.Employees eFirstName on h.firstname = eFirstName.FirstName
left join dbo.Employees eZip on h.zip = eZip.PostalCode
left join dbo.Employees eCity on h.city = eCity.City
left join dbo.Employees eState on h.stateabbr = eState.Region
left join dbo.Employees eHomePhone on h.homephone = eHomePhone.HomePhone
left join dbo.Employees eExtension on h.extension = eExtension.Extension
) as derived1
where
derived1.ColumnLastNameFirstNameEnum >= 2 and (ColumnCityStateZipEnum + ColumnHomePhoneEnum + ColumnExtensionEnum) >= 2
OR
derived1.ColumnLastNameFirstNameEnum >= 1 and (ColumnCityStateZipEnum + ColumnHomePhoneEnum + ColumnExtensionEnum) >= 3
-- select * from dbo.Employees e
Here is a "partial" Northwind creation if you don't have one handy.
SET NOCOUNT ON
GO
USE master
GO
if exists (select * from sysdatabases where name='NorthwindPartial')
drop database NorthwindPartial
go
DECLARE #device_directory NVARCHAR(520)
SELECT #device_directory = SUBSTRING(filename, 1, CHARINDEX(N'master.mdf', LOWER(filename)) - 1)
FROM master.dbo.sysaltfiles WHERE dbid = 1 AND fileid = 1
EXECUTE (N'CREATE DATABASE NorthwindPartial
ON PRIMARY (NAME = N''NorthwindPartial'', FILENAME = N''' + #device_directory + N'northwndPartial.mdf'')
LOG ON (NAME = N''NorthwindPartial_log'', FILENAME = N''' + #device_directory + N'northwndPartial.ldf'')')
go
GO
set quoted_identifier on
GO
/* Set DATEFORMAT so that the date strings are interpreted correctly regardless of
the default DATEFORMAT on the server.
*/
SET DATEFORMAT mdy
GO
use "NorthwindPartial"
go
if exists (select * from sysobjects where id = object_id('dbo.Employees') and sysstat & 0xf = 3)
drop table "dbo"."Employees"
GO
CREATE TABLE "Employees" (
"EmployeeID" "int" IDENTITY (1, 1) NOT NULL ,
"LastName" nvarchar (20) NOT NULL ,
"FirstName" nvarchar (10) NOT NULL ,
"Title" nvarchar (30) NULL ,
"TitleOfCourtesy" nvarchar (25) NULL ,
"BirthDate" "datetime" NULL ,
"HireDate" "datetime" NULL ,
"Address" nvarchar (60) NULL ,
"City" nvarchar (15) NULL ,
"Region" nvarchar (15) NULL ,
"PostalCode" nvarchar (10) NULL ,
"Country" nvarchar (15) NULL ,
"HomePhone" nvarchar (24) NULL ,
"Extension" nvarchar (4) NULL ,
"Photo" "image" NULL ,
"Notes" "ntext" NULL ,
"ReportsTo" "int" NULL ,
"PhotoPath" nvarchar (255) NULL ,
CONSTRAINT "PK_Employees" PRIMARY KEY CLUSTERED
(
"EmployeeID"
),
CONSTRAINT "FK_Employees_Employees" FOREIGN KEY
(
"ReportsTo"
) REFERENCES "dbo"."Employees" (
"EmployeeID"
),
CONSTRAINT "CK_Birthdate" CHECK (BirthDate < getdate())
)
GO
CREATE INDEX "LastName" ON "dbo"."Employees"("LastName")
GO
CREATE INDEX "PostalCode" ON "dbo"."Employees"("PostalCode")
GO
set quoted_identifier on
go
set identity_insert "Employees" on
go
ALTER TABLE "Employees" NOCHECK CONSTRAINT ALL
go
INSERT "Employees"("EmployeeID","LastName","FirstName","Title","TitleOfCourtesy","BirthDate","HireDate","Address","City","Region","PostalCode","Country","HomePhone","Extension","Photo","Notes","ReportsTo","PhotoPath") VALUES(1,'Davolio','Nancy','Sales Representative','Ms.','12/08/1948','05/01/1992','507 - 20th Ave. E.
Apt. 2A','Seattle','WA','98122','USA','(206) 555-9857','5467',null,'Education includes a BA in psychology from Colorado State University in 1970. She also completed "The Art of the Cold Call." Nancy is a member of Toastmasters International.',2,'http://accweb/emmployees/davolio.bmp')
GO
INSERT "Employees"("EmployeeID","LastName","FirstName","Title","TitleOfCourtesy","BirthDate","HireDate","Address","City","Region","PostalCode","Country","HomePhone","Extension","Photo","Notes","ReportsTo","PhotoPath") VALUES(2,'Fuller','Andrew','Vice President, Sales','Dr.','02/19/1952','08/14/1992','908 W. Capital Way','Tacoma','WA','98401','USA','(206) 555-9482','3457',null,'Andrew received his BTS commercial in 1974 and a Ph.D. in international marketing from the University of Dallas in 1981. He is fluent in French and Italian and reads German. He joined the company as a sales representative, was promoted to sales manager in January 1992 and to vice president of sales in March 1993. Andrew is a member of the Sales Management Roundtable, the Seattle Chamber of Commerce, and the Pacific Rim Importers Association.',NULL,'http://accweb/emmployees/fuller.bmp')
GO
INSERT "Employees"("EmployeeID","LastName","FirstName","Title","TitleOfCourtesy","BirthDate","HireDate","Address","City","Region","PostalCode","Country","HomePhone","Extension","Photo","Notes","ReportsTo","PhotoPath") VALUES(3,'Leverling','Janet','Sales Representative','Ms.','08/30/1963','04/01/1992','722 Moss Bay Blvd.','Kirkland','WA','98033','USA','(206) 555-3412','3355',null,'Janet has a BS degree in chemistry from Boston College (1984). She has also completed a certificate program in food retailing management. Janet was hired as a sales associate in 1991 and promoted to sales representative in February 1992.',2,'http://accweb/emmployees/leverling.bmp')
GO
INSERT "Employees"("EmployeeID","LastName","FirstName","Title","TitleOfCourtesy","BirthDate","HireDate","Address","City","Region","PostalCode","Country","HomePhone","Extension","Photo","Notes","ReportsTo","PhotoPath") VALUES(4,'Peacock','Margaret','Sales Representative','Mrs.','09/19/1937','05/03/1993','4110 Old Redmond Rd.','Redmond','WA','98052','USA','(206) 555-8122','5176',null,'Margaret holds a BA in English literature from Concordia College (1958) and an MA from the American Institute of Culinary Arts (1966). She was assigned to the London office temporarily from July through November 1992.',2,'http://accweb/emmployees/peacock.bmp')
GO
INSERT "Employees"("EmployeeID","LastName","FirstName","Title","TitleOfCourtesy","BirthDate","HireDate","Address","City","Region","PostalCode","Country","HomePhone","Extension","Photo","Notes","ReportsTo","PhotoPath") VALUES(5,'Buchanan','Steven','Sales Manager','Mr.','03/04/1955','10/17/1993','14 Garrett Hill','London',NULL,'SW1 8JR','UK','(71) 555-4848','3453',null,'Steven Buchanan graduated from St. Andrews University, Scotland, with a BSC degree in 1976. Upon joining the company as a sales representative in 1992, he spent 6 months in an orientation program at the Seattle office and then returned to his permanent post in London. He was promoted to sales manager in March 1993. Mr. Buchanan has completed the courses "Successful Telemarketing" and "International Sales Management." He is fluent in French.',2,'http://accweb/emmployees/buchanan.bmp')
GO
INSERT "Employees"("EmployeeID","LastName","FirstName","Title","TitleOfCourtesy","BirthDate","HireDate","Address","City","Region","PostalCode","Country","HomePhone","Extension","Photo","Notes","ReportsTo","PhotoPath") VALUES(6,'Suyama','Michael','Sales Representative','Mr.','07/02/1963','10/17/1993','Coventry House
Miner Rd.','London',NULL,'EC2 7JR','UK','(71) 555-7773','428',null,'Michael is a graduate of Sussex University (MA, economics, 1983) and the University of California at Los Angeles (MBA, marketing, 1986). He has also taken the courses "Multi-Cultural Selling" and "Time Management for the Sales Professional." He is fluent in Japanese and can read and write French, Portuguese, and Spanish.',5,'http://accweb/emmployees/davolio.bmp')
GO
INSERT "Employees"("EmployeeID","LastName","FirstName","Title","TitleOfCourtesy","BirthDate","HireDate","Address","City","Region","PostalCode","Country","HomePhone","Extension","Photo","Notes","ReportsTo","PhotoPath") VALUES(7,'King','Robert','Sales Representative','Mr.','05/29/1960','01/02/1994','Edgeham Hollow
Winchester Way','London',NULL,'RG1 9SP','UK','(71) 555-5598','465',null,'Robert King served in the Peace Corps and traveled extensively before completing his degree in English at the University of Michigan in 1992, the year he joined the company. After completing a course entitled "Selling in Europe," he was transferred to the London office in March 1993.',5,'http://accweb/emmployees/davolio.bmp')
GO
INSERT "Employees"("EmployeeID","LastName","FirstName","Title","TitleOfCourtesy","BirthDate","HireDate","Address","City","Region","PostalCode","Country","HomePhone","Extension","Photo","Notes","ReportsTo","PhotoPath") VALUES(8,'Callahan','Laura','Inside Sales Coordinator','Ms.','01/09/1958','03/05/1994','4726 - 11th Ave. N.E.','Seattle','WA','98105','USA','(206) 555-1189','2344',null,'Laura received a BA in psychology from the University of Washington. She has also completed a course in business French. She reads and writes French.',2,'http://accweb/emmployees/davolio.bmp')
GO
INSERT "Employees"("EmployeeID","LastName","FirstName","Title","TitleOfCourtesy","BirthDate","HireDate","Address","City","Region","PostalCode","Country","HomePhone","Extension","Photo","Notes","ReportsTo","PhotoPath") VALUES(9,'Dodsworth','Anne','Sales Representative','Ms.','01/27/1966','11/15/1994','7 Houndstooth Rd.','London',NULL,'WG2 7LT','UK','(71) 555-4444','452',null,'Anne has a BA degree in English from St. Lawrence College. She is fluent in French and German.',5,'http://accweb/emmployees/davolio.bmp')
go
set identity_insert "Employees" off
go
ALTER TABLE "Employees" CHECK CONSTRAINT ALL
go
set quoted_identifier on
go
Select * from "Employees"
It's probably helpful to analyse your match rule a little, if we break it down we can see that the non-negotiable condition for a match is that either the FirstName OR the LastName matches. So let's build a query where we join only those rows from the employee table:
...
FROM #holder As h
JOIN Employee As e
ON h.FirstName = e.FirstName
OR h.LastName = e.LastName
...
Now that we're only looking at rows which meet the minimum criteria, we can assess the others. Basically your rule says that if either FirstName or LastName match, then we need a minimum of three of the following (let's assume that we matched FirstName):
Match LastName
Match City AND State, OR PostalCode
Match HomePhone
Match Extension
You present different rules depending if both FirstName and LastName match, but as long as you have one of those two then it so happens that the rules are mathematically equivalent from the perspective that I'm taking.
So we can take our potential match rows and just count how many of those matching attributes there are, and filter out rows where there aren't enough.
Select h.holderidentitykey, e.*
From #holder As h
Join Employees As e
On h.FirstName = e.FirstName
Or h.lastname = e.LastName
Where iif(h.firstname = e.firstname, 1, 0) +
iif(h.lastname = e.LastName, 1, 0) +
iif((h.city = e.City AND h.stateabbr = e.Region) OR h.zip = e.PostalCode, 1, 0) +
iif(h.homephone = e.HomePhone, 1, 0) +
iif(h.extension = e.Extension, 1, 0) >= 4;
Please note that this approach may not scale well if you have large tables (1M+) and want to match often, but if/when those situations occur then you could look at refactoring.

Multiple rows in one

I know there are similar questions like mine but unfortunately I haven't found the corresponding solution to my problem.
First off here's a simplified overview of my tables:
Partner table: PartnerID, Name
Address table: PartnerID, Street, Postcode, City, ValidFrom
Contact table: PartnerID, TypID, TelNr, Email, ValidFrom
A partner can have one or more addresses as well as contact info. With the contact info a partner could have let's say 2 tel numbers, 1 mobile number and 1 email. In the table it would look like this:
PartnerID TypID TelNr Email ValidFrom
1 1 0041 / 044 - 2002020 01.01.2010
1 1 0041 / 044 - 2003030 01.01.2011
1 2 0041 / 079 - 7003030 01.04.2011
1 3 myemail#hotmail.com 01.06.2011
What I need in the end is, combining all tables for each partner, is like this:
PartnerID Name Street Postcode City TelNr Email
1 SomeGuy MostActualStreet MostActualPC MostActualCity MostActual Nr (either type 1 or 2) MostActual Email
Any help?
Here's some T-SQL that gets the answer I think you're looking for if by "Most Actual"
you mean "most recent":
WITH LatestAddress (PartnerID,Street,PostCode,City)
AS (
SELECT PartnerID,Street,PostCode,City
FROM [Address] a
WHERE ValidFrom = (
SELECT MAX(ValidFrom)
FROM [Address] aMax
WHERE aMax.PartnerID = a.PartnerID
)
)
SELECT p.PartnerID,p.Name,la.Street,la.PostCode,la.City
,(SELECT TOP 1 TelNr FROM Contact c WHERE c.PartnerID = p.PartnerID AND TelNr IS NOT NULL ORDER BY ValidFrom DESC) AS MostRecentTelNr
,(SELECT TOP 1 Email FROM Contact c WHERE c.PartnerID = p.PartnerID AND Email IS NOT NULL ORDER BY ValidFrom DESC) AS MostRecentEmail
FROM [Partner] p
LEFT OUTER JOIN LatestAddress la ON p.PartnerID = la.PartnerID
Breaking it down, this example used a common table expression (CTE) to get the latest address information for each Partner
WITH LatestAddress (PartnerID,Street,PostCode,City)
AS (
SELECT PartnerID,Street,PostCode,City
FROM [Address] a
WHERE ValidFrom = (
SELECT MAX(ValidFrom)
FROM [Address] aMax
WHERE aMax.PartnerID = a.PartnerID
)
)
I left-joined from the Partner table to the CTE, because I didn't want partners who don't have addresses to get left out of the results.
FROM [Partner] p
LEFT OUTER JOIN LatestAddress la ON p.PartnerID = la.PartnerID
In the SELECT statement, I selected columns from the Partner table, the CTE, and I wrote two subselects, one for the latest non-null telephone number for each partner, and one for the latest non-null email address for each partner. I was able to do this as a subselect because I knew I was returning a scalar value by selecting the TOP 1.
SELECT p.PartnerID,p.Name,la.Street,la.PostCode,la.City
,(SELECT TOP 1 TelNr FROM Contact c WHERE c.PartnerID = p.PartnerID AND TelNr IS NOT NULL ORDER BY ValidFrom DESC) AS MostRecentTelNr
,(SELECT TOP 1 Email FROM Contact c WHERE c.PartnerID = p.PartnerID AND Email IS NOT NULL ORDER BY ValidFrom DESC) AS MostRecentEmail
I would strongly recommend that you separate your Contact table into separate telephone number and email tables, each with their own ValidDate if you need to be able to keep old phone numbers and email addresses.
Check out my answer on another post which explains how to get the most actual information in a case like yours : Aggregate SQL Function to grab only the first from each group
PS : The DATE_FIELD would be ValidFrom in your case.

Options for indexing a view with cte

I have a view for which I want to create an Indexed view. After a lot of energy I was able to put the sql query in place for the view and It looks like this -
ALTER VIEW [dbo].[FriendBalances] WITH SCHEMABINDING as
WITH
trans (Amount,PaidBy,PaidFor, Id) AS
(SELECT Amount,userid AS PaidBy, PaidForUsers_FbUserId AS PaidFor, Id FROM dbo.Transactions
FULL JOIN dbo.TransactionUser ON dbo.Transactions.Id = dbo.TransactionUser.TransactionsPaidFor_Id),
bal (PaidBy,PaidFor,Balance) AS
(SELECT PaidBy,PaidFor, SUM( Amount/ transactionCounts.[_count]) AS Balance FROM trans
JOIN (SELECT Id,COUNT(*)AS _count FROM trans GROUP BY Id) AS transactionCounts ON trans.Id = transactionCounts.Id AND trans.PaidBy <> trans.PaidFor
GROUP BY trans.PaidBy,trans.PaidFor )
SELECT ISNULL(bal.PaidBy,bal2.PaidFor)AS PaidBy,ISNULL(bal.PaidFor,bal2.PaidBy)AS PaidFor,
ISNULL( bal.Balance,0)-ISNULL(bal2.Balance,0) AS Balance
FROM bal
left JOIN bal AS bal2 ON bal.PaidBy = bal2.PaidFor AND bal.PaidFor = bal2.Paidby
WHERE ISNULL( bal.Balance,0)>ISNULL(bal2.Balance,0)
Sample Data for FriendBalances View -
PaidBy PaidFor Balance
------ ------- -------
9990 9991 1000
9990 9992 2000
9990 9993 1000
9991 9993 1000
9991 9994 1000
It is mainly a join of 2 tables.
Transactions -
CREATE TABLE [dbo].[Transactions](
[Id] [int] IDENTITY(1,1) NOT NULL,
[Date] [datetime] NOT NULL,
[Amount] [float] NOT NULL,
[UserId] [bigint] NOT NULL,
[Remarks] [nvarchar](255) NULL,
[GroupFbGroupId] [bigint] NULL,
CONSTRAINT [PK_Transactions] PRIMARY KEY CLUSTERED
Sample data in Transactions Table -
Id Date Amount UserId Remarks GroupFbGroupId
-- ----------------------- ------ ------ -------------- --------------
1 2001-01-01 00:00:00.000 3000 9990 this is a test NULL
2 2001-01-01 00:00:00.000 3000 9990 this is a test NULL
3 2001-01-01 00:00:00.000 3000 9991 this is a test NULL
TransactionUsers -
CREATE TABLE [dbo].[TransactionUser](
[TransactionsPaidFor_Id] [bigint] NOT NULL,
[PaidForUsers_FbUserId] [bigint] NOT NULL
) ON [PRIMARY]
Sample Data in TransactionUser Table -
TransactionsPaidFor_Id PaidForUsers_FbUserId
---------------------- ---------------------
1 9991
1 9992
1 9993
2 9990
2 9991
2 9992
3 9990
3 9993
3 9994
Now I am not able to create a view because my query contains cte(s). What are the options that I have now?
If cte can be removed, what should be the other option which would help in creating indexed views.
Here is the error message -
Msg 10137, Level 16, State 1, Line 1 Cannot create index on view "ShareBill.Test.Database.dbo.FriendBalances" because it references common table expression "trans". Views referencing common table expressions cannot be indexed. Consider not indexing the view, or removing the common table expression from the view definition.
The concept:
Transaction mainly consists of:
an Amount that was paid
UserId of the User who paid that amount
and some more information which is not important for now.
TransactionUser table is a mapping between a Transaction and a User Table. Essentially a transaction can be shared between multiple persons. So we store that in this table.
So we have transactions where 1 person is paying for it and other are sharing the amount. So if A pays 100$ for B then B would owe 100$ to A. Similarly if B pays 90$ for A then B would owe only $10 to A. Now if A pays 300$ for A,b,c that means B would owe 110$ and C would owe 10$ to A.
So in this particular view we are aggregating the effective amount that has been paid (if any) between 2 users and thus know how much a person owes another person.
Okay, this gives you an indexed view (that needs an additional view on top of to sort out the who-owes-who detail), but it may not satisfy your requirements still.
/* Transactions table, as before, but with handy unique constraint for FK Target */
CREATE TABLE [dbo].[Transactions](
[Id] [int] IDENTITY(1,1) NOT NULL,
[Date] [datetime] NOT NULL,
[Amount] [float] NOT NULL,
[UserId] [bigint] NOT NULL,
[Remarks] [nvarchar](255) NULL,
[GroupFbGroupId] [bigint] NULL,
CONSTRAINT [PK_Transactions] PRIMARY KEY CLUSTERED (Id),
constraint UQ_Transactions_XRef UNIQUE (Id,Amount,UserId)
)
Nothing surprising so far, I hope
/* Much expanded TransactionUser table, we'll hide it away and most of the maintenance is automatic */
CREATE TABLE [dbo]._TransactionUser(
[TransactionsPaidFor_Id] int NOT NULL,
[PaidForUsers_FbUserId] [bigint] NOT NULL,
Amount float not null,
PaidByUserId bigint not null,
UserCount int not null,
LowUserID as CASE WHEN [PaidForUsers_FbUserId] < PaidByUserId THEN [PaidForUsers_FbUserId] ELSE PaidByUserId END,
HighUserID as CASE WHEN [PaidForUsers_FbUserId] < PaidByUserId THEN PaidByUserId ELSE [PaidForUsers_FbUserId] END,
PerUserDelta as (Amount/UserCount) * CASE WHEN [PaidForUsers_FbUserId] < PaidByUserId THEN -1 ELSE 1 END,
constraint PK__TransactionUser PRIMARY KEY ([TransactionsPaidFor_Id],[PaidForUsers_FbUserId]),
constraint FK__TransactionUser_Transactions FOREIGN KEY ([TransactionsPaidFor_Id]) references dbo.Transactions,
constraint FK__TransactionUser_Transaction_XRef FOREIGN KEY ([TransactionsPaidFor_Id],Amount,PaidByUserID)
references dbo.Transactions (Id,Amount,UserId) ON UPDATE CASCADE
)
This table now maintains enough information to allow the view to be constructed. The rest of the work we do is to construct/maintain the data in the table. Note that, with the foreign key constraint, we've already ensured that if, say, an amount is changed in the transactions table, everything gets recalculated.
/* View that mimics the original TransactionUser table -
in fact it has the same name so existing code doesn't need to change */
CREATE VIEW dbo.TransactionUser
with schemabinding
as
select
[TransactionsPaidFor_Id],
[PaidForUsers_FbUserId]
from
dbo._TransactionUser
GO
/* Effectively the PK on the original table */
CREATE UNIQUE CLUSTERED INDEX PK_TransactionUser on dbo.TransactionUser ([TransactionsPaidFor_Id],[PaidForUsers_FbUserId])
Anything that's already written to work against TransactionUser will now work against this view, and be none the wiser. Except, they can't insert/update/delete the rows without some help:
/* Now we write the trigger that maintains the underlying table */
CREATE TRIGGER dbo.T_TransactionUser_IUD
ON dbo.TransactionUser
INSTEAD OF INSERT, UPDATE, DELETE
AS
SET NOCOUNT ON;
/* Every delete affects *every* row for the same transaction
We need to drop the counts on every remaining row, as well as removing the actual rows we're interested in */
WITH DropCounts as (
select TransactionsPaidFor_Id,COUNT(*) as Cnt from deleted group by TransactionsPaidFor_Id
), KeptRows as (
select tu.TransactionsPaidFor_Id,tu.PaidForUsers_FbUserId,UserCount - dc.Cnt as NewCount
from dbo._TransactionUser tu left join deleted d
on tu.TransactionsPaidFor_Id = d.TransactionsPaidFor_Id and
tu.PaidForUsers_FbUserId = d.PaidForUsers_FbUserId
inner join DropCounts dc
on
tu.TransactionsPaidFor_Id = dc.TransactionsPaidFor_Id
where
d.PaidForUsers_FbUserId is null
), ChangeSet as (
select TransactionsPaidFor_Id,PaidForUsers_FbUserId,NewCount,1 as Keep
from KeptRows
union all
select TransactionsPaidFor_Id,PaidForUsers_FbUserId,null,0
from deleted
)
merge into dbo._TransactionUser tu
using ChangeSet cs on tu.TransactionsPaidFor_Id = cs.TransactionsPaidFor_Id and tu.PaidForUsers_FbUserId = cs.PaidForUsers_FbUserId
when matched and cs.Keep = 1 then update set UserCount = cs.NewCount
when matched then delete;
/* Every insert affects *every* row for the same transaction
This is why the indexed view couldn't be generated */
WITH TU as (
select TransactionsPaidFor_Id,PaidForUsers_FbUserId,Amount,PaidByUserId from dbo._TransactionUser
where TransactionsPaidFor_Id in (select TransactionsPaidFor_Id from inserted)
union all
select TransactionsPaidFor_Id,PaidForUsers_FbUserId,Amount,UserId
from inserted i inner join dbo.Transactions t on i.TransactionsPaidFor_Id = t.Id
), CountedTU as (
select TransactionsPaidFor_Id,PaidForUsers_FbUserId,Amount,PaidByUserId,
COUNT(*) OVER (PARTITION BY TransactionsPaidFor_Id) as Cnt
from TU
)
merge into dbo._TransactionUser tu
using CountedTU new on tu.TransactionsPaidFor_Id = new.TransactionsPaidFor_Id and tu.PaidForUsers_FbUserId = new.PaidForUsers_FbUserId
when matched then update set Amount = new.Amount,PaidByUserId = new.PaidByUserId,UserCount = new.Cnt
when not matched then insert
([TransactionsPaidFor_Id],[PaidForUsers_FbUserId],Amount,PaidByUserId,UserCount)
values (new.TransactionsPaidFor_Id,new.PaidForUsers_FbUserId,new.Amount,new.PaidByUserId,new.Cnt);
Now that the underlying table is being maintained, we can finally write the indexed view you wanted in the first place... almost. The issue is that the totals we create may be positive or negative, because we've normalized the transactions so that we can easily sum them:
CREATE VIEW [dbo]._FriendBalances
WITH SCHEMABINDING
as
SELECT
LowUserID,
HighUserID,
SUM(PerUserDelta) as Balance,
COUNT_BIG(*) as Cnt
FROM dbo._TransactionUser
WHERE LowUserID != HighUserID
GROUP BY
LowUserID,
HighUserID
GO
create unique clustered index IX__FriendBalances on dbo._FriendBalances (LowUserID, HighUserID)
So we finally create a view, built on the indexed view above, that if the balance is negative, we flip the person owed, and the person owing around. But it will use the index on the above view, which is most of the work we were seeking to save by having the indexed view:
create view dbo.FriendBalances
as
select
CASE WHEN Balance >= 0 THEN LowUserID ELSE HighUserID END as PaidBy,
CASE WHEN Balance >= 0 THEN HighUserID ELSE LowUserID END as PaidFor,
ABS(Balance) as Balance
from
dbo._FriendBalances WITH (NOEXPAND)
Now, finally, we insert your sample data:
set identity_insert dbo.Transactions on --Ensure we get IDs we know
GO
insert into dbo.Transactions (Id,[Date] , Amount , UserId , Remarks ,GroupFbGroupId)
select 1 ,'2001-01-01T00:00:00.000', 3000, 9990 ,'this is a test', NULL union all
select 2 ,'2001-01-01T00:00:00.000', 3000, 9990 ,'this is a test', NULL union all
select 3 ,'2001-01-01T00:00:00.000', 3000, 9991 ,'this is a test', NULL
GO
set identity_insert dbo.Transactions off
GO
insert into dbo.TransactionUser (TransactionsPaidFor_Id, PaidForUsers_FbUserId)
select 1, 9991 union all
select 1, 9992 union all
select 1, 9993 union all
select 2, 9990 union all
select 2, 9991 union all
select 2, 9992 union all
select 3, 9990 union all
select 3, 9993 union all
select 3, 9994
And query the final view:
select * from dbo.FriendBalances
PaidBy PaidFor Balance
9990 9991 1000
9990 9992 2000
9990 9993 1000
9991 9993 1000
9991 9994 1000
Now, there is additional work we could do, if we were concerned that someone may find a way to dodge the triggers and perform direct changes to the base tables. The first would be yet another indexed view, that will ensure that every row for the same transaction has the same UserCount value. Finally, with a few additional columns, check constraints, FK constraints and more work in the triggers, I think we can ensure that the UserCount is correct - but it may add more overhead than you want.
I can add scripts for these aspects if you want me to - it depends on how restrictive you want/need the database to be.

A very complicated SQL query issue

I have 2 tables ...
Customer
CustomerIdentification
Customer table has 2 fields
CustomerId varchar(20)
Customer_Id_Link varchar(50)
CustomerIdentification table has 3 fields
CustomerId varchar(20)
Identification_Number varchar(50)
Personal_ID_Type_Code int -- is a foreign key to another table but thats irrelevant
Basically, Customer is the customer master table (with CustomerID as primary key) and CustomerIdentification can have several pieces of identifications for a given customer. In other words, CustomerId in CustomerIdentification is a foriegn key to Customer table. A customer can have many pieces of identifications, each having a Identification_Number and Personal_ID_Type_Code (which is an integer that tells you whether the identification is a passport, sin, drivers license etc.).
Now, customer table has the following data: Customer_Id_Link is blank (empty string) at this point
CustomerId Customer_Id_Link
--------------------------------
'CU-1' <Blank>
'CU-2' <Blank>
'CU-3' <Blank>
'CU-4' <Blank>
'CU-5' <Blank>
and CustomerIdentification table has the following data:
CustomerId Identification_Number Personal_ID_Type_Code
------------------------------------------------------------
'CU-1' 'A' 1
'CU-1' 'A' 2
'CU-1' 'A' 3
'CU-2' 'A' 1
'CU-2' 'B' 3
'CU-2' 'C' 4
'CU-3' 'A' 1
'CU-3' 'B' 2
'CU-3' 'C' 4
'CU-4' 'A' 1
'CU-4' 'B' 2
'CU-4' 'B' 3
'CU-5' 'B' 3
Essentially, more than one customer can have same Identification_Number and Personal_ID_Type_Code in CustomerIdentification. When this happens, all Customer_Id_Link fields need to be updated with a common value (could be a GUID or whatever). But the processing for this is more complex.
Rules are these:
For matching Personal_ID_Type_Code and Identification_Number fields between Customer Records
- Compare the Identification_Number fields for all other common Personal_ID_Type_Code fields for all the Customer Records from the above match
- if true, then link the Customer Records
For example:
Match ID 1 A for CU-1, CU-2, CU-3, CU-4
Exception ID 2 mismatch (A on CU-1 vs B on CU-3)
No linkage done
Match ID 2 B for CU-3, CU-4
No ID mismatch
Link CU-3 and CU-4 (update Customer_Id_Link field with a common value in customer table for both)
Match ID 3 A for CU-1, CU-4
Exception ID 2 mismatch (A vs B)
No linkage done
Match ID 3 B for CU-2, CU-5
No ID mismatch
Link CU-2 and CU-5 (update Customer_Id_Link field with a common value in customer table for both) Match ID 4 C for CU-2, CU-3
CU-2 already linked, keep CU-5 to customer linking list
CU-3 already linked, keep CU-4 to customer linking list
Exception ID 3 mismatch (B on CU-2 vs A on CU-4)
No linkage done (previous linkage remains)
Any help will be appreciated. This has kept me awake for two days now, and I cant seem to be able to find the solution. Ideally, the solution will be a stored procedure that I can execute to do customer linking.
 - SQL Server 2008 R2 Standard 64 bit
UPDATE-------------------------------
I knew it was going to be tough to explain this problem, so I take the blame. But essentially, I want to be able to link all the customers that have same identificationNumbers, only, a customer can have more than 1 identificationNumber. Take example 1. 1 A (1 being Personal_id_type_code and A being identificationNumber exists for 4 different customers. CU-1, CU-2, CU-3, CU-4. So they could potentially be the same customer that exists 4 different times in customer table with different customer ID. We need to link them with 1 common value. However, CU-1 has 2 other identifications and if even 1 of them is different from the other 3 (CU-2, CU-3, CU-4) they are not the same customer. So ID 2 with Num A does not match with ID 2 for CU-3 (its B) and same for CU-4. Also, even though ID 2 num A does not exist in CU-2, CU-1's ID 3 and num A does not match with CU-2s ID 3 (its B). Therefore its not a match at all.
Next common Id's and num is 2-b which exists in CU-3 and CU-4. These two customers are in fact same cause both have ID 1 - A and ID 2 - B. ID 4 - C and ID 3 - A is irrelevant cause both IDs are different. Which essentially means this customer has 4 IDs I A, 2 B, 4 C and 3 A. So now we need to link this customer with a common unique value (guid) in customer table.
I hope I explained this very complicated issue now. It is tough to explain as this is a very unique problem.
I've changed your data model a bit to try and make it a bit more obvious what's going on..
CREATE TABLE [dbo].[Customer]
(
[CustomerName] VARCHAR(20) NOT NULL,
[CustomerLink] VARBINARY(20) NULL
)
CREATE TABLE [dbo].[CustomerIdentification]
(
[CustomerName] VARCHAR(20) NOT NULL,
[ID] VARCHAR(50) NOT NULL,
[IDType] VARCHAR(16) NOT NULL
)
And I've added some more test data..
INSERT [dbo].[Customer]
([CustomerName])
VALUES ('Fred'),
('Bob'),
('Vince'),
('Tom'),
('Alice'),
('Matt'),
('Dan')
INSERT [dbo].[CustomerIdentification]
VALUES
('Fred', 'A', 'Passport'),
('Fred', 'A', 'SIN'),
('Fred', 'A', 'Drivers Licence'),
('Bob', 'A', 'Passport'),
('Bob', 'B', 'Drivers Licence'),
('Bob', 'C', 'Credit Card'),
('Vince', 'A', 'Passport'),
('Vince', 'B', 'SIN'),
('Vince', 'C', 'Credit Card'),
('Tom', 'A', 'Passport'),
('Tom', 'B', 'SIN'),
('Tom', 'B', 'Drivers Licence'),
('Alice', 'B', 'Drivers Licence'),
('Matt', 'X', 'Drivers Licence'),
('Dan', 'X', 'Drivers Licence')
Is this what you're looking for:
;WITH [cteNonMatchingIDs] AS (
-- Pairs where the IDType is the same, but
-- name and ID don't match
SELECT ci3.[CustomerName] AS [CustomerName1],
ci4.[CustomerName] AS [CustomerName2]
FROM [dbo].[CustomerIdentification] ci3
INNER JOIN [dbo].[CustomerIdentification] ci4
ON ci3.[IDType] = ci4.[IDType]
WHERE ci3.[CustomerName] <> ci4.[CustomerName]
AND ci3.[ID] <> ci4.[ID]
),
[cteMatchedPairs] AS (
-- Pairs where the IDType and ID match, and
-- there aren't any non matching IDs for the
-- CustomerName
SELECT DISTINCT
ci1.[CustomerName] AS [CustomerName1],
ci2.[CustomerName] AS [CustomerName2]
FROM [dbo].[CustomerIdentification] ci1
LEFT JOIN [dbo].[CustomerIdentification] ci2
ON ci1.[CustomerName] <> ci2.[CustomerName]
AND ci1.[IDType] = ci2.[IDType]
WHERE ci1.[ID] = ISNULL(ci2.[ID], ci1.[ID])
AND NOT EXISTS (
SELECT 1
FROM [cteNonMatchingIDs]
WHERE ci1.[CustomerName] = [CustomerName1] -- correlated subquery
AND ci2.[CustomerName] = [CustomerName2]
)
AND ci1.[CustomerName] < ci2.[CustomerName]
),
[cteMatchedList] ([CustomerName], [CustomerNameList]) AS (
-- Turn the matched pairs into list of matching
-- CustomerNames
SELECT [CustomerName1],
[CustomerNameList]
FROM (
SELECT [CustomerName1],
CONVERT(VARCHAR(1000), '$'
+ [CustomerName1] + '$'
+ [CustomerName2]) AS [CustomerNameList]
FROM [cteMatchedPairs]
UNION ALL
SELECT [CustomerName2],
CONVERT(VARCHAR(1000), '$'
+ [CustomerName2]) AS [CustomerNameList]
FROM [cteMatchedPairs]
) [cteMatchedPairs]
UNION ALL
SELECT [cteMatchedList].[CustomerName],
CONVERT(VARCHAR(1000),[CustomerNameList] + '$'
+ [cteMatchedPairs].[CustomerName2])
FROM [cteMatchedList] -- recursive CTE
INNER JOIN [cteMatchedPairs]
ON RIGHT([cteMatchedList].[CustomerNameList],
LEN([cteMatchedPairs].[CustomerName1])
) = [cteMatchedPairs].[CustomerName1]
),
[cteSubstringLists] AS (
SELECT r1.[CustomerName],
r2.[CustomerNameList]
FROM [cteMatchedList] r1
INNER JOIN [cteMatchedList] r2
ON r2.[CustomerNameList] LIKE '%' + r1.[CustomerNameList] + '%'
),
[cteCustomerLink] AS (
SELECT DISTINCT
x1.[CustomerName],
HASHBYTES('SHA1', x2.[CustomerNameList]) AS [CustomerLink]
FROM (
SELECT [CustomerName],
MAX(LEN([CustomerNameList])) AS [MAX LEN CustomerList]
FROM [cteSubstringLists]
GROUP BY [CustomerName]
) x1
INNER JOIN (
SELECT [CustomerName],
LEN([CustomerNameList]) AS [LEN CustomerList],
[CustomerNameList]
FROM [cteSubstringLists]
) x2
ON x1.[MAX LEN CustomerList] = x2.[LEN CustomerList]
AND x1.[CustomerName] = x2.[CustomerName]
)
UPDATE c
SET [CustomerLink] = cl.[CustomerLink]
FROM [dbo].[Customer] c
INNER JOIN [cteCustomerLink] cl
ON cl.[CustomerName] = c.[CustomerName]
SELECT *
FROM [dbo].[Customer]

Problem with unique SQL query

I want to select all records, but have the query only return a single record per Product Name. My table looks similar to:
SellId ProductName Comment
1 Cake dasd
2 Cake dasdasd
3 Bread dasdasdd
where the Product Name is not unique. I want the query to return a single record per ProductName with results like:
SellId ProductName Comment
1 Cake dasd
3 Bread dasdasdd
I have tried this query,
Select distict ProductName,Comment ,SellId from TBL#Sells
but it is returning multiple records with the same ProductName. My table is not realy as simple as this, this is just a sample. What is the solution? Is it clear?
Select ProductName,
min(Comment) , min(SellId) from TBL#Sells
group by ProductName
If y ou only want one record per productname, you ofcourse have to choose what value you want for the other fields.
If you aggregate (using group by) you can choose an aggregate function,
htat's a function that takes a list of values and return only one : here I have chosen MIN : that is the smallest walue for each field.
NOTE : comment and sellid can come from different records, since MIN is taken...
Othter aggregates you might find useful :
FIRST : first record encountered
LAST : last record encoutered
AVG : average
COUNT : number of records
first/last have the advantage that all fields are from the same record.
SELECT S.ProductName, S.Comment, S.SellId
FROM
Sells S
JOIN (SELECT MAX(SellId)
FROM Sells
GROUP BY ProductName) AS TopSell ON TopSell.SellId = S.SellId
This will get the latest comment as your selected comment assuming that SellId is an auto-incremented identity that goes up.
I know, you've got an answer already, I'd like to offer a way that was fastest in terms of performance for me, in a similar situation. I'm assuming that SellId is Primary Key and identity. You'd want an index on ProductName for best performance.
select
Sells.*
from
(
select
distinct ProductName
from
Sells
) x
join
Sells
on
Sells.ProductName = x.ProductName
and Sells.SellId =
(
select
top 1 s2.SellId
from
Sells s2
where
x.ProductName = s2.ProductName
Order By SellId
)
A slower method, (but still better than Group By and MIN on a long char column) is this:
select
*
from
(
select
*,ROW_NUMBER() over (PARTITION BY ProductName order by SellId) OccurenceId
from sells
) x
where
OccurenceId = 1
An advantage of this one is that it's much easier to read.
create table Sale
(
SaleId int not null
constraint PK_Sale primary key,
ProductName varchar(100) not null,
Comment varchar(100) not null
)
insert Sale
values
(1, 'Cake', 'dasd'),
(2, 'Cake', 'dasdasd'),
(3, 'Bread', 'dasdasdd')
-- Option #1 with over()
select *
from Sale
where SaleId in
(
select SaleId
from
(
select SaleId, row_number() over(partition by ProductName order by SaleId) RowNumber
from Sale
) tt
where RowNumber = 1
)
order by SaleId
-- Option #2
select *
from Sale
where SaleId in
(
select min(SaleId)
from Sale
group by ProductName
)
order by SaleId
drop table Sale

Resources