How to ensure uniqueness in many-to-many relationship table? - database

Users have many roles, roles have many users.
In USERS_ROLES table, have 3 columns: USERS_ROLES_ID, USER_ID, ROLE_ID
Usually USERS_ROLES_ID is just sequentially generated. Someone told me it's supposed to guarantee that user_id and role_id cross product are unique, so the primary key USERS_ROLES_ID should actually be some sort of combination of both USER_ID and ROLE_ID. How is this done, usually? (for example, USER_ID * (big number here) + ROLE_ID)?? Every example I could find uses a naive sequential primary key generation of the many-to-many join table.

Having a sequentially generated USERS_ROLE_ID primary key will not guarantee a unique combination of USER_ID and ROLE_ID. Adding a unique index on (USER_ID, ROLE_ID) will.

Gerrat is right. I found the full answer here: http://weblogs.sqlteam.com/jeffs/archive/2007/08/23/composite_primary_keys.aspx
Create table CustomerProducts
(
Customer_ProductID int identity primary key,
CustomerID int references Customers(CustomerID) not null,
ProductID int references Products(ProductID) not null,
OrderLimit int not null
)
This is what I see in perhaps most of the databases that I’ve worked
with over the years. The reason for designing a table in this manner?
Honestly, I don’t know! I can only surmise that it is because of the
lack of understanding what a primary key of a table really is, and
that it can be something other than an identity and that it can be
comprised of more than just a single column. As I mentioned, it seems
that many database architects are simply not aware of this fact.
Consider instead the following design:
Create table CustomerProducts (
CustomerID int references Customers(CustomerID) not null,
ProductID int references Products(ProductID) not null,
OrderLimit int not null,
Primary key (CustomerID, ProductID) )
Notice here that we have eliminated the identity column, and have
instead defined a composite (multi-column) primary key as the
combination of the CustomerID and ProductID columns. Therefore, we do
not have to create an additional unique constraint. We also do not
need an additional identity column that really serves no purpose. We
have not only simplified our data model physically, but we’ve also
made it more logically sound and the primary key of this table
accurately explains what it is this table is modeling – the
relationship of a CustomerID to a ProductID.

Related

Creating a foreign key against a composite key in MS SQL Server

I'm trying to create a foreign key between two tables. Problem is one of those tables has a composite primary key..
My tables are products (one row per product) and product_price_history (many rows per product).
I have a composite key in product_price_history, which is product id and start date of a specific price for that product.
Here's my code :
CREATE TABLE products (
product_id INT IDENTITY(1,1) PRIMARY KEY,
product_name VARCHAR(50) NOT NULL,
product_desc VARCHAR(255) NULL,
product_group_id INT
)
CREATE TABLE product_price_history (
product_id INT NOT NULL,
start_date DATE NOT NULL,
end_date DATE NULL,
price NUMERIC (6,2) NOT NULL
)
ALTER TABLE product_price_history
ADD CONSTRAINT pk_product_id_start_dt
PRIMARY KEY (product_id,start_date)
Now I'm trying to create a foreign key between the products table and the product_price_history table but I can't because its a composite key.
Also it doesn't make sense to add the start date (the other part of the foreign key) to the products table.
What's the best way to deal with this? Can I create a foreign key between these tables? Do I even NEED a foreign key?
My intentions here are
to enforce uniqueness of the product price information. A product can only have one price at any time.
to link these two tables so there's a logical join between them, and I can show this in a database diagram
The foreign key on the product_price_history table should only include product_id. Your target is to ensure that any entry product_price_history already has "parent" entry in products. That has nothing to do with start_date.
The way I see this situation, in theory, fully normalized version of the tables would have to have current_price as unique value in products table. And the product_price_history is simply a log table.
It's not necessary to do it this way, with a physical field, but thinking from this perspective helps to see where your tables model is slightly de-normalized.
Also, if you make product_price_history table anything but simple log table, how do you ensure that new start_date is newer than previous end_date? You can't even express that as a primary key. What if you edit start_date later? I would even think to create different compaund key for product_price_history table. Perhaps product_id+insert_date or only auto-increment id, while still keeping foreign key relationship to the products.product_id.

Foreign key to table A and B, where A already have a foreign key to B

Suppose there is a table called Accounts:
CREATE TABLE Accounts
(
[Id] int not null primary key identity(1,1)
[Username] varchar(20) not null unique,
[Password] varchar(20) not null
)
Then, there is another tabled called Characters. Each account can have N characters. So I can use a foreign key to link these characters.
CREATE TABLE Characters
(
[AccountId] int not null foreign key references Accounts([Id]),
[Id] int not null primary key identity(1,1),
[Nickname] varchar(20) not null unique,
[Level] int not null default 0,
)
Each character can have multiple equipments (inventory), so there is a Equipments table.
Since each equipment is linked to a character, I should use foreign key again, and there comes the problem.
Me and my coworker were arguing about which foreign key to use.
Since each character has a unique Id, I told him that we could use foreign key to that Id and that would be enough. As follows:
CREATE TABLE Equipments
(
[CharId] int not null foreign key references Characters([Id]),
[ItemId] int not null
)
He told me that we must use a foreign key to the character id AND the account id, as follows:
CREATE TABLE Equipments
(
[AccountId] int not null foreign key references Accounts([Id]), /*is this necessary?*/
[CharId] int not null foreign key references Characters([Id]),
[ItemId] int not null
)
I'm not expert in Sql Server and in my opinion, the foreign key to the account id is completely unecessary but he keeps telling me that we must use it and it will help performance because the more foreign key you use, it will be better.
So, should I use foreign key to account id and character id or character id is good enough?
As you said, there is a one-to-many relationship between Account and Character (and hence, a character cannot belong to more than one account).
Similarly, as you described, each record in Equipments only corresponse to a unique record in Characters. The relation from Account to Equipments hence can be inferred, and so, there is no need to create an extra column in the Equipments table. Also, the data integrity is preserved just by the two foreign keys already created, so that should not be a problem when you go without the AccountId column in the Equipments table.
Regarding the performance argument, this is a case-by-case situation, and it depends on a lot of other things (number of records, business logic,...). Having unnecessary foreign key can even hurt performance since the database/server will need to maintain that foreign key while operate. Also, I found that if you do not have the key and when you find out that you need it, it is easier to add one in than to remove an existing one, especially when you have to create a whole new column for this one (this last piece is a mere personal opinion).
You should use it only if you plan to interrogate equipment directly for an account which is faster than joining with account via char. Otherwise, no, you shouldn't use it.
You are correct, but for a more important reason.
If you include Accountid in the Equipments table, then you have a second relationship to the Accounts table. Perhaps this is allowed, but in all likelihood, you intend to have the Characters.AccountId be the account id for a row in Equipments.
You would then get the appropriate account id by using a join to the Equipments table.

composite vs surrogate primary key

I am designing a database with the following requirements:
An organization can exist on its own
An organization can have any number of distinct terms (date range)
An organization can have any number of survey types (student, teacher, parent, etc)
A survey form is assigned a term and survey type
A structure for this might be:
Organization
- OrganizationId INT IDENTITY(1,1) NOT NULL PRIMARY KEY
Term
- TermId INT IDENTITY(1,1) NOT NULL PRIMARY KEY
- OrganizationId INT NOT NULL REFERENCES Organization(OrganizationId)
SurveyType
- SurveyTypeId IDENTITY(1,1) NOT NULL PRIMARY KEY
- OrganizationId INT NOT NULL REFERENCES Organization(OrganizationId)
SurveyForm
- SurveyFormId INT IDENTITY(1,1) NOT NULL PRIMARY KEY
- SurveyTypeId INT NOT NULL REFERENCES SurveyType(SurveyTypeId)
- TermId INT NOT NULL REFERENCES Term(TermId)
That structure keeps with what seems to be a popular emphasis on a single surrogate primary key. However that structure sacrifices data integrity because it is very easy for a SurveyForm record to have a TermId or SurveyTypeId from different Organizations.
To address data integrity, it would seem you would have to add OrganizationId and use it in the composite keys (OrganizationId, SurveyTypeId) and (OrganizationId, TermId). That is somewhat tolerable in this example but as the schema becomes more complete, the composite key sizes increase.
So my question is, how do people generally approach this now (most references online are from 2008 when I think its possible there were different database design concerns)? As a corollary, when is it acceptable to add foreign keys to a table to reduce the number of tables joined for common expressions?
Academically speaking, you can migrate the Organization key along both lineages. That's just 4 bytes, after all:
create table dbo.Organization (
OrganizationId INT IDENTITY(1,1) PRIMARY KEY
);
go
create table dbo.Term (
TermId INT IDENTITY(1,1) NOT NULL,
OrganizationId INT NOT NULL REFERENCES dbo.Organization(OrganizationId),
primary key (OrganizationId, TermId)
);
go
create table dbo.SurveyType (
SurveyTypeId int IDENTITY(1,1) NOT NULL,
OrganizationId INT NOT NULL REFERENCES dbo.Organization(OrganizationId),
primary key (OrganizationId, SurveyTypeId)
);
go
create table dbo.SurveyForm (
SurveyFormId INT IDENTITY(1,1) NOT NULL,
OrganizationId int not null,
SurveyTypeId INT NOT NULL,
TermId INT NOT NULL,
primary key (OrganizationId, SurveyTypeId, TermId),
foreign key (OrganizationId, TermId) references dbo.Term (OrganizationId, TermId),
foreign key (OrganizationId, SurveyTypeId) references dbo.SurveyType (OrganizationId, SurveyTypeId)
);
go
These tables definitely violate some NF, I don't remember which one exactly, but I'm sure you can handle it yourself.
While this design approach can almost be considered a must for a warehouse (esp. if you aggregate data from different sources), I would never recommend it for any real-life OLTP. Much simpler solution would be:
Perform all modifications via a stored procedure, which will have proper checks against this kind of possible discrepancy.
Make sure that no user would have permissions to directly add / modify data in the dbo.SurveyForm, thus circumventing the business rules implemented in the aforementioned SP.
I think there could be a way to avoid circular references, firstly by defining who really depends on who and removing redundant dependencies.
The question is... are Organizations allowed to be randomly associated to Terms without caring about any Survey association? I wonder if Organizations really need to be associated to a Term directly or indirectly through Surveys. If, for example, an Organization CANNOT be associated to a Term that is not associated to the Organization's Survey then the Organization-Term relationship is useless, if it is the other way around, then the Organization-SurveyType is not needed

Database design - composite key relationship issue

I had posted a similar question before, but this is more specific. Please have a look at the following diagram:
The explanation of this design is as follows:
Bakers produce many Products
The same Product can be produced by more than one Baker
Bakers change their pricing from time-to-time for certain (of their) Products
Orders can be created, but not necessarily finalised
The aim here is to allow the store manager to create an Order "Basket" based on whatever goods are required, and also allow the system being created to determine the best price at that time based on what Products are contained within the Order.
I therefore envisaged the ProductOrders table to initially hold the productID and associated orderID, whilst maintaining a null (undetermined) value for bakerID and pricingDate, as that would be determined and updated by the system, which would then constitute a finalised order.
Now that you have an idea of what I am trying to do, please advise me on how to to best set these relationships up.
Thank you!
If I understand correctly, an unfinalised order is not yet assigned a baker / pricing (meaning when an order is placed, no baker has yet been selected to bake the product).
In which case, the order is probably placed against the Products Table and then "Finalized" against the BakersProducts table.
A solution could be to give ProductsOrders 2 separate "ProductID's", one being for the original ordered ProductId (i.e. Non Nullable) - say ProductId, and the second being part of the Foreign key to the assigned BakersProducts (say ProductId2). Meaning that in ProductsOrders, the composite foreign keys BakerId, ProductId2 and PricingDate are all nullable, as they will only be set once the order is Finalized.
In order to remove this redundancy, what you might also consider is using surrogate keys instead of the composite keys. This way BakersProducts would have a surrogate PK (e.g. BakersProductId) which would then be referenced as a nullable FK in ProductsOrders. This would also avoid the confusion with the Direct FK in ProductsOrders to Product.ProductId (which from above, was the original Product line as part of the Order).
HTH?
Edit:
CREATE TABLE dbo.BakersProducts
(
BakerProductId int identity(1,1) not null, -- New Surrogate PK here
BakerId int not null,
ProductId int not null,
PricingDate datetime not null,
Price money not null,
StockLevel bigint not null,
CONSTRAINT PK_BakerProducts PRIMARY KEY(BakerProductId),
CONSTRAINT FK_BakerProductsProducts FOREIGN KEY(ProductId) REFERENCES dbo.Products(ProductId),
CONSTRAINT FK_BakerProductsBaker FOREIGN KEY(BakerId) REFERENCES dbo.Bakers(BakerId),
CONSTRAINT U_BakerProductsPrice UNIQUE(BakerId, ProductId, PricingDate) -- Unique Constraint mimicks the original PK for uniqueness ... could also use a unique index
)
CREATE TABLE dbo.ProductOrders
(
OrderId INT NOT NULL,
ProductId INT NOT NULL, -- This is the original Ordered Product set when order is created
BakerProductId INT NULL, -- This is nullable and gets set when Order is finalised with a baker
OrderQuantity BIGINT NOT NULL,
CONSTRAINT FK_ProductsOrdersBakersProducts FOREIGN KEY(BakersProductId) REFERENCES dbo.BakersProducts(BakerProductId)
.. Other Keys here
)

When should I combine two foreign keys as a single foreign key?

My teach said I should combine two foreign keys into a single primary key. But my thought process is that that would allow for only one combination of each foreign key.
Imagine I have a Product, Purchase, PurchaseDetail.
In PurchaseDetail I have two foreign keys, one for product and one for purchase. My teacher said that I should combine these two foreign keys into a single one. But can't a product be in many different purchases? And many purchases have many products?
I'm confused.
Thanks!
Edit: This is the SQL my teacher saw and then gave feedback upon. Thanks for the guidance guys. (I changed the essential to English)
create table Purchase
(
ID int primary key identity(1,1),
IDCliente int foreign key references Cliente(ID),
IDEmpleado int foreign key references Empleado(ID),
Fecha datetime not null,
Hora datetime not null,
Amount float not null,
)
create table PurchaseDetail
(
ID int primary key identity(1,1),
IDPurchase int foreign key references Purchase(ID),
IDProductOffering int foreign key references ProductOffering(ID),
Quantity int not null
)
create table Product
(
ID int primary key identity(1,1),
IDProveedor int foreign key references Proveedor(ID),
Nombre nvarchar(256) not null,
IDSubcategoria int foreign key references Subcategoria(ID),
IDMarca int foreign key references Marca(ID),
Fotografia nvarchar(1024) not null
)
create table ProductOffering
(
ID int primary key identity(1,1),
IDProduct int foreign key references Product(ID),
Price float not null,
OfferDate datetime not null,
)
Maybe I'm confused about good database schema design. Thanks again!
I imagine he's suggesting:
Product - one primary key (product id), which implies a unique product id
Purchase - one primary key (purchase id), which implies a unique purchase id
PurchaseDetail - two foreign keys (product id),(purchase id), plus one unique constraint on (product id + purchase id)
Plus some people argue that all tables should have their own primary key that doesn't depend on anything else (purchase detail id). Some DBMS make this mandatory.
This means that you can't have two rows in PurchaseDetail that have the same product and purchase. That makes sense, assuming there is also a quantity column on PurchaseDetail, so that one purchase can have more than one of each product.
Note that there is a difference between a unique constraint and a foreign key. A foreign key merely says that there should be an item with that id in the parent table - it will let you create as many references to that item as you want in the child table. You need to specify that the column or combination of columns are unique if you want to avoid duplicates. A primary key on the other hand implies a unique constraint.
Exact syntax for defining all of this varies by language, but those are the principles.
I don't agree with the single key, but they could be a compound key (which I tend to dislike). They can be two different fields each restricted to the ID in the corresponding tables.
Not sure why the same product iD would need to be listed more than once for a single purchase? Isn't that why you indicate quantity? Maybe the need to do a separate line item for a purchase and a discount?
I believe thelem has answered correctly. But there is another option. You could add a new primary key column to the details table, so it looks like this:
detail_id int (PK)
product_id int (FK)
purchsae_id int (FK)
This is not really necessary, but it could be useful if you need to ever need to reference the details table as a foreign key - having a single primary key field makes for smaller indexes and foreign key reference (and they are a little easier to type).
That depends on what data you need to represent.
If you use the two foreign keys as the primary key for the purchase detail, a product may only occur once in each purchase. A purchase may however still contain many products, and a product may still occur in many purchases.
If the purchase detail contains more information, you may need to be able to use a product more than once in a purchase. For example if the purchase detail contains size and color, and you want to by a red T-shirt size XL and a blue T-shirt size S.
Perhaps he is suggesting a many-to-many table where it's Primary Key is comprised of the Foreign Keys to the mapped tables:
PurchaseDetail:
ProductId int (FK)
PurchaseId int (FK)
PK(ProductId, PurchaseId)
This can also be modelled as
PurchaseDetail:
PurchaseDetailId int (PK, Identity)
ProductId int (FK)
PurchaseId int (FK)
The second form is useful if you want to refer to Purchase details elsewhere in your model, and also in some RDBMS's it is beneficial to have a PK on a montonically increasing integer.

Resources