I'm building a comment system in PostgreSQL where I can comment (as well as "liking" them) on different entities that I already have (such as products, articles, photos, and so on). For the moment, I came up with this:
(note: the foreign key between comment_board and product/article/photo is very loose here. ref_id is just storing the id, which is used in conjunction with the comment_board_type to determine which table it is)
Obviously, this doesn't seem like good data integrity. What can I do to give it better integrity? Also, I know every product/article/photo will need a comment_board. Could that mean I implement a comment_board_id to each product/article/photo entity such as this?:
I do recognize this SO solution, but it made me second-guess supertypes and the complexities of it: Database design - articles, blog posts, photos, stories
Any guidance is appreciated!
I ended up just pointing the comments directly to the product/photo/article fields. Here is what i came up with in total
CREATE TABLE comment (
id SERIAL PRIMARY KEY,
created_at TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT (now()),
updated_at TIMESTAMP WITH TIME ZONE,
account_id INT NOT NULL REFERENCES account(id),
text VARCHAR NOT NULL,
-- commentable sections
product_id INT REFERENCES product(id),
photo_id INT REFERENCES photo(id),
article_id INT REFERENCES article(id),
-- constraint to make sure this comment appears in only one place
CONSTRAINT comment_entity_check CHECK(
(product_id IS NOT NULL)::INT
+
(photo_id IS NOT NULL)::INT
+
(article_id IS NOT NULL)::INT
= 1
)
);
CREATE TABLE comment_likes (
id SERIAL PRIMARY KEY,
created_at TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT (now()),
updated_at TIMESTAMP WITH TIME ZONE,
account_id INT NOT NULL REFERENCES account(id),
comment_id INT NOT NULL REFERENCES comment(id),
-- comments can only be liked once by an account.
UNIQUE(account_id, comment_id)
);
Resulting in:
This makes it so that I have to do one less join to an intermediary table. Also, it lets me add a field and update the constraints easily.
I am trying to model the following in a postgres db.
I have N number of 'datasets'. These datasets are things like survey results, national statistics, aggregated data etc. They each have a name a source insitution a method etc. This is the meta data of a dataset and I have tables created for this and tables for codifying the research methods etc. The 'root' meta-data table is called 'Datasets'. Each row represents one dataset.
I then need to store and access the actual data associated with this dataset. So I need to create a table that contains that data. How do I represent the relationship between this table and its corresponding row in the 'Datasets' table?
an example
'hea' is a set of survey responses. it is unaggregated so each row is one survey response. I create a table called 'HeaData' that contains this data.
'cso' is a set of aggregated employment data. each row is a economic sector. I create a table called 'CsoData' that contains this data
I create a row for each of these in the 'datasets' table with the relevant meta data for each and they have ids of 1 & 2 respectively.
what is the best way to relate 1 to the HeaData table and 2 to the CsoData table?
I will eventually be accessing this data with scala slick so if the database design could just 'plug and play' with slick that would be ideal
Add a column to the Datasets table which designates which type of dataset it represents. Then a 1 may mean HEA and 2 may mean CSO. A check constraint would limit the field to one of the two values. If new types of datasets are added later, the only change needed is to change the constraint. If it is defined as a foreign key to a "type of dataset" table, you just need to add the new type of dataset there.
Form a unique index on the PK and the new field.
Add the same field to each of the subtables. But the check constraint limits the value in the HEA table to only that value and the CSO table to only that value. Then form the ID field of Datasets table and the new field as the FK to Datasets table.
This limits the ID value to only one of the subtables and it must be the one defined in the Datasets table. That is, if you define a HEA dataset entry with an ID value of 1000 and the HEA type value, the only subtable that can contain an ID value of 1000 is the HEA table.
create table Datasets(
ID int identity/auto_generate,
DSType char( 3 ) check( DSType in( 'HEA', 'CSO' ),
[everything else],
constraint PK_Datasets primary key( ID ),
constraint UQ_Dateset_Type unique( ID, DSType ) -- needed for references
);
create table HEA(
ID int not null,
DSType char( 3 ) check( DSType = 'HEA' ) -- making this a constant value
[other HEA data],
constraint PK_HEA primary key( ID ),
constraint FK_HEA_Dataset_PK foreign key( ID )
references Dataset( ID ),
constraint FK_HEA_Dataset_Type foreign key( ID, DSType )
references Dataset( ID, DSType )
);
The same idea with the CSO subtable.
I would recommend an HEA and CSO view that would show the complete dataset rows, metadata and type-specific data, joined together. With triggers on those views, they can be the DML points for the application code. Then the apps don't have to keep track of how that data is laid out in the database, making it a lot easier to make improvements should the opportunity present itself.
I'm new to the hasAndBelongsToMany relationships in CakePHP. I'm trying to figure out the best way to implement the following scenario. Do I use hasAndBelongsToMany or hasMany through?
In my application I am creating Repair Orders. The Repair Orders have Op Codes, which are assigned to employees. There can be more than one Op Code per Repair Order. So, it's likely I'll have something similar to the following:
RepairOrder1 - OpCode1 completed by Employee1, OpCode2 completed by Employee2
From what I read in the manual, the reason you can't use HABTM relationships for this type of situation is that when saving the data, the original information is removed first. However, in 2.1 there is a 'unique' key that you can set to have it keep the existing information first. So do I create a HABTM or a hasMany through?
EDIT
Here's my schema:
/* repair_orders */
CREATE TABLE repair_orders (
id INT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
number VARCHAR(16), //Auto-generated repair order number
created DATETIME DEFAULT NULL,
modified DATETIME DEFAULT NULL
);
/* Employees */
CREATE TABLE employees (
id INT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(16),
created DATETIME DEFAULT NULL,
modified DATETIME DEFAULT NULL
);
/* op_codes */
CREATE TABLE op_codes (
id INT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(16),
sale_material DECIMAL(10,2),
cost_material DECIMAL(10,2),
created DATETIME DEFAULT NULL,
modified DATETIME DEFAULT NULL
);
I am trying to assign more than one op code to the repair orders, and also keep track of which employee did that specific op code. So I can have a repair order with 3 op codes, each with a different employee. So I think that requires another table, something like:
/* repair_order_assignments */
CREATE TABLE repair_order_assignments (
id INT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
repair_order_id INT(16),
op_code_id INT(16),
employee_id INT(16)
);
So my associations would be:
Repair_Order_Assignment hasMany Employee, Op_Code
Employee belongsTo Repair_Order_Assignment
Op_Code belongsTo Repair_Order_Assignment
Repair_Order_Assignment hasAndBelongsToMany Repair_Order
Repair_Order hasAndBelongsToMany Repair_Order_Assignment
Then according to the manual, when hasAndBelongsToMany associations are saved, the association is deleted first. You would lose the extra data in the columns as it is not replaced in the new insert. However, it goes on to note that in 2.1 there is a unique variable that can be set to save the extra data. Will this setup work with the unique key or should I use a hasMany through approach?
I would think you'd use neither in this case:
OpCode belongsTo RepairOrder
OpCode belongsTo Employee
RepairOrder hasMany OpCode
Employee hasMany OpCode
When retrieving your data, use CakePHP's Containable behavior:
$this->RepairOrder->find('all',
'contain' => array(
'OpCode' => array(
'Employee'
)
)
);
I have 3 tables:
Product (shamppoo, toothpaste,..)
FrequenceOfUse (3 times/day, once/day,...)
User
As you can imagine I want to store in a database how much the users use products. What should be the relations between those tables?
Javi
It will be a many-to-many relationship between User and Product and one-to-many relationship between FrequenceOfUse and UserProduct (the latter being a relationship itself).
Note that there is no consensus on whether relationships can have relationships on their own. So some would prefer to model it as:
Entity User
Entity Product
Entity Usage
Entity FrequenceOfUse
1-N relationship: User participates in Usage
1-N relationship: Product participates in Usage
N-1 relationship Usage is performed with Frequence
Both these models are relationally modeled as:
CREATE TABLE user (id INT NOT NULL PRIMARY KEY, name TEXT, ...)
CREATE TABLE product (id INT NOT NULL PRIMARY KEY, name TEXT, ...)
CREATE TABLE frequence (id NOT NULL, description TEXT, ...)
CREATE TABLE usage
(
user INT NOT NULL FOREIGN KEY REFERENCES (user),
product INT NOT NULL FOREIGN KEY REFERENCES (product),
frequence INT NOT NULL REFERENCES frequence,
PRIMARY KEY (user, product)
)
You may find this post in my blog useful:
What is entity-relationship model?
I've got 3 relevant tables in my database.
CREATE TABLE dbo.Group
(
ID int NOT NULL,
Name varchar(50) NOT NULL
)
CREATE TABLE dbo.User
(
ID int NOT NULL,
Name varchar(50) NOT NULL
)
CREATE TABLE dbo.Ticket
(
ID int NOT NULL,
Owner int NOT NULL,
Subject varchar(50) NULL
)
Users belong to multiple groups. This is done via a many to many relationship, but irrelevant in this case. A ticket can be owned by either a group or a user, via the dbo.Ticket.Owner field.
What would be the MOST CORRECT way describe this relationship between a ticket and optionally a user or a group?
I'm thinking that I should add a flag in the ticket table that says what type owns it.
You have a few options, all varying in "correctness" and ease of use. As always, the right design depends on your needs.
You could simply create two columns in Ticket, OwnedByUserId and OwnedByGroupId, and have nullable Foreign Keys to each table.
You could create M:M reference tables enabling both ticket:user and ticket:group relationships. Perhaps in future you will want to allow a single ticket to be owned by multiple users or groups? This design does not enforce that a ticket must be owned by a single entity only.
You could create a default group for every user and have tickets simply owned by either a true Group or a User's default Group.
Or (my choice) model an entity that acts as a base for both Users and Groups, and have tickets owned by that entity.
Heres a rough example using your posted schema:
create table dbo.PartyType
(
PartyTypeId tinyint primary key,
PartyTypeName varchar(10)
)
insert into dbo.PartyType
values(1, 'User'), (2, 'Group');
create table dbo.Party
(
PartyId int identity(1,1) primary key,
PartyTypeId tinyint references dbo.PartyType(PartyTypeId),
unique (PartyId, PartyTypeId)
)
CREATE TABLE dbo.[Group]
(
ID int primary key,
Name varchar(50) NOT NULL,
PartyTypeId as cast(2 as tinyint) persisted,
foreign key (ID, PartyTypeId) references Party(PartyId, PartyTypeID)
)
CREATE TABLE dbo.[User]
(
ID int primary key,
Name varchar(50) NOT NULL,
PartyTypeId as cast(1 as tinyint) persisted,
foreign key (ID, PartyTypeId) references Party(PartyID, PartyTypeID)
)
CREATE TABLE dbo.Ticket
(
ID int primary key,
[Owner] int NOT NULL references dbo.Party(PartyId),
[Subject] varchar(50) NULL
)
The first option in #Nathan Skerl's list is what was implemented in a project I once worked with, where a similar relationship was established between three tables. (One of them referenced two others, one at a time.)
So, the referencing table had two foreign key columns, and also it had a constraint to guarantee that exactly one table (not both, not neither) was referenced by a single row.
Here's how it could look when applied to your tables:
CREATE TABLE dbo.[Group]
(
ID int NOT NULL CONSTRAINT PK_Group PRIMARY KEY,
Name varchar(50) NOT NULL
);
CREATE TABLE dbo.[User]
(
ID int NOT NULL CONSTRAINT PK_User PRIMARY KEY,
Name varchar(50) NOT NULL
);
CREATE TABLE dbo.Ticket
(
ID int NOT NULL CONSTRAINT PK_Ticket PRIMARY KEY,
OwnerGroup int NULL
CONSTRAINT FK_Ticket_Group FOREIGN KEY REFERENCES dbo.[Group] (ID),
OwnerUser int NULL
CONSTRAINT FK_Ticket_User FOREIGN KEY REFERENCES dbo.[User] (ID),
Subject varchar(50) NULL,
CONSTRAINT CK_Ticket_GroupUser CHECK (
CASE WHEN OwnerGroup IS NULL THEN 0 ELSE 1 END +
CASE WHEN OwnerUser IS NULL THEN 0 ELSE 1 END = 1
)
);
As you can see, the Ticket table has two columns, OwnerGroup and OwnerUser, both of which are nullable foreign keys. (The respective columns in the other two tables are made primary keys accordingly.) The CK_Ticket_GroupUser check constraint ensures that only one of the two foreign key columns contains a reference (the other being NULL, that's why both have to be nullable).
(The primary key on Ticket.ID is not necessary for this particular implementation, but it definitely wouldn't harm to have one in a table like this.)
Another approach is to create an association table that contains columns for each potential resource type. In your example, each of the two existing owner types has their own table (which means you have something to reference). If this will always be the case you can have something like this:
CREATE TABLE dbo.Group
(
ID int NOT NULL,
Name varchar(50) NOT NULL
)
CREATE TABLE dbo.User
(
ID int NOT NULL,
Name varchar(50) NOT NULL
)
CREATE TABLE dbo.Ticket
(
ID int NOT NULL,
Owner_ID int NOT NULL,
Subject varchar(50) NULL
)
CREATE TABLE dbo.Owner
(
ID int NOT NULL,
User_ID int NULL,
Group_ID int NULL,
{{AdditionalEntity_ID}} int NOT NULL
)
With this solution, you would continue to add new columns as you add new entities to the database and you would delete and recreate the foreign key constraint pattern shown by #Nathan Skerl. This solution is very similar to #Nathan Skerl but looks different (up to preference).
If you are not going to have a new Table for each new Owner type then maybe it would be good to include an owner_type instead of a foreign key column for each potential Owner:
CREATE TABLE dbo.Group
(
ID int NOT NULL,
Name varchar(50) NOT NULL
)
CREATE TABLE dbo.User
(
ID int NOT NULL,
Name varchar(50) NOT NULL
)
CREATE TABLE dbo.Ticket
(
ID int NOT NULL,
Owner_ID int NOT NULL,
Owner_Type string NOT NULL, -- In our example, this would be "User" or "Group"
Subject varchar(50) NULL
)
With the above method, you could add as many Owner Types as you want. Owner_ID would not have a foreign key constraint but would be used as a reference to the other tables. The downside is that you would have to look at the table to see what the owner types there are since it isn't immediately obvious based upon the schema. I would only suggest this if you don't know the owner types beforehand and they won't be linking to other tables. If you do know the owner types beforehand, I would go with a solution like #Nathan Skerl.
Sorry if I got some SQL wrong, I just threw this together.
Yet another option is to have, in Ticket, one column specifying the owning entity type (User or Group), second column with referenced User or Group id and NOT to use Foreign Keys but instead rely on a Trigger to enforce referential integrity.
Two advantages I see here over Nathan's excellent model (above):
More immediate clarity and simplicity.
Simpler queries to write.
you can also use an enum to identify whether Owner is user or group like this:
CREATE TABLE dbo.Group
(
ID int NOT NULL,
Name varchar(50) NOT NULL
)
CREATE TABLE dbo.User
(
ID int NOT NULL,
Name varchar(50) NOT NULL
)
CREATE TYPE Enum_OwnerType AS ENUM ('Group', 'User');
CREATE TABLE dbo.Ticket
(
ID int NOT NULL,
Owner int NOT NULL,
OwnerType Enum_OwnerType NOT NULL,
Subject varchar(50) NULL
)
Maybe it's no better than any of proposed solutions, it might not offer any advantage. In fact, I think that this might require altering Enum_OwnerType and even ticket in order to change OwnerType, I guess... I hope it's useful anyway.
I have many cases like this and I just use polymorphic ability like below:
example
I have turnovers table that have this columns id, amount, user_id and I need to know the refrence of every records, So I just add two Fields table_id and table_type and my final turnovers table is like id, amount, user_id,table_id, table_type.
if new record is about order record inserted like this
[1,25000,2,22,order]
and if new record is about increment credit like this
[1,25000,2,23,credit]
note
if using M:M tables its take so much time two retrieve the records
and my way
Cons is turnovers table records number is grows up
Pons is more flexible in new records and readable and search ability
nathan_jr's 4th option (model an entity that acts as a base for both Users and Groups, and have tickets owned by that entity) doesn't enforce referential integrity on PartyId. You'd have to do that on the application layer which invites all sorts of trouble. Can't really call it an antipattern when django's genericforeignkey implements the same solution, but no doubt you can design something more robust and performant using your framework's orm (using something like django's Multi-table inheritance)
CREATE TABLE dbo.OwnerType
(
ID int NOT NULL,
Name varchar(50) NULL
)
insert into OwnerType (Name) values ('User');
insert into OwnerType (Name) values ('Group');
I think that would be the most general way to represent what you want instead of using a flag.