In making an ER schema for a simple database, I have encountered the following problem:
I get a cycle in the diagram, which I don't know if it is redundant or I could eliminate it somehow.
I present the problem on a large scale:
The visit entity records visits to London by a vehicle. This entity contains information on their arrival, departure and total visit time.
The vehicle entity contains information on the vehicle's place of origin, CO2 emissions and its number plate.
The entity date contains information for each date of the day of the week to which it corresponds and the name of the holiday for any region added.
The region of the entity Date is matched to the Vehicle region. The entry_date/end_date of the entity Visit is matched to the date of the entity Date. Finally, the number plate of the entity Vehicle is matched to the number plate of the entity Visit. In this way, the cycle that I mentioned at the beginning appears.
The ER diagram is as follows:
If there are any questions about the problem that I have not explained, please do not hesitate to ask me. I welcome suggestions for improving the ER diagram, either to remove the cycle or to simply keep it as it is if you think it is correct.
My two cents -
"Date" is really not a good name for entity or table.
First, it is too general to convey what you really refer to.
Second, it is a reserved key word in most common languages. You just cause unnecessary trouble for programming.
You use "Date" to get holiday name (for particular region) and week day, right?
My suggestion is that you only need to save holidays in this table because weekday can be figured out in most common programming language.
This "Date" table is just a lookup table to help you find out holiday, you do not need to enforce relation between "Date" and Visit
I'd also suggest you add Region table to enforce consistent naming.
Here is the DB diagram, I renamed "date" to "holiday"
Here is the SQL server implementation -
create table region (
region_code varchar(100) primary key
,region_name varchar(100)
)
create table holiday (
holiday_date date not null
,region_code varchar(100) not null
,holiday_name varchar(100) not null
)
alter table holiday add primary key (holiday_date, region_code)
alter table holiday add foreign key (region_code) references region (region_code)
create table vehicle (
number_plate varchar(100) primary key
,region_code varchar(100) not null
,CO2_emission varchar(100)
)
alter table vehicle add foreign key (region_code) references region (region_code)
create table visit (
number_plate varchar(100) not null
,entry_date date not null
,end_date date
)
alter table visit add primary key (number_plate, entry_date)
alter table visit add foreign key (number_plate) references vehicle (number_plate)
The Relate relationship between Vehicle and Date entity is redundant here. You can still find the dates for each visit a vehicle completed from the existing relationships. If you convert the ER diagram to DB tables, it'll be more clear.
Why are entry_date and exit_date two attributes of the Visit entity? These are already considered by the many-to-2 relationship with the Date entity. Remove these two attributes along with number_plate from Visit. Lastly, add an unique id to the Visit entity.
Related
For a small shift management project I decided to make I'm trying to make a weekly schedule of shifts for our employees, based on a 3 shifts per day schedule, where 1 shift can hold more than one employee.
I've created an employee table and a work_day table that holds the date of the shift and 3 join tables for each shift of the day.
CREATE TABLE employee(
id SERIAL PRIMARY KEY,
"name" VARCHAR(128) NOT NULL,
archived BOOLEAN DEFAULT FALSE
);
CREATE TABLE work_day(
id SERIAL PRIMARY KEY,
"date" DATE NOT NULL UNIQUE
);
CREATE TABLE morning_shift(
employee_id INTEGER FOREIGN KEY REFERENCES employee(id),
shift_id INTEGER FOREIGN KEY REFERENCES work_day(id),
PRIMARY KEY(employee_id, shift_id)
);
CREATE TABLE evening_shift(
employee_id INTEGER FOREIGN KEY REFERENCES employee(id),
shift_id INTEGER FOREIGN KEY REFERENCES work_day(id),
PRIMARY KEY(employee_id, shift_id)
);
CREATE TABLE night_shift(
employee_id INTEGER FOREIGN KEY REFERENCES employee(id),
shift_id INTEGER FOREIGN KEY REFERENCES work_day(id),
PRIMARY KEY(employee_id, shift_id)
);
The plan I had in mind is to create a view that would materialize a presentation of a work day:
Date
Morning Shift(name1, name2)
Evening Shift(name3, name4)
Night Shift(name5)
That way I can query whole work days as objects in my projects.
The issue is I come with very little experience in databases and it has been proved way more difficult that I had even imagined. I've been trying for the last couple of days and finally gave up on my ego and now I seek your humble help, how do you create a view like that. There are many confusing joins to it I can't wrap my head around it.
Thank you very much in advance.
As others mentioned in the comments, there is room for improvement regarding the DB design. However, this is how to create a view, just join all the tables where you need data from and select the fields you want:
CREATE VIEW shifts AS
SELECT *
FROM work_day inner join morning_shift on work_day.id = morning_shift.shift_id
inner join evening_shift on work_day.id = evening_shift.shift_id
... (more joins)
Take a look at this Postgres tutorial page on joins
I'm building a comment system in PostgreSQL where I can comment (as well as "liking" them) on different entities that I already have (such as products, articles, photos, and so on). For the moment, I came up with this:
(note: the foreign key between comment_board and product/article/photo is very loose here. ref_id is just storing the id, which is used in conjunction with the comment_board_type to determine which table it is)
Obviously, this doesn't seem like good data integrity. What can I do to give it better integrity? Also, I know every product/article/photo will need a comment_board. Could that mean I implement a comment_board_id to each product/article/photo entity such as this?:
I do recognize this SO solution, but it made me second-guess supertypes and the complexities of it: Database design - articles, blog posts, photos, stories
Any guidance is appreciated!
I ended up just pointing the comments directly to the product/photo/article fields. Here is what i came up with in total
CREATE TABLE comment (
id SERIAL PRIMARY KEY,
created_at TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT (now()),
updated_at TIMESTAMP WITH TIME ZONE,
account_id INT NOT NULL REFERENCES account(id),
text VARCHAR NOT NULL,
-- commentable sections
product_id INT REFERENCES product(id),
photo_id INT REFERENCES photo(id),
article_id INT REFERENCES article(id),
-- constraint to make sure this comment appears in only one place
CONSTRAINT comment_entity_check CHECK(
(product_id IS NOT NULL)::INT
+
(photo_id IS NOT NULL)::INT
+
(article_id IS NOT NULL)::INT
= 1
)
);
CREATE TABLE comment_likes (
id SERIAL PRIMARY KEY,
created_at TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT (now()),
updated_at TIMESTAMP WITH TIME ZONE,
account_id INT NOT NULL REFERENCES account(id),
comment_id INT NOT NULL REFERENCES comment(id),
-- comments can only be liked once by an account.
UNIQUE(account_id, comment_id)
);
Resulting in:
This makes it so that I have to do one less join to an intermediary table. Also, it lets me add a field and update the constraints easily.
I need some guidance on designing the schema for invoices in a multi-tenant application.
I have a table called EmployeePay which holds all the information required to generate an invoice. The invoice table would have the invoice number, invoice created date and VAT rate. I am thinking to create a Sequence object for each Tenant to generate an invoice number.
EmployeePay Table: EmployeeID, Hours, Rate, InvoiceID (FK)
Invoice Table: InvoiceID (PK) (Identity), InvoiceNumber, InvoiceDate, VATRate, TenantID
Is it okay to have hundreds of Sequence objects in a database, as I’ll have to create one for each tenant? I’ll also have to create same amount of stored procedures which returns the next invoice number (I prefer a separate stored procedure for each tenant rather than having one large stored procedure with hundreds of choices in a select case statement).
Another concern is, is it theoretical to insert into the master table (Invoice) based on the transaction table (EmployeePay) and then use its primary key(InvoiceID) to update the transaction table?
Thanks in advance.
First make sure the relationship either this is one to many or many to many. If you are considering one employee that will have many invoices then its one to many relationship and you can create your table as under:
EmployeePay Table: EmployeeID (PK) (Identity), Hours, Rate
Invoice Table: InvoiceID (PK) (Identity), EmployeeID (FK), InvoiceNumber, InvoiceDate, VATRate, TenantID
EDIT:
I don't know which database you are using but for increment sequence check:
for MySQL check this LINK.
If you are using Oracle then check this LINK
I would suggest you to create another table can be called as InvoiceNumber, this will contain InvoiceNumberId(Int),TenantId (Fk), CurrentSequenceNumber(Int).
Significance of CurrentSequenceNumber is that it will be simple integer number which can be used to generate next Invoicenumber.InvoiceNumberId will be a Identity columns for Primary key purpose (you may or may not have it).
Structure of the Table will look like below.
Now you need to create only One Stored Procedure which will take input parameter as TenantId and will have responsiblity to generate next Invoice number by reading CurrentSequenceNumber from above table.
For example if we need to generate new Invoice Id for Tenant with id as 15 then SP will have your Business logic I am assuming Just creating a String with "Inv-" as prefix with incremented value of CurrentSequenceNumber so output of Procedure will be.
Inv-0009
Then after generation of this number SP will increment value to 9 for InvoiceNumberId 3.
So everything will be managed by Single table and Single procedure only.
I'm designing my first SQL Server database and wonder if there's a better way to accomplish what I'm trying to do.
The goal is to be able to create one of 14 documents based on 200+ different document sections (titles, headings, paragraphs, lists, etc). Each document section is part of 1 or more documents.
My application does a single database lookup for a particular document and retrieves the data stored in the 50 text fields.
To do this, I first stored each unique document section in a "sections" table, giving each section a unique identifier (sectionID) and made the identifier a primary key, for example:
dbo.sections
sectionID(pk) sectionText
iv1 this is the text for the first section
AHv1 this text is for another section
APv2 more text to include
.
.
.
EFv3 another text section
GHv2 this is the last section text in the table
I then created a second table called "documents" to store each document name and the sections that belong to it. There are 51 columns in this table. The first column is the document name and the other 50 columns store the id's of the sections (they're named section1, section2, ...) that make up that particular document. Each of the section columns are foreign keys that reference the primary key in the "sections" table, for example:
dbo.documents
docID section1(fk) section2(fk) ... section50(fk)
option1 iv1 AHv1 ... GHv2
option2 iv1 APv2 ... EFv3
All of this seems straightforward to me. However, in order to get the text for each document to be part of a given record, I have to create a view that does 50 joins of the sections table. By doing that, each document id and its text are stored in one row of a table.
Is there a better way to get the same end result? Or a better design? It seems like there may be a lot of overhead to join the data between tables.
Any input would be greatly appreciated!
Let's say you have one table, document, with a one-to-many relationship with a second table, documentSection. Document has a PK field documentID, documentSection's PK is compound, documentID and sectionID, so when the two tables are joined, it's only on the documentID field. Then you won't need one column for each section.
Actually, it sounds like you have all of the document section text stored in your section table, which can be used in multiple documents. Maintenance nightmares aside, you can have Section be your primary table and sectionDocument have the one-to-many relationship, but you may need to introduce a sectionSequence field to keep the sections of your document in sequence. You'll actually need the sequence field regardless of which table is primary.
regarding your comment, let's say you have a section table with a PK field sectionID. Then you can have a sectionDocument table with a compound PK, sectionID and documentID, which will probably need to include a sequence number. You're currently using the ordinal position of the column to identify the sequence of the section in the document, but as you say, you don't want 50 relationships to the section table. The way to handle that is to have the sections defined vertically instead of horizontally. In rows instead of columns. You can also have a document table with PK documentID and the document name.
Building on (and maybe clarifying) what Beth is talking about, you might consider a three-table approach. The lords-of-data generally refer to normalization rules or normal forms to describe patterns in your data that result in great flexibility and performance.
At first blush, these rules seem to spread your data out, but it's very worthwhile learning about these patterns. You don't have to worry about your database "joining a lot" as this is what relational databases are really good at - and normalized database are really easy to join up.
For example, in order to select all the section texts in order for a given document, you would do something like this:
select
s.SectionText
from
Documents d
inner join
DocumentSections ds
on
d.DocumentId=ds.DocumentId
inner join
Sections s
on
ds.SectionId = s.SectionId
where
d.DocumentId = 'MyDoc'
order by
ds.Position
Basically, this converts your 50 columns in documents to an unlimited number of rows in DocumentSections.
Here's how you'd define such a system in SQL Server:
create table dbo.Sections
(
SectionId
nvarchar(8) not null
constraint [Sections.SectionId.PrimaryKey]
primary key clustered,
SectionText
nvarchar( max ) not null
)
create table dbo.Documents
(
DocumentId
nvarchar(8) not null
constraint [Documents.DocumentId.PrimaryKey]
primary key clustered,
Name
nvarchar( 255 ) not null
constraint [Documents.Name.Unique]
unique nonclustered
)
create table dbo.DocumentSections
(
DocumentId
nvarchar(8) not null
constraint [DocumentSections.to.Documents]
foreign key references dbo.Documents( DocumentId )
on delete cascade,
SectionId
nvarchar(8) not null
constraint [DocumentSections.to.Sections]
foreign key references dbo.Sections( SectionId )
on delete cascade,
Position
int not null,
constraint [DocumentSections.DocumentId.SectionId.PrimaryKey]
primary key clustered( DocumentId, SectionId ),
constraint [DocumentSections.DocumentId.Position.Unique]
unique ( DocumentId, Position )
)
There are a couple of things worth noting:
In this code, if you delete a row from Documents, the DocumentSections rows also go away (but not the Sections that were used in the Documents row). Likewise, if you delete a Sections row, the DocumentSections rows for that deleted Sections row go away, leaving the Documents unmolested. This is done with the on delete cascade clauses in the foreign key constraints. They're totally optional, but I showed it just for fun. This is often very handy.
I added a restriction (again optional) that prevents a Section from being used more than once in a Document. If that's not what you want, you can just remove that whole constraint.
I picked nvarchar(8) for the size of the key fields - for no particular reason. If you make these bigger, be sure to increase the width in the referring tables, too.
I'm creating a clinic management system where I need to store Medical History for a patient. The user can select multiple history conditions for a single patient, however, each clinic has its own fixed set of Medical History fields.
For example:
Clinic 1:
DiseaseOne
DiseaseTwo
DiseaseThree
Clinic 2:
DiseaseFour
DiseaseFive
DiseaseSize
For my Patient visit in a specific Clinic , the user should be able to check 1 or more Diseases for the patient's medical history based on the clinic type.
I thought of two ways of storing the Medical History data:
First Option:
Add the fields to the corresponding clinic Patient Visit Record:
PatientClinic1VisitRecord:
PatientClinic1VisitRecordId
VisitDate
MedHist_DiseaseOne
MedHist_DiseaseTwo
MedHist_DisearThree
And fill up each MedHist field with the value "True/False" based on the user input.
Second Option:
Have a single MedicalHistory Table that holds all Clinics Medical History detail as well as another table to hold the Patient's medical history in its corresponding visit.
MedicalHistory
ClinicId
MedicalHistoryFieldId
MedicalHistoryFieldName
MedicalHistoryPatientClinicVisit
VisitId
MedicalHistoryFieldId
MedicalHistoryFieldValue
I'm not sure if these approaches are good practices, is a third approach that could be better to use ?
If you only interested on the diseases the person had, then storing the false / non-existing diseases is quite pointless. Not really knowing all the details doesn't help getting the best solution, but I would probably create something like this:
Person:
PersonID
Name
Address
Clinic:
ClinicID
Name
Address
Disease:
DiseaseID
Name
MedicalHistory:
HistoryID (identity, primary key)
PersonID
ClinicID
VisitDate (either date or datetime2 field depending what you need)
DiseaseID
Details, Notes etc
I created this table because my assumption was that people have most likely only 1 disease on 1 visit, so in case there's sometimes several, more rows can be added, instead of creating separate table for the visit, which makes queries most complex.
If you need to track also situation where a disease was checked but result was negative, then new status field is needed for the history table.
If you need to limit which diseases can be entered by which clinic, you'll need separate table for that too.
Create a set of relational tables to get a robust and flexible system, enabling the clinics to add an arbitrary number of diseases, patients, and visits. Also, constructing queries for various group-by criteria will become easier for you.
Build a set of 4 tables plus a Many-to-Many (M2M) "linking" table as given below. The first 3 tables will be less-frequently updated tables. On each visit of a patient to a clinic, add 1 row to the [Visits] table, containing the full detail of the visit EXCEPT disease information. Add 1 row to the M2M [MedicalHistory] table for EACH disease for which the patient will be consulting on that visit.
On a side note - consider using Table-Valued Parameters for passing a number of rows (1 row per disease being consulted) from your front-end program to the SQL Server stored procedure.
Table [Clinics]
ClinicId Primary Key
ClinicName
-more columns -
Table [Diseases]
DiseaseId Primary Key
ClinicId Foreign Key into the [Clinics] table
DiseaseName
- more columns -
Table [Patients]
PatientId Primary Key
ClinicId Foreign Key into the [Clinics] table
PatientName
-more columns -
Table [Visits]
VisitId Primary Key
VisitDate
DoctorId Foreign Key into another table called [Doctor]
BillingAmount
- more columns -
And finally the M2M table: [MedicalHistory]. (Important - All the FK fields should be combined together to form the PK of this table.)
ClinicId Foreign Key into the [Clinics] table
DiseaseId Foreign Key into the [Diseases] table
PatientId Foreign Key into the [Patients] table
VisitId Foreign Key into the [Visits] table