How to design this tables better? - database

Firstly, sorry about the title I couldn't find better one.
I have a database that stores some devices depending on their number:
|-----------|-------------|-------------|
| device_id | device_name | device_type |
|-----------|-------------|-------------|
Each device has two types, 3-port and 1-port and each port has specific name, for example:
Device 1122 is type 3-port, port names are (kitchen,
living_room, bed_room).
Device 1123 is type 1-port, port name is (boiler).
My imagination design is:
|-----------|-------------|--------|-----------|--------|
+ device_id | device_name | port_1 | port_2 | port_3 +
|-----------|-------------|--------|-----------|--------|
| 1122 | First floor | kitchen|living_room|bed_room|
|-----------|-------------|--------|-----------|--------|
| 1123 | Second floor| boiler | null | null |
|-----------|-------------|--------|-----------|--------|
But my design is not good since if I had 100 device of type 1-port I leave 200 fields empty.
Can you please help me to create better design ?

I am pasting in my comment as an answer so you can mark the question answered.
You could break out ports to a separate, normalized table with
deviceId, port number, and port name. You will have one record for
each device and port combination with a foreign key reference back to
the main devices table. This will reduce empty fields and allow more
than 3 ports should the requirements change. However this comes at the
cost of an additional table and duplication of the key. From a space
perspective you may not end up much better. Then again, storage is
pretty cheap so I would not sweat it too much.
Zohar's answer is much more complete, so I will not have a problem if you accept his answer. However you should accept an answer to close the question.

A fully normalized schema will have device type table (where you specify the number of ports for each device type),
a device table with a unique device name and device id,
a ports table with a unique port name and a unique port id,
and an intersection table with device id and port id where the combination of these 2 columns is the primary key.
You should consider adding a check constraint on adding records to the intersection table to make sure you don't add too many records according to the device type (if your target db supports check constraints).
Here is a pseudo code for this schema:
TblDeviceType
(
DeviceType_Id int, -- primary key
DeviceType_Name varchar(20) -- unique
)
TblDevice
(
Device_Id int, -- primary key
Device_Type int, -- fk to TblDeviceType
)
TblPorts
(
Port_Id int, -- primary key
Port_Name varchar(30) -- unique
)
TblDeviceToPort
(
DeviceToPort_Device int, -- fk to TblDevice
DeviceToPort_Port int, -- fk to TblPort
Primary key (DeviceToPort_Device, DeviceToPort_Port)
)

Related

SQL Server - Database import from CSV/XLS

Have a basic question regarding how to solve this import problem.
Have a CSV with ca. 40 fields, that need to be inserted across ca. 5 tables.
Let say tables are like this
tpeople
Column Name | Datatype
GUID | uniqueidentifier
Fname | varchar
Lname | varchar
UserEnteredGUID | uniqueidentifier
tcompany
Column Name | Datatype
GUID | uniqueidentifier
CompanyTypeGUID | uniqueidentifier
PrintName | varchar
Website | varchar
tcompanyLocation
Column Name | Datatype
GUID | uniqueidentifier
CompanyGUID | uniqueidentifier
City | varchar
As we can see database is normalized and we can see different GUIDs.
My question is, when I will write for example a Python script to enter data, how should I handle the GUIDs?
For example I want to add:
Fname: John
Lname: Smith
Company: IBM
Location: New York
Website: www.ibm.com
UserEntered: Admin
How do I make sure all relations/GUID are correct?
I would try:
insert into tpeople(GUID,Fname,Lname,UserEnteredGUID) values("","John","Smith",???)
Question
How to get UserEnteredGUID? Do I have to make a select on GUID from UserEntered table where user equals "Admin"?
Or here:
insert into tcompany(GUID,CompanyTypeGUID,PrintName,Website) values("",??,"IBM","www.ibm.com")
Here the same? How should I handle CompanyTypeGUID? It would also mean that I have to populate CompanyType table BEFORE I add anything to tcompany table?
This does not look right to me, kind of thinking from back to forward, thinking how each table is connected to the other one .... there has to be a way to insert records to normalized database where this GUID, Foreign Keys stuff has to be somehow automated.
I hope somebody got my problem and can guide me towards solution.
Thanks!

how do I model subtyping in a relational schema?

Is the following DB-schema ok?
REQUEST-TABLE
REQUEST-ID | TYPE | META-1 | META-2 |
This table stores all the requests each of which has a unique REQUEST-ID. The TYPE is either A, B or C. This will tell us which table contains the specific request parameters. Other than that we have the tables for the respective types. These tables store the parameters for the respective requests. META-1 are just some additional info like timestamps and stuff.
TYPE-A-TABLE
REQUEST-ID | PARAM_X | PARAM_Y | PARAM_Z
TYPE-B-TABLE
REQUEST-ID | PARAM_I | PARAM_J
TYPE-C-TABLE
REQUEST-ID | PARAM_L | PARAM_M | PARAM_N | PARAM_O | PARAM_P | PARAM_Q
The REQUEST-ID is the foreign key into the REQUEST-TABLE.
Is this design normal/best-practice? Or is there a better/smarter way? What are the alternatives?
It somehow feels strange to me, having to do a query on the REQUEST-TABLE to find out which TYPE-TABLE contains the information I need, to then do the actual query I'm interested in.
For instance imagine a method which given an ID should retrieve the parameters. This method would need to do 2 db-access.
- Find correct table to query
- Query table to get the parameters
Note: In reality we have like 10 types of requests, i.e. 10 TYPE tables. Moreover there are many entries in each of the tables.
Meta-Note: I find it hard to come up with a proper title for this question (one that is not overly broad). Please feel free to make suggestions or edit the title.
For exclusive types, you just need to make sure rows in one type table can't reference rows in any other type table.
create table requests (
request_id integer primary key,
request_type char(1) not null
-- You could also use a table to constrain valid types.
check (request_type in ('A', 'B', 'C', 'D')),
meta_1 char(1) not null,
meta_2 char(1) not null,
-- Foreign key constraints don't reference request_id alone. If they
-- did, they might reference the wrong type.
unique (request_id, request_type)
);
You need that apparently redundant unique constraint so the pair of columns can be the target of a foreign key constraint.
create table type_a (
request_id integer not null,
request_type char(1) not null default 'A'
check (request_type = 'A'),
primary key (request_id),
foreign key (request_id, request_type)
references requests (request_id, request_type) on delete cascade,
param_x char(1) not null,
param_y char(1) not null,
param_z char(1) not null
);
The check() constraint guarantees that only 'A' can be stored in the request_type column. The foreign key constraint guarantees that each row will reference an 'A' row in the table "requests". Other type tables are similar.
create table type_b (
request_id integer not null,
request_type char(1) not null default 'B'
check (request_type = 'B'),
primary key (request_id),
foreign key (request_id, request_type)
references requests (request_id, request_type) on delete cascade,
param_i char(1) not null,
param_j char(1) not null
);
Repeat for each type table.
I usually create one updatable view for each type. The views join the table "requests" with one type table. Application code uses the views instead of the base tables. When I do that, it usually makes sense to revoke privileges on the base tables. (Not shown.)
If you don't know which type something is, then there's no alternative to running one query to get the type, and another query to select or update.
select request_type from requests where request_id = 42;
-- Say it returns 'A'. I'd use the view type_a_only.
update type_a_only
set param_x = '!' where request_id = 42;
In my own work, it's pretty rare to not know the type, but it does happen sometimes.
The phrase you may be looking for is "how do I model inheritance in a relational schema". It's been asked before. Whilst this is a reference to object oriented software design, the basic question is the same: how do I deal with data where there is a "x is a type of y" relationship.
In your case, "request" is the abstract class, and typeA, TypeB etc. are the subclasses.
Your solution is one of the classic answers - "table per subclass". It's clean and easy to maintain, but does mean you can have multiple database access requests to retrieve the data.

sql-server one to one relational tables for different types of users

I have 2 kind of users. they all have some shared information like: username, email, password, phone,... but each kind of users has some other settings that are related to that kind of user only now I have thought of 3 ways to do it:
having 1 users table and having all column for all type of users (each row will have several empty columns that are not related to that kind of user)
having 3 tables users, usertype1, usertype2. that shared settings will be saved in users table and there will be a one to one relation to usertype1 or usertype2 (based on the usertype)
one user table and one setting table, this is more dynamic but this has a problem that I have to use varchar type for all settings.
which one is more wiser to use? I'm concerned that in the future user types might increase too.
Your Second option is most suitable. In addition to what you have described I would add a Column to Main_Users Table User_Type which references to a table User_Types with only two
values
User_Type
PK_TypeID User_Type
1 usertype1
2 usertype2
Main_User
Only the Columns That every user will have Like first name, Last name, Dob, UserID and anyother information that every user will have.
U_ID | Column1 | Column2 | Column3 | User_Type --<-- User Type values(1,2)
-- from User_Type Table
Type 1
This will be only for the users with Type one and only column that a Type1 user will have. Make U_ID a foreign key referencing U_ID column in the main users table.
U_ID | Column1 | Column2 | Column3
Type 2
This will be only for the users with Type Two and only column that a Typeuser will have.MakeU_ID` a foreign key referencing U_ID column in the main users table.
U_ID | Column1 | Column2 | Column3
I would think that option 3 would be the best, and you wouldn't have to use varchar for all settings. What you'll need is a third table for lookup values as (Setting_ID, Setting_Description). Use the Setting_ID in your settings table along with the value for that setting.

(PostgreSQL) "Advanced" Check Constraint Question

I use PostgreSQL but am looking for SQL answer as standard as possible.
I have the following table "docs" --
Column | Type | Modifiers
------------+------------------------+--------------------
id | character varying(32) | not null
version | integer | not null default 1
link_id | character varying(32) |
content | character varying(128) |
Indexes:
"docs_pkey" PRIMARY KEY, btree (id, version)
id and link_id are for documents that have linkage relationship between each other, so link_id self references id.
The problem comes with version. Now id is no longer the primary key (won't be unique either) and can't be referenced by by link_id as foreign key --
my_db=# ALTER TABLE docs ADD FOREIGN KEY(link_id) REFERENCES docs (id) ;
ERROR: there is no unique constraint matching given keys for referenced table "docs"
I tried to search for check constraint on something like "if exists" but didn't find anything.
Any tip will be much appreciated.
I usually do like this:
table document (id, common, columns, current_revision)
table revision (id, doc_id, content, version)
which means that document has a one-to-many relation with it's revisions, AND a one-to-one to the current revision.
That way, you can always select a complete document for the current revision with a simple join, and you will only have one unique row in your documents table which you can link parent/child relations in, but still have versioning.
Sticking as close to your model as possible, you can split your table into two, one which has 1 row per 'doc' and one with 1 row per 'version':
You have the following table "versions" --
Column | Type | Modifiers
------------+------------------------+--------------------
id | character varying(32) | not null
version | integer | not null default 1
content | character varying(128) |
Indexes:
"versions_pkey" PRIMARY KEY, btree (id, version)
And the following table "docs" --
Column | Type | Modifiers
------------+------------------------+--------------------
id | character varying(32) | not null
link_id | character varying(32) |
Indexes:
"docs_pkey" PRIMARY KEY, btree (id)
now
my_db=# ALTER TABLE docs ADD FOREIGN KEY(link_id) REFERENCES docs (id) ;
is allowed, and you also want:
my_db=# ALTER TABLE versions ADD FOREIGN KEY(id) REFERENCES docs;
of course there is nothing stoping you getting a 'combined' view similar to your original table:
CREATE VIEW v_docs AS
SELECT id, version, link_id, content from docs join versions using(id);
Depending on if it's what you want, you can simply create a FOREIGN KEY that includes the version field. That's the only way to point to a unique row...
If that doesn't work, you can write a TRIGGER (for all UPDATEs and INSERTs on the table) that makes the check. Note that you will also need a trigger on the docs table, that restricts modifications on that table that would break the key (such as a DELETE or UPDATE on the key value itself).
You cannot do this with a CHECK constraint, because a CHECK constraint cannot access data in another table.

Any difference in the way a primary key is defined in Postgres?

I am wondering if all these are exactly the same or if there is some difference.
Method 1:
CREATE TABLE testtable
(
id serial,
title character varying,
CONSTRAINT id PRIMARY KEY (id)
);
Method: 2
CREATE TABLE testtable
(
id serial PRIMARY KEY,
title character varying,
);
Method 3:
CREATE TABLE testtable
(
id integer PRIMARY KEY,
title character varying,
);
CREATE SEQUENCE testtable_id_seq
START WITH 1
INCREMENT BY 1
NO MAXVALUE
NO MINVALUE
CACHE 1;
ALTER SEQUENCE testtable_id_seq OWNED BY testtable.id;
Update: I found something on the web saying that by using a raw sequence you can pre-allocate memory for primary keys which helps if you plan on doing several thousand inserts in the next minute.
Try it and see; remove the trailing "," after "varying" on the second and third so they run, execute each of them, then do:
\d testtable
after each one and you can see what happens. Then drop the table and move onto the next one. It will look like this:
Column | Type | Modifiers
--------+-------------------+--------------------------------------------------------
id | integer | not null default nextval('testtable_id_seq'::regclass)
title | character varying |
Indexes:
"id" PRIMARY KEY, btree (id)
Column | Type | Modifiers
--------+-------------------+--------------------------------------------------------
id | integer | not null default nextval('testtable_id_seq'::regclass)
title | character varying |
Indexes:
"testtable_pkey" PRIMARY KEY, btree (id)
Column | Type | Modifiers
--------+-------------------+-----------
id | integer | not null
title | character varying |
Indexes:
"testtable_pkey" PRIMARY KEY, btree (id)
First and second are almost identical, except the primary key created is named differently. In the third, the sequence is no longer filled in when you insert into the database. You need to create the sequence first, then create the table like this:
CREATE TABLE testtable
(
id integer PRIMARY KEY DEFAULT nextval('testtable_id_seq'),
title character varying
);
To get something that looks the same as the second one. The only upside to that is that you can use the CACHE directive to pre-allocate some number of sequence numbers. It's possible for that to be a big enough resource drain that you need to lower the contention. But you'd need to be doing several thousand inserts per second, not per minute, before that's likely to happen.
No semantic difference between method 1 and method 2.
Method 3 is quite similar, too - it's what happens implicitly, when using serial. However, when using serial, postgres also records a dependency of sequence on the table. So, if you drop the table created in method 1 or 2, the sequence gets dropped as well.

Resources