How to implement a flexible table - database

I am a novice database user (not designer). I would like to implement the following item in the postgres database.
I would like to implement a database which contain the following information
Table1
Classroom | Classroom Type | AV System | Student1 | Student2 | ... | Student36
1A | 'Square' | 1 | 'Anson' | 'Antonie'| ... | 'Zalda'
1B | 'Rectangle' | 2 | 'Allen' | 'Andy' | ... | 'Zeth'
There is another table to store the seating plan for each student, that's why I created another table
Table2
Classroom Type | Student1 Position | Student2 Position | ... | Student36 Position
'Square' | (1,1) | (1,2) | ... | (6,6)
'Rectangle' | (1,1) | (1,2) | ... | (4,9)
Table3
AV System | TV | Number of Speaker
1 | 'LCD' | 2
2 | 'Projector' | 4
The reason of this implementation is to draw a seating plan. However I don't think this is a good implementation. Therefore I would like to find another way which will give me some flexibility when I want to scale it up.
Thanks in advance.

This is not how relational databases work. In a relational database you don't repeat attributes, you create 1:N relationships. This process is called normalization and one of its main goals is to prevent duplication of data.
As far as I can tell, the following structure would do what you want:
-- a table to store all possible classroom types ("Square", "Rectangle", ...)
create table classroom_type
(
type_id integer not null primary key,
type_name varchar(20) not null,
unique (type_name)
);
-- a table to store all classrooms
create table classroom
(
room_id integer not null primary key,
room_name varchar(5) not null,
room_type integer not null references classroom_type,
unique (room_name)
);
-- a table containing all students
create table student
(
student_id integer not null primary key,
student_name varchar(100) not null
--- ... possibly more attributes like date of birth and others ....
);
-- this table stores the combinations which student has which position in which classroom
create table seating_plan
(
student_id integer not null references student,
room_id integer not null references room,
position varchar(10) not null,
primary key (student_id, room_id), -- make sure the same student is seated only once in a room
unique (room_id, position) -- make sure each position is only used once insid a room
);
I used integer for the ID columns, but most probably you might want to use serial to automatically create unique values for them.
Most probably the model needs to be extended to include a school year as well. Because student Allen might be in room 1A this year, but in 3C next year. This would be another attribute of the seat_plan table (and would be part of the primary key)

|ClassRoomTypes| | ClassRooms | | TableTypes | | Tables |
|--------------| |----------------| |----------------| |------------|
|Id |<--- |Id | |Id |<- |Id |
|Name | | |Name | |Name | |--|TableType_Id|
|--------------| ---|ClassRoomType_Id| |Size_X | |------------|
|Size_Y |
|----------------|
|ClassRoomToTables| |ClassRoomToTable_Students| | Students |
|-----------------| |-------------------------| |-------------------|
|Id |<--- |Id | |Id |
|ClassRoom_Id | |--|ClassRoomToTable_Id | OR |Name |
|Table_Id | |Student_Id | |ClassRoomToTable_Id|
|-----------------| |-------------------------| |-------------------|
Now a explaination:
A Class Room has a list of tables.
A Table has some paramenters ( eg: Student Capacity; Size_X, Size_Y etc)
A Table also it's a concept, (It's not something unique identificated, a table concept is used in many class rooms)
One or many students sit at many tables in different class rooms (ClassRoomToTable_Students table)
OR
One or many students may sit only at a table from specific class room (ClassRoomToTable_Id from Students)
You may get some inspiration from my point of view, I do not guarantee that totally fits your domain case. Success

Related

The effect of index on unique constraint columns

How is the table made?
create table market_post
(
.
.
.
d_id varchar(20) constraint unique_d_id unique,
.
.
.
);
create index market_post_d_i_219a22_idx on market_post (d_id, is_deleted);
It should be noted that above code is DDL of table and i created the indexes and unique constraint when the table was created and was full of data (ALTER....)
Sometimes it allows duplicate value in d_id and sometimes it not allows!!
Let's test:
TEST1
SELECT id,d_id
FROM public.market_post
WHERE id in (1910764,2584556)
Result:
--------------------------------
| id | d_id |is_deleted|
--------------------------------
|1910764 | QYynk1fG | true |
--------------------------------
|2584556 | gYkgfj_M | true |
--------------------------------
now i want update:
UPDATE public.market_post SET d_id = 'gYkgfj_M'WHERE id = 1910764
Result:
[2022-07-24 10:31:52] 1 row affected in 116 ms
OMG! now result is:
---------------------
| id | d_id |
---------------------
|1910764 | gYkgfj_M |
---------------------
|2584556 | gYkgfj_M |
---------------------
interesting point
SELECT id,d_id FROM public.market_post WHERE d_id='gYkgfj_M'
only returnt one row !!!!!!!!
---------------------
| id | d_id |
---------------------
|1910764 | gYkgfj_M |
---------------------
TEST2
SELECT id,d_id
FROM public.market_post
WHERE id in (191076 , 258455)
Result:
--------------------------------
| id | d_id |is_deleted|
--------------------------------
|191076 | SYyFk1fA | false |
--------------------------------
|258455 | fYkDfjbb | false |
--------------------------------
now i want update:
UPDATE public.market_post SET d_id = 'fYkDfjbb' WHERE id = 191076
Result:
[23505] ERROR: duplicate key value violates unique constraint "unique_d_id"
Detail: Key (d_id)=(fYkDfjbb) already exists.
its guarantees that the duplicate value was not found in the rows where is_deleted=false
Unique constraint does not work in Postgres?(Of course it should work) Or has the index affected it?
is this bug? no , i tested it in new table (in my server and in SQL fiddle) and all of them work truly , and there isn't any bug
But the old table is not work
It should be noted that I created the indexes and unique constraint when the table was full of data
VAERSION:12
As #ErwinBrandstetter said in the comments, I should have rebuilt the indexes! Apparently, I had fallen into a Postgres bug
After running ‍‍‍‍REINDEX TABLE market_post; everything was solved
The hard part was where I had to distinguish values with duplicate IDs and delete one of them! As I said in the question, Postgres returns only one of the records. I took the entire table with the help of pandas and identified the duplicate values and then removed them
After the steps, it was time to REINDEX

postgres update with join slow performance

I have below tables and trying to do an update from second table to first one, it seems to take more than 15 minutes and I killed it at that point.
Basically just trying to set one field from a table to another field. Both tables have around 2.5 million rows. How can we optimize this operation?
first table:
\d table1
Table "public.fa_market_urunu"
Column | Type | Collation | Nullable | Default
--------------+-----------------------------+-----------+----------+-----------------------
id | character varying | | not null |
ad | character varying | | |
url | character varying | | |
image_url | character varying | | |
satici_id | character varying | | not null |
satici | character varying | | not null |
category_id | character varying | | |
date_created | timestamp with time zone | | not null | now()
last_updated | timestamp(3) with time zone | | not null | now()
fiyat | double precision | | |
Indexes:
"tbl1_pkey" PRIMARY KEY, btree (id)
"tbl1_satici" UNIQUE, btree (id, satici)
"tbl1_satici_id" UNIQUE, btree (satici, id)
"tbl1_satici_id_last_updated" UNIQUE, btree (satici, id, last_updated)
"tbl1_satici_id_satici_key" UNIQUE CONSTRAINT, btree (satici_id, satici)
"tbl1_satici_last_updated_id" UNIQUE, btree (satici, last_updated, id)
"tbl1_last_updated" btree (last_updated)
"tbl1_satici_category" btree (satici, category_id)
"tbl1_satici_category_last_updated" btree (satici, category_id, last_updated)
"tbl1_satici_last_updated" btree (satici, last_updated)
second table:
\d table2
Table "public.temp_son_fiyat"
Column | Type | Collation | Nullable | Default
---------+-------------------+-----------+----------+---------
urun_id | character varying | | |
satici | character varying | | |
fiyat | double precision | | |
Indexes:
"ind_u" UNIQUE, btree (urun_id, satici)
My operation:
UPDATE table1 mu
SET fiyat = fn.fiyat
FROM table2 AS fn
WHERE mu.satici_id = fn.urun_id AND mu.satici = fn.satici;
This happens because of the indexes. Every update in postgres is considered as reinsertion of that row regardless of the column getting updated, so all indexes are recalculated. To make it faster, dropping indexes or swapping to a new table would work (if it is possible to do those).

Find duplicates across many to many relationship in SQL Server

I am trying to find potential duplicates in a many-to-many join in a SQL Server database.
I have a database of students attending classes and have the following tables: Lessons, Attendees, Classrooms and Students.
I am trying to find duplicates where the same group of students may have been entered twice for the same date and classroom.
Students to Lessons is many-to-many broken down by the Attendees table. The LessonID, StudentID, ClassroomID fields are SQL Server Identity primary keys. Attendees is simply the join table with a compound key of student and lesson.
Lessons:
LessonID
LessonDate
ClassroomID
Students:
StudentID
Attendees:
LessonID
StudentID
Classrooms:
ClassroomID
It is legitimate that the same group of students may have attended different classes on the same day in the same classroom, but I want to flag them up as potential duplicates, in case the record has erroneously been entered twice.
I can’t figure out how to find matching sets of students for the same classroom on the same date.
So, an example of duplicate data I would expect to find would be:
Lessons:
+----------+-------------+------------+
| LessonID | ClassroomID | LessonDate |
+----------+-------------+------------+
| 335867 | 347 | 06/01/2020 |
| 335872 | 347 | 06/01/2020 |
+----------+-------------+------------+
Attendees:
+----------+----------+
| LessonID | PersonID |
+----------+----------+
| 335867 | 432 |
| 335867 | 1398 |
| 335867 | 5107 |
| 335872 | 432 |
| 335872 | 1398 |
| 335872 | 5107 |
+----------+----------+
Another way to look at this would be: for any given Lesson, which other lessons (if any) have the same students in the same classroom on the same day.
I found a solution myself using the STRING_AGG function to flatten out the hierarchy. I added the following query to the database:
SELECT Lessons.LessonID, Lessons.ClassroomID, Lessons.LessonDate, string_agg(Attendees.StudentID, '-') AS team
FROM Lessons INNER JOIN
Attendees ON Lessons.LessonID = Attendees.LessonID
GROUP BY Lessons.LessonID, Lessons.ClassroomID, Lessons.LessonDate
This gives lesson data that looks like this:
+---+----+------------+--------------+
| 1 | 17 | 2006-01-04 | 3-5-10-23 |
| 2 | 18 | 2006-01-04 | 2-17-252 |
| 3 | 18 | 2006-01-04 | 2-16-18 |
| 4 | 18 | 2006-01-04 | 2-6-11-16-18 |
+---+----+------------+--------------+
which I can then simply query against.
I will turn this into a stored procedure passing in for my chosen lesson: LessonDate, ClassroomID and its own "STRING_AGG" team of students, as filters.
The STRING_AGG function is only available from SQL Server 2017. So for older versions you can use the FOR XML PATH('') syntax, concatenating with a hyphen, with a STUFF to remove the leading hyphen:
SELECT dbo.Lessons.LessonID, dbo.Lessons.ClassroomID, dbo.Lessons.LessonDate,
(
stuff(
(select '-' + cast(StudentId as varchar(10))
FROM Attendees
WHERE Attendees.LessonId = Lessons.Lessonid
FOR XML path('')
),1,1,'')
)
as Team
FROM dbo.Lessons
You could concatenate with a comma instead for standard CSV format if preferred.

PostgreSQL conditional join - if column is not NULL

I have a table "temp"
author | title | bibkey | Data
-----------------------------------
John | JsPaper | John2008 | 65
Kate | KsPaper | | 60
| | Data2015 | 80
From this I want to produce two tables, a 'sample_table' and a 'ref_table' like so:
sample_table:
sample_id|ref_id| data
--------------------------
1 | 1 | 65
2 | 2 | 60
3 | 3 | 80
ref_table:
ref_id | author | title | bibkey
--------------------------------------
1 | John | JsPaper | John2008
2 | Kate | KsPaper |
3 | | | Data2015
I've created both tables
CREATE TABLE ref_table ( CREATE TABLE sample_table (
ref_id serial PRIMARY KEY, sample_id serial PRIMARY KEY,
author text, ref_id integer REFERENCES ref_table(ref_id),
title text, data numeric
bibkey text );
);
And inserted the unique author,title,bibkey rows into the reference table as above. What I want to do now is do the join for the sample_table to get the ref_id's. For my insert statement i currently have:
INSERT INTO sample_table (
ref_id,data
)
SELECT ref.ref_id, t.data
FROM
temp t
LEFT JOIN
ref_table ref ON COALESCE(ref.author,'00000') = COALESCE(t.author,'00000')
AND COALESCE(ref.title,'00000') = COALESCE(t.title,'00000')
AND COALESCE(ref.bibkey,'00000') = COALESCE(t.bibkey,'00000');
However i really want to have a conditional statement in the join, rather than all 3 like I have:
IF a bibkey exists for that row, I know it is unique, and join only on that.
If bibkey is NULL, then join on both author and title for the unique pair, and not bibkey.
Is this possible?

Database structure for storing personal skills

I need to design a database for storing skills for a person, a person can have none,one or several skills, what is a good way to store it when it comes to easy modification of skill and fast search?
I have been thinking
1. use a bit array, each bit position represents a skill,
2. a relation table that each row link a person to a SKILL
3. each skill as a field in the table of the person
Any other suggestion or what should I aim for?
First, we need a persons table (all code examples use MySQL syntax):
CREATE TABLE IF NOT EXISTS `persons` (
`id` int unsigned NOT NULL AUTOINCREMENT,
`first_name` varchar(50) NOT NULL,
`last_name` varchar(50) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB Comment='Persons';
And pretend this is the data in the table:
|----|------------|-----------|
| id | first_name | last_name |
|----|------------|-----------|
| 1 | John | Doe |
| 2 | Benny | Hill |
| 3 | Linus | Torvalds |
| 4 | Donald | Knuth |
| .. | .......... | ......... |
|----|------------|-----------|
Then we need a skills table to hold all known skills:
CREATE TABLE IF NOT EXISTS `skills` (
`id` int unsigned NOT NULL AUTOINCREMENT,
`name` varchar(50) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB Comment='Skills';
|----|---------------|
| id | name |
|----|---------------|
| 1 | Swimming |
| 2 | Pilot |
| 3 | Writing |
| 4 | Create kernel |
| 5 | Astronaut |
| .. | ............. |
|----|---------------|
Finally we need a table that associates a person with a skill:
CREATE TABLE IF NOT EXISTS `persons_skills` (
`person_id` int unsigned NOT NULL,
`skill_id` int unsigned NOT NULL,
PRIMARY KEY (`person_id`, `skill_id`),
KEY (`person_id`),
KEY (`skill_id`)
) ENGINE=InnoDB Comment='Skills held by every person';
ALTER TABLE `persons_skills`
ADD FOREIGN KEY (`person_id`) REFERENCES `persons` (`id`) ON DELETE CASCADE ON UPDATE CASCADE,
ADD FOREIGN KEY (`skill_id`) REFERENCES `skills` (`id`) ON DELETE CASCADE ON UPDATE CASCADE;
The primary key is defined so that no person can be associated with the same skill more than once and both columns are foreign key to their respective tables.
Assume the data below:
|-----------|----------|
| person_id | skill_id |
|-----------|----------|
| 1 | 1 |
| 2 | 1 |
| 2 | 2 |
| 3 | 1 |
| 3 | 4 |
| 4 | 2 |
| 4 | 3 |
| ......... | ........ |
|-----------|----------|
This data would indicate that John Doe, Benny Hill and Linus Torvalds all have the skill "Swimming". Benny Hill and Donald Knuth are both pilots. Linus Torvalds created a kernel. And Donald Knuth is a writer. None of the persons are an Astronaut...
It's a clasic many to many relationship so I would suggest a persons table, skills table and a personToSkill table. You other suggested solutions might be tempting at first, but they are both a maitnence hell.

Resources