Relational Design: Column Attributes - database

I have a system that allows a person to select a form type that they want to fill out from a drop down box. From this, the rest of the fields for that particular form are shown, the user fills them out, and submits the entry.
Form Table:
| form_id | age_enabled | profession_enabled | salary_enabled | name_enabled |
This describes the metadata of a form so the system will know how to draw it. So each _enabled column is a boolean true if the form should include a field to be filled out for this column.
Entry Table:
| entry_id | form_id | age | profession | salary | name | country |
This stores a submitted form. Where age, profession, etc stores the actual value filled out in the form (or null if it didn't exist in the form)
Users can add new forms to the system on the fly.
Now the main question: I would like to add the ability for a user designing a new form to be able to include a list of possible values for an attribute (e.g. profession is a drop down list of say 20 professions instead of just a text box when filling out the form). I can't simply store a global list of possible values for each column because each form will have a different list of values to pick from.
The only solution I can come up with is to include another set of columns in Form table like profession_values and then store the values in a character delimited format. I am concerned that a column may one day have a large number of possible values and this column will get out of control.
Note that new columns can be added later to Form if necessary (and thus Entry in turn), but 90% of forms have the same base set of columns, so I think this design is better than an EAV design. Thoughts?
I have never seen a relational design for such a system (as a whole) and I can't seem to figure out a decent way to do this.

Create a new table to contain groups of values:
CREATE TABLE values (
id SERIAL,
group INT NOT NULL,
value TEXT NOT NULL,
label TEXT NOT NULL,
PRIMARY KEY (id),
UNIQUE (group, value)
);
For example:
INSERT INTO values (group, value, label) VALUES (1, 'NY', 'New York');
INSERT INTO values (group, value, label) VALUES (1, 'CA', 'California');
INSERT INTO values (group, value, label) VALUES (1, 'FL', 'Florida');
So, group 1 contains three possible values for your drop-down selector. Then, your form table can reference what group a particular column uses.
Note also that you should add fields to a form via rows, not columns. I.e., your app shouldn't be adjusting the schema when you add new forms, it should only create new rows. So, make each field its own row:
CREATE TABLE form (
id SERIAL,
name TEXT NOT NULL,
PRIMARY KEY (id)
);
CREATE TABLE form_fields (
id SERIAL,
form_id INT NOT NULL REFERENCES form(id),
field_label TEXT NOT NULL,
field_type INT NOT NULL,
field_select INT REFERENCES values(id),
PRIMARY KEY (id)
);
INSERT INTO form (name) VALUES ('new form');
$id = last_insert_id()
INSERT INTO form_fields (form_id, field_label, field_type) VALUES ($id, 'age', 'text');
INSERT INTO form_fields (form_id, field_label, field_type) VALUES ($id, 'profession', 'text');
INSERT INTO form_fields (form_id, field_label, field_type) VALUES ($id, 'salary', 'text');
INSERT INTO form_fields (form_id, field_label, field_type, field_select) VALUES ($id, 'state', 'select', 1);

I think you are starting from the wrong place entirely.
| form_id | age_enabled | profession_enabled | salary_enabled | name_enabled |
Are you just going to keep adding to this table for every single for field you can ever have? Generically the list could be endless.
How will your application code display a form if all the fields are in columns in this table?
What about a form table like this:
| form_id | form description |
Then another table, formAttributes with one row per entry on the form:
| attribute_id | form_id | position | name | type |
Then a third table forAttributeValidValues with one row per attribute valid value:
| attribute_id | value_id | value |
This may seem like more work to begin with, but it really isn't. THink about how easy it is to add or remove new attribute or value to a form. Also think about how your application will render the form:
for form_element in (select name, attribute_id
from formAttributes
where form_id = :bind
order by position asc) loop
render_form_element
if form_element.type = 'list of values' then
render_values with 'select ... from formAttributeValidValues'
end if
end loop;
The dilema will then become how to store the form results. Ideally you would store them with 1 row per form element in a table that is something like:
| completed_form_id | form_id | attribute_id | value |
If you only ever work on one form at a time, then this model will work well. If you want to do aggregations over lots of forms, then the resulting queries become more difficult, however that is reporting, which can run in a different process to the online form entry. You can start to think of things that pivot queries to transform the rows in into columns or materialized view to pull together forms of the same type etc.

Related

CassandraDB table with multiple Key-Value

I am a new CassandraDB user. I am trying to create a table which has 3 static columns, for example "name", "city" and "age", and then I was thinking in two "key" and "value" columns, since my table could receive a lot of inputs. How can I define this table? I am trying to achieve something scalable, i.e:
Table columns --> "Name", "City", "Age", "Key", "Value"
Name: Mark
City: Liverpool
Age: 26
Key: Car
Value: Audi A3
Key: Job
Value: Computer Engineer
Key: Main hobby
Value: Football
I am looking for the TABLE DEFINITION.. Any help? Thank you so so much in advance.
If I understand correctly, you want to create a key-value store, grouped by "name", "city" and "age". There are few solutions for this approach -
First by using STATIC columns -
create table record_by_id(
recordId text,
name text static,
city text static,
age int static,
key text,
value text
primary key (recordId, key)
);
Which this table design, Name, City, Age remain constant for same recordid. You can any number of key- values for same record id.
Second Approach would be -
create table record_by_id(
name text ,
city text ,
age int ,
key text,
value text
primary key ((name,city,age),key)
);
In this design, Name , city and age is are part of partition key. The key column is part of clustering key.
Both approach are scalable but first approach is good for maintenance.
table which has 3 static columns
So by "static" I assume you're not referring to Cassandra's definition of static columns. Which is cool, I know what you mean. But the mention did give me an idea of how to approach this:
trying to create the table definition
I see two ways to go about this.
CREATE TABLE user_properties (
name TEXT,
city TEXT STATIC,
age INT STATIC,
key TEXT,
value TEXT,
PRIMARY KEY (name,key));
Because we have static columns (only stored w/ the partition key name) adding more key/values is just a matter of adding more keys to the same name, so INSERTing data looks like this:
INSERT INTO user_properties (name,city,age,key,value)
VALUES ('Mark','Liverpool',26,'Car','Audi A3');
INSERT INTO user_properties (name,key,value)
VALUES ('Mark','Job','Computer Engineer');
INSERT INTO user_properties (name,key,value)
VALUES ('Mark','Main hobby','Football');
Querying looks like this:
> SELECT * FROm user_properties WHERE name='Mark';
name | key | age | city | value
------+------------+-----+-----------+-------------------
Mark | Car | 26 | Liverpool | Audi A3
Mark | Job | 26 | Liverpool | Computer Engineer
Mark | Main hobby | 26 | Liverpool | Football
(3 rows)
This is the "simple" way to go about it.
Or
CREATE TABLE user_properties_map (
name TEXT,
city TEXT,
age INT,
kv MAP<TEXT,TEXT>,
PRIMARY KEY (name));
With a single partition key as the PRIMARY KEY, we can INSERT everything in one shot:
INSERT INTO user_properties_map (name,city,age,kv)
VALUES ('Mark','Liverpool',26,{'Car':'Audi A3',
'Job':'Computer Engineer',
'Main hobby':'Football'});
And querying looks like this:
> SELECT * FROm user_properties_map WHERE name='Mark';
name | age | city | kv
------+-----+-----------+--------------------------------------------------------------------------
Mark | 26 | Liverpool | {'Car': 'Audi A3', 'Job': 'Computer Engineer', 'Main hobby': 'Football'}
(1 rows)
This has the added benefit of putting the properties into a map, which might be helpful if that's the way you're intending to work with it on the application side. The drawbacks, are that Cassandra collections are best kept under 100 items, the writes are a little more complicated, and you can't query individual entries of the map.
But by keying on name (might want to also include last name or something else to help with uniqueness), data should scale fine. And partition growth won't be a problem, unless you're planning on thousands of key/value pairs.
Basically, choose the structure based ons the standard Cassandra advice of considering how you'd query the data, and then build the table to suit it.

Database normalization. Which is better, inserting in one row or multiple row?

I'm currently designing my tables. i have three types of user which is, pyd, ppp and ppk. Which is better? inserting data in one row or in multiple row?
which is better?
or
or any suggestion? thanks
I would go for 3 tables:
user_type
typeID | typeDescription
Main_table
id_main_table | id_user | id_type
table_bhg_i
id_bhg_i | id_main_table | data1 | data2 | data3
Although I see you are inserting IDs for each user , I don't quite understand how are are you going to differentiate between the users , had I designed this DB , I would have gone for tables like
tableName: UserTypes
this table would contain two field first would be ID and second would be type of user
like
UsertypeID | UserType
the UsertypeID is a primary key and can be auto increment , while UserType would be your users pyd ,ppk or so on . Designing in this way would give you flexibility of adding data later on in the table without changing the schema of the table ,
the next you can edit a table for generating multiple users of a particular type, this table would refer the userID of the previous table , this will help you adding new user easily and would remove redundancy
tableName:Users
this table would again contain two fields, the first field would be the id call and the secind field would be the usertypeId try
UserId |UserName | UserTypeID
the next thing you can do is make a table to insert the data , let the table be called DataTable
tableName: DataTable
this table will contain the data of the users and this will reference then easily
DataTabID | DataFields(can be any in number) | UserID(refrences Users table)
these tables would be more than sufficient .If doubts as me in chatbox

References in a table

I have a table like this, that contains items that are added to the database.
Catalog table example
id | element | catalog
0 | mazda | car
1 | penguin | animal
2 | zebra | animal
etc....
And then I have a table where the user selects items from that table, and I keep a reference of what has been selected like this
User table example
id | name | age | itemsSelected
0 | john | 18 | 2;3;7;9
So what I am trying to say, is that I keep a reference to what the user has selected as a string if ID's, but I think this seems a tad troublesome
Because when I do a query to get information about a user, all I get is the string of 2;3;7;9, when what I really want is an array of the items corresponing to those ID's
Right now I get the ID's and I have to split the string, and then run another query to find the elements the ID's correspond to
Is there any easier ways to do this, if my question is understandable?
Yes, there is a way to do this. You create a third table which contains a map of A/B. It's called a Multiple to Multiple foreign-key relationship.
You have your Catalogue table (int, varchar(MAX), varchar(MAX)) or similar.
You have your User table (int, varchar(MAX), varchar(MAX), varchar(MAX)) or similar, essentially, remove the last column and then create another table:
You create a UserCatalogue table: (int UserId, int CatalogueId) with a Primary Key on both columns. Then the UserId column gets a Foreign-Key to User.Id, and the CatalogueId table gets a Foreign-Key to Catalogue.Id. This preserves the relationship and eases queries. It also means that if Catalogue.Id number 22 does not exist, you cannot accidentally insert it as a relation between the two. This is called referential-integrity. The SQL Server mandates that if you say, "This column must have a reference to this other table" then the SQL Server will mandate that relationship.
After you create this, for each itemsSelected you add an entry: I.e.
UserId | CatalogueId
0 | 2
0 | 3
0 | 7
0 | 9
This also alows you to use JOINs on the tables for faster queries.
Additionally, and unrelated to the question, you can also optimize the Catalogue table you have a bit, and create another table for CatalogueGroup, which contains your last column there (catalog: car, animal) which is referenced via a Foreign-Key Relationship in the current Catalogue table definition you have. This will also save storage space and speed up SQL Server work, as it no longer has to read a string column if you only want the element value.

Use INT PRIMARY KEY value in other table, same database call

I have created the following tables in SQLite3 to tag items (after reading this great response). The tags are saved in the table Tags and the ItemTags will show the relation between one item (from the table Items) and one or more tags (from the table Tags).
CREATE TABLE Items ItemID INTEGER PRIMARY KEY, Title TEXT, Comment TEXT;
CREATE TABLE Tags TagID INTEGER PRIMARY KEY, Title TEXT;
CREATE TABLE ItemsTags ItemID INTEGER, TagID INTEGER;
When submitting a new row, the user will enter a title and a comment (which will be saved in the table Items) and chose from one or more tags (which are chosen/added from/to the table Tags). So far, I've for instance managed to do this:
INSERT INTO Items (Title, Comment) VALUES ('First title', 'First comment');
I want the column ItemID to be a INTEGER PRIMARY KEY, but at the same time, I want to access that value in the same call. Say, for instance, that my table Tags has the following layout:
TagID | Title
------|----------
1 | First tag
2 | Second tag
and that I want to tag the above mentioned statement ("First title", which has the TitleID 1) with TagID 1 and 2, and save the relation to the table ItemsTags. After I'm done, I want the following changes to be made:
Table: Items
TitleID | Title | Comment
--------|-------------|--------------
1 | First title | First comment
Table: Tags
TagID | Title
------|----------
1 | First tag
2 | Second tag
Table: ItemsTags
TagID | ItemID
------|---------
1 | 1
2 | 1
How can I achieve this? Thanks in advance!
You cannot insert rows into two separate tables with a single call to the database, nor can you insert two rows into the same table with a single call. You will need four in this case:
INSERT INTO Items (Title, Comment) VALUES ('First title', 'First comment');
SELECT last_insert_rowid() -- To get last inserted id
INSERT INTO ItemTags (TagID, ItemID) VALUES (1, :LastID)
INSERT INTO ItemTags (TagID, ItemID) VALUES (2, :LastID)
If you place them all inside a transaction, they will all get committed at the same time, with only a single lock placed on the database file.

SQL Database design help needed

I am trying to limit the about of tables in my database because I hear that's good (I am a novice web developer). I want to have the user input a list into the database. So they input a title, overall comment, and then start making the list. I can't figure out how to do this without making a new table for each list. Because, say one user wants a list with 44 values and another user wants a list of 10 values. I can't think of how to do this without making a new table for each list. I would really appreciate any help/insight you can give to me.
Basically, you want to make a table for the user lists, where each row in the table refers to one user's lists, and another table for the user list values, where each row in the table has a column for a reference to the list it belongs to, and a column for the value the user input.
Your Table Could Be:
UserID, int
ListID, int (Primary Key-Unique Identifier)
Title, VarChar(250)
Comment, VarChar(MAX)
Example Content:
1 | 1 | The Title | My Comment
1 | 2 | The Other Title | My other comment
2 | 3 | First Comment | Second Person, first comment
Eacher User just gets their list from a query:
Select ListID, Titel, Comment FROM the_Table
where UserID = #UserID
You can get away with a single table of lines for all the lists, say for example simply
CREATE TABLE ListLines (
listID INTEGER,
lineNo INTEGER,
line TEXT,
PRIMARY KEY (listID, lineNo),
FOREIGN KEY (listID) REFERENCES Lists
);
with the table of lists becoming:
CREATE TABLE Lists (
listID INTEGER PRIMARY KEY,
userID INTEGER,
title TEXT,
comment TEXT,
FOREIGN KEY (userID) REFERENCES Users
);
assuming you have a Users table with primary key userID INTEGER with per-user information (name, etc, etc).
So to get all the lines of a list given its ID you just
SELECT line FROM ListLines
WHERE listID=:whateverid
ORDER BY lineNo;
or you could UNION that with e.g. the title:
SELECT title AS line FROM Lists
WHERE listID=:whateverid
UNION ALL
SELECT line FROM ListLines
WHERE listID=:whateverid
ORDER BY lineNo;
and so on. This flexible and efficient arrangement is the relational way of doing things...

Resources