Managing both fixed and user-defined values - sql-server

I'm designing the database for an application in which the user is presented with questions, and he must answer them. Think of it either as a questionnaire or as a quiz game, the concept applies to both. I plan to have:
a table with the questions
a table with the possible answers, each of them linked to the question it belongs to with a foreign key (let's keep things simple and assume it's a 1:many relationship, where answers cannot be shared between questions)
a table with the answers that users provided (with foreign keys to the question, the answer and the user ID)
Since many of the questions will be common cases, like yes/no, I decided I'd specify a "question type" enumeration to each question. If the application sees a yes/no question, for example, it means there are no answers in the database, and the application will automatically add the two answers, "Yes" and "No". This saves me hundreds or thousands of useless rows in the answers table.
However, I'm not sure how I should define the table to record user answers. Without the special types of questions, I'd just record the question ID, the answer ID and the user ID, which means "user X answered Y to question Z". However, "yes/no" questions would not have a matching answer in the table, so I can't use the answer ID.
Even making the answers shareable between questions (by making a many-to-many relationship between questions and answers) is not a good solution. Sure, it would allow me to define "Yes" and "No" as regular answers, but then applications should be aware that a "yes/no" question uses answers (say) 7 and 8 - or, when creating a "yes/no" question answers 7 and 8 should be bound to that question. But this means that these "special" answers' IDs must be hardcoded somewhere else. Also, this would not scale well should I add more special types of question in the future.
How should I proceed? Ideally, I need to store in each row of my "user answers" table either a fixed value or a foreign key to the answers table. Is there a better solution than using two columns, one of which is NULL?
I'm using SQL Server, if that matters.

Based on your description I think I'd go on the route of adding another column to the table and making the FK column nullable.
You'd probably have only a few choices for those special questions, so a nullable TINYINT datatype would cut it, and it is only 1 extra byte for your answer row. If this extra column happen to raise the number of columns to more than a multiple of eight, say you go from 8 to 9 or 16 to 17, than you pay another extra byte for the growth of the null bitmap. But it's 2 extra bytes per row worst case.

Related

How to design a database table from a form? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I'm learning how to design databases, and i've been ask to create the table that will hold this form: Medical History I'm learning to use Django/Python i've already made the markup in HTML and CSS, but I don't think that making each question on the form an column would be the best approach. For example in the family history i've thought of making it a separate table, while in the review of systems i want to make each to be a set.
A pragmatic approach is to define tables based on the following criteria:
1) easy to select data from them (not to obtain many JOINs or convoluted queries that require ORs or strings splitting)
2) easy to understand (each concept maps to one table)
=> usually, normalized structures do the trick here
Of course, above are challenged in high transactional environments (INSERTs, UPDATEs, DELETEs).
I would assume then your case has moderate INSERTs, but more SELECTs (reports).
For Family history section I would normalize everything:
DiseaseType
DiseaseTypeId
Code -- use to separate from a name that can change in time
Name -- breast cancer, colon cancer etc.
CollateralOption
CollateralOptionId
Code -- I would put UNIQUE constraints on Codes and Names
Name -- no, yes, father
FamilyHistory
FamilyHistoryId INT PK IDENTITY -- this may be missing, but I prefer if I use an ORM
PatientId -> FK -> Patient
DiseaseTypeId -> FK -> DiseaseType
CollateralOptionId FK -> CollateralOption
Checked BIT -- you may not define this and have records for Checked ones.
-- having this may put some storage pressure
-- but prevent some "stuffing" in the queries
These structures allow to easily COUNT number of patients with colon cancer cases in their family, for example.
Shortly put: if there is not serious reason against it, go for normalized structures.
I don't see any advantage to perform any design tricks on this data structure. Yes, making a boolean attribute of each of your checkboxes, and a string attribute of each of your free texts, will lead to a high number of attributes in one table. But this is just the logical structure of your data. All these attributes are dependent on the key, some person id, (or at least that's what I assume, as a medical layman). Also, I assume that they are independent of each other, i.e. not determined by some other combination of attributes. So they go to the same table. Putting them on several tables won't gain anything, but will force you to do lots of joins if you query on different types of attributes (like all patients whose mother had breast cancer and who now have breast lumps).
I don't know exactly what you mean by making sets of some attributes. Do you mean to have just one attribute, and encode the sequence of boolean values e.g. in one integer, like 5 for yes-no-yes? Again that's not worth the trouble, as it won't save any space or whatever, but will make queries more complicated.
If you are still in doubt, try to formulate the most frequent use cases for those data, which will probably be typical queries on combinations of these attributes. Then we might see whether a different structure would make your life easier.

What is the best database design when multiple tables share a common data model? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 2 years ago.
Improve this question
It is very common coming up with the situation where you have multiple tables, let's say Posts, Entries and News and the users can comment in any of them.
The natural way to think this would look like would be something like this:
However, using the parent_type attribute to set up the primary key seems ambiguous to me, because it doesn't really describe clearly to which parent the comment belongs to.
It would seem to me like the clearest way to understand the relationship would be to create an intermediate table for each relationship: posts_comments, posts_entries, posts_news. But then again this doesn't have much sense in terms of database design because you don't really need an intermediate table for a one to many relationship, they are only needed for many to many relationships.
Then maybe a third approach would be to make a different Comment model depending on what it belongs to: PostComment, EntryComment. This would help in understanding better what that comment is for, but I think it is a terrible idea because a comment could just be represented by a single table and you could end up with a bunch of different tables with repeated column while it could just be represented by one table.
So, what is the usual approach for this situation?
I agree that you don't want to create three tables that are almost identical if you can help it. I've seen plenty of databases that do this and it's a pain, because any change is likely to affect all the variations, so you have to make the change three times instead of once, and there are three times as many queries to change. And sooner or later someone will make a change and not know he should make the same change to the other two tables or be in a hurry or forget, and then six months later someone else comes along and wonders if there's a reason why the three tables are subtly different or if this is just carelessness.
But you also don't want to have a column with an ambiguous definition. You especially don't want a foreign key that can refer to different tables depending on the content of a type field. That would be really ugly.
A solution to a similar problem that I used once was to create an intermediate table. In this case, it would be -- I don't know if you have a word that encompasses news, posts, and events, so let me call them all collectively "articles". So we create an article_comments table. It may have no data except an ID. Then news, posts, and events all have a pointer to article_comments, and comments has a pointer to article_comments.
So if you want all the comments for a given news record, it's:
select whatever
from news n
join article_comments ac on ac.iarticle_comments_id=n.article_comments_id
join comments c on c.article_comments_id=ac.article_comments_id
where n.news_id=#nid
Note that with this structure, all FKs are true FKs to a single table. You don't make article_comments point to news, posts, and events; you make news, posts and events point to article_comments, so all the FKs are clean.
Yes, it's an extra table to read, which would slow down queries a bit. But I think that's a price worth paying to keep the FKs clean.
One admittedly clumsy query with this structure would be if you want to know which article a given comment is for if you don't know the type of article. That would have to be:
select whatever
from comment c
join article_comment ac on ac.article_comment_id=c.article_comment_id
left join news n on n.article_comment_id=ac.article_comment_id
left join post p on p.article_comment_id=ac.article_comment_id
left join event e on e.article_comment_id=ac.article_comment_id
where c.comment_id=#cid
and then see which of news, post, and event turns up non-null. (You could also do it with a join to a subquery that's a union, but I think that would be uglier.)
So later. However, you can design model as same below:
#Entity
uid: string
type: string/ (post, new,...)
...
#Comment:
uid: string
entity_id: string
body: String
Other way (only one table)
Entity:
uid: string
parent: string
type: string /post, new, comment, revison,...
metadata: jsonb
Hope useful for anyone!

Questionnaire to database design

I have read through a lot of the threads here and have found a good amount of useful input...but there are a couple of questions that remain unanswered.
I am storing questions & answers from a questionnaire in a database.
I have the tables:
Survey (surveyID)
Question (questionID, surveyID, questionType, Question)
Answer (answerID, userID, questionID, answer)
User (userID, username)
Question 1: multi-value questions...I would have a separate row for each value in the answer table....but have the same questionID and userID. But then how would you work the following:
-what are your coping strategies (multi-value)
-how frequently do you use each coping strategy?
i.e. a one-one relationship of coping strategy-frequency.
The solution above (i.e. one row per answer doesn't work because you need the relation between the specific coping strategy and the frequency).
A similar question is for the following:
have you been involved in conflicts over land-use rights?
with whom? (multi-value)
for what reasons?
(i.e. what were your reasons for conflict with the neighbours, what were your reasons for conflict with the authorities?) ...i.e. one to many on a multi-value attribute
Thank you in advance, I hope I have explained my query sufficiently well.
Becky

Should a table with only 1 field of useful data be its own table

Just got a question here about a database table. If the table only has a primary key (identity) and 1 column of useful data, is it okay to be its own table or should it be in the parent table as just the data?
The table is storing Security Questions that the user will set up with they make their account and be used to reset password in the event they want to change password or forgot the password. I have the ID of the question, and the question string in this table.
The reason I have it in its own table is that the same question could be used for many users so why store the question many times in the parent table. Thats my thinking, just wanted a few others' opinions on this.
EDIT: The Security Questions are going to be input by my team, not the user themselves. The user will pick one of the questions to use.
I would suggest this sample design using bridge table:
You can have multiple questions for a user as well as their answers unique. Also, the questions can be same for multiple users.
You must always try to prevent duplicates, that's why your solution is the best.
it will also keep your database smaller. A foreign key with int value is smaller than a string.

Schema for Inspection/Exam Questions/Answers with scoring

I'm writing an application that will generate inspections for our locations. Basically, think of them as health inspection forms. Each "inspection" will have a series of questions and answers. The answers can be either numeric (1,2,3,4,5 - which will represent their point values), or multiple choice ('Yes','No') that will have map to points (1 for yes, 0 for no) and flat text answers that will not map to points but might be able to be used by the application layer for averaging. So for example, we could have a field for "Sauce Temperature" which carries no points, but could be used for reporting down the road.
Questions can be reused on multiple inspection forms but can have different point values. So can answers.
I'm having trouble figuring out the schema for this. My instinct says EAV would be a good way to go, but the more I think about it, the more I'm thinking more of a data warehouse model would be better.
Particularly, I'm having a problem figuring out the best way to map the min_points, max_points and no_points to each question/answer. This is where I am thinking I'm going to have to use EAV. I'm kind of stuck on it actually. If it was a survey or something where there were no points, or the same point value for each answer, it would be pretty simple. Question table, answer table, some boilerplate tables for input type and so forth. But since each question MAY have a point value, and that point value may change depending on which location is using that question, I'm not sure how to proceed.
So, the example questions are as follows
Was the food hot [Yes, No] Possible points = 5 (5 for yes, 0 for no)
Was the food tasty [1,2,3,4,5] Possible points = 5 (1 for 1, 2 for 2, etc)
Was the manager on duty [Yes, No] Possible points = 5 (5 for yes, 0 for no)
Was the building clean [1,2,3,4,5] Possible Points = 10 (2 for 1, 4 for 2, 6 for 3, etc)
Was the staff professional [Yes, No] Possible Points = 5 (5 for yes, 0 for no)
Freezer Temp [numerical text input]
Manager on duty [text input]
Since all the answers can have different data types and point values I'm not sure how to build out the database for them.
I'm thinking (Other tables, names and other imp details left out or changed for brevity)
CREATE TABLE IF NOT EXISTS inspection(
id mediumint(8) unsigned not null auto_increment PRIMARY KEY,
store_id mediumint(8) unsigned not null,
inspection_id mediumint(8) unsigned not null,
date_created datetime,
date_modified timestamp,
INDEX IDX_STORE(store_id),
INDEX IDX_inspection(inspection_id),
FOREIGN KEY (store_id) REFERENCES store (store_id)ON DELETE CASCADE,
FOREIGN KEY (inspection_id) REFERENCES inspection (inspection_id)ON DELETE CASCADE)
CREATE TABLE IF NOT EXISTS input_type(
input_type_id tinyint(4) unsigned not null auto_increment PRIMARY KEY,
input_type_name varchar(255),
date_created datetime,
date_modified timestamp)
CREATE TABLE IF NOT EXISTS inspection_question(
question_id mediumint(8) unsigned not null auto_increment PRIMARY KEY,
question text,
input_type_id mediumint(8),
date_created datetime,
date_modified timestamp)
CREATE TABLE IF NOT EXISTS inspection_option(
option_id,
value)
But here's where I'm kind of stuck. I'm not sure how to build the question answers tables to account for points, no points, and different data types.
Also, I know I'll need mapping tables for stores to inspections and so forth, but I've left those all off for now, since it's not important to the question.
So, should I make a table for answers where all possible answers (built from either the options table or entered as text) are stored in that table and then a mapping table to map an "answer" to a "question" (for any particular inspection) and store the points there?
I'm just not thinking right. I could use some help.
There’s no right or wrong answer here, I’m just tossing out some ideas and discussion points.
I would propose that the basic “unit” isn’t the question, but the pair of question + answer type (e.g. 1-5, text, or whatever). Seems to me that Was the food hot / range 1 to 5 and Was the food hot / text description are so very different you’d go nuts trying to relate a question with two (or more) answer types (let alone answer keys for those answers--ignore that for now, I pick up on that later). Call the pair a QnA item. You may end up with a lot of similar pairs, but hey, it's what you've got to work with.
So you have a “pool” of QnA items. How are they selected for use? Are specific forms (or questionnaires) built from items in the pool, or are they randomly selected every time a questionnaire is filled out? Are forms specifically related to location, or might a form be used at any location? How fussy are they at building their forms/questionnaires? How the QnA items are collected/associated with one another and/or there ultimate results is pretty important, and you should work it all out before you start writing code, unless you really like rewriting code.
Given a defined QnA item, you should also have an “answer key” for that item – a means by which a given answer (as based on the item's answer type) measured: Zero, Value, Value * 2, whatever. This apparently can vary from usage to usage (questionnaire to questionnaire? Does it differ based on the location at which the questionnaire is presented? If so, how or why?) Are there standardized answer key algorithms (alwyas zero, always Value * 2, etc) or are these also extremely free-form? Determining how they are used/associated with the QnA items will be essential for proper modeling.

Resources