Database Design for Conditional Questionnaire - database

I am designing a database schema to support a business case in which a user can submit a request (for him/herself or on someone else's behalf). To process and complete the request, the submitter will be prompted with questions based on their answers to prior questions. That is to say, the next question is conditional based on the current question's answer.
Each question will have an associated type, which will drive the user form for that particular question. A question of type boolean indicates Yes/No radio buttons for the answer. Questions of type multiple indicates a multiple choice answer, where users will select one of multiple radio options.
I have two questions:
How can I modify my schema to "link" answers to multiple choice questions? (ie "the following answers are available for question X.")
How should the answers drive the next question? (ie. "for question #1, if answer A is chosen, then GOTO question 5")
My question_relationships table will let me specify that question 1 is the parent of question 5, and question 5 is the parent of question 6. But I really need the answers to drive this logic.
question
-id
-question_name
-question_text
-question_hint
-question_type (boolean, multiple)
question_relationship
-id
-fk_parent_question_id
-fk_child_question_id
request
-id
-person_id
-submitter_id
-submit_date
-status
request_answer
-id
-fk_request_id
-fk_question_id
-answer_text
-answer_boolean
I have seen the answers in db design - survey/quiz with questions and answers, but I believe that my scenario is a bit different.

A table has an associated fill-in-the-(named-)blanks statement aka predicate. Rows that make it into a true statement go in the table. Rows that make it into a false statement stay out. That's how we interpret a table (base, view or query) and update a base. Each table represents an application relationship.
(So your predicate-style quote for 2 is how to give a table's meaning. Because then JOIN's meaning is the AND of argument meanings, and UNION the OR, EXCEPT is the AND NOT, etc.)
How can I modify my schema to "link" answers to multiple choice
questions? (ie "the following answers are available for question X.")
// question [question_id] has available answer [answer_id]
question_answers(question_id, answerid)
How should the answers drive the next question?
(ie. "for question #1, if answer A is chosen, then GOTO question 5")
// for question [this_id] if answer [answer_id] is chosen then go to question [next_id]
next_question(this_id, answer_id, next_id)
PS
There are many different ways of representing graphs (nodes with edges between them) via tables. Here the nodes are questions and the edges are this-next question pairs. Different tables support different kinds of graphs and different patterns of reading and update. (I chose one reflecting your application, but framed my answer to help you find your best representation via proper design yourself.)
PPS
If different user traces through questions can mean that which question follows another is context-dependent:
// in context [this_id] if answer [answer_id] is chosen then go to context[next_id]
next_context(this_id, answer_id, next_id)
What a "context" is depends on aspects of your application that you have not given. What you have given suggests that your only notion of context is the current question. Also, depending on what a context contains, this table may need normalization. You might want independent notions of current context vs current question. (Topic: finite state machines.)

Related

Deal with Many Choice and Matching Questions in Testing Schema

Struggling to find a way to allow many choice questions and matching questions in my testing schema. I'm guessing I should have a column in the QUESTIONS_LINK table denoting the amount of correct answers for that question. For matching, I just have no clue how to logically relate the somewhat "unrelated" options with the answers for each so that it can be grabbed to be displayed on the web page. Note: strong/weak relationship lines may not be accurate.

what should be database schema for online voting app

I want to develop an app where Events/Question would be posted by admin and user will vote or answer. Here question can be of three different type and each type can have different option.Admin can view reports about each question
E.g
WHQuestion: What is right age for marriage? (1) >20 (2)=20 (3)20<
Voting:Who is best captain (1)ABC (2)PQR n so on......
YesNoQuestion: Is Dhoni a good captain? (1)Yes (2)No
So I am confused here about the database schema and tables. How should i manage them?
All questions are multiple choice with one or zero correct answers. So: One question, some answers, one optional correct answer.
Question: question_no, text
Answer: question_no, answer_no, text
As to how to store which answer per question is correct, there are two options:
Store the answer_no in the question record. I consider this the better option. A dbms featuring deferred Constraints (so a question record can reference an answer record and vice versa) would be a good thing to have here. If there is no correct answer the answer_no is null.
Have a flag in the answers table and then mark one answer per question as correct and the others as incorrect. This would be appropriate if there were multiple correct answers per question possible. For one correct answer, however, this would be the worse option of the two. To guarantee data consistency you would apply some special check, which can be a bit complicated (for instance a function index to guarantee uniqueness). For no correct answer you could store the same value or even null for all answers. However, you must see the answers to find out that this is a vote question. So again, option 1 is the better choice where you see this immediately in the question record.

A topic to experienced database architects [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 9 years ago.
I face the following problem.
I'm creating a database for (say) human beings' info. All the human beings may be classified in one of the three categories: adult female, adult male, child. It is clear that the parameters like "height" and "weight" are applicable to all of the categories. The parameter "number of children" is applicable only to adults, while the parameter "number of pregnancies" is applicable to females only. Also, each parameter may be classified as mandatory or optional depending on the category (for example, for adults the parameter "number of ex-partners" is optional).
When I load (say) "height" and "weight", I check whether the info in these two fields is self-consistent. I.e., I mark as a mistake the record which has height=6'4'' and weight=10 lb (obviously, this is physically impossible). I have several similar verification rules.
When I insert a record about a human being, I need to reflect the following characteristics of the info:
the maximum possible info for the category of this particular human being (including all the optional parameters).
the required minimum of information for the category (i.e., mandatory fields only)
what has actually been inserted for this particular human being (i.e., it is possible to insert whatever I have for this person no matter whether it is smaller than the amount of required minimum of info or not). The non-trivial issue here is that a field "XXX" may have NULL value because I have never inserted anything there OR because I have intentionally inserted exactly NULL value. The same logic with the fields that have a default value. So somewhere should be reflected that I have processed this particular field.
what amount of inserted information has been verified (i.e., even if I load some 5 fields, I can check for self-consistency only 3 fields while ignoring the 2 left).
So my question is how to technically organize it. Currently, all these required features are either hardcoded with no unified logic or broken into completely independent blocks. I need to create a unified approach.
I have some naive ideas in my head in this regard. For example, for each category of human beings, I can create and store a list of possible fields (I call it "template"). A can mark those fields that are mandatory.
When I insert a record about a human being, I copy the template and mark what fields from this templates have actually been processed. At the next stage, I can mark in this copy of the template those fields that will be currently verified.
The module of verification is specially corrected in the following way: for each verification procedure I create a list of fields that are being used in this particular verification procedure. Then I call only those verification procedures that have those fields that are actually marked "to be verified" in the copy of the template for the particular human being that is to be verified (see the previous passage).
As you see, this is the most straightforward way to solve this problem. But my guess is that there are a lot of quite standardized approaches that I'm not aware of. I really doubt that I'm the first in the world to solve such a problem. I don't like my solution because it is really painfull to write the code to correctly reflect in this copied template all the "updates" happening with a record.
So, I ask you to share your opinion how would you solve this problem.
I think there are two questions here:
how do I store polymorphic data in a database?
how do I validate complex business rules?
You should address them separately - trying to solve both at once is probably too hard.
There are a few approaches to polymorphic data in RDBMSes - ORMs use the term inheritance mapping, for instance. The three solutions here - table per class hierarchy, table per subclass and table per concrete class - are "pure" relational solutions. You can also use the "Entity-Attribute-Value" design, or use a document approach (storing data in structured formats such as XML or JSON) - these are not "pure" relational options, but have their place.
Validating complex business rules is often done using rule engines - these are super cool bits of technology, but you have to be sure that your problem really fits with their solution - deciding to invest in a rules engine means your project changes into a rules engine project, not a "humans" project. Alternatively, most mainstream solutions to this embody the business logic about the entities in the application's business logic layer. It sounds like you're outgrowing this.
This exact problem, both in health terms and in terms of a financial instrument, is used as a primary example in Martin Fowlers book Analysis Patterns. It is an extensive topic. As #NevilleK says you are trying to deal with two questions, and it is best to deal with them separately. One ultra simplified way of approaching these problems is:
1 Storage of polymorphic data - only put mandatory data that is common to the category in the category table. For optional data put these in a separate table in 1-1 relationship to the category table. Entries are made in these optional tables only if there is a value to be recorded. The record of the verification of the data can also be put in these additional tables.
2 Validate complex business rules - it is useful to consider the types of error that can arise. There are a number of ways of classifying the errors but the one I have found most useful is (a) type errors where one can tell that the value is in error just by looking at the data - eg 1980-02-30. (b) context errors where one can detect the error only by reference to previously captured date - eg DoB 1995-03-15, date of marriage 1996-08-26. and (c) lies to the system - where the data type is ok; the context is ok; but the information can only be detected as incorrect at a later date when more information comes to light eg if I register my DoB as 1990-12-31, when it is something different. This latter class of error typically has to be dealt with by procedures outside the system being developed.
I would use the Party Role pattern (Silverston):
Party
id
name
Individual : Party
current_weight
current_height
PartyRole
id
party_id
from_date
to_date (nullable)
AdultRole : PartyRole
number_of_children
FemaleAdultRole : AdultRole
number_of_pregnancies
Postgres has a temporal extension such that you could enforce that a party could only play one role at a time (yet maintain their role histories).
Use table inheritance. For simplicity use Single Table Inheritance (has nulls), for no nulls use Class Table Inheritance.

Logic for recommender application

I am developing an application - which would have users answer maybe 10 questions - which would have 3-4 options for each question. At the end of the 10th question, based on the responses, it would need to suggest a certain solution. Since there are 100's of permutation and combinations - what's the logic that would be required to use and the database design,
thanks
EDIT some more detailed explanation
if my application is used to recommend a data plan from various mobile operators - based on the user answering questions like the time spent on the internet, the type of files being downloaded and so on. So, if the response to question 1 was a and question 2 was c, etc - then it would be a certain plan. If the response to question 1 was b and for question 2 it was c, then it would recommend a certain plan. So, if there were 10 questions - then the combinations can be quite large. So is there a certain algorithm that can handle this?
I. what would be the logic?
If I understand correctly, you would define "rules" such as
If the answer to question 5. is either A or B then the suggested plan would be planB, otherwise execute the rest of the rules.
So you would use a rule engine e.g.: http://www.jboss.org/drools/
II. what would be the database design?
This is quite simple:
USERS table,
QUESTIONS table and
ANSWERS table which would refer to the two others
Possibly there would be a QUESTIONNAIRE table as well, and the QUESTIONS table would refer to it.
Just a 'quick' comment, consider letting the user see changes in what company they could be recommended as they answer every question.
For example, if I am most interested in price that would be the question I would answer first and immediately see the 3 cheapest plans/products recommended to me.
The second question could be coverage and if I then could see the 3 plans with best coverage (in my area) that would be interesting too.
When I answer the third question about smart phone features and I say I want internet, then the first question should spit out the 3 cheapest plans/products that include internet, obviously they could change.
And so on...
Maybe it also could be a good idea to let the user "dive into" each question and see the full range of options for that answer. As a user I would appreciate that.
Above comments is just how I would appreciate if a form was made for me, I don't want to answer 10 questions about stuff I'm not really putting any value on, each user is different and will prefer to make their choice on their questions.
So, based on above it would be like a check list where the top answers would be the plans/products with the most fitting check marks. And to give immediate responses (as the user answer/alter each question), here AJAX would probably be your choice.

User's text in my database as a separate table or within other data

This question is based on my plan at the thread.
I have the following table
alt text http://files.getdropbox.com/u/175564/table-problem-2.png
where kysymys is a question in English.
I would like to know how I should store the data of an user's question:
in a separate table where I have the parameters question-id and question-body OR
in the current table where I have other parameters too
I need to neutralize the question-body somehow in the future such that user does not give code which breaks my system.
How would you store the data of the user's text?
This will depend:
You mention: "where kysymys is a question in English."
Are you planning to have the same question in other languages?
If that's the case, normalize the question and question body out to another table. That way, given a language and a question ID, you can retrieve the right one.
However, if the question is only going to be in English, just leave it in the same table. That's perfectly fine.
Are you planning to store revisions of a question ? e.g. StackOverflow allows you to revise the question text and it stores the history.
If this is the case I would store the text separately. You would store answers/comments referenced against the question-id, but the question text would be held in a separate table.
Your data neutralisation issue (above) is orthogonal to this (a separate issue of data sanitisation/cleansing).

Resources