Efficient way to store a dynamic questionnaire? - sql-server

In reference to this question, I am facing almost the same scenario except that in my case, the questions are probably static (it's subject to change from time to time, and I still think it's not a good idea adding columns for each question, but even I decided to add, how should the answers be specified/retrieved from), but the answers are in different types, for examples the answer could be yes/no, list-items, free text, list-items OR free text (Other, Please specify), multiple-selectable-list items etc.
What would be an efficient way to implement this?

Shimmy, I have written a four-part article that addresses this issue - see Creating a Dynamic, Data-Drive User Interface. The article looks at how to let a user define what data to store about clients, so it's not an exact examination of your question, but it's pretty close. Namely, my article shows how to let an end user define the type of data to store, which is along the lines of what you want.
The following ER diagram gives the gist of the data model:
Here, DynamicAttributesForClients is the table that indicates what user-created attributes a user wants to track for his clients. In short, each attribute has a DataTypeId value, which indicates whether it's a Boolean attribute, a Text attribute, a Numeric attribute, and so on. In your case, this table would store the questions of the survey.
The DynamicValuesForClients table holds the values stored for a particular client for a particular attribute. In your case, this table would store the answers to the questions of the survey. The actual value is stored in the DynamicValue column, which is of type sql_variant, allowing any type of data - numeric, bit, string, etc. - to be stored there.
My article does not address how to handle multiple-choice questions, where a user may select one option from a preset list of options, but enhancing the data model to allow this is pretty straightforward. You would create a new table named DynamicListOptions with the following columns:
DynamicListOptionId - a primary key
DynamicAttributeId - specifies what attribute these questions are associated with
OptionText - the option text
So if you had an attribute that was a multiple-choice option you'd populate the drop-down list in the user interface with the options returned from the query:
SELECT OptionText
FROM DynamicListOptions
WHERE DynamicAttributeId = ...
Finally, you would store the selected DynamicListOptionId value in the DynamicValuesForClients.DynamicValue column to record the list option they selected (or use NULL if they did not choose an item).
Give the article a read through. There is a complete, working demo you can download, which includes the complete database and its model. Also, the four articles that make up the series explore the data model in depth and show how to build a web-based (ASP.NET) user interface for letting users define dynamic attributes, how to display them for data entry, and so forth.
Happy Programming!

This may not fit you exactly, but here's what i've got at my part-time job.
I have a questions table, an answers table, and a survey table. For each new survey i crate a survey build (because each survey is unique, but questions and answers are repeated a lot). I then have a respondent table that contains some information about the respondent (and it also links back to the survey table, forgot that in the diagram). I also have a response table that links the respondent and the survey build. This probably isn't the best way but it's the way that works for me, and it works pretty fast (we're at about 1mill+ in the response table and it handles like a dream).
With this model i get reusable questions, reusable answers (a lot of our questions use "Yes" and "No"), and a rather slim response table.

Related

Designing a database for personal project

I am pretty new to designing databases, and currently, I am working on a substantial big project of mine which requires a pretty big database. Here for I have a couple of questions to get my database ready for implementation. --Do have in mind that this project is focused on Laravel--
Question 1:
My project makes use of posts, But not only one. I have a system where three sorts of posts can be created, a standard post, a profile post and a Company post. All these posts can contain images. Currently, I have a column inside of all these different post tables called Post_photo'. Is this the right way to store pictures that associate with a post? It is illustrated in the image below,
Image: https://imgur.com/a/b9FWL
Question 2:
Every post can contain comments, And to connect these comments to a post you need to refer them one. But because I have three different variations of posts I set my comments table up like this; "Comment table consists of a Post_ID column and a Company_post_ID column" Instead of it having one Post_ID. Is this the right way to connect comments to posts? Or do I need to make another table called company_comments? If not, How can I accomplish this?
I have this same system on my likes and category table as well because I need to refer my likes and categories to posts. Is this the right way? To get a visual of what I am talking about, There is a picture above.
Thanks for taking the time to read this!
The following assumes that you are using a relational database.
Answer 1: If there can be more than one picture or file per post, then the best practice would be creating a table for photos that references the post's ID.
This way when you load the post you would query the photos table for columns containing a PostID field matching your post's id.
Answer 2: If the three types of post are very similar (and contain similar data), consider having only one post table, and include a field that indicates the type of post. For example, a field called postType could store an integer (0-2) that corresponds to the type. This would simplify your comments table, as you would only reference the postID.
As a final note, you might find this thread about storing binary data in databases helpful: (Storing files in SQL Server)

Best way to store comments with mentions (#FirstName) in database

Was wondering what is the best way to store comments in a database (sql) that allows mentioning of other users by a non-unique natural name?
E.g. This is a comment for #John.
The client application would also need to detect and link to corresponding user profile if his/her name was clicked.
My initial thought was to replace the user's first name with the id and some metadata and store that in the DB: This is a comment for <John_51/> where 51 is the id of that user. Clients can then parse that and display the appropriate user name and profile links.
Is this a good approach?
Some background:
What I would like to achieve is similar to facebook posts where it allows you to 'tag' a user by just mentioning their name (not the unique username) in a post. It doesn't have to be as complex as facebook as what I need it for isn't for a post, but just comments (which can only be text, as opposed to posts which could be text mixed with videos/images/etc).
The solution would affect the database side (how the comments are stored) and also the client side (how the comments are parsed and displayed to the user). The clients are mobile apps for iOS and Android but also looking to expand to a web application as well.
I don't think the language matters as much but for completeness sake, I'm using Python's Flask with SQLAlchemy frameworks on the backend.
Current DB schema for comments
COMMENT TABLE:
id (<PK>)
post_id (id of the post that the comment is for: <FK on a post object>)
author_id (id of the creator of the comment: <FK on a user object>)
text (comment text: <String>)
timestamp (comment date: <Date>)
Edit:
I ended up going with metadata in the comment. E.g.
Hey <mention userid="785" tagname="JohnnyBravo"/>!
I included the user's name (tagname) as well so that client application can extract the name directly from the comment text instead of adding another step to look up who user 785 is.
The big problem here is if the username is not a stable reference, you need to abstract it to an id reference, while still keeping the the text reconstructable, but the references queryable.
Embedded collections and dynamic typing are a great option if you're using a NoSQL database. It would be fairly straightforward.
{
_id: ...,
text: [
"Wow ",
51,
", your selfie looks really great, even better than ",
72,
"'s does."
],
...
}
That way you could query references, while still easily reconstructing the content. BUT since you're using SQLAlchemy, that's a no go. Your methodology seems fine, but because your doing magic in the string you'll need to escape your delimiters, (as well as escape the escape character) if they exist in the text. Personally, I would use # as the delimiter since it's already a special character. You'd also need to identify the end of the id, in case the user sticks a bunch of numbers after the #mention, so
Wow #51#, your selfie looks really great, even better than #72#'s does. email me! john\#foo.com. Division time!!! with backslashes! 12\\4 = 3
IF querying posts for references is also important to you. You'll also need to maintain a separate POST__USER junction table that stores a row for the post and for each user id, so that when you load an object into memory, you can construct a collection. You could decide to add the junction table later, but it would be a fairly expensive migration.
If #name is not unique,you have to somehow associate the non-unique name, via the session, with the unique owner of the natural name, and do this ideally before storing it in the database. Storing a non-unique name in the database, if it cannot be resolved to its unique owner, is not of much value.
Since you mention "sql" I assume you're using a relational database. If that is the case, once you have resolved #name to its unique owner, I would create a one-to-many relationship between posting or comment and userids; that would allow a comment or post to reference more than one user.
TABLE: COMMENT_MENTIONEDUSERS
commentid
userid
I would recommend storing the comment as markdown since it's now quite widespread. In your case, "This is a comment for [#John](/user/johnID)".
Markdown is pretty standard and you shouldn't have an issue finding a package for editing / viewing.

How to model a database structure with repeating fields in every table

I'm in the process of structuring a databasemodel for my new project. For all the entities in my model (which is a cms, and the entities as such f.ex: page, content, menu, template and a bunch of others) they all have in common the same attributes on dates and names.
More specifically each entity contains the following for the dates: IsCreated, IsValidFrom, IsPublished, IsDeleted, IsEdited and IsExpired, and for names: CreatedByNameId, ValidFromByNameId, PublishedByNameId and so on...
I'm going to use EF5 for mapping to objects.
The question is as simple: What is the best way to structure this: Having all the fields in every table (which I am not obliged to...) or to have two separate tables which the other can relate to...?
Thanks in advance /Finn.
First of all - give this a read - http://www.agiledata.org/essays/mappingObjects.html
You really need to think about your queries/access paths. There are many tradeoffs between different implementations.
In reply to your example though,
Given the following setup:
COMMON
ValidFromByNameId
SPECIFIC1
FieldA
SPECIFIC2
FieldB
Querying by the COMMON attributes is easy but you'll have to work some magic when pulling up the subclasses (unless EF5 does it for you)
If the primary questions you're asking are about specific1 and specific2 then perhaps this isn't the right model. having the COMMON table doesn't really buy you much necessary as it will introduce a join to load any Specific1 object. In this case, i'd probably just have duplicate columns.
This answer is intentionally partial as a full answer is better handled by the numerous articles and blogs already out there. Search for "mapping object hierarchies to databases"

Custom Fields for a Form representing an object

I have an architectural question concerning custom fields in a view for an object. Let's say you have a User Object with some basic information like firstname, lastname, ... that can be used by all customers.
Now, often we get a question from a customer to add couple of custom fields typical for their domain. Our solution now is an xml data column where key value pairs are stored. This has been ok so far, but now we'll have to find a more architectural solution.
For instance, now, a customer wants a dropdown where it can select the value for its custom field. We could still store the selected value in the xml data column, but where do we store all those dropdown values...
I know that in sharepoint you can also add custom fields like dropdowns and I was wondering how to deal with this best. I want to avoid creating custom tables for customers, or having a table with 90 columns (10 basic and then 10 for each customer), ...
You get the idea, it should be generic and be able to deal with all sorts of problems in the future.
What I was thinking about is a Table UserConfiguration where each record has a Foreign Key to the Customer (Channel in our database), then a column FieldName, a column FieldType and a column Values. The column values should be an xml type column, because for a dropdown, we'll need to add multiple values. Also, each value can have extra data attached to it (not just a name). The other problem then is how to store the selected value. I don't like the idea of having foreign keys to xml in my database (read somewhere that Azure can't handle this all to well). Do you just store the name of the value (what if the value were to disappear out of the xml?)?
Any documentation, links on this kind of problems would also be great. I'm trying to find a design pattern that deals with this kind of problem in the database.
I want to answer your question in two parts:
1) Implementing custom fields in a database server
2) Restricting custom fields to an enumeration of values
Although common solutions to 1) are discussed in the question referenced by #Simon, maybe you are looking for a bit of discussion on what the problem is and why it hasn't been solved for us already.
databases are great for structured, typed data
custom fields are inherently less structured
therefore, custom fields are more difficult to work with in a database
some or many of the advantages of using a database are lost
some queries may be more difficult or impossible
type safety may be lost (in the database)
data integrity may no longer be enforced (by the database)
it's a lot more work for the implementers and maintainers
As discussed in the other question, there's no perfect solution.
But these benefits/features still need to be implemented somewhere, and so often the application becomes responsible for data integrity and type safety.
For situations like these, people have created Object-Relation Mapping tools, although, as Jeff Atwood says, even using an ORM could create more problems than it solved. However, you mentioned that it 'should be generic and be able to deal with all sorts of problems in the future' -- this makes me think an ORM might be your best bet.
So, to sum up my answer, this is a known problem with known solutions, none of which are completely satisfactory (because it's so hard). Pick your poison.
To answer the second part of (what I think is) your question:
As mentioned in the linked question, you could implement Entity-Attribute-Value in your database for custom fields, and then add an extra table to hold the legal values for each entity. Then, the attribute/value of the EAV table is a foreign key into the attribute-value table.
For example,
CREATE TABLE `attribute_value` ( -- enumerations go in this table
`attribute` varchar(30),
`value` varchar(30),
PRIMARY KEY (`attribute`, `value`)
);
CREATE TABLE `eav` ( -- now the values of attributes are restricted
`entityid` int,
`attribute` varchar(30),
`value` varchar(30),
PRIMARY KEY (`entityid`, `attribute`),
FOREIGN KEY (`attribute`, `value`) REFERENCES `attribute_value`(`attribute`, `value`)
);
Of course, this solution isn't perfect or complete -- it's only supposed to illustrate the idea. For instance, it uses varchars, and lacks a type column. Also, who gets to decide what the possible values for each attribute are? Can these be changed at any time by the user?
I'm doing something similar for a customer. I've create a JSON FieldType which holds the entire JSON stream of a complex object and a String containing the FQTN (FullQualifiedTypeName) of my C# model class.
By using custom New-, Edit- and Display-Forms we'd ensured that our custom objects are rendered the correct way for best user experience.
To promote fields from the complex C# model to the SharePoint list, we've build something like Microsoft did in InfoPath. Users are able to select Properties or MetaData from the Complex C# type, which will be automatically promoted to the hosting SharePoint list.
The big advantage of JSON is, that its smaller than XML and easier to work with in the web world. (JavaScript...)
When you let the users create the data models, I would recommend looking at an document database or 'NoSQL' since you want exactly that, to store schemaless data structures.
Also, sharePoint stores metadata the way you mentioned (10 columns for text, 5 for dates etc)
That said, in my current project (locked in SharePoint, so Framework 3.5 + SQL Server and all the constraints that follow) we use a somewhat similar structure as below:
Form
Id
Attribute (or Field)
Name
Type (enum) Text, List, Dates, Formulas etc
Hidden (bool)
Mandatory
DefaultValue
Options (for lists)
Readonly
Mask (for SSN etc)
Length (for text fields)
Order
Metadata
FormId
AttributeId
Text (the value for everything but dates)
Date (the value for dates)
Our formulas employ functions such as Increment: INC([attribute1][attribute2], 6) and this would produce something like 000999 for the 999th instance of the combined values for attribute 1 and attribute 2 for a form, this is stored as:
AttributeIncrementFormula
AtributeId
Counter
Token
Other 'formulas' (aka anything non-trivial) such as barcodes are stored as single metadata values. In the actual implementation, we would have something like this:
var form = formRepository.GetById(1);
form.Metadata["firstname"].Value
Value above is a readonly property that decides whether we should get the value from Text or Date and if some additional transform is required. Note that the database here is merely a storage, we hold all the domain complexity in the application.
We also let our customer decide which attribute is the form title for example, so if firstname is the form title, they'll set an in-memory param that spans the entire application to be something like Params.InMemory.TitleAttributeId = <user-defined-id>.
I hope this gives you some insight on a production impl of a similar scenario.
This is really more of a comment than an answer, but I need more space than SO will allow for comments, so here 'tis:
I think your UserConfiguration table approach is good, and would suggest only abstracting the "type" and "value" pieces of your design a bit more:
Since your application will need to validate user input, each notion of "type" will have an associated piece of evaluation logic. Obviously the more of this you can abstract into data the easier it will be to keep your code small. Enumerated lists are a good start, but if your "validator" logic can be extended to handle pattern matching for text strings and Boolean logical expressions (e.g. to describe/enforce constraints on input values), then you can express pretty much any "type" of input that your application may need to handle in terms of (relatively) simple "atoms" that you can map naturally to DB tables.
When storing a user-specified value, you can either store the "raw" data (e.g. in JSON) and a foreign key to the associated "type", or you can add an lookup/cache system that assigns an integer to each new value that is encountered by the system ("novelty" can be checked by checking a hash of the "raw" data, for example). The latter approach obviously scales better if you're expecting lots of data duplication (which of course you would in the case of a multiple-choice menu).

MS Access 07 - Q re lookup column vs many-to-many; Q re checkboxes in many-to-many forms

I'm creating a database with Access. This is just a test database, similar to my requirements, so I can get my skills up before creating one for work. I've created a database for a fictional school as this is a good playground and rich data (many students have many subjects have many teachers, etc).
Question 1
What is the difference, if any, between using a Lookup column and a many-to-many associate table?
Example: I have Tables 'Teacher' and 'Subject'. Many teachers have many subjects. I can, and have, created a table 'Teacher_Subject' and run queries with this.
I have then created a lookup column in teachers table with data from subjects. The lookup column seems to take the place of the teacher_subject table. (though the data on relationships is obviously duplicated between lookup table and teacher_subject and may vary). Which one is the 'better' option? Is there a snag with using lookup tables?
(I realize that this is a very 'general' question. Links to other resources and answers saying 'that depends...' are appreciated)
Question 2
What attracts me to lookup tables is the following: When creating a form for entering subjects for teachers, with lookup I can simply create checkboxes and click a subject for a teacher 'on' or 'off'. Each click on/off creates/removes a record in the lookup column (which replaces teacher_subject).
If I use a form from a query from teacher subject with teacher as main form and subject as subform I run into this problem: In the subform I can either select each subject that teacher has in a bombo box, i.e. click, scroll down, select, go to next row, click, scroll down, etc. (takes too long) OR I can create a list box listing all available subjects in each row but allowing me to select only one. (takes up too much space). Is it possible to have a click on/off list box for teacher_subject, creating/removing a record there with each click?
Note - I know zero SQL or VB. If the correct answer is "you need to know SQL for this" then that's cool. I just need to know.
Thanks!
Lookup columns in tables will cause you more stress than joy. Unless you need them for Sharepoint, they should be avoided. You may wish to read http://r937.com/relational.html and http://www.mvps.org/access/tencommandments.htm
I wouldn't use them. Your example is fine, but there are limitations. What do you do when you need to reference another field from the Subject table other than the name? How would you differentiate subjects that are only offered on a semester basis?
You have no way of getting a count of how many subjects each teacher is assigned without some ugly coding.
Another limitation, is when you start identifying who taught what courses during a given school year.
I'm kind of unclear on your second question, but it sounds to me like you need a subform with a dropdown list.
If you want to do the checkbox thing, it quickly becomes a lot more complicated. To me, you're starting from user interface and working backwards to structure, instead of going the other direction.
I hesitate to mention it, but in terms of full disclosure you should know that in A2007 and A2010, you have multi-value fields available, and they are presented with exactly the UI you describe. But they have many of the same problems as lookup fields, and are quite complex to work with in code. Behind the scenes, they are implemented with a standard many-to-many join table, but it's all hidden from you.
I wish MS would make the listbox with checkbox control that is used with MV fields available for all listboxes, but binding that to a many-to-many join table would be complex if the listbox control were not designed for that (with link child/link master properties, for instance).
I tried to come up with a way to offer you the UI feature you prefer from multi-value fields without actually using multi-value fields. That seems challenging to me.
The closest I could come up with is to load a disconnected recordset with your "List" choices and a check box field. Then create a form, or subform, based on that recordset which you present in datasheet view. It could look similar to a combo bound to a multi-value field. In the after update event of the checkbox field, you would need code to add or remove a record from the junction table as required.
However, I don't know if this is something you would care to tackle. Earlier you indicated a willingness to learn SQL if needed; the approach I'm suggesting would also require VBA. Maybe take a look at Danny Lesandrini's article, Create In-Memory ADO Recordsets, to see whether it is something you could use.
OTOH, maybe the most appropriate answer for you is to keep the multi-value fields and get on with the rest of your life. I'm stuck. But now that we know you are actually using multi-value fields, perhaps someone else will be able to offer you a more appropriate suggestion.

Resources