Was wondering what is the best way to store comments in a database (sql) that allows mentioning of other users by a non-unique natural name?
E.g. This is a comment for #John.
The client application would also need to detect and link to corresponding user profile if his/her name was clicked.
My initial thought was to replace the user's first name with the id and some metadata and store that in the DB: This is a comment for <John_51/> where 51 is the id of that user. Clients can then parse that and display the appropriate user name and profile links.
Is this a good approach?
Some background:
What I would like to achieve is similar to facebook posts where it allows you to 'tag' a user by just mentioning their name (not the unique username) in a post. It doesn't have to be as complex as facebook as what I need it for isn't for a post, but just comments (which can only be text, as opposed to posts which could be text mixed with videos/images/etc).
The solution would affect the database side (how the comments are stored) and also the client side (how the comments are parsed and displayed to the user). The clients are mobile apps for iOS and Android but also looking to expand to a web application as well.
I don't think the language matters as much but for completeness sake, I'm using Python's Flask with SQLAlchemy frameworks on the backend.
Current DB schema for comments
COMMENT TABLE:
id (<PK>)
post_id (id of the post that the comment is for: <FK on a post object>)
author_id (id of the creator of the comment: <FK on a user object>)
text (comment text: <String>)
timestamp (comment date: <Date>)
Edit:
I ended up going with metadata in the comment. E.g.
Hey <mention userid="785" tagname="JohnnyBravo"/>!
I included the user's name (tagname) as well so that client application can extract the name directly from the comment text instead of adding another step to look up who user 785 is.
The big problem here is if the username is not a stable reference, you need to abstract it to an id reference, while still keeping the the text reconstructable, but the references queryable.
Embedded collections and dynamic typing are a great option if you're using a NoSQL database. It would be fairly straightforward.
{
_id: ...,
text: [
"Wow ",
51,
", your selfie looks really great, even better than ",
72,
"'s does."
],
...
}
That way you could query references, while still easily reconstructing the content. BUT since you're using SQLAlchemy, that's a no go. Your methodology seems fine, but because your doing magic in the string you'll need to escape your delimiters, (as well as escape the escape character) if they exist in the text. Personally, I would use # as the delimiter since it's already a special character. You'd also need to identify the end of the id, in case the user sticks a bunch of numbers after the #mention, so
Wow #51#, your selfie looks really great, even better than #72#'s does. email me! john\#foo.com. Division time!!! with backslashes! 12\\4 = 3
IF querying posts for references is also important to you. You'll also need to maintain a separate POST__USER junction table that stores a row for the post and for each user id, so that when you load an object into memory, you can construct a collection. You could decide to add the junction table later, but it would be a fairly expensive migration.
If #name is not unique,you have to somehow associate the non-unique name, via the session, with the unique owner of the natural name, and do this ideally before storing it in the database. Storing a non-unique name in the database, if it cannot be resolved to its unique owner, is not of much value.
Since you mention "sql" I assume you're using a relational database. If that is the case, once you have resolved #name to its unique owner, I would create a one-to-many relationship between posting or comment and userids; that would allow a comment or post to reference more than one user.
TABLE: COMMENT_MENTIONEDUSERS
commentid
userid
I would recommend storing the comment as markdown since it's now quite widespread. In your case, "This is a comment for [#John](/user/johnID)".
Markdown is pretty standard and you shouldn't have an issue finding a package for editing / viewing.
Related
I'm an intern student at a company that does both wiring and aircon services. The job that they gave me was to make a database for them. I don't have any experience in anything related to databases.
So, I started to look up videos and stuff to at least learn a bit about databases and made something that works and I made it after 1.5 months of learning.
in the database that I created,
I have 1 table (CustomerDetailsT):
CustomerID (pk)
CustomerName
PhoneNumber
Address
Aircond (type and model of ac,ex: WM daikin 1.0HP)
AcDetails (what has been done for the ac.)
Others (yes/no) (Wiring, installing a fan and so on)
WhatHasBeenDone (shows what has been done for others)
Then 3 queries (CustomerOthersDetailsQ, CustomerAcDetailsQ, CustomerDetailsQ).CustomerAcDetailsQ has CustomerName, PhoneNumber, Address, Aircond and AcDetails. CustomerOthersDetailsQ has CustomerName, PhoneNumber, Address, Others, and WhatHasBeenDone.CustomerDetailsQ has CustomerID, CustomerName, PhoneNumber and Address
And 1 form with 3 subforms.
it's a search form, which would search for customers as we're typing in their name/phone number and it will show what has been done for the customer.
With this, I have created what the company wants, but now they want to add dates. Dates which would show when we have done something for a customer. Dates for Aircond and the Others stuff.
I've tried with what I know and it didn't work. tried searching it on youtube and google, but still couldn't find it.
how can I go about doing this?. I have tried having separate tables for each service, but it became a hassle when I wanted to create a new customer. . I hope I could some help, I could send pictures if someone needs them.
[1]: https://i.stack.imgur.com/mtrmC.png [The Customer search form] [1]: https://i.stack.imgur.com/A3Y9d.png [example of a customer that has ac installation] [1]: https://i.stack.imgur.com/dsGL5.png [example of a customer that has both ac and wiring done]
Acknowledging the question is too broad, here is some guidance. One of the nice things about Access is that each database is a single file. First protect your work by finding that file and make two copies. Make a backup and a play around version. Only mess with the play around version.
Your question indicates you are still learning Table Normalization and 1 to many relationships. Both of these topics are general to all databases, so you don't have to restrict yourself to just Access when looking for guides and Youtube videos.
Part of normalization is putting separate entities into their own tables. Also, in Access there is a big payoff for using the Relationship Tool, so here is a rather lame example of normalization:
Make sure to select the checkboxes when setting up relationships.
WhatHasbeenDone should also have WhatHasbeenDoneDate. I've wrapped AC and Other as Unit because later it will be easier than having two WhatHasBeenDone tables(AC)(Other).
Now imagine someone taking the customer request call. They just want to see a form to enter the customer details, request, unit-type, etc. They don't want to see those tables. Even with training entering data in the tables is error prone. The person fulfilling the request just wants to enter what they did and when. That's how you start to figure out what your final Data entry forms will look like.
Since we normalized the tables and used the relationships tool, the payoff is Access can give us an assortment of working starter forms. Select Each Table and then hit Create and then hit Form. Choose your Favorites and start playing around from there. While playing, keep in mind that Access will not let you add an item on the many side of a relationship unless there is an item on the 1 side.
For example I selected the customers table and hit create form:
Access uses a concept of form and subform based on separate but related tables. So, to get a form that shows what has been done for each customer I created a form for the What has been done table, and dragged it onto the customers form:
Unless an ID is also being used as a part number or something there is probably no reason for the person entering data to see it. So I removed the texboxes bound to ID's. Except for UnitTypeID, where I replaced the textbox with a combobox that displays the userfriendly UnitDescription. The ID's are still part of the form recordsources, Access is still adding new IDs and using those IDs to put the appropriate data in the right tables.
Oh, didn't we need dates (went back and added a date to the table, and adjusted the subform accordingly). Also changed the subform format from single record to continuous records to show multiple dates:
In conclusion and in my opinion your final forms will use VBA behind the scenes to insert data from the forms into the tables. This is because either you will want to rapidly insert multiple records or How the end users think about the data will not match the default forms and subforms approach Access depends upon to figure out how to insert the data. However, the default approach is fast and I always use it for version 1 of my Access Databases.
P.S. For simplicity I avoided including any Many to Many relationships
I am pretty new to designing databases, and currently, I am working on a substantial big project of mine which requires a pretty big database. Here for I have a couple of questions to get my database ready for implementation. --Do have in mind that this project is focused on Laravel--
Question 1:
My project makes use of posts, But not only one. I have a system where three sorts of posts can be created, a standard post, a profile post and a Company post. All these posts can contain images. Currently, I have a column inside of all these different post tables called Post_photo'. Is this the right way to store pictures that associate with a post? It is illustrated in the image below,
Image: https://imgur.com/a/b9FWL
Question 2:
Every post can contain comments, And to connect these comments to a post you need to refer them one. But because I have three different variations of posts I set my comments table up like this; "Comment table consists of a Post_ID column and a Company_post_ID column" Instead of it having one Post_ID. Is this the right way to connect comments to posts? Or do I need to make another table called company_comments? If not, How can I accomplish this?
I have this same system on my likes and category table as well because I need to refer my likes and categories to posts. Is this the right way? To get a visual of what I am talking about, There is a picture above.
Thanks for taking the time to read this!
The following assumes that you are using a relational database.
Answer 1: If there can be more than one picture or file per post, then the best practice would be creating a table for photos that references the post's ID.
This way when you load the post you would query the photos table for columns containing a PostID field matching your post's id.
Answer 2: If the three types of post are very similar (and contain similar data), consider having only one post table, and include a field that indicates the type of post. For example, a field called postType could store an integer (0-2) that corresponds to the type. This would simplify your comments table, as you would only reference the postID.
As a final note, you might find this thread about storing binary data in databases helpful: (Storing files in SQL Server)
In reference to this question, I am facing almost the same scenario except that in my case, the questions are probably static (it's subject to change from time to time, and I still think it's not a good idea adding columns for each question, but even I decided to add, how should the answers be specified/retrieved from), but the answers are in different types, for examples the answer could be yes/no, list-items, free text, list-items OR free text (Other, Please specify), multiple-selectable-list items etc.
What would be an efficient way to implement this?
Shimmy, I have written a four-part article that addresses this issue - see Creating a Dynamic, Data-Drive User Interface. The article looks at how to let a user define what data to store about clients, so it's not an exact examination of your question, but it's pretty close. Namely, my article shows how to let an end user define the type of data to store, which is along the lines of what you want.
The following ER diagram gives the gist of the data model:
Here, DynamicAttributesForClients is the table that indicates what user-created attributes a user wants to track for his clients. In short, each attribute has a DataTypeId value, which indicates whether it's a Boolean attribute, a Text attribute, a Numeric attribute, and so on. In your case, this table would store the questions of the survey.
The DynamicValuesForClients table holds the values stored for a particular client for a particular attribute. In your case, this table would store the answers to the questions of the survey. The actual value is stored in the DynamicValue column, which is of type sql_variant, allowing any type of data - numeric, bit, string, etc. - to be stored there.
My article does not address how to handle multiple-choice questions, where a user may select one option from a preset list of options, but enhancing the data model to allow this is pretty straightforward. You would create a new table named DynamicListOptions with the following columns:
DynamicListOptionId - a primary key
DynamicAttributeId - specifies what attribute these questions are associated with
OptionText - the option text
So if you had an attribute that was a multiple-choice option you'd populate the drop-down list in the user interface with the options returned from the query:
SELECT OptionText
FROM DynamicListOptions
WHERE DynamicAttributeId = ...
Finally, you would store the selected DynamicListOptionId value in the DynamicValuesForClients.DynamicValue column to record the list option they selected (or use NULL if they did not choose an item).
Give the article a read through. There is a complete, working demo you can download, which includes the complete database and its model. Also, the four articles that make up the series explore the data model in depth and show how to build a web-based (ASP.NET) user interface for letting users define dynamic attributes, how to display them for data entry, and so forth.
Happy Programming!
This may not fit you exactly, but here's what i've got at my part-time job.
I have a questions table, an answers table, and a survey table. For each new survey i crate a survey build (because each survey is unique, but questions and answers are repeated a lot). I then have a respondent table that contains some information about the respondent (and it also links back to the survey table, forgot that in the diagram). I also have a response table that links the respondent and the survey build. This probably isn't the best way but it's the way that works for me, and it works pretty fast (we're at about 1mill+ in the response table and it handles like a dream).
With this model i get reusable questions, reusable answers (a lot of our questions use "Yes" and "No"), and a rather slim response table.
Is there an existing implementation or even a name for a type of database which allows multiple points of view? What I mean is for instance if one user changes an article's title then the change will only be visible to that particular user and everyone else will see the original title. If several users change it to the same new title then the new title becomes the 'master view', or the 'unfiltered view' of the database, initiated either by the database or from the application.
I'm coding in C# and am familiar with SQL and starting with MongoDB but the question is about the concept and whether abstractions, implementations, or design patterns of it exist.
If your "point of views" are completely separated, you could just use a new database for each user.
From your question it seems you want to have some degree of interconnectedness. Perhaps articles created by any user should be visible to everyone? Is it only changes after creation that should be private? How about updates from the original author?
I think you just need to specify the behavior you need, and design a database that can handle that.
One solution could be to use both the (article) id and the user id as the key for your tables. That way you can completely replace content for specific users. Let's say you want to find article 123, as seen by user 456, or if that user doesn't haven't edited it, as seen by user 789, or if that user haven't edited it, just pick any version:
SELECT * FROM articles WHERE id = 123 ORDER BY user_id = 456 DESC, user_id = 789 DESC LIMIT 1
I'm implementing a tag system similar to StackOverflow tag system. I was thinking about when storing the tags and relating to a question, that relationship will be directly with the tag name or it's better create a field tagID to "link" the question with the tag? Looks that linking directly to tag name is easier, but it doesn't look good, mainly why when working with statistics and/or tag categorization (IMHO) can be hard to manage this. Another problem is when one admin decides "fix" a tag name. If there isn't a tagID separated from tag name, then I will be changing the key of the table...
What's your thoughts?
Thanks for all replies. I will delete this post since there is another posts with the same subject. I wonder why the search and the suggestion doesn't show it results for me...
Have a look at these related earlier SO questions:
What is the most efficient way to
store tags in a
database
Database design for
tagging
How to design a database schema to
support tagging with
categories
Is there an ideal schema for
tagging?
Your last sentence in your question seems to answer it. Assuming the tags are stored in a tag table, I would always have an ID column (int or GUID) and the varchar/string column for the tag name. The many-to-many (junction table) that would relate some other entity to one or more tags would have two columns containing the ID's the "other entity" and the tag's ID.
It's then easy to edit a tag (to correct a mis-spelling for example) without touching the key. You should get much better performance when using queries that include joins with your junction table and it also means you're normalizing your data better.
Remember, "the key, the whole key and nothing but the key, so help me codd"! :)
If you foresee many tags, and are using a relational database, using an ID that the database supports natively (e.g. RID) internally may just give you better performance.
If that's not a concern: go by simple short tag names. You can give the tags long names which will be displayed in the user interface too where it makes sense (e.g. ask the user for one when creating a new tag). You are more likely to have to edit the long names, which nothing refers to directly, so this is not a problem.
Aside, if you are using a relational database, it is probably not very difficult to change a tag name together with all its references with a simple query, it may just be a slightly more expensive operation, but it is probably not going to be done frequently enough that you need to optimize for it. And consider that you may have duplicate tags that you will want to merge too, so you might want to be able to do that anyway.