Database schema about social networking like Facebook - database

Like Facebook, I have posts, comments and user profiles.
I THINK THAT
Posts and comments do not need the details of user
ONLY user profiles need the details
So I separate the user information into main and detail
Here is the schema.
Question
Is it necessary to separate user data into main and details?
WHY not or WHY yes?
Thanks for applying!

I would recommend using separate tables because you may not need all that information at one time. You could do it either way but I think of it as do you need all of the data at once.
Table 1 (User Auth)
This table would hold only information for log-in and have three columns (user_name, hashed_password, UID)
So your query would select UID where user_name and hashed_password matched. I would also recommend never storing a readable password in a database table because that can become a security issue.
Table 2 (Basic Information)
This table would hold the least amount of information that you would get at signup to make a basic profile. The fields would consist of UID, name, DOB, zip, link_to_profile_photo, email and whatever basic information you would like. email is kind of special because if you require the user_name to be an email address there is no reason to have it twice.
Table 3 (Extended Information)
This table would hold any optional information that the user could enter like phone_number, bio or address assigned by UID.
Then after that you can add as many other tables that you would like. One for Post, one for comments, ect.
An Example of a Post table would be like:
post_id, UID, the_post, date_of_post, likes, ect.
Then for Comments
comment_id, for_post_id, UID, the_comment, date_of_comment, likes, ect.
Breaking it down in to small sections would be more efficient in the long run.

Database performance is associated with disk seek time. Disk seek time is a bottleneck of database performance. For large table, you may need large seek time to locate and read an entry. As for post and comments you do not need user details, just user main info, you may get reduced read time when you read only user Id for post and comments. Also joins with user_main_info will be faster. You may keep the smallest portion of data you need to read most frequently on one table and other detailed information on another table. But, in a scenario like when you will always need to read all the user information together, this won't give you any benefit.

1)the userinformation table will be added
ex:create table fb_users
(intuserid primary key,
username varchar(50),
phoneno int,
emailid varchar(max))
2)the sending of the friend request would be
2.a)create the table name called friends, friend requestor, friend requested by, status b/w both of them, Active flag
ex:create table fb_friends
(intfriendid primary key,
intfriendrequestor int (foreign key with fb_users's(intuserid)),
intfriendrequestedby int (foreign key with fb_users's(intuserid)),
statusid varchar(max)(use the status id from the below table which is a look up table),
active bit)
3)creating the table for the status the status
3.a)create the table name called status, statusname, statusdesc, Active flag
ex:create table fb_staus
(intstatusid primary key,
statusname varchar,
statusdesc varchar,
active bit)
the status could be
pending
approval
deleted
..etc
4)similarly for creating the groups,likes,comments also
a table will be created respectively for each one of them and the foreign key of the intuserid from user table
are linked for each of them

Related

how to maintain friend requests data in sqlserver

I have a requirement in my app where one user can send the friend request to another user. We use SQL Server database as the backend. The structure of the table is like this.
CREATE TABLE FriendStatus
(FriendStatusId BIGINT PRIMARY KEY IDENTITY(1,1),
FromUserId BIGINT,
ToUserId BIGINT,
StatusId TINYINT,
SentTime DATETIME2,
ResponseTime DATETIME2);
I have few questions related to this:
If user A sends a friend request to user B, then the friend request from user B to User A should still valid ? I feel that should be the case, let me know if there is a better way of handling this ?
Is it a good idea to store the users data in a separate table called friends table once User b approves user A friend request ? And Once User B approves User A request then two records needs to be inserted into the friends table with col1 containing user A and col2 containing user B .At the same time should we also insert a record with User B in col1 and USer A in col2 ? Or two records are unnecessary ?
Is it a good idea to store the users data in a separate table called friends table once User b approves user A friend request ?
No, it's almost never a good idea to duplicate data in your database. You can get anomalies where the same data in two places has two different values. Which value is the correct value?
Here's one way to maintain the relationship
User
----
User ID
User Name
...
Friend
------
User ID 1
User ID 2
...
In the Friend table, the primary key is (User ID 1, User ID 2). You would also have a unique index on (User ID 2, User ID 1). It's up to you if you want to have one row or two rows for each relationship.
One row means you have to do two SELECTS with a UNION. One SELECT using the primary key and one SELECT using the unique index.
Two rows means you do a SELECT using the primary key.
You have the same one row / two row choice with the FriendRequest table.
FriendRequest
-------------
User ID 1
User ID 2
Status
Sent Time Stamp
Accepted Time Stamp
...
You can have one row or two rows for each request. In this case, I'd prefer the one row because I could determine which user initiated the friend request.
I'll take a few examples from Facebook to answer.
If user A sends a friend request to user B, then the friend request
from user B to User A should still valid?
No, a dialog box appears You have already recieved a Friend Request from {name}. Also from B's view, The link to Send Friend Request to A should be changed to Respond to friend request with a respective code.
Is it a good idea to store the users data in a separate table called friends...?
No, one record is enough. Additionally you can have a new column to maintain status status={blocked|friends|pending}
Well thats my idea. You are free since the application is yours. Think as a user too.

solution for single attribute in table

I am designing database architecture. I have user. User make request for order. Order is associated with payment. Once payment is completed, I want to generate sticker for that user.
Sticker has initial prize(i.e. $10). Now, admin can edit sticker prize. so, if admin change sticker prize then order will generate with new prize after change by admin.
By database architecture is as follow :
User(id, name, email, password)
Order(id, user_id, no_of_sticker,sticker_prize, address, status)
Payment(id, order_id, amount, date)
sticker(id, order_id, name, content)
sticker_info(sticker_prize)
Now, my question is--- is it good to create new table for just one single attribute.
That sticker_prize is only available for admin to edit
Please give your valuable suggestion.
Thanks in adv.
Creating the sticker_info table, for the purpose of storing a single value is ok from a database design perspective. You should ensure you have a primary key on the table so you can not get duplicate rows.
In larger systems, there are often lots of values like this, and often the solution is a table like: configuration( configId, configValue ).

Database design, shound I use varchar for Primary Key in this case?

Im building a webpage where users will be able to create accounts, and every account will have its own subdomain. So there could be URL-s like this:
www.user1.domain.com
www.user2.domain.com
...
They will have their own pages too, like this:
www.user1.domain.com/url-1/
www.user1.domain.com/url-2/
www.user2.domain.com/url-3/
...
So I need to store account_url and page_url in database.
I did it like this, I have users, accounts and pages tables.
This is how my tables look like:
USERS:
user_id PK
user_name
user_pass
...
ACCOUNTS:
account_id PK
user_id FK
account_url
account_name
account_type
...
PAGES:
page_id PK
user_id FK
page_url
page_name
page_content
...
Now the problem is this, since I get url like this:
www.user1.domain.com/page-url/
The only information I can fetch from url is account_url and page_url since its in URL, dispatcher/router gets these two variables. account_url is subdomain, and page_url is segment after domain.
Since there will be multiple users I always need to get that user_id so I can update/delete rows that belong to them. So I need to update page_content where user_id belongs to this user and page_url is the one from URL.
But I dont have user_id. And when I would like to update page_url_content, first I need to find user_id, like this:
SELECT user_id FROM accounts WHERE account_url = something
And then when I have user_id I can update content of a page or do any other action.
So is this a good design?
Its normalized and clean, but when Im using this in every action inside controller I need to fetch user_id first joust to be able to do a real query I wanted.
Now, I could use account_url for Primary Key, and have all tables relate to that primary key. So when I get URL I already know the Primary key since its in the URL.
Is this a good case to use Primary Key in URL, or Im doing something wrong?
I prefer to always have my primary ID keys as integers for joins. That said, there are a bunch of ways to help make your site snappy.
You could index the account_url column so look ups are more efficient.
Or you could cookie the users ID and use that value instead of querying the database each time. Granted, you would want to do some session tracking so someone can't spoof someone else.
One presumes the user will be in control of the name of the subdomain, so embedding the user ID into the subdomain name probably wouldn't be effective otherwise it is also an option.
You could keep user ID and user account_url in a separate table and cache that table so you don't hit the database for the vast majority of lookups.
My recommendation would be to keep the primary key the integer, index the account_url and identify a page load target time; say completing all database access and page rendering in under 1.500 seconds. When your site starts to respond over your threshold, then you can analyze your site to see where the actual problems lie and address them then.
In general, leave the database normalized as much as possible. If and when you can provably show (using metrics and actual measurements) that you need to denormalize for performance reasons, then think about doing that.
In this case, if you have a m-1 relationship between a domain and a user's account, you can effectively treat the domain as a user ID; you just have to join things in the right way. (and by m-1, I mean a single domain can only be "owned" by 1 user).
The key thing is that you don't need to get the user_id because you can get to it by joining the ACCOUNTS table as needed since it ties the domain to the user_id.
Lastly, to your question about using the domain as the primary key, you can do this, since a domain is required to be "unique", but you have a minimal overhead and much more flexibility by using a surrogate primary key.
You have two totaly separate issues. Mapping Subdomains and pages to a user is the easier of the two. The more difficult issue is "State". You need to create state database (or similar module) to keep track of which user is currently logged in and if they are still logged in when an update is received.
JZ touched on this in his comment. Don't confuse these two issues, they are separate and should betreated as such.

how are viewing permissions usually implemented in a relational database?

What's the standard relational database idiom for setting permissions for items?
Answers should be general; however, they should be able to be applied to example below. Anything flies: adding columns, adding another table—whatever as long as it works well.
Application / Example
Assume the Twitter database is extremely simple: we have one User table, which contains a login and user id; we have a Tweet table, which contains a tweet id, tweet text, and creator id; and we have a Follower table, which contains the id of the person being followed and the follower.
Now, assume Twitter wants to enable advanced privacy settings (viewing permissions), so that users can pick exactly which followers can view tweets. The settings can be:
Everyone on Twitter
Only current followers (which would of course have to be approved by the user, this doesn't really matter though) EDIT: Current as in, I get a new follower, he sees it; I remove a follower, he stops seeing it.
Specific followers (e.g., user id 5, 10, 234, and 1)
Only the owner
Under these circumstances, what's the best way to represent viewing permissions? The priorities, in order, are speed of lookup (you want to be able to figure out what tweets to display to a user quickly), speed of creation (you don't want to take forever to post a tweet), and efficient use of space (every time I post a tweet to everyone on my followers' list, I shouldn't have to add a row for each and every follower I have to some table.)
Looks like a typical many-to-many relationship -- I don't see any restrictions on what you desire that would allow space savings wrt the typical relational DB idiom for those, i.e., a table with two columns (both foreign keys, one into users and one into tweets)... since the current followers can and do change all the time, posting a tweet to all the followers that are current at the instant of posting (I assume that's what you mean?) does mean adding that many (extremely short) rows to that relationship table (the alternative of keeping a timestamped history of follower sets so you can reconstruct who was a follower at any given tweet-posting time appears definitely worse in time and not substantially better in space).
If, on the other hand, you want to check followers at the time of viewing (rather than at the time of posting), then you could make a special userid artificially meaning "all followers of the current user" (just like you'll have one meaning "all users on Twitter"); the needed SQL to make the lookup fast, in that case, looks hairy but feasible (a UNION or OR with "all tweets for which I'm a follower of the author and the tweet is readable by [the artificial userid representing] all followers"). I'm not getting deep into that maze of SQL until and unless you confirm that it is this peculiar meaning that you have in mind (rather than the simple one which seems more natural to me but doesn't allow any space savings on the relationship table for the action of "post tweet to all followers").
Edit: the OP has clarified they mean the approach I mention in the second paragraph.
Then, assume userid is the primary key of the Users table, the Tweets table has a primary key tweetid and a foreign key author for the userid of each tweet's author, the Followers table is a typical many-to-many relationship table with the two columns (both foreign keys into Users) follower and followee, and the Canread table a not-so-typical many-to-many relationship table, still with two column -- foreign key into Users is column reader, foreign key into Tweets is column tweet (phew;-). Two special users #everybody and #allfollowers are defined with the above meanings (so that posting to everybody, all followers, or "just myself", all add only one row to Canread -- only selective posting to a specific list of N people adds N rows).
So the SQL for the set of tweet IDs a user #me can read is, I think, something like:
SELECT Tweets.tweetid
FROM Tweets
JOIN Canread ON(Tweets.tweetid=Canread.tweet)
WHERE Canread.reader IN (#me, #everybody)
UNION
SELECT Tweets.tweetid
FROM Tweets
JOIN Canread ON(Tweets.tweetid=Canread.tweet)
JOIN Followers ON(Tweets.author=Followers.followee)
WHERE Canread.reader=#allfollowers
AND Followers.follower=#me

How to model this one-to-one relation?

I have several entities which respresent different types of users who need to be able to log in to a particular system. Additionally, they have different types of information associated with them.
For example: a "general user", which has an e-mail address and "admin user", which has a workstation number (note that this a hypothetical case). Both entities also share common properties like first name, surname, address and telephone number. Finally, they naturally need to have a (unique) user name and a password to log in.
In the application, the user just has to fill in his user name and password, and the functionality of the application changes slightly according to the type of the user. You can imagine that the username needs to be unique for this work.
How should I model this effectively?
I can't just create two tables, because then I can't force a unique constaint on the user name.
I also can't put them all in just one table, because they have different types of specific information associated to them.
I think I might need 3 seperate tables, one for "users" (with user name and password), one for the "general users" and another one for the "admin users", but how would the relations between these work? Or is there another solution?
(By the way, the target DBMS is MySQL, so I don't think generalization is supported in the database system itself).
Your 3 tables approach seems Ok.
In users table have only ID, username, password,usertype.
In general users table have ID, UserID (from users table), other fields.
Same thing for admin users.
Usertype field will tell you from what table to search for additional info
if(usertype==admin)
select * from admins where userid=:id;
else
select * from general where userid=:id;
Two tables. USERS with user names, first, last, etc. ROLES with roles, and a link back to the user name (or user id or whatever). Put a unique constraint on the user name. Put workstation nbr, email, phone, whatever else you need, in the user table. Put 2 columns in the ROLES table -- USERID and ROLE.
You should decide how much specific information is being stored (or likely to be stored in the future) and make the decision based on that. If there are only a handful of fields for each user type then using a single table is alright.
USERS table (name, type, email, password, genfield1, genfield2, adminfield1, adminfield2)
Make sure to include the type (don't assume because some of the fields particular to that user are filled in that the user is of that type) field. Any queries will just need to include the "AND usertype = " clause.
If there are many fields or rules associated with each type then your idea of three tables is the best.
USERS table (ID, type, name, password)
GENUSERS (ID, genfield1, genfield2)
ADMINUSERS(ID, adminfield1, adminfield2)
The constraints between IDs on the table are all you need (and the main USERS table keeps the IDs unique). Works very well in most situations but reports that include both types of users with their specific fields have to be done in two parts (unioned SQL or subqueries or multiple left joins).
You can solve it with one 'general' users table containing the information thats available for all users and 1 table for every specific user type. In your example you will then need 3 tables.
Users: This table holds only information shared between all usertypes, ie. UserId, Name, Address, etc.
GeneralUsers: This table 'extends' the Users table by providing a foreing key UserId that references the Users table. In addition, information specific to general users are held here, fx. EmailAddress, etc.
AdminUsers: As with GeneralUsers, this table also 'extends' the Users table by providing a foreign key UserId referencing the Users table. In addition information specific to admin users are held here, fx. WorkstationId, etc.
With this approach you can add additional 'specializations' if the need arises by simply adding new tables that 'extends' the Users table using a foreign key reference. You can also create several levels of specialization. If for example admin users are general users as well as admin users then AdminUsers could 'extend' GeneralUsers instead of Users simply by using a foreing key to GeneralUsers instead of Users.
When you need to retreive data from this model you need to which type of user to query. If for example you need to query a GeneralUser you will need something similar to:
SELECT * FROM GeneralUsers
LEFT JOIN Users ON GeneralUsers.UserId = Users.UserId
Or if querying an admin user
SELECT * FROM AdminUsers
LEFT JOIN Users ON AdminUsers.UserId = Users.UserId
If you have additional levels of specialization, for example by having admin users also being general users you just join your way back.
SELECT * FROM AdminUsers
LEFT JOIN GeneralUsers ON AdminUsers.UserId = GeneralUsers.UserId
LEFT JOIN Users ON GeneralUsers.UsersId = Users.UserId
I most definitely would not do a model where you have separate tables as in GeneralUser, AdminUser and ReadOnlyUser.
In database design, a good rule of thumb is "Down beats across". Instead of multiple tables (one for each type), I would create a SystemUsers table, and a Roles table and define a join table to put SystemUsers in Roles. Also, I would define individual roles.
This way, a user can be added to and removed from multiple roles.
A role can have multiple permissions, which can be modified at any time.
Joins to other places do not need a GeneralUserId, AdminUserId and ReadOnlyUserId column - just a SystemUserId column.
This is very similar to the ASP.Net role based security model.
alt text http://img52.imageshack.us/img52/2861/rolebasedsecurity.jpg

Resources