How to design database request/approval tables?

How to design database request/approval tables? - database

I have the following kind of request tables:
oversea_study_request
course_request
leave_request
in these request functions, approving officer can post multiple remarks and also approve or reject the request. The system must be able to capture the history of the actions taken.
What is the best way to design it?
Should I create a common table to store the approval information and remarks.
Should I store in each request table the approval information and remarks instead.
Can someone advise on the pros and cons of each approach?

Similar to the fields organisation question here: How to better organise database to account for changing status in users; and my answer there:
If the all the requests have the same fields, field types and info, including mandatory (NOT NULL) and optional, etc. then it's better to put all the requests into one requests table. Designate one field to be request_type, with an int for efficiency and SQL convenience, or an ENUM type. Example:
overseas study = 1
course = 2
leave = 3
Similarly, do that for the approvals table also...if the process is the same for each type then store those together. Store the request id (requests.id). Since you have multiple approval-comments and approval+rejection possible, store these in approvals.action and approvals.action_date. If "actions" are independent of "approve/reject" - that is, if you can post a comment without approving/rejecting OR if you can approve/reject without a comment - then store the actions and comments separately, and include the request.id.
So you have:
Table1: requests
id INT
request_type INT (or ENUM)
request_date DATETIME
...
Table2: approvals (or 'actions', to be general)
id
request_id # (refers to requests.id above)
action_type # (approve or reject)
action_date
comment
If comments and approvals are NOT necessarily together, then:
Table2: actions
id, request_id, action_type, action_date
Table3: comments
id, request_id, comment, comment_date
And of course, add the user_id, username, etc. tables/fields. (The id in each table is it's own Primary Key)
Each request + actions + comments can be found with a SELECT and LEFT JOINs
Btw, it's "overseas" study, not "oversea" study - it's not a course in an airplane ;-)

Related

Either of 2 columns is always redundant -- is there a better solution?

Say, I want to create a form for a feedback. If a registered user submits a feedback, his email address is used automatically because he's authenticated. If an anonymous user does that, he has to enter his email manually. My table would look like this:
feedbacks(id, user_id, email, body)
As you can see, it has a redundant column: either user_id or email. And for those who's not familiar with the database structure it'll be confusing: why both email and user_id? can they both be null? or both have a value at the same time? in reality, only one of them must have a value, which isn't possibly to achieve on database level using constraints. Also, what if I by mistake insert values in both columns?
Thus, I wonder, is there any way to change its structure so that it's more wise and that issue described above has become resolved? Using a trigger isn't what I'm looking for.
In other words, the issue is "either of 2 columns is always redundant".

If you had several mutually exclusive columns, then you might have a good case for something called entity sub-typing. As it is, there is no good design reason for adding all of the extra overhead of this design pattern.
These are the basic options that you have:
Two mutually exclusive columns in one table - This is your current design. This is a good design because it lets you define a proper foreign key constraint on your user_id. You mention that it may be confusing for people that don't know the database well because the same kind of information might appear in one or the other place in the table. However, it's important to remember that even though both columns contain a string that happens to be in the form of an email address, to your system these things are semantically distinct. One is a foreign key to your user table. The other is a means of contacting (or identifying?) a non-member. You could avoid this apparent confusion in one of two ways: (a) give a more descriptive name to your email column, such as non_member_email or (b) create a view that coalesces user_id and email into a single column for displaying this information to people who would otherwise be confused.
Entity Subtyping - This approach has you create separate tables for logically separate groups of predicates (columns). These are joined together by a supertype table which gives a common primary key for all logical subtypes, as well as holding all other common predicates. You can google around to learn more about this design pattern. As I've already mentioned, this is overkill for your case because you only have one pair of mutually exclusive columns. If you think it's confusing to have this then having three tables (supertype, member subtype, non-member subtype) will really be confusing.
Column Overloading - This approach would have you combine both columns into a single one. This is feasible because you only need room in your table for one email address at a time. This is a terrible idea because it prevents you from creating a declarative referential constraint on user_id which is a very important tool for maintaining your data's referential integrity. It also conflates two semantically different pieces of information, which violates good database design principles.
The best choice is number 1. Don't worry about having two mutually exclusive columns or if you think you can't "comment" your way around the confusion you think this might cause with more descriptive column names, then use a view to hide the "complexity" of storing two things that look similar in two separate columns.

If one must be exclusively filled:
create table feedbacks (
id integer,
user_id text,
email text,
body text,
check ((user_id is null)::int + (email is null)::int = 1)
);
The cast from boolean to integer yields either 1 or 0, so the sum must be 1.

Remove the email field. If the user is registered, enter their user_id as you do now. If the user is not registered, search the user table for an anonymous entry with that email address. If exists, use that user_id. Otherwise, create an entry in the user table named 'Anonymous', storing the address and use the newly created user_id. There are two advantages:
You don't need mutually exclusive fields. As you have already noticed, these can be the cause of a lot of confusion and extra work to keep the data clean.
If an anonymous poster later registers, the existing "anonymous" user entry can be updated, thus preserving the user_id value and preserving all feedback (and any other activity you track for anonymous users) entered before registering. That is, if a user anonymously enters a few feedbacks then registers, the previous feedbacks remain associated with the now named user.

I might misunderstand the question, but why you say it is impossible to do with constraints?..
t=# CREATE TABLE feedbacks (
t(# id integer,
t(# user_id text CHECK (case when email is null then user_id is distinct from null else user_id is null end),
t(# email text CHECK (case when user_id is null then email is distinct from null else email is null end),
t(# body text
t(# );
CREATE TABLE
t=# insert into feedbacks select 1,null,null,'t';
ERROR: new row for relation "feedbacks" violates check constraint "feedbacks_check1"
DETAIL: Failing row contains (1, null, null, t).
t=# insert into feedbacks select 1,'t','t','t';
ERROR: new row for relation "feedbacks" violates check constraint "feedbacks_check1"
DETAIL: Failing row contains (1, t, t, t).
t=# insert into feedbacks select 1,'t',null,'t';
INSERT 0 1
t=# insert into feedbacks select 1,null,'t','t';
INSERT 0 1
t=# select * from feedbacks ;
id | user_id | email | body
----+---------+-------+------
1 | t | | t
1 | | t | t
(2 rows)

Database schema about social networking like Facebook

Like Facebook, I have posts, comments and user profiles.
I THINK THAT
Posts and comments do not need the details of user
ONLY user profiles need the details
So I separate the user information into main and detail
Here is the schema.
Question
Is it necessary to separate user data into main and details?
WHY not or WHY yes?
Thanks for applying!

I would recommend using separate tables because you may not need all that information at one time. You could do it either way but I think of it as do you need all of the data at once.
Table 1 (User Auth)
This table would hold only information for log-in and have three columns (user_name, hashed_password, UID)
So your query would select UID where user_name and hashed_password matched. I would also recommend never storing a readable password in a database table because that can become a security issue.
Table 2 (Basic Information)
This table would hold the least amount of information that you would get at signup to make a basic profile. The fields would consist of UID, name, DOB, zip, link_to_profile_photo, email and whatever basic information you would like. email is kind of special because if you require the user_name to be an email address there is no reason to have it twice.
Table 3 (Extended Information)
This table would hold any optional information that the user could enter like phone_number, bio or address assigned by UID.
Then after that you can add as many other tables that you would like. One for Post, one for comments, ect.
An Example of a Post table would be like:
post_id, UID, the_post, date_of_post, likes, ect.
Then for Comments
comment_id, for_post_id, UID, the_comment, date_of_comment, likes, ect.
Breaking it down in to small sections would be more efficient in the long run.

Database performance is associated with disk seek time. Disk seek time is a bottleneck of database performance. For large table, you may need large seek time to locate and read an entry. As for post and comments you do not need user details, just user main info, you may get reduced read time when you read only user Id for post and comments. Also joins with user_main_info will be faster. You may keep the smallest portion of data you need to read most frequently on one table and other detailed information on another table. But, in a scenario like when you will always need to read all the user information together, this won't give you any benefit.

1)the userinformation table will be added
ex:create table fb_users
(intuserid primary key,
username varchar(50),
phoneno int,
emailid varchar(max))
2)the sending of the friend request would be
2.a)create the table name called friends, friend requestor, friend requested by, status b/w both of them, Active flag
ex:create table fb_friends
(intfriendid primary key,
intfriendrequestor int (foreign key with fb_users's(intuserid)),
intfriendrequestedby int (foreign key with fb_users's(intuserid)),
statusid varchar(max)(use the status id from the below table which is a look up table),
active bit)
3)creating the table for the status the status
3.a)create the table name called status, statusname, statusdesc, Active flag
ex:create table fb_staus
(intstatusid primary key,
statusname varchar,
statusdesc varchar,
active bit)
the status could be
pending
approval
deleted
..etc
4)similarly for creating the groups,likes,comments also
a table will be created respectively for each one of them and the foreign key of the intuserid from user table
are linked for each of them

Perplexing Complexity - Tricky Table Join

I've been struggling with this specific set of tables for some time. There are three tables (used to be two, but it was necessary to add a third). The tables are Request, PartInfo, and Status. Request is used by the customer via a form that enters data into the table. Status is for our service agents to keep track of progress on customer requests. PartInfo is the new table, containing common data accessed by both parties.
The trick is that with each request, there is a running log of changes to that request which are stored in the same table, and linked to the original request in that series via a self-joining key called FirstRequestID (which I'll abbreviate as FID). The same is true for the Status table. Here is my basic table structure as I have currently designed it (Note: It's not too late to change the architecture if there is a better approach):
Request PartInfo Status
------- -------- ------
ID ID ID
FID FID FID
PartInfoID PartNum PartInfoID
ProductID Revision StatusID
CategoryID Description Comments
Now say I want to display the information on a particular request (including part info and status changes) in a ASP.NET GridView table. The "particular request" is identified by the FID.
Question:
How can I ensure that when I'm looking at either the Request history or the Status history, it's always pulling the proper information from the PartInfo (shared) table? In other words, what's the best way to link these three tables with the proper relationship without having 50 different junction tables to account for all the exceptions?

I apologize, but my first take on this schema was “this is a mess”. This data needs to be normalized. Unfortunately, there’s not enough information here to determine how to best do so. Based on your descriptions and part names, I come up with the following ideas.
The main entity is the Request.
Reqeusts contain Products
Requests contain Categories (unless the Category is an attribute of a Product)
Products contain Parts (unless it’s Categories that contain Parts)
The implication is that a Requested Product is associated with an arbitrary number of Parts (as opposed to a standardized set of Parts for that Product).
Status is used to track change in state of a Request over time (and is not completely dependant upon Products, Categories, or Parts)
This suggests the following tables
REQUESTS
RequestId
DateTimeCreated
PRODUCTS
ProductId
-- Add CategoryId, if it’s a Product attribute
CATEGORIES
CategoryId
REQUESTPRODUCTS
RequestProductId
RequestId
ProductId
DateTimeAdded
-- Add StatusId if a status entry must be made every time a product is requested
-- Note extra surrogate key. ReqeustId + ProductId + DateTimeAdded should be the
-- natural key, unless two identical products can be requested at the same time
-- (in which case add an “Quantity” column)
REQUESTCATEGORIES
RequestId
CategoryId
DateTimeAdded
-- Suorrogate key optional, as it’s not referenced by other tables
-- Drop, if categories are product attributes
PARTS
PartId
REQUESTPRODUCTPARTS
RequestProductId
PartId
-- Add StatusId if a status entry must be made every time a part is requested
STATUS
StatusId
RequestId
DateTimeAdded
Comments
There’s a log of ways this could go. You may end up with a lot of “junction” tables, but then your data will have referential integrity and accurate SQL queries become much, much simpler to write.

how are viewing permissions usually implemented in a relational database?

What's the standard relational database idiom for setting permissions for items?
Answers should be general; however, they should be able to be applied to example below. Anything flies: adding columns, adding another table—whatever as long as it works well.
Application / Example
Assume the Twitter database is extremely simple: we have one User table, which contains a login and user id; we have a Tweet table, which contains a tweet id, tweet text, and creator id; and we have a Follower table, which contains the id of the person being followed and the follower.
Now, assume Twitter wants to enable advanced privacy settings (viewing permissions), so that users can pick exactly which followers can view tweets. The settings can be:
Everyone on Twitter
Only current followers (which would of course have to be approved by the user, this doesn't really matter though) EDIT: Current as in, I get a new follower, he sees it; I remove a follower, he stops seeing it.
Specific followers (e.g., user id 5, 10, 234, and 1)
Only the owner
Under these circumstances, what's the best way to represent viewing permissions? The priorities, in order, are speed of lookup (you want to be able to figure out what tweets to display to a user quickly), speed of creation (you don't want to take forever to post a tweet), and efficient use of space (every time I post a tweet to everyone on my followers' list, I shouldn't have to add a row for each and every follower I have to some table.)

Looks like a typical many-to-many relationship -- I don't see any restrictions on what you desire that would allow space savings wrt the typical relational DB idiom for those, i.e., a table with two columns (both foreign keys, one into users and one into tweets)... since the current followers can and do change all the time, posting a tweet to all the followers that are current at the instant of posting (I assume that's what you mean?) does mean adding that many (extremely short) rows to that relationship table (the alternative of keeping a timestamped history of follower sets so you can reconstruct who was a follower at any given tweet-posting time appears definitely worse in time and not substantially better in space).
If, on the other hand, you want to check followers at the time of viewing (rather than at the time of posting), then you could make a special userid artificially meaning "all followers of the current user" (just like you'll have one meaning "all users on Twitter"); the needed SQL to make the lookup fast, in that case, looks hairy but feasible (a UNION or OR with "all tweets for which I'm a follower of the author and the tweet is readable by [the artificial userid representing] all followers"). I'm not getting deep into that maze of SQL until and unless you confirm that it is this peculiar meaning that you have in mind (rather than the simple one which seems more natural to me but doesn't allow any space savings on the relationship table for the action of "post tweet to all followers").
Edit: the OP has clarified they mean the approach I mention in the second paragraph.
Then, assume userid is the primary key of the Users table, the Tweets table has a primary key tweetid and a foreign key author for the userid of each tweet's author, the Followers table is a typical many-to-many relationship table with the two columns (both foreign keys into Users) follower and followee, and the Canread table a not-so-typical many-to-many relationship table, still with two column -- foreign key into Users is column reader, foreign key into Tweets is column tweet (phew;-). Two special users #everybody and #allfollowers are defined with the above meanings (so that posting to everybody, all followers, or "just myself", all add only one row to Canread -- only selective posting to a specific list of N people adds N rows).
So the SQL for the set of tweet IDs a user #me can read is, I think, something like:
SELECT Tweets.tweetid
FROM Tweets
JOIN Canread ON(Tweets.tweetid=Canread.tweet)
WHERE Canread.reader IN (#me, #everybody)
UNION
SELECT Tweets.tweetid
FROM Tweets
JOIN Canread ON(Tweets.tweetid=Canread.tweet)
JOIN Followers ON(Tweets.author=Followers.followee)
WHERE Canread.reader=#allfollowers
AND Followers.follower=#me

Any simple approaches for managing customer data change requests for global reference files?

For the first time, I am developing in an environment in which there is a central repository for a number of different industry standard reference data tables and many different customers who need to select records from these industry standard reference data tables to fill in foreign key information for their customer specific records.
Because these industry standard reference files are utilized by all customers, I want to reserve Create/Update/Delete access to these records for global product administrators. However, I would like to implement a (semi-)automated interface by which specific customers could request record additions, deletions or modifications to any of the industry standard reference files that are shared among all customers.
I know I need something like a "data change request" table specifying:
user id,
user request datetime,
request type (insert, modify, delete),
a user entered text explanation of the change request,
the user request's current status (pending, declined, completed),
admin resolution datetime,
admin id,
an admin entered text description of the resolution,
etc.
What I can't figure out is how to elegantly handle the fact that these data change requests could apply to dozens of different tables with differing table column definitions. I would like to give the customer users making these data change requests a convenient way to enter their proposed record additions/modifications directly into CRUD screens that look very much like the reference table CRUD screens they don't have write/delete permissions for (with an additional text explanation and perhaps request priority field). I would also like to give the global admins a tool that allows them to view all the outstanding data change requests for the users they oversee sorted by date requested or user/date requested. Upon selecting a data change request record off the list, the admin would be directed to another CRUD screen that would be populated with the fields the customer users requested for the new/modified industry standard reference table record along with customer's text explanation, the request status and the text resolution explanation field. At this point the admin could accept/edit/reject the requested change and if accepted the affected industry standard reference file would be automatically updated with the appropriate fields and the data change request record's status, text resolution explanation and resolution datetime would all also be appropriately updated.
However, I want to keep the actual production reference tables as simple as possible and free from these extraneous and typically null customer change request fields. I'd also like the data change request file to aggregate all data change requests across all the reference tables yet somehow "point to" the specific reference table and primary key in question for modification & deletion requests or the specific reference table and associated customer user entered field values in question for record creation requests.
Does anybody have any ideas of how to design something like this effectively? Is there a cleaner, simpler way I am missing?

Option 1
If preserving the base tables is important then I would create a "change details" table as a child to your change request table. I'm envisioning something like
ChangeID
TableName
TableKeyValue
FieldName
ProposedValue
Add/Change/Delete Indicator
So you'd have a row in this table for every proposed field change. The challenge in this scenario is maintaining the mapping of TableName and FieldName values to the actual tables and fields. If your database structure if fairly static then this may not be an issue.
Option 2
Add a ChangeID field to each of your base tables. When a change is proposed add a record to the base table with the ChangeID populated. So as an example if you have a Company table, for a single company you could have multiple records:
CompanyCode ChangeID CompanyName CompanyAddress
----------- -------- ----------- --------------
COMP1 My Company Boston <-- The "live" record
COMP1 1 New Name Boston <-- A proposed change
When the admin commits the change the existing live record is deleted or archived and the ChangeID value is removed from the proposed record making it the live record. It may be a little tricky to handle proposed deletions with this option. This option also has the potential for impacting performance of selecting live data for normal usage. However it does save you the hassle of maintaining a list of table names and field names somewhere in your code.
I'm sure others will have some opinions!

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight