Table with user emails who haven't signed up? - database

Not sure if this question has been answered elsewhere; I searched but I couldn't find a similar question.
I have a database table (tb_users) with a list of users, ID is primary, EMAIL is unique
I have another database table (tb_memberships) with a list of memberships that refer to tb_users by ID.
I'd like the memberships be added and revoked based on email addresses, not based on their IDs so that a user who HASN'T signed up yet be granted the membership when they sign up. How should I design the table so that a user (that hasn't signed up yet) can be added to the memberships table. Is it a good idea to make the foreign key in tb_memberships the users' email address? Or should these 'pending' memberships have their own table?

Solved this by adding a "ghost_account" boolean to the users table. Using a scope in Rails, I ignored all users who are considered "ghosts" so they can't log in and can sign up normally (ghost_account boolean is set to false once the user signs up). That way, I have a user ID that can be linked to everything else, even if the user hasn't signed up yet.

Related

Sharded ancestor entities in GAE

I'm working on a GAE-based project involving a large user base (possibly millions of users). We use Datastore for persistency. Users will be identified both by username and by e-mail address, so these two properties should be unique across all entities of the kind. Because Datastore doesn't support unique fields other than ID, we need transactions to ensure uniqueness of these fields when new users are registered. And in order to have transactions, User entities need to be enclosed in entity groups.
Having large entity groups is not recommended, as pointed out here. Therefore, given a possible large number of stored users, I'm thinking of putting them into multiple smaller entity groups. Each group would have a common parent with ID generated from the two unique fields (a piece of the MD5 sum for instance). Inserting a new user could look like this (in Python):
#ndb.transactional
def register_new_user(login, email, full_name) :
# validation code omitted
user = User(login = login, email = email, full_name = full_name)
group_id = a_simple_hash(login, email)
group_key = ndb.Key('UserGroup', group_id)
query = User.query(ancestor = group_key).filter(ndb.OR(User.login = login, User.email = email))
if not query.get() :
user.put()
One problem I see with this solution is that it will be impossible to get a User by ID alone. We'd have to use complete entity keys.
Are there any other cons of such approach? Anyone tried something similar?
EDIT
As I've been pointed out in comments, a hash like the one outlined above would not work properly because it would only prevent registering users having non-unique e-mails together with non-unique usernames matching those e-mails. It would work if the hash was computed based on a single field.
Nevertheless, I find the concept of such sharding interesting by itself and perhaps worth of discussion.
An e-mail address is owned by a user and unique. So there is a very small change, somebody will (try to) use the same email address.
So my approch would be: get_or_insert a new login, which makes it easy to login (by key) and next verify if the e-mail address is unique.
If it not unique you can discard or .....do something else
Entity groups have meaning for transactions. I'am interested in your planned transactions, because I do not understand your entity group key hash. Which entities will be part of the entity group, and why?
A user with the same login will be part of another entity group, If i do understand your hash?
It looks like your entity group holds a single entity.
In my opinion you're overthinking here : what's the probability of having two users register with the same username at the same time ?
Very slim. Eventual consistency is good enough for this case, as you don't nanosecond precision...
unless you plan to have more users than facebook, with people registering every second.
Registering with the same email is virtually impossible for different users, since the check has already been done by the email provider for you!
Only a user could try to open two accounts with the same email address. Eventual consistency is good enough for this query too.
Your user entities each belong to their own entity group.
Actually in most use cases, your User is the most obvious root entity : people use the datastore because they need scalability, and most of the time huge scale is needed for user oriented apps.

User and Profile - Which Gets The Foreign Key To The Other?

The User table contains administrative info like username, password, and email.
A Profile has all the other information like about me, names, social network URLs, etc.
A profile isn't required. (A profile isn't created until the user fills in profile info and saves it).
So, I've seen 2 implementations:
Only User has a ProfileId FK to Profile - ProfileId is nullable and has a SET NULL delete rule.
Only Profile has a UserId FK to User - UserId is required, thus, non-nullable, and the DELETE CASCADE delete rule.
I don't think they're equivalent. Where would each situation fit better when a user doesn't always have to have a profile associated with it?
After much searching and a few discoveries, Profile gets the UserId FK based on the Principal/Dependent concept.
User is the principal because it can exist independently without even knowledge that a profile exists. The Profile is the dependent, which bears the foreign key column, and it being dependent means it can't exist until the User is created to give its newly-generated Id to Profile's UserId.

User being banned

I've seen this a lot on forums, or having a membership somewhere, so my question since a user can be banned by admin which depends on the situation. This situation might be like not returning the item or he did not follow the forum rules, or there is a situation where the forum or a shop does not need his service anymore. So obviously you can be banned forever or it can be temporary, where also this can be changed by admin. Would I have to make a new table ? since the ban can be more than one type and it can be changed my admin, how would I make a table ? not sure how to have an attribute for different types of bans...
You could add a field 'statusId' in the User table, and have all the possible statuses listed in a UserStatus table. No need for user_ID in the Userstatus table.
You could then use simple join queries to retrieve the status of a user, or a list of banned users, etc.
Hope this helps :)

Database Tables - To decouple or not?

Is it better to create tables that store a lot of data that are related to an entity (User for example) or many tables to store said data?
For example:
User Table
Name
Email
Subscription Id
Email Notifications
Permissions
Or
User Table
Name
Email
Subscription Table
User ID
Subscription ID
Notification Table
User ID
Receives?
... etc
Please consider code in this as well, or I would have posted to ServerVault.
From a relational design standpoint what is important is the normal form you're aiming for. In general, if the "column" would require multiple values (subscription_id1, subscription_id2, etc) then it is a repeating group, and that would indicate to you that it needs to be moved to a related table. You've provided very general table and column notes, but taking a cue from the fact that you named "Email Notifications" and "Permissions" with plurals, I'm going to assume that those require related tables.

How to model this one-to-one relation?

I have several entities which respresent different types of users who need to be able to log in to a particular system. Additionally, they have different types of information associated with them.
For example: a "general user", which has an e-mail address and "admin user", which has a workstation number (note that this a hypothetical case). Both entities also share common properties like first name, surname, address and telephone number. Finally, they naturally need to have a (unique) user name and a password to log in.
In the application, the user just has to fill in his user name and password, and the functionality of the application changes slightly according to the type of the user. You can imagine that the username needs to be unique for this work.
How should I model this effectively?
I can't just create two tables, because then I can't force a unique constaint on the user name.
I also can't put them all in just one table, because they have different types of specific information associated to them.
I think I might need 3 seperate tables, one for "users" (with user name and password), one for the "general users" and another one for the "admin users", but how would the relations between these work? Or is there another solution?
(By the way, the target DBMS is MySQL, so I don't think generalization is supported in the database system itself).
Your 3 tables approach seems Ok.
In users table have only ID, username, password,usertype.
In general users table have ID, UserID (from users table), other fields.
Same thing for admin users.
Usertype field will tell you from what table to search for additional info
if(usertype==admin)
select * from admins where userid=:id;
else
select * from general where userid=:id;
Two tables. USERS with user names, first, last, etc. ROLES with roles, and a link back to the user name (or user id or whatever). Put a unique constraint on the user name. Put workstation nbr, email, phone, whatever else you need, in the user table. Put 2 columns in the ROLES table -- USERID and ROLE.
You should decide how much specific information is being stored (or likely to be stored in the future) and make the decision based on that. If there are only a handful of fields for each user type then using a single table is alright.
USERS table (name, type, email, password, genfield1, genfield2, adminfield1, adminfield2)
Make sure to include the type (don't assume because some of the fields particular to that user are filled in that the user is of that type) field. Any queries will just need to include the "AND usertype = " clause.
If there are many fields or rules associated with each type then your idea of three tables is the best.
USERS table (ID, type, name, password)
GENUSERS (ID, genfield1, genfield2)
ADMINUSERS(ID, adminfield1, adminfield2)
The constraints between IDs on the table are all you need (and the main USERS table keeps the IDs unique). Works very well in most situations but reports that include both types of users with their specific fields have to be done in two parts (unioned SQL or subqueries or multiple left joins).
You can solve it with one 'general' users table containing the information thats available for all users and 1 table for every specific user type. In your example you will then need 3 tables.
Users: This table holds only information shared between all usertypes, ie. UserId, Name, Address, etc.
GeneralUsers: This table 'extends' the Users table by providing a foreing key UserId that references the Users table. In addition, information specific to general users are held here, fx. EmailAddress, etc.
AdminUsers: As with GeneralUsers, this table also 'extends' the Users table by providing a foreign key UserId referencing the Users table. In addition information specific to admin users are held here, fx. WorkstationId, etc.
With this approach you can add additional 'specializations' if the need arises by simply adding new tables that 'extends' the Users table using a foreign key reference. You can also create several levels of specialization. If for example admin users are general users as well as admin users then AdminUsers could 'extend' GeneralUsers instead of Users simply by using a foreing key to GeneralUsers instead of Users.
When you need to retreive data from this model you need to which type of user to query. If for example you need to query a GeneralUser you will need something similar to:
SELECT * FROM GeneralUsers
LEFT JOIN Users ON GeneralUsers.UserId = Users.UserId
Or if querying an admin user
SELECT * FROM AdminUsers
LEFT JOIN Users ON AdminUsers.UserId = Users.UserId
If you have additional levels of specialization, for example by having admin users also being general users you just join your way back.
SELECT * FROM AdminUsers
LEFT JOIN GeneralUsers ON AdminUsers.UserId = GeneralUsers.UserId
LEFT JOIN Users ON GeneralUsers.UsersId = Users.UserId
I most definitely would not do a model where you have separate tables as in GeneralUser, AdminUser and ReadOnlyUser.
In database design, a good rule of thumb is "Down beats across". Instead of multiple tables (one for each type), I would create a SystemUsers table, and a Roles table and define a join table to put SystemUsers in Roles. Also, I would define individual roles.
This way, a user can be added to and removed from multiple roles.
A role can have multiple permissions, which can be modified at any time.
Joins to other places do not need a GeneralUserId, AdminUserId and ReadOnlyUserId column - just a SystemUserId column.
This is very similar to the ASP.Net role based security model.
alt text http://img52.imageshack.us/img52/2861/rolebasedsecurity.jpg

Resources