Almost every web application now allows you to setup a basic profile (Even Stack Exchange does). The question is how should you be storing the data in your database?
Should you just add more columns to your users database table, or should you setup another table called user_profiles that has a foreign key of user_id?
This is quite subjective:
Separate table
easier to fetch user without profile
when some user doesn't have a profile (one-to-one optional relationship), you don't pay anything in terms of storage
sharing profile (?!?) - can't imagine such a scenario, but...
Single table
no JOINs required when loading
related information in one place
strong one-to-one (typically every user will have a profile, maybe created implicitly) relationships tend to be merged to single table
Related
I am getting started in microservices architectures and I have a couple of questions about the data persistence and databases.
So my understanding is each microservice has it's own database (not necessarily, but usually). But given that case, consider a usual social media platform with users, posts and comments. There will be two microservices, a user's microservice and a posts' microservice. The user's database have a users table and the posts' database has posts and comments tables.
My question is on the posts microservice, because each post and comment has an author, so usually we would create the foreign key pointing to the user's table, however this is in a different database. What to do then? From my perspective there are 2 options:
Add the authorId entry to the table but not the foreign key constrain. If so, what would happen in the application whenever we retrieve that user's data from the user's microservice using the authorId and the user's data is gone?
Create an author's table in the posts' database. If so, what data should that table contain other than the user's id?
It just doesn't feel right to duplicate the data that is already in the user's database but it also doesn't feel right to use the user's id without the FK constraint.
One thing to note, data growth is quite different
Users -> relatively static data.
Posts & Comments -> Dynamic and could be exponentially high compared to users data.
Two microservices design looks good. I would prefer option-1 from your design.
Duplication is not bad, In normal database design this is normal to have "Denormalization" for better read performance. This is also helping in decoupling from users table , may help you to choose different database if require. some of your question what if users data is missing and posts is available, this can be handle with business logic and API design.
I am planning to use Identity Server 4 and Asp.net Core Identity together. My website that will be talking to Identity Server 4/Asp.net Core Identity will be expecting that a company name comes back with each user.
Should I create a new customer table called Company and in the Asp User table add a column linking them together.
Or should this be a claim?
I know when I authenticated my user and they are sent back to my main site, I will have a company table and they will be linked but just not sure for the purposes of identifying them.
I feel like it should be a claim but I want to double check since I am new to all this.
In terms of using IdentityServer, technically everything is a claim. The "user" object IdentityServer returns will have all the properties mapped as claims. In that sense, it really doesn't matter which approach you go with.
However, it's generally better to keep data on your user table, if it makes sense to. Something like a foreign key relationship is especially valuable to exist at a database level, as there's more value to that than simply getting a company name.
Storing data as claims is most useful when that data is transient or not applicable to every user. Typical examples include things like third-party access tokens, such as from Facebook. Storing that on the database-level would inevitably result in denormalization of your database table, so it makes more sense to use a claim.
So I have a User table that holds all user information within my application. User's can have at most one profile, but not all users will have a profile (admin's for example). Therefore, I made the relationship between these two tables one to zero-or-one.
This works just fine, but a Profile can have additional data that only relates to a Profile. For example a link to various social media accounts a user may have. These links are only ever shown when displaying a profile, so I went ahead and made the relationship from SocialMediaLink to Profile instead of the User table. SocialMediaLinkType is just a lookup table for social media sites - "Twitter", "Facebook", etc.
Now my question stems from this second relationship. It seems odd to me to be referencing the UserId PK/FK column on the Profile table to the SocialMediaLink table rather than just referencing the UserId PK on the User table. But at the same time, a SocialMediaLink doesn't exactly have a direct reference to the User because it is really a part of the profile. A SocialMediaLink would only ever be shown when displaying the profile. Also, there are at least 3 other tables of additional Profile data, so there would be
Am I over-complicating this? Is it perfectly fine to have the User table just reference the SocialMediaLink and any other additional tables related to the Profile, even when some user records won't have those associated records (like in the case of an admin)? Or is there added benefit to keeping the link on the Profile table, say for a cascade delete where deleting the profile deletes all the additional table data without having to touch the User table at all?
Also, for what it's worth I am using Entity Framework with this database in the application, but I am more interested in what actual relational database theory suggests versus how to get things working with EF.
I'm certainly no DBA and only a beginner when it comes to software development, so any help is appreciated. What is the most secure structure for storing the data from multiple parties in one database? For instance if three people have access to the same tables, I want to make sure that each person can only see their data. Is it best to create a unique ID for each person and store that along with the data then query based on that ID? Are there other considerations I should take into account as well?
You are on the right track, but mapping the USER ID into the table is probably not what you want, because in practice many users have access to the corporations data. In those cases you would store "CorpID" as a column, or more generically "ContextID". But yes, to limit access to data, each row should be able to convey who the data is for, either directly (the row actually contains a reference to CorpID, UserID, ContextID or the like) or it can be inferred by joining to other tables that reference the qualifier.
In practice, these rules are enforced by a middle tier that queries the database, providing the user context in some way so that only the correct records are selected out of the database and ultimately presented to the user.
...three people have access to the same tables...
If these persons can query the tables directly through some query tool like toad then we have a serious problem. if not, that is like they access through some middle tier/service layer or so then #wagregg's solution above holds.
coming to the case when they have direct access rights then one approach is:
create database level user accounts for each of the users.
have another table with row level grant information. say your_table has a primary key column MY_PK_COL then the structure of the GRANTS_TABLE table would be like {USER_ID; MY_PK_COL} with MY_PK_COL a foreign key to your_table.
Remove all privileges of concerned users from your_table
Create a view. SELECT * FROM your_table WHERE user_id=getCurrentUserID();
give your users SELECT/INSERT/UPDATE rights on this view.
Most of the database systems (MySQL, Oracle, SQLServer) provide way to get current logged user. (the one used in the connection string). They also provide ways to restrict access to certain tables. now for your users the view will behave as a normal table. they will never know the difference.
a problem happens when there are too many users. provisioning a database level uer account to every one of them may turn difficult. but then DBMS like MsSQLServer can use windows authentication, there by reducing the user/creation problem.
In most of the scenarios the filter at middle tier approach is the best way. but there are times when security is paramount. Also a bug in the middle tier may allow malicious users to bypass the security. SQL injection is one thing to name. then you have to do what you have to do.
It sounds like you're talking about a multi-tenant architecture, but I can't tell for sure.
This SO answer has a summary of the issues, and links to an online article containing details about the trade-offs.
I am implementing an authentication system into an existing database system. Currently, the database has a "Person" table that includes things like: First Name, Last Name, Date of Birth, Email (username), etc. This is the primary table for a user.
I need to add the following fields for authentication: Password, IsLocked, LockDate, LastLoginDate.
Would you suggest putting these fields in the Person table or would you put them in a new Authentication table? My original plan was for "Person" to simply contain data about that person, not necessarily about authentication.
The other approach could be to store the password along with the email in Person, but then place the authentication data in a separate table. This way, the username and password would be in the same place, but the meta data would be in its own entity.
Anyone have any thoughts?
Thanks for the help!
Keep them separate so that users can query the system for information about a Person without necessarily having access to their account credentials.
This also has a nice side-effect where not all Person entities may have accounts.
Keep the account information separate. Your current business requirement may be for each person to have only one account, but it could come up in the future that a person needs to have multiple accounts, or even that you need an account that is shared by multiple people. Having a separate table for authentication means that such future changes will have a smaller impact on your code.
Also, from the perspective of protecting authentication information, the fewer people/processes that can access the account data the better off you'll be. It's much easier to implement table-level access than column-level access.
I don't think it makes much sense to create a seperate table for Authentication data. Authentication can't exist independently of the Person, as far as I can tell - and there doesn't seem to be a way one Person could reasonably be associated with two Authentications (or vice versa).
In other words: There's a 1:1 relationship between Person and Authentication, so why split it off?