I am creating block lists to block user actions based on IP address, MAC address, Email address, Name (first/last name), Trademark names and Usernames. So should each of these be a separate lookup table or can there be one blocked_list table with all these in? Each is individual independent of the other.
The list will be used in few places ->
User signup - block account signup based on IP, MAC, email & any disallowed First/last name
Username creation - block username creation based on restricted usernames
Profile details - block profile email being added based on disallowed emails
Public pages - block people from naming pages based on a restricted list of trademarked names.
Also, is it better to keep this in the DB or a text file? Except trademark names everything else will be in English. For trademarks I may use region specific blocking so need multi-lang support.
I'd make each their own table.
how would combining them make your queries run any faster? it wouldn't.
It isn't like you won't know what you have, if you have a MAC address check in the block MAC address table for it, it would be your primary key which would be defined to the exact proper length and type for a MAC address.
What about having 1 table with all the values in and have a column - block_type. You will need a lookup table for block_type where 1=IP. 2 =MAC,... This was you can manage this with only 2 tables. But I'll let someone more professional answer this as I am new myself to databases.
You can put all the entries in a single table. Check out the Entity-Attribute-Value approach or use a schemaless NoSQL datastore.
http://en.wikipedia.org/wiki/Entity-attribute-value_model
If you're processing the 'blocking' in the middle tier, you can just dump the lists as serialized objects (e.g. JSON) into the table.
Related
How to create a Facebook database on small scale.
about the project:
User can signup and create a account (info is store in UserTB table)
User can edit there profile & address informaion (info is store in ProfileTB & addressTB tables)
User can add their family member information. But Family member is Not a user of this application.
Family Member can become a member of application but doesnt has to be
Issue: How can user add their family member information (sister/brother/dad)? User's family will also have first_name, address etc.. so maybe doesnt make sense to re-creating FamilyProfile or FamilyAddress Tables. Main issue is user can add their many Family Member but family member doesnt has to be a user of this application.
UserTB = track user login information
ProfileTB = Track user profile information
AddressTB = track user address information
FamilyTB = user have multi family member linked to user table (Relationship is string, ex brother, sister, dad etc..).
I solove this issue by adding Linked_ID in each tables. but it has its own issues, for example. i could have 50 tables... and when family member becomes a user... than I will have to update user_ID in 50 different tables...
Table and test data
click here to see ERD Image
The AddressTB is unnecessary and must be merged with the ProfileTB. I know why you seperated Street into a different table (Normalization) but streets are just way too many and they can sometimes have multiple names (In some countries atleast), people can also have typos when entering their Street name and it's not worth it!
On the other hand you need the FamilyTB (I would rename this table to RelationTB for more clearance).The actual information for users are stored in ProfileTB and only their relations are stored in FamilyTB. (Read more here)
Also, Family_ID cannot be a primary key on its own! Because primary keys must be Unique; However, a user can have multiple brothers and that violates this rule. Change it the following so it's always unique:
Table = FamilyTB
Field = Family_ID (PK)
Field = User_ID (PK)
Field = Relationship
Say for example, I have an ADDRESS table, that will store similar attributes of other entities like address, city, zip, country, etc. The entities are USER, COMPANY, BANK, BRANCH, etc. I would like to use this one table ADDRESS to store the addresses of the other entities rather than creating other tables for each entity to store the ADDRESS like so, USER_ADDRESS, COMPANY_ADDRESS, BANK_ADDRESS, BRANCH_ADDRESS.
Is this possible? Am i breaking any laws or conventions? What are the consequences, if any?
Each entity (USER, COMPANY, etc.) should contain a reference to an entry in the ADDRESS table.
There are a few issues:
If 2 users have the same address, they should reference the same address id.
You will need to normalise addresses so that you're not duplicating information (e.g. if you know the city, then you automatically know the zip and country).
Of course, you may not want a well-normalised database. Saving the entire address as a string will improve read performance by reducing the number of join operations.
A lot of things depend on the exact use of the database.
It is fine to use a single ADDRESS table for that purpose and have an ADDRESS_ID in each of the other entities. Depends on the use case and the way you prefer to implement it. I most probably wouldn't do it. I also wouldn't do the other solution you're suggesting (an address table per entity).
So, let's say you want to implement a function to search for all the addresses, where it doesn't matter what type of entity is connected to it. You will have to search the ADDRESS table. If you get results, then you have to search the other four tables to see which record is connected to that address.
You could add a field ENTITY_TYPE in the ADDRESS table where you specify which type of entity it is connected to, so you don't have to search the four tables, but I don't recommend this since you can have consistency errors (USER 17 points to ADDRESS 14, but ADDRESS 14 has ENTITY_TYPE = BANK).
Now, with your other solution (having four separate tables to store the addresses of the four different entities) you're just going to have to search those four tables and then search the corresponding entity table to get the entity you're looking for.
My solution in most cases is adding the address fields to the entities tables themselves. Having ADDRESS, ZIP_CODE and COUNTRY_CODE (always use proper country codes, not country names https://en.wikipedia.org/wiki/List_of_ISO_3166_country_codes) will make it simple. When you present a list of items (users, banks, companies, offices, whatever), it's really common to show the name and the address at the same time in a table. Having no JOINS makes it faster and easier to process. If you want to update an address, it's on the table itself. No lookups!
Of course, like most things in programming, it depends on what your needs are.
Also, please, don't try to split the ADDRESS in more fields. I've seen ADDRESS_TYPE (street, road, avenue, square, ...), STREET_NAME, STREET_NUMBER, BLOCK_NUMBER, BLOCK_FLOOR, BLOCK_LETTER. I'm pretty sure you're never going to need something like SELECT * FROM USER WHERE STREET_NUMBER = 74.
I have an old application that needs upgrading. Doesn't everything now days?
The existing DB schema consists of predefined fields like phone, fax, email. Obviously with the social explosion over the last 5-7 years (or longer depending on your country) end users need more control over creating contact cards the way they see fit rather than just what I think might be useful.
Im concerned here with "digital" addresses. i.e. One line type addresses. phone=ccc ccc ccc ccc etc
Since physical addresses are pretty standard in terms of requirements in this case users will have to use what they are given (location, postal, delivery) in order to keep the scope managable.
So I'm wondering what the best practice format for storing digital info is. To me it seems I have two choices:
A simple 4 field table (ContactId, AddressTypeId, Address, FormatterId)
1000, "phone", "ccc ccc ccc ccc", phoneformatter
1000, "facebook", "myfacebook", facebookformatter
This would then be JOINED anywhere it's need. The table would get massive though and the join performance would degrade over time i suspect.
A json blob that would require additional processing once read (ContactId, Addresses)
1000, {{"phone": "ccc ccc ccc ccc"}, {"facebook": "myfacebook"}}
Or ... something else.
This db is for use in a given country by customers only trading domestically with client bases ranging from 3000-12000 accounts and then however many contacts per account - averages about 10 in current system.
My primary concern is user flexibility but performance is a key consideration in that. So I dunno, just do whatever and throw heaps of hardware at it ;)
Application is in C# if that makes any difference re: post query processing.
I would not go for the JSON blob. This will be nasty if you need to answer any queries like:-
Does anyone have me in their Facebook contacts?
What's the most popular type of social media contact?
You would be forced to parse the JSON for every record and be unable to create a simple index.
Your additional solution is nearly correct, however FormatterId would need to be on a AddressType table. What you have is not normalised as FormatterId would depend only on AddressTypeId. So you would have three tables:-
Contact
ContactAddress
AddressType
You haven't stated if you need to store two addresses of the same type against a single contact. e.g. if someone has two twitter accounts. Answering this question will allow you to define the correct primary key on ContactAddress. It would either be (ContactId, AddressTypeId) if you can only have one of each type per contact or create a synthenic key (ContactAddressId).
Well, I believe you have a table named contact
contact(contactid, contact details, other details)
and now you want to remove this contact details from the contact table because the contact details may contain digital address, phone number and all.
But the table you are considering
(ContactId, AddressTypeId, Address, FormatterId) is not in normal form and you can't uniquely identify a tuple until you read all the four columns which is bad and in this case indexing also not going to help you.
So better if you have if separate table for each type of the digital address, and have indexing on contactID
facebookdetails(contactid, rest of the details)
phonedetails(contactid, rest of the details)
And then the query can be join of all the tables, it will not degrade the performance.
Hope this will help :)
This is a new topic to me, I've read a few articles and I'm still unclear even to the point where I'm unsure if the following question relates to the title of this post or not.
My system sends data to a user. The user may elect for the data to be sent by:
XML
Email
Post
Depending on what the user chooses several additional but different variables are required. Email address for example is required for sending the data via email but it isn't for sending it via XML.
Assuming we had a database table which stored 'Data Delivery Choices' (XML, Email, or Post), would it be best to store the additionally required variables in this table, meaning if XML was selected the email field would be empty in that row, or would it be better to make three new tables to store the vairables associated with each possible choice in the 'Data Delivery Choices' table and then associate entries in these tables via the 'Data Delivery Choices' PK?
Or doesn't it matter which way it's done?
For the purposes of the question forget the fact me may already hold the users email address etc... elsewhere.
Thanks!
Some RDBMS (PostgreSQL for example), allow for table inheritance. So you can define DataDeliveryChoices and then create 3 additional tables that inherit from it DataDeliveryXML, DataDeliveryEmail, ...
MSSQL allows you to store (and query) an XML document in a column, so you can store the additional data in XML (not classic database design, but very flexible should you need to add some data fields, without changing the schema).
Your way of adding three additional tables is IMO also an acceptable solution.
The possibilities are endless :)
Currently my users table has the below fields
Username
Password
Name
Surname
City
Address
Country
Region
TelNo
MobNo
Email
MembershipExpiry
NoOfMembers
DOB
Gender
Blocked
UserAttempts
BlockTime
Disabled
I'm not sure if I should put the address fields in another table. I have heard that I will be breaking 3NF if I don't although I can't understand why. Can someone please explain?
There are several points that are definitely not 3NF; and some questionable ones in addition:
Could there could be multiple addresses per user?
Is an address optional or mandatory?
Does the information in City, Country, Region duplicate that in Address?
Could a user have multiple TelNos?
Is a TelNo optional or mandatory?
Could a user have multiple MobNos?
Is a MobNo optional or mandatory?
Could a user have multiple Emails?
Is an Email optional or mandatory?
Is NoOfMembers calculated from the count of users?
Can there be more than one UserAttempts?
Can there be more than one BlockTime per user?
If the answer to any of these questions is yes, then it indicates a problem with 3NF in that area. The reason for 3NF is to remove duplication of data; to ensure that updates, insertions and deletions leave the data in consistent form; and to minimise the storage of data - in particular there is no need to store data as "not yet known/unknown/null".
In addition to the questions asked here, there is also the question of what constitutes the primary key for your table - I would guess it is something to do with user, but name and the other information you give is unlikely to be unique, so will not suffice as a PK. (If you think name plus surname is unique are you suggesting that you will never have more than one John Smith?)
EDIT:
In the light of further information that some fields are optional, I would suggest that you separate out the optional fields into different tables, and establish 1-1 links between the new tables and the user table. This link would be established by creating a foreign key in the new table referring to the primary key of the user table. As you say none of the fields can have multiple values then they are unlikely to give you problems at present. If however any of these change, then not splitting them out will give you problems in upgrading the application and the data to support the application. You still need to address the primary key issue.
As long as every user has one address and every address belongs to one user, they should go in the same table (a 1-to-1 relationship). However, if users aren't required to enter addresses (an optional relationship) a separate table would be appropriate. Also, in the odd case that many users share the same address (e.g. they're convicts in the same prison), you have a 1-to-many relationship, in which case a separate table would be the way to go. EDIT: And yes, as someone pointed out in the comments, if users have multiple address (a 1-to-many the other way around), there should also be separate tables.
Just as point that I think might help someone in this question, I once had a situation where I put addresses right in the user/site/company/etc tables because I thought, why would I ever need more than one address for them? Then after we completed everything it was brought to my attention by a different department that we needed the possibility of recording both a shipping address and a billing address.
The moral of the story is, this is a frequent requirement, so if you think you ever might want to record shipping and billing addresses, or can think of any other type of address you might want to record for a user, go ahead and put it in a separate table.
In today's age, I think phone numbers are a no brainer as well to be stored in a separate table. Everyone has mobile numbers, home numbers, work numbers, fax numbers, etc., and even if you only plan on asking for one, people will still put two in the field and separate them by a semi-colon (trust me). Just something else to consider in your database design.
the point is that if you imagine to have two addresses for the same user in the future, you should split now and have an address table with a FK pointing back to the users table.
P.S. Your table is missing an identity to be used as PK, something like Id or UserId or DataId, call it the way you want...
By adding them to separate table, you will have a easier time expanding your application if you decide to later. I generally have a simple user table with user_id or id, user_name, first_name, last_name, password, created_at & updated_at. I then have a profile table with the other info.
Its really all preference though.
You should never group two different types of data in a single table, period. The reason is if your application is intended to be used in production, sooner or later different use-cases will come which will need you to higher normalised table structure.
My recommendation - Adhere to SOLID principles even in DB design.