I am building a small database for a lab. We have some rules to make a ID String for every Item, so i do not want to store it in my database. The problem is that some times changes in data, for example the person response for that item changed, causes the chang of ID String. But i can not correct it for printed docs. How can i store the old version of that ID String.
I may simply do't change it but that will break the rules. Any suggestions?
To expand on Damir's point
A "Smart Key" is what you say when
We have some rules to make a ID String for every Item
You're taking the name of the item, maybe a category code and adding
person response for that item
So if I were responsible for Beakers that item ID might be
GLASSWARE-BEAKER-SPAGE
That 'code' becomes a 'Smart key' when you use it in your database as a Primary Key.
This is an anti-pattern. Like most anti-patterns it's seductive. People like the idea of just looking at the key and knowing what kind of thing it is, what it is called and who do I ask to get more. All that information on a report or shelf-label with just a few characters. But it's an anti-pattern for the reason you mentioned - it has meaning and meaning can be changed.
As Damir suggests, you can store this value in another column that we'd call an ALTERNATE KEY or CANDIDATE KEY... it's unique, it could be a PK but it's not. You'll want a unique constraint on the column but not a Primary Key constraint.
It is important to distinguish between a primary key which is supposed to uniquely identify a row in a table and some kind of a smart key that products in catalogs usually have.
For a primary key use auto-incrementing integer -- very few exceptions to this one.
Add columns for things that you are trying to represent in that smart key, like: Person, Project, Response etc.
Add a separate column for that key and treat it like any other field in the table -- this should keep people who are used to this kind of thinking happy.
Smart key is a misnomer here, from a db-design point, that key is rather dumb.
for example the person response for that item changed, causes the chang of ID String
Looks like the workflow in your lab is broken. IDs should never change. Try to bring this to attention of your superiors.
Related
I'm really struggling to find a good DB design for my project using SQL server.
I've already implemented a few models which worked great till this point, but now that I need to add something extra I just can't find a good option and I'm stuck with it ATM.
I've supplied below 2 very simplified models (class diagrams) I've tried, but both models are not working well.
1st model: which I also prefer if it's possible to fix
I should explain first that msg and action can have the same basic id (i.e 1) but when used with tabID or groupID as a composite primary key, it becomes unique.
Here you can see that UserInput is created using only the basic ID which creates a problem to save both a Msg and Action with id 1 for example.
Is there anyway around this? maybe a way to say that ya Action and Msg are extending UserInput but they define all the keys themselves?
2nd model:
Each Critical Point is related to either a Msg or Action, but how can I define it since they have a different set of PK? I would like to keep referral integrity.
I would REALLY appreciate help on this issue.
Potential fix for 1st model
I do not understand why Action and Msg can have the same id. If you want to treat them both similarly (as UserInput) then the id of the UserInput table needs to be unique for them both. So each id of UserInput represents either an Action or a Msg.
I do not know if this is a good example, but lets if Action and Msg are similar to Car and Motorcycle, than you still want to be able to uniquely identify them so their id on the license plate should really be unique and thus should not exist in both groups.
Does the critical point needs to know by what it is used?
If not, you just need a foreign key column "CriticalPointId" in your UserInput class. Because Action and Msg are subclasses, they can both access their CritialPoint.
Potential fix for 2nd model
In this model you have unique ids in the Msg and Action table.
In that respect, it is very similar to my proposed fix for the first model, expect from that fact that no UserData table exists.
This might be the better solution if Msg and Action do not have anything in common (there are no properties in UserData in the first model except from the ID).
Supposing that the CriticalPoint does not need to know by what object it is used, you just need to specify a "CriticalPointId" foreign key column in both the Msg and the Action table.
I strongly having the feeling, I don't see the wood for the trees, so I need your help.
Think of the following two tables:
create table Category (Category_ID integer,
Category_Desc nvarchar2(500));
create table Text (Text_Id integer,
Text nvarchar2(1000),
Category_Id integer references Category.Category_Id);
This code follows no proper syntax, it's just to get an idea of the problem.
Consider the idea to save text parts for certain categories to use them in an interface, like messages ("You can't do that!", "Do this!",...), but also to create notes for other objects, e. g. like orders ("Important customer! Prioritize this order!").
Now for my question. Some of this text bits bring some more information with them, like if you add the "Important customer" note to an order, also the Order.Prio_Flag is set.
Now this is a very special information, only considering text used by the category Order_Note. I don't want to add this to the Text table, since most of the entries are not affected by this and the table would get more and more crowded by special cases for only the least part of its content.
I get the feeling, the design is flawed, but I also don't want a table for every category and keep this as general as possible.
Keep in mind, this is a simplified view of the problem.
TL:DR: How do I add information to a table's content without adding new attributes, because the new attribute would only be filled for the least number of entries.
Subtyping and dependent attributes are easy to do in a relational database. For example, if some Texts are important and need to have a dependent attribute (e.g. DisplayColor), you could add the following table to your schema:
CREATE TABLE ImportantText (
Text_Id integer NOT NULL ,
Display_Color integer NOT NULL ,
PRIMARY KEY (Text_Id),
CONSTRAINT ImportantTextSubtypeOfText
FOREIGN KEY (Text_Id) REFERENCES Text (Text_Id)
ON DELETE CASCADE ON UPDATE CASCADE
);
Many people think foreign key constraints establish relationships between entities. That's not what they're for. They're CONSTRAINTS, i.e. they limit the values in a column to be a subset of another column. In this way, a subtyping relation is established which can record additional properties.
In the table above, any element of ImportantText must be an element of Text, and will have all the attributes of Text (since it must be recorded in the Text table), as well as the additional attributes of ImportantText.
I have a data schema similar to the following:
USERS:
id
name
email
phone number
...
PHOTOS:
id
width
height
filepath
...
I have an auditing table for any changes to the system
LOGS:
id
acting_user
date
record_type (enum: "users", "photos", "...")
record_id
record_field
new_value
Is there a name for this setup where an enum in one of the fields refers to the name of one of the other table? And effectively, the record_type and record_id together are a foreign key to the record in the other table? Is this an anti-pattern? (Note: new_value, and all the thing we would be logging are the same data type, strings).
Is this an anti-pattern?
Yes. Any pattern that makes you enforce referential integrity manually1 is an anti-pattern.
Here is why using FOREIGN KEYs is so important and here is what to do in cases like yours.
Is there a name for this setup where an enum in one of the fields refers to the name of one of the other table?
There is no standard term that I know of, but I heard people calling it "generic" or "polymorphic" FKs.
1 As opposed to FOREIGN KEYs built-into the DBMS.
Actually, I think 'Anti-Pattern' is a pretty good name for this set up, but it can be a realistic way to go - especially in this example.
I'll add a similar example with a new table which records LIKES of users' photos, etc, and show why it's bad. Then I'll explain why it might not ne too bad for your LOGS example.
The LIKES table is:
Id
LikedByUserId
RecordType ("users", "photos", "...")
RecordId
This is pretty much the same as the LOGS table. The problem with this is that you cannot make RecordId a foreign key to the USERS table as well as to the PHOTOS table as well as any other tables. If User 1234 is being liked, you couldn't insert it unless there was a PHOTO with ID 1234 and so on. For this reason, all RDBMS's that I know of will not let a Foreign Key be defined with multiple Primary keys - after all, Primary means 'only one' amongst other things.
So you'ld have to create the LIKES table with no relational integrity. This may not be a bad thinbg sometimes, but in this case I'd think I'd want an important table such as LIKES to have valid entries.
To do LIKES properly, I would create the table as:
Id
LikedByUserId (allow null)
PhotoId (allow null)
OtherThingId (allow null)
...and create the appropriate foreign keys. This will actually make queries that read the data easier to read and maintain and probably more efficient too.
However, for a table like LOGS which probably isn't central to the functionality of my system and I'm only doing some ad-hoc querying from to check what's been happening, then I might not want to put in the extra effort and add the complexity that results in more efficient reading. I'm not sure I would actually skip it, though. It is an anti-pattern but depending on usage it might be OK.
To emphasise the point, I would only do this if the system never queried the table; if the only people who look at the data are admin's running ad-hoc queries against it then it might be OK.
Cheers -
I'm building a small forum component for a website where sub-forums have different admins and mods responsible for them, and users can be banned from individual sub-forums. The ban table I have looks like this:
_Bantable_
user_id
group_id
start_date
end_date
banned_by
comment
At first I was going to use the first four columns as the primary key, but now I'm wondering if it would matter if I use one at all, since no-one would be banned at the same exact time from the same forum, and regardless I'd still have to check if they were already banned and during what interval. Should I just not use a key here, and simply create an index on the user_id, and group_id and search through those when needed?
It wasn't 100% clear, but it sounds like you want temporary ban functionality on a per user basis for a particular groupId. If this is the case, you should make a composite primary key:
user_id,
group_id,
end_date
This will let you do
SELECT * FROM bantable WHERE user_id=$currentUserToCheck AND group_id=$currentGroupToCheck AND end_date < $currentDate
or something like that
Note: if you want your primary key to be coherent in terms of whatever database design principle you're adhering to, then you can just make the primary key the user_id (because it is indeed a unique identifier), and then make a composite index on the three columns that i specified above.
Be absolutely sure that any queries you run against this table that require individual indexes have those indexes correctly generated.
Do you need the historical record of past bans?
If not, just create a composite PK on {user_id, group_id}. Whatever data is currently in the _Bantable_ determines who is currently banned. When the ban expires, just delete the corresponding row (and consider whether you need the end_date at all1).
If you do need the historic record, put an active ban into your original table as before, but when the ban expires, don't just delete it - instead move it into a separate "history" table, which would have a surrogate PK2 independent from {user_id, group_id} (so a same user/group pair can be in multiple rows) and a trigger that prevents time overlaps (something like this).
1 If this is the date at which the ban is going to end, then you do need it. If this is the date the ban has ended, then you don't - the row will be gone by then.
2 Or alternatively, a PK on {user_id, group_id, start_date}.
Why dont you just take the user_id as primary key? I mean you don't even have to use auto_increment (which obviously would not make any sense in here).
Guessing that you'd request the user_id anyway on login this would probably provide the best performance to look if there is even an entry for banning matters.
I am trying to implement a system on my website similar to that of Facebook's "Like" feature. Where users can click a button which counter++'s. However, I have run into a problem in terms of efficiently storing data into my DB.
Each story has it's own row in the stories table in my DB with the columns like and users_like.
I want each person to only be able to like the story once. Therefore I need to somehow store data that shows that the user has, in fact, like++'d the post.
All I could thing of was to have a column named users_like and then add each user, followed by a comma, to the column using CONCAT and then using the php function to explode the data.
However, this method, as far as I know, is in the opposite direction of database normalization.
What is the best way to do this and I understand "best" is subjective.
I cannot add a liked flag to the user table because there will be a vast number of stories the person could 'like.'
Thanks
You need a many to many table in your database that will store a foreign key to the stories table and a foreign key to the user table. You put a constraint on this table saying that the story fk - user fk combo must be unique.
You now don't even have to have a like column, you just count the number of rows in the many to many table corresponding to your story.