Circular foreign keys in PostgreSQL - database

I'm having a bit of an issue with circular foreign keys. I have a table posts with a uid column that is a BIGINT NOT NULL PRIMARY KEY, and I need to reference that in the last_seen_post column of my users table. The problem is that there's a sender_id column in posts that references users(user_id), meaning that I can't use both foreign keys at the same time because depending on the order of table definition, one of them will not be defined. Any fixes other than to drop the foreign key on the last_seen_post column?
I really would rather avoid dropping either foreign keys because the data integrity for this backend is mission critical. I know I could probably make another table to achieve this, but I was really hoping to get away with this since it's way cheaper in terms of space and search time
For context, the last_seen_post column is supposed to help the backend figure out what the user's "feed" is, i.e. the collection of posts from users they follow that they have not downloaded yet.

Alright, I figured it out, so I'm gonna leave the solution here if others need it in the future. The easiest thing to do is to create the column without the foreign key constraint and then do ALTER TABLE yourtbl ADD FOREIGN KEY (yourcolumn) REFERENCES othertbl(othercolumn); once both tables have been created!

Related

ForeignKey column comes from a lot of tables

So i have a table in which i have a column named parentKey. And this column has actually keys (which by definition are foreign keys) to MANY other tables (at least 4). And it seems strange to me to even create a column like this. I haven't yet seen a construction of a table that had this. Because you can't add a foreign key constraint since the column doesn't link to one single table. So i don't know is this is allowed to exist. I mean it's there it is created but i'm not sure if i should let it like this.
My idea is to create a column for each of the possible tables and name it correctly like : MyTable1Key, MyTable2Key and let them be foreign keys. But the problem with that is that if one of the foreign keys is assigned then the other ones will be null (And it will never be assigned so it will always stay null).
So do i have to let this parentKey column like it is or should i split it to different columns linked to tables by foreign keys and so have null values for some columns?
Unless you have a good reason, do not combine multiple foreign keys into a single column. As you've already noted it removes the referential integrity of your foreign key.
Either you will risk having a key which could belong to two tables or you have a master table somewhere that you should use as your foreign key reference. It is possible to have a primary key as a foreign key.
It sounds like you may be looking at the supertype-subtype pattern in which case this question might give you some good ideas. How do I apply subtypes into an SQL Server database?

Should GUID be used as a foreign key to many tables

I designed a database to use GUIDs for the UserID but I added UserId as a foreign key to many other tables, and as I plan to have a very large number of database inputs I must design this well.
I also use ASP Membership tables (not profiles only membership, users and roles).
So currently I use a GUID as PK and as FK in every other table, this is maybe bad design?
What I think is maybe better and is the source of my question, should I add in Users table UserId(int) as a primary key and use this field as a foreign key to other tables and user GUID UserId only to reference aspnet_membership?
aspnet_membership{UserID(uniqueIdentifier)}
Users{
UserID_FK(uniqueIdentifier) // FK to aspnet_membership table
UserID(int) // primary key in this table --> Should I add this
...
}
In other tables user can create items in tables and I always must add UserId for example:
TableA{TableA_PK(int)..., CreatedBy(uniqueIdentifier)}
TableB{TableB_PK(int)..., CreatedBy(uniqueIdentifier)}
TableC{TableC_PK(int)..., CreatedBy(uniqueIdentifier)}
...
Ultimately the answer is that it really depends.
Microsoft have documented the performance differences of each here. While the article differs slightly to your situation, as you have to use a UNIQUEIDENTIFIER to link back to asp membership, many of the discussion points still apply.
If you have to create your own users table anyway it would make more sense to have your own int primary key, and use the GUID as a foreign key. It keeps separate entities separate. What if at some point in the future you wanted to add a different membership to a user? You would then need to update a primary key, which would have to cascade to any tables referencing this and could be quite a performance hit. If it is just a unique column in your users table it is a simple update.
All binary datatypes (uniqueidetifier is binary(16)) are fine as foreign keys. But GUID might cause another small problem as primary key. By default primary key is clustered, but GUID is generated randomly not sequentially as IDENTITY, so it will frequently split IO pages, that cause performance degradation a little.
I'd go ahead with your current design.
If you're worried because of inner joins performance in your queries because of comparing strings instead of ints, nowadays there is no difference at all.

MySQL Foreign keys: should i set it up?

Do i need to setting the foreign key for this situation ?
i'm weak in database design, especially in mysql.. may i know if i want to setting foreign keys for them, what should i setting for them ? in case if the people delete... all referral to people_id will delete together, is it possible to set while the table is too many ?
Thx for reply
Yes. Foreign key constraints enforce referential integrity, a key tenet of ensuring that your data is reliable and of high quality. Otherwise, your people_address table could reference a people_id value that doesn't exist in the people table, and would be an orphan. A foreign key constraint would prevent that from happening.
So, just do it. There's really no good reason not to.
Define foreign keys such as the following on the people_email table:
ALTER TABLE people_email ADD CONSTRAINT FOREIGN KEY (people_id) REFERENCES people (id) ON DELETE CASCADE;
This will mean that you cannot enter a record in people_email where the people_id in that table does not exist in people. Also, if you delete the parent row in people, the rows referencing it in people_email with get automatically deleted.
I personally prefer to manually delete all the rows from the child tables and not use cascade deletes though. It's a bit of extra app dev work, but it makes me feel safer and also allows me some control over locking and ensuring that queries are as efficient as possible.

Should a database table always have primary keys?

Should I always have a primary key in my database tables?
Let's take the SO tagging. You can see the tag in any revision, its likely to be in a tag_rev table with the postID and revision number. Would I need a PK for that?
Also since it is in a rev table and not currently use the tags should be a blob of tagIDs instead of multiple entries of multiple post_id tagid pair?
A table should have a primary key so that you could identify each row uniquely with it.
Technically, you can have tables without a primary key, but you'll be breaking good database design rules.
You should strive to have a primary key in any non-trivial table where you're likely to want to access (or update or delete) individual records by that key. Primary keys can consist of multiple columns, and formally speaking, will be the shortest available superkey; that is, the shortest available group of columns which, together, uniquely identify any row.
I don't know what the Stack Overflow database schema looks like (and from some of the things I've read on Jeff's blog, I don't want to), but in the situation you describe, it's entirely possible there is a primary key across the post identifier, revision number and tag value; certainly, that would be the shortest (and only) superkey available.
With regards to your second point, while it may be reasonable to argue in favour of aggregating values in archive tables, it does go against the principle that each row/column intersection in a table ought to contain one single value. While it may slightly simplify development, there is no reason you can't keep to a normalised table with versioned metadata, even for something as trivial as tags.
I tend to agree that most tables should have a primary key. I can only think of two times where it doesn't make sense to do it.
If you have a table that relates keys to other keys. For example, to relate a user_id to an answer_id, that table wouldn't need a primary key.
A logging table, whose only real purpose is to create an audit trail.
Basically, if you are writing a table that may ever need to be referenced in a foreign key relationship then a primary key is important, and if you can't be positive it won't be, then just add the PK. :)
See this related question about whether an integer primary key is required. One of the answers uses tagging as an example:
Are there any good reasons to have a database table without an integer primary key
For more discussion of tagging and keys, see this question:
Id for tags in tag systems
From MySQL 5.5 Reference Manual section 13.1.17:
If you do not have a PRIMARY KEY and an application asks for the PRIMARY KEY in your tables, MySQL returns the first UNIQUE index that has no NULL columns as the PRIMARY KEY.
So, technically, the answer is no. However, as others have stated, in most cases it is quite useful.
I firmly believe every table should have a way to uniquely identify a record. For 99% of the tables, this is a primary key. For the rest you may get away with a unique index (I'm thinking one column look up type tables here). Any time I have a had to work with a table without a way to uniquely identify records, there has been trouble.
I also believe if you are using surrogate keys as your PK, you should, where at all possible, have a separate unique index on whatever combination of fields make up the natural key. I realize there are all too many times when you don't have a true natural key (names are not unique or what makes something unique might be spread across several parentchild tables), but if you do have one, please please please make sure it has a unique index or is created as the PK.
If there is no PK, how will you update or delete a single row ? It would be impossible ! To be honest I have used a few times tables without PK, for instance to store activity logs, but even in this case it is advisable to have one because the timestamps could not be granular enough. Temporary tables is another example. But according to relational theory the PK is mandatory.
it is good to have keys and relationships . Helps a lot. however if your app is good enough to handle the relationships then you could possibly skip the keys ( although i recommend that you have them )
Since I use Subsonic, I always create a primary key for all of my tables. Many DB Abstraction libraries require a primary key to work.
Note: that doesn't answer the "Grand Unified Theory" tone of your question, but I'm just saying that in practice, sometimes you MUST make a primary key for every table.
If it's a join table then I wouldn't say that you need a primary key. Suppose, for example, that you have tables PERSONS, SICKPEOPLE, and ILLNESSES. The ILLNESSES table has things like flu, cold, etc., each with a primary key. PERSONS has the usual stuff about people, each also with a primary key. The SICKPEOPLE table only has people in it who are sick, and it has two columns, PERSONID and ILLNESSID, foreign keys back to their respective tables, and no primary key. The PERSONS and ILLNESSES tables contain entities and entities get primary keys. The entries in the SICKPEOPLE table aren't entities and don't get primary keys.
Databases don't have keys, per se, but their constituent tables might. I assume you mean that, but just in case...
Anyway, tables with a large number of rows should absolutely have primary keys; tables with only a few rows don't need them, necessarily, though they don't hurt. It depends upon the usage and the size of the table. Purists will put primary keys in every table. This is not wrong; and neither is omitting PKs in small tables.
Edited to add a link to my blog entry on this question, in which I discuss a case in which database administration staff did not consider it necessary to include a primary key in a particular table. I think this illustrates my point adequately.
Cyberherbalist's Blog Post on Primary Keys

Delete a table referenced by a foreign keys

What is the best way to delete a table referenced by a foreign keys?
Is the intended goal to orphan those records and never use the foreign key again? If so the method mentioned before about disabling the key is fine, otherwise you may want to instead delete the records referencing the table you want to delete first (or update the to point to a more appropriate record, or NULL if that makes sense in this case). I seem to be coming at this from a different direction than others, are you sure the foreign key is pointless, and if so why not just remove it? At some point someone wanted to constraint this behavior, before just disabling constraints I make sure I understand their purpose and have a good justification for bypassing those safeguards.
Remove the foreign key constraint and then delete the table once no-one is forced to recognize it. If the column in the second table (the one not being deleted) is not used elsewhere, then you should probably delete the whole column after removing the constraint.
You need to remove the constraint before you're allowed to delete the table referenced by it. SQL Server uses the following syntax:
ALTER TABLE <table_name> DROP FOREIGN KEY <foreignkey_name>
Keep in mind that the constraint exists on the table that references the one you want to delete so that's the table you should be altering.
Do NOT delete a table with foreign key constraints without considering the impact on the foreign key tables. Let me explain the impact of simply deleting the foreign key and then the table with an example.
Consider two tables - parts and orderdetails. There is a foreign key constraint that says a part must exist before it can be put into the orderdetails table. What is stored in the orderdetail table is the id for the part from the parts table, not the part name or description. Suppose you drop the foreign key and then drop the parts table. Now all the data in the orderdetail table is totally useless because you have no way of knowing what the part ordered was. This would include orders not yet shipped and orders that the customer might call and ask questions about. Further you now have no way to recreate that data except by restoring a backup (hope you have one).
Further suppose you want to drop the table and recreate it to make a change to the table. Then reload the information and put the foreign key back on. In this case you should probaly use alter table instead of drop and recreate but if you don't you may end up with id numbers that are not the same as they were originally and thus now the orders will reference the wrong ids. This can be done safely but you would have to do it very carefully and with a lot of thought as to the consequences.
by using On Delete Cascade

Resources