As mentioned in the title, is it possible to create many-to-many relationship between two tables that belong to two different databases? If yes, how can i perform that with PostgreSQL?
The standard way of using foreign key constraints to enforce referential integrity is only possible within the same database - not db cluster. But you can operate across multiple schemas in the same database.
Other than that, you can create tables just the same way. And even join tables dynamically among remote databases using dblink or FDW. Referential integrity cannot be guaranteed across databases by the RDBMS, though.
Does not matter much whether the other DB is on the same physical machine or even in the same DB cluster - that just makes the connection faster and more secure.
Or you can replicate data to a common database and add standard constraints there.
It should be possible, but as has been stated you cannot expect much in the way of referential integrity.
If you follow the standard design pattern of using a linking table, you can generate a sort of M2M relationship.
DB1.dbo.Users has the USER_ID primary key
DB2.dbo.Tasks has the TASK_ID primary key
you could create a table on either DB1 or DB2 that is UsersToTasks
DB1.dbo.UsersToTasks
USER_ID - KEY
TASK_ID - KEY
This way, a unique pairing of USER_IDs and TASK_IDs are used as a key in that table. The only thing is you cannot create a foreign key to the other table.
As a pseudo workaround, you could write a trigger on DB2.dbo.Task that would write the TASK_ID to DB1.dbo.TASK_IDS and link that as the foreign key on the linking table above. I'm not sure, but you could also potentially create a delete trigger that would remove the TASK_ID as well.
http://solaimurugan.blogspot.com/2010/08/cross-database-triggers-in-postgresql.html
Related
Let's consider the tables:
CREATE TABLE [dbo].[User] (
[Id] INT PRIMARY KEY,
...
);
CREATE TABLE [dbo].[UserInfo_1] (
[Id] INT PRIMARY KEY,
[UserId] INT,
...,
CONSTRAINT FK_UserId FOREIGN KEY ([UserId]) REFERENCES [dbo].[User] (Id)
);
CREATE TABLE [dbo].[UserInfo_2] (
[UserId] INT PRIMARY KEY,
...,
CONSTRAINT FK_UserId FOREIGN KEY ([UserId]) REFERENCES [dbo].[User] (Id)
);
What are the procs and cons of using FOREIGN KEY for UserInfo_1 and UserInfo_2 tables? Also in terms of ORM.
I don't think there would be a con for using Foreign Keys on any form of tables. In fact, I make sure to have a primary key on all tables I use, especially on temps and variable tables since I know I will be joining and filtering with them.
Now your first table User and UserInfo_1 is a one to many relationship. Meaning a single User can have many different UserInfo_1 associated with it.
the second one of User and UserInfo_2 is a one to one relationship. In which a single User can only ever have one UserInfo_2 associated with it.
In terms of performance, since they are Indexed they would perform relatively the same, depending upon your filtering and what plan was cached in the query plan. though you may not entirely run into issues with cached query plans as EF utilizes ad-hock statements, though EF does run up the cached plan memory and that is typically recommended to be disabled when using EF.
One-to-One
I am a fan of one to one relationship, especially in a Domain Driven Design aspect, and when implemented correctly. If each of your rows for User is going to require information from UserInfo_2, then I would theoretically keep them on the User table. Now if you know you will not be querying that information much or not all Users will require the columns on that table, or if your main table is fairly large I would keep it as a one to one relationship.
I personally like to use System Versioning. I have tables which contain certain columns which typically update and columns which almost never update. Those that I know update on a daily/weekly bases I have them congregated on a one to one relationship to the main table that should almost never update. But each business needs and scenario is different. Not any one design fits all situations.
Benefits of Indexing
When you create a Foreign Key, you are creating an Index in the database. This will allow for you to perform faster queries. The SQL optimizer will utilize the Index to better find what you are looking for. Without the index, your query plan will turn into a table scan, which is a row by row search. When doing a row by row search, you can seriously slow your system down as your table grows.
Should you choose to create your tables without an Index, or in this case a Foreign Key index, you might find issues with the ORM aspect. When you query your database from EF, you will call your DbSet. If you had proper Foreign Key connections with your two tables, EF can utilize the .Include to join the two tables searching for what you need. Otherwise you would be forced to utilize two queries into the database for both tables.
In a project I worked on one time, a developer did that. He did not properly attach a Foreign Key connection between two objects and then didn't understand why EF would not properly return his values when he used the .Include and wasn't very fast. He thought it was EF's fault and had to do two queries to obtain the information he needed.
Well User > UserInfo_1 is a one-to-many relationship, as UserInfo_1.UserID is not a key. And EF 6 doesn't support alternate keys. EF Core does, so you could make it a key.
But the simplest design is always to collapse 1-1 relationships into a single table. In EF Core you can still have a main Entity Type and one or more separate Owned Entity Types. But on the database it's typically better to have them in a single table.
The second-simplest is to have have both tables have the same key columns.
I am creating user management database schema. I am using Postgresql as database. Following is my approach. Please suggest if there is any performance issue if I use this structure.
Requirement:
Expecting around millions of users in future.
I have to use unique user id on other systems also, may be on MongoDB, redis etc.
Approach:
I am using pseudo_encrypt() as unique user_id (BIGINT or BIGSERIAL), so that no one can guess other ids. For example: 3898573529235304961
Using user_id as foreign key in another table. I am not using primary key of user table as foreign key.
Any suggestions?
Use of unique key as foreign key everywhere in other tables, am I doing it correct?
Any performance issue during CRUD operations & with complex joins?
Use of unique key in any other database is correct way? (in case of distributed environment)
You are wading into flame war territory here over the question of natural vs surrogate primary keys. I agree with you and often use unique keys as foreign keys, and designate natural primary keys as such. On PostgreSQL this is safe (on MySQL or MS SQL it would be a bad habit though).
In PostgreSQL the only differences between primary keys and unique constraints are:
A table can have only one primary key
primary keys are not null on all columns
In practice, if you have a table defined as NOT NULL UNIQUE, it is just about the same as a single primary key.
On other dbs, often times table structure is optimized for primary key lookups which is why this is a problem, and there may be tools that don't like it but those are questions outside the realm of db design per se.
You are better to use normal serials and have real access controls than try to build things on obscurity. The obscurity controls are likely to perform worse, and be less secure than just doing things right however.
Very new to sql server. I have a db with about 20 tables each with around 40 columns. How can I select two tables and see if they have any columns in common?
I basically want to see where I can make joins.. If there's a better way of quickly telling where I can combine info from two tables that could be helpful too.
First of all, in relational databases there is not such a concept of "joinable tables and/or columns". You can always list two relations (= tables) crossing every row in one relation with each row of the other (the cross/carthesian product of them) and then filter those based on some predicate (also called a "join", if the predicate involves columns of both relations).
The idea of "joinable" tables/columns comes into being only when thinking about the database schema. The schema's author can ask the database engine to enforce some referential integrity, by means of foreign keys.
Now if your database schema is well done (that is, its author was kind/clever enough to put referential integrity all over the schema) you can have a clue of which tables are joinable (by which columns).
To find those foreign keys, for each table you can run sp_help 'databasename.tablename' (you can omit the databasename. part, if it is the current database).
This command will output some facts about the given table, like its columns (along with their datatypes, requiredness, ...), its indexes and so on. Somewhere near the end it will list foreign keys along with where (if ever) its primary key is imported as foreign key on other tables.
For each key imported as foreign key on other table you have a candidate predicate for a join.
Please note that this procedure will only work if the foreign keys are set correctly. If they aren't, you can fix your database schema (but to do this you must know already which tables are joinable anyway). Also it won't show you joinable tables on other databases (in the same or linked server).
This also won't work for views.
Try to see in the SQL Management Studio, in the database diagram, there you find the relations between tables.
I've read a lot of tips and tutorials about normalization but I still find it hard to understand how and when we need normalization. So right now I need to know if this database design for an electricity monitoring system needs to be normalized or not.
So far I have one table with fields:
monitor_id
appliance_name
brand
ampere
uptime
power_kWh
price_kWh
status (ON/OFF)
This monitoring system monitors multiple appliances (TV, Fridge, washing machine) separately.
So does it need to be normalized further? If so, how?
Honestly, you can get away without normalizing every database. Normalization is good if the database is going to be a project that affects many people or if there are performance issues and the database does OLTP. Database normalization in many ways boils down to having larger numbers of tables themselves with fewer columns. Denormalization involves having fewer tables with larger numbers of columns.
I've never seen a real database with only one table, but that's ok. Some people denormalize their database for reporting purposes. So it isn't always necessary to normalize a database.
How do you normalize it? You need to have a primary key (on a column that is unique or a combination of two or more columns that are unique in their combined form). You would need to create another table and have a foreign key relationship. A foreign key relationship is a pair of columns that exist in two or more tables. These columns need to share the same data type. These act as a map from one table to another. The tables are usually separated by real-world purpose.
For example, you could have a table with status, uptime and monitor_id. This would have a foreign key relationship to the monitor_id between the two tables. Your original table could then drop the uptime and status columns. You could have a third table with Brands, Models and the things that all models have in common (e.g., power_kWh, ampere, etc.). There could be a foreign key relationship to the first table based on model. Then the brand column could be eliminated (via the DDL command DROP) from the first table as this third table will have it relating from the model name.
To create new tables, you'll need to invoke a DDL command CREATE TABLE newTable with a foreign key on the column that will in effect be shared by the new table and the original table. With foreign key constraints, the new tables will share a column. The tables will have less information in them (fewer columns) when they are highly normalized. But there will be more tables to accommodate and store all the data. This way you can update one table and not put a lock on all the other columns in a denormalized database with one big table.
Once new tables have the data in the column or columns from the original table, you can drop those columns from the original table (except for the foreign key column). To drop columns, you need to invoke DDL commands (ALTER TABLE originalTable, drop brand).
In many ways, performance will be improved if you try to do many reads and writes (commit many transactions) on a database table in a normalized database. If you use the table as a report, and want to present all the data as it is in the table normally, normalized the database will hurt the peformance.
By the way, normalizing the database can prevent redundant data. This can make the database consume less storage space and use less memory.
It is nice to have our database normalize.It helps us to have a efficient data because we can prevent redundancy here and also saves memory usages. On normalizing tables we need to have a primary key in each table and use this to connect to another table and when the primary key (unique in each table) is on another table it is called the foreign key (use to connect to another table).
Sample you already have this table :
Table name : appliances_tbl
-inside here you have
-appliance_id : as the primary key
-appliance_name
-brand
-model
and so on about this appliances...
Next you have another table :
Table name : appliance_info_tbl (anything for a table name and must be related to its fields)
-appliance_info_id : primary key
-appliance_price
-appliance_uptime
-appliance_description
-appliance_id : foreign key (so you can get the name of the appliance by using only its id)
and so on....
You can add more table like that but just make sure that you have a primary key in each table. You can also put the cardinality to make your normalizing more understandable.
Hi I've set up two very basic tables. One table will act as a look up, with an identity field as a primary key. The other table uses the look up ID as a foreign key.
I have created a relationship constraint so now I cannot delete from the look up if the foreign key is used in the "main" table.
However my issue is i can add a record with a foreign key that doesn't exist.
To my way of thinking this shouldn't be allowed, can anyone tell me what setting I need to use to enforce this and whether this is typical database design or not?
Thanks Dave
You way of thinking is correct. Good database design provides some way of enforcing what is called "Referential Integrity". This is simply a buzzword for the concept you have derived on your own. Namely that a foreign key should be rejected if it refers to a non existent row. For a general discussion of referential integrity, see the following Wikipedia article. It's short.
http://en.wikipedia.org/wiki/Referential_integrity
Some pprogrammers would like to enforce referential integrity inside their programs. In general, it's a much better plan to define a referential integrity constraint inside the database, and let the DBMS do the enforcement. It's easier, it's faster, and it's more effective.
The SQL Data Definition Language (DDL) provides a way to declare a foreign key constraint when you create a table. The syntax differs a little between different dialects of SQL, but it's basically the same idea in all of them. Here's a capsule summary.
http://www.w3schools.com/sql/sql_foreignkey.asp
The documentation for SQL Server should have a description of the referential integrity constraint under the CREATE TABLE command.