How can I use arrays of references in SQLite? - arrays

I am implementing a system to represent a school schedule in SQL, and I want to have a table called Student which includes all of the student's classes. do i need to include references to a Class table as attributes class1,class2,class3,...,class12
or can I use a sort of array?

Since you are using relational database, it would be good to make a m:n relationship between Student and Class table. It would mean that you will have Student table with primary key student_id, Class table with primary key class_id, and one more table, called StudentClass with foreign keys fk_student_id and fk_class_id, plus some additional properties (depending on what do you want to achieve). That would be a good relational design.

You could have a field filled with a comma separated list, or you could keep a separate table of 'allowed classes', with associated data (unique ID number, name, description, teacher), then use foreign keys and an intermediate table to implement a many to many relationship of students to classes.
Many to many relationship
Foreign keys in SQLite
Support for foreign keys in SQLite is pretty good these days, and all the features you'll likely want are there.

Related

DataBase Analysis-junction table

This matter confuses me,
I have a College Information system the junction table between students table and subjects(curriculum) table, the primary key is composite key (StudentID, SubjectID) and both of them are Foreign keys but the student may be fail in exam and repeat the subject so we will have duplicate PK and we need to record all data. I have two ways to solve this matter but i don't know the best way?
Add new column as primary Key instead of composite key.
Join to the composite key Season Column and year column and the composite key will be(StudentID, SubjectID, Season, Year). I have to mention that i don't need this composite key as foreign key.
Which way is better for performance and DB integrity?
Subject and exam are separate (if related) concepts, so you should not try to represent them within the same table. Also, the fact that an exam has been held for the given subject is separate from the fact that any particular student took that exam. Split all these concepts into their own tables, and the model becomes more natural, for example:
Representing a student that took the same exam several times is just a matter of adding multiple rows to the STUDENT_EXAM table.
NOTE: STUDENT_SUBJECT just records the fact that the student has enrolled in the subject, but not when (which year/semester). Keeping semester-specific information may require additional tables and more complicated relationships within the model.
NOTE: There is a diamond-shaped dependency in this model. Since SUBJECT_ID was passed from the "top" (SUBJECT), down both "sides" (STUDENT_SUBJECT, EXAM) and then merged at the "bottom" (STUDENT_EXAM) of the diamond, a student cannot take an exam on a subject (s)he has not enrolled in.

Entity Framework won't resolve PK-FK relationship with composite primary key?

I'm trying to set up an EDM on an existing SQL Server infrastructure, and came across a problem.
The EDM will not resolve a PK-FK relationship to a composite foreign key.
My DB table structure looks something like this (names changed to protect the innocent):
I have a PERSONS table containing an INT column called PerID (PK)
I have an OFFICE table containing an INT column called OffID (PK)
I am tying these tables together using a table called OFFICEPERSONS, creating a many-to-many relationship between PERSONS and OFFICE. This table has two INT columns, PerID and OffID, which together form a composite primary key.
I have a table called OFFICELOCATION that contains two INT columns, LocID and OffID. These two columns comprise a composite primary key. Additionally, OffID is also a FK to the OFFICE table.
Finally, I have a table called OFFICEPERSONSLOCATION. This table has three INT columns: PerID, OffID, and LocID. All three columns comprise a composite primary key. LocID and OffID provide a FK relationship to OFFICELOCATION, and OffID and PerID provide a FK relationship to OFFICEPERSONS.
With me so far? Hopefully, I haven't lost you yet. When all is said and done, my structure looks like this:
This structure works great in SQL Server. In EDM? Not so much. It will NOT allow me to construct the relation between OFFICEPERSONSLOCATION and OFFICEPERSONS. I get the following error:
Error 6037: Foreign key constraint 'FK_OFFICEPERSONSLOCATION_OFFICEPERSONS' has been omitted from the storage model. Column 'OffID' of table 'Model.Store.OFFICEPERSONSLOCATION' is a foreign key participating in multiple relationships. A one-to-one Entity Model will not validate since data inconsistency is possible.
Huh? Data inconsistency?!? How?
How do I get my entity framework to recognize this?
I agree that it is the entity framework's problem, and the problem is stupid. Even if you have the UPDATE CASCADE to "no action", it is not like you could create an inconsistency, but no, it claims that you can somehow.
In any case, in this situation, if you are willing to use surrogate keys instead of composite keys, you can get around this, because the only place to change the ID reference is in the main table.
In this case, OffID could be "inconsistent", but by using ID's in the OFFICEPERSONS and OFFICELOCATIONS tables (and therefore reference in OFFICEPERSONSLOCATION), you are forced to have the OffId managed in its primary table.

Database design - defining a basic many-to-one relationship

This is a basic database design question. I want a table (or multiple tables) defining relationships between customers. I want it so PrimaryCustomer can be linked to multiple SecondaryCustomers, and can have many SecondaryCustomers with the same relationship.
PrimaryCustomerID RelationshipID SecondaryCustomerID
1) If the primary key is {PrimaryCustomerID} then I can only have one linked customer of any kind.
2) If the primary key is {PrimaryCustomerID, RelationshipID}, then I can only have one linked customer for each relationship type.
3) If the primary key is {PrimaryCustomerID, RelationshipID, SecondaryCustomerID}, then I can have whatever I like, but having all columns as the primary key seems completely wrong.
What's the right way to set things up?
A third alternative might be for the key to be (PrimaryCustomerId, SecondaryCustomerId), which would make sense if only one type of relationship is permitted per pair of customers. What keys to implement should be defined by what dependencies you need to represent in the table so that the table accurately represents the reality you are modelling. There's nothing wrong in principle with compound keys or all-key tables.
Number 3 is the right way to go for this data model. Linking tables often have all the columns in a join as all they do is link to other tables.
If a customer can only be linked to one primary customer then you can use a simple recursive relationship in the customer table itself.
CustomerID as PK
PrimaryCustomerID as FK to CustomerID
Nothing wrong with No 3.
If you need to prevent reverse-relationship duplicates, you can use
ALTER TABLE CustomerRelationship
ADD CONSTRAINT chk_id CHECK (PrimaryCustomerId < SecondaryCustomerId);

Is there an answer matrix I can use to decide if I need a foreign key or not?

For example, I have a table that stores classes, and a table that stores class_attributes. class_attributes has a class_attribute_id and a class_id, while classes has a class_id.
I'd guess if a dataset is "a solely child of" or "belongs solely to" or "is solely owned by", then I need a FK to identify the parent. Without class_id in the class_attributes table I could never find out to which class this attribute belongs to.
Maybe there's an helpful answer matrix for this?
Wikipedia is helpful.
In the context of relational
databases, a foreign key is a
referential constraint between two
tables.1 The foreign key identifies
a column or a set of columns in one
(referencing) table that refers to a
column or set of columns in another
(referenced) table. The columns in the
referencing table must be the primary
key or other candidate key in the
referenced table.
(and it goes on into more and more detail)
If you want to enforce the constraint that each row in class_attributes applies to exactly one row of classes, you need a foreign key. If you don't care about enforcing this (ie, you're fine to have attributes for non-existent classes), you don't need an FK.
I don't have an answer matrix, but just for clarification purposes, we're talking about Database Normalization:
http://en.wikipedia.org/wiki/Database_normalization
And to a certain extent Denormalization:
http://en.wikipedia.org/wiki/Denormalization
I would say, it's the other way around. First, you design what kind of objects you need to have. For those will create a table.
Part of this phase is designing the keys, that is the combinations of attributes (columns) that uniquely identify the object. You may or may not add an artificial key or surrogate key for convenience or performance reasons. From these keys, you typically elect one canonical key, the primary key, which you try to use consistently to identify objects in that table (you keep the other keys too, they serve to ensure unicity as a business rule, not so much for identificattion purposes.)
Then, you think what relationships exist between the objects. An object that is 'owned' by another object, or an object that refers to another object needs some way to identify its related object. In the corresponding table (child table) you add columns to make a foreign key to point to the primary key of the referenced table.
This takes care of all one to many relationships.
Sometimes, an object can be related multiple times to another object. For example, an order can be used to order multiple products, but a product can appear on multiple orders as well. For those relationships, you design a separate table (intersection table - in this example, order_items). This table will have a unique key created from two foreign keys: one pointing to the one parent (orders), one to the other parent (products). And again, you add the columns to the intersection table that you need to create those foreign keys.
So in short, you first design keys and foreign keys, only then you start adding columns to implement them.
Don't be concerned with the type of relationship -- it has more to do with the cardinality of the relationship.
If you have a one-to-many relationship, then you'd want to assign a Primary Key to the smaller of the tables, and store it as a Foreign Key in the larger table.
You'd also do it with one-to-one relationships, but some people argue that you should avoid them.
In the case of a many-to-many relationship, you'd want to make a join table, and then have each of the original tables have a foreign key to the join table.

One or Two Primary Keys in Many-to-Many Table?

I have the following tables in my database that have a many-to-many relationship, which is expressed by a connecting table that has foreign keys to the primary keys of each of the main tables:
Widget: WidgetID (PK), Title, Price
User: UserID (PK), FirstName, LastName
Assume that each User-Widget combination is unique. I can see two options for how to structure the connecting table that defines the data relationship:
UserWidgets1: UserWidgetID (PK), WidgetID (FK), UserID (FK)
UserWidgets2: WidgetID (PK, FK), UserID (PK, FK)
Option 1 has a single column for the Primary Key. However, this seems unnecessary since the only data being stored in the table is the relationship between the two primary tables, and this relationship itself can form a unique key. Thus leading to option 2, which has a two-column primary key, but loses the one-column unique identifier that option 1 has. I could also optionally add a two-column unique index (WidgetID, UserID) to the first table.
Is there any real difference between the two performance-wise, or any reason to prefer one approach over the other for structuring the UserWidgets many-to-many table?
You only have one primary key in either case. The second one is what's called a compound key. There's no good reason for introducing a new column. In practise, you will have to keep a unique index on all candidate keys. Adding a new column buys you nothing but maintenance overhead.
Go with option 2.
Personally, I would have the synthetic/surrogate key column in many-to-many tables for the following reasons:
If you've used numeric synthetic keys in your entity tables then having the same on the relationship tables maintains consistency in design and naming convention.
It may be the case in the future that the many-to-many table itself becomes a parent entity to a subordinate entity that needs a unique reference to an individual row.
It's not really going to use that much additional disk space.
The synthetic key is not a replacement to the natural/compound key nor becomes the PRIMARY KEY for that table just because it's the first column in the table, so I partially agree with the Josh Berkus article. However, I don't agree that natural keys are always good candidates for PRIMARY KEY's and certainly should not be used if they are to be used as foreign keys in other tables.
Option 2 uses a simple compund key, option 1 uses a surrogate key. Option 2 is preferred in most scenarios and is close to the relational model in that it is a good candidate key.
There are situations where you may want to use a surrogate key (Option 1)
You are not certain that the compound key is a good candidate key over time. Particularly with temporal data (data that changes over time). What if you wanted to add another row to the UserWidget table with the same UserId and WidgetId? Think of Employment(EmployeeId,EmployeeId) - it would work in most cases except if someone went back to work for the same employer at a later date
If you are creating messages/business transactions or something similar that requires an easier key to use for integration. Replication maybe?
If you want to create your own auditing mechanisms (or similar) and don't want keys to get too long.
As a rule of thumb, when modeling data you will find that most associative entities (many to many) are the result of an event. Person takes up employment, item is added to basket etc. Most events have a temporal dependency on the event, where the date or time is relevant - in which case a surrogate key may be the best alternative.
So, take option 2, but make sure that you have the complete model.
I agree with the previous answers but I have one remark to add.
If you want to add more information to the relation and allow more relations between the same two entities you need option one.
For example if you want to track all the times user 1 has used widget 664 in the userwidget table the userid and widgetid isn't unique anymore.
What is the benefit of a primary key in this scenario? Consider the option of no primary key:
UserWidgets3: WidgetID (FK), UserID (FK)
If you want uniqueness then use either the compound key (UserWidgets2) or a uniqueness constraint.
The usual performance advantage of having a primary key is that you often query the table by the primary key, which is fast. In the case of many-to-many tables you don't usually query by the primary key so there is no performance benefit. Many-to-many tables are queried by their foreign keys, so you should consider adding indexes on WidgetID and UserID.
Option 2 is the correct answer, unless you have a really good reason to add a surrogate numeric key (which you have done in option 1).
Surrogate numeric key columns are not 'primary keys'. Primary keys are technically one of the combination of columns that uniquely identify a record within a table.
Anyone building a database should read this article http://it.toolbox.com/blogs/database-soup/primary-keyvil-part-i-7327 by Josh Berkus to understand the difference between surrogate numeric key columns and primary keys.
In my experience the only real reason to add a surrogate numeric key to your table is if your primary key is a compound key and needs to be used as a foreign key reference in another table. Only then should you even think to add an extra column to the table.
Whenever I see a database structure where every table has an 'id' column the chances are it has been designed by someone who doesn't appreciate the relational model and it will invariably display one or more of the problems identified in Josh's article.
I would go with both.
Hear me out:
The compound key is obviously the nice, correct way to go in so far as reflecting the meaning of your data goes. No question.
However: I have had all sorts of trouble making hibernate work properly unless you use a single generated primary key - a surrogate key.
So I would use a logical and physical data model. The logical one has the compound key. The physical model - which implements the logical model - has the surrogate key and foreign keys.
Since each User-Widget combination is unique, you should represent that in your table by making the combination unique. In other words, go with option 2. Otherwise you may have two entries with the same widget and user IDs but different user-widget IDs.
The userwidgetid in the first table is not needed, as like you said the uniqueness comes from the combination of the widgetid and the userid.
I would use the second table, keep the foriegn keys and add a unique index on widgetid and userid.
So:
userwidgets( widgetid(fk), userid(fk),
unique_index(widgetid, userid)
)
There is some preformance gain in not having the extra primary key, as the database would not need to calculate the index for the key. In the above model though this index (through the unique_index) is still calculated, but I believe that this is easier to understand.

Resources