I am creating a database where we want to combine data of several sites into one database. I now have an issue with the unique constraint for the samplepoint table. for each site the samplepointname must be unique. in the old system I enforced this with a unique constraint. The problem in the new system is that the siteID's are not stored in te table with samplepoints because these are enheritted from the parent of samplepoints (projects).
can I create a unique constraint that include the siteID stored in its parent, or should I create a siteID field in the table itself
I'm a bit confused by some of the phrasing of the question, so I'm going to lay out some clarifying assumptions based on what I think is my best read of it. Hopefully these assumptions actually match your situation.
In the original configuration, you had:
a single site
represented by a single pair of tables named "project" and "samplepoints"
a unique constraint over a field named "samplepointname"
a field named "siteID" in in a table named "project"
it had previously been unnecessary to add "siteID" to "samplepoints" because there was only one row in "project" and that one row's single "siteID" was always implied throughout the table "samplepoints"
And in the new configuration you have the following changes:
multiple sites
one row for each site in the table "projects"
a unique value for each field "siteID" in "projects"
You've stated that the field "sitepointname" within each site must be unique, but not globally. So I'm going to work with that.
Given these assumptions, you almost certainly will not merely want but need to add "siteID" to your table "sitepoints". This is because you can no longer simply read from "projects" and "sitepoints" at the same time without either joining them or adding a WHERE clause to filter down to the relevant site.
In fact, if your table "sitepoints" has already been populated without "siteID" you may well need to obtain the original tables from all of the different sites, empty that consolidated table, and repopulate it such that "siteID" correctly represents each independent site.
After you've added the new field "siteID", you'll remove the UNIQUE constraint on the field. You're going to replace that with what you'll use instead, and if you don't remove it all names will need to be unique across all sites rather than just within each site.
If you're simply executing commands directly, this will create that index:
CREATE UNIQUE INDEX unique_sitepointnames ON sitepoints (siteID, sitepointname);
The index name "unique_sitepointnames" is just an identifier, it can be whatever you wish, but that's my suggestion for it as it's clear and describes the purpose.
Rather than "UNIQUE" being a constraint on the column, "UNIQUE" is here a constraint on the index. Any more options to how the index is created is just optimization.
There are a lot of questions asking this but I can't seem to find one with the specific solution.
I have a 3rd party database where every field is set to allow null. One of the columns ("Code") is a unique string ID and it is distinct.
Using entity framework I'd like to add this table, by telling EF to treat the column "Code" as a primary key.
I've created a view but I am not sure where to go from here.
I've seen some solutions that involve adding an extra row number to use as the primary key but I would prefer to use "Code" if possible.
Any ideas?
After some playing around I found a read-only solution
In the view I modify the column to be:
SELECT ISNULL(Code, -1) AS Code
Specifying ISNULL allows EF to infer a primary key. It is not ideal as I would like it to be writable as well. You get the message:
Error 6002: The table/view 'KittyCat.dbo.View_GetCatDetails' does not
have a primary key defined. The key has been inferred and the
definition was created as a read-only table/view.
I have a mySQL database which is accessed by web2py, but has one table (which has an auto increment column labelled 'id') which is also regularly altered by another script. This script frequently deletes and inserts new rows into the table, so that although the integers in the 'id' column are unique and ascending, there are also many intermediate numbers missing. Will this cause web2py problems in the future?
Note that I only access this table through a different column, which contains a different set of unique identifiers, so I don't really need the 'id' column at all: it's only there because the docs state that web2py requires it.
Having missing values in the id field would not affect web2py by any means. but deleting or changing an ID of a record while you are editing this record in web2py would result in an error. so just be careful your web2py users are not editing records during script is changing/deleting IDs.
The accepted answer is correct, but also note that depending on your use case, you might not need the auto-incrementing integer id field in your table at all, as the DAL can handle other types of primary keys via the primarykey argument to db.define_table(). In particular, if you are working with a read-only table and any references are to/from other keyed tables, you do not need the id field. For more details, see http://web2py.com/books/default/chapter/29/06/the-database-abstraction-layer#Legacy-databases-and-keyed-tables.
I have a problem that can be summarized as follow:
Assume that I am implementing an employee database. For each person depends on his position, different fields should be filled. So for example if the employee is a software engineer, I have the following columns:
Name
Family
Language
Technology
CanDevelopWeb
And if the employee is a business manager I have the following columns:
Name
Family
FieldOfExpertise
MaximumContractValue
BonusRate
And if the employee is a salesperson then some other columns and so on.
How can I implement this in database schema?
One way that I thought is to have some related tables:
CoreTable:
Name
Family
Type
And if type is one then the employee is a software developer and hence the remaining information should be in table SoftwareDeveloper:
Language
Technology
CanDevelopWeb
For business Managers I have another table with columns:
FieldOfExpertise
MaximumContractValue
BonusRate
The problem with this structure is that I am not sure how to make relationship between tables, as one table has relationship with several tables on one column.
How to enforce relational integrity?
There are a few schools of thought here.
(1) store nullable columns in a single table and only populate the relevant ones (check constraints can enforce integrity here). Some people don't like this because they are afraid of NULLs.
(2) your multi-table design where each type gets its own table. Tougher to enforce with DRI but probably trivial with application or trigger logic.
The only problem with either of those, is as soon as you add a new property (like CanReadUpsideDown), you have to make schema changes to accommodate for that - in (1) you need to add a new column and a new constraint, in (2) you need to add a new table if that represents a new "type" of employee.
(3) EAV, where you have a single table that stores property name and value pairs. You have less control over data integrity here, but you can certainly constraint the property names to certain strings. I wrote about this here:
What is so bad about EAV, anyway?
You are describing one ("class per table") of the 3 possible strategies for implementing the category (aka. inheritance, generalization, subclass) hierarchy.
The correct "propagation" of PK from the parent to child tables is naturally enforced by straightforward foreign keys between them, but ensuring both presence and the exclusivity of the child rows is another matter. It can be done (as noted in the link above), but the added complexity is probably not worth it and I'd generally recommend handling it at the application level.
I would add a field called EmployeeId in the EmployeeTable
I'd get rid of Type
For BusinessManager table and SoftwareDeveloper for example, I'll add EmployeeId
From here, you can then proceed to create Foreign Keys from BusinessManager, SoftwareDeveloper table to Employee
To further expand on your one way with the core table is to create a surrogate key based off an identity column. This will create a unique employee id for each employee (this will help you distinguish between employees with the same name as well).
The foreign keys preserve your referential integrity. You wouldn't necessarily need EmployeeTypeId as someone else mentioned as you could filter on existence in the SoftwareDeveloper or BusinessManagers tables. The column would instead act as a cached data point for easier querying.
You have to fill in the types in the below sample code and rename the foreign keys.
create table EmployeeType(
EmployeeTypeId
, EmployeeTypeName
, constraint PK_EmployeeType primary key (EmployeeTypeId)
)
create table Employees(
EmployeeId int identity(1,1)
, Name
, Family
, EmployeeTypeId
, constraint PK_Employees primary key (EmployeeId)
, constraint FK_blahblah foreign key (EmployeeTypeId) references EmployeeType(EmployeeTypeId)
)
create table SoftwareDeveloper(
EmployeeId
, Language
, Technology
, CanDevelopWeb
, constraint FK_blahblah foreign key (EmployeeId) references Employees(EmployeeId)
)
create table BusinessManagers(
EmployeeId
, FieldOfExpertise
, MaximumContractValue
, BonusRate
, constraint FK_blahblah foreign key (EmployeeId) references Employees(EmployeeId)
)
No existing SQL engine has solutions that make life easy on you in this situation.
Your problem is discussed at fairly large in "Practical Issues in Database Management", in the chapter on "entity subtyping". Commendable reading, not only for this particular chapter.
The proper solution, from a logical design perspective, would be similar to yours, but for the "type" column in the core table. You don't need that, since you can derive the 'type' from which non-core table the employee appears in.
What you need to look at is the business rules, aka data constraints, that will ensure the overall integrity (aka consistency) of the data (of course whether any of these actually apply is something your business users, not me, should tell you) :
Each named employee must have exactly one job, and thus some job detail somewhere. iow : (1) no named employees without any job detail whatsoever and (2) no named employees with >1 job detail.
(3) All job details must be for a named employee.
Of these, (3) is the only one you can implement declaratively if you are using an SQL engine. It's just a regular FK from the non-core tables to the core table.
(1) and (2) could be defined declaratively in standard SQL, using either CREATE ASSERTION or a CHECK CONSTRAINT involving references to other tables than the one the CHECK CONSTRAINT is defined on, but neither of those constructs are supported by any SQL engine I know.
One more thing about why [including] the 'type' column is a rather poor choice to make : it changes how constraint (3) must be formulated. For example, you can no longer say "all business managers must be named employees", but instead you'd have to say "all business managers are named employees whose type is <type here>". Iow, the "regular FK" to your core table has now become a reference to a VIEW on your core table, something you might want to declare as, say,
CREATE TABLE BUSMANS ... REFERENCES (SELECT ... FROM CORE WHERE TYPE='BM');
or
CREATE VIEW BM AS (SELECT ... FROM CORE WHERE TYPE='BM');
CREATE TABLE BUSMANS ... REFERENCES BM;
Once again something SQL doesn't allow you to do.
You can use all fields in the same table, but you'll need an extra table named Employee_Type (for example) and here you have to put Developer, Business Manager, ... of course with an unique ID. So your relation will be employee_type_id in Employee table.
Using PHP or ASP you can control what field you want to show depending the employee_type_id (or text) in a drop-down menu.
You are on the right track. You can set up PK/FK relationships from the general person table to each of the specialized tables. You should add a personID to all the tables to use for the relationship as you do not want to set up a relationship on name because it cannot be a PK as it is not unique. Also names change, they are a very poor choice for an FK relationship as a name change could cause many records to need to change. It is important to use separate tables rather than one because some of those things are in a one to many relationship. A Developer for instnce may have many differnt technologies and that sort of thing should NEVER be stored in a comma delimted list.
You could also set up trigger to enforce that records can only be added to a specialty table if the main record has a particular personType. However, be wary of doing this as you wil have peopl who change roles over time. Do you want to lose the history of wha the person knew when he was a developer when he gets promoted to a manager. Then if he decides to step back down to development (A frequent occurance) you would have to recreate his old record.
I'm building a claims database with the above schema so far. Three three-part key on tblPatient is to uniquely identify individual's claim for a certain problemX. The same patientID can appear in tblPatient as long as the admission/discharge dates are different.
This database is also concerned (not pictured) with claims that are NOT associated with problemX. These claims will be identified with another three-part key of patientID, claimsFromDate,claimsThroughDate. So, tblPatient.admissionDate and tblPatient.DischargeDate will not have to be equal to claimsFromDate and claimsThroughDate and if they are equal it's happenstance.
Since tblPatient.patientID is repeated more than once (for those that have more than one visit), I cannot simply copy it to another table without breaking unique constraints for a primary key. I need to be able to relate patientID with the rest of the claims. Do I need to redesign my tblPatient to only have one field as a primary key, or include the already-existing three-part key and just roll with it?
First of all: In a perfect, puristic, database-world you would split your patients database into two - one containing the patients, and another called "PatientClaims" or such. If you do not care for patient-specifc data apart from patient-ID, then you should at least rename the table.
That same puristic approach would also tell you the primary key is defined as "The only set of data that uniquely identifies the row" - which is likely your three fields. (I could assume that you could leave out DischargeDate, but only if you are sure that the logic for doing so is sound)
Seeing, however, that you have to work with:
1. A three-part key, which is never pleasent
2. Having that key-combo across two tables, and likely having to join them
I would suggest simply defining a new key - like "ClaimID" using whatever autoincrement functions your database of choice has available.
Unrelated note: Your whole state/county double-table set looks kinda weird - but that might just be me not understanding what you are modelling.