SQL Primary Key Decisions

SQL Primary Key Decisions - sql-server

In my scenario I am tracking a population of members and their doctor changes
The columns concerned are
MemberID | Prov_Nbr | Prov_Start_Date | Prov_End_Date | Prov_Update_Date
My question is in regards to a primary key
In this scenario, would it be better to have a primary key on an Auto-Increment field, and add the column to the front like so:
IDENTITY |MemberID | Prov_Nbr | Prov_Start_Date | Prov_End_Date | Prov_Update_Date
Or to create the primary key based on the business rules/uniqueness of the data?
MemberID - PK1 | Prov_Nbr - PK2 | Prov_Start_Date - PK3 | Prov_End_Date | Prov_Update_Date
This is how the data would look in table, after processing on a weekly basis:
MemberID | Prov_Nbr | Prov_Start_Date | Prov_End_Date | Prov_Update_Date
------------------------------------------------------------------------
ABC123| IR456|2014-01-01|null|null - original record
ABC123| IR102|2014-04-01|null|null - new record turns original record `Prov_End_Date` to New `Prov_Start_Date - 1 day`
So table looks like this:
ABC123 | IR456 | 2014-01-01 | 2014-03-31 | null
ABC123 | IR102 | 2014-04-01 | null | 2014-04-30
Still with me?
There are situations where based on the nature of the business a member could have a "retro" which essentially means this:
ABC123 | IR456 | 2014-01-01| 2014-03-31 | null
ABC123 | IR102 | 2014-04-01| null | 2014-04-30
gets a new record
ABC123 | IR402 | 2014-01-01 | null | null
essentially retro-fitting the original record with a new provider.
Would this case ruin the uniqueness of the data? or would SQL know how to handle this as a primary key update?
Any help with this would be much appreciated.

I would actually put both of your solutions into place, as in create an identity field as your primary key (probably clustered) and add a unique key on MemberID, Prov_Nbr, Prov_Start_Date.
The top SQL Server bloggers are almost always extolling the virtues of an identity as PK, including situations somewhat similar to this where it is a surrogate, and you can then additionally enforce your business rule with the UK. Of course, I hope I'm reading your requirements correctly, especially the "retro" part.

Related

Restrict a bit field to true in only one child record at a time

I have a Person table, each of who's records belongs to a parent record from the Company table.
One Person is designated as "Organizer" for their parent Company. Initially I handled this by having a recursive reference from the Company table identifying the Person record that was it's "Organizer" - but the software I'm using to build my application layer falls over - it can't handle recursive references.
I've changed tack, and have added a bit field to the Person table to identify whether the person is an "Organizer" or not, but neet to ensure that there is only one "Organizer" for each Company record. If I use an AFTER UPDATE trigger on the Person table, an update on Person triggers an update on Person - obviously I want to avoid recursive triggers.
How can I ensure that there is only ever one Person marked as the "Organizer" for it's parent Company?
+-----------+---------+---------+-----------+ +-----------+---------+---------+-----------+
| FirstName | Surname | Company | Organizer | | FirstName | Surname | Company | Organizer |
+-----------+---------+---------+-----------+ +-----------+---------+---------+-----------+
| John | Smith | 1 | True | | John | Smith | 1 | True |
| Mike | Jones | 1 | NULL | | Mike | Jones | 1 | NULL |
| Fred | Green | 1 | NULL | | Fred | Green | 1 | NULL |
| James | McMahon | 2 | NULL | | James | McMahon | 2 | NULL |
| Philip | Stills | 2 | NULL | Making Philip organizer ==> | Philip | Stills | 2 | True |
| Hector | Berlioz | 2 | True | 'triggers' this change ==> | Hector | Berlioz | 2 | NULL |
+-----------+---------+---------+-----------+ +-----------+---------+---------+-----------+

So seeing as no-one has given an answer here, I'll post what I eventually did:
Created a separate table called Organizer, with only two fields:
CREATE TABLE Organizer (
Company int NOT NULL UNIQUE,
Person int NOT NULL,
CONSTRAINT FK_Organizer_Company FOREIGN KEY (Company) REFERENCES Company(ID) ON DELETE CASCADE,
CONSTRAINT FK_Organizer_Person FOREIGN KEY (Person) REFERENCES Person(ID) ON DELETE CASCADE,
CONSTRAINT PK_Organizer_ID PRIMARY KEY (Company, Person)
);
By making the Company field unique, I can only ever have one organizer for any company.
ON DELETE CASCADE prevents me ending up with orphan organizer records for companies or people that don't exist.
Can't quite recall why I made the PRIMARY KEY both fields. Doesn't seem to hurt.
It was then just a matter of checking for an existing Organizer record and updating that if it existed, or inserting one if it didn't. I did this in the application layer, though I could just have easily have made a Stored Procedure that took Company.ID and Person.ID parameters, checked for Organizer records with the former, and updated the table accordingly. Could even throw in a check for whether the Person actually belongs to that company, and return a value accordingly to the application layer.

Ensuring that two column values are related in SQL Server

I'm using Microsoft SQL Server 2017 and was curious about how to constrain a specific relationship. I'm having a bit of trouble articulating so I'd prefer to share through an example.
Consider the following hypothetical database.
Customers
+---------------+
| Id | Name |
+---------------+
| 1 | Sam |
| 2 | Jane |
+---------------+
Addresses
+----------------------------------------+
| Id | CustomerId | Address |
+----------------------------------------+
| 1 | 1 | 105 Easy St |
| 2 | 1 | 9 Gale Blvd |
| 3 | 2 | 717 Fourth Ave |
+------+--------------+------------------+
Orders
+-----------------------------------+
| Id | CustomerId | AddressId |
+-----------------------------------+
| 1 | 1 | 1 |
| 2 | 2 | 3 |
| 3 | 1 | 3 | <--- Invalid Customer/Address Pair
+-----------------------------------+
Notice that the final Order links a customer to an address that isn't theirs. I'm looking for a way to prevent this.
(You may ask why I need the CustomerId in the Orders table at all. To be clear, I recognize that the Address already offers me the same information without the possibility of invalid pairs. However, I'd prefer to have an Order flattened such that I don't have to channel through an address to retrieve a customer.)
From the related reading I was able to find, it seems that one method may be to enable a CHECK constraint targeting a User-Defined Function. This User-Defined Function would be something like the following:
WHERE EXISTS (SELECT 1 FROM Addresses WHERE Id = Order.AddressId AND CustomerId = Order.CustomerId)
While I imagine this would work, given the somewhat "generality" of the articles I was able to find, I don't feel entirely confident that this is my best option.
An alternative might be to remove the CustomerId column from the Addresses table entirely, and instead add another table with Id, CustomerId, AddressId. The Order would then reference this Id instead. Again, I don't love the idea of having to channel through an auxiliary table to get a Customer or Address.
Is there a cleaner way to do this? Or am I simply going about this all wrong?

Good question, however at the root it seems you are struggling with creating a foreign key constraint to something that is not a foreign key:
Orders.CustomerId -> Addresses.CustomerId
There is no simple built-in way to do this because it is normally not done. In ideal RDBMS practices you should strive to encapsulate data of specific types in their own tables only. In other words, try to avoid redundant data.
In the example case above the address ownership is redundant in both the address table and the orders table, because of this it is requiring additional checks to keep them synchronized. This can easily get out of hand with bigger datasets.
You mentioned:
However, I'd prefer to have an Order flattened such that I don't have to channel through an address to retrieve a customer.
But that is why a relational database is relational. It does this so that distinct data can be kept distinct and referenced with relative IDs.
I think the best solution would be to simply drop this requirement.
In other words, just go with:
Customers
+---------------+
| Id | Name |
+---------------+
| 1 | Sam |
| 2 | Jane |
+---------------+
Addresses
+----------------------------------------+
| Id | CustomerId | Address |
+----------------------------------------+
| 1 | 1 | 105 Easy St |
| 2 | 1 | 9 Gale Blvd |
| 3 | 2 | 717 Fourth Ave |
+------+--------------+------------------+
Orders
+--------------------+
| Id | AddressId |
+--------------------+
| 1 | 1 |
| 2 | 3 |
| 3 | 3 | <--- Valid Order/Address Pair
+--------------------+
With that said, to accomplish your purpose exactly, you do have views available for this kind of thing:
create view CustomerOrders
as
select o.Id OrderId,
a.CustomerId,
o.AddressId
from Orders
join Addresses a on a.Id = o.AddressId
I know this is a pretty trivial use-case for a view but I wanted to put in a plug for it because they are often neglected and come in handy with organizing big data sets. Using WITH SCHEMABINDING they can also be indexed for performance.

You may ask why I need the CustomerId in the Orders table at all. To be clear, I recognize that the Address already offers me the same information without the possibility of invalid pairs. However, I'd prefer to have an Order flattened such that I don't have to channel through an address to retrieve a customer.
If you face performance problems, the first thing is to create or amend proper indexes. And DBMS are usually good at join operations (with proper indexes). But yes normalization can sometimes help in performance tuning. But it should be a last resort. And if that route is taken, one should really know what one is doing and be very careful not to damage more at the end of a day, that one has gained. I have doubts, that you're out of options here and really need to go that path. You're likely barking up the wrong tree. Therefore I recommend you take the "normal", "sane" way and just drop customerid in orders and create proper indexes.
But if you really insist, you can try to make (id, customerid) a key in addresses (with a unique constraint) and then create a foreign key based on that.
ALTER TABLE addresses
ADD UNIQUE (id,
customerid);
ALTER TABLE orders
ADD FOREIGN KEY (addressid,
customerid)
REFERENCES addresses
(id,
customerid);

Zero_or_One vs One_or_only_One entity relation just for initially allowable Null field

I have confusion about Zero_or_One vs One_and_only_One entity relation in step 4 of the following scenario. The scenario is:
There are two entities: School and Teacher.
Without cardinality defined, ERD is:
______________________
| SCHOOL | ____________________
---------------------- | TEACHER |
| ID [PK] | --------------------
| Headmaster_ID [FK] |----------------| ID [PK] |
---------------------- --------------------
A teacher can be a headmaster of zero or one school but not more than one school. So ERD becomes:
______________________
| SCHOOL | ____________________
---------------------- | TEACHER |
| ID [PK] | --------------------
| Headmaster_ID [FK] | 0|-------------| ID [PK] |
---------------------- --------------------
A school normally has exactly one headmaster. But if Headmaster_ID is restricted to be Not_Null, then the headmaster (also a teacher) for a new school must be inserted into Teacher table before the new school is inserted into School table. I do not want this restriction in the database. So which one is correct for the given scenario?
______________________
| SCHOOL | ____________________
---------------------- | TEACHER |
| ID [PK] | --------------------
| Headmaster_ID [FK] | 0|----------|0 | ID [PK] |
---------------------- --------------------
VS
______________________
| SCHOOL | ____________________
---------------------- | TEACHER |
| ID [PK] | --------------------
| Headmaster_ID [FK] | 0|----------|| | ID [PK] |
---------------------- --------------------
So my question: is it OK to use Zero_or_One rather than One_and_only_One just because a field can be initially Null for a short time? Or something else should be done in this case? I want a better explanation for the issue in generic sense.

TL;DR
What matters about the "normal" presence of a headmaster vs "short time" without is whether it happens in the application domain of schools, teachers and headmasters vs only in your particular implementation. The "normal" and "short" are irrelevant. An ER diagram records facts true of every situation/state that can can ever arise for associations/tables in the application/database.
Neither of your proposed solutions are (quite) appropriate whether or not there is always a headmaster in the application domain.
If you want no headmaster only until a school ever has one then you must use DBMS-specific security measures and/or non-declarative code to get what assurance you can.
The basic diagram
Your step 2 is unjustifiably presuming a FK. Here is the basic situation:
______________________
| SCHOOL | ____________________
---------------------- | TEACHER |
| ID [PK] |Headmastered_by --------------------
| | -------------------- | ID [PK] |
---------------------- Headmaster_of--------------------
Now that you have decided what entities are participating in a relationship and picked names per order of entities you can clarify just what relationship you mean. Then you can observe its cardinality (per entity & ordering). Then you can decide how to represent it.
If a school always has a headmaster
(Reading from a box along a line to a cardinality and another box:)
A school is Headmastered_by 1 teacher. (Every school participates.)
A teacher is Headmaster_of one 0 or 1 school.
______________________
| SCHOOL | ____________________
---------------------- | TEACHER |
| ID [PK] |Headmastered_by --------------------
| | |0----------------|| | ID [PK] |
---------------------- Headmaster_of--------------------
Headmastered_by is not 0-or-many to 1 so it's not represented by a SCHOOL NOT NULL (relational) FK to TEACHER. It's not 0-or-many to 0-or-1 so it's not represented by a SCHOOL NULLable (SQL) FK to TEACHER. Similarly there is no FK in the other direction. However you could represent Headmastered_by by a SCHOOL NOT NULL FK to TEACHER that is also UNIQUE NOT NULL (a candidate/alternate key). Then schools and headmasters are 1:1 and some teachers may not be headmasters (not referenced).
______________________
| SCHOOL | ____________________
---------------------- | TEACHER |
| ID [PK] |Headmastered_by --------------------
| TEACHER_ID NOT NULL| |0----------------|| | ID [PK] |
| [UNIQUE] [FK] | Headmaster_of--------------------
----------------------
...but with "no restriction" on "insert into Teacher"
If you want a school to have exactly one headmaster in the application domain then it's a little odd that you "do not want this restriction in the database" that "headmaster (also a teacher) for a new school must be inserted into Teacher table before the new school is inserted into School." Then you either hope for users to follow a protocol or you use DBMS security measures and/or non-procedural code to force both changes in the same transaction or at least to make the user go out of their way not to do so.
If a school can sometimes have no headmaster
A school is Headmastered_by 0 or 1 teacher.
A teacher is Headmaster_of one 0 or 1 school.
______________________
| SCHOOL | ____________________
---------------------- | TEACHER |
| ID [PK] |Headmastered_by --------------------
| | |0----------------0| | ID [PK] |
---------------------- Headmaster_of--------------------
You can represent this in two mirror-image ways using a NULLable FK in one table that is UNIQUE NULLable. (If your DBMS supports the SQL standard's multiple NULLs in a UNIQUE NULLable column.)
______________________
| ONE | ____________________
---------------------- | OTHER |
| ID [PK] |Headmaster_one --------------------
| OTHER_ID NULLable | |0----------------0| | ID [PK] |
| [UNIQUE][FK] | Headmaster_other--------------------
----------------------
You could represent it by adding a NULLable FK to SCHOOL that is also UNIQUE NULLable. Then schools and headmasters are 1:1 while some schools may have no headmaster (NULL) and some teachers may not be headmasters (not referenced).
You could represent it by adding a NULLable FK to TEACHER that is also UNIQUE NULLable. Then headmasters and schools are 1:1 while some teachers may not be headmasters (NULL) and some schools may not have headmasters (not referenced).
But you can represent this symmetrical situation symmetrically by adding a Headmaster table with UNIQUE NOT NULL CKs (SCHOOL_ID) and (TEACHER_ID) plus NOT NULL FKs to respective tables. The two CK constraints enforce 1:1 for headmastered schools & headmasters. But there can be headmasterless schools (not referenced) and non-headmaster teachers (not referenced).
____________________________________________
| SCHOOL | TEACHER | Headmaster
--------------------------------------------
| SCHOOL_ID NOT NULL | TEACHER_ID NOT NULL |
| [UNIQUE] [FK] | [UNIQUE] [FK] |
--------------------------------------------
The obvious and natural initial representation for an application relationship is such an explicit table. If you start with this then you can rearrange to asymmetrical schemas with FKs etc when there are appropriate special cases. Note that in asymmetrical versions a Headmaster_is/Headmaster_of relationship instance must be extracted from what is ostensibly/supposedly/called an entity table but is de facto a relationship table. (And the FK gets inappropriately called a "relationship".)
(The IDEF1X method records the use of NULLable vs NOT NULL FKs in its ER diagrams, rather than allowing a method to be chosen by implementers. This is a good idea, because the NULLability of an attribute is visible to users and affects the meaning of relationships/tables.) (Of course, that is also a good argument to not to use ER formally and to specify directly by relational schemas.)
...but only initially
As to enforcing (in the application domain) that a school can have no headmaster only until it has had one: In a typical SQL DBMS you can't constrain this declaratively. You can write triggers and stored procedures restricting UPDATE when IS NOT NULL or UPDATing/DELETing rows, but these are bypassable by users. You can add a column showing the last value, or whether it was NULL, or a table for that, to restrict user updates to your table, but you can't restrict user access to that attribute/table. So it comes down to the security measures that your DBMS offers.

Condensing Row Data into a View

I have data in my PeopleInfo table where there are some people that have multiple records that I am trying to combine together into one record for a view.
All people data is the almost the same except for the PlanId and PlanName. So:
| FirstName | LastName | SSN | PlanId | PlanName | Status | Price1 | Price2 |
|-----------|----------|-----------|--------|----------|-----------|---------|--------|
| John | Doe | 123456789 | 1 | Plan A | Primary | 9.00 | NULL |
|-----------|----------|-----------|--------|----------|-----------|---------|--------|
| John | Doe | 123456789 | 2 | Plan B | Secondary | NULL | 5.00 |
I would like to only to have one John Doe record in my view that looked like this:
| FirstName | LastName | SSN | PlanId | PlanName | Status | Price1 | Price2 |
|-----------|----------|-----------|--------|----------|-----------|---------|--------|
| John | Doe | 123456789 | 1 | Plan A | Primary | 9.00 | 5.00 |
Where the Primary status determines which PlanId and PlanName to show. Can anyone help me with this query?

declare #t table ( FNAME varchar(10), LNAME varchar(10), SSN varchar(10), PLANID INT,PLANNAME varchar(10),stat varchar(10),Price1 decimal(18,2),Price2 decimal(18,2))
insert into #t (FNAME,LNAME,SSN,PLANID,PLANNAME,stat,Price1,Price2)values ('john','doe','12345',1,'PlanA','primary',9.00,NULL),('john','doe','12345',1,'PlanB','secondary',Null,8.00)
select
FNAME,
LNAME,
SSN,
MAX(PLANID)PLANID,
MIN(PLANNAME)PLANNAME,
MIN(stat)stat,
MIN(Price1)Price1,
MIN(Price2)Price2 from #t
GROUP BY FNAME,LNAME,SSN

(I can't yet add a comment, so have an answer.)
The only thing that troubles me here is that i am also determining which PlanId and PlanName since they are different and i want to show a specific one based off of the Status field of both records.
Then you don't even need GROUPing. It would be much simpler. Just SELECT WHERE 'Primary' = PlanName. Assuming that (A) there will always be this PlanName for each user, and (B) You are happy to ignore all others.
P.S. If you will only be using Primary and Secondary PlanNames, you might want to change the column to a bit named something like isPrimaryPlan where 1 indicates true and 0 false. However, if you might bring in e.g. Bronze and Consolation Prize Plans later, then you'll need to retain a more variable datatype. Perhaps store the plans in a separate table and have an int FOREIGN KEY to it... I could go on!

OK, I'm back after having a sleep, which has improved my brain slightly,
First, let the record reflect that I don't like the database design here. The People and Plans should be separate tables, linked by foreign keys - via a 3rd table, e.g. PeoplePlans. That takes me to another point: the people here have no primary key (at least not that you have specified). So when writing the below, I had to pick the SSN, assuming that will always be present and unique.
Anyway, something like this should work, with the caveat that I'm not going to replicate the database structure to test it.
select
FirstName,
LastName,
SSN,
PlanId,
PlanName,
Status,
_ca._sum_Price1,
_ca._sum_Price2
from
PeopleInfo as _Primary
cross apply (
select
sum(Price1) as _sum_Price1,
sum(Price2) as _sum_Price2
from
PeopleInfo
where
_Primary.SSN = SSN
) as _ca
where
'Primary' = Status;
This SELECTs all People with Primary status in order to get you those rows. It then CROSS APPLYs their Primary and any other rows and takes the summed Prices.
Hopefully this makes sense. If not, you'll have to do some reading about CROSS APPLY, in addition to about good relational database design. ;-)

Relationships Between Tables in MS Access

I'm new in DataBases at all and have some difficulties with setting relationships between 3 tables in MS Access 2013.
The idea is that I have a table with accounts info, a table with calls related to this accounts and also one table with all the possible call responses. I tried different combinations between them but nothing works.
1st table - Accounts : AccountID(PK) | AccountName | Language | Country | Email
2nd table - Calls : CallID(PK) | Account | Response | Comment | Date
3rd table - Responses: ResponseID(PK) | Response

When you have a table, it usually has a Primary Key field that is the main index of the table. In order for you to connect it with other tables, you usually do that by setting Foreign Key on the other table.
Let's say you have your Accounts table, and it has AccountID field as Primary Key. This field is unique (meaning no duplicate value for this field).
Now, you have the other table called Calls and you have a Foreign Key field called AccountID there, which points to the Accounts table.
Essentially you have Accounts with the following data:
AccountID| AccountName | Language | Country | Email
1 | FirstName | EN | US | some#email.com
2 | SecondName | EN | US | some#email.com
Now you have the other table Calls with Many calls
CallID(PK) | AccountID(FK) | ResponseID(FK) | Comment | Date
1 | 1 | 1 | a comment | 26/10
2 | 1 | 1 | a comment | 26/10
3 | 2 | 3 | a comment | 26/10
4 | 2 | 3 | a comment | 26/10
You can see the One to Many relationship: One accountID (in my example AccountID=1) to Many Calls (in my example 2 rows with AccountID=1 as foreign keys, rows 1 & 2) and AccountID=2 has also 2 rows of Calls (rows 3 and 4)
Same goes for the Responses table

Using this table structure:
Accounts : AccountID(PK) | AccountName | Language | Country | Email
Calls : CallID(PK) | AccountID(FK) | ResponseID(FK) | Comment | Date
Responses: ResponseID(PK) | Response
Accounts.AccountID is referenced by Calls.AccountID. 1:n – many calls for one account possible, but each call concerns just one account.
Responses.ResponseID is referenced by Calls.ResponseID. 1:n – many calls can get the same response from the prepared set, but each call gets exactly one of them.

To actually define the Relationships in Access, open the Relationships window...
... then follow the detailed instructions here:
How to define relationships between tables in an Access database

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

SQL Primary Key Decisions - sql-server

Related

Restrict a bit field to true in only one child record at a time

Ensuring that two column values are related in SQL Server

Zero_or_One vs One_or_only_One entity relation just for initially allowable Null field

Condensing Row Data into a View

Relationships Between Tables in MS Access

Categories

Resources