Database schema consistency issue - sql-server

Part of my database schema involves the entities:
Jobs
Agencies
Agents
and relation JobAgent
Each Job has one Agency it belongs to
Each Agent belongs to one agency
Each Job has 0-n agents
The database will be SQL Server 2008
Here is my schema:
My problem is that Jobs.agencyid must always be equal to Agents.agencyid when related through JobAgent.
If Jobs.agencyid were to be updated to a new agency, The Agents would then belong to a different Agency than the Job.
What would be the best way to redesign my schema to avoid relying on triggers or application code to ensure this consistency?

AGENCIES
agency_id (pk)
JOBS
job_id (pk)
agency_id (fk to AGENCIES.agency_id)
AGENTS
agent_id (pk)
agency_id (fk to AGENCIES.agency_id)
JOBAGENT
job_id
fk to JOBS.job_id
agent_id
fk to AGENTS.agent_id
agency_id
fk to JOB.agency_id
fk to AGENTS.agency_id
You can define more than one foreign key constraint to a column - it just means that the value in JOBAGENT has to satisfy BOTH foreign key constraints to be allowed. But you'll have fun if you ever want to update jobs to a different agency... ;) SQL Server supports composite foreign keys: http://msdn.microsoft.com/en-us/library/ms175464.aspx
Update Regarding Updating
You have two choices -
Perform by hand, because ON UPDATE CASCADE etc won't handle agency and agent updates without using triggers
Have a status column in JOB, so you can cancel a job in order to recreate the job with the new supporting records (Agent, jobagent, etc). Further cleanup can be automated, based on job status if you desire

The problem is that if a job moves from one agency to another (as you say, if Jobs.agencyid were to be updated...) then the corresponding records in JobAgent become meaningless: those agents can't be attached to a job that's no longer with their agency, so the JobAgent records connecting them to the jobs should therefore be deleted...
One way to enforce this is to add a JobAgent.agencyid field, and make it a foreign key on Jobs.agencyid, with ON UPDATE RESTRICT to force (manual) deletion of the relevant JobAgent records before Jobs.agencyid can be changed.
Edit: the other issue, which I hadn't really considered, is that when you first associate a job to an agent (ie create a new JobAgent record) you need to ensure they both belong to the same agency... for this, I think OMG's solution works best - I'm happy to defer to the better answer.
OMG also raises the question of how to handle updates: you can either
Change the Jobs.agencyid field and delete (by hand) all associated JobAgent records: in this case the old agents no longer work on this job, and you can assign someone from the new agency to work on it.
Change the Jobs.agencyid field and also change all associated JobAgent records (ie all those agents move with the job to the new agency) - but this is very messy, because those agents will also be associated with other jobs that are still with the original agency.
As OMG suggests, make a new Jobs record and mark the old one as defunct (for later deletion).
As above but keep the defunct Jobs record to preserve historical information.
Whether you choose 3 or 4 depends a bit on what your system is for: do you just want to maintain the current state of who-has-which-jobs? or do you need to keep some kind of history, for example if there's billing records attached to the job... that info needs to stay associated with the original agency (but this is all outside the scope of your original question).

You could use ON UPDATE CASCADE with the foreign keys. See this Wikipedia Page.
Or maybe, if agencyid is something that you expect to be mutable, you can have a unique constraint for it and use some other meaningless field for the agency id (say, an auto-increment column).

Does the following scheme answer your question?
Jobs Agents Agencies
^ ^ ^
| | |
\ | /
\ | /
AgentiatedJob
Normally, I have a single-field primary key for every table, because it is easier to match a registry on a table and to refer it on tables below. So following this approach the AgientiatedJob would have at least the fields:
AgentiatedJobId
JobId
AgentId
AgencyId

Related

SQL Server Employee table with existing ID number

I am attempting to create an Employee table in SQL Server 2016 and I want to use EmpID as the Primary Key and Identity. Here is what I believe to be true and my question: When I create the Employee table with EmpID as the Primary Key and an Identity(100, 1) column, each time I add a new employee, SQL Server will auto create the EmpID starting with 100 and increment by 1 with each new employee. What happens if I want to import a list of existing employees from another company and those employees already have an existing EmpID? I haven't been able to figure out how I would import those employees with the existing EmpID. If there is a way to import the employee list with the existing EmpID, will SQL Server check to make sure the EmpID's from the new list does not exist for a current employee? Or is there some code I need to write in order to make that happen?
Thanks!
You are right about primary keys, but about importing employees from another company and Merging it with your employee list, you have to ask these things:
WHY? Sure there are ways to solve this problem, but why will you merge other company employees into your company employee?
Other company ID structure: Most of the time, companies have different ID structure, some have 4 characters others have only numbers so on and so forth. But you have to know the differences of the companies ID Structure.
If the merging can't be avoided, then you have to tell the higher ups about the concern, and you have to tell them that you have to give the merging company new employee ID's which is a must. With this in my, simply appending your database with the new data is the solution.
This is an extremely normal data warehousing issue where a table has data sources from multiple places. Also comes up in migration, acquisitions, etc.
There is no way to keep the existing IDs as a primary key if there are multiple people with the same ID.
In the data warehouse world we would always create a new surrogate key, which is the primary key to the table, and include the original key and a source system identifier as two attributes.
In your scenario you will probably keep the existing keys for the original company, and create new IDs for the new employees, and save the oldID in an additional column for historical use.
Either of these choices also means that as you migrate other associated data such as leave information imported from the old system, you can translate it to the new key by looking up OldID in the employee table, and finding the associated newID to associate it with when saving your lave records in the new system.
At the end of the day there is no alternative to this, as you simply cant have two employees with the same primary key.
I have never seen any company that migrate employees from another company and keep their existed employee id. Usually, they'll give them a new ID and keep the old one in the employee file for references uses. But they never uses the old one as an active ID ever.
Large companies usually uses serial of special identities that are already defined in the system to distinguish employees based on field, specialty..etc.
Most companies they don't do the same as large ones, but instead, they stick with one identifier, and uses dimensions as an identity. These dimensions specify areas of work for employees, projects, vendors ..etc. So, they're used in the system globally, and affected on company financial reports (which is the main point of using it).
So, what you need to do is to see the company ID sequence requirements, then, play your part on that. As depending on IDENTITY alone won't be enough for most companies. If you see that you can depend on identity alone, then use it, if not, then see if you can use dimensions as an identity (you could create five dimensions - Company, Project, Department, Area, Cost Center - it will be enough for any company).
if you used identity alone, and want to migrate, then in your insert statement do :
SET IDENTITY_INSERT tableName ON
INSRT INTO tableName (columns)
...
this will allow you to insert inside identity column, however, doing this might require you to reset the identity to a new value, to avoid having issues. read DBCC CHECKIDENT
If you end up using dimensions, you could make the dimension and ID both primary keys, which will make sure that both are unique in the table (treated as one set).

Database design for Multi tenant application

for a multi tenant app, we have the following database design:
which is based on shared database approach. Since we are identifying the tenants using company id (each company has a different set of employees and their tasks and so on), my question is
Do we need a companyId key in the Task table also so that every record
of task can be clearly identified using the companyId OR we should
always use a Join?
because if we use the companyId in Task that would not be a properly normalized database as the Task is would relate to a company and an employee which is also related to the company.
It is a matter of opinion. My take is to make companyId as a part of primary key and hence a mandatory field in every table.
In a multi tenant application, we should ensure that data does not get added in the table without a company code. Without making it a part of primary key or a non null field, it is upto program logic to ensure that. In my opinion, DB should also ensure that. Second issue is with the table task id. It is possible for 2 companies to have same employee id's and same task ids. DB should not restrict that.

One to two (1:2) relation between two tables

I'm working on some asp.net application, I got stuck in following business.
Suppose we have a person, and he can open both types of accounts and each account has some transaction history. Person table has some primary key say PPK, and customer table has some PK as PIN.
My question is how to solve/present this scenario in database, relation of person table to customer table is 1:2 as no person can have more than two account. and what about that transaction table? that holds all transaction history for some specific account? shall I make two transaction table (which is really a bad idea because as account type exceeds transaction tables exceeds).
Can I build their relation as 1:* (then I may need another table for that. it holds Fk of both table. )
or Can make pin as unique key and always open database for like checking limit of accounts (i.e. two).
I really need some suggestions.
All suggestions are welcome.
Thanks!
P.S: If you dont understand the question please ask me before reporting it away please!
You can either do something like this:
Or something like this:
The first model allows adding more accounts by just changing the CHECK (in case requirements change in the future), while the second would require adding another column in such case.
NOTE: The above assumes both account types are structurally identical, so they "fit" into same table. If that's not the case, you can use inheritance.
Ok you have a person table and an account table with a foreign key relationship between the two which is 1 person to many accounts. Then you have a transaction table that is related to the account id in the account table which is also 1 account to many transacations.
Now to enforce the limit of only two accounts being allowed, for that you want a trigger that checks when a record is being inserted or updated to amek sure the person currently has no more than one other record.

Database table design; link tables or rarely used foreign key?

My question relates to the best practices design of tables in a database for a specific scenario.
Say we have a Company that sells office equipment, Say Printers. This company also offers Service Contracts to Customers that have bought 1 or more of its Printers.
Given the above information, we can deduce three tables for our database:
Customer
Printer
ServiceContract
So for a given service contract, we specify which Customer the contract is created for and we assign 1 or more printers that comprise the contract agreement.
With regards to the Printers that are part of the service contract agreement, there are 2 ways we could approach the database design.
The first is to create a ServiceContractID column in the Printers table and create a basic Primary/Foreign key relationship between it and the ServiceContracts table. The only problem I see with with approach is that Printers don't have to be a part of a Service Contract and therefore you could have hundreds or even thousands of Printer records in the database, many of which are not part of a contract, so having this Foreign key column not being used for many of the records.
The second option is to create a link table which would contain 2 foreign keys referencing both The ServiceContracts table (It's primary key) and the Printers table (It's Primary Key). Combining both columns in this new table to make a unique composite primary key.
OK, here's my quandary. I don't see either option as being typically a Really bad idea, but I am stuck on knowing which of these 2 design decisions would be the best practise.
All comments welcome.
I think, without knowing your entire issue, that the second option is preferable. (i.e more properly normalized -in general)
The second option would allow some business rule flexibility which you may not have come up with yet (or your business model may change).
for example: dates may become important. The same ServiceContract may for instance be used even if the business decides on some rule, like all printers are covered for one year after purchase by this customer. where the same agreement just covers many purchases.
so option 2 gives you flexibility for adding other attributes on the relationship...
If I understand your problem domain correctly, the proper way to do this would be option 2. It sounds like a customer can have 0-many service contract, a service contract can have 0-many printers associated with it, and a printer can be associate with 0-1 service contracts (unless contracts expire and renew with a NEW contract, in which case the printer can have many-many.
Customers:
PK
CustomerInfo
Printers:
PK
PrinterInfo
FK on Customer PK
ServiceContracts:
PK
FK on Customer
// This creates a composite PK for the Contract_Printers Table:
Contract_Printers:
FK on Contract PK
FK on Printer PK
Hope that helps. I will be interested in hearing what others think . . .

cascading deletes causing multiple cascade paths

I am using SQlServer 2008, and an extract of some datatables is displayed below:
Users
Id (PK)
UserItems
UserId (PK)
ItemId (PK) - (Compound key of 2 columns)
...
UserItemVotes
UserId (PK)
ItemId (PK)
VoterId (PK) - (Compound key of 3 columns)
I have the following relationships defined:
User.Id -> UserItems.UserId
(UserItems.UserId, UserItems.ItemId) -> (UserItemVotes.UserId, UserItemVotes.ItemId)
UserId.Id -> UserItemVotes.VoterId
Now, I am having a problem when turning on cascading deletes. When adding the 3rd relationship I receive the error "...may cause cycles or multiple cascade paths. Specify ON DELETE NO ACTION or ON UPDATE NO ACTION, or modify other FOREIGN KEY constraints."
I do not really want to do this, ideally if a user is deleted I would like to remove their useritem and/or their votes.
Is this a bad design? Or is there a way to get behaviour I want from SQL Server?
The approved answer is not a good answer. The scenario described is not bad design, nor is it "risky" to rely on the database to do its job.
The original question describes a perfectly valid scenario, and the design is well thought-out. Clearly, deleting a user should delete both the user's items (and any votes on them), and delete the user's votes on any item (even items belonging to other users). It is reasonable to ask the database to perform this cascading delete when the user record is deleted.
The problem is that SQL Server can't handle it. Its implementation of cascading deletes is deficient.
"UserItems.ItemId -> UserItemVotes.UserId"
This one seems extremely suspect.
I would lead toward bad design. While most DBMSs can manage cascading deletes, it is risky to use this built in functionality. Your scenario is a perfect example of why these types of things are often managed in application code. There you can determine exactly what needs to be deleted and in what order.

Resources