EF Making Links but still getting errors - sql-server

I'm evolving here and making progress it appears. Last week I'd posted a question and gotten some responses that made although they didn't give me an answer did make me question some of the things that I had tried and made me go back to rework some configurations of my database tables a bit to see if that would prove to be more conducive to the EF approach which as I've mentioned is entirely new to me coming back to web development after being way for many years.
I tried several different things including some foreign key options but because I'm dealing with larger 1 to many relationships that was going to be problematic as well. As it ended up it flat out wasn't working. My last attempt was this morning when I restructured the tables to include all fields that were included in my associations inside the primary keys of the four tables.
I then re-created the EF Data Model for the project and built associations for each of the tables linking ONLY the fields values that were valid for the link between each of the associations. This mean that in some cases as you'll see below some elements of a tables key were left without matches assigned to them. I wasn't sure if I could get away with this or not. But to my surprise I was successful in establishing an association between all of the tables as I had intended for the first time since starting this project. I'm attaching the model diagram here though I doubt it will be helpful for this discussion. EF Data Model
Now, I may be okay or I may still be in the weeds. I will et you experts tell me how I stand here. The result of this is that I've pulled up two new errors repeated twice. One set of errors for the association between GETT_Family_Group_List and GETT_Elements tables and the other is for the association between GETT_Elemenets and GETT_Documents tables. Virtually the same issue so I'm only going to talk about the first one and not the second one, because, if and when we come up with a solution for the first that should apply to the second as well. here are the two errors that I'm getting for each of these:
Error 111: Properties referred by the Principal Role
GETT_Family_Group_List must be exactly identical to the key of the
EntityType ARFSModel. GETT_Family_Group_List referred to by the
Principal Role in the relationship constraint for Relationship
ARFSModel.GETF_Family_GroupjistGErf_Elements. Make sure all the key
properties are specified in the Principal Role.
Running transformation: Properties referred by the Principal Role
GETT_Family_Group_List must be exactly identical to the key of the
EntityType ARFSModel.GETT_Family_Group_List referred to by the
Principal Role in the relationship constraint for Relationship
ARFSModel.GETT_Family_Group_ListGETI_Elements. Make sure all the key
properties are specified in the Principal Role.
The tables layouts related to these two tables that are pulling these two errors as well as the Referential Constraint look like this: Tables and Constraints
That's about it... I supplied you with about all I can think of. Let me know if you need more and I will try to provide it.
Thanks and regards,
Ken...
EDITED:
Properties
ARFSModel.GETT_Family_Group_ListGETT_Element Association
- Constraints
Referential Contraint GETT_Family_Group_List -> GETT_Elements
- General
Association Set Name Gett_Family_Group_ListGET_Elements
- Documentation
Long Description
Summary
End1 Multiplicity 1 (One of GET_Family_Group_List)
End1 Navigation Property GETT_Elements
End1 OnDelete None
End1 RoleName GETT_Family_Group_List
End2 Multiplicity * (collection of GETT_Elements)
End2 Navigation Property) GETT_Family_Group_List
END2 OnDelete None
End2 Role Name GETT_Elements
Name GETT_Family_Group_ListGETT_Elements
Hopefully this is more readable!
Regards,
Ken...

Related

Relational Database: Reusing the same table in a different interpretation

Problem description
I am currently working on a project which requires a relational database for storage.
After thinking about the data and its relations for a while I ran into a quite repetitive problem:
I encountered a common data schema for entity A which contains some fields e.g. name, description, value. This entity is connected with entity B in multiple n-1 relations. So entity B has n entities A in relation rel1 and n entities A in relation rel2.
Now I am trying to break down this datamodel into a schema for a relational database (e.g. Postgres, MySQL).
After some research, I have not really found "the best" solution for this particular problem.
Some similar questions I have found so far:
Stackoverflow
DBA Stackexchange
My ideas
So I have thought about possible solutions which I am going to present here:
1. Duplicate table
The relationship from entity B to entity A has a certain meaning to it. So it is possible to create multiple tables (1 per relationship). This would solve all immediate problems but essentially duplicate the tables which means that changes now have to be reflected to multiple tables (e.g. a new column).
2. Introduce a type column
Instead of multiple relationships, I could just say "Entity B is connected with n entity A". Additionally, I would add a type column that then tells me to which relation entity A belongs. I am not exactly sure how this is represented with common ORMs like Spring-Hibernate and if this introduces additional problems that I am currently unaware of.
3. Abstract the common attributes of entity A
Another option is to create a ADetails entity, which bundles all attributes of entity A.
Then I would create two entities that represent each relationship and which are connected to the ADetails entity in a 1-to-1 relationship. This would solve the interpretation problem of the foreign key but might be too much overhead.
My Question
In the context of a medium-large-sized project, are any of these solutions viable?
Are there certain Cons that rule out one particular approach?
Are there other (better) options I haven't thought about?
I appreciate any help on this matter.
Edit 1 - PPR (Person-Party-Role)
Thanks for the suggestion from AntC. PPR Description
I think the described situation matches my problem.
Let's break it down:
Entity B is an event. There exists only one event for the given participants to make this easier. So the relationship from event to participant is 1-n.
Entity A can be described as Groups, People, Organization but given my situation they all have the same attributes. Hence, splitting them up into separate tables felt like the wrong idea.
To explain the situation with the class diagram:
An Event (Entity B) has a collection of n Groups (Entity A), n People (Entity A) and n Organizations (Entity A).
If I understand correctly the suggestion is the following:
In my case the relationship between Event and Participant is 1-n
The RefRoles table represents the ParticipantType column that descibes to which relationship the Participant belongs (is it a customer or part of the service for the event for example)
Because all my Groups, People and Organizations have the same attributes the only table required at this point is the Participant table
If there are individual attributes in the future I would introduce a new table (e.g. People) that references the Participant in a 1-1 relationship.
If there are multiple tables going to be added, the foreign key of the multiple 1-1 relationship is mutually exclusive (so there can only be one Group/Person/Organization for a participant)
Solution suggested by AntC and Christian Beikov
Splitting up the tables does make sense while keeping the common attributes in one table.
At the moment there are no individual attributes but the type column is not required anymore because the foreign keys can be used to see which relationship the entity belongs to.
I have created a small example for this:
There exist 3 types (previously type column) of people for an event: Staff, VIP, Visitor
The common attributes are mapped in a 1-1-relationship to the person table.
To make it simple: Each Person (Staff, VIP, Visitor) can only participate in one event. (Would be n-m-relationship in a more advanced example)
The database schema would be the following:
This approach is better than the type column in my opinion.
It also solves having to interprete the entity based on its type in the application later on. It is also possible to resolve a type column in an ORM (see this question) but this approach avoids the struggle if the ORM you are using does not support resolving it.
IMO since you already use dedicated terms for these objects, they probably will diverge and splitting up a table afterwards is quite some work, also on the code side, so I would suggest you map dedicated entities/tables from the beginning.

Need Help Creating Primary Key/Foreign Key Relationships between Multiple Tables

Background:
(I'm using Microsoft SQL Server 2014)
My company receives data files (tblFile) that contain many accounts (tblAccount). For each data file, we may perform multiple "pricings" (tblPricing), and these "pricings" may contain all of the accounts in the file, or only a subset of them, but the "pricings" cannot contain any accounts not in the file from which the pricing is based. So, in summary:
We get a single file
This file can contain many accounts
We create many pricings from this single file
Each pricing can contain all or some of the accounts from the file it is linked to, but no accounts not in that file
Here is a (way) simplified database diagram as it exists today:
Problem:
What's working so far:
The 1:Many relationship between tblFile and tblPricing
The Many:Many relationship between tblFile and tblAccount (an account can exist in multiple files)
The Many:Many relationship between tblPricing and tblAccount (because many pricings can be performed, an account can exist in many pricings)
Our problem comes from trying to enforce integrity between the subset of accounts that a file has and the subset of accounts that a pricing has. With the above structure, the tblPricingAccounts can contain accounts not contained in the tblFileAccounts, violating our need for each pricing to contain only the accounts from the file of which it is based upon.
I've tried changing the foreign key relationships where I broke the link between tblPricingAccounts and tblAccount, removed 'acct_id' from tblPricingAccounts, and instead linked tblPricingAccounts to tblFileAccounts (yes, I know I need a primary key in tblFileAccounts and I had one). But, then I was able to insert whatever 'pricing_id' I wanted into tblPricingAccounts. Now I could link accounts to a pricing that had nothing to do with the file that originally contained those accounts.
Need
At the end of the day, I don't care what the structure or relationships of my database look like. I simply need the following criteria met, and I can't seem to wrap my mind around it:
A file contains many accounts.
A file contains many pricings.
A pricing contains many accounts, but those accounts MUST be contained in the file that the pricing is linked to.
Any help is appreciated, and I'm open to all suggestions that can be performed within SQL Server. Ultimately I'm building a web application around this database, and I'm using Entity Framework 6 to make life easier (mostly...). I could obviously enforce the above 3 needs through my code, but I really would like the database to be the last line of defense in enforcing this integrity.
This is a situation that foreign key constraints are not intended to handle. FK constraints test for existence of values between tables; they do not enforce particular cardinality requirements.
Simple cardinality is the "one to many", "many to many" relationships mentioned in the question. Your more complex need is still essentially about cardinality though: it's a requirement that certain subsets of rows relate to certain other subsets of rows in a particular way. "Windowed cardinality" if you will. (My own coinage as far as I know.)
As suggested in comments to the question, one way to enforce this wholly within the database is via triggers. A well crafted trigger in this case would probably test whether new rows to be inserted are valid, and erroring without insertion if not. For a bulk insert, you may wish to insert valid rows and throw the rest, or throw everything back if 1+ rows are invalid. You can also craft logic to handle updates or deletions that could break your integrity requirements.
Be aware that triggers will negatively affect performance, especially if the table is being changed frequently.
Other approaches are to handle this in application logic, as suggested, and/or allow data into the tables regardless, but validate existing data periodically. For example, a nightly process could identify data failing this requirement and pass to a human to correct.
It sounds like tblFileAccounts might be superfluous. Try removing it altogether and inferring which accounts exist in which files through the relationships captured in tblPricingAccounts and tblPricing.
If this meets your need, and there are no attributes (columns) which rightfully belong to the tblFileAccounts object (table), then I think your problem is solved.

Complex Database Relations (Junction Tables)

My Question is about the idea of combining two junction tables into one, for similarly related tables. Please read to see what I mean. Also note that this is indeed a problem I am faced with and therefore relevant to this forum. It is just a topic of broad consequence for which I'm hoping to elicit a bit more participation from various professionals to get a better census of "best practice" if you will.
I have this rather challenging database design problem. I'm hoping this will be sort of a wiki that many people can contribute to and learn from. To make this easier, I've created a set of graphics, and will break the problem down into 1) Process, and 2) Structure.
Process Steps
A request (DocRequest) for documentation (Publication) is made.
A new publication is created IF said publication does not already exist.
A running log (StatusReport) is kept for progress on fulfilling the request.
Note: For any given Publication there may be many DocRequests and StatusReports (including updates)
Database Structure
Note: Both the DocRequest and StatusReport tables have numerous fields and supporting tables not shown in the attached graphics. Furthermore, a particular Publication is the master record to which all records in those tables belong.
--Current Implementation--
Note: The major flaw with this design is that whenever you create either a new DocRequest and StatusReport record, you have to also create a new record in the Publications table (which acts like a junction table), but this also creates a new Publication as a result. This is not the desired behavior.
--Typical Implementation-- (for this type of relationship)
Note: This is ok, and probably ideal, but handles updates to either the DocRequest and StatusReport tables, independently linking them to the Publication to which they belong.
--My Preferred Implementation-- (for this special case)
Note: The idea I had here, was simply to combine the dual junction tables into one. In this case the junction table would get a new record anytime either the DocRequest or StatusReport had a insert occur. I will likely handle this with a trigger.
Discussion
Now for the discussion. I would like to know from my fellow Database Developers if you think this is a bad idea, and what issues might arise from this. I think the net number of records should be identical as with the two separate junction tables, and in fact uses slightly less space by saving an extra ID column. :)
Let me know what you guys think. I would really like to get many people involved in this discussion. Cheers! :)
I think you're hurting yourself by thinking in terms of junction tables. Just think of tables.
Since StatusReport has to do with the status of the document request,
you need a table that relates those two somehow.
"StatusReport" is an awful name for a table that stores facts about the status of a document request.
"ID" is an awful name for any column in any table.
The id number of the publication seems to have more to do with the document request than with the status of the request. (You said, "A new publication is created IF said publication does not already exist." Frankly, that's skating pretty close to the edge of not making sense.) So the publication number almost certainly belongs in the DocRequest table.
Referring to the diagram of your preferred implementation, I'd drop the table TripleJunction, and replace StatusReport with this.
-- Predicate: Document request number (doc_request_id) has status (status)
-- as of date and time (status_as_of).
create table document_request_status (
doc_request_id integer not null references DocRequest (id),
status_as_of timestamp not null default current_timestamp,
status varchar(10) not null,
-- other columns go here
primary key (doc_request_id, status_as_of)
);

What would you do to avoid conflicting data in this database schema?

I'm working on a multi-user internet database-driven website with SQL Server 2008 / LinqToSQL / custom-made repositories as the DAL. I have run across a normalization problem which can lead to an inconsistent database state if exploited correctly and I am wondering how to deal with the problem.
The problem: Several different companies have access to my website. They should be able to track their Projects and Clients at my website. Some (but not all) of the projects should be assignable to clients.
This results in the following database schema:
**Companies:**
ID
CompanyName
**Clients:**
ID
CompanyID (not nullable)
FirstName
LastName
**Projects:**
ID
CompanyID (not nullable)
ClientID (nullable)
ProjectName
This leads to the following relationships:
Companies-Clients (1:n)
Companies-Projects (1:n)
Clients-Projects(1:n)
Now, if a user is malicious, he might for example insert a Project with his own CompanyID, but with a ClientID belonging to another user, leaving the database in an inconsistent state.
The problem occurs in a similar fashion all over my database schema, so I'd like to solve this in a generic way if any possible. I had the following two ideas:
Check for database writes that might lead to inconsistencies in the DAL. This would be generic, but requires some additional database queries before an update and create queries are performed, so it will result in less performance.
Create an additional table for the clients-Projects relationship and make sure the relationships created this way are consistent. This also requires some additional select queries, but far less than in the first case. On the other hand it is not generic, so it is easier to miss something in the long run, especially when adding more tables / dependencies to the database.
What would you do? Is there any better solution I missed?
Edit: You might wonder why the Projects table has a CompanyID. This is because I want users to be able to add projects with and without clients. I need to keep track of which company (and therefore which website user) a clientless project belongs to, which is why a project needs a CompanyID.
I'd go with with the latter, having one or more tables that define the allowable relationships between entities.
Note, there's no circularity in the references you have, so the title is misleading.
What you have is the possibility of conflicting data, that's different.
Why do you have "CompanyID" in the project table? The ID of the company involved is implicitly given by the client you link to. You don't need it.
Remove that column and you've removed your problem.
Additionally, what is the purpose of the "name" column in the client table? Can you have a client with one name, differing from the name of the company?
Or is "client" the person at that company?
Edit: Ok with the clarification about projects without companies, I would separate out the references, but you're not going to get rid of the problem you're describing without constraints that prevent multiple references being made.
A simple constraint for your existing tables would be that not both the CompanyID and ClientID fields of the project row could be non-null at the same time.
If you want to use the table like this and avoid the all the new queries just put triggers on the table and when user tries to insert row with wrong data the trigger with stop him.
Best Regards,
Iordan
My first thought would be to create a special client record for each company with name "No client". Then eliminate the CompanyId from the Project table, and if a project has no client, use the "No client" record rather than a "normal" client record. If processing of such no-client's is special, add a flag to the no-client record to explicitly identify it. (I'd hate to rely on the name being "No Client" or something like that -- too fuzzy.)
Then there would be no way to store inconsistent data so the problem would go away.
In the end I implemented a completely generic solution which solves my problem without much runtime overhead and without requiring any changes to the database. I'll describe it here in case someone else has the same problem.
First off, the approach only works because the only table that other tables are referencing through multiple paths is the Companies table. Since this is the case in my database, I only have to check whether all n:1 referenced entities of each entity that is to be created / updated / deleted are referencing the same company (or no company at all).
I am enforcing this by deriving all of my Linq entities from one of the following types:
SingleReferenceEntityBase - The norm. Only checks (via reflection) if there really is only one reference (no matter if transitive or intransitive) to the Companies table. If this is the case, the references to the companies table cannot become inconsistent.
MultiReferenceEntityBase - For special cases such as the Projects table above. Asks all directly referenced entities what company ID they are referencing. Raises an exception if there is an inconsistency. This costs me a few select queries per CRUD operation, but since MultiReferenceEntities are much rarer than SingleReferenceEntities, this is negligible.
Both of these types implement a "CheckReferences" and I am calling it whenever the linq entity is written to the database by partially implementing the OnValidate(System.Data.Linq.ChangeAction action) method which is automatically generated for all Linq entities.

database design: a 'code' table that get referenced by other entities

I am building a database as a simple exercise, it could be hosted on any database server, so I am trying to keep things as much standard as possible. Basically what I would like to do is a 'code' table that get referenced by other entities. I explain:
xcode
id code
r role
p property
code
r admin
r staff
p title
....
then I would like to have some view like:
role (select * from code where xcode='r')
r admin
r staff
property (select * from code where xcode='p')
p title
then, suppose we have an entity
myentity
id - 1
role - admin (foreign key to role)
title - title (foreign key to property)
Obviously I cannot create foreign key to a view, but this is to tell the idea I have in mind. How can I reflect such behaviour using whenever possible, standard sql syntax, then as a second option, database additional features like trigger ecc... ?
Because if I tell that role and title in myentity are foreign key to 'code', instead of the views, nothing would stop me to insert a role in title field.
I have worked on systems with a single table for all codes and others with one table per code. I definitely prefer the latter approach.
The advantages of a table per code are:
Foreign keys. As you have already spotted it is not possible to enforce compliance to permitted values through foreign keys with a single table. Using check constraints is an alternative approach but it has a higher maintenance cost.
Performance. Code lookups are not normally a performance bottle neck, but it undoubtedly helps the optimizer to make sensible decisions about execution paths if it knows it is retrieving records from a table with four rows rather than four hundred.
Code groups. Sometimes we want to organise a code into sub-divisions, usually to make it easier to render complex lists of values. If we have a table per code we have more flexibility when it comes to structure.
In addition I notice that you want to be able to deploy "on any database server". In that case avoid triggers. Triggers are usually bad news in most scenarios, but they have product-specific syntax.
What you are trying to do is in most cases an anti pattern and design mistake. Just create the different tables instead of views.
There are some rare cases where this kind of design makes sense. In this kind include the xcode field in the primary key/ foreign key. So your entity will look like this:
myentity
id - 1
role_xcode
role - admin (foreign key to role)
title_xcode
title - title (foreign key to property)
You then can create check constraints to enforce role_xcode='r' and title_xcode='p'
(sorry I don't know if they are standard, they do exist in oracle and are so simple that I'd expect them on other rdbms's as well)

Resources