Entity Framework show multiple the same row - sql-server

This is weird, I am using an ASP.NET MVC application and Entity Framework to map a view from my database.
I don't know why but the query returns the same rows (5 rows 2 times each) multiple times, while in the database the view show me 10 distinct rows.
Don't understand what is going on.
Please help!

This is a well-known issue with views. Since a view (contrary to an actual table) in SQL Server doesn't have a defined primary key, EF will use all non-nullable columns as the primary key. These might be strings or other datatypes - and they might just really not make up a "good" primary key.
Now when EF reads the data, it comes across the first row in question, reads it into the dataset, and determines what the "substitute primary key" for that row is. When it then reads the next row from the database view, if the non-nullable columns are all the same, EF will interpret this as "this is the same row again" and it will NOT actually store the values from the database view, but it'll just use the row that it had just read before - since the primary key is the same, that's a valid approach.
How to solve this?
you can either explicitly define an EF-based primary key for your view entity that is in fact distinct for each row read
you can include the primary key columns of all the tables involved in the view - that way, the unique values from each table will be present in the view and thus will cause EF to properly recognize those distinct rows as being distinct rows.

Related

Sql: Joining view on computed columns vs performance

I have some Sql tables with a primary key that's includes more column. I created a view on this
tables and I added a computed column that is a concatenation of table's primary key, separated by a separator. (for example: ColumnA$ColumnB$ColumnC is concatenation of Column A, B e C that's table key).
When I use this view I filter on computed column to work with primary key.
In other case I have a query that put in join more view. Foreign key on the view is computed like primary key and the joins are on computed column.
The scope of this work is to simplified key to simplified integration with other software.
Could this execution scenario affect significatly performance?
Thanks in advance
Luca
Better idea would be to keep these columns separately just as you have them natively in your tables, then you can create your index/PK based on all 3 columns not just a concentrated single one. For the performance I would probably suggest here to use indexed view here. Other way if we talk about 3 string columns you can use some hashing techniques as long as you can handle that extreme minimum hashing duplication exception on your application end.

MSSQL and Entity framework syncing PK on multiple tables

I've been working with Entity framework in C# trying to figure out how to join two tables together. I found a reference here http://msdn.microsoft.com/en-us/data/jj715646.aspx on how to do this. Problem is, the two tables have PKs that are not in sync which seems to be a requirement. I've never had to worry about syncing PKs from two tables in a database before. I know I can turn off identity insert on one table but I see comments from numerous people that this is a very bad idea. If I'm not supposed to do this, then how do I accomplish syncing the PKs in each of the tables?
I have two tables in a database:
User
pkID (int)
FirstName (varchar)
LastName (varchar)
Email (varchar)
...
LockedFlags (locking fields in user from being edited)
pkID
fkUserID
bFirstName (bool)
bLastName (bool)
bEmail (bool)
I'm curious on why people thing that removing the identity insert on a table is a bad idea... if I'm relying on MSSQL to assign a PK then I could see an instance when I'm inserting a record into the database where the second table write could get a different value when dealing with multiple writes...
It sounds like you have orphaned rows in the LockedFlags table, like a row with a user ID that points to a user that has been deleted. Depending on how the relationship is setup it can also be true for the reverse.
If you have a entity where the 2 tables are combined into a single class, loading the entity set will query both tables and require matching pairs of rows.
Your LockedFlags probably has a User property which it is trying to load and cannot find in the user table.
Table options:
Note: I'm using MSSQL equivalent as I don't know MYSQL.
Comments regarding your data model:
I don't know how MYWSQL handles record locking but if it is anything like MSSQL, you do not have to worry about manually handling.
I would strongly suggest re-looking at your data model if you're going to use it as is. Just using a single table would be best if you really want to manually lock individual row fields?
Edit:
ALTER TABLE LockFlags ADD CONSTRAINT
FK_LockFlags_User FOREIGN KEY
(
UserID
) REFERENCES User
(
pkID
) ON UPDATE NO ACTION
ON DELETE NO ACTION
GO

SQL Server 2008 - Database Design Query

I have to load the data shown in the below image into my database.
For a particular row, either field PartID would be NULL OR field GroupID will be NULL, and the other available columns refers to the NON-NULL entity. I have following three options:
To use one database table, which will have one unified column say ID, which will have PartID and GroupID data. But, in this case I won't be able to apply foreign key constraint, as this column will be containing both entities' data.
To use one database table, which will have columns for both PartID and GroupID, which will contain the respective data. For each row, one of them will be NULL, But in this case I will be able to apply foreign key constraint.
To use two database tables, which will have similar structure, the only difference will be the column PartID and GroupID. In this case I will be able to apply foreign key constraint.
One thing to note here is that, the table(s) will be used in import processes to import about 30000 rows in one go and will also be heavily used in data retrieve operations. Also, the other columns will be used as pivot columns.
Can someone please suggest what should be best approach to achieve this?
I would use option 2 and add a constraint that only one can be non-null and the other must be null (just to be safe). I would not use option 1 because of the lack of a FK and the possibility of linking to the wrong table when not obeying the type identifier in the join.
There is a 4th option, which is to normalize them as "items" with another (surrogate) key and two link tables which link items to either parts or groups. This eliminates NULLs. There are further problems with that approach (items might be in both again or neither without any simple constraint), so unless that is necessary for other reasons, I wouldn't generally go down that path.
Option 3 could be fine - it really depends if these rows are a relation - i.e. data associated with a primary key. That's one huge problem I see with the data presented, the lack of a candidate key - I think you need to address that first.
IMO option 2 is the best - it's not perfectly normalized but will be the easiest to work with. 30K rows is not a lot of rows to import.
I would modify the table so it has one ID column and then add an IDType that is either "G" for Group or "P" for Part.

When having an identity column is not a good idea?

In tables where you need only 1 column as the key, and values in that column can be integers, when you shouldn't use an identity field?
To the contrary, in the same table and column, when would you generate manually its values and you wouldn't use an autogenerated value for each record?
I guess that it would be the case when there are lots of inserts and deletes to the table. Am I right? What other situations could be?
If you already settled on the surrogate side of the Great Primary Key Debacle then I can't find a single reason not use use identity keys. The usual alternatives are guids (they have many disadvatages, primarily from size and randomness) and application layer generated keys. But creating a surrogate key in the application layer is a little bit harder than it seems and also does not cover non-application related data access (ie. batch loads, imports, other apps etc). The one special case is distributed applications when guids and even sequential guids may offer a better alternative to site id + identity keys..
I suppose if you are creating a many-to-many linking table, where both fields are foreign keys, you don't need an identity field.
Nowadays I imagine that most ORMs expect there to be an identity field in every table. In general, it is a good practice to provide one.
I'm not sure I understand enough about your context, but I interpret your question to be:
"If I need the database to create a unique column (for whatever reason), when shouldn't it be a monotonically increasing integer (identity) column?"
In those cases, there's no reason to use anything other than the facility provided by the DBMS for the purpose; in your case (SQL Server?) that's an identity.
Except:
If you'll ever need to merge the table with data from another source, use a GUID, which will prevent duplicate keys from colliding.
If you need to merge databases it's a lot easier if you don't have to regenerate keys.
One case of not wanting an identity field would be in a one to one relationship. The secondary table would have as its primary key the same value as the primary table. The only reason to have an identity field in that situation would seem to be to satisfy an ORM.
You cannot (normally) specify values when inserting into identity columns, so for example if the column "id" was specified as an identify the following SQL would fail:
INSERT INTO MyTable (id, name) VALUES (1, 'Smith')
In order to perform this sort of insert you need to have IDENTITY_INSERT on for that table - this is not intended to be on normally and can only be on for a maximum of 1 tables in the database at any point in time.
If I need a surrogate, I would either use an IDENTITY column or a GUID column depending on the need for global uniqueness.
If there is a natural primary key, or the primary key is defined as a unique combination of other foreign keys, then I typically do not have an IDENTITY, nor do I use it as the primary key.
There is an exception, which is snapshot configuration tables which I am tracking with an audit trigger. In this case, there is usually a logical "primary key" (usually date of the snapshot and natural key of the row - like a cost center or gl account number for which the row is a configuration record), but instead of using the natural "primary key" as the primary key, I add an IDENTITY and make that the primary key and make a unique index or constraint on the date and natural key. Although theoretically the date and natural key shouldn't change, in these tables, if a user does that instead of adding a new row and deleting the old row, I want the audit (which reflects a change to a row identified by its primary key) to really reflect a change in the row - not the disappearance of a key and the appearance of a new one.
I recently implemented a Suffix Trie in C# that could index novels, and then allow searches to be done extremely fast, linear to the size of the search string. Part of the requirements (this was a homework assignment) was to use offline storage, so I used MS SQL, and needed a structure to represent a Node in a table.
I ended up with the following structure : NodeID Character ParentID, etc, where the NodeID was a primary key.
I didn't want this to be done as an autoincrementing identity for two main reasons.
How do I get the value of a NodeID after I add it to the database/data table?
I wanted more control when it came to generating my own IDs.

Foreign key referencing composite table

I've got a table structure I'm not really certain of how to create the best way.
Basically I have two tables, tblSystemItems and tblClientItems. I have a third table that has a column that references an 'Item'. The problem is, this column needs to reference either a system item or a client item - it does not matter which. System items have keys in the 1..2^31 range while client items have keys in the range -1..-2^31, thus there will never be any collisions.
Whenever I query the items, I'm doing it through a view that does a UNION ALL between the contents of the two tables.
Thus, optimally, I'd like to make a foreign key reference the result of the view, since the view will always be the union of the two tables - while still keeping IDs unique. But I can't do this as I can't reference a view.
Now, I can just drop the foreign key, and all is well. However, I'd really like to have some referential checking and cascading delete/set null functionality. Is there any way to do this, besides triggers?
sorry for the late answer, I've been struck with a serious case of weekenditis.
As for utilizing a third table to include PKs from both client and system tables - I don't like that as that just overly complicates synchronization and still requires my app to know of the third table.
Another issue that has arisen is that I have a third table that needs to reference an item - either system or client, it doesn't matter. Having the tables separated basically means I need to have two columns, a ClientItemID and a SystemItemID, each having a constraint for each of their tables with nullability - rather ugly.
I ended up choosing a different solution. The whole issue was with easily synchronizing new system items into the tables without messing with client items, avoiding collisions and so forth.
I ended up creating just a single table, Items. Items has a bit column named "SystemItem" that defines, well, the obvious. In my development / system database, I've got the PK as an int identity(1,1). After the table has been created in the client database, the identity key is changed to (-1,-1). That means client items go in the negative while system items go in the positive.
For synchronizations I basically ignore anything with (SystemItem = 1) while synchronizing the rest using IDENTITY INSERT ON. Thus I'm able to synchronize while completely ignoring client items and avoiding collisions. I'm also able to reference just one "Items" table which covers both client and system items. The only thing to keep in mind is to fix the standard clustered key so it's descending to avoid all kinds of page restructuring when the client inserts new items (client updates vs system updates is like 99%/1%).
You can create a unique id (db generated - sequence, autoinc, etc) for the table that references items, and create two additional columns (tblSystemItemsFK and tblClientItemsFk) where you reference the system items and client items respectively - some databases allows you to have a foreign key that is nullable.
If you're using an ORM you can even easily distinguish client items and system items (this way you don't need to negative identifiers to prevent ID overlap) based on column information only.
With a little more bakcground/context it is probably easier to determine an optimal solution.
You probably need a table say tblItems that simply store all the primary keys of the two tables. Inserting items would require two steps to ensure that when an item is entered into the tblSystemItems table that the PK is entered into the tblItems table.
The third table then has a FK to tblItems. In a way tblItems is a parent of the other two items tables. To query for an Item it would be necessary to create a JOIN between tblItems, tblSystemItems and tblClientItems.
[EDIT-for comment below] If the tblSystemItems and tblClientItems control their own PK then you can still let them. You would probably insert into tblSystemItems first then insert into tblItems. When you implement an inheritance structure using a tool like Hibernate you end up with something like this.
Add a table called Items with a PK ItemiD, And a single column called ItemType = "System" or "Client" then have ClientItems table PK (named ClientItemId) and SystemItems PK (named SystemItemId) both also be FKs to Items.ItemId, (These relationships are zero to one relationships (0-1)
Then in your third table that references an item, just have it's FK constraint reference the itemId in this extra (Items) table...
If you are using stored procedures to implement inserts, just have the stored proc that inserts items insert a new record into the Items table first, and then, using the auto-generated PK value in that table insert the actual data record into either SystemItems or ClientItems (depending on which it is) as part of the same stored proc call, using the auto-generated (identity) value that the system inserted into the Items table ItemId column.
This is called "SubClassing"
I've been puzzling over your table design. I'm not certain that it is right. I realise that the third table may just be providing detail information, but I can't help thinking that the primary key is actually the one in your ITEM table and the FOREIGN keys are the ones in your system and client item tables. You'd then just need to do right outer joins from Item to the system and client item tables, and all constraints would work fine.
I have a similar situation in a database I'm using. I have a "candidate key" on each table that I call EntityID. Then, if there's a table that needs to refer to items in more than one of the other tables, I use EntityID to refer to that row. I do have an Entity table to cross reference everything (so that EntityID is the primary key of the Entity table, and all other EntityID's are FKs), but I don't find myself using the Entity table very often.

Resources