Database Table Primary Keys - database

I was recently developing a .Net MVC web application and send the database schema to a DBA I work with to get the database built on a production DB server. The DBA asked me if I needed primary keys in all my tables. I said yes, for the primary reason, that it is good DB design practice. When I asked why the DBA told me that it is preferred to minimize the number of tables in our organization's database servers with primary keys to conserve resources. Is there some sort of detriment to having primary keys in data tables?

When you make a 'join table' the primary keys from each contributing table, form a composite key for the join table. It is then quite possible that this composite key can be indexed.
Inefficient indexing strategies can degrade performance.
An example is the 'InnoDB' engine for MySQL, this is one I work with a lot. With InnoDB every index entry is concatenated with a value of the corresponding primary key. When a query reads a record via a secondary index, this is used with the primary key, to find the record.
So the primary key could effect performance especially if it is something big like a java UUID (128bytes).

Related

IdentityServer4 design of the PersistedGrants table

PersistedGrants table has ClientId, SubjectId and Type columns as navchars. I would expect them to be foreign keys instead referencing to Clients, Subjects and Type tables. I've been wandering why this patter has been chosen? Is it performing better this way despite taking more space?
Also keeping all in one thread, how can I configure IdentityServer4 to delete expired rows(Keys)?
Thanks
I'm not the author but I'd imagine it's because the design of the framework is not wedded to having to use a relational DB. The repositories for configuration and operational data are separate and could live in physically separate databases and therefore enforcing referential integrity when you happen to be using the same DB for both doesn't really make sense.

SQL Server Foreign Keys across database boundaries - techniques for enforcement

I have two separate SQL Server 2005 databases (on the same server)
security database
main application database
The security database has a user table with everything needed to authenticate. -
The application database has a person table with extended user details. There is a 1-1 mapping between the security database user table and the application database person table.
I want to enforce a mapping between the user and the person table. I'm assuming that foreign keys can't be mapped across databases thus I am wondering what to do to enforce the integrity of the relationship.
Cross database foreign keys are indeed not supported
Msg 1763, Level 16, State 0, Line 2
Cross-database foreign key references are not supported.
If you really want to enforce the referential integrity on the database side you will have to rely on triggers. (which I don't recommend)
to make your code more maintainable you could create synonyms for the tables you want to check referential integrity on.
CREATE SYNONYM myTable FOR otherdatabase.dbo.myTable;
This would be to make the "manual" checks easier, as you can not create foreign keys on a synonym.
It's a lot of work but you may think about merging those two databases into a single database. If you want a logical difference between objects within the database, you can use a schema.

Primary Keys in Oracle and SQL Server

What's the best practice for handling primary keys using an ORM over Oracle or SQL Server?
Oracle - Should I use a sequence and a trigger or let the ORM handle this? Or is there some other way ?
SQL Server - Should I use the identifier data type or somehow else ?
If you are using any kind of ORM, I would suggest you to let it handle your primary keys generation. In SQL Server and Oracle.
With either database, I would use a client-generated Guid for the primary key (which would map to uniqueidentifier in SQL Server, or RAW(20) in Oracle). Despite the performance penalty on JOINs when using a Guid foreign key, I tend to work with disconnected clients and replicated databases, so being able to generate unique IDs on the client is a must. Guid IDs also have advantages when working with an ORM, as they simplify your life considerably.
It is a good idea to remember that databases tend to have a life independent from a front end application. Records can be inserted by batch processes, web services, data exchange with other databases, heck, even different applications sharing the same database.
Consequently it is useful if a database table is in charge of its own identify, or at least has that capability. For instance, in Oracle a BEFORE INSERT trigger can check whether a value has been provided for its primary key, and if not generate its own.
Both Oracle and SQL Server can generate GUIDs, so that is not a sufficient reason for delegating identity generation to the client.
Sometimes, there is a natural, unique identifier for a table. For instance, each row in a User table can be uniquely identified by the UserName column. In that case, it may be best to use UserName as the primary key.
Also, consider tables used to form a many to many relationship. A UserGroupMembership table will contain UserId and GroupId columns, which should be the primary key, as the combination uniquely identifies the fact that a particular user is a member of a particular group.

Transactional replication with no primary key (unique index)

I've just come across something disturbing, I was trying to implement transactional replication from a database whose design is not under our control . This replication was in order to perform reporting without taxing the system too much. Upon trying the replication only some of the tables went across.
On investigation tables were not selected to be replicated because they don't have a primary key, I thought this cannot be it is even shown as a primary key if I use ODBC and ms access but not in management studio. Also the queries are not ridiculously slow.
I tried inserting a duplicate record and it failed saying about a unique index(not a primary key). Seems to be the tables have been implemented using a unique index as oppose to a primary key. Why I do not know I could scream.
Is there anyway to perform transactional replication or an alternative, it needs to be live (last minute or two). The main db server is currently sql 2000 sp3a and the reporting server 2005.
The only thing I have currently thought of trying is setting the replication up as if it is another type of database. I believe replication to say oracle is possible would this force the use of say an ODBC driver like I assume access is using hence showing a primary key. I don't know if that is accurate out of my depth on this.
As MSDN states, it is not possible to create a transactional replication on tables without primary keys. You could use Merge replication (one way), that doesn't require a primary key, and it automatically creates a rowguid column if it doesn't exist:
Merge replication uses a globally
unique identifier (GUID) column to
identify each row during the merge
replication process. If a published
table does not have a uniqueidentifier
column with the ROWGUIDCOL property
and a unique index, replication adds
one. Ensure that any SELECT and INSERT
statements that reference published
tables use column lists. If a table is
no longer published and replication
added the column, the column is
removed; if the column already
existed, it is not removed.
Unfortunately, you will have a performance penalty if using merge replication.
If you need to use replication for reporting only, and you don't need the data to be exactly the same as on the publisher, then you could consider snapshot replication also

Advice Please: SQL Server Identity vs Unique Identifier keys when using Entity Framework

I'm in the process of designing a fairly complex system. One of our primary concerns is supporting SQL Server peer-to-peer replication. The idea is to support several geographically separated nodes.
A secondary concern has been using a modern ORM in the middle tier. Our first choice has always been Entity Framework, mainly because the developers like to work with it. (They love the LiNQ support.)
So here's the problem:
With peer-to-peer replication in mind, I settled on using uniqueidentifier with a default value of newsequentialid() for the primary key of every table. This seemed to provide a good balance between avoiding key collisions and reducing index fragmentation.
However, it turns out that the current version of Entity Framework has a very strange limitation: if an entity's key column is a uniqueidentifier (GUID) then it cannot be configured to use the default value (newsequentialid()) provided by the database. The application layer must generate the GUID and populate the key value.
So here's the debate:
abandon Entity Framework and use another ORM:
use NHibernate and give up LiNQ support
use linq2sql and give up future support (not to mention get bound to SQL Server on DB)
abandon GUIDs and go with another PK strategy
devise a method to generate sequential GUIDs (COMBs?) at the application layer
I'm leaning towards option 1 with linq2sql (my developers really like linq2[stuff]) and 3. That's mainly because I'm somewhat ignorant of alternate key strategies that support the replication scheme we're aiming for while also keeping things sane from a developer's perspective.
Any insight or opinion would be greatly appreciated.
I second Craig's suggestion - option 4.
You can always use the GUID column, populated by the middle-tier, as your PRIMARY KEY (that's a LOGICAL construct).
To avoid massive index (thus: table) fragmentation, use some other key (ideally an INT IDENTITY column) as the CLUSTERING KEY - that's a physical database construct, which CAN be separated from the primary key.
By default, the primary key is the clustering key - but that doesn't have to be that way. In fact, I improved performance and drastically lowered fragmentation by doing just that on a database I "inherited" - add a INT IDENTITY column and put the clustering key on that small, ever-increasing, never-changing INT - works like a charm!
Marc
Huh? I think your three options are a false choice. Consider option 4:
4) Use the Entity Framework with non-sequential, client-generated GUIDs.
The EF can't see DB-server-generated GUIDs for new rows inserted by the framework itself, sure, but you don't need to generate the GUIDs on the DB server. You can generate them on the client when you create your entity instances. The whole point of a GUID is it doesn't matter where you generate it. As for GUIDs generated by a replicated DB, the EF will see them just fine.
Your client-side GUIDs won't be sequential (use Guid.NewGuid()), but they will be world-wide, guaranteed unique.
We do this in shipping, production software with replication. It does work.
Another option (not available when this was posted) is to upgrade to EF 4, which supports server-generated GUIDs.
Why not use identity column? If you are doing merge replication you can have each system start at a separate seed and work in one direction (e.g. node a starts at 1 and adds 1, node b starts at 0 and subtracts one)...
You can use stored procedures if you are really stuck on using NewSequentialID(). You can bind the result columns from the procedure to the appropriate property and once inserted the SQL-generated GUID will be fed back into the object.
Unfortunately you have to define SPs for all three operations (insert, update, delete) even though the other operations would complete properly using the defaults. You also need to maintain the SP code and ensure it is synchronized with your EF model as you make changes, which may make this option unattractive on account of the additional overhead.
There is a step-by-step example at http://blogs.msdn.com/bags/archive/2009/03/12/entity-framework-modeling-action-stored-procedures.aspx which is pretty straight-forward.
use newseqid with your own orm (it not that hard) with linq

Resources