What's the best practice for handling primary keys using an ORM over Oracle or SQL Server?
Oracle - Should I use a sequence and a trigger or let the ORM handle this? Or is there some other way ?
SQL Server - Should I use the identifier data type or somehow else ?
If you are using any kind of ORM, I would suggest you to let it handle your primary keys generation. In SQL Server and Oracle.
With either database, I would use a client-generated Guid for the primary key (which would map to uniqueidentifier in SQL Server, or RAW(20) in Oracle). Despite the performance penalty on JOINs when using a Guid foreign key, I tend to work with disconnected clients and replicated databases, so being able to generate unique IDs on the client is a must. Guid IDs also have advantages when working with an ORM, as they simplify your life considerably.
It is a good idea to remember that databases tend to have a life independent from a front end application. Records can be inserted by batch processes, web services, data exchange with other databases, heck, even different applications sharing the same database.
Consequently it is useful if a database table is in charge of its own identify, or at least has that capability. For instance, in Oracle a BEFORE INSERT trigger can check whether a value has been provided for its primary key, and if not generate its own.
Both Oracle and SQL Server can generate GUIDs, so that is not a sufficient reason for delegating identity generation to the client.
Sometimes, there is a natural, unique identifier for a table. For instance, each row in a User table can be uniquely identified by the UserName column. In that case, it may be best to use UserName as the primary key.
Also, consider tables used to form a many to many relationship. A UserGroupMembership table will contain UserId and GroupId columns, which should be the primary key, as the combination uniquely identifies the fact that a particular user is a member of a particular group.
Related
When our IT department converts Access databases to SQL Server the relationships do not transfer over. In the past, I have provided ERDs that they can use to build the relationships. In this case, I didn't.
What are the possible consequences of defining the table relationships in the MS Access Front End versus on the SQL Server itself?
It would be ideal if I could just create the relationships in Access and avoid submitting a request to IT, but I don't want to risk performance issues now or in the future.
There may be some misconceptions.
A relationship in SQL Server enforces referential integrity (an order cannot have a customer ID that doesn't exist). It does not automatically create an index on the Foreign Key, so it has per se no impact on performance.
But in most cases it is a good idea to define an index on a foreign key, to improve performance.
A relationship that you define in Access on linked tables does neither. It cannot enforce referential integrity (that's the server's job).
It is merely a "hint" that the tables are related via the specified fields, e.g., so that the Query Builder can automatically join the tables if they are added to the query design. (copied from here)
So you should
Create the relationships in SQL Server to avoid inconsistent data. ("But my application logic prevents that!", I hear you say. Well, applications have bugs.)
Create indexes on foreign keys where appropriate to avoid performance problems.
If you are working with queries in the Access frontend, additionally define the relationships there.
Ideally you should have a test server where you can yourself define the relationships, and just send the finished SQL script to IT.
I'm creating a DB using SQL Server 2008.
This DB will be used in two countries and at some time (every day) they will be synchronized, I'll use the Replication service to accomplish that.
Most of the tables are using an Int column with Identity increment. But the tables will be empty when deployed so both countries will have a row with identity 1, 2, and son. I've never use replication before so I wanna know if there will be an error when the tables are synchronized?
Should I use a GUID data type instead?
Replicate Identity Columns (MSDN):
Replication offers three identity range management options:
Automatic. Used for merge replication and transactional replication with updates at the Subscriber...
Manual. Used for snapshot and transactional replication without updates at the Subscriber...
None. This option is recommended only for backwards compatibility...
So, yes, you can continue to use IDENTITY, provided you read through the information on replication and choose an option that makes sense for you.
Under Automatic, what it does is each server grabs a range of usable identity values and hands the individual values out as needed. Provided synchronization occurs often enough so that the ranges aren't completely exhausted, you'll never notice this detail.
And this allows you to scale out later as needed - as opposed to e.g. a MOD scheme where one server hands out odd values and the other even - you can't easily add a third server to such a scheme.
By your description, it sounds like you want to implement so called Merge replication.
In SQL Server you would not need to change the identity to a GUID, however, if you don't SQL server will automatically add another column called rowguid for each table and you may end up with duplicates of your original identity column. To circumvent this, you could have the servers assign mod 2 IDs.
In my opinion it makes most sense to use a GUID for the IDs altogether. Don't forget to set the ROWGUIDCOL property on your identity columns. Good luck.
Relevant MSDN:
http://technet.microsoft.com/en-us/library/ms152746.aspx
Consider adding a deviceID field to all tables users can update. With each device making changes using its own ID as part of the PK, there cannot be conflicts across devices.
I have two separate SQL Server 2005 databases (on the same server)
security database
main application database
The security database has a user table with everything needed to authenticate. -
The application database has a person table with extended user details. There is a 1-1 mapping between the security database user table and the application database person table.
I want to enforce a mapping between the user and the person table. I'm assuming that foreign keys can't be mapped across databases thus I am wondering what to do to enforce the integrity of the relationship.
Cross database foreign keys are indeed not supported
Msg 1763, Level 16, State 0, Line 2
Cross-database foreign key references are not supported.
If you really want to enforce the referential integrity on the database side you will have to rely on triggers. (which I don't recommend)
to make your code more maintainable you could create synonyms for the tables you want to check referential integrity on.
CREATE SYNONYM myTable FOR otherdatabase.dbo.myTable;
This would be to make the "manual" checks easier, as you can not create foreign keys on a synonym.
It's a lot of work but you may think about merging those two databases into a single database. If you want a logical difference between objects within the database, you can use a schema.
I've just come across something disturbing, I was trying to implement transactional replication from a database whose design is not under our control . This replication was in order to perform reporting without taxing the system too much. Upon trying the replication only some of the tables went across.
On investigation tables were not selected to be replicated because they don't have a primary key, I thought this cannot be it is even shown as a primary key if I use ODBC and ms access but not in management studio. Also the queries are not ridiculously slow.
I tried inserting a duplicate record and it failed saying about a unique index(not a primary key). Seems to be the tables have been implemented using a unique index as oppose to a primary key. Why I do not know I could scream.
Is there anyway to perform transactional replication or an alternative, it needs to be live (last minute or two). The main db server is currently sql 2000 sp3a and the reporting server 2005.
The only thing I have currently thought of trying is setting the replication up as if it is another type of database. I believe replication to say oracle is possible would this force the use of say an ODBC driver like I assume access is using hence showing a primary key. I don't know if that is accurate out of my depth on this.
As MSDN states, it is not possible to create a transactional replication on tables without primary keys. You could use Merge replication (one way), that doesn't require a primary key, and it automatically creates a rowguid column if it doesn't exist:
Merge replication uses a globally
unique identifier (GUID) column to
identify each row during the merge
replication process. If a published
table does not have a uniqueidentifier
column with the ROWGUIDCOL property
and a unique index, replication adds
one. Ensure that any SELECT and INSERT
statements that reference published
tables use column lists. If a table is
no longer published and replication
added the column, the column is
removed; if the column already
existed, it is not removed.
Unfortunately, you will have a performance penalty if using merge replication.
If you need to use replication for reporting only, and you don't need the data to be exactly the same as on the publisher, then you could consider snapshot replication also
Our application architecture allows us to host multiple clients in a single database, and also host multiple databases. This allows us to scale out by distributing clients across multiple databases. For example, 20 clients can be in database A, and another 15 could be in database B. We use a ClientID field in almost every table to partition client data. All our table's primary keys are INT identity TableID fields.
I'm looking for a tool/script that would help me extract client data from one database, and move it to a brand new database (so the PKs can stay the same). I'm hoping this exists already so we don't have to build our own. Pretty flexible in how this could work, but ideally it just generates a large .sql file with all the necessary INSERTS in the right order to move the data, and another sql file with all the necessary DELETES to erase the data from the source.
If it makes any difference we are on SQL Server 2008.
If you have standard or enterprise, you do have SSIS. Although it may not qualify as a "tool", it is fairly easy to implement in this scenario.
I can recomend redgate SQL DataCompare for this, we use it for syncing data, and use their SQL Compare to sync the database schema.
Both tools can either output sql, you can execute yourself, or the tools can execute the sql scripts themself.
They have a command line version of the tools to, so you could use them in an deployment script, tho i haven't tried this.
They both work really well, and are no doubt worth the price.
Not the answer you may be looking for, but you should consider using a GUID as a key. This will ensure that you have some type of unique identifier for your all records and that you can avoid collisions with identity keys / integer based indexes. It would add another degree of traceability should something go wrong when you migrate between databases.
SplendidCRM uses this technique when importing data from other DB systems.
Update:
My assumption was that the operation of transferring data between databases was not that frequent and that you needed database architecture for that task. I would use the GUID as lookup key specifically validation for the transfer of data, but I would NOT use that as a primary key for joins for standard operations like URL's. Although unique across databases, the trade-off is that GUIDs are slow.
In other words, the GUIDS would in addition to your existing primary keys now, and act as a means of validation for you should something go wrong. If you need ClientID in Database A to retain the same value in Database B then an identity column as that identifier will be an issue. You may have to create another identifier that is not "auto-generated". This could something other than the GUID, but my instinct is that integers alone will not be enough. Maybe you can create a columns that is a hash of the identity key, customer name and database name, or more simply, just concatenate those columns into a varchar column.