Generating unique identity column value across different SQL Server instances - sql-server

I have WinForm application (C#) and Backed as SQL Server 2008, which works offline at client location. Currently all branches were getting managed from single location, so an identity column value is always unique across branches.
Now, I want this application to be managed from multiple locations and should work offline as well as online. To make it possible we are putting a SQL Server in PUBLIC IP and each branch will have a separate local SQL Server instance running. The data wil be synced between local and central sever on regular intervals.
Is this approach good to sync data, or is there something better that I can do?
If I go with above approach the problem that I will face is in syncing the data. There is a problem with table structure, eg. I have a table COURSES as follows:
COURSES ( COURSE_ID (identity column), COURSE_NAME, COURSE_BRANCH_ID)
where
COURSE_ID is an IDENTITY column
COURSE_NAME represents the name of Standard
COURSE_BRANCH_ID represents the branch, where the course is taken.
Now each SQL Server will generate its own value for the COURSE_ID column and that might be same for different server.
The unique combination is COURSE_ID and COURSE_BRANCH_ID.
Is there any way I can a append COURSE_BRANCH_ID with COURSE_ID without adding a new IDENTITY column?
Approach I thought of
Remove identity from COURSE_ID column,
Add a new column say ID which will be a identity column.
Now after insert on COURCES table write a trigger which will update the value of COURSE_ID as Concate(COURSE_BRANCH_ID,ID)
convert(number(convert(varchar,COURSE_BRANCH_ID) + convert(varchar,ID))
But this will require lots of efforts as I have around 19 tables with such problem. I there any thing better than this we can do? Any suggestions are welcome! Thank You!

There are several approaches related to this issue. The one you mentioned where you concatenate the branch id to the identity field.
You can use GUID, the possibility of collision is almost zero.
Or you can set the Identity Seed and Increment, such that each branch has a different start value, and all incremented by the number of branches.
For example, if you have four branches, then on Branch1 you may set the
ID INT IDENTITY(1, 4) NOT NULL -- IDs will be 1, 5, 9...etc.
On Branch2
ID INT IDENTITY(2, 4) NOT NULL -- IDs will be 2, 6, 10 ...etc
On Branch3
ID INT IDENTITY(3, 4) NOT NULL -- IDs will be 3, 7, 11 ...etc
And on Branch4
ID INT IDENTITY(4, 4) NOT NULL -- IDs will be 4, 8, 12 ...etc.

Related

SQL Server Employee table with existing ID number

I am attempting to create an Employee table in SQL Server 2016 and I want to use EmpID as the Primary Key and Identity. Here is what I believe to be true and my question: When I create the Employee table with EmpID as the Primary Key and an Identity(100, 1) column, each time I add a new employee, SQL Server will auto create the EmpID starting with 100 and increment by 1 with each new employee. What happens if I want to import a list of existing employees from another company and those employees already have an existing EmpID? I haven't been able to figure out how I would import those employees with the existing EmpID. If there is a way to import the employee list with the existing EmpID, will SQL Server check to make sure the EmpID's from the new list does not exist for a current employee? Or is there some code I need to write in order to make that happen?
Thanks!
You are right about primary keys, but about importing employees from another company and Merging it with your employee list, you have to ask these things:
WHY? Sure there are ways to solve this problem, but why will you merge other company employees into your company employee?
Other company ID structure: Most of the time, companies have different ID structure, some have 4 characters others have only numbers so on and so forth. But you have to know the differences of the companies ID Structure.
If the merging can't be avoided, then you have to tell the higher ups about the concern, and you have to tell them that you have to give the merging company new employee ID's which is a must. With this in my, simply appending your database with the new data is the solution.
This is an extremely normal data warehousing issue where a table has data sources from multiple places. Also comes up in migration, acquisitions, etc.
There is no way to keep the existing IDs as a primary key if there are multiple people with the same ID.
In the data warehouse world we would always create a new surrogate key, which is the primary key to the table, and include the original key and a source system identifier as two attributes.
In your scenario you will probably keep the existing keys for the original company, and create new IDs for the new employees, and save the oldID in an additional column for historical use.
Either of these choices also means that as you migrate other associated data such as leave information imported from the old system, you can translate it to the new key by looking up OldID in the employee table, and finding the associated newID to associate it with when saving your lave records in the new system.
At the end of the day there is no alternative to this, as you simply cant have two employees with the same primary key.
I have never seen any company that migrate employees from another company and keep their existed employee id. Usually, they'll give them a new ID and keep the old one in the employee file for references uses. But they never uses the old one as an active ID ever.
Large companies usually uses serial of special identities that are already defined in the system to distinguish employees based on field, specialty..etc.
Most companies they don't do the same as large ones, but instead, they stick with one identifier, and uses dimensions as an identity. These dimensions specify areas of work for employees, projects, vendors ..etc. So, they're used in the system globally, and affected on company financial reports (which is the main point of using it).
So, what you need to do is to see the company ID sequence requirements, then, play your part on that. As depending on IDENTITY alone won't be enough for most companies. If you see that you can depend on identity alone, then use it, if not, then see if you can use dimensions as an identity (you could create five dimensions - Company, Project, Department, Area, Cost Center - it will be enough for any company).
if you used identity alone, and want to migrate, then in your insert statement do :
SET IDENTITY_INSERT tableName ON
INSRT INTO tableName (columns)
...
this will allow you to insert inside identity column, however, doing this might require you to reset the identity to a new value, to avoid having issues. read DBCC CHECKIDENT
If you end up using dimensions, you could make the dimension and ID both primary keys, which will make sure that both are unique in the table (treated as one set).

SQL Server Alternative to reseeding identity column

I am currently working on a phone directory application. For this application I get a flat file (csv) from corporate SAP that is updated daily that I use to update an sql database twice a day using a windows service. Additionally, users can add themselves to the database if they do not exist (ie: is not included in the SAP file). Thus, a contact can be of 2 different types: 'SAP' or 'ECOM'.
So, the Windows service downloads the file from a SAP ftp, deletes all existing contacts in the database of type 'SAP' and then adds all the contacts on the file to the database. To insert the contacts into the database (some 30k), I load them into a DataTable and then make use of SqlBulkCopy. This works particularly, running only a few seconds.
The only problem is the fact that the primary key for this table is an auto-incremented identity. This means that my contact id's grows at a rate of 60k per day. I'm still in development and my id's are in the area of 20mil:
http://localhost/CityPhone/Contact/Details/21026374
I started looking into reseeding the id column, but if I were to reseed the identity to the current highest number in the database, the following scenario would pose issues:
Windows Service Loads 30 000 contacts
User creates entry for himself (id = 30 001)
Windows Service deletes all SAP contacts, reseeds column to after current highest id: 30 002
Also, I frequently query for users based on this this id, so, I'm concerned that making use of something like a GUID instead of an auto-incremented integer will have too high a price in performance. I also tried looking into SqlBulkCopyOptions.KeepIdentity, but this won't work. I don't get any id's from SAP in the file and if I did they could easily conflict with the values of manually entered contact fields. Is there any other solution to reseeding the column that would not cause the id column values to grow at such an exponential rate?
I suggest following workflow.
import to brand new table, like tempSAPImport, with your current workflow.
Add to your table only changed rows.
Insert Into ContactDetails
(Select *
from tempSAPImport
EXCEPT
SELECT Detail1, Detail2
FROM ContactDetails)
I think your SAP table have a primary key, you can make use of the control if a row updated only.
Update ContactDetails ( XXX your update criteria)
This way you will import your data fast, also you will keep your existing identity values. According to your speed requirements, adding indexes after import will speed up your process.
If SQL Server version >= 2012 then I think the best solution for the scenario above would be using a sequence for the PK values. This way you have control over the seeding process (you can cycle values).
More details here: http://msdn.microsoft.com/en-us/library/ff878091(v=sql.110).aspx

How to generate STRICTLY increasing number with no gap in Azure Sql Database

I have a MVC 5 application that access a Sql Database through Entity framework. I have a invoice model that looks like :
class Invoice
{
public int Id {get; set;}
public int InvoiceNumber {get; set;}
// other fields ....
}
InvoiceNumber have a lot of legal constraint. It needs to be unique, and incremental with no gap. I initially though that my Id would do the job till I have found that for some reasons I may have gaps between the id ( 1,2,3,...,7 then 1000 ... , for inst see Windows Azure SQL Database - Identity Auto increment column skips values ).
Take into account that many clients could be connected at the same time so these Inserts may be concurrent.
Does someone know how to force EF or the database to generate this kind of invoice number.
I think the only way that you're going to get this exact functionality is to drop the identity, add a unique index on the column, then design a very lightweight process that issues a call to the table to insert the next value such as
INSERT INTO Invoice (InvoiceID)
OUTPUT inserted.InvoiceID
SELECT MAX(InvoiceID) + 1
FROM Invoice
You would then write back an update containing the whole value, or alternatively set this up as a one and done type of query where you write the value back as one transaction in a statement such as when you pass all intended inserts into a procedure to write the value out.
I don't know about doing this in entity framework specifically as I don't do much application development, but it is possible to accomplish this just takes longer to do manually rather than trusting a table setting.
I am assuming here of course that you have already investigated the NOCACHE solution presented in the other question and found some sort of issue with it?
old question, but you could generate a table of all invoice numbers beforehand and mark them as used. then query for the minimum unmarked value.
table doesn't have to be all numbers either, just like 1000 ahead, or whatever's comfortable.

Database Partitioning Newbie - Rudimentary Web Indexer running slow

This is a rudimentary web indexer.
I have 2 database tables:
domainList:
PK domainName
UI domainNumber
status ... start, indexing, completed.
and
domainPages
PK pageNumber
FK domainNumber
pageHTML
pageTitle
I have several "indexer" servers that load the websites HTML and store it into the database.
The database as it gets bigger is now slowing down considerably.
INSERT INTO domainPages (domainNumber,domainPageHTML,domainPageTitle VALUES ('" & domainNumber & "',N'" & domainPageHTML & "',N'" & domainPageTitle & "')")
This is taking a long time as there are many rows. Reading from the table is taking a long time too.
I could create a new table for each set of domainPages, but I'd rather try something new: I'm looking at database partitioning to help.
All the examples on the 'net about partitioning use a date field, whearas here I need to partition on the domainNumber in the domainPages table (which is a logical foreign key on the domainList - as I believe an actualy relationship will fail with partitioning).
So I think I'm looking at a partition per unique domain? If that is correct how would I do this? Are there any examples online that don't involve a date field, but a logical foreign key on a table.
I had no answers to this question, so I had to use separate tables for each domain. Which means it takes a while to browse all the tables!
I did spot this however http://blog.sqlauthority.com/2008/01/25/sql-server-2005-database-table-partitioning-tutorial-how-to-horizontal-partition-database-table/ which is what I would have needed. It is there for anyone else looking in future.
I've not tried it but I assume that for higher IDs you would run another function e.g. once IDs reach 500 like in this example:
--Determine where values live before new partition
SELECT $PARTITION.Left_Partition (501) --should return a value of 4
--Create new partition
ALTER PARTITION FUNCTION Left_Partition ()
SPLIT RANGE(500)
--Determine where values live after new partition
SELECT $PARTITION.Left_Partition (501) --should return a value of 5
According to this article see section: Consider the table created in Figure 2. You can add a new partition to this table to contain values greater than 500, like this.
http://technet.microsoft.com/en-gb/magazine/2007.03.partitioning.aspx

Change tracking -- simplest scenario

I am coding in ASP.NET C# 4. The database is SQL Server 2012.
I have a table that has 2000 rows and 10 columns. I want to load this table in memory and if the table is updated/inserted in any way, I want to refresh the in-memory copy from the DB.
I looked into SQL Server Change Tracking, and while it does what I need, it appears I have to write quite a bit of code to select from the change functions -- more coding than I want to do for a simple scenario that I have.
What is the best (simplest) solution for this problem? Do I go with CacheDependency?
I currently have a similar problem: I'm implementing a rest service that returns a table with 50+ columns and I want to cache the data on the client to reduce trafic.
I'm thinking about this implementation:
All my tables have the fields
ID AutoIncrement (primary key)
Version RowVersion (a numeric value that will be incremented
every time the record is updated)
To calculate a "fingerprint" of the table I use the select
select count(*), max(id), sum(version) from ...
Deleting records changes the first value, inserting the second value and updating the third value.
So if one of the three values changes, i have to reload the table.

Resources