The scenario:
I have a local DB and a remote public DB. Both are synced using SQLyog SJA job files - which sync both DB's to be the same. It works well.
The part of the DB with the issue is a user comments table.
The local DB contains thousands of user comments, and more are always being added through various means. These are all synced to the remote DB comments table.
The remote comments table DB accepts direct user comments to be entered. These then sync to the local comments DB.
It is a two way sync, where neither one deletes from the other. This actually seems to work well and is automated through SJA.
The problem:
The primary key ID's for the comments are auto incremented on both sides.
So if both tables are in exact sync and the key count is at 50, and a user makes a remote entry it will be key 51. Now the local DB is also growing and a different entry is made under key 51. So when the next sync is called there will be a problem as the keys are conflicting.
Possible solutions:
So I thought A good idea would be to add a large number to the remote comments PK ID as they are added. That way when a sync is called the primary keys will not conflict as the local PK ID would never get that high.
It worked well on the first sync. But the problem is that the auto increment feature will increment off the highest value, even if there is a large gap in between the keys. So this solution does not work.
I would like to maintain a single table for user comments and have a seamless sync but the issue of conflicting primary keys is a problem.
I am interested if other people have some thoughts on the matter.
I hope I described the problem clearly.
Thanks.
------ EDIT -----------
I have found a solution that works.
I changed the primary key ID to just a normal INT with auto increment. I then created a second ID INT field which consists of a random INT about 10 in length. I now use the two ID fields together to form a PK ID. Now the chance of a conflict between the two DB's is essentially none existent. The auto incremented ID and the long random INT ID would both have to be the same on the same entry, highly unlikely with the volume I'm dealing with.
Not the best solution but it works well.
Hope this helps someone else out.
I have found a solution that works.
I changed the primary key ID to just a normal INT with auto increment. I then created a second ID INT field which consists of a random INT about 10 in length. I now use the two ID fields together to form a PK ID. Now the chance of a conflict between the two DB's is essentially none existent. The auto incremented ID and the long random INT ID would both have to be the same on the same entry, highly unlikely with the volume I'm dealing with.
Not the best solution but it works well.
Hope this helps someone else out.
Related
Just got a question here about a database table. If the table only has a primary key (identity) and 1 column of useful data, is it okay to be its own table or should it be in the parent table as just the data?
The table is storing Security Questions that the user will set up with they make their account and be used to reset password in the event they want to change password or forgot the password. I have the ID of the question, and the question string in this table.
The reason I have it in its own table is that the same question could be used for many users so why store the question many times in the parent table. Thats my thinking, just wanted a few others' opinions on this.
EDIT: The Security Questions are going to be input by my team, not the user themselves. The user will pick one of the questions to use.
I would suggest this sample design using bridge table:
You can have multiple questions for a user as well as their answers unique. Also, the questions can be same for multiple users.
You must always try to prevent duplicates, that's why your solution is the best.
it will also keep your database smaller. A foreign key with int value is smaller than a string.
Suppose I've got two databases with the same schema, but with different data. What if I want to export one database data into the other? This could be done without much inconvenience if there were no id conflicts. However, not only there will be id conflicts between members of the same model (like there's a user with id 1 in the first database and a completely different user with the same id in the second database), but also trouble arises when you've got foreign key columns referring to ids which will have to change to avoid conflict.
I was wondering if there was any quick and clean way to do this. Thanks
Why use rowguid and what are the benefits?
Suppose a company has thousands of customers, is it a good approach to divide them on the basis of their gender for the performance and fast query if no then why?
How do large companies like Facebook handle their primary key for their comments, users and for other things for example:
Suppose there are five users with primary key 1,2,3,4,5...
What if user 3 is deleted from there, now it's 1,2,4,5 will be left, which is kind of gap between continuous chain. How do they deal with it?
Don't know - maybe you use a non-auto value so you can keep it constant across other databases (maybe for use with 3rd part integration etc.)
Do not divide on a field such as gender, when you don't know gender (or want a full list) you are going to have to search two tables, also when you want to add other filtering/searching you will have to do over multiple tables again
So what if there is a gap in the ID chain - it does not effect anything. Why would you think it is important?
Why do many applications replace the primary key of a database with a seemingly random alternative id when revealing the record to the user?
My guess is that it prevents users from guessing other rows in the table. If so, isn't that just false sense of security?
I guess you are talking about surrogate keys here. One of the desired or supposed advantages of surrogate keys is that they aren't burdened by any external meaning or dependency on anything outside the database. So for example the surrogate key values can safely be reassigned or the key can be refactored or discarded without any consequences for users of the system.
Generally surrogate keys are kept hidden from users so that they don't acquire any such external dependencies. Being hidden from users was in fact part of the original definition of a surrogate key as proposed by E.F.Codd. If key values reside in the user's browser cache or favourites list then they aren't much use as "surrogates" any more. So that's one common reason why you will see one key used only inside the database and a different key for the same table made visible in the application.
I think it may depend on the type of application you are working with. I work with Enterprise software that is only used by the company I work for and is not generally available to the outside world. In this case, it is often critical to let the user see the surrogate key for people-related records because the information in the person table has no uniqueness. There can be two John Smiths (we actually have over 1000 of them) who are genuinely different people. They may even have the same business address and be different people (Sons are often named for fathers and work in the same medical practice for instance). So they need to refer to the surrogate key on forms and in reporting to ensure they are using the record they thought they wanted. OItherwise if they wanted to research further details about the John Smith that they saw in a report, how would they look it up in the aaplication without having to go through all 1000 to find the right one? Creating a fake id as well as the real one would be time consuming (we import millions of records at a time) and for no real gain since the data would not be visible outside our comapny application.
For a web app that is open to the general public, I can see where you might not want to show this information.
I have a web app which I can create some notes, each time I create a new note, it will insert to a table with an auto_increment id. (quite obvious)
Now I want to develop an android app which I can create notes too (save them locally in sqlite), and then syncronize those notes with the server.
The problem is, when I create notes in my phone they will have their own auto_increment id which many times will be the same with those notes in server!
I don't care to have duplicated notes (actually I don't think there is a way to differentiate if the new note is duplicated or not, because they don't have some physical id), the problem is if they have same id (primary key), I won't be able to insert them to the server.
Any suggestion?
You could use an UUID as a key for your note.
That way, each entry should have an unique id, be it created on the server or on the client.
To create a UUID, you can use UUID.randomUUID().
The most obvious solution would be to give each note its own unique hash or GUID in addition to the database's auto_increment_id.
You'd then use these unique values as the basis for synchronisation in conjunction with a "last synced" timestamp in each of the tables so that you know what data needs to be synced and can easily determine if the data already exists in the destination (and should be updated) or whether it's a new note.
I'm sorry but i think that your DB structure is wrong. You cannot use autoincrement field in this way, different DBs with a disconnected architecture. Autoincrement values are created for a specific use, if you need to merge two tables like this, you have to implement a different logic. Use a note_id to identify a note in a unique way, using more data (i.e. the user id, the device id etc.) to make this id unique. Autoincrement will only give you a messy architecture at best in this scenario