How do you keep two related, but separate, systems in sync with each other? - sql-server

My current development project has two aspects to it. First, there is a public website where external users can submit and update information for various purposes. This information is then saved to a local SQL Server at the colo facility.
The second aspect is an internal application which employees use to manage those same records (conceptually) and provide status updates, approvals, etc. This application is hosted within the corporate firewall with its own local SQL Server database.
The two networks are connected by a hardware VPN solution, which is decent, but obviously not the speediest thing in the world.
The two databases are similar, and share many of the same tables, but they are not 100% the same. Many of the tables on both sides are very specific to either the internal or external application.
So the question is: when a user updates their information or submits a record on the public website, how do you transfer that data to the internal application's database so it can be managed by the internal staff? And vice versa... how do you push updates made by the staff back out to the website?
It is worth mentioning that the more "real time" these updates occur, the better. Not that it has to be instant, just reasonably quick.
So far, I have thought about using the following types of approaches:
Bi-directional replication
Web service interfaces on both sides with code to sync the changes as they are made (in real time).
Web service interfaces on both sides with code to asynchronously sync the changes (using a queueing mechanism).
Any advice? Has anyone run into this problem before? Did you come up with a solution that worked well for you?

This is a pretty common integration scenario, I believe. Personally, I think an asynchronous messaging solution using a queue is ideal.
You should be able to achieve near real time synchronization without the overhead or complexity of something like replication.
Synchronous web services are not ideal because your code will have to be very sophisticated to handle failure scenarios. What happens when one system is restarted while the other continues to publish changes? Does the sending system get timeouts? What does it do with those? Unless you are prepared to lose data, you'll want some sort of transactional queue (like MSMQ) to receive the change notices and take care of making sure they get to the other system. If either system is down, the changes (passed as messages) will just accumulate and as soon as a connection can be established the re-starting server will process all the queued messages and catch up, making system integrity much, much easier to achieve.
There are some open source tools that can really make this easy for you if you are using .NET (especially if you want to use MSMQ).
nServiceBus by Udi Dahan
Mass Transit by Dru Sellers and Chris Patterson
There are commercial products also, and if you are considering a commercial option see here for a list of of options on .NET. Of course, WCF can do async messaging using MSMQ bindings, but a tool like nServiceBus or MassTransit will give you a very simple Send/Receive or Pub/Sub API that will make your requirement a very straightforward job.
If you're using Java, there are any number of open source service bus implementations that will make this kind of bi-directional, asynchronous messaging a snap, like Mule or maybe just ActiveMQ.
You may also want to consider reading Udi Dahan's blog, listening to some of his podcasts. Here are some more good resources to get you started.

I'm mid-way through a similar project except I have multiple sites that need to keep in sync over slow connections (dial-up in some cases).
Firstly you need to track changes, if you can use SQL 2008 (even the Express version is enough if the 2Gb limit isn't a problem) this will ease the pain greatly, just turn on Change Tracking on the database and each table. We're using SQL Server 2008 at the head office with the extended schema and SQL Express 2008 at each site with a sub-set of data and limited schema.
Secondly you need to track your changes, Sync Services does the trick nicely and supports using a WCF gateway into the main database. In this example you will need to use the Sync using SQL Express Client sample as a starting point, note that it's based on SQL 2005 so you'll need to update it to take advantage of the Change Tracking features in 2008. By default the Sync Services uses SQL CE on the clients, which I'm sure isn't enough in your case. You'll need a service that runs on your Web Server that periodically (could be as often as every 10 seconds if you want) runs the Synchronize() method. This will tell your main database about changes made locally and then ask the server for all changes made there. You can set up the get and apply SQL code to call stored procedures and you can add event handlers to handle conflicts (e.g. Client Update vs Server Update) and resolve them accordingly at each end.

We have a shop as a client, with three stores connected to the same VPN
Two of the shops have a computer running as a "server" for that shop and the the third one has the "master database"
To synchronize all to the master we don't have the best solution, but it works: there is a dedicated PC running an application that checks the timestamp of every record in every table of the two stores and if it is different that the last time you synchronize, it copies the results
Note that this works both ways. I.e. if you update a product in the master database, this change will propagate to the other two shops. If you have a new order in one of the shops, it will be transmitted to the "master".
With some optimizations you can have all the shops synchronize in around 20minutes

Recently I have had a lot of success with SQL Server Service Broker which offers reliable, persisted asynchronous messaging out of the box with very little implementation pain.
It is quick to set up and as you learn more you can use some of the more advanced features.
Unknown to most, it is also part of the desktop editions so it can be used as a workstation messaging system
If you have existing T-SQL skills they can be leveraged as all the code to read and write messages is done in SQL
It is blindingly fast
It is a vastly under-hyped part of SQL Server and well worth a look.

I'd say just have a job that copies the data in the pub database input table into a private database pending table. Then once you update the data on the private side have it replicated to the public side. If you don't have any of the replicated data on the public side updated it should be a fairly easy transactional replication solution.

Related

Transfer data between NoSQL and SQL databases on different servers

Currently, I'm working on a MERN Web Application that'll need to communicate with a Microsft SQL Server database on a different server but on the same network.
Data will only be "transferred" from the Mongo database to the MSSQL one based on a user action. I think I can accomplish this by simply transforming the data to transfer into the appropriate format on my Express server and connecting to the MSSQL via the matching API.
On the flip side, data will be transferred from the MSSQL database to the Mongo one when a certain field is updated in a record. I think I can accomplish this with a Trigger, but I'm not exactly sure how.
Do either of these solutions sound reasonable or are there more better/industry standard methods that I should be employing. Any and all help is much appreciated!
There are (in general) two ways of doing this.
If the data transfer needs to happen immediately, you may be able to use triggers to accomplish this, although be aware of your error handling.
The other option is to develop some form of worker process in your favourite scripting language and run this on a schedule. (This would be my preferred option, as my personal familiarity with triggers is fairly limited). If option 1 isn't viable, you could set your schedule to be very frequent, say once per minute or every x seconds, as long as a new task doesn't spawn before the previous is completed.
The broader question though, is do you need to have data duplicated across two different sources? The obvious pitfall with this approach is consistency, should anything fail you can end up with two data sources wildly out of sync with each other and your approach will have to account for this.

Is this the right architecture for our MMORPG mobile game?

These days I am trying to design architecture of a new MMORPG mobile game for my company. This game is similar to Mafia Wars, iMobsters, or RISK. Basic idea is to prepare an army to battle your opponents (online users).
Although I have previously worked on multiple mobile apps but this is something new to me. After a lot of struggle, I have come up with an architecture which is illustrated with the help of a high-level flow diagram:
We have decided to go with client-server model. There will be a centralized database on server. Each client will have its own local database which will remain in sync with server. This database acts as a cache for storing things that do not change frequently e.g. maps, products, inventory etc.
With this model in place, I am not sure how to tackle following issues:
What would be the best way of synchronizing server and client databases?
Should an event get saved to local DB before updating it to server? What if app terminates for some reason before saving changes to centralized DB?
Will simple HTTP requests serve the purpose of synchronization?
How to know which users are currently logged in? (One way could be to have client keep on sending a request to server after every x minutes to notify that it is active. Otherwise consider a client inactive).
Are client side validations enough? If not, how to revert an action if server does not validate something?
I am not sure if this is an efficient solution and how it will scale. I would really appreciate if people who have already worked on such apps can share their experiences which might help me to come up with something better. Thanks in advance.
Additional Info:
Client-side is implemented in C++ game engine called marmalade. This is a cross platform game engine which means you can run your app on all major mobile OS. We certainly can achieve threading and which is also illustrated in my flow diagram. I am planning to use MySQL for server and SQLite for client.
This is not a turn based game so there is not much interaction with other players. Server will provide a list of online players and you can battle them by clicking battle button and after some animation, result will be announced.
For database synchronization I have two solutions in mind:
Store timestamp for each record. Also keep track of when local DB
was last updated. When synchronizing, only select those rows that
have a greater timestamp and send to local DB. Keep a isDeleted flag
for deleted rows so every deletion simply behaves as an update. But
I have serious doubts about performance as for every sync request we
would have to scan the complete DB and look for updated rows.
Another technique might be to keep a log of each insertion or update
that takes place against a user. When the client app asks for sync,
go to this table and find out which rows of which table have been
updated or inserted. Once these rows are successfully transferred to
client remove this log. But then I think of what happens if a user
uses another device. According to logs table all updates have been
transferred for that user but actually that was done on another
device. So we might have to keep track of device also. Implementing
this technique is more time consuming but not sure if it out
performs the first one.
I've actually worked on some of the titles you mentioned.
I do not recommend using mysql, it doesn't scale up correctly, even if you shard. If you do you are loosing any benefits you might have in using a relational database.
You are probably better off using a no-sql database. Its is faster to develop, easy to scale and it is simple to change the document structure which is a given for a game.
If your game data is simple you might want to try couchDB, if you need advanced querying you are probably better of with MongoDB.
Take care of security at the start. They will try to hack the game for sure and if you have a number of clients released it is hard to make security changes backward compatible. SSL won't do much as the end user is the problem not an eavesdropper. Signing or encrypting your data will make it harder for a user to add items and gold to their accounts.
You should also define your architecture to support multiple clients without having a bunch of ifs and case statements. Read the client version and dispatch that client to the appropriate codebase.
Have a maintenance mode with flags for upgrading, maintenance, etc. It will cut you some slack if you need to re-shard your DB or any other change that might require downtime.
Client side validations are not enough, specially if using in app purchases. I agree with the above post. Server should control game logic.
As for DB sync, its best to memcache read only data. Typical examples are buyable items, maps, news, etc. User data is harder as you might not be able to afford loosing any modified data. The easiest setup is to cache user data for a couple of hours and write directly to the DB every time. If you are using no-sql it will probably withstand a high load without the need of using a persistence queue.
I see two potential problem hidden in the fact that you store all the state on the client, and then update the state on the server using a background thread.
How can the server validate the data being posted? If someone hacked your application, they could modify the code so whenever they swing their sword (or whatever they do in your game), it is always a hit. Doing that in a single player game is not that big a deal, but doing that in an MMORPG can ruin the experience for everyone else. So the server should validate every update of data - or even better, the server should be in charge of every business rule. So when you swing your sword against an opponent, that should be a server call, and the server returns whether or not it is a hit, and how many hit points the opponent lost.
What about interaction with other players (since you say it is an MMORP, there will be interaction with other players)? Since you say that you update the server, and get updates in a background thread, interaction will be sluggish. When you communicate with another character you have first wait for you background thread to sync data, but you also have to wait on the background thread of the other player to sync data.
Looks nice. But what is the client-side made of ? Web ? Can you use threading to synchronize both DB ? I should make the game in that way that it interacts immediately with the local DB, and let some background mechanism do the sync (something like a snapshot). This leads me to think about mysql replication. I think it is worth to be tried, but I never did. It also brings you answers to other questions. But what about the charge (how many customers are connected together) ?
http://dev.mysql.com/doc/refman/5.0/en/replication.html
Make your client issue commands to the server ("hit player"), and server send (relevant) events to client ("player was killed"). I wouldn't advice going with data synchronization. Server should be responsible for all important game decisions.

Accessing data from other databases - system architecture

I have a system that we have recently developed - a web application over a SQL server database. The SQL server database has been set up to be a 'multi-tenant' database, with many different 'installations' of our web site accessing the same database.
We have another application that runs along similar lines, the main difference being that it has many different 'installations' all accessing their own seperate databases.
All these websites run on the same server and all the databases reside in the same SQL server instance.
Each of our clients would have one of each of these systems and up to this point, we have had some fairly light integration between these two systems, which has been handled via web service calls.
We now have a new change that is going to require me to return a list of data from the multi-tenant system, but filter it based on criteria stored in the databases of the other system. I can see a few ways of doing this, but was wondering if anybody had any bright ideas:
Web service again - don't like this idea, as it means taking a list of data and making a call for each individual item, which is both slow and ugly.
Writing some dynamic SQL within the database layer to do a join on .dbo.table, which is also a bit ugly, and can be hard to maintain.
Replicate the data from one database to the other. This is where I am tending towards, however there then comes a risk of the data getting out of sync.
I'd like to do something clever about views in my multi-tenant database, but I don't want to have to create a seperate set of views each time we create a new database for the second system...
depending on business size I go with #1 or #2.
#1 is more scalabe and good for heterogenus clients but harder to implement and maintain. Since you do't have public APIs you can go to #2.
#2 needs an expert DBA and very error-prone
#3 is the worst solution IMO since redundacy would happen and it's hard to resolve later.
What I suggest is a short-term plan and a long-term plan. In short term use #1 or #2 and at the same time redesign your database. Then you can add new data model to system and it can coexist with legacy dbase. When you are insure of it's functionality switch to new db but still remain lgacy system. And finally when new db has no problem after a while exit the legacy db from circuit.
Don't change the data model. It's risky. Just make another abstract wrapper over it.
You can replicate database on another server and let this new wrapper work with copy of data.
If any data corruption happened, simply restore to main copy.

Replication vs Sync Framework vs Service Broker

I've asked about each of these technologies separately, and really haven't found a suitable answer.
We have a server in our central office running SQL Server 2005 Enterprise that has several large (large in the sense that DSL is the limiting factor) databases that we need local copies of at each of our locations. We currently have a few dozen locations, and are needing to bring even more online. The total number of locations we'll need to sync these databases to will be in the several hundreds in the next 2 years.
We are trying to overcome issues with the WAN connection at each location. These are DSL lines and the wiring at the locations isn't always the best. We currently have issues with some of the locations going down as often as every hour. While we are working to resolve these issues with rewiring and assistance from the local telcos, it mainly highlights the problem at hand: we need a two-way sync that can handle being occasionally-connected.
We tried transactional replication for a while, and while it worked some of the time, it was too high maintenance for us, and it seemed to randomly error out often with no possible explanations, forcing us to reinitialize subscriptions (which could take upwards of 4 hours assuming the location would stay connected long enough to get the entire snapshot in one go). We've looked at rolling our own solution from scratch, but I don't feel this would be the best idea given the scale and reliability we are needing.
So far we've also looked at Sync Framework, and as suggested by someone else, Service Broker. Sync Framework seems a better fit, but I was told that Service Broker scales better and is more reliable? I can't find any empirical data on the overhead involved with Sync Framework or Service Broker, so it's proving impossible to compare the two in this regard.
What we really need is a two-way sync between the central office server and a remote client that can run autonomously and can report to an admin in the event of a failure that requires our intervention.
There are so many possible solutions to this problem, all involving completely different technologies, that I need a fresh eye on this.
What do you think would be the optimal solution for our situation, and why?
EDIT: Obviously, upgrading to SQL Server 2008 would solve this problem easily. However, we would like to try to less expensive options first.
I don't have any hard data to offer on this, but we used the sync framework on a project a while ago. My experience with it is really bad. It's slow (even when synchronizing relatively small tables across a LAN), scales terribly and requires a lot of work to manually handle error conditions (it'll happy produce larger packets than WCF can handle by default -- and is only able to split updates into batches when syncing one way, not the other.) And it only works with a few select databases (the client must use MS SQL Compact Edition, as I recall), unless you're willing to write your own SyncAdapter.
Overall, a lot of work just to get a fragile and inefficient solution to your problem. I wouldn't recommend it.
You can Use sync framwork with SQL express 2008 R1/R2 on one end and multitenant db SQl server enterprise on central end. Below is the sample application for n-tier sync over secure WCF channel.You could write windows service to sync data from backend:
http://www.rajneeshnoonia.com/blog/2012/03/n-tier-sync-framework/
It sould be capable enough to handle large number of clients (thousands).
I think we'll look into the SQL Server 2008 upgrade route. It seems the native change tracking support will be the easiest way to accomplish this.

Web services and database concurrency

I'm building a .NET client application (C#, WinForms) that uses a web service for interaction with the database. The client will be run from remote locations using a WAN or VPN, hence the idea of using a web service rather than direct database access.
The issue I'm grappling with right now is how to handle database concurrency. That is, if two people from different locations update the same data, how do I deal with it? I'm considering using timestamps on each database record and having that as part of the update where clauses, but that means that the timestamps have to move back and forth through the web service interface, which seems kind of ugly.
What is the best way to approach this?
I don't think you want your web service to talk directly to the database. You probably want your service to interact with some type of business components who in turn interact with a data access layer. Any concurrency exceptions can be passed from the DAL up to the business layer where they can be handled so that the web service never has to see the timestamps.
But if you are passing something like a data table up to the client and you want to avoid timestamps, you can do concurrency checking by comparing field by field. The Table Adapter wizards generate this type of concurrency checking by default if you ask for optimistic concurrency checking.
If your collisions occur infrequently enough that they can be resolved manually, a simple solution is to add an update trigger that copies a row's pre-update values to an audit table. This way the most recent write is the "winner", but no data is ever lost to an overwrite, and an administrator can restore an earlier row state or even combine them.
This technique has its downsides, and is not a very good solution where frequent overwrites are common.
Also, this is slightly off-topic, but using web services isn't necessarily the way to go just because the clients will be remoting into the network. ASP.NET web services are XML-based and very verbose. If your client application can count on being always connected, you'd be better off not using web services.

Resources