How is this problem generally solved?
To give some context, I have a single database and multiple processes connected to that database (a mobile api, a custom content management tool and a custom front end website), they all run on different servers. Sometimes it is useful to get delta changes on the database; for instance say I update the database with new data using the content management tool (which then sets some metadata on the data that was changed, specifically its updated_at field), and all my mobile apps need a local copy of the database to work. It wouldn't make much sense to redownload the whole db, so it's useful for the app to send its local last_updated timestamp and retrieve the subset of fields which were updated since that time. But the mobile api's server's last_updated and the database server's updated_at are not calculated from the same time source, each server has its own clock.
Is using a time based "freshness indicator" just a dirty mess or is there a robust and efficient way to do it? Or is there a better approach? I'm thinking something like incrementing a version number. What's the best practice for this kind of thing?
Related
We maintain a Software as a Service (SaaS) web application that sits on top of a multi-tenant SQL Server database. There are about 200 tables in the system, this biggest with just over 100 columns in it, at last look the database was about 10 gigabytes in size. We have about 25 client companies using the application every entering their data and running reports.
The single instance architecture is working very effectively for us - we're able to design and develop new features that are released to all clients every month. Each client experience can be configured through the use of feature-toggles, data dictionary customization, CSS skinning etc.
Our typical client is a corporate with several branches, one head office and sometimes their own inhouse IT software development teams.
The problem we're facing now is that a few of the clients are undertaking their own internal projects to develop reporting, data warehousing and dashboards based on the data presently stored in our multi-tenant database. We see it as likely that the number and sophistication of these projects will increase over time and we want to cater for it effectively.
At present, we have a "lite" solution whereby we expose a secured XML webservice that clients can call to get a full download of their records from a table. They specify the table, and we map that to a purpose-built stored proc that returns a fixed number of columns. Currently clients are pulling about 20 tables overnight into a local SQL database that they manage. Some clients have tens of thousands of records in a few of these tables.
This "lite" approach has several drawbacks:
1) Each client needs to develop and maintain their own data-pull mechanism, deal with all the logging, error handling etc.
2) Our database schema is constantly expanding and changing. The stored procs they are calling have a fixed number of columns, but occasionally when we expand an existing column (e.g. turn a varchar(50) into a varchar(100)) their pull will fail because it suddenly exceeds the column size in their local database.
3) We are starting to amass hundreds of different stored procs built for each client and their specific download expectations, which is a management hassle.
4) We are struggling to keep up with client requests for more data. We provide a "shell" schema (i.e. a copy of our database with no data in it) and ask them to select the tables they need to pull. They invariably say "all of them" which compounds the changing schema problem and is a heavy drain on our resources.
Sorry for the long winded question, but what I'm looking for is an approach to this problem that other teams have had success with. We want to securely expose all their data to them in a way they can most easily use it, but without getting caught in a constant process of negotiating data exchanges and cleaning up after schema changes.
What's worked for you?
Thanks,
Michael
I've worked for a SaaS company that went through a similar exercise some years back and Web Services is the probably the best solution here. incidentally, one of your "drawbacks" is actually a benefit. Customers should be encouraged to do their own data pulls because each customer's needs on timing and amount of data will be different.
Now instead of a LITE solution, you should look at building out a WSDL with separate CRUD calls for each table and good filtering capabilities. Also, make sure you have change times for records on each table. this way a customer can hit each table and immediately pull only the records that have been updated since the last time they pulled.
Will it be easy. Not a chance, but if you want scalability, it's the only route to go.
ood luck.
I've built a django/satchmo ecommerce site which is starting to get some traffic, and I am having a problem because I do not have a smart way to deal with database changes. When I develop the site on my local system, I make changes to the layout and the DB, which manages the product attributes.
When I want to push new developments to the server, I have to overwrite the server database which has information about recent shoppers and purchases.
What I want to do is "merge" the two databases together so that new purchases still stay recorded in the database, but which also allow me to push local changes to the server.
I'd appreciate any advice. Thanks.
are you using south? (if not, you should)
in particular, have a look at data migrations
These days I am trying to design architecture of a new MMORPG mobile game for my company. This game is similar to Mafia Wars, iMobsters, or RISK. Basic idea is to prepare an army to battle your opponents (online users).
Although I have previously worked on multiple mobile apps but this is something new to me. After a lot of struggle, I have come up with an architecture which is illustrated with the help of a high-level flow diagram:
We have decided to go with client-server model. There will be a centralized database on server. Each client will have its own local database which will remain in sync with server. This database acts as a cache for storing things that do not change frequently e.g. maps, products, inventory etc.
With this model in place, I am not sure how to tackle following issues:
What would be the best way of synchronizing server and client databases?
Should an event get saved to local DB before updating it to server? What if app terminates for some reason before saving changes to centralized DB?
Will simple HTTP requests serve the purpose of synchronization?
How to know which users are currently logged in? (One way could be to have client keep on sending a request to server after every x minutes to notify that it is active. Otherwise consider a client inactive).
Are client side validations enough? If not, how to revert an action if server does not validate something?
I am not sure if this is an efficient solution and how it will scale. I would really appreciate if people who have already worked on such apps can share their experiences which might help me to come up with something better. Thanks in advance.
Additional Info:
Client-side is implemented in C++ game engine called marmalade. This is a cross platform game engine which means you can run your app on all major mobile OS. We certainly can achieve threading and which is also illustrated in my flow diagram. I am planning to use MySQL for server and SQLite for client.
This is not a turn based game so there is not much interaction with other players. Server will provide a list of online players and you can battle them by clicking battle button and after some animation, result will be announced.
For database synchronization I have two solutions in mind:
Store timestamp for each record. Also keep track of when local DB
was last updated. When synchronizing, only select those rows that
have a greater timestamp and send to local DB. Keep a isDeleted flag
for deleted rows so every deletion simply behaves as an update. But
I have serious doubts about performance as for every sync request we
would have to scan the complete DB and look for updated rows.
Another technique might be to keep a log of each insertion or update
that takes place against a user. When the client app asks for sync,
go to this table and find out which rows of which table have been
updated or inserted. Once these rows are successfully transferred to
client remove this log. But then I think of what happens if a user
uses another device. According to logs table all updates have been
transferred for that user but actually that was done on another
device. So we might have to keep track of device also. Implementing
this technique is more time consuming but not sure if it out
performs the first one.
I've actually worked on some of the titles you mentioned.
I do not recommend using mysql, it doesn't scale up correctly, even if you shard. If you do you are loosing any benefits you might have in using a relational database.
You are probably better off using a no-sql database. Its is faster to develop, easy to scale and it is simple to change the document structure which is a given for a game.
If your game data is simple you might want to try couchDB, if you need advanced querying you are probably better of with MongoDB.
Take care of security at the start. They will try to hack the game for sure and if you have a number of clients released it is hard to make security changes backward compatible. SSL won't do much as the end user is the problem not an eavesdropper. Signing or encrypting your data will make it harder for a user to add items and gold to their accounts.
You should also define your architecture to support multiple clients without having a bunch of ifs and case statements. Read the client version and dispatch that client to the appropriate codebase.
Have a maintenance mode with flags for upgrading, maintenance, etc. It will cut you some slack if you need to re-shard your DB or any other change that might require downtime.
Client side validations are not enough, specially if using in app purchases. I agree with the above post. Server should control game logic.
As for DB sync, its best to memcache read only data. Typical examples are buyable items, maps, news, etc. User data is harder as you might not be able to afford loosing any modified data. The easiest setup is to cache user data for a couple of hours and write directly to the DB every time. If you are using no-sql it will probably withstand a high load without the need of using a persistence queue.
I see two potential problem hidden in the fact that you store all the state on the client, and then update the state on the server using a background thread.
How can the server validate the data being posted? If someone hacked your application, they could modify the code so whenever they swing their sword (or whatever they do in your game), it is always a hit. Doing that in a single player game is not that big a deal, but doing that in an MMORPG can ruin the experience for everyone else. So the server should validate every update of data - or even better, the server should be in charge of every business rule. So when you swing your sword against an opponent, that should be a server call, and the server returns whether or not it is a hit, and how many hit points the opponent lost.
What about interaction with other players (since you say it is an MMORP, there will be interaction with other players)? Since you say that you update the server, and get updates in a background thread, interaction will be sluggish. When you communicate with another character you have first wait for you background thread to sync data, but you also have to wait on the background thread of the other player to sync data.
Looks nice. But what is the client-side made of ? Web ? Can you use threading to synchronize both DB ? I should make the game in that way that it interacts immediately with the local DB, and let some background mechanism do the sync (something like a snapshot). This leads me to think about mysql replication. I think it is worth to be tried, but I never did. It also brings you answers to other questions. But what about the charge (how many customers are connected together) ?
http://dev.mysql.com/doc/refman/5.0/en/replication.html
Make your client issue commands to the server ("hit player"), and server send (relevant) events to client ("player was killed"). I wouldn't advice going with data synchronization. Server should be responsible for all important game decisions.
For a contact management system web app that allows tennants to upload lists of contact records (having varying field structures) and then displays these back to multiple users (within tennant) one at a time is there a good p/saas database solution to handle this
-it would need to allow uploading lists with custom fields (20K records per list)
-allow updating of fields changed when users edit them (user may update 60 records a minute)
-would need to allow running queries against the lists to determine next record to display (this part utilizes set fields)
Obviously a scalable, easy to use, hassle free as possible design is the aim here.
Will it be easier than developing a local database design?
(Prefer not to use a full paas would like to keep the application tier seperate.)
If your queries are always against the set fields and not the custom fields, then any database would do assuming you keep the "custom" fields in blob format (xml, for example).
If that is the case then your local database design is pretty simple. If you host on amazon ec2, then you can use either of their saas database solutions (mysql or SimpleDB even).
-Dave
I'm building a .NET client application (C#, WinForms) that uses a web service for interaction with the database. The client will be run from remote locations using a WAN or VPN, hence the idea of using a web service rather than direct database access.
The issue I'm grappling with right now is how to handle database concurrency. That is, if two people from different locations update the same data, how do I deal with it? I'm considering using timestamps on each database record and having that as part of the update where clauses, but that means that the timestamps have to move back and forth through the web service interface, which seems kind of ugly.
What is the best way to approach this?
I don't think you want your web service to talk directly to the database. You probably want your service to interact with some type of business components who in turn interact with a data access layer. Any concurrency exceptions can be passed from the DAL up to the business layer where they can be handled so that the web service never has to see the timestamps.
But if you are passing something like a data table up to the client and you want to avoid timestamps, you can do concurrency checking by comparing field by field. The Table Adapter wizards generate this type of concurrency checking by default if you ask for optimistic concurrency checking.
If your collisions occur infrequently enough that they can be resolved manually, a simple solution is to add an update trigger that copies a row's pre-update values to an audit table. This way the most recent write is the "winner", but no data is ever lost to an overwrite, and an administrator can restore an earlier row state or even combine them.
This technique has its downsides, and is not a very good solution where frequent overwrites are common.
Also, this is slightly off-topic, but using web services isn't necessarily the way to go just because the clients will be remoting into the network. ASP.NET web services are XML-based and very verbose. If your client application can count on being always connected, you'd be better off not using web services.