I'm building a mobile application in VB.NET (compact framework), and I'm wondering what the best way to approach the potential offline interactions on the device. Basically, the devices have cellular and 802.11, but may still be offline (where there's poor reception, etc). A driver will scan boxes as they leave his truck, and I want to update the new location - immediately if there's network signal, or queued if it's offline and handled later. It made me think, though, about how to handle offline-ness in general.
Do I cache as much data to the device as I can so that I use it if it's offline - Essentially, each device would have a copy of the (relevant) production data on it? Or is it better to disable certain functionality when it's offline, so as to avoid the headache of synchronization later? I know this is a pretty specific question that depends on my app, but I'm curious to see if others have taken this route.
Do I build the application itself to act as though it's always offline, submitting everything to a local queue of sorts that's owned by a local class (essentially abstracting away the online/offline thing), and then have the class submit things to the server as it can? What about data lookups - how can those be handled in a "Semi-live" fashion?
Or should I have the application attempt to submit requests to the server directly, in real-time, and handle it if it itself request fails? I can see a potential problem of making the user wait for the timeout, but is this the most reliable way to do it?
I'm not looking for a specific solution, but really just stories of how developers accomplish this with the smoothest user experience possible, with a link to a how-to or heres-what-to-consider or something like that. Thanks for your pointers on this!
We can't give you a definitive answer because there is no "right" answer that fits all usage scenarios. For example if you're using SQL Server on the back end and SQL CE locally, you could always set up merge replication and have the data engine handle all of this for you. That's pretty clean. Using the offline application block might solve it. Using store and forward might be an option.
You could store locally and then roll your own synchronization with a direct connection, web service of WCF service used when a network is detected. You could use MSMQ for delivery.
What you have to think about is not what the "right" way is, but how your implementation will affect application usability. If you disable features due to lack of connectivity, is the app still usable? If you have stale data, is that a problem? Maybe some critical data needs to be transferred when you have GSM/GPRS (which typically isn't free) and more would be done when you have 802.11. Maybe you can run all day with lookup tables pulled down in the morning and upload only transactions, with the device tracking what changes it's made.
Basically it really depends on how it's used, the nature of the data, the importance of data transactions between fielded devices, the effect of data latency, and probably other factors I can't think of offhand.
So the first step is to determine how the app needs to be used, then determine the infrastructure and architecture to provide the connectivity and data access required.
I haven't used it myself, but have you looked into the "store and forward" capabilities of the CF? It may suit your needs. I believe it uses an Exchange mailbox as a message queue to send SOAP packets to and from the device.
The best way to approach this is to always work offline, then use message queues to handle sending changes to and from the device. When the driver marks something as delivered, for example, update the item as delivered in your local store and also place a message in an outgoing queue to tell the server it's been delivered. When the connection is up, send any queued items back to the server and get any messages that have been queued up from the server.
Related
The scenario is needing to write high volume data, like tracking clicks or mouse movements, from a web application to a SQL database. The data doesn't need to be written right away because the analysis on the data happens on some recurring basis, like daily or weekly.
I want some feedback on a solution that comes to mind:
The click and mouse data is published to a message queue. This stores the queue items in memory so it should be fast and faster than SQL. Then on some other server a job plugs away on retrieving the next queue item and writing the data to SQL.
Does anyone know of implementations like this? What pitfalls am I failing to see? If this solution is not a good one are there other alternatives?
Regards
RabbitMQ is meant for real time message exchange and not for temporary buffering data. If you are able to consume all data as soon as it arrives in your queues, then this solution will work for you. Otherwise RabbitMQ will grow in memory and eventually die. Then you will have to configure it to throw some data away (there are a lot of options to choose rules for this).
You could possibly store data in Redis cache, you can do it as fast as you publish your events to RabbitMQ. Then you can listen to the new changes in Redis from remote server and fill up whatever database storage you use, or even use it as your data storage.
To solve a very similar problem I was considering doing exactly this. In the end we decided not to go for it because we did need access to the data very quickly. However I still like the idea.
Ive also recently learnt that under the hood this is exactly the way that Microsofft Dynamics CRM does its database updates, using message passing.
Things I think you would need to pay careful attention to.
Make sure that if your RabbitMQ instance disappeared it wouldnt have any affect on your client. Rabbit dying is bad enough, your client erroring because Rabbit is down would be terrible.
If it's truly very high volume (and its good practice for reliability anyway) clustering is something worth looking at.
Obviously paying attention to your deadletter queues is a must. But the ability to play back messages which failed for some reason is awesome, in theory at least your data should eventually always get to you database. Even if it went down for a period of time.
Make sure you can keep up with the number of messages being passed in. Of course, this should be solvable by adding more consumer to a given queue. Which leads to...
Idempotency of messages. Given that your messages relate directly to a DB write, they HAVE to be idempotent.
We have a requirement for notifying external systems of changes in data in various tables in a SQL Server database. The choice of which data to monitor is somewhat under the control of the user (gets to choose from a list of what we support). The recipients of the notifications may be on a locally connected network (i.e., in the same data center) or they may be remote.
We currently handle this by application code within our data access layer that detects changes and queues notifications on a Service Broker queue which is monitored by a Windows service that performs the actual notification. Not quite real time but close enough.
This has proven to have some maintenance problems so we are looking at using one of the change detection mechanisms that are built into SQL Server. Unfortunately none of the ones I have looked at (I think I looked at them all) seem to fit very well:
Change Data Capture and Change Tracking: Major problem is that they require polling the captured information to determine changes that are to be passed on to recipients. I suspect that will introduce too much overhead.
Notification Services: Essentially uses SQL Server as a web server, which is a horrible waste of licenses. It also requires access through at least two firewalls in the network, which is unacceptable from a security perspective.
Query Notification: Seems the most likely candidate but does not seem to lend itself particularly well to dynamically choosing the data elements to watch. The need to re-register the query after each notification is sent means that we would keep SQL Server busy with managing the registrations
Event Notification: Designed to notify on database or instance level events, not really applicable to data change detection.
About the best idea I have come up with is to use CDC and put insert triggers on the change data tables. The triggers would queue something to a Service Broker queue that would be handled by some other code to perform the notifications. This is essentially what we do now except using a SQL Server feature to do the change detection. I'm not even sure that you can add triggers to those tables but I thought I'd get feedback before spending a lot of time with a POC.
That seems like an awful roundabout way to get the job done. Is there something I have missed that will make the job easier or have I misinterpreted one of these features?
Thanks and I apologize for the length of this question.
Why don't you use update and insert triggers? A trigger can execute clr code, which is explained enter link description here
Scenario
The DB for an application has gone down. This results in any actor responsible for committing important data to the DB failing to get a connection
Preferred Behaviour
The important data is written to the db when it comes back up sometime in the future.
Current Implementation
The actor catches the DBException, wraps the data in a DBWriteFailed case class, and sends the message to its supervisor. The supervisor then schedules another write for sometime in the future (e.g. 1 minute) using system.scheduler.scheduleOnce(...) so that we don't spin in circles too much while waiting for the DB to come back up.
This implementation certainly works but I feel there might be a better way.
The protocol gets a bit messier when the committing actor has to respond to the original sender after a successful commit.
The regular flow of messages to the committing actor is not throttled in any way and the actor will happily process the new messages, likely failing to connect to the DB for each and every one of them.
If messages get caught in this retry loop for too long, the mailboxes of the committing actors will start to balloon. It is important that this data be committed, but none of it matters if the application crawls to a halt or crashes due to excessive memory usage.
I am an akka novice and I am largely inexperienced when it comes to supervisor strategies, but I feel as though I may be able to leverage one of those to handle some of this retry logic.
Is there a common approach in akka for solving a problem like this? Am I on the right track or should I be heading in a different direction?
Any help is appreciated.
You can use Akka Circuit Breaker to reduce connection attempts. Instead of using the scheduler as retry queue I would use a buffer (with max size limit) inside the actor and retry those when circuit breaker becomes closed again (onClose callback should send message to self actor). An alternative could be to combine the circuit breaker with a stashing mailbox.
If you're planning to implement full failover in your app
Don't.
Do not bubble database failover responsibility up into the app layer. As far as your app is concerned, the database should just be up and ready to accept reads and writes.
If your database goes down often, spend time making your database more robust (there's a multitude of resources on the web already for this: search the web for terms like 'replication', 'high availability', 'load-balancing' and 'clustering', and learn from the war stories of others at highscalability.com). It all really depends on what the cause of your DB outages are (e.g. I once maxed out the NIC on the DB master, and "fixed" the problem intermittently by enabling GZIP on the wire).
You'll be glad you adhered to a separation of concerns if you go down this route.
If you're planning to implement the odd sprinkling of retry logic and handling DB brown-outs
If you're not expecting your app to become a replacement database, then Patrik's answer is the best way to go.
I have a CRUD webservice, and have been tasked with trying to figure out a way to ensure that we don't lose data when the database goes down. Everyone is aware that if the database goes down we won't be able to get "reads" but for a specific subset of the operations we want to make sure that we don't lose data.
I've been given the impression that this is something that is covered by services like 0MQ, RabbitMQ, or one of the Microsoft MQ services. Although after a few days of reading and research, I'm not even certain that the messages we're talking about in MQ services include database operations. I am however 100% certain that I can queue up as many hello worlds as I could ever hope for.
If I can use a message queue for adding a layer of protection to the database, I'd lean towards Rabbit (because it appears to persist through crashes) but since the target is a Microsoft SQL server databse, perhaps one of their solutions (such as SQL Service Broker, or MSMQ) is more appropriate.
The real fundamental question that I'm not yet sure of though is whether I'm even playing with the right deck of cards (so to speak).
With the desire for a high-availablity webservice, that continues to function if the database goes down, does it make sense to put a Rabbit MQ instance "between" the webservice and the database? Maybe the right link in the chain is to have RabbitMQ send messages to the webserver?
Or is there some other solution for achieving this? There are a number of lose ideas at the moment around finding a way to roll up weblogs in the event of database outage or something... but we're still in early enough stages that (at least I) have no idea what I'm going to do.
Is message queue the right solution?
Introducing message queuing in between a service and it's database operations is certainly one way of improving service availability. Writing to a local temporary queue in a store-and-forward scenario will always be more available than writing to a remote database server, simply by being a local operation.
Additionally by using queuing you gain greater control over the volume and nature of database traffic your database has to handle at peak. Database writes can be queued, routed, and even committed in a different order.
However, in order to do this you need to be aware that when a database write is performed it is processed off-line. Even under conditions where this happens almost instantaneously, you are losing a benefit that the synchronous nature of your current service gives you, which is that your service consumers can always know if the database write operation is successful or not.
I have written about this subject before here. The user posting the question had similar concerns to you. Whether you do this or not is a decision you have to make based on whether this is something your consumers care about or not.
As for the technology stacks you are thinking of this off-line model is implementable with any of them pretty much, with the possible exception of Service broker, which doesn't integrate well with code (see my answer here: https://stackoverflow.com/a/45690344/569662).
If you're using Windows and unlikely to need to migrate, I would go for MSMQ (which supports durable messaging via transactional queues) as it's lightweight and part of Windows.
What is the best way to program an immediate reaction to an update to data in a database?
The simplest method I could think of offhand is a thread that checks the database for a particular change to some data and continually waits to check it again for some predefined length of time. This solution seems to be wasteful and suboptimal to me, so I was wondering if there is a better way.
I figure there must be some way, after all, a web application like gmail seems to be able to update my inbox almost immediately after a new email was sent to me. Surely my client isn't continually checking for updates all the time. I think the way they do this is with AJAX, but how AJAX can behave like a remote function call I don't know. I'd be curious to know how gmail does this, but what I'd most like to know is how to do this in the general case with a database.
Edit:
Please note I want to immediately react to the update in the client code, not in the database itself, so as far as I know triggers can't do this. Basically I want the USER to get a notification or have his screen updated once the change in the database has been made.
You basically have two issues here:
You want a browser to be able to receive asynchronous events from the web application server without polling in a tight loop.
You want the web application to be able to receive asynchronous events from the database without polling in a tight loop.
For Problem #1
See these wikipedia links for the type of techniques I think you are looking for:
Comet
Reverse AJAX
HTTP Server Push
EDIT: 19 Mar 2009 - Just came across ReverseHTTP which might be of interest for Problem #1.
For Problem #2
The solution is going to be specific to which database you are using and probably the database driver your server uses too. For instance, with PostgreSQL you would use LISTEN and NOTIFY. (And at the risk of being down-voted, you'd probably use database triggers to call the NOTIFY command upon changes to the table's data.)
Another possible way to do this is if the database has an interface to create stored procedures or triggers that link to a dynamic library (i.e., a DLL or .so file). Then you could write the server signalling code in C or whatever.
On the same theme, some databases allow you to write stored procedures in languages such as Java, Ruby, Python and others. You might be able to use one of these (instead of something that compiles to a machine code DLL like C does) for the signalling mechanism.
Hope that gives you enough ideas to get started.
I figure there must be some way, after
all, web application like gmail seem
to update my inbox almost immediately
after a new email was sent to me.
Surely my client isn't continually
checking for updates all the time. I
think the way they do this is with
AJAX, but how AJAX can behave like a
remote function call I don't know. I'd
be curious to know how gmail does
this, but what I'd most like to know
is how to do this in the general case
with a database.
Take a peek with wireshark sometime... there's some google traffic going on there quite regularly, it appears.
Depending on your DB, triggers might help. An app I wrote relies on triggers but I use a polling mechanism to actually 'know' that something has changed. Unless you can communicate the change out of the DB, some polling mechanism is necessary, I would say.
Just my two cents.
Well, the best way is a database trigger. Depends on the ability of your DBMS, which you haven't specified, to support them.
Re your edit: The way applications like Gmail do it is, in fact, with AJAX polling. Install the Tamper Data Firefox extension to see it in action. The trick there is to keep your polling query blindingly fast in the "no news" case.
Unfortunately there's no way to push data to a web browser - you can only ever send data as a response to a request - that's just the way HTTP works.
AJAX is what you want to use though: calling a web service once a second isn't excessive, provided you design the web service to ensure it receives a small amount of data, sends a small amount back, and can run very quickly to generate that response.