I have two postgresql databases configured for replication. Master node is sending new data in real time to secondary node and it works just fine. But there is one disadvantage - secondary node is read-only, so no write requests are accepted.
What I need is to be able to perform write/read operations on both of the databases and still be able to have them in perfect sync, so they are identical.
What is the best solution for such requirements?
Just for the context of this question:
I have two web app instances that are deployed in two different locations in the world (very high delay in sending requests is the reason I decided to deploy one instance locally in each location). Both instances are fetching the same data but they are also able to generate some data and input it into the DB. It is impossible to have only one DB due to too big delay when fetching data.
Maybe my solution is not perfect, I'm open for any suggestion really because I'm out of ideas how to make it work smoothly and maybe I'm lacking some knowledge.
Thanks
Related
I am having a problem and I need your help.
I am working with Play Framework v1.2.4 in java, and my server is uploaded in the Heroku servers.
All works fine, I can access to my databases and all is ok, but I am experiment troubles when I do a couple of saves to the database.
I have a method who store data many times in the database and return a notification to a mobile phone. My problem is that the notification arrives before the database finish to save the data, because when it arrives I request for the update data to the server, and it returns the data without the last update. After a few seconds I have trying to update again, and the data shows correctly, therefore I think there is a time-access problem.
The idea would be that when the databases end to save the data, the server send the notification.
I dont know if this is caused because I am using the free version of the Heroku Servers, but I want to be sure before purchasing it.
In general all requests to cloud databases are always slower than the same working on your local machine. Even simply query that on your computer needs just 0.0001 sec can be as slow as 0.5 sec in the cloud. Reason is simple clouds providers uses shared databases + (geo) replications, which just... cannot be compared to the database accessed only by one program on the same machine.
Also keep in mind that free Heroku DB plans doesn't offer ANY database cache, which means that every query is fetched from the cloud directly.
As we don't know your application it's hard to say what is the bottleneck anyway almost for sure you have at least 3 ways to solve your problem. They are not an alternatives, probably you will need to use (or at least check) all of them.
You need to risk some basic plan and see how things changed with paid version, maybe it will be good enough for you, maybe not.
Redesign your application to make less queries. For an example instead sending 10 queries to select 10 different rows, you will need to send one query, which selects all 10 records at once.
Use Play's cache API to avoid repeating selecting the same set of data again and again. For an example, if you have some categories, which changes rarely, but you need category tree for each article, you don't need to fetch categories from DB every time, instead you can store a List of categories in cache, so you will need to use only one request to fetch article's content (which can be cached for some short time as well...)
We have an affiliate system which counts millions of banner Impressions/Clicks per day.
Currently it writes to SQL every Impression/Click that occurs in real time on each request.
Web application serves these requests.
We are facing two problems:
If we have a lot of concurrent requests per second, the SQL is
starting to work very hard to insert the Impressons/Clicks data and
as a result lead to problem #2.
If SQL is slow at the moment, the requests are being accumulated and
are waiting in the queue on web server. As a result we have a
slowness on a web application and requests are not being processed.
Design we thought of in high level:
We are now considering changing the design by taking out the writing to SQL logic out of web application (write it to some local storage instead) and making a stand alone service which will read from local storage and eventually write the aggregated Impressions/Clicks data (not in real time) to SQL in background.
Our constraints:
10 web servers (load balanced)
1 SQL server
What do you think of suggested design?
Would you use NoSQL as local storage for each web server?
Suggest your alternative.
Your problem seems to be that your front-end code is synchronusly blocking while waiting for the back-end code to update the database.
Decouple front-end and back-end, e.g. by putting a queue inbetween where the front-end can write to the queue with low latency and high throughput. The back-end then can take its time to process the queued data into their destinations.
It may or may not be necessary to make the queue restartable (i.e. not losing data after a crash). Depending on this, you have various options:
In-memory queue, speedy but not crash-proof.
Database queue, makes sense if writing the raw request data to a simple data structure is faster than writing the final data into its target data structures.
Renundant queues, to cover for crashes.
I'm with Bernd, but I'm not sure about using a queue specifically.
All you need is something asynchronous that you can call; that way the act of logging the impression is pretty much redundant.
I have a very limited experience of database programming and my applications that access databases are simple ones :). Until now :(. I need to create a medium-size desktop application (it's called rich client?) that will use a database on the network to share data between multiple users. Most probably i will use C# and MSSQL/MySQL/SQLite.
I have performed a few drive tests and discovered that on low quality networks database access is not so smooth. In one company's LAN it's a lot of data transferred over network and servers are at constant load, so it's a common situation that a simple INSERT or SELECT SQL query will take 1-2 minutes or even fail with timeout / network error.
Is it any best practices to handle such situations? Of course i can split my app into GUI thread and DB thread so network problems will not lead to frozen GUI. But what to do with lots of network errors? Displaying them to user too often will be not very good :(. I'm thinking about automatic creating local copy of a database on each computer my app is running: first updating local database and synchronize it in background, simple retrying on network errors. This will allow an app to function event if network has great lags / problems.
Any hints and buzzwords what can i look into? Maybe it's some best practices already available that i don't know :)
Sorry this is prob not the answer you are looking for but you mention that a simple insert / update could take 1-2 minutes or even fail with timeout / network error.
This to me sounds like there may be another problem rather than the network itself. If your working on a corporate network there would have to be insane levels of traffic for this sort of behavior. I would do everything in your power to look at improving the network before proceeding. Can you post the result of a ping to the db box?
If your going to architect your application around this type of network it will significantly alter the end product and even possibly result in a poor quality product for other clients.
Depending upon the nature of the application maybe look at implementing an async persistence queue and caching data on startup or even embedding a copy of the db into your application.
Even though async behaviour/queues/caching/copying the database to each local instance etc will help solve the symptoms, the problem will still remain. If the network really is that bad then I'd address it with their I.T. department, or the project manager and build some performance requirement from their side of things into the contract.
On the memcached website it says that memcached is a distributed memory cache. It implies that it can run across multiple servers and maintain some sort of consistency. When I make a request in google app engine, there is a high probability that request in the same entity group will be serviced by the same server.
My question is, say there were two servers servicing my request, is the view of memcached from these two servers the same? That is, do things I put in memcached in one server reflected in the memcached instance for the other server, or are these two completely separate memcached instances (one for each server)?
Specifically, I want each server to actually run its own instance of memcached (no replication in other memcached instances). If it is the case that these two memcached instances update one another concerning changes made to them, is there a way to disable this?
I apologize if these questions are stupid, as I just started reading about it, but these are initial questions I have run into. Thanks.
App Engine does not really use memcached, but rather an API-compatible reimplementation (chiefly by the same guy, I believe -- in his "20% time";-).
Cached values may disappear at any time (via explicit expiration, a crash in one server, or due to memory scarcity in which case they're evicted in least-recently-used order, etc), but if they don't disappear they are consistent when viewed by different servers.
The memcached server chosen doesn't depend on the entity group that you're using (the entity group is a concept from the datastore, a different beast).
Each server runs its own instance of memcached, and each server will store a percentage of the objects that you store in memcache. The way it works is that when you use the memcached API to store something under a given key, a memcached server is chosen (based on the key).
There is no replication between memcached instances, if one of those boxes goes down, you lose 1/N of your memcached' data (N being the number of memcached instances running in AppEngine).
Typically, memcached does not share data between servers. The application server hashes the key to choose a memcached server, and then communicates with that server to get or set the data.
Based in what I know, there is only ONE instance of Memcache of you entire application, there could be many instance of your code running each one with their memory, and many datastore around the world, but there is only one Memcache server at a time, and keep in mind that this susceptible to failure service, even is no SLA for it.
Has anyone had any experience scaling out SQL Server in a multi reader single writer fashion. If not can anyone suggest a suitable alternative for a read intensive web application, that they have experience with
It depends on probably 2 things:
How big each single write is?
Do readers need real time data?
A write will block readers when writing, but if each write is small and fast then readers won't notice.
If you offload, say, end of day reporting then you batch your load onto a separate server because readers do not require real time data. This makes sense
A write on your primary server must be synched to your offload secondary server... which will block there as part of the synch process anyway + you add an overhead load to manage the synch.
Most apps are 95%+ read anyway all the time. For example, an update or delete is a read followed by a write.
My choice would be (probably, based on the low write volume and it's a web app) to scale up and stuff as much RAM as I could in the DB server with separate disk paths for the data and log files of the database.
I don't have any experience with scaling out SQL Server for your scenario.
However for a Read-Intensive application, I would be looking at reducing the load on the database and employ a Cache Strategy using something like Memcache or MS Velocity
There are two approaches that I'm aware of:
Have the entire database loaded into the Cache and manage Adding and Updating of items in the cache.
Add items to the cache only when they are requested and remove them when a write operation is performed.
Some kind of replication would do the trick.
http://msdn.microsoft.com/en-us/library/ms151827.aspx
You of course need to change your app code.
Some people use partitioned tables, with different row ranges being stored on different servers - united with views. This would be invisible to the app. Federation for this practice, I think.
By designing your database, application and server configuration (SQL particulars - location of data/log/system/sql binaries/tempdb), you should be able to handle a pretty good load. Try not to complicate things if you don't have to.