I'm interested in taking advantage of Amazon's managed database (RDS), but, at the same time, I'd like my web application to run on-premises or on another cloud provider that offers data centers near me (less latency, as my application not always has to fetch data from the DB).
Is this scenario common? Would it make sense, or is Amazon RDS supposed to be run with instances running in Amazon?
If you're looking to reduce latency this is probably not your best option, as DB performance is going to be pretty bad (very large latencies between the web application and the DB server, basically cancelling out and advantages of having the app server as close to your clients as possible).
I've actually had to test a similar configuration, with a DB server in Europe and an app server in the US and the performance was much worse than having both in any of the two regions.
I've recently moved the database for our web application to RDS whilst still having our application servers hosted on-premise. Eventually the entire application will live on AWS, but it was too big a move to do all components at the same time. I'm only dealing with latency between Perth and Sydney in Australia, which is about 50ms, and this is working fine for us.
I don't recommend that you adopt this as a permanent configuration. You'll get much better performance from hosting your entire stack at AWS as opposed to keeping parts separated.
Related
Assuming we are using a micro service architecture for a product and we decide to use 'Database per service' model, and deploy in cloud servers by provider like AWS.
It is convenient to have databases running as a container for development and test environments.
But can same be implemented for Production environment! If so, how safe it would be?
Or is it proper to go with cloud solution as AWS RDS-DB instead!!
This blog post lists some reasons why you should not run production databases in containers. It also references another blog post describing problems with updating docker and unstable storage drivers.
The main points here for me boil down to this:
Dodgy storage drivers. This may be less of a problem when you write your database state to the host system but Docker for example explicitly encourages users to use volumes for exactly that (see the docs: Citation: "Volumes are the best way to persist data in Docker"). It may just work fine under normal circumstances, but what about the edge-cases like power-failures or read-errors for example?
Managing databases in production is hard. Many companies employ full-time DBAs to ensure smooth operation of production databases. The devops paradigm (every dev creates a plethora of DB servers in containers) makes it nearly impossible for a DBA to do his job. That is if the DBA even has access to these DBs.
In conclusion: Containers are fine for certain tasks and a bad idea for others. Running production databases in containers is one of those bad ideas.
We containerise our db in production (on-premises enterprise application). Many do. It's perfectly stable and the deployment is much simplified. Of course our db is not under stress; we're dealing with hundreds of concurrent users, not tens of thousands. We just make sure that the container has enough RAM and is monitored well.
If we did need to dedicate an entire VM to the db alone, then yes I would skip docker.
According to link below, it is not a good idea to use database container in Production.
But as I have experienced; if you isolate your container from your app and update your container regularly and also manage networking stuff, there seems to be no problem.
Link: https://www.quora.com/Is-it-not-advisable-to-use-database-in-Docker-container
As you are using Database Per Service Model for Microservice, in Production perfect solution can be AWS RDS instance for database, Now you have 2 approaches :
You can create single RDS Instance and can have different databases for different services on same RDS insatnce, it will save cost a lot but you need to take care of database connections and load you will be having on database based on that you have to choose RDS instance type like 4xlarge etc, better the instance type more connection it will provide and more database load it can handle effectively.
Second solution can be creating several RDS instance and number of RDS instance will be equivalent to your microservice count as each service will be using one RDS instance for its database independently, this is not the effective solution, it will incur lot of cost and this solution will under utilize AWS RDS instances.
I have an application in Rails that displays a lot of information to the user.
Using new relic, I notice that the database is working intensively and that this will probably limit my ability to scale (assume for now that the SQL is fine)
Is there a way I can have several databases which will be in sync, and the requests will be load-balanced between them?
Does Heroku provide such a system?
Maybe more importantly - Should I rely on Heroku for an app which needs to scale? (is the architecture one web server connects to one database server or can it do more?)
Look in to heroku follower database.
https://devcenter.heroku.com/articles/heroku-postgres-follower-databases
It will keep your database sync and for load balancing you will need to configure octopus.
Moreover regarding scalability its quite easy (application level scalability just increase the dynos) and on database they are having multiple models (with different cache sizes) and its quite ease with to switch between these models (with ignoreable down time)
thanks
We're getting ready to build a new platform for our current system. Currently we install sql server express locally to all our clients and all their data is stored there. While the process works pretty good, it's still a pain to add columns/tables etc. We also want to have our data available outside of the local install. So we're moving to a central web based sql database and creating a web based application. Our new application will be a Silverlight 5, wcf ria services, mvvm, entity framework application
We've decided that either a web hosted sql server database or sql azure database are the way to go. However, I have no idea why I would choose one over the other. The limitations of azure don't seem to apply to us, but our application will be run on our current shared web host. Is it better to host the application on the same server as the database? Do we even know with shared web hosting that the server is on the same location as the app? There's also the marketing advantage of being 'in the cloud' which our clients love when we drop that word (they have no idea about anything technical, it's just a buzzword for them). I'm not too worried about the cost as I think both will ultimately be about the equivalent of each other.
I feel like I may be completely overthinking this and either will work, however I'd like to try and get the best solution for us and don't want to choose without getting some feedback.
In case it helps, our application is mostly dashboard/informational data. Mostly financial and trending data. It's almost entirely read only. Sometimes the data can get fairly large and we would be sending upwards of 50,000 rows of data to the application.
Thanks for any help/insight you can provide for me!
The main concerns I would have with using a SQL Azure DB from an application on your current shared web host would be
The effect of network latency: Depending on location, every time you do a DB round trip from your application to the SQL Azure DB you will incur a 50-100ms delay. If your application does lots of round trips, this will mount up. Often, if an application has been designed to work with a DB on the LAN (you use of local client DBs suggests this) the they tend to get "chatty" since network delays are very small on the LAN. You may find your application slows down significantly.
Security: You will have to open up the SQL Azure firewall to the IP address(es) that your application presents when querying. Depending on your host, it may be that this IP address is shared between several tenants. This would be a vulnerability.
If neither of these is a problem, then SQL Azure will provide a much lower management overhead (e.g. no need to patch etc.) and will give you very high reliability, especially in terms of the risk of data loss.
I am planning to host my iphone game on amazon aws. Basically my game just need a database, and currently I am using mysql (relational database) to store users data.
I am new to amazon aws, and I have read some of the articles. This page: http://aws.amazon.com/running_databases/ provides some available choices for databases.
RDS (relational database services)
EC2 with Relational Database AMI (it has mysql)
simpleDB
I think I will skip simpleDB, because I have read the sample codes, the database structure is kind of different from relational db, no join tables, all data stored in strings. The current game that I am developing is already in relational form, with all the php codes already, maybe for future project, I could consider it.
Now, left RDS and EC2, which one should I use? In comparison in costs, performances, reliability and stability? My game server requirements:
MySQL database (as I only familiar with this database engine and I already developed the game half way, no time to re-write or learn new language)
Easy to scale
Load balancing
Automatic backup
(if possible, less maintenance works in future)
Please give me some advice, thank you very much.
As you have already chosen MySQL on AWS, the question is only whether you want to host the Database Server on the Instance or through AWS RDS Service.
In comparison in costs, performances, reliability and stability and the your game server requirements:
MySQL database Easy to scale,
Load balancing,
Automatic backup,
(if possible, less maintenance works in future),
AWS RDS would be the BEST option.
As once you scale the Environment, it might be complex and needs lot of processing and maintaining if you host it on the INSTANCE.
While AWS RDS makes it easy for you.
Hope It Helps.. :)
If you need EC2 instance(s) anyway (for web hosting for instance) then hosting MySQL on an EC2 instance that you are already paying for is going to be cheaper...
But as your load goes up I would definitely look towards RDS for easier scaling, reduced admin overhead, better disaster recover story, etc... No reason in my opinion to host MySQL on dedicated EC2 instances...
For your requirement of load balancing and easy scaling you will need a dedicated instance for database. Your EC2 instances hosting your game would be behind load balancer and those will all connect to one database on dedicated instance.
That dedicated instance hosting your database could be RDS or EC2 instance. RDS is expensive but has its benefits.
I run a very high traffic(10m impressions a day)/high revenue generating web site built with .net. The core meta data is stored on a SQL server. My team and I have a unique caching strategy that involves querying the database for new meta data at regular intervals from a middle tier server, serializing the data to files and sending those to the web nodes. The web application uses the data in these files (some are actually serialized objects) to instantiate objects and caches those in memory to use for real time requests.
The advantage of this model is that it:
Allows the web nodes to cache all data in memory and not incur any IO overhead querying a database.
If the database ever goes down either unexpectedly or for maintenance windows, the web servers will continue to run and generate revenue. You can even fire up a web server without having to retrieve its initial data from the DB because all the data it needs are in files on its own disks.
Allows us to be completely horizontally scalable. If throughput suffers, we can just add a web server.
The disadvantages are that this caching and persistense layers adds complexity in the code that queries the database, packages the data and unpackages it on the web server. Any time our domain model requires us to add entities, more of this "plumbing" has to be coded. This architecture has been in place for four years and there are probably better ways to tackle this.
One strategy I have been considering is using replication to replicate our master sql server database to local database instances installed on each web server. The web server application would use normal sql/ORM techniques to instantiate objects. Here, we can still sustain a master database outage and we would not have to code up specialized caching code and could instead use nHibernate to handle the persistence.
This seems like a more elegant solution and would like to see what others think or if anyone else has any alternatives to suggest.
I think you're overthinking this. SQL Server already has mechanisms available to you to handle these kinds of things.
First, implement a SQL Server cluster to protect your main database. You can fail over from node to node in the cluster without losing data, and downtime is a matter of seconds, max.
Second, implement database mirroring to protect from a cluster failure. Depending on whether you use synchronous or asynchronous mirroring, your mirrored server will either be updated in realtime or a few minutes behind. If you do it in realtime, you can fail over to the mirror automatically inside your app - SQL Server 2005 & above support embedding the mirror server's name in the connection string, so you don't even have to lift a finger. The app just connects to whatever server's live.
Between these two things, you're protected from just about any main database failure short of a datacenter-wide power outage or network outage, and there's none of the complexity of the replication stuff. That covers your high availability issue, and lets you answer the scaling question separately.
My favorite starting point for scaling is using three separate connection strings in your application, and choose the right one based on the needs of your query:
Realtime - Points directly at the one master server. All writes go to this connection string, and only the most mission-critical reads go here.
Near-Realtime - Points at a load balanced pool of read-only SQL Servers that are getting updated by replication or log shipping. In your original design, these lived on the web servers, but that's dangerous practice and a maintenance nightmare. SQL Server needs a lot of memory (not to mention money for licensing) and you don't want to be tied into adding a database server for every single web server.
Delayed Reporting - In your environment right now, it's going to point to the same load-balanced pool of subscribers, but down the road you can use a technology like log shipping to have a pool of servers 8-24 hours behind. These scale out really well, but the data's far behind. It's great for reporting, search, long-term history, and other non-realtime needs.
If you design your app to use those 3 connection strings from the start, scaling is a lot easier, and doesn't involve any coding complexity - just pick the right connection string.
Have you considered memcached? Since it is:
in memory
can run locally
fully scalable horizontally
prevents the need to re-cache on each web server
It may fit the bill. Check out Google for lots of details and usage stories.
Just some addition to what RickNZ proposed above..
Since your master data which you are caching currently won't change so frequently and probably over some maintenance window, here is what should you do first on database side:
Create a SNAPSHOT replication for the master tables which you want to cache. Adding new entities will be equally easy.
On all the webservers, install SQL Express and subscribe to this Publication.
Since, this is not a frequently changing data, you can rest assure, no much server resource usage issue minus network trips for master data.
All your caching which was available via previous mechanism is still availbale minus all headache which comes when you add new entities.
Next, you can leverage .NET mechanisms as suggested above. You won't face memcached cluster failure unless your webserver itself goes down. There is a lot availble in .NET which a .NET pro can point out after this stage.
It seems to me that Windows Server AppFabric is exactly what you are looking for. (AKA "Velocity"). From the introductory documentation:
Windows Server AppFabric provides a
distributed in-memory application
cache platform for developing
scalable, available, and
high-performance applications.
AppFabric fuses memory across multiple
computers to give a single unified
cache view to applications.
Applications can store any
serializable CLR object without
worrying about where the object gets
stored. Scalability can be achieved by
simply adding more computers on
demand. The cache also allows for
copies of data to be stored across the
cluster, thus protecting data against
failures. It runs as a service
accessed over the network. In
addition, Windows Server AppFabric
provides seamless integration with
ASP.NET that enables ASP.NET session
objects to be stored in the
distributed cache without having to
write to databases. This increases
both the performance and scalability
of ASP.NET applications.
Have you considered using SqlDependency caching?
You could also write the data to the local disk at the web tier, if you're concerned about initial start-up time or DB outages. But at least with a SqlDependency, you shouldn't have to poll the DB to look for changes. It can also be made relatively transparent.
In my experience, adding a DB instance on web servers generally doesn't work out too well from a scalability or performance perspective.
If you're concerned about performance and scalability, you might consider partitioning your data tier. The specifics depend on your app, but as an example, you could move read-only data onto a couple of SQL Express servers that are populated with replication.
In case it helps, I talk about this subject at length in my book (Ultra-Fast ASP.NET).