Storage issues in cloud when creating multiple instances - database

In a cloud hosting environment (amazon, rackspace,) you can create multiple instances. Let's say I have a database server (mysql,) and other persistent data.
If I create more instances, what happens to the data ? Ex.
1 Instance -> user table (in a db)
I make another 3 instances
4 Instances -> each one has it's one user table
Errors: if someone adds data to the table on instance 3 how does instance nr 4 see it ? If I merge the instances back to one, which instance data does it keep ?
Thank you

I would suggest having one (or more) dedicated database servers that all the instances connect to. If you are using Amazon Web Services check out their RDS service ( http://aws.amazon.com/rds/ )
That way you don't need to worry about replication - if you do want each server running it's own db instance you'll have to look into replication - for MySQL this is a good guide: http://dev.mysql.com/doc/refman/5.0/en/replication.html
I would strongly recommend the former solution for the database. Replication is tricky to get right and can be a nightmare to maintain
If you are using static data eg images I would recommend using amazon's S3 service for uploading to ( http://aws.amazon.com/s3/ ) - that way all your servers are getting their data from a single point instead of having to replicate over servers, which is always going to end up a less scalable solution

Related

When to create new RDS Instance vs new database?

I have two AWS RDS Postgres Instances. Sometimes I create new instances for applications that are (very) vaguely related to other applications. Which always leads me to the question; should I just create a new database in an existing instance or keep things separate and create a new instance instead?
I would recommend that you use the same database server (Amazon RDS instance).
You can logically separate the data via either:
CREATE DATABASE: Full logical separation. You login to one database and never see the other one. OR
CREATE SCHEMA: Data is kept separate, but can be referenced from the other. Quite common for staging areas, such as doing ETL in a Staging Schema, then publishing to a Production Schema.
From your description, I'd say that CREATE DATABASE would be appropriate.
The benefit is that you only need to manage one database and there is little impact on cost unless you need to increase the size of the database instance to handle the higher load (but it would still be cheaper than running two separate databases).
Just keep an eye on the CloudWatch metrics to be sure that the database is handling the increased load correctly.
Normally, the biggest reason for using a different server is because they are owned/managed by different teams. However, in your situation the same team seems to 'own' both data stores, so that wouldn't be an issue.

How to query multiple databases from different SQL Servers

We have approx. 8 odd SQL Servers used for different purposes like inserting data in 1 server, update in another etc. (or connecting to only that database based on user’s region).
The problem is sometimes query for data needs to be done from multiple SQL Server databases. So say, I have an Id property, and based on the Id data needs to be retrieved from multiple of these 8 servers (if there is an Id match, so basically querying all database).
So basically the server which the user is logged into, will use “Linked Server” functionality and connect to other SQL Servers (with the server which the user is currently on acts as the source SQL Server), and using “UNION” functionality to club all data.
As a lot of transactions is taking place each day, this approach is not feasible, performance wise.
So any recommendations on a better approach to achieve the same above functionality. I read a concept called “Server Groups” but not sure of it.
The application is made in .Net Web Forms using Jquery/Ajax/HTML/API and ADO.NET.
If you have a .net application which is outside these 8 servers can't you establish individual connections and pass the ID from .net app to these servers ?
As far as I know "Server Group" is a concept in SSMS which helps you to group the servers and can run common scripts at same time.

Syncing One Online and Multiple Offline Versions of the Same Schema DB in Rails

SETUP
I have three instances running of my app deployed in three separate geographical locations running locally (since Internet connections are not reliable).
I have one master instance of the app running on DigitalOcean.
I would like to sync the local databases with the master database daily.
MY CURRENT APPROACH
I have a cron job scheduled to pull the data from the local databases and upload them into a database running on a DigitalOcean VPS. My concern is that the id columns of the three local dbs will conflict resulting in an incorrect merge in the online master database.
I am running Rails 4.1 with Ruby 2.0 using Postgres as my DB.
I am open to any solutions that come up with a relatively simple way of keeping the databases in sync.
Thank you
Simplest solution would be to have all your unique autonumbering one column ID keys to consist of 2 different columns. An "ServerID" and an auto numbering ID. It makes your design more complicated but you never have to worry of non unique keys.

Data warehousing using local DB - Beginner

I want to get an idea on how to achieve this;
I have an application that runs at 5 different geographical locations. Eg: Texas, NY,California, Boston, Washington
This application saves data to a local database, which is located at that location.
I want to do data warehousing, So is it a must to have just have one database (Where all the 5 applications will now save its data in a single database - without having local DBs)
Or is it possible to have 5 local databases, and do data warehousing by retrieving data from those local DBs to a central DB and then performing data warehousing.
Please give me your thoughts and references.
You have three options for this:
you use a single, centrally hosted database server. Typical relational database servers can be directly accessed via network these days: mySQL, Postgresql, Oracle, ... This means you can implement an application which opens a network connection to the database server and uses that remote server to store and retrieve the data as required. Multiple connections are possible at the same time.
you use a single, central database server but put a wrapper around it. So some small network layer application layer acting as a broker. This way you can address that central instance over network, but via standard protocols like for example http.
you use a decentralized approach and install a database instance at each location. Then you need some additional tool to perform a synchronization. For most modern database servers (see above) such tools exist, but the setup is not trivial.
If in doubt and if the load is not that high go with the first alternative.

Should i just use 1 database?

Hi i am building a window apps retailer pos but was wondering what is the best method to design the database. Should i just use 1 database to store all my clients data?
Meaning to say if i have 100 clients from different businesses using my App, all of their data will be stored in 1 database.
e.g. i will store 1 company column in the user table to indicate which company does the customer or transaction belongs to.
My current practice is i create new database for each business and put it installed into their local machine. (Got to manually install sqlserver + sqlexpress).
Do u think it is more easier for me to design in this way? and i can just put the database online to sql server. Will i be getting any latency ? how bad will it be? I heard Window Azure able to handle this well. In my case i think the speed and data size per business is not really a concern.
Could you advice?
You should definitely look at other alternatives within Azure for storing data, specifically Azure Storage Tables and Blobs.
Utilizing all of the Azure Storage Options with SQL Azure will allow you to choose different data tiers depending on your application's needs and your desired cost structure. Running everything inside of SQL Azure will cost you more in the long run, but it makes a good place to tie together federated data for relational reporting, whereas you can store each tenant's data inside of Azure Tables, using PartitionKeys which keep each client's data separated from the others.

Resources