SQL server high availability solutions in Amazon AWS - sql-server

I am in the process of migrating to Amazon AWS and need a SQL server high availability solution. The current licence that I have is SQL standard 2016.
At this time Amazon does not support shared volumes for Windows instances. Therefore, I am not able to do a regular SQL cluster fail over solution. This is the one where if the entire server goes down the stand by server picks up the slack and continues writing to the same storage. My only option is high availability always on basic groups. As I am starting to get familiar with this feature I find it very maintenance intensive and can see it becoming a problem when dealing with thousands of databases. In my case I have about 5k databases mostly small in size 600mb or less each. My question is Amazon not a viable hosting environment for a full SQL fail over solution. Is the high availability always on basic groups one per database a viable solution?

Related

Can I remotely connect to RDS sql server instance to support always on high availability, row level security feature?

My sql server db will be hosted on amazon RDS. I want to implement always on high availability feature, row level security feature, horizontal partitioning feature for my SaaS application to support multi tenancy without performance degradation. The RDS instance doesn't give me access to the underlying file system like an ec2 would, but if I connect remotely to the rds db instance through SSMS then will I be able to do such features implementation?
One of the value propositions (arguably the value proposition) of RDS is you're delegating some of the administrative burdens to Amazon. One of those is "how to provide HA". When last I looked, they were using Database Mirroring to provide that, but that shouldn't matter to you (or, if it does, I'd be interested in hearing more about why). Some day they'll transition to using Availability Groups, but as long as they're delivering on their RPO and RTO agreements, the actual technology they use to provide them shouldn't matter (much).
As to RLS, I don't have any experience with it but reading up on it I don't see anything that's incompatible with how RDS works that would prevent you from using it. That is, it all looks like database-scoped DDL which RDS (generally) doesn't restrict.
RDS instance does not provide access to underlying file system no matter what you use as client.
There are different SQL server versions and editions that are available.
More details at https://aws.amazon.com/rds/sqlserver/.
Regarding HA, AWS RDS provides so called Multi A-Z deployment and minimum SQL edition that can be used is standard edition. Before it used database mirroring, but thez support now Always On Availability Groups. Enabling Multi A-Z deployment is done through AWS Console.
https://aws.amazon.com/about-aws/whats-new/2018/11/amazon-rds-for-sql-server-now-supports-alwayson-availability-groups/

Options for a secondary SQL database

I have a VM in Azure running a single SQL Server instance.
I also have recently setup Power BI to refresh from this source at 1am every morning. Unfortunately, this refresh is causing performance issues, where all queries/operations are timing out due to stress.
What are my options regarding a secondary DB for reporting purposes? Main requirements are ease of maintenance and cost (dont need anything enterprise level).
Things that come to mind:
Secondary DB on same VM. Use replication to mirror data
Another cheap VM. Use replication
Use sql server availability sets, connect to read only replica
SQL data warehouse
Can anyone provide some guidance, or ask questions that may help find my answer?
Thanks.
I think Always ON availability group with secondary read-only replica will be best suited for your needs.
Building a separate DW for reporting purpose will be an overkill, as your reporting needs are satisfied from current database already, except for performance.
Transactional replication could be of help here. But, it also needs lot of knowledge on setup and maintenance.
I can think of several options, but in general this sounds like a canonical OLTP vs. OLAP issue, or a call for data warehouse, but since you are on the budget, let's consider low cost options.
Assuming the databases are small (GBs not TBs), I would separate operational and reporting instances either to be on the same machine if it is a pretty beefy machine, or better have two VMs so you can manage capacity separately.
I would consider replication from one instance to another.
Can you boost your VM resources during the period of the Power BI refresh only?
That's one of the key benefits of Azure - you can scale up and down and save money. How long does the refresh take? Who is using your DB at 1am?
I guess for a VM it's difficult to do this so you'd need to migrate to SQL Azure rather than a VM

Whats the best redundancy setup on AWS for SQL Server 2014

We're migrating our environment over to AWS from a colo facility. As part of that we are upgrading our 2 SQL Server 2005s to 2014s. The two are currently mirrored and we'd like to keep it that way or find other ways to make the servers redundant. # of transactions/server-use is light for our app - but it's in production, requires high availability, and, as a result, requires some kind of fail over.
We have already setup one EC2 instance and put SQL server 2014 on it (as opposed to using RDBMS for licensing reasons and are now exploring what to do next to achieve this.
What suggestions do people have to achieve the redundancy we need?
I've seen two options thus far from here and googling around. I list them below - we're very open to other options!
First, use RDBMS mirroring service, but I can't tell if that only applies if the principal server is also RDBMS - it also doesn't help with licensing.
Second, use multiple availability zones. What are the pros/cons of this versus using different regions altogether (e.g., bandwidth issues) etc? And does multi-AZ actually give redundancy (if AWS goes down in Oregon, for example, then doesn't everything go down)?
Thanks for the help!
The Multi-AZ capability of Amazon RDS (Relational Database Service) is designed to offer high-availability for a database.
From Amazon RDS Multi-AZ Deployments:
When you provision a Multi-AZ DB Instance, Amazon RDS automatically creates a primary DB Instance and synchronously replicates the data to a standby instance in a different Availability Zone (AZ). Each AZ runs on its own physically distinct, independent infrastructure, and is engineered to be highly reliable. In case of an infrastructure failure (for example, instance hardware failure, storage failure, or network disruption), Amazon RDS performs an automatic failover to the standby, so that you can resume database operations as soon as the failover is complete. Since the endpoint for your DB Instance remains the same after a failover, your application can resume database operation without the need for manual administrative intervention.
Multiple Availability Zones are recommended to improve availability of systems. Each AZ is a separate physical facility such that any disaster that should befall one AZ should not impact another AZ. This is normally considered sufficient redundancy rather than having to run across multiple Regions. It also has the benefit that data can be synchronously replicated between AZs due to low-latency connections, while this might not be possible between Regions since they are located further apart.
One final benefit... The Multi-AZ capability of Amazon RDS can be activated by simply selecting "Yes" when the database is launched. Running your own database and using mirroring services requires you to do considerably more work on an on-going basis.

comparing MapReduce to cloud database services

Are the databases offered by cloud services such as Windows Azure SQL Database or AWS Big Data capable of distributed computing, in the sense that the query optimizer divides the work across servers which compute in parallel, similar to how MapReduce distributes computation across nodes?
I haven't found anything about any such query optimization in the Azure documentation, although PDW seems like it may do this.
AWS has EMR (Elastic Map-Reduce) which is Hadoop provisioned by AWS.
Azure has HDInsights which is Hortonwork's data platform (Hadoop) installed on Windows VMs.
Microsoft's PDW (parallel data warehouse) doesn't support map-reduce right as far as I know but they are working on it (http://www.zdnet.com/microsofts-polybase-mashes-up-sql-server-and-hadoop-7000007424/) - PDW is essentially a few SQL Server machines with a central management layer that allows partitioning and distribution of the data between the different nodes - it can and will break a query between the PDW nodes if the data resides on more than one but the parallelism is not map-reduce in nature.

Overcoming Windows Azure Sql Database 150 gb size limitation

SQL Azure has a database size limit of 150 gb. I have read through their documentation several times and also searched online but I'm unclear about this: Does using federations allow a developer to grow beyond a 150 gb data base? For example can I have several 150GB federation members.
If not, how can I handle a database larger than 150 gb on Windows Azure?
basically, How do I scale out beyond 150 gb on Windows Azure
If theres no other way is RDS a good alternative(share any other alternatives)
Currently it is not possible to have a single database larger than 150G.
The only approach is to either split the data into multiple databases, one account can have up to 149 user databases plus the master DB, or use SQL Azure Federations. Currently, if I am not mistaken, the total number of Federations supported is Int16.MaxValue - 1. Each federation is actually a separate database, transparent to the developer, which can be up to 150GB.
However, SQL Azure Federations has its own pros and cons, along with some data access layer re-factoring. If you are interested you may check out these cool videos on SQL Azure Federations:
Building Scalable Apps with SQL Azure
Using SQL Azure Database Federations
UPDATE
I will not completely agree with #ryancrawcour. What he explains is just the peak of the iceberg lying bellow the water. The amount of required re-factoring really depends on how data is consumed from the application. I will just mention a few factors for considerations (which are not complete picture at all). Consider any of the following:
Data that is common for all federations (how you get this data)
Stored proc, that post-processes data - you have to iterate in each and ever federation member and execute that stored proc. There is no way to execute the Stored proc once and process data in all the federations.
Aggregate data, which is spread across more than 1 federation member
List data from more than one federation member.
These are just few operations that you will need to consider, and that does not require "just change in connection string and execute one use federation ..." before each query. Actually using SQL Azure Federations you don't need to change the connection string at all. It is all the same SQL Azure connection string. The "USE FEDERATION ..." statement is what you have execute before each query. But it is way not just the only thing. And how about if one is using EntityFramework (model first, or code first, or whatever). Things get even more complicated and need real understanding of SQL Azure Federations.
I would say that SQL Azure Federations is different way of thinking about data, about modelling and normalizing.
UPDATE 2 - new Database sizes announced by Microsoft
As of 03. April 2014 the maximum size for a single Database has been increased to 500GB. The only available information to date is here. Be aware that the management portal still doesn't show this option (as of Today and now: 4. Apri 2014, 15:00 GMT+0:00).
I've been looking for these same answers a while ago. In addition to the answers Anton provided (which are very accurate), I found that you can make your WAVM with SQL Server installation redundant through load balancing and mirroring.
The advantage of WASD is that everything is automated. E.g. when your WAVM instance is taken out of the roulation of the load balancer, you'll need bring a new one up yourself. WASD takes care of all of this.
With WASD Federations you're able to scale to 75TB of data (if I remember correctly), while with WAVM with SQL Server you can scale to 16TB tops.
Also with WASD Federations you can more granularly divide the SQL Workloads.
Regards,
Patriek
There is also the new Azure feature of persistent VMs (currently in preview) which will allow you to migrate your on-premises applications to cloud with minimal changes.
Further reading: Infrastructure as a Service Series: Running SQL Server in a Windows Azure Virtual Machine
.This guide might be helpful as well.
Edit
Here is a comparison with Sql Azure
While considering your scale options, be aware that, as of April 3 2014, Microsoft announced upcoming changes to SQL Premium, including ability to scale each SQL Database instance to 500GB (along with geo-replication, self-service restore, and higher uptime SLA). No date has been announced yet, but you can read about the announcement details here.
There is now a 1 Terrabyte tier available - see https://azure.microsoft.com/en-us/pricing/details/sql-database/ and look at the Premium level.

Resources