Multisite Strategy for Disaster Recovery — On–Premise and AWS

Multisite OperationsHararei


A multi–site solution runs in AWS as well as on your existing on–site infrastructure (or another AWS Region if you are running Production in AWS), in an active–active configuration. The data replication method that you employ will be determined by the recovery point that you choose.

In addition to recovery point options, there are various replication methods, such as synchronous and asynchronous methods

You can use a DNS service that supports weighted routing, such as Amazon Route 53, to route production traffic to different sites that deliver the same application or service. A proportion of traffic will go to each site.

In a disaster situation affecting on–site or AWS Availability Zone, you can adjust the DNS weighting and send all traffic to the still active servers. The capacity of the Amazon EC2 service can be rapidly increased to handle the full production load. You can use AWS Auto Scaling to automate this process. You might need some application logic to detect the failure of the primary database services and cut over to the parallel database services running in AWS.

The cost of this scenario is determined by how much production traffic is handled by AWS during normal operation. In the recovery phase, you pay only for what you use for the duration that the DR environment is required at full scale.

Key steps for preparation for Active/Active:

  1. Set up your AWS environment to duplicate your production environment.
  2. Set up DNS weighting, or similar traffic routing technology, to distribute incoming requests to both sites. Configure automated failover to re-route traffic away from the affected site.

Multisite

Traffic is routed in the event of a disaster. Traffic is cut over to the active AWS infrastructure by updating DNS, and all traffic and supporting data queries are supported by the AWS infrastructure.

  1. Either manually or by using DNS failover, change the DNS weighting so that all requests are sent to the remaining AWS site(s).
  2. Have application logic for failover to use the local AWS database servers for all queries.
  3. Consider using AWS Auto Scaling to automatically right–size the AWS fleet.

Multisite Recovery