Lately I’ve had several conversations with clients who have asked about Active/Active architectures and how they can be used to transform existing disaster recovery strategies. Since many companies can benefit from an Active/Active approach, I would like to take some time to discuss the architectures that I typically include in my discussion. Before doing so, I would like to give a shout out to my colleague, David Edborg for his help in developing this summary. David is EMC’s subject matter expert on Continuous Availability. Check out his blog for more information on this transformative approach to disaster recovery and high availability.
So, now for the discussion on active/active architectures and the fundamental elements that I typically include in my discussions with clients:
What is active-active? There are a couple of different active/active architectures: active/active data centers and active/active applications. In order to have an active/active application, you must first have active/active data centers. When you have the two together, you have the ability to deliver a new level of service called Continuous Availability. We will look at both in the discussion in this blog.
With an “out-of-the-box” Continuous Availability solution, the distance between the intended data centers needs to support a communications link between the sites that has a latency of five milliseconds or less round-trip time. Generally speaking, this is a distance of around 60 miles or 100 kilometers, more or less depending on equipment attenuation factors or round-about circuit routing.
There are four corner posts in the overall architecture that need to be considered: spanned networking technology, a virtualized storage layer that extends across the two locations, and applications that can operate in two diverse locations. Also, having a virtualized compute platform is very helpful.
Continuously Available or Active/Active architectures are deployed to focus on uptime rather than recovery. That is to say we discuss the reliability of the architecture rather than how quickly it can recover from an outage. If your infrastructure has all of the tenants of the active/active data center and application architectures — where a fault in a single location does not affect the ability of the application to continue to perform in another location — then you can have a Continuously Available solution.
Continuous Availability has the following characteristics:
- The Network Technology needs to be able to stretch between two sites. Generally you will hear network engineers talk about “layer-2” adjacency or virtualization of VLANs between sites. With layer-2 adjacency, VLANs can be intelligently stretched between sites without the drawbacks of earlier generation stretched LAN solutions.
- The Storage Platform must be virtualized across the two sites to create a data presence that is mirrored between the sites and allows concurrent write-access in both sites.
- When the compute layer is virtualized, there are features of the virtual operating system that can take advantage of workload placement. Why is this helpful? Here is an example to illustrate: If you need to take a site down for maintenance, it is relatively easy to move workloads out of one site to the other while continuing to deliver service non-disruptively during the migration and while the site is down.
- And finally, the application or application infrastructure must have locking mechanisms to allow concurrent updates to data from either site. This can be accomplished with clustering technologies such as Oracle RAC, VMware Metro Storage Cluster or mechanisms coded into the application itself.
It is also possible to create a near Continuously Available solution by distributing presentation and application servers between sites and setting up the data layer in an Active/Passive stretched cluster. If there is a failure of the predominant site or of the database server, only the database needs to be failed over to resume service. And with various watchdog technologies and advances in clustering technologies, the database fail-over can be triggered automatically and the database restart time can be in a few minutes and in some cases seconds.
Continuously Available or Active/Active applications give you the ability to perform maintenance on applications without disrupting service to your user community. The architecture also provides diverse production locations assuring that a complete outage of one data center does not disrupt your ability to deliver “always-on and always-available” application services.
With this new, ground-breaking approach, companies can avoid disruption and achieve continuous operations by converging their high availability and disaster recovery practices. We call this new approach Continuous Availability.