Ilesfay on key principles for the resilient cloud
As cloud infrastructure matures and adoption grows, exposure to cloud outages is a growing risk for everyone. As public and private cloud vendors improve offerings, the issue of resiliency should be top of mind for customers. Avoiding outages requires a proactive approach to a vendor-agnostic solution that ensures continuity and performance for even the most complex IT environments.
At Ilesfay®, our cloud-based data delivery service has operated for three years without an outage. We can ensure our customers the best possible performance and experience with our application, despite the risks of operating an Internet-based business. It’s all because we’ve followed these key principles for creating a resilient cloud:
Know your single points of failure. Assume that every component of your application will fail at some point. To understand this risk, walk through your application and architecture to identify every area where there is a single point of failure.
Spread the load of your work over multiple server nodes using a Load Balancer. Balancing load over many servers allows you to scale up or down as your business ebbs and flows. Use a monitoring service to automatically start up or shut down servers as workload fluctuates. This principle also applies to network equipment and other resources. However, load balancing also creates a single point of failure in your architecture. Avoid the risk of failure by adding a second Load Balancer, which eliminates the single endpoint created by using only the original.
Use Shared Nothing architecture. Applications are more resilient when each server node is independent and self-sufficient, containing everything needed to fulfill a request and perform its work. This contrasts to systems with centrally-stored information in an application server or database, which require server nodes to obtain data to perform their work. Shared Nothing helps eliminate this single point of failure and allows for almost-infinite scalability.
Implement robust retry logic. Moving applications from traditional data centers to the cloud allows for almost infinite scalability and elasticity. But the cloud can cause gaps in network connectivity and latency that didn’t exist while applications ran on private infrastructure. Retry is crucial at these times, but must be analyzed very carefully, with logic and rules that take into account the specific instance in which it will be used.
Replicate your application. Cloud data centers are physical facilities with electrical and electronic systems susceptible to outage, and open cloud platforms like Amazon Web Services (AWS), Windows Azure, or Rackspace can’t guarantee 100 percent uptime. Replicating your application in multiple cloud data centers across regions allows for redundancy and full business continuity.
Use managed services outside of your expertise. Save your key technologists’ time for your core products, and use off-the-shelf managed services and applications available in the cloud. Examples of these services include:
- Amazon’s Relational Database Service (RDS), available on Amazon Web Services (AWS), a cloud-based database manager. It eliminates the need for hardware, software or hands-on administrators as a database grows in size.
- Ilesfay ZoneSync®, also available within AWS. It lets companies and data centers quickly and seamlessly replicate enormous amounts of data between regions, eliminating down time when regional or local issues occur. ZoneSync provides a user-friendly dashboard and reporting and alert tools. Patented MatchMaking® binary differencing techniques and multi-threaded transfer help to increase data delivery speeds and reduce bandwidth requirements by 10x that of standard methods.
Cloud computing is the most efficient way to do business in today’s world. But to realize its full benefit, build sound processes that protect your product and your customer.
Chris McLennan is founder, president and CEO of Ilesfay Technology Group, a Cincinnati-based firm offering cloud-based tools and services to improve global data replication. He has 15 years of experience in IT solutions for Fortune 100 clients across SaaS, ERP, BOM, PLM, and data acquisition.
Cincinnati | October 25, 2012 | Chris McLennan
