Growing past a Single Server
For the majority of businesses, a single cloud hosted web site running WordPress, Magento, Joomla or some other web application provides more than sufficient capacity. However, being a single instance of your website, any downtime of the website or underlying components may affect your business. So, if your online presence must be up close to 100% of the time then a better solution offering some form of High Availability (HA) is a serious requirement.
Once your business grows and traffic increases dramatically a normal site (even with all the performance enhancements) will start to reach some hard limits. Caching technology will provide some performance improvements but if the website data is not static and must be retrieved from a database or file system then these limits are going to be reached as user request loads grow.
The first hard limit will be the ability of the web server to server requests. Another limit will be the response time of the database server and finally the amount of network traffic flowing to the server will also become an issue.
Let’s address these three issues with some easy to understand technical solutions.
Scaling out the Front End Webserver
Once you are pushing the bounds of a single server and you have exhausted the numerous performance tweaks that can be performed on it, then separating out the functionality of the server components is required. At this point an additional web server (or more) will be required.
There are a number of benefits to increasing the number of web servers servicing user requests. Without delving into probability theory, simple queuing theory tells us that if the number of arriving requests exceeds the ability of the processing service then the queue of requests will grow. But if we add more processing services to service the queue, then the queue will be at close to zero most of the time, much like people standing in line at a bank queue waiting to be serviced by a teller.
If we add additional web servers we will reduce the overall load and service each web request in time.
Load Balancing
In order to direct traffic to our pool of web servers we need to have some mechanism in place to direct traffic to them, there are a number of techniques that can be implemented. The first and easiest is DNS based Load Balancing, this method uses multiple DNS entries with the same domain name but a different IP address of each web server. These DNS entries are served in a Round Robin fashion, whereby the multiple IP’s are send out and the sequence is then repeated.
DNS Round Robin load balancing has numerous drawbacks, here are a few:
Not capable of detecting offline web servers
Unable to determine if a web server is overloaded
Skewing of request traffic due to locality of closest/fastest responding DNS server
Conetix does NOT recommend the use of DNS load balancing for these reasons. Let’s look at how a Smart Load Balancing solution addresses these issues.
If our two web servers are sitting behind a Load Balancer, traffic can be made to flow to all servers alternatively to maintain good response times, this is called “Round Robin” and if all servers are online and evenly loaded this works very well. If a server goes offline then pure Round Robin will not be ideal as requests will be directed to a server that will not respond, so a mechanism is needed to determine if the web server is operational, most Round Robin Load Balancers implement a heart beat monitor that polls the web server to determine if it is operational, with this in hand, Round Robin can continue to be used knowing that a website is responding.
However, there is still an issue that must be addressed, a heavily loaded server will respond at some point so simply monitoring a web servers presence is insufficient. Ideally we need to monitor how quickly the web server responds and how much “Load” the server is experiencing to determine if it should have requests sent to it.
A Smart Load Balanced solution as used at Conetix can also monitor the load of your web server and ensure that requests are sent to lightly loaded servers in a weight Distribution Model as needed. A lightly loaded server will receive more requests, as the load on the other server decreases then they will receive more of the request traffic, eventually the traffic will reach equilibrium until a spike causes loads to rise. The Load Balancer will be monitoring this and routing traffic accordingly. If you consider an alternative hosting provider then these are excellent topics to raise.
If your on-demand load is very high then additional servers can be added to boost the service rate of requests and from the Load Balancers perspective, this is just additional configuration details that are quick and easy to implement.
Also consider that the typical Load Balancer Appliance will be implemented in dedicated hardware, so your hosting plan may have an additional monthly cost if you choose to use this service.
Application issues with scaling out
While the addition of more servers at first appears to be a simple solution, it introduces some additional technical issues at the Application Software level. The first issue is the need to replicate the software and data to every server and keep this in sync as updates are made. For example, the Magento eCommerce application stores product data in a database, but the product images and related media are stored in the file system. To solve this, replication of the image data can be performed when the next store instance is started.
An additional issue to be addressed is the persistent session data that represents a user’s session with the application, this is data that is dynamically generated and maintained as a visitor interacts with the application. For example, a typical default install of the Magento software configures the storage of session data to the file system while WordPress session data is stored in a database. If the load balancing is working correctly, the user will most likely connect to a different server from the initial interaction and the session data will not be present on the new server if it’s stored in the local file system.
There are at least 3 ways to solve this issue:
Use Database Storage – Applications that use session data must be configured to store that data in the database.
Use a Clustered File System – The file system can be clustered so all web servers see the same files and hence they see the same session and application data.
Use Sticky Sessions – The user, on being redirected to a server initially is redirected to a specific server using a unique URL for the duration of their visit.
There are more solutions but we will limit ourselves to these three and address more technical solutions in future articles.
All web applications will have one or more of these technical issues, these technical issues must be considered and addressed prior to diving into any HA solution!
Persistent Data
Databases
Each solution outlined in the previous section introduces additional technical issues that need to be addressed. Firstly using the Database to store sessions will increase the number of queries on the database server. The tuning of the database cache will eliminate a lot of the performance issues as most standard database installation are installed with configuration tuned for reliability not performance.
The introduction of a caching technology such as Memcache or Redis cache will address performance but adds another layer to our architecture.
If your application does not have persistent data stored in a file system and uses a database server to retrieve data and content then Load Balancing the front-end web servers is going to be achievable.
Clustered File Systems
If your application stores data in a file system, the using a clustered file system can easily solve the persistent data issue as each server will see the same files on the clustered file system. The cluster will maintain changes between nodes and most clustered file system implementations implement a replication policy so that “N” copies of the data exists in the cluster based on the belief that nodes in the cluster can go offline at any time.
Using a clustered file system introduces delays in the data replication process as data is usually transported over a network link between nodes. If the nodes are directly connected on 1Gigabit or even 10Gigabit technology then this will usually not be an issue but if replication to a remote site is implemented there will be propagation delay inherit in the transport technology type.
The “Glusterfs” clustered file system is often used to provide reliable clustering of file systems between servers. It installs vary easily in a large number of Linux installations and has a simple architectural design that is quite logical and reliable.
Sticky Sessions
The phrase “Sticky Sessions” refers to a client remaining with a specific host in a session based transaction. Sticky Sessions are often used with HTTPS traffic as the client generated session key is sent to the server on the initial transaction and all data transfers after that are fully encrypted, in our example, the load balanced host will detect the SSL session and maintain a session relationship between the client and the host.
Often the load balancer will present the public key for the server(s) and act as a “Man in the Middle”, presenting encrypted traffic to the client but clear text to the servers, a design which is perfectly acceptable when the servers and load balancer are directly connected and clear text HTTP traffic cannot be snooped on.
Having the SSL session terminate at the Load Balancer does raise the processing load on the Balancer, something that the hosting provider will be monitoring for. At Conetix, we use multiple Balancers that work in a High Availability and session sharing architecture, they are smart enough to share the load between themselves as needed.
Content Delivery Networks
If your web application presents Images, Javascript and CSS content then a Content Delivery Network (CDN) is an excellent way to deliver this data to your clients independent of your hosted web application.
The advantage of CDN, especially if you use a large provider is it can be dispersed across many geographic locations and if the delivery network can determine the location of the requesting client then the content can be delivered from a server close by.
Cloudflare are very large CDN hosts who are capable of achieving Content Delivery from multiple geographic locations. To implement this functionality, the CDN server will need to cache the static content from your web application.
A separate unique URL needs to be configured for CSS, Images and Javascript content. These URL’s are then configured to the CDN provider. The net effect of this is that only traffic specifically retrieving application functionality is now serviced via your hosting plan and hence a reduction in traffic and reduction in request load.
The static content on the servers can be synchronised from your administration instance when changes are made and will automatically be synchronised by the CDN network as content expiry is checked.
Conetix is a partner of Cloudflare and we actively use their CDN services for the majority of our clients.
Where to from here?
Investigate the performance of your site, once all performance tweaks have been implemented and growth continues, review the scalability of the applications you use.
Implement a test website, build a second copy of the site and a database server holding a copy of your production web site. Implement a load balancing hosting plan and test all facets of your application.
Failures are most likely in shared resources and files that are altered during the course of execution of the application. If data is stored in a local file system then look at a clustered file system solution.