Some users may have been aware that we suffered a network issue at one of our UK based data centers on April 22nd 2014. The issue caused connections to sites and services on our platform to timeout for a small subset of users from around 9:41am BST onwards.
Thankfully most Freewebstore users and website visitors were unaffected and there was no disruption of service for those users. The users who were affected seemed to be reporting that they were unable to view their own website nor the Freewebstore Control Panel. So we were forced to communicate with these users via our Twitter feed. This should be your first port of call if you are ever unable to reach any of our web installations. Here we were able to keep users fully informed of the progress and our continued efforts to resolve the problem. If you were not following us on Twitter before then now is a good time to start. Follow us at https://twitter.com/Freewebstore
As this issue only affected a small percentage of our users it took longer than usual to identify. Within a couple of hours we were able to pinpoint the issue to one UK hosting installation and a potential upstream networking issue within or close to the datacenter. Working further with our hosting partners, we managed to route around the problem for some of our users and a set of corrective measures were drawn up to correct the problem. The problem was down to some Cisco7600 routers. To correct the problem they had to be taken down and the firmware updated then rebooted with a new configuration. The work would cause 5 minutes of downtime but should resolve the issue for all users on all connections. These corrective measures were delayed until 1am BST to avoid any unnecessary disruption to users who were not being affected by this issue.
Meanwhile we kept our Twitter feed up to date and began work on creating a redundant installation in the cloud. We managed to get a second Freewebstore Control Panel online in the cloud at around 18:50pm BST on a new URL. The hope was that users having connectivity issues within the UK may be able to access the new site hosted in the cloud. This was largely successful for most users who were affected by the original issue. This at least gave users access to their store accounts and allowed them to process orders and communicate with their shoppers.
The corrective work at the data center was carried out on time and without any further issues. The down time occurred at around 1:28am BST and lasted for around 4 to 5 minutes. The new hardware was rebooted and all sites and services came back online shortly afterwards. The work corrected the connection issues as planned and normal service was resumed for all users across our platform.
We understand that any downtime is very frustrating to users and all precautions are taken to prevent these types of incidents from happening. Having said that – hardware failures do happen. It affects companies of all sizes. Not even the mighty Google have 100% uptime due to these type of problems. Due to the robust nature of our network architecture a problem as unforeseen as this still only affected a very small percentage of users. We are very proud of our uptime and we continue to strive to get as close to 100% uptime as humanly possible.
We appreciate your patience during scenarios like these and our dedicated network support team will continue to endeavour to make sure you are protected from future incidents.
Remember to follow us on Twitter to be kept fully up to date on all things Freewebstore. Trolls and keyboard warriors need not apply. You know who you are!