Emergency Maintenance :: Cloud SAN :: 01/20/12
Posted 20 January 2012 - 05:24 PM
Posted 20 January 2012 - 07:22 PM
Posted 21 January 2012 - 04:53 AM
The cloud storage configuration, which is made up of a 16 disk SAN array, can tolerate one drive failure and continue to operate normally. This automatic failover provides the reliability that has made cloud platforms popular. In last night’s incident, however, the SAN array had two drive failures, which is beyond the tolerance fault level of the system.
Currently, attempts are being made to rebuild the array and regain access to the data, but this is a tedious process due to the nature of the failure and the methods of recovery. At this time, we have no definitive ETA for when this will be completed.
We will update you as soon as we have more definitive information regarding the state of the drives and the array.
Posted 21 January 2012 - 11:28 AM
Thanks, everyone, for your patience.
Posted 23 January 2012 - 09:23 PM
However, nothing is ever easy:
We attempted to get all overwrites and instances overwritten or turned up as quickly as possible, but the activity of copying three virtual machines of data onto the storage server simultaneously, along with the mad rush of our users to copy files has elevated the I/O load on the new cloud far beyond its normal expected activity. Due to the elevated I/O, we are restoring instances far more slowly than previously in order to keep the I/O load to a minimum and the service more stable for those already on the cloud.
We will continue working to get all available instances restored as quickly as possible and, again, thank you for your patience.
0 user(s) are reading this topic
0 members, 0 guests, 0 anonymous users