July 26, 2008 |
Allen Stern over at CenterNetworks reports that Amazon has published an announcement regarding last weekend's outage of Amazon S3 services. For those of you who were neither affected, nor cared, I'd want to remind that S3 is Amazon's service that provides hosting and storage for web applications – to a certain extent for free. And last weekend the service experienced an outage of the whole 8 hours – and that with very poor communication on behalf of the company while startups complained in their forums and users were perplexed because they could not use many of their favorite services.
You can read the entire explanation provided by Amazon on their blog – it has plenty technical details for you to understand what exactly happened and how the situation was handled. Basically, the reason for the lengthy outage was in serve-to-server communication. To a certain level their approach is a little strange for me – calling a lengthy outage that affected a number of companies and huge number of users an "availability event" is kind of far from what we expect it to be called actually.
Even if Amazon seems to be reluctant to admit the problem was really terrible, it is inspiring to see them already taking actions not only to solve the existing problem but to prevent such situations in the future. Moreover, it is obvious that their PR and marketing department are really doing something very good – I guess someone told them that optimism is the best thing to help you get out of such situations so the post ends with "we know that any downtime is unacceptable and we won't be satisfied until performance is statistically indistinguishable from perfect".I'm not sure if it sounds like a promise or more like a hope.