Amazon S3 getting attention, ROI analysis underway

I’ve been following the online data storage market very carefully, (having worked at a data storage company for 3 years)

The new model of web companies is to minimize technical infrastructure, such as using webservices like a utility like power or water.

Tim O’Reilly is asking for numbers from clients, a case study to show how the web is an infrastructure. The benefits? pay as you go, no need for a sys admin or HW purchase, no upgrading, and hopefully no data loss. Of course, the real risk is worrying about where your corporate data is, Jeff Nolan notes some issues with continuity with databases and storage.

Amazon is not alone, I’ve a list of quite a few enterprise IT vendors that are also entering the online data storage industry, so please don’t forget those companies (who already have a large install base)

The big picture for Online Data storage is the opportunity for effective marketing, (there are other opportunites and disruptions to think about as well). When user data is stored on the cloud, the opportunity to understand, organize, and connect information is at hand. This is why I have the theory that Amazon S3 will eventually pay users (or other online data storage users) to upload data.

  • Jeff Maaks

    Hi Jeremiah,

    Some personal thoughts on Online Data Storage, with usual disclaimer about this not representing my current employer…and yours prior. 🙂

    I think we’ll start to see a 2-tier storage model appear, whereby one purchases a local storage array for high-performance access, and/or for rapid recovery capability (like a solution near to my heart, HDS’ SplitSecond), then utilizes S3 for off-site backup purposes. Wouldn’t it even be slick if these storage solutions came S3-aware out of the box?

    Data de-duplication across users will become ever more important for storage providers as well as for the users themselves. Think: does your online storage provider need to store “x” copies of that Pink Floyd album you ripped to MP3s? Or couldn’t they just have their client app generate a hash and see if the file was already in the cloud from some other user? Benefit for the user: no actual file transfer would have to take place (storage provider just ack’s the backup and simply transfers the file metadata) and therefore bandwidth and time saved.

    Anyway, just some thoughts from someone that wishes this solution existed today for my own personal backup strategy.


  • Cloud based storage has such great potential, but I suspect the realization of it is still a long way off.

    From an enterprise perspective, there’s always going to be the security issue, and the latency inherent in the cloud causes real problems with time sensitive data. Most of the protocols used in todays apps aren’t optimized for WAN access, so it would be interesting to see how they handled a cloud based storage system.

    Jeff’s 2-tier approach seems to make sense, backup isn’t time sensitive, so it would make sense to push that out. But I’d be interested to see case studies of how it’s being used today.


  • Thanks gents for this info.