Yes, I'm aware that it's not good idea build "ordinary" filesystem above
Hadoop. Let's say that I try to build system for my users where is 500 GB
space for every user. It seems that Hadoop can write/store 500 GB fine, but
reading and altering data later isn't easy (at least not altering).
How the big boys do this? E.g. Google filesystem, Gmail is above that (and
still latency time seems fine for the remote enduser)? How about Amazon S3?
Do the big players implement some caching layers above Hadoop like system?
My dream is to have system with easy to add more space when needed, with all
those automatic features: balancing, recovery of data (keeping it really
there no matter what happens) etc. I guess I'm not alone there.