Dear Hadoop community,
We are excited today to introduce the public beta of Amazon Elastic MapReduce, a web service that enables developers to easily and cost-effectively process vast amounts of data. It utilizes a hosted Hadoop (0.18.3) running on the web-scale infrastructure of Amazon Elastic Compute Cloud (Amazon EC2) and Amazon Simple Storage Service (Amazon S3).
Using Amazon Elastic MapReduce, you can instantly provision as much or as little capacity as you like to perform data-intensive tasks for applications such as web indexing, data mining, log file analysis, machine learning, financial analysis, scientific simulation, and bioinformatics research. Amazon Elastic MapReduce lets you focus on crunching or analyzing your data without having to worry about time-consuming set-up, management or tuning of Hadoop clusters or the compute capacity upon which they sit.
Working with the service is easy: Develop your processing application using our samples or by building your own, upload your data to Amazon S3, use the AWS Management Console or APIs to specify the number and type of instances you want, and click "Create Job Flow." We do the rest, running Hadoop over the number of specified instances, providing progress monitoring, and delivering the output to Amazon S3.
We will be posting several patches to Hadoop today and are hoping to become a part of this exciting community moving forward.
We hope this new service will prove a powerful tool for your data processing needs and becomes a great development platform to build sophisticated data processing applications. You can sign up and start using the service today at http://aws.amazon.com/elasticmapreduce.
Our forums are available to ask any questions or suggest features: http://developer.amazonwebservices.com/connect/forum.jspa?forumID=52
The Amazon Web Services Team