Grokbase Groups HBase user June 2011

On Thu, Jun 30, 2011 at 8:07 AM, Shuja Rehman wrote:
I am doing  bulk insertion into Hbase using Map reduce reading from lot of
small(10MB approximation) files, resulting mappers = no of files. I am also
monitoring the performance using Ganglia. The machines are c1.xlarge for
processing the files(task trackers+data nodes) and m1.xlarge for hbase
cluster(region servers+data nodes). The CPU usage remain 75%-100% for almost
all of the servers. The ram usage also below 5 GB. But the job fails due to
killing of lot of maps. If i run the same job without insertion then
processing complete in 9-10 minutes. So the question is why it is  killing
so many maps? Any clue?
Can you figure which region the map tasks are failing against? And
once you have the region, what was the regionserver that was hosting
this region (grep master log to figure this). Thereafter, check the
RS logs around the time of the map task timeout. See anything? Long
GC? A region split? 600 seconds is a long time for the server-side
to be hung up.

What version of hbase?


Search Discussions

Discussion Posts


Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 2 of 2 | next ›
Discussion Overview
groupuser @
categorieshbase, hadoop
postedJun 30, '11 at 3:08p
activeJul 1, '11 at 6:03a

2 users in discussion

Shuja Rehman: 1 post Stack: 1 post



site design / logo © 2022 Grokbase