Grokbase Groups HBase user June 2011
FAQ

On Thu, Jun 30, 2011 at 8:07 AM, Shuja Rehman wrote:
I am doing  bulk insertion into Hbase using Map reduce reading from lot of
small(10MB approximation) files, resulting mappers = no of files. I am also
monitoring the performance using Ganglia. The machines are c1.xlarge for
processing the files(task trackers+data nodes) and m1.xlarge for hbase
cluster(region servers+data nodes). The CPU usage remain 75%-100% for almost
all of the servers. The ram usage also below 5 GB. But the job fails due to
killing of lot of maps. If i run the same job without insertion then
processing complete in 9-10 minutes. So the question is why it is  killing
so many maps? Any clue?
Can you figure which region the map tasks are failing against? And
once you have the region, what was the regionserver that was hosting
this region (grep master log to figure this). Thereafter, check the
RS logs around the time of the map task timeout. See anything? Long
GC? A region split? 600 seconds is a long time for the server-side
to be hung up.

What version of hbase?

Thanks,
St.Ack

Search Discussions

Discussion Posts

Previous

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 2 of 2 | next ›
Discussion Overview
groupuser @
categorieshbase, hadoop
postedJun 30, '11 at 3:08p
activeJul 1, '11 at 6:03a
posts2
users2
websitehbase.apache.org

2 users in discussion

Shuja Rehman: 1 post Stack: 1 post

People

Translate

site design / logo © 2022 Grokbase