FAQ
How do I figure out whats going on while a job is trying to initialize? I
have a job thats importing data from a DB into HBase and it takes very long
to initialize. The time is enough to cause a time out of the mappers and
eventually kill the job.

Amandeep


Amandeep Khurana
Computer Science Graduate Student
University of California, Santa Cruz

Search Discussions

  • Philip Zeyliger at Jul 3, 2009 at 12:48 am
    You can try to run it via LocalJobRunner ("hadoop jar yourjar -jt
    local" if you're using GenericOptionsParser), and see if it exhibits
    the same behavior there. It's easy to push that into a debugger
    (HADOOP_OPTS="-Xdebug
    -Xrunjdwp:transport=dt_socket,server=y,suspend=y,address=8020" and
    point Eclipse at it) to set some breakpoints and see what's going on.

    Cheers,

    -- Philip

    On Thu, Jul 2, 2009 at 5:22 PM, Amandeep Khuranawrote:
    How do I figure out whats going on while a job is trying to initialize? I
    have a job thats importing data from a DB into HBase and it takes very long
    to initialize. The time is enough to cause a time out of the mappers and
    eventually kill the job.

    Amandeep


    Amandeep Khurana
    Computer Science Graduate Student
    University of California, Santa Cruz
  • Jason hadoop at Jul 3, 2009 at 2:10 am
    A couple of things that can cause a job to take a long time are replicating
    distributed cache items,
    and unpacking distributed cache items and otherwise preparing the local task
    directory on the task trackers.
    The job jar is a distributed cache item.
    On Thu, Jul 2, 2009 at 5:48 PM, Philip Zeyliger wrote:

    You can try to run it via LocalJobRunner ("hadoop jar yourjar -jt
    local" if you're using GenericOptionsParser), and see if it exhibits
    the same behavior there. It's easy to push that into a debugger
    (HADOOP_OPTS="-Xdebug
    -Xrunjdwp:transport=dt_socket,server=y,suspend=y,address=8020" and
    point Eclipse at it) to set some breakpoints and see what's going on.

    Cheers,

    -- Philip

    On Thu, Jul 2, 2009 at 5:22 PM, Amandeep Khuranawrote:
    How do I figure out whats going on while a job is trying to initialize? I
    have a job thats importing data from a DB into HBase and it takes very long
    to initialize. The time is enough to cause a time out of the mappers and
    eventually kill the job.

    Amandeep


    Amandeep Khurana
    Computer Science Graduate Student
    University of California, Santa Cruz


    --
    Pro Hadoop, a book to guide you from beginner to hadoop mastery,
    http://www.amazon.com/dp/1430219424?tag=jewlerymall
    www.prohadoopbook.com a community for Hadoop Professionals

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
categorieshadoop
postedJul 3, '09 at 12:23a
activeJul 3, '09 at 2:10a
posts3
users3
websitehadoop.apache.org...
irc#hadoop

People

Translate

site design / logo © 2023 Grokbase