FAQ
I'm looking for some help. I'm Nutch user, everything was working fine, but
now I get the following error when indexing.
I have a single note pseudo distributed set up.
Some people on the Nutch list indicated to me that I could full, so I remove
many things and hdfs is far from full.
This file & directory was perfectly OK the day before.
I did a "hadoop fsck"... report says healthy.

What can I do ?

Is is safe to do a Linux FSCK just in case ?

Caused by: java.io.IOException: Could not obtain block:
blk_8851198258748412820_9031
file=/user/nutch/crawl/indexed-segments/20100111233601/part-00000/_103.frq


--
-MilleBii-

Search Discussions

  • MilleBii at Jan 30, 2010 at 12:06 am
    X-POST with Nutch mailing list.

    HEEELP !!!

    Kind of get stuck on this one.
    I backed-up my hdfs data, reformated the hdfs, put data back, try to merge
    my segments together and it explodes again.

    Exception in thread "Lucene Merge Thread #0"
    org.apache.lucene.index.MergePolicy$MergeException: java.io.IOException:
    Could not obtain block: blk_4670839132945043210_1585
    file=/user/nutch/crawl/indexed-segments/20100113003609/part-00000/_ym.frq
    at
    org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:309)

    If I go into the hfds/data directory I DO find the faulty block ????
    Could it be a synchro problem on the segment merger code ?

    2010/1/29 MilleBii <millebii@gmail.com>
    I'm looking for some help. I'm Nutch user, everything was working fine, but
    now I get the following error when indexing.
    I have a single note pseudo distributed set up.
    Some people on the Nutch list indicated to me that I could full, so I
    remove many things and hdfs is far from full.
    This file & directory was perfectly OK the day before.
    I did a "hadoop fsck"... report says healthy.

    What can I do ?

    Is is safe to do a Linux FSCK just in case ?

    Caused by: java.io.IOException: Could not obtain block:
    blk_8851198258748412820_9031
    file=/user/nutch/crawl/indexed-segments/20100111233601/part-00000/_103.frq


    --
    -MilleBii-


    --
    -MilleBii-
  • Ken Goodhope at Jan 30, 2010 at 12:25 am
    "Could not obtain block" errors are often caused by running out of available
    file handles. You can confirm this by going to the shell and entering
    "ulimit -n". If it says 1024, the default, then you will want to increase
    it to about 64,000.
    On Fri, Jan 29, 2010 at 4:06 PM, MilleBii wrote:

    X-POST with Nutch mailing list.

    HEEELP !!!

    Kind of get stuck on this one.
    I backed-up my hdfs data, reformated the hdfs, put data back, try to merge
    my segments together and it explodes again.

    Exception in thread "Lucene Merge Thread #0"
    org.apache.lucene.index.MergePolicy$MergeException: java.io.IOException:
    Could not obtain block: blk_4670839132945043210_1585
    file=/user/nutch/crawl/indexed-segments/20100113003609/part-00000/_ym.frq
    at

    org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:309)

    If I go into the hfds/data directory I DO find the faulty block ????
    Could it be a synchro problem on the segment merger code ?

    2010/1/29 MilleBii <millebii@gmail.com>
    I'm looking for some help. I'm Nutch user, everything was working fine, but
    now I get the following error when indexing.
    I have a single note pseudo distributed set up.
    Some people on the Nutch list indicated to me that I could full, so I
    remove many things and hdfs is far from full.
    This file & directory was perfectly OK the day before.
    I did a "hadoop fsck"... report says healthy.

    What can I do ?

    Is is safe to do a Linux FSCK just in case ?

    Caused by: java.io.IOException: Could not obtain block:
    blk_8851198258748412820_9031
    file=/user/nutch/crawl/indexed-segments/20100111233601/part-00000/_103.frq

    --
    -MilleBii-


    --
    -MilleBii-


    --
    Ken Goodhope
    Cell: 425-750-5616

    362 Bellevue Way NE Apt N415
    Bellevue WA, 98004
  • MilleBii at Jan 30, 2010 at 8:20 am
    Increased the "ulimit" to 64000 ... same problem
    stop/start-all ... same problem but on a different block which of course
    present, so it looks like there is nothing wrong with actual data in the
    hdfs.

    I use the Nutch default hadoop 0.19.x anything related ?

    2010/1/30 Ken Goodhope <kengoodhope@gmail.com>
    "Could not obtain block" errors are often caused by running out of
    available
    file handles. You can confirm this by going to the shell and entering
    "ulimit -n". If it says 1024, the default, then you will want to increase
    it to about 64,000.
    On Fri, Jan 29, 2010 at 4:06 PM, MilleBii wrote:

    X-POST with Nutch mailing list.

    HEEELP !!!

    Kind of get stuck on this one.
    I backed-up my hdfs data, reformated the hdfs, put data back, try to merge
    my segments together and it explodes again.

    Exception in thread "Lucene Merge Thread #0"
    org.apache.lucene.index.MergePolicy$MergeException: java.io.IOException:
    Could not obtain block: blk_4670839132945043210_1585
    file=/user/nutch/crawl/indexed-segments/20100113003609/part-00000/_ym.frq
    at

    org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:309)
    If I go into the hfds/data directory I DO find the faulty block ????
    Could it be a synchro problem on the segment merger code ?

    2010/1/29 MilleBii <millebii@gmail.com>
    I'm looking for some help. I'm Nutch user, everything was working fine, but
    now I get the following error when indexing.
    I have a single note pseudo distributed set up.
    Some people on the Nutch list indicated to me that I could full, so I
    remove many things and hdfs is far from full.
    This file & directory was perfectly OK the day before.
    I did a "hadoop fsck"... report says healthy.

    What can I do ?

    Is is safe to do a Linux FSCK just in case ?

    Caused by: java.io.IOException: Could not obtain block:
    blk_8851198258748412820_9031
    file=/user/nutch/crawl/indexed-segments/20100111233601/part-00000/_103.frq

    --
    -MilleBii-


    --
    -MilleBii-


    --
    Ken Goodhope
    Cell: 425-750-5616

    362 Bellevue Way NE Apt N415
    Bellevue WA, 98004


    --
    -MilleBii-
  • MilleBii at Jan 30, 2010 at 9:46 am
    Ken,

    FIXED !!! SO MUCH THANKS

    Command prompt ulimit wasn't enough, one needs to hard set it and reboot
    explained here
    http://posidev.com/blog/2009/06/04/set-ulimit-parameters-on-ubuntu/




    2010/1/30 MilleBii <millebii@gmail.com>
    Increased the "ulimit" to 64000 ... same problem
    stop/start-all ... same problem but on a different block which of course
    present, so it looks like there is nothing wrong with actual data in the
    hdfs.

    I use the Nutch default hadoop 0.19.x anything related ?

    2010/1/30 Ken Goodhope <kengoodhope@gmail.com>

    "Could not obtain block" errors are often caused by running out of
    available
    file handles. You can confirm this by going to the shell and entering
    "ulimit -n". If it says 1024, the default, then you will want to increase
    it to about 64,000.
    On Fri, Jan 29, 2010 at 4:06 PM, MilleBii wrote:

    X-POST with Nutch mailing list.

    HEEELP !!!

    Kind of get stuck on this one.
    I backed-up my hdfs data, reformated the hdfs, put data back, try to merge
    my segments together and it explodes again.

    Exception in thread "Lucene Merge Thread #0"
    org.apache.lucene.index.MergePolicy$MergeException: java.io.IOException:
    Could not obtain block: blk_4670839132945043210_1585
    file=/user/nutch/crawl/indexed-segments/20100113003609/part-00000/_ym.frq
    at

    org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:309)
    If I go into the hfds/data directory I DO find the faulty block ????
    Could it be a synchro problem on the segment merger code ?

    2010/1/29 MilleBii <millebii@gmail.com>
    I'm looking for some help. I'm Nutch user, everything was working
    fine,
    but
    now I get the following error when indexing.
    I have a single note pseudo distributed set up.
    Some people on the Nutch list indicated to me that I could full, so I
    remove many things and hdfs is far from full.
    This file & directory was perfectly OK the day before.
    I did a "hadoop fsck"... report says healthy.

    What can I do ?

    Is is safe to do a Linux FSCK just in case ?

    Caused by: java.io.IOException: Could not obtain block:
    blk_8851198258748412820_9031
    file=/user/nutch/crawl/indexed-segments/20100111233601/part-00000/_103.frq

    --
    -MilleBii-


    --
    -MilleBii-


    --
    Ken Goodhope
    Cell: 425-750-5616

    362 Bellevue Way NE Apt N415
    Bellevue WA, 98004


    --
    -MilleBii-


    --
    -MilleBii-

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
categorieshadoop
postedJan 29, '10 at 7:46p
activeJan 30, '10 at 9:46a
posts5
users2
websitehadoop.apache.org...
irc#hadoop

2 users in discussion

MilleBii: 4 posts Ken Goodhope: 1 post

People

Translate

site design / logo © 2022 Grokbase