FAQ
MultithreadMapRunner keeps consuming records even if trheads are not available
------------------------------------------------------------------------------

Key: HADOOP-3104
URL: https://issues.apache.org/jira/browse/HADOOP-3104
Project: Hadoop Core
Issue Type: Bug
Affects Versions: 0.16.1
Environment: all
Reporter: Alejandro Abdelnur
Assignee: Alejandro Abdelnur
Priority: Critical
Fix For: 0.16.2


The ExecutorService execute() method does not block when there are not threads available, it queues up the runnables until there are threads.

The problem is that all key/values are read and kept in memory for the task, with large datasets this will create a OOM exception.

Have to figure out how to use the execute in blocking fashion.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Search Discussions

  • Alejandro Abdelnur (JIRA) at Mar 27, 2008 at 3:59 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Alejandro Abdelnur updated HADOOP-3104:
    ---------------------------------------

    Attachment: patch3104.txt

    figured out how to use properly the ThreadPoolExecutor to block execute invocations (instead queuing them) until there are trheads avail.
    MultithreadMapRunner keeps consuming records even if trheads are not available
    ------------------------------------------------------------------------------

    Key: HADOOP-3104
    URL: https://issues.apache.org/jira/browse/HADOOP-3104
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.16.1
    Environment: all
    Reporter: Alejandro Abdelnur
    Assignee: Alejandro Abdelnur
    Priority: Critical
    Fix For: 0.16.2

    Attachments: patch3104.txt


    The ExecutorService execute() method does not block when there are not threads available, it queues up the runnables until there are threads.
    The problem is that all key/values are read and kept in memory for the task, with large datasets this will create a OOM exception.
    Have to figure out how to use the execute in blocking fashion.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Alejandro Abdelnur (JIRA) at Mar 27, 2008 at 4:01 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Alejandro Abdelnur updated HADOOP-3104:
    ---------------------------------------

    Status: Patch Available (was: Open)
    MultithreadMapRunner keeps consuming records even if trheads are not available
    ------------------------------------------------------------------------------

    Key: HADOOP-3104
    URL: https://issues.apache.org/jira/browse/HADOOP-3104
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.16.1
    Environment: all
    Reporter: Alejandro Abdelnur
    Assignee: Alejandro Abdelnur
    Priority: Critical
    Fix For: 0.16.2

    Attachments: patch3104.txt


    The ExecutorService execute() method does not block when there are not threads available, it queues up the runnables until there are threads.
    The problem is that all key/values are read and kept in memory for the task, with large datasets this will create a OOM exception.
    Have to figure out how to use the execute in blocking fashion.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hadoop QA (JIRA) at Mar 27, 2008 at 6:49 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12582765#action_12582765 ]

    Hadoop QA commented on HADOOP-3104:
    -----------------------------------

    -1 overall. Here are the results of testing the latest attachment
    http://issues.apache.org/jira/secure/attachment/12378727/patch3104.txt
    against trunk revision 619744.

    @author +1. The patch does not contain any @author tags.

    tests included -1. The patch doesn't appear to include any new or modified tests.
    Please justify why no tests are needed for this patch.

    javadoc -1. The javadoc tool appears to have generated 1 warning messages.

    javac +1. The applied patch does not generate any new javac compiler warnings.

    release audit +1. The applied patch does not generate any new release audit warnings.

    findbugs -1. The patch appears to introduce 1 new Findbugs warnings.

    core tests +1. The patch passed core unit tests.

    contrib tests +1. The patch passed contrib unit tests.

    Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2079/testReport/
    Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2079/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
    Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2079/artifact/trunk/build/test/checkstyle-errors.html
    Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2079/console

    This message is automatically generated.
    MultithreadMapRunner keeps consuming records even if trheads are not available
    ------------------------------------------------------------------------------

    Key: HADOOP-3104
    URL: https://issues.apache.org/jira/browse/HADOOP-3104
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.16.1
    Environment: all
    Reporter: Alejandro Abdelnur
    Assignee: Alejandro Abdelnur
    Priority: Critical
    Fix For: 0.16.2

    Attachments: patch3104.txt


    The ExecutorService execute() method does not block when there are not threads available, it queues up the runnables until there are threads.
    The problem is that all key/values are read and kept in memory for the task, with large datasets this will create a OOM exception.
    Have to figure out how to use the execute in blocking fashion.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Chris Douglas (JIRA) at Mar 28, 2008 at 12:58 am
    [ https://issues.apache.org/jira/browse/HADOOP-3104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12582868#action_12582868 ]

    Chris Douglas commented on HADOOP-3104:
    ---------------------------------------

    * Instead of treating InterruptedException as a noop, it would be better to throw an IOException with the InterruptedException as its cause.
    * Does the wait between attempts need to be configurable?
    MultithreadMapRunner keeps consuming records even if trheads are not available
    ------------------------------------------------------------------------------

    Key: HADOOP-3104
    URL: https://issues.apache.org/jira/browse/HADOOP-3104
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.16.1
    Environment: all
    Reporter: Alejandro Abdelnur
    Assignee: Alejandro Abdelnur
    Priority: Critical
    Fix For: 0.16.2

    Attachments: patch3104.txt


    The ExecutorService execute() method does not block when there are not threads available, it queues up the runnables until there are threads.
    The problem is that all key/values are read and kept in memory for the task, with large datasets this will create a OOM exception.
    Have to figure out how to use the execute in blocking fashion.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Amar Kamat (JIRA) at Mar 28, 2008 at 5:54 am
    [ https://issues.apache.org/jira/browse/HADOOP-3104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12582920#action_12582920 ]

    Amar Kamat commented on HADOOP-3104:
    ------------------------------------

    Here are some comments
    1) The javadoc comments should not mention the default value. That might change and will require code change too. So you can keep the earlier comment as is and just add the comment about the wait parameter.
    2) I think mapred.map.multithreadedrunner.backoff seems more appropriate than mapred.map.multithreadedrunner.waitwhennothreads, comments?
    3) 10ms seems too short. I was wondering what if we double it everytime. Something like 10,20,40,80 ...

    MultithreadMapRunner keeps consuming records even if trheads are not available
    ------------------------------------------------------------------------------

    Key: HADOOP-3104
    URL: https://issues.apache.org/jira/browse/HADOOP-3104
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.16.1
    Environment: all
    Reporter: Alejandro Abdelnur
    Assignee: Alejandro Abdelnur
    Priority: Critical
    Fix For: 0.16.2

    Attachments: patch3104.txt


    The ExecutorService execute() method does not block when there are not threads available, it queues up the runnables until there are threads.
    The problem is that all key/values are read and kept in memory for the task, with large datasets this will create a OOM exception.
    Have to figure out how to use the execute in blocking fashion.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Devaraj Das (JIRA) at Mar 28, 2008 at 6:46 am
    [ https://issues.apache.org/jira/browse/HADOOP-3104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Devaraj Das updated HADOOP-3104:
    --------------------------------

    Status: Open (was: Patch Available)

    It'd be better to have wait-notify mechanism instead of sleep.
    MultithreadMapRunner keeps consuming records even if trheads are not available
    ------------------------------------------------------------------------------

    Key: HADOOP-3104
    URL: https://issues.apache.org/jira/browse/HADOOP-3104
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.16.1
    Environment: all
    Reporter: Alejandro Abdelnur
    Assignee: Alejandro Abdelnur
    Priority: Critical
    Fix For: 0.16.2

    Attachments: patch3104.txt


    The ExecutorService execute() method does not block when there are not threads available, it queues up the runnables until there are threads.
    The problem is that all key/values are read and kept in memory for the task, with large datasets this will create a OOM exception.
    Have to figure out how to use the execute in blocking fashion.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Alejandro Abdelnur (JIRA) at Mar 28, 2008 at 7:14 am
    [ https://issues.apache.org/jira/browse/HADOOP-3104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Alejandro Abdelnur updated HADOOP-3104:
    ---------------------------------------

    Attachment: patch3104.txt
    MultithreadMapRunner keeps consuming records even if trheads are not available
    ------------------------------------------------------------------------------

    Key: HADOOP-3104
    URL: https://issues.apache.org/jira/browse/HADOOP-3104
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.16.1
    Environment: all
    Reporter: Alejandro Abdelnur
    Assignee: Alejandro Abdelnur
    Priority: Critical
    Fix For: 0.16.2

    Attachments: patch3104.txt, patch3104.txt


    The ExecutorService execute() method does not block when there are not threads available, it queues up the runnables until there are threads.
    The problem is that all key/values are read and kept in memory for the task, with large datasets this will create a OOM exception.
    Have to figure out how to use the execute in blocking fashion.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Alejandro Abdelnur (JIRA) at Mar 28, 2008 at 7:20 am
    [ https://issues.apache.org/jira/browse/HADOOP-3104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Alejandro Abdelnur updated HADOOP-3104:
    ---------------------------------------

    Attachment: (was: patch3104.txt)
    MultithreadMapRunner keeps consuming records even if trheads are not available
    ------------------------------------------------------------------------------

    Key: HADOOP-3104
    URL: https://issues.apache.org/jira/browse/HADOOP-3104
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.16.1
    Environment: all
    Reporter: Alejandro Abdelnur
    Assignee: Alejandro Abdelnur
    Priority: Critical
    Fix For: 0.16.2

    Attachments: patch3104.txt


    The ExecutorService execute() method does not block when there are not threads available, it queues up the runnables until there are threads.
    The problem is that all key/values are read and kept in memory for the task, with large datasets this will create a OOM exception.
    Have to figure out how to use the execute in blocking fashion.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Alejandro Abdelnur (JIRA) at Mar 28, 2008 at 7:26 am
    [ https://issues.apache.org/jira/browse/HADOOP-3104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Alejandro Abdelnur updated HADOOP-3104:
    ---------------------------------------

    Attachment: patch3104.txt
    MultithreadMapRunner keeps consuming records even if trheads are not available
    ------------------------------------------------------------------------------

    Key: HADOOP-3104
    URL: https://issues.apache.org/jira/browse/HADOOP-3104
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.16.1
    Environment: all
    Reporter: Alejandro Abdelnur
    Assignee: Alejandro Abdelnur
    Priority: Critical
    Fix For: 0.16.2

    Attachments: patch3104.txt, patch3104.txt


    The ExecutorService execute() method does not block when there are not threads available, it queues up the runnables until there are threads.
    The problem is that all key/values are read and kept in memory for the task, with large datasets this will create a OOM exception.
    Have to figure out how to use the execute in blocking fashion.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Alejandro Abdelnur (JIRA) at Mar 28, 2008 at 7:34 am
    [ https://issues.apache.org/jira/browse/HADOOP-3104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Alejandro Abdelnur updated HADOOP-3104:
    ---------------------------------------

    Attachment: patch3104.txt
    MultithreadMapRunner keeps consuming records even if trheads are not available
    ------------------------------------------------------------------------------

    Key: HADOOP-3104
    URL: https://issues.apache.org/jira/browse/HADOOP-3104
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.16.1
    Environment: all
    Reporter: Alejandro Abdelnur
    Assignee: Alejandro Abdelnur
    Priority: Critical
    Fix For: 0.16.2

    Attachments: patch3104.txt, patch3104.txt


    The ExecutorService execute() method does not block when there are not threads available, it queues up the runnables until there are threads.
    The problem is that all key/values are read and kept in memory for the task, with large datasets this will create a OOM exception.
    Have to figure out how to use the execute in blocking fashion.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Alejandro Abdelnur (JIRA) at Mar 28, 2008 at 7:34 am
    [ https://issues.apache.org/jira/browse/HADOOP-3104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Alejandro Abdelnur updated HADOOP-3104:
    ---------------------------------------

    Attachment: (was: patch3104.txt)
    MultithreadMapRunner keeps consuming records even if trheads are not available
    ------------------------------------------------------------------------------

    Key: HADOOP-3104
    URL: https://issues.apache.org/jira/browse/HADOOP-3104
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.16.1
    Environment: all
    Reporter: Alejandro Abdelnur
    Assignee: Alejandro Abdelnur
    Priority: Critical
    Fix For: 0.16.2

    Attachments: patch3104.txt, patch3104.txt


    The ExecutorService execute() method does not block when there are not threads available, it queues up the runnables until there are threads.
    The problem is that all key/values are read and kept in memory for the task, with large datasets this will create a OOM exception.
    Have to figure out how to use the execute in blocking fashion.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Alejandro Abdelnur (JIRA) at Mar 28, 2008 at 7:36 am
    [ https://issues.apache.org/jira/browse/HADOOP-3104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12582931#action_12582931 ]

    Alejandro Abdelnur commented on HADOOP-3104:
    --------------------------------------------

    per Chris suggestion, catch blocks of InterruptedException now rethrow the exception, but as a RuntimeException (instead an IOException as he suggested) as I think is more appropriate as it is not an IO issue.

    per Devaraj's suggestion, refactored to use wait-notify in a ThreadPoolExecutor subclass that uses the wait-notify to make the execute blocking if there are not threads. There is no need to use a rejection handler anymore. Also refactor the exception check into a method and use that method in the different parts of the maprunner instead duplicating the exception check code.


    MultithreadMapRunner keeps consuming records even if trheads are not available
    ------------------------------------------------------------------------------

    Key: HADOOP-3104
    URL: https://issues.apache.org/jira/browse/HADOOP-3104
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.16.1
    Environment: all
    Reporter: Alejandro Abdelnur
    Assignee: Alejandro Abdelnur
    Priority: Critical
    Fix For: 0.16.2

    Attachments: patch3104.txt, patch3104.txt


    The ExecutorService execute() method does not block when there are not threads available, it queues up the runnables until there are threads.
    The problem is that all key/values are read and kept in memory for the task, with large datasets this will create a OOM exception.
    Have to figure out how to use the execute in blocking fashion.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Alejandro Abdelnur (JIRA) at Mar 28, 2008 at 7:36 am
    [ https://issues.apache.org/jira/browse/HADOOP-3104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Alejandro Abdelnur updated HADOOP-3104:
    ---------------------------------------

    Status: Patch Available (was: Open)
    MultithreadMapRunner keeps consuming records even if trheads are not available
    ------------------------------------------------------------------------------

    Key: HADOOP-3104
    URL: https://issues.apache.org/jira/browse/HADOOP-3104
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.16.1
    Environment: all
    Reporter: Alejandro Abdelnur
    Assignee: Alejandro Abdelnur
    Priority: Critical
    Fix For: 0.16.2

    Attachments: patch3104.txt, patch3104.txt


    The ExecutorService execute() method does not block when there are not threads available, it queues up the runnables until there are threads.
    The problem is that all key/values are read and kept in memory for the task, with large datasets this will create a OOM exception.
    Have to figure out how to use the execute in blocking fashion.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Amar Kamat (JIRA) at Mar 28, 2008 at 8:16 am
    [ https://issues.apache.org/jira/browse/HADOOP-3104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12582936#action_12582936 ]

    Amar Kamat commented on HADOOP-3104:
    ------------------------------------

    bq. Instead of treating InterruptedException as a noop, it would be better to throw an IOException with the InterruptedException as its cause.
    Shouldn't it log it continue to busy wait?
    MultithreadMapRunner keeps consuming records even if trheads are not available
    ------------------------------------------------------------------------------

    Key: HADOOP-3104
    URL: https://issues.apache.org/jira/browse/HADOOP-3104
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.16.1
    Environment: all
    Reporter: Alejandro Abdelnur
    Assignee: Alejandro Abdelnur
    Priority: Critical
    Fix For: 0.16.2

    Attachments: patch3104.txt, patch3104.txt


    The ExecutorService execute() method does not block when there are not threads available, it queues up the runnables until there are threads.
    The problem is that all key/values are read and kept in memory for the task, with large datasets this will create a OOM exception.
    Have to figure out how to use the execute in blocking fashion.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Devaraj Das (JIRA) at Mar 28, 2008 at 9:14 am
    [ https://issues.apache.org/jira/browse/HADOOP-3104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12582947#action_12582947 ]

    Devaraj Das commented on HADOOP-3104:
    -------------------------------------

    bq. Shouldn't it log it continue to busy wait?

    This should be okay given the context is a task execution and not a daemon execution
    MultithreadMapRunner keeps consuming records even if trheads are not available
    ------------------------------------------------------------------------------

    Key: HADOOP-3104
    URL: https://issues.apache.org/jira/browse/HADOOP-3104
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.16.1
    Environment: all
    Reporter: Alejandro Abdelnur
    Assignee: Alejandro Abdelnur
    Priority: Critical
    Fix For: 0.16.2

    Attachments: patch3104.txt, patch3104.txt


    The ExecutorService execute() method does not block when there are not threads available, it queues up the runnables until there are threads.
    The problem is that all key/values are read and kept in memory for the task, with large datasets this will create a OOM exception.
    Have to figure out how to use the execute in blocking fashion.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Devaraj Das (JIRA) at Mar 28, 2008 at 10:48 am
    [ https://issues.apache.org/jira/browse/HADOOP-3104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12582979#action_12582979 ]

    Devaraj Das commented on HADOOP-3104:
    -------------------------------------

    +1
    MultithreadMapRunner keeps consuming records even if trheads are not available
    ------------------------------------------------------------------------------

    Key: HADOOP-3104
    URL: https://issues.apache.org/jira/browse/HADOOP-3104
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.16.1
    Environment: all
    Reporter: Alejandro Abdelnur
    Assignee: Alejandro Abdelnur
    Priority: Critical
    Fix For: 0.16.2

    Attachments: patch3104.txt, patch3104.txt


    The ExecutorService execute() method does not block when there are not threads available, it queues up the runnables until there are threads.
    The problem is that all key/values are read and kept in memory for the task, with large datasets this will create a OOM exception.
    Have to figure out how to use the execute in blocking fashion.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Owen O'Malley (JIRA) at Mar 28, 2008 at 7:16 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Owen O'Malley updated HADOOP-3104:
    ----------------------------------

    Status: Patch Available (was: Open)
    MultithreadMapRunner keeps consuming records even if trheads are not available
    ------------------------------------------------------------------------------

    Key: HADOOP-3104
    URL: https://issues.apache.org/jira/browse/HADOOP-3104
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.16.1
    Environment: all
    Reporter: Alejandro Abdelnur
    Assignee: Alejandro Abdelnur
    Priority: Critical
    Fix For: 0.16.2

    Attachments: patch3104-2.txt, patch3104.txt, patch3104.txt


    The ExecutorService execute() method does not block when there are not threads available, it queues up the runnables until there are threads.
    The problem is that all key/values are read and kept in memory for the task, with large datasets this will create a OOM exception.
    Have to figure out how to use the execute in blocking fashion.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Owen O'Malley (JIRA) at Mar 28, 2008 at 7:16 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Owen O'Malley updated HADOOP-3104:
    ----------------------------------

    Status: Open (was: Patch Available)
    MultithreadMapRunner keeps consuming records even if trheads are not available
    ------------------------------------------------------------------------------

    Key: HADOOP-3104
    URL: https://issues.apache.org/jira/browse/HADOOP-3104
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.16.1
    Environment: all
    Reporter: Alejandro Abdelnur
    Assignee: Alejandro Abdelnur
    Priority: Critical
    Fix For: 0.16.2

    Attachments: patch3104-2.txt, patch3104.txt, patch3104.txt


    The ExecutorService execute() method does not block when there are not threads available, it queues up the runnables until there are threads.
    The problem is that all key/values are read and kept in memory for the task, with large datasets this will create a OOM exception.
    Have to figure out how to use the execute in blocking fashion.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Owen O'Malley (JIRA) at Mar 28, 2008 at 7:16 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Owen O'Malley updated HADOOP-3104:
    ----------------------------------

    Attachment: patch3104-2.txt

    I replaced the submitJob, followed by polling with runJob. I also replaced the BlockingTaskExecutor with a TaskExecutor that uses a really blocking queue.
    MultithreadMapRunner keeps consuming records even if trheads are not available
    ------------------------------------------------------------------------------

    Key: HADOOP-3104
    URL: https://issues.apache.org/jira/browse/HADOOP-3104
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.16.1
    Environment: all
    Reporter: Alejandro Abdelnur
    Assignee: Alejandro Abdelnur
    Priority: Critical
    Fix For: 0.16.2

    Attachments: patch3104-2.txt, patch3104.txt, patch3104.txt


    The ExecutorService execute() method does not block when there are not threads available, it queues up the runnables until there are threads.
    The problem is that all key/values are read and kept in memory for the task, with large datasets this will create a OOM exception.
    Have to figure out how to use the execute in blocking fashion.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Chris Douglas (JIRA) at Mar 28, 2008 at 7:32 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12583159#action_12583159 ]

    Chris Douglas commented on HADOOP-3104:
    ---------------------------------------

    Internally, ThreadPoolExecutor uses offer() rather than add() when the queue is saturated. With that minor change, +1
    MultithreadMapRunner keeps consuming records even if trheads are not available
    ------------------------------------------------------------------------------

    Key: HADOOP-3104
    URL: https://issues.apache.org/jira/browse/HADOOP-3104
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.16.1
    Environment: all
    Reporter: Alejandro Abdelnur
    Assignee: Alejandro Abdelnur
    Priority: Critical
    Fix For: 0.16.2

    Attachments: patch3104-2.txt, patch3104.txt, patch3104.txt


    The ExecutorService execute() method does not block when there are not threads available, it queues up the runnables until there are threads.
    The problem is that all key/values are read and kept in memory for the task, with large datasets this will create a OOM exception.
    Have to figure out how to use the execute in blocking fashion.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Nigel Daley (JIRA) at Mar 28, 2008 at 7:48 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12583166#action_12583166 ]

    Nigel Daley commented on HADOOP-3104:
    -------------------------------------

    Hudson is down right now. The machine is getting new memory.

    In the interest of time, I ran the test-patch target on this. Here's the output:

    -1 overall.

    @author +1. The patch does not contain any @author tags.

    tests included +1. The patch appears to include 3 new or modified tests.

    javadoc +1. The javadoc tool did not generate any warning messages.

    javac -1. The applied patch generated 568 javac compiler warnings (more than the trunk's current 567 warnings).

    findbugs +1. The patch does not introduce any new Findbugs warnings.

    The javac issue is that a new private class does not have a serialVersionUID which is acceptable.
    MultithreadMapRunner keeps consuming records even if trheads are not available
    ------------------------------------------------------------------------------

    Key: HADOOP-3104
    URL: https://issues.apache.org/jira/browse/HADOOP-3104
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.16.1
    Environment: all
    Reporter: Alejandro Abdelnur
    Assignee: Alejandro Abdelnur
    Priority: Critical
    Fix For: 0.16.2

    Attachments: patch3104-2.txt, patch3104.txt, patch3104.txt


    The ExecutorService execute() method does not block when there are not threads available, it queues up the runnables until there are threads.
    The problem is that all key/values are read and kept in memory for the task, with large datasets this will create a OOM exception.
    Have to figure out how to use the execute in blocking fashion.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Owen O'Malley (JIRA) at Mar 28, 2008 at 8:22 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Owen O'Malley updated HADOOP-3104:
    ----------------------------------

    Attachment: patch3104-3.txt

    This patch overloads offer as well as add in addition to the previous one.
    MultithreadMapRunner keeps consuming records even if trheads are not available
    ------------------------------------------------------------------------------

    Key: HADOOP-3104
    URL: https://issues.apache.org/jira/browse/HADOOP-3104
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.16.1
    Environment: all
    Reporter: Alejandro Abdelnur
    Assignee: Alejandro Abdelnur
    Priority: Critical
    Fix For: 0.16.2

    Attachments: patch3104-2.txt, patch3104-3.txt, patch3104.txt, patch3104.txt


    The ExecutorService execute() method does not block when there are not threads available, it queues up the runnables until there are threads.
    The problem is that all key/values are read and kept in memory for the task, with large datasets this will create a OOM exception.
    Have to figure out how to use the execute in blocking fashion.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Chris Douglas (JIRA) at Mar 28, 2008 at 8:24 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12583182#action_12583182 ]

    Chris Douglas commented on HADOOP-3104:
    ---------------------------------------

    +1 Looks good
    MultithreadMapRunner keeps consuming records even if trheads are not available
    ------------------------------------------------------------------------------

    Key: HADOOP-3104
    URL: https://issues.apache.org/jira/browse/HADOOP-3104
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.16.1
    Environment: all
    Reporter: Alejandro Abdelnur
    Assignee: Alejandro Abdelnur
    Priority: Critical
    Fix For: 0.16.2

    Attachments: patch3104-2.txt, patch3104-3.txt, patch3104.txt, patch3104.txt


    The ExecutorService execute() method does not block when there are not threads available, it queues up the runnables until there are threads.
    The problem is that all key/values are read and kept in memory for the task, with large datasets this will create a OOM exception.
    Have to figure out how to use the execute in blocking fashion.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Owen O'Malley (JIRA) at Mar 28, 2008 at 9:20 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Owen O'Malley updated HADOOP-3104:
    ----------------------------------

    Resolution: Fixed
    Status: Resolved (was: Patch Available)

    I just committed this. Thanks, Alejandro!
    MultithreadMapRunner keeps consuming records even if trheads are not available
    ------------------------------------------------------------------------------

    Key: HADOOP-3104
    URL: https://issues.apache.org/jira/browse/HADOOP-3104
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.16.1
    Environment: all
    Reporter: Alejandro Abdelnur
    Assignee: Alejandro Abdelnur
    Priority: Critical
    Fix For: 0.16.2

    Attachments: patch3104-2.txt, patch3104-3.txt, patch3104.txt, patch3104.txt


    The ExecutorService execute() method does not block when there are not threads available, it queues up the runnables until there are threads.
    The problem is that all key/values are read and kept in memory for the task, with large datasets this will create a OOM exception.
    Have to figure out how to use the execute in blocking fashion.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hadoop QA (JIRA) at Mar 28, 2008 at 10:16 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12583212#action_12583212 ]

    Hadoop QA commented on HADOOP-3104:
    -----------------------------------

    -1 overall. Here are the results of testing the latest attachment
    http://issues.apache.org/jira/secure/attachment/12378815/patch3104-3.txt
    against trunk revision 619744.

    @author +1. The patch does not contain any @author tags.

    tests included +1. The patch appears to include 3 new or modified tests.

    javadoc +1. The javadoc tool did not generate any warning messages.

    javac -1. The applied patch generated 568 javac compiler warnings (more than the trunk's current 567 warnings).

    release audit +1. The applied patch does not generate any new release audit warnings.

    findbugs +1. The patch does not introduce any new Findbugs warnings.

    core tests -1. The patch failed core unit tests.

    contrib tests +1. The patch passed contrib unit tests.

    Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2089/testReport/
    Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2089/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
    Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2089/artifact/trunk/build/test/checkstyle-errors.html
    Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2089/console

    This message is automatically generated.
    MultithreadMapRunner keeps consuming records even if trheads are not available
    ------------------------------------------------------------------------------

    Key: HADOOP-3104
    URL: https://issues.apache.org/jira/browse/HADOOP-3104
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.16.1
    Environment: all
    Reporter: Alejandro Abdelnur
    Assignee: Alejandro Abdelnur
    Priority: Critical
    Fix For: 0.16.2

    Attachments: patch3104-2.txt, patch3104-3.txt, patch3104.txt, patch3104.txt


    The ExecutorService execute() method does not block when there are not threads available, it queues up the runnables until there are threads.
    The problem is that all key/values are read and kept in memory for the task, with large datasets this will create a OOM exception.
    Have to figure out how to use the execute in blocking fashion.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hudson (JIRA) at Mar 29, 2008 at 12:11 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12583313#action_12583313 ]

    Hudson commented on HADOOP-3104:
    --------------------------------

    Integrated in Hadoop-trunk #445 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/445/])
    MultithreadMapRunner keeps consuming records even if trheads are not available
    ------------------------------------------------------------------------------

    Key: HADOOP-3104
    URL: https://issues.apache.org/jira/browse/HADOOP-3104
    Project: Hadoop Core
    Issue Type: Bug
    Affects Versions: 0.16.1
    Environment: all
    Reporter: Alejandro Abdelnur
    Assignee: Alejandro Abdelnur
    Priority: Critical
    Fix For: 0.16.2

    Attachments: patch3104-2.txt, patch3104-3.txt, patch3104.txt, patch3104.txt


    The ExecutorService execute() method does not block when there are not threads available, it queues up the runnables until there are threads.
    The problem is that all key/values are read and kept in memory for the task, with large datasets this will create a OOM exception.
    Have to figure out how to use the execute in blocking fashion.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-dev @
categorieshadoop
postedMar 27, '08 at 7:29a
activeMar 29, '08 at 12:11p
posts27
users1
websitehadoop.apache.org...
irc#hadoop

1 user in discussion

Hudson (JIRA): 27 posts

People

Translate

site design / logo © 2022 Grokbase