FAQ
DistributedFileSystem.close() introduces a deadlock
---------------------------------------------------

Key: HADOOP-3139
URL: https://issues.apache.org/jira/browse/HADOOP-3139
Project: Hadoop Core
Issue Type: Bug
Components: dfs
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE


Koji found the following:

My dfs -ls hang.
Ctrl-Z showed a deadlock state.

"Thread-0":
at org.apache.hadoop.dfs.DistributedFileSystem.close(DistributedFileSystem.java:190)
- waiting to lock <0xedde8788> (a org.apache.hadoop.dfs.DistributedFileSystem)
at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:1231)
- locked <0xee0baf88> (a org.apache.hadoop.fs.FileSystem$Cache)
at org.apache.hadoop.fs.FileSystem.closeAll(FileSystem.java:169)
at org.apache.hadoop.fs.FileSystem$ClientFinalizer.run(FileSystem.java:154)
- locked <0xee0bae40> (a org.apache.hadoop.fs.FileSystem$ClientFinalizer)
"main":
at org.apache.hadoop.fs.FileSystem$Cache.remove(FileSystem.java:1201)
- waiting to lock <0xee0baf88> (a org.apache.hadoop.fs.FileSystem$Cache)
at org.apache.hadoop.fs.FileSystem.close(FileSystem.java:1085)
at org.apache.hadoop.dfs.DistributedFileSystem.close(DistributedFileSystem.java:192)
- locked <0xedde8788> (a org.apache.hadoop.dfs.DistributedFileSystem)
at org.apache.hadoop.fs.FsShell.close(FsShell.java:1698)
at org.apache.hadoop.fs.FsShell.main(FsShell.java:1712)

Found 1 deadlock.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Search Discussions

  • Tsz Wo (Nicholas), SZE (JIRA) at Mar 31, 2008 at 8:15 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12583818#action_12583818 ]

    Tsz Wo (Nicholas), SZE commented on HADOOP-3139:
    ------------------------------------------------

    The "synchronized" in DistributedFileSystem.close() is not necessary and shell be removed.
    DistributedFileSystem.close() introduces a deadlock
    ---------------------------------------------------

    Key: HADOOP-3139
    URL: https://issues.apache.org/jira/browse/HADOOP-3139
    Project: Hadoop Core
    Issue Type: Bug
    Components: dfs
    Reporter: Tsz Wo (Nicholas), SZE
    Assignee: Tsz Wo (Nicholas), SZE

    Koji found the following:
    My dfs -ls hang.
    Ctrl-Z showed a deadlock state.
    {noformat}
    "Thread-0":
    at org.apache.hadoop.dfs.DistributedFileSystem.close(DistributedFileSystem.java:190)
    - waiting to lock <0xedde8788> (a org.apache.hadoop.dfs.DistributedFileSystem)
    at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:1231)
    - locked <0xee0baf88> (a org.apache.hadoop.fs.FileSystem$Cache)
    at org.apache.hadoop.fs.FileSystem.closeAll(FileSystem.java:169)
    at org.apache.hadoop.fs.FileSystem$ClientFinalizer.run(FileSystem.java:154)
    - locked <0xee0bae40> (a org.apache.hadoop.fs.FileSystem$ClientFinalizer)
    "main":
    at org.apache.hadoop.fs.FileSystem$Cache.remove(FileSystem.java:1201)
    - waiting to lock <0xee0baf88> (a org.apache.hadoop.fs.FileSystem$Cache)
    at org.apache.hadoop.fs.FileSystem.close(FileSystem.java:1085)
    at org.apache.hadoop.dfs.DistributedFileSystem.close(DistributedFileSystem.java:192)
    - locked <0xedde8788> (a org.apache.hadoop.dfs.DistributedFileSystem)
    at org.apache.hadoop.fs.FsShell.close(FsShell.java:1698)
    at org.apache.hadoop.fs.FsShell.main(FsShell.java:1712)
    Found 1 deadlock.
    {noformat}
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Tsz Wo (Nicholas), SZE (JIRA) at Mar 31, 2008 at 8:15 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Tsz Wo (Nicholas), SZE updated HADOOP-3139:
    -------------------------------------------

    Description:
    Koji found the following:

    My dfs -ls hang.
    Ctrl-Z showed a deadlock state.
    {noformat}
    "Thread-0":
    at org.apache.hadoop.dfs.DistributedFileSystem.close(DistributedFileSystem.java:190)
    - waiting to lock <0xedde8788> (a org.apache.hadoop.dfs.DistributedFileSystem)
    at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:1231)
    - locked <0xee0baf88> (a org.apache.hadoop.fs.FileSystem$Cache)
    at org.apache.hadoop.fs.FileSystem.closeAll(FileSystem.java:169)
    at org.apache.hadoop.fs.FileSystem$ClientFinalizer.run(FileSystem.java:154)
    - locked <0xee0bae40> (a org.apache.hadoop.fs.FileSystem$ClientFinalizer)
    "main":
    at org.apache.hadoop.fs.FileSystem$Cache.remove(FileSystem.java:1201)
    - waiting to lock <0xee0baf88> (a org.apache.hadoop.fs.FileSystem$Cache)
    at org.apache.hadoop.fs.FileSystem.close(FileSystem.java:1085)
    at org.apache.hadoop.dfs.DistributedFileSystem.close(DistributedFileSystem.java:192)
    - locked <0xedde8788> (a org.apache.hadoop.dfs.DistributedFileSystem)
    at org.apache.hadoop.fs.FsShell.close(FsShell.java:1698)
    at org.apache.hadoop.fs.FsShell.main(FsShell.java:1712)

    Found 1 deadlock.
    {noformat}

    was:
    Koji found the following:

    My dfs -ls hang.
    Ctrl-Z showed a deadlock state.

    "Thread-0":
    at org.apache.hadoop.dfs.DistributedFileSystem.close(DistributedFileSystem.java:190)
    - waiting to lock <0xedde8788> (a org.apache.hadoop.dfs.DistributedFileSystem)
    at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:1231)
    - locked <0xee0baf88> (a org.apache.hadoop.fs.FileSystem$Cache)
    at org.apache.hadoop.fs.FileSystem.closeAll(FileSystem.java:169)
    at org.apache.hadoop.fs.FileSystem$ClientFinalizer.run(FileSystem.java:154)
    - locked <0xee0bae40> (a org.apache.hadoop.fs.FileSystem$ClientFinalizer)
    "main":
    at org.apache.hadoop.fs.FileSystem$Cache.remove(FileSystem.java:1201)
    - waiting to lock <0xee0baf88> (a org.apache.hadoop.fs.FileSystem$Cache)
    at org.apache.hadoop.fs.FileSystem.close(FileSystem.java:1085)
    at org.apache.hadoop.dfs.DistributedFileSystem.close(DistributedFileSystem.java:192)
    - locked <0xedde8788> (a org.apache.hadoop.dfs.DistributedFileSystem)
    at org.apache.hadoop.fs.FsShell.close(FsShell.java:1698)
    at org.apache.hadoop.fs.FsShell.main(FsShell.java:1712)

    Found 1 deadlock.

    DistributedFileSystem.close() introduces a deadlock
    ---------------------------------------------------

    Key: HADOOP-3139
    URL: https://issues.apache.org/jira/browse/HADOOP-3139
    Project: Hadoop Core
    Issue Type: Bug
    Components: dfs
    Reporter: Tsz Wo (Nicholas), SZE
    Assignee: Tsz Wo (Nicholas), SZE

    Koji found the following:
    My dfs -ls hang.
    Ctrl-Z showed a deadlock state.
    {noformat}
    "Thread-0":
    at org.apache.hadoop.dfs.DistributedFileSystem.close(DistributedFileSystem.java:190)
    - waiting to lock <0xedde8788> (a org.apache.hadoop.dfs.DistributedFileSystem)
    at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:1231)
    - locked <0xee0baf88> (a org.apache.hadoop.fs.FileSystem$Cache)
    at org.apache.hadoop.fs.FileSystem.closeAll(FileSystem.java:169)
    at org.apache.hadoop.fs.FileSystem$ClientFinalizer.run(FileSystem.java:154)
    - locked <0xee0bae40> (a org.apache.hadoop.fs.FileSystem$ClientFinalizer)
    "main":
    at org.apache.hadoop.fs.FileSystem$Cache.remove(FileSystem.java:1201)
    - waiting to lock <0xee0baf88> (a org.apache.hadoop.fs.FileSystem$Cache)
    at org.apache.hadoop.fs.FileSystem.close(FileSystem.java:1085)
    at org.apache.hadoop.dfs.DistributedFileSystem.close(DistributedFileSystem.java:192)
    - locked <0xedde8788> (a org.apache.hadoop.dfs.DistributedFileSystem)
    at org.apache.hadoop.fs.FsShell.close(FsShell.java:1698)
    at org.apache.hadoop.fs.FsShell.main(FsShell.java:1712)
    Found 1 deadlock.
    {noformat}
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Tsz Wo (Nicholas), SZE (JIRA) at Mar 31, 2008 at 9:26 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Tsz Wo (Nicholas), SZE updated HADOOP-3139:
    -------------------------------------------

    Attachment: 3139_20080331.patch

    3139_20080331.patch:
    - removed synchronized in DistributedFileSystem.close().
    - also fixed a bug in HftpFileSystem. Otherwise, distcp will show an extra warning message.
    DistributedFileSystem.close() introduces a deadlock
    ---------------------------------------------------

    Key: HADOOP-3139
    URL: https://issues.apache.org/jira/browse/HADOOP-3139
    Project: Hadoop Core
    Issue Type: Bug
    Components: dfs
    Reporter: Tsz Wo (Nicholas), SZE
    Assignee: Tsz Wo (Nicholas), SZE
    Attachments: 3139_20080331.patch


    Koji found the following:
    My dfs -ls hang.
    Ctrl-Z showed a deadlock state.
    {noformat}
    "Thread-0":
    at org.apache.hadoop.dfs.DistributedFileSystem.close(DistributedFileSystem.java:190)
    - waiting to lock <0xedde8788> (a org.apache.hadoop.dfs.DistributedFileSystem)
    at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:1231)
    - locked <0xee0baf88> (a org.apache.hadoop.fs.FileSystem$Cache)
    at org.apache.hadoop.fs.FileSystem.closeAll(FileSystem.java:169)
    at org.apache.hadoop.fs.FileSystem$ClientFinalizer.run(FileSystem.java:154)
    - locked <0xee0bae40> (a org.apache.hadoop.fs.FileSystem$ClientFinalizer)
    "main":
    at org.apache.hadoop.fs.FileSystem$Cache.remove(FileSystem.java:1201)
    - waiting to lock <0xee0baf88> (a org.apache.hadoop.fs.FileSystem$Cache)
    at org.apache.hadoop.fs.FileSystem.close(FileSystem.java:1085)
    at org.apache.hadoop.dfs.DistributedFileSystem.close(DistributedFileSystem.java:192)
    - locked <0xedde8788> (a org.apache.hadoop.dfs.DistributedFileSystem)
    at org.apache.hadoop.fs.FsShell.close(FsShell.java:1698)
    at org.apache.hadoop.fs.FsShell.main(FsShell.java:1712)
    Found 1 deadlock.
    {noformat}
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Tsz Wo (Nicholas), SZE (JIRA) at Mar 31, 2008 at 9:30 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Tsz Wo (Nicholas), SZE updated HADOOP-3139:
    -------------------------------------------

    Status: Patch Available (was: Open)
    DistributedFileSystem.close() introduces a deadlock
    ---------------------------------------------------

    Key: HADOOP-3139
    URL: https://issues.apache.org/jira/browse/HADOOP-3139
    Project: Hadoop Core
    Issue Type: Bug
    Components: dfs
    Reporter: Tsz Wo (Nicholas), SZE
    Assignee: Tsz Wo (Nicholas), SZE
    Attachments: 3139_20080331.patch


    Koji found the following:
    My dfs -ls hang.
    Ctrl-Z showed a deadlock state.
    {noformat}
    "Thread-0":
    at org.apache.hadoop.dfs.DistributedFileSystem.close(DistributedFileSystem.java:190)
    - waiting to lock <0xedde8788> (a org.apache.hadoop.dfs.DistributedFileSystem)
    at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:1231)
    - locked <0xee0baf88> (a org.apache.hadoop.fs.FileSystem$Cache)
    at org.apache.hadoop.fs.FileSystem.closeAll(FileSystem.java:169)
    at org.apache.hadoop.fs.FileSystem$ClientFinalizer.run(FileSystem.java:154)
    - locked <0xee0bae40> (a org.apache.hadoop.fs.FileSystem$ClientFinalizer)
    "main":
    at org.apache.hadoop.fs.FileSystem$Cache.remove(FileSystem.java:1201)
    - waiting to lock <0xee0baf88> (a org.apache.hadoop.fs.FileSystem$Cache)
    at org.apache.hadoop.fs.FileSystem.close(FileSystem.java:1085)
    at org.apache.hadoop.dfs.DistributedFileSystem.close(DistributedFileSystem.java:192)
    - locked <0xedde8788> (a org.apache.hadoop.dfs.DistributedFileSystem)
    at org.apache.hadoop.fs.FsShell.close(FsShell.java:1698)
    at org.apache.hadoop.fs.FsShell.main(FsShell.java:1712)
    Found 1 deadlock.
    {noformat}
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Robert Chansler (JIRA) at Mar 31, 2008 at 10:02 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Robert Chansler updated HADOOP-3139:
    ------------------------------------

    Fix Version/s: 0.17.0
    Priority: Blocker (was: Major)
    DistributedFileSystem.close() introduces a deadlock
    ---------------------------------------------------

    Key: HADOOP-3139
    URL: https://issues.apache.org/jira/browse/HADOOP-3139
    Project: Hadoop Core
    Issue Type: Bug
    Components: dfs
    Reporter: Tsz Wo (Nicholas), SZE
    Assignee: Tsz Wo (Nicholas), SZE
    Priority: Blocker
    Fix For: 0.17.0

    Attachments: 3139_20080331.patch


    Koji found the following:
    My dfs -ls hang.
    Ctrl-Z showed a deadlock state.
    {noformat}
    "Thread-0":
    at org.apache.hadoop.dfs.DistributedFileSystem.close(DistributedFileSystem.java:190)
    - waiting to lock <0xedde8788> (a org.apache.hadoop.dfs.DistributedFileSystem)
    at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:1231)
    - locked <0xee0baf88> (a org.apache.hadoop.fs.FileSystem$Cache)
    at org.apache.hadoop.fs.FileSystem.closeAll(FileSystem.java:169)
    at org.apache.hadoop.fs.FileSystem$ClientFinalizer.run(FileSystem.java:154)
    - locked <0xee0bae40> (a org.apache.hadoop.fs.FileSystem$ClientFinalizer)
    "main":
    at org.apache.hadoop.fs.FileSystem$Cache.remove(FileSystem.java:1201)
    - waiting to lock <0xee0baf88> (a org.apache.hadoop.fs.FileSystem$Cache)
    at org.apache.hadoop.fs.FileSystem.close(FileSystem.java:1085)
    at org.apache.hadoop.dfs.DistributedFileSystem.close(DistributedFileSystem.java:192)
    - locked <0xedde8788> (a org.apache.hadoop.dfs.DistributedFileSystem)
    at org.apache.hadoop.fs.FsShell.close(FsShell.java:1698)
    at org.apache.hadoop.fs.FsShell.main(FsShell.java:1712)
    Found 1 deadlock.
    {noformat}
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hadoop QA (JIRA) at Apr 1, 2008 at 6:16 am
    [ https://issues.apache.org/jira/browse/HADOOP-3139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12584025#action_12584025 ]

    Hadoop QA commented on HADOOP-3139:
    -----------------------------------

    -1 overall. Here are the results of testing the latest attachment
    http://issues.apache.org/jira/secure/attachment/12378978/3139_20080331.patch
    against trunk revision 619744.

    @author +1. The patch does not contain any @author tags.

    tests included -1. The patch doesn't appear to include any new or modified tests.
    Please justify why no tests are needed for this patch.

    javadoc +1. The javadoc tool did not generate any warning messages.

    javac +1. The applied patch does not generate any new javac compiler warnings.

    release audit +1. The applied patch does not generate any new release audit warnings.

    findbugs +1. The patch does not introduce any new Findbugs warnings.

    core tests +1. The patch passed core unit tests.

    contrib tests +1. The patch passed contrib unit tests.

    Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2106/testReport/
    Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2106/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
    Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2106/artifact/trunk/build/test/checkstyle-errors.html
    Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2106/console

    This message is automatically generated.
    DistributedFileSystem.close() introduces a deadlock
    ---------------------------------------------------

    Key: HADOOP-3139
    URL: https://issues.apache.org/jira/browse/HADOOP-3139
    Project: Hadoop Core
    Issue Type: Bug
    Components: dfs
    Reporter: Tsz Wo (Nicholas), SZE
    Assignee: Tsz Wo (Nicholas), SZE
    Priority: Blocker
    Fix For: 0.17.0

    Attachments: 3139_20080331.patch


    Koji found the following:
    My dfs -ls hang.
    Ctrl-Z showed a deadlock state.
    {noformat}
    "Thread-0":
    at org.apache.hadoop.dfs.DistributedFileSystem.close(DistributedFileSystem.java:190)
    - waiting to lock <0xedde8788> (a org.apache.hadoop.dfs.DistributedFileSystem)
    at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:1231)
    - locked <0xee0baf88> (a org.apache.hadoop.fs.FileSystem$Cache)
    at org.apache.hadoop.fs.FileSystem.closeAll(FileSystem.java:169)
    at org.apache.hadoop.fs.FileSystem$ClientFinalizer.run(FileSystem.java:154)
    - locked <0xee0bae40> (a org.apache.hadoop.fs.FileSystem$ClientFinalizer)
    "main":
    at org.apache.hadoop.fs.FileSystem$Cache.remove(FileSystem.java:1201)
    - waiting to lock <0xee0baf88> (a org.apache.hadoop.fs.FileSystem$Cache)
    at org.apache.hadoop.fs.FileSystem.close(FileSystem.java:1085)
    at org.apache.hadoop.dfs.DistributedFileSystem.close(DistributedFileSystem.java:192)
    - locked <0xedde8788> (a org.apache.hadoop.dfs.DistributedFileSystem)
    at org.apache.hadoop.fs.FsShell.close(FsShell.java:1698)
    at org.apache.hadoop.fs.FsShell.main(FsShell.java:1712)
    Found 1 deadlock.
    {noformat}
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Koji Noguchi (JIRA) at Apr 1, 2008 at 5:21 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12584220#action_12584220 ]

    Koji Noguchi commented on HADOOP-3139:
    --------------------------------------
    also fixed a bug in HftpFileSystem. Otherwise, distcp will show an extra warning message.
    Just fyi, this is the warning message I was getting before the patch.

    {noformat}
    08/03/31 16:48:42 INFO mapred.JobClient: distcp
    08/03/31 16:48:42 INFO mapred.JobClient: Files copied=1
    08/03/31 16:48:42 INFO mapred.JobClient: Bytes copied=46
    08/03/31 16:48:42 INFO mapred.JobClient: Bytes expected=46
    08/03/31 16:48:42 INFO mapred.JobClient: Map-Reduce Framework
    08/03/31 16:48:42 INFO mapred.JobClient: Map input records=1
    08/03/31 16:48:42 INFO mapred.JobClient: Map input bytes=114
    08/03/31 16:48:42 INFO fs.FileSystem: FileSystem.closeAll() threw an exception:
    java.io.IOException: HftpFileSystem(=org.apache.hadoop.dfs.HftpFileSystem@111111) and
    Key(=null@hftp://namenode-nn:4444) do not match.
    {noformat}
    DistributedFileSystem.close() introduces a deadlock
    ---------------------------------------------------

    Key: HADOOP-3139
    URL: https://issues.apache.org/jira/browse/HADOOP-3139
    Project: Hadoop Core
    Issue Type: Bug
    Components: dfs
    Reporter: Tsz Wo (Nicholas), SZE
    Assignee: Tsz Wo (Nicholas), SZE
    Priority: Blocker
    Fix For: 0.17.0

    Attachments: 3139_20080331.patch


    Koji found the following:
    My dfs -ls hang.
    Ctrl-Z showed a deadlock state.
    {noformat}
    "Thread-0":
    at org.apache.hadoop.dfs.DistributedFileSystem.close(DistributedFileSystem.java:190)
    - waiting to lock <0xedde8788> (a org.apache.hadoop.dfs.DistributedFileSystem)
    at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:1231)
    - locked <0xee0baf88> (a org.apache.hadoop.fs.FileSystem$Cache)
    at org.apache.hadoop.fs.FileSystem.closeAll(FileSystem.java:169)
    at org.apache.hadoop.fs.FileSystem$ClientFinalizer.run(FileSystem.java:154)
    - locked <0xee0bae40> (a org.apache.hadoop.fs.FileSystem$ClientFinalizer)
    "main":
    at org.apache.hadoop.fs.FileSystem$Cache.remove(FileSystem.java:1201)
    - waiting to lock <0xee0baf88> (a org.apache.hadoop.fs.FileSystem$Cache)
    at org.apache.hadoop.fs.FileSystem.close(FileSystem.java:1085)
    at org.apache.hadoop.dfs.DistributedFileSystem.close(DistributedFileSystem.java:192)
    - locked <0xedde8788> (a org.apache.hadoop.dfs.DistributedFileSystem)
    at org.apache.hadoop.fs.FsShell.close(FsShell.java:1698)
    at org.apache.hadoop.fs.FsShell.main(FsShell.java:1712)
    Found 1 deadlock.
    {noformat}
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Tsz Wo (Nicholas), SZE (JIRA) at Apr 1, 2008 at 7:18 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Tsz Wo (Nicholas), SZE updated HADOOP-3139:
    -------------------------------------------

    Description:
    Koji found the following:

    *DistributedFileSystem.close() deadlock*
    My dfs -ls hang.
    Ctrl-Z showed a deadlock state.
    {noformat}
    "Thread-0":
    at org.apache.hadoop.dfs.DistributedFileSystem.close(DistributedFileSystem.java:190)
    - waiting to lock <0xedde8788> (a org.apache.hadoop.dfs.DistributedFileSystem)
    at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:1231)
    - locked <0xee0baf88> (a org.apache.hadoop.fs.FileSystem$Cache)
    at org.apache.hadoop.fs.FileSystem.closeAll(FileSystem.java:169)
    at org.apache.hadoop.fs.FileSystem$ClientFinalizer.run(FileSystem.java:154)
    - locked <0xee0bae40> (a org.apache.hadoop.fs.FileSystem$ClientFinalizer)
    "main":
    at org.apache.hadoop.fs.FileSystem$Cache.remove(FileSystem.java:1201)
    - waiting to lock <0xee0baf88> (a org.apache.hadoop.fs.FileSystem$Cache)
    at org.apache.hadoop.fs.FileSystem.close(FileSystem.java:1085)
    at org.apache.hadoop.dfs.DistributedFileSystem.close(DistributedFileSystem.java:192)
    - locked <0xedde8788> (a org.apache.hadoop.dfs.DistributedFileSystem)
    at org.apache.hadoop.fs.FsShell.close(FsShell.java:1698)
    at org.apache.hadoop.fs.FsShell.main(FsShell.java:1712)

    Found 1 deadlock.
    {noformat}

    *FileSystem.closeAll() warning*
    {noformat}
    08/03/31 16:48:42 INFO fs.FileSystem: FileSystem.closeAll() threw an exception:
    java.io.IOException: HftpFileSystem(=org.apache.hadoop.dfs.HftpFileSystem@111111) and
    Key(=null@hftp://namenode-nn:4444) do not match.
    {noformat}

    was:
    Koji found the following:

    My dfs -ls hang.
    Ctrl-Z showed a deadlock state.
    {noformat}
    "Thread-0":
    at org.apache.hadoop.dfs.DistributedFileSystem.close(DistributedFileSystem.java:190)
    - waiting to lock <0xedde8788> (a org.apache.hadoop.dfs.DistributedFileSystem)
    at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:1231)
    - locked <0xee0baf88> (a org.apache.hadoop.fs.FileSystem$Cache)
    at org.apache.hadoop.fs.FileSystem.closeAll(FileSystem.java:169)
    at org.apache.hadoop.fs.FileSystem$ClientFinalizer.run(FileSystem.java:154)
    - locked <0xee0bae40> (a org.apache.hadoop.fs.FileSystem$ClientFinalizer)
    "main":
    at org.apache.hadoop.fs.FileSystem$Cache.remove(FileSystem.java:1201)
    - waiting to lock <0xee0baf88> (a org.apache.hadoop.fs.FileSystem$Cache)
    at org.apache.hadoop.fs.FileSystem.close(FileSystem.java:1085)
    at org.apache.hadoop.dfs.DistributedFileSystem.close(DistributedFileSystem.java:192)
    - locked <0xedde8788> (a org.apache.hadoop.dfs.DistributedFileSystem)
    at org.apache.hadoop.fs.FsShell.close(FsShell.java:1698)
    at org.apache.hadoop.fs.FsShell.main(FsShell.java:1712)

    Found 1 deadlock.
    {noformat}

    Summary: DistributedFileSystem.close() deadlock and FileSystem.closeAll() warning (was: DistributedFileSystem.close() introduces a deadlock)
    DistributedFileSystem.close() deadlock and FileSystem.closeAll() warning
    ------------------------------------------------------------------------

    Key: HADOOP-3139
    URL: https://issues.apache.org/jira/browse/HADOOP-3139
    Project: Hadoop Core
    Issue Type: Bug
    Components: dfs
    Reporter: Tsz Wo (Nicholas), SZE
    Assignee: Tsz Wo (Nicholas), SZE
    Priority: Blocker
    Fix For: 0.17.0

    Attachments: 3139_20080331.patch


    Koji found the following:
    *DistributedFileSystem.close() deadlock*
    My dfs -ls hang.
    Ctrl-Z showed a deadlock state.
    {noformat}
    "Thread-0":
    at org.apache.hadoop.dfs.DistributedFileSystem.close(DistributedFileSystem.java:190)
    - waiting to lock <0xedde8788> (a org.apache.hadoop.dfs.DistributedFileSystem)
    at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:1231)
    - locked <0xee0baf88> (a org.apache.hadoop.fs.FileSystem$Cache)
    at org.apache.hadoop.fs.FileSystem.closeAll(FileSystem.java:169)
    at org.apache.hadoop.fs.FileSystem$ClientFinalizer.run(FileSystem.java:154)
    - locked <0xee0bae40> (a org.apache.hadoop.fs.FileSystem$ClientFinalizer)
    "main":
    at org.apache.hadoop.fs.FileSystem$Cache.remove(FileSystem.java:1201)
    - waiting to lock <0xee0baf88> (a org.apache.hadoop.fs.FileSystem$Cache)
    at org.apache.hadoop.fs.FileSystem.close(FileSystem.java:1085)
    at org.apache.hadoop.dfs.DistributedFileSystem.close(DistributedFileSystem.java:192)
    - locked <0xedde8788> (a org.apache.hadoop.dfs.DistributedFileSystem)
    at org.apache.hadoop.fs.FsShell.close(FsShell.java:1698)
    at org.apache.hadoop.fs.FsShell.main(FsShell.java:1712)
    Found 1 deadlock.
    {noformat}
    *FileSystem.closeAll() warning*
    {noformat}
    08/03/31 16:48:42 INFO fs.FileSystem: FileSystem.closeAll() threw an exception:
    java.io.IOException: HftpFileSystem(=org.apache.hadoop.dfs.HftpFileSystem@111111) and
    Key(=null@hftp://namenode-nn:4444) do not match.
    {noformat}
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Tsz Wo (Nicholas), SZE (JIRA) at Apr 1, 2008 at 7:20 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Tsz Wo (Nicholas), SZE updated HADOOP-3139:
    -------------------------------------------

    Status: Open (was: Patch Available)
    DistributedFileSystem.close() deadlock and FileSystem.closeAll() warning
    ------------------------------------------------------------------------

    Key: HADOOP-3139
    URL: https://issues.apache.org/jira/browse/HADOOP-3139
    Project: Hadoop Core
    Issue Type: Bug
    Components: dfs
    Reporter: Tsz Wo (Nicholas), SZE
    Assignee: Tsz Wo (Nicholas), SZE
    Priority: Blocker
    Fix For: 0.17.0

    Attachments: 3139_20080331.patch


    Koji found the following:
    *DistributedFileSystem.close() deadlock*
    My dfs -ls hang.
    Ctrl-Z showed a deadlock state.
    {noformat}
    "Thread-0":
    at org.apache.hadoop.dfs.DistributedFileSystem.close(DistributedFileSystem.java:190)
    - waiting to lock <0xedde8788> (a org.apache.hadoop.dfs.DistributedFileSystem)
    at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:1231)
    - locked <0xee0baf88> (a org.apache.hadoop.fs.FileSystem$Cache)
    at org.apache.hadoop.fs.FileSystem.closeAll(FileSystem.java:169)
    at org.apache.hadoop.fs.FileSystem$ClientFinalizer.run(FileSystem.java:154)
    - locked <0xee0bae40> (a org.apache.hadoop.fs.FileSystem$ClientFinalizer)
    "main":
    at org.apache.hadoop.fs.FileSystem$Cache.remove(FileSystem.java:1201)
    - waiting to lock <0xee0baf88> (a org.apache.hadoop.fs.FileSystem$Cache)
    at org.apache.hadoop.fs.FileSystem.close(FileSystem.java:1085)
    at org.apache.hadoop.dfs.DistributedFileSystem.close(DistributedFileSystem.java:192)
    - locked <0xedde8788> (a org.apache.hadoop.dfs.DistributedFileSystem)
    at org.apache.hadoop.fs.FsShell.close(FsShell.java:1698)
    at org.apache.hadoop.fs.FsShell.main(FsShell.java:1712)
    Found 1 deadlock.
    {noformat}
    *FileSystem.closeAll() warning*
    {noformat}
    08/03/31 16:48:42 INFO fs.FileSystem: FileSystem.closeAll() threw an exception:
    java.io.IOException: HftpFileSystem(=org.apache.hadoop.dfs.HftpFileSystem@111111) and
    Key(=null@hftp://namenode-nn:4444) do not match.
    {noformat}
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Tsz Wo (Nicholas), SZE (JIRA) at Apr 3, 2008 at 12:42 am
    [ https://issues.apache.org/jira/browse/HADOOP-3139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Tsz Wo (Nicholas), SZE updated HADOOP-3139:
    -------------------------------------------

    Status: Patch Available (was: Open)
    DistributedFileSystem.close() deadlock and FileSystem.closeAll() warning
    ------------------------------------------------------------------------

    Key: HADOOP-3139
    URL: https://issues.apache.org/jira/browse/HADOOP-3139
    Project: Hadoop Core
    Issue Type: Bug
    Components: dfs
    Reporter: Tsz Wo (Nicholas), SZE
    Assignee: Tsz Wo (Nicholas), SZE
    Priority: Blocker
    Fix For: 0.17.0

    Attachments: 3139_20080331.patch, 3139_20080402b.patch


    Koji found the following:
    *DistributedFileSystem.close() deadlock*
    My dfs -ls hang.
    Ctrl-Z showed a deadlock state.
    {noformat}
    "Thread-0":
    at org.apache.hadoop.dfs.DistributedFileSystem.close(DistributedFileSystem.java:190)
    - waiting to lock <0xedde8788> (a org.apache.hadoop.dfs.DistributedFileSystem)
    at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:1231)
    - locked <0xee0baf88> (a org.apache.hadoop.fs.FileSystem$Cache)
    at org.apache.hadoop.fs.FileSystem.closeAll(FileSystem.java:169)
    at org.apache.hadoop.fs.FileSystem$ClientFinalizer.run(FileSystem.java:154)
    - locked <0xee0bae40> (a org.apache.hadoop.fs.FileSystem$ClientFinalizer)
    "main":
    at org.apache.hadoop.fs.FileSystem$Cache.remove(FileSystem.java:1201)
    - waiting to lock <0xee0baf88> (a org.apache.hadoop.fs.FileSystem$Cache)
    at org.apache.hadoop.fs.FileSystem.close(FileSystem.java:1085)
    at org.apache.hadoop.dfs.DistributedFileSystem.close(DistributedFileSystem.java:192)
    - locked <0xedde8788> (a org.apache.hadoop.dfs.DistributedFileSystem)
    at org.apache.hadoop.fs.FsShell.close(FsShell.java:1698)
    at org.apache.hadoop.fs.FsShell.main(FsShell.java:1712)
    Found 1 deadlock.
    {noformat}
    *FileSystem.closeAll() warning*
    {noformat}
    08/03/31 16:48:42 INFO fs.FileSystem: FileSystem.closeAll() threw an exception:
    java.io.IOException: HftpFileSystem(=org.apache.hadoop.dfs.HftpFileSystem@111111) and
    Key(=null@hftp://namenode-nn:4444) do not match.
    {noformat}
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Tsz Wo (Nicholas), SZE (JIRA) at Apr 3, 2008 at 12:42 am
    [ https://issues.apache.org/jira/browse/HADOOP-3139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Tsz Wo (Nicholas), SZE updated HADOOP-3139:
    -------------------------------------------

    Attachment: 3139_20080402b.patch

    3139_20080402b.patch: added a test
    DistributedFileSystem.close() deadlock and FileSystem.closeAll() warning
    ------------------------------------------------------------------------

    Key: HADOOP-3139
    URL: https://issues.apache.org/jira/browse/HADOOP-3139
    Project: Hadoop Core
    Issue Type: Bug
    Components: dfs
    Reporter: Tsz Wo (Nicholas), SZE
    Assignee: Tsz Wo (Nicholas), SZE
    Priority: Blocker
    Fix For: 0.17.0

    Attachments: 3139_20080331.patch, 3139_20080402b.patch


    Koji found the following:
    *DistributedFileSystem.close() deadlock*
    My dfs -ls hang.
    Ctrl-Z showed a deadlock state.
    {noformat}
    "Thread-0":
    at org.apache.hadoop.dfs.DistributedFileSystem.close(DistributedFileSystem.java:190)
    - waiting to lock <0xedde8788> (a org.apache.hadoop.dfs.DistributedFileSystem)
    at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:1231)
    - locked <0xee0baf88> (a org.apache.hadoop.fs.FileSystem$Cache)
    at org.apache.hadoop.fs.FileSystem.closeAll(FileSystem.java:169)
    at org.apache.hadoop.fs.FileSystem$ClientFinalizer.run(FileSystem.java:154)
    - locked <0xee0bae40> (a org.apache.hadoop.fs.FileSystem$ClientFinalizer)
    "main":
    at org.apache.hadoop.fs.FileSystem$Cache.remove(FileSystem.java:1201)
    - waiting to lock <0xee0baf88> (a org.apache.hadoop.fs.FileSystem$Cache)
    at org.apache.hadoop.fs.FileSystem.close(FileSystem.java:1085)
    at org.apache.hadoop.dfs.DistributedFileSystem.close(DistributedFileSystem.java:192)
    - locked <0xedde8788> (a org.apache.hadoop.dfs.DistributedFileSystem)
    at org.apache.hadoop.fs.FsShell.close(FsShell.java:1698)
    at org.apache.hadoop.fs.FsShell.main(FsShell.java:1712)
    Found 1 deadlock.
    {noformat}
    *FileSystem.closeAll() warning*
    {noformat}
    08/03/31 16:48:42 INFO fs.FileSystem: FileSystem.closeAll() threw an exception:
    java.io.IOException: HftpFileSystem(=org.apache.hadoop.dfs.HftpFileSystem@111111) and
    Key(=null@hftp://namenode-nn:4444) do not match.
    {noformat}
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Konstantin Shvachko (JIRA) at Apr 3, 2008 at 1:18 am
    [ https://issues.apache.org/jira/browse/HADOOP-3139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12584946#action_12584946 ]

    Konstantin Shvachko commented on HADOOP-3139:
    ---------------------------------------------

    +1
    DistributedFileSystem.close() deadlock and FileSystem.closeAll() warning
    ------------------------------------------------------------------------

    Key: HADOOP-3139
    URL: https://issues.apache.org/jira/browse/HADOOP-3139
    Project: Hadoop Core
    Issue Type: Bug
    Components: dfs
    Reporter: Tsz Wo (Nicholas), SZE
    Assignee: Tsz Wo (Nicholas), SZE
    Priority: Blocker
    Fix For: 0.17.0

    Attachments: 3139_20080331.patch, 3139_20080402b.patch


    Koji found the following:
    *DistributedFileSystem.close() deadlock*
    My dfs -ls hang.
    Ctrl-Z showed a deadlock state.
    {noformat}
    "Thread-0":
    at org.apache.hadoop.dfs.DistributedFileSystem.close(DistributedFileSystem.java:190)
    - waiting to lock <0xedde8788> (a org.apache.hadoop.dfs.DistributedFileSystem)
    at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:1231)
    - locked <0xee0baf88> (a org.apache.hadoop.fs.FileSystem$Cache)
    at org.apache.hadoop.fs.FileSystem.closeAll(FileSystem.java:169)
    at org.apache.hadoop.fs.FileSystem$ClientFinalizer.run(FileSystem.java:154)
    - locked <0xee0bae40> (a org.apache.hadoop.fs.FileSystem$ClientFinalizer)
    "main":
    at org.apache.hadoop.fs.FileSystem$Cache.remove(FileSystem.java:1201)
    - waiting to lock <0xee0baf88> (a org.apache.hadoop.fs.FileSystem$Cache)
    at org.apache.hadoop.fs.FileSystem.close(FileSystem.java:1085)
    at org.apache.hadoop.dfs.DistributedFileSystem.close(DistributedFileSystem.java:192)
    - locked <0xedde8788> (a org.apache.hadoop.dfs.DistributedFileSystem)
    at org.apache.hadoop.fs.FsShell.close(FsShell.java:1698)
    at org.apache.hadoop.fs.FsShell.main(FsShell.java:1712)
    Found 1 deadlock.
    {noformat}
    *FileSystem.closeAll() warning*
    {noformat}
    08/03/31 16:48:42 INFO fs.FileSystem: FileSystem.closeAll() threw an exception:
    java.io.IOException: HftpFileSystem(=org.apache.hadoop.dfs.HftpFileSystem@111111) and
    Key(=null@hftp://namenode-nn:4444) do not match.
    {noformat}
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hadoop QA (JIRA) at Apr 3, 2008 at 9:14 am
    [ https://issues.apache.org/jira/browse/HADOOP-3139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12585022#action_12585022 ]

    Hadoop QA commented on HADOOP-3139:
    -----------------------------------

    -1 overall. Here are the results of testing the latest attachment
    http://issues.apache.org/jira/secure/attachment/12379213/3139_20080402b.patch
    against trunk revision 643282.

    @author +1. The patch does not contain any @author tags.

    tests included +1. The patch appears to include 3 new or modified tests.

    javadoc +1. The javadoc tool did not generate any warning messages.

    javac +1. The applied patch does not generate any new javac compiler warnings.

    release audit +1. The applied patch does not generate any new release audit warnings.

    findbugs +1. The patch does not introduce any new Findbugs warnings.

    core tests -1. The patch failed core unit tests.

    contrib tests +1. The patch passed contrib unit tests.

    Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2135/testReport/
    Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2135/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
    Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2135/artifact/trunk/build/test/checkstyle-errors.html
    Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2135/console

    This message is automatically generated.
    DistributedFileSystem.close() deadlock and FileSystem.closeAll() warning
    ------------------------------------------------------------------------

    Key: HADOOP-3139
    URL: https://issues.apache.org/jira/browse/HADOOP-3139
    Project: Hadoop Core
    Issue Type: Bug
    Components: dfs
    Reporter: Tsz Wo (Nicholas), SZE
    Assignee: Tsz Wo (Nicholas), SZE
    Priority: Blocker
    Fix For: 0.17.0

    Attachments: 3139_20080331.patch, 3139_20080402b.patch


    Koji found the following:
    *DistributedFileSystem.close() deadlock*
    My dfs -ls hang.
    Ctrl-Z showed a deadlock state.
    {noformat}
    "Thread-0":
    at org.apache.hadoop.dfs.DistributedFileSystem.close(DistributedFileSystem.java:190)
    - waiting to lock <0xedde8788> (a org.apache.hadoop.dfs.DistributedFileSystem)
    at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:1231)
    - locked <0xee0baf88> (a org.apache.hadoop.fs.FileSystem$Cache)
    at org.apache.hadoop.fs.FileSystem.closeAll(FileSystem.java:169)
    at org.apache.hadoop.fs.FileSystem$ClientFinalizer.run(FileSystem.java:154)
    - locked <0xee0bae40> (a org.apache.hadoop.fs.FileSystem$ClientFinalizer)
    "main":
    at org.apache.hadoop.fs.FileSystem$Cache.remove(FileSystem.java:1201)
    - waiting to lock <0xee0baf88> (a org.apache.hadoop.fs.FileSystem$Cache)
    at org.apache.hadoop.fs.FileSystem.close(FileSystem.java:1085)
    at org.apache.hadoop.dfs.DistributedFileSystem.close(DistributedFileSystem.java:192)
    - locked <0xedde8788> (a org.apache.hadoop.dfs.DistributedFileSystem)
    at org.apache.hadoop.fs.FsShell.close(FsShell.java:1698)
    at org.apache.hadoop.fs.FsShell.main(FsShell.java:1712)
    Found 1 deadlock.
    {noformat}
    *FileSystem.closeAll() warning*
    {noformat}
    08/03/31 16:48:42 INFO fs.FileSystem: FileSystem.closeAll() threw an exception:
    java.io.IOException: HftpFileSystem(=org.apache.hadoop.dfs.HftpFileSystem@111111) and
    Key(=null@hftp://namenode-nn:4444) do not match.
    {noformat}
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Tsz Wo (Nicholas), SZE (JIRA) at Apr 3, 2008 at 1:25 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Tsz Wo (Nicholas), SZE updated HADOOP-3139:
    -------------------------------------------

    Status: Open (was: Patch Available)
    DistributedFileSystem.close() deadlock and FileSystem.closeAll() warning
    ------------------------------------------------------------------------

    Key: HADOOP-3139
    URL: https://issues.apache.org/jira/browse/HADOOP-3139
    Project: Hadoop Core
    Issue Type: Bug
    Components: dfs
    Reporter: Tsz Wo (Nicholas), SZE
    Assignee: Tsz Wo (Nicholas), SZE
    Priority: Blocker
    Fix For: 0.17.0

    Attachments: 3139_20080331.patch, 3139_20080402b.patch


    Koji found the following:
    *DistributedFileSystem.close() deadlock*
    My dfs -ls hang.
    Ctrl-Z showed a deadlock state.
    {noformat}
    "Thread-0":
    at org.apache.hadoop.dfs.DistributedFileSystem.close(DistributedFileSystem.java:190)
    - waiting to lock <0xedde8788> (a org.apache.hadoop.dfs.DistributedFileSystem)
    at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:1231)
    - locked <0xee0baf88> (a org.apache.hadoop.fs.FileSystem$Cache)
    at org.apache.hadoop.fs.FileSystem.closeAll(FileSystem.java:169)
    at org.apache.hadoop.fs.FileSystem$ClientFinalizer.run(FileSystem.java:154)
    - locked <0xee0bae40> (a org.apache.hadoop.fs.FileSystem$ClientFinalizer)
    "main":
    at org.apache.hadoop.fs.FileSystem$Cache.remove(FileSystem.java:1201)
    - waiting to lock <0xee0baf88> (a org.apache.hadoop.fs.FileSystem$Cache)
    at org.apache.hadoop.fs.FileSystem.close(FileSystem.java:1085)
    at org.apache.hadoop.dfs.DistributedFileSystem.close(DistributedFileSystem.java:192)
    - locked <0xedde8788> (a org.apache.hadoop.dfs.DistributedFileSystem)
    at org.apache.hadoop.fs.FsShell.close(FsShell.java:1698)
    at org.apache.hadoop.fs.FsShell.main(FsShell.java:1712)
    Found 1 deadlock.
    {noformat}
    *FileSystem.closeAll() warning*
    {noformat}
    08/03/31 16:48:42 INFO fs.FileSystem: FileSystem.closeAll() threw an exception:
    java.io.IOException: HftpFileSystem(=org.apache.hadoop.dfs.HftpFileSystem@111111) and
    Key(=null@hftp://namenode-nn:4444) do not match.
    {noformat}
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Tsz Wo (Nicholas), SZE (JIRA) at Apr 3, 2008 at 1:25 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Tsz Wo (Nicholas), SZE updated HADOOP-3139:
    -------------------------------------------

    Status: Patch Available (was: Open)
    DistributedFileSystem.close() deadlock and FileSystem.closeAll() warning
    ------------------------------------------------------------------------

    Key: HADOOP-3139
    URL: https://issues.apache.org/jira/browse/HADOOP-3139
    Project: Hadoop Core
    Issue Type: Bug
    Components: dfs
    Reporter: Tsz Wo (Nicholas), SZE
    Assignee: Tsz Wo (Nicholas), SZE
    Priority: Blocker
    Fix For: 0.17.0

    Attachments: 3139_20080331.patch, 3139_20080402b.patch


    Koji found the following:
    *DistributedFileSystem.close() deadlock*
    My dfs -ls hang.
    Ctrl-Z showed a deadlock state.
    {noformat}
    "Thread-0":
    at org.apache.hadoop.dfs.DistributedFileSystem.close(DistributedFileSystem.java:190)
    - waiting to lock <0xedde8788> (a org.apache.hadoop.dfs.DistributedFileSystem)
    at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:1231)
    - locked <0xee0baf88> (a org.apache.hadoop.fs.FileSystem$Cache)
    at org.apache.hadoop.fs.FileSystem.closeAll(FileSystem.java:169)
    at org.apache.hadoop.fs.FileSystem$ClientFinalizer.run(FileSystem.java:154)
    - locked <0xee0bae40> (a org.apache.hadoop.fs.FileSystem$ClientFinalizer)
    "main":
    at org.apache.hadoop.fs.FileSystem$Cache.remove(FileSystem.java:1201)
    - waiting to lock <0xee0baf88> (a org.apache.hadoop.fs.FileSystem$Cache)
    at org.apache.hadoop.fs.FileSystem.close(FileSystem.java:1085)
    at org.apache.hadoop.dfs.DistributedFileSystem.close(DistributedFileSystem.java:192)
    - locked <0xedde8788> (a org.apache.hadoop.dfs.DistributedFileSystem)
    at org.apache.hadoop.fs.FsShell.close(FsShell.java:1698)
    at org.apache.hadoop.fs.FsShell.main(FsShell.java:1712)
    Found 1 deadlock.
    {noformat}
    *FileSystem.closeAll() warning*
    {noformat}
    08/03/31 16:48:42 INFO fs.FileSystem: FileSystem.closeAll() threw an exception:
    java.io.IOException: HftpFileSystem(=org.apache.hadoop.dfs.HftpFileSystem@111111) and
    Key(=null@hftp://namenode-nn:4444) do not match.
    {noformat}
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Tsz Wo (Nicholas), SZE (JIRA) at Apr 3, 2008 at 9:43 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Tsz Wo (Nicholas), SZE updated HADOOP-3139:
    -------------------------------------------

    Attachment: 3139_20080403.patch

    3139_20080403.patch:
    In HADOOP-3003, I added consistency check in FileSystem.Cache.closeAll() to check whether the cache key is consistent with the stored fs (with the conf returned by fs.getConf()). However, a conf can be shared by several FileSystem objects and some other object like JobTracker. Therefore, the consistency check makes no sense since the conf will be modified from time to time. I remove the consistency check in this patch. Indeed, the conf shouldn't be shared between FileSystem. We should fix it in another issue.
    DistributedFileSystem.close() deadlock and FileSystem.closeAll() warning
    ------------------------------------------------------------------------

    Key: HADOOP-3139
    URL: https://issues.apache.org/jira/browse/HADOOP-3139
    Project: Hadoop Core
    Issue Type: Bug
    Components: dfs
    Reporter: Tsz Wo (Nicholas), SZE
    Assignee: Tsz Wo (Nicholas), SZE
    Priority: Blocker
    Fix For: 0.17.0

    Attachments: 3139_20080331.patch, 3139_20080402b.patch, 3139_20080403.patch


    Koji found the following:
    *DistributedFileSystem.close() deadlock*
    My dfs -ls hang.
    Ctrl-Z showed a deadlock state.
    {noformat}
    "Thread-0":
    at org.apache.hadoop.dfs.DistributedFileSystem.close(DistributedFileSystem.java:190)
    - waiting to lock <0xedde8788> (a org.apache.hadoop.dfs.DistributedFileSystem)
    at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:1231)
    - locked <0xee0baf88> (a org.apache.hadoop.fs.FileSystem$Cache)
    at org.apache.hadoop.fs.FileSystem.closeAll(FileSystem.java:169)
    at org.apache.hadoop.fs.FileSystem$ClientFinalizer.run(FileSystem.java:154)
    - locked <0xee0bae40> (a org.apache.hadoop.fs.FileSystem$ClientFinalizer)
    "main":
    at org.apache.hadoop.fs.FileSystem$Cache.remove(FileSystem.java:1201)
    - waiting to lock <0xee0baf88> (a org.apache.hadoop.fs.FileSystem$Cache)
    at org.apache.hadoop.fs.FileSystem.close(FileSystem.java:1085)
    at org.apache.hadoop.dfs.DistributedFileSystem.close(DistributedFileSystem.java:192)
    - locked <0xedde8788> (a org.apache.hadoop.dfs.DistributedFileSystem)
    at org.apache.hadoop.fs.FsShell.close(FsShell.java:1698)
    at org.apache.hadoop.fs.FsShell.main(FsShell.java:1712)
    Found 1 deadlock.
    {noformat}
    *FileSystem.closeAll() warning*
    {noformat}
    08/03/31 16:48:42 INFO fs.FileSystem: FileSystem.closeAll() threw an exception:
    java.io.IOException: HftpFileSystem(=org.apache.hadoop.dfs.HftpFileSystem@111111) and
    Key(=null@hftp://namenode-nn:4444) do not match.
    {noformat}
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hadoop QA (JIRA) at Apr 4, 2008 at 12:01 am
    [ https://issues.apache.org/jira/browse/HADOOP-3139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12585351#action_12585351 ]

    Hadoop QA commented on HADOOP-3139:
    -----------------------------------

    +1 overall. Here are the results of testing the latest attachment
    http://issues.apache.org/jira/secure/attachment/12379318/3139_20080403.patch
    against trunk revision 643282.

    @author +1. The patch does not contain any @author tags.

    tests included +1. The patch appears to include 3 new or modified tests.

    javadoc +1. The javadoc tool did not generate any warning messages.

    javac +1. The applied patch does not generate any new javac compiler warnings.

    release audit +1. The applied patch does not generate any new release audit warnings.

    findbugs +1. The patch does not introduce any new Findbugs warnings.

    core tests +1. The patch passed core unit tests.

    contrib tests +1. The patch passed contrib unit tests.

    Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2147/testReport/
    Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2147/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
    Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2147/artifact/trunk/build/test/checkstyle-errors.html
    Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2147/console

    This message is automatically generated.
    DistributedFileSystem.close() deadlock and FileSystem.closeAll() warning
    ------------------------------------------------------------------------

    Key: HADOOP-3139
    URL: https://issues.apache.org/jira/browse/HADOOP-3139
    Project: Hadoop Core
    Issue Type: Bug
    Components: dfs
    Reporter: Tsz Wo (Nicholas), SZE
    Assignee: Tsz Wo (Nicholas), SZE
    Priority: Blocker
    Fix For: 0.17.0

    Attachments: 3139_20080331.patch, 3139_20080402b.patch, 3139_20080403.patch


    Koji found the following:
    *DistributedFileSystem.close() deadlock*
    My dfs -ls hang.
    Ctrl-Z showed a deadlock state.
    {noformat}
    "Thread-0":
    at org.apache.hadoop.dfs.DistributedFileSystem.close(DistributedFileSystem.java:190)
    - waiting to lock <0xedde8788> (a org.apache.hadoop.dfs.DistributedFileSystem)
    at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:1231)
    - locked <0xee0baf88> (a org.apache.hadoop.fs.FileSystem$Cache)
    at org.apache.hadoop.fs.FileSystem.closeAll(FileSystem.java:169)
    at org.apache.hadoop.fs.FileSystem$ClientFinalizer.run(FileSystem.java:154)
    - locked <0xee0bae40> (a org.apache.hadoop.fs.FileSystem$ClientFinalizer)
    "main":
    at org.apache.hadoop.fs.FileSystem$Cache.remove(FileSystem.java:1201)
    - waiting to lock <0xee0baf88> (a org.apache.hadoop.fs.FileSystem$Cache)
    at org.apache.hadoop.fs.FileSystem.close(FileSystem.java:1085)
    at org.apache.hadoop.dfs.DistributedFileSystem.close(DistributedFileSystem.java:192)
    - locked <0xedde8788> (a org.apache.hadoop.dfs.DistributedFileSystem)
    at org.apache.hadoop.fs.FsShell.close(FsShell.java:1698)
    at org.apache.hadoop.fs.FsShell.main(FsShell.java:1712)
    Found 1 deadlock.
    {noformat}
    *FileSystem.closeAll() warning*
    {noformat}
    08/03/31 16:48:42 INFO fs.FileSystem: FileSystem.closeAll() threw an exception:
    java.io.IOException: HftpFileSystem(=org.apache.hadoop.dfs.HftpFileSystem@111111) and
    Key(=null@hftp://namenode-nn:4444) do not match.
    {noformat}
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Tsz Wo (Nicholas), SZE (JIRA) at Apr 4, 2008 at 12:19 am
    [ https://issues.apache.org/jira/browse/HADOOP-3139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Tsz Wo (Nicholas), SZE updated HADOOP-3139:
    -------------------------------------------

    Fix Version/s: 0.16.3
    DistributedFileSystem.close() deadlock and FileSystem.closeAll() warning
    ------------------------------------------------------------------------

    Key: HADOOP-3139
    URL: https://issues.apache.org/jira/browse/HADOOP-3139
    Project: Hadoop Core
    Issue Type: Bug
    Components: dfs
    Reporter: Tsz Wo (Nicholas), SZE
    Assignee: Tsz Wo (Nicholas), SZE
    Priority: Blocker
    Fix For: 0.16.3, 0.17.0

    Attachments: 3139_20080331.patch, 3139_20080402b.patch, 3139_20080403.patch


    Koji found the following:
    *DistributedFileSystem.close() deadlock*
    My dfs -ls hang.
    Ctrl-Z showed a deadlock state.
    {noformat}
    "Thread-0":
    at org.apache.hadoop.dfs.DistributedFileSystem.close(DistributedFileSystem.java:190)
    - waiting to lock <0xedde8788> (a org.apache.hadoop.dfs.DistributedFileSystem)
    at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:1231)
    - locked <0xee0baf88> (a org.apache.hadoop.fs.FileSystem$Cache)
    at org.apache.hadoop.fs.FileSystem.closeAll(FileSystem.java:169)
    at org.apache.hadoop.fs.FileSystem$ClientFinalizer.run(FileSystem.java:154)
    - locked <0xee0bae40> (a org.apache.hadoop.fs.FileSystem$ClientFinalizer)
    "main":
    at org.apache.hadoop.fs.FileSystem$Cache.remove(FileSystem.java:1201)
    - waiting to lock <0xee0baf88> (a org.apache.hadoop.fs.FileSystem$Cache)
    at org.apache.hadoop.fs.FileSystem.close(FileSystem.java:1085)
    at org.apache.hadoop.dfs.DistributedFileSystem.close(DistributedFileSystem.java:192)
    - locked <0xedde8788> (a org.apache.hadoop.dfs.DistributedFileSystem)
    at org.apache.hadoop.fs.FsShell.close(FsShell.java:1698)
    at org.apache.hadoop.fs.FsShell.main(FsShell.java:1712)
    Found 1 deadlock.
    {noformat}
    *FileSystem.closeAll() warning*
    {noformat}
    08/03/31 16:48:42 INFO fs.FileSystem: FileSystem.closeAll() threw an exception:
    java.io.IOException: HftpFileSystem(=org.apache.hadoop.dfs.HftpFileSystem@111111) and
    Key(=null@hftp://namenode-nn:4444) do not match.
    {noformat}
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Konstantin Shvachko (JIRA) at Apr 4, 2008 at 12:34 am
    [ https://issues.apache.org/jira/browse/HADOOP-3139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12585356#action_12585356 ]

    Konstantin Shvachko commented on HADOOP-3139:
    ---------------------------------------------

    +1
    The important thing is that the deadlock problem is solved.
    It turned out to be that the only way to get rid of the warning is to remove the verification itself.
    The problem here is that people keep using configuration class as a container for passing
    parameters between methods, which is a bad practice.

    DistributedFileSystem.close() deadlock and FileSystem.closeAll() warning
    ------------------------------------------------------------------------

    Key: HADOOP-3139
    URL: https://issues.apache.org/jira/browse/HADOOP-3139
    Project: Hadoop Core
    Issue Type: Bug
    Components: dfs
    Reporter: Tsz Wo (Nicholas), SZE
    Assignee: Tsz Wo (Nicholas), SZE
    Priority: Blocker
    Fix For: 0.16.3, 0.17.0

    Attachments: 3139_20080331.patch, 3139_20080402b.patch, 3139_20080403.patch


    Koji found the following:
    *DistributedFileSystem.close() deadlock*
    My dfs -ls hang.
    Ctrl-Z showed a deadlock state.
    {noformat}
    "Thread-0":
    at org.apache.hadoop.dfs.DistributedFileSystem.close(DistributedFileSystem.java:190)
    - waiting to lock <0xedde8788> (a org.apache.hadoop.dfs.DistributedFileSystem)
    at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:1231)
    - locked <0xee0baf88> (a org.apache.hadoop.fs.FileSystem$Cache)
    at org.apache.hadoop.fs.FileSystem.closeAll(FileSystem.java:169)
    at org.apache.hadoop.fs.FileSystem$ClientFinalizer.run(FileSystem.java:154)
    - locked <0xee0bae40> (a org.apache.hadoop.fs.FileSystem$ClientFinalizer)
    "main":
    at org.apache.hadoop.fs.FileSystem$Cache.remove(FileSystem.java:1201)
    - waiting to lock <0xee0baf88> (a org.apache.hadoop.fs.FileSystem$Cache)
    at org.apache.hadoop.fs.FileSystem.close(FileSystem.java:1085)
    at org.apache.hadoop.dfs.DistributedFileSystem.close(DistributedFileSystem.java:192)
    - locked <0xedde8788> (a org.apache.hadoop.dfs.DistributedFileSystem)
    at org.apache.hadoop.fs.FsShell.close(FsShell.java:1698)
    at org.apache.hadoop.fs.FsShell.main(FsShell.java:1712)
    Found 1 deadlock.
    {noformat}
    *FileSystem.closeAll() warning*
    {noformat}
    08/03/31 16:48:42 INFO fs.FileSystem: FileSystem.closeAll() threw an exception:
    java.io.IOException: HftpFileSystem(=org.apache.hadoop.dfs.HftpFileSystem@111111) and
    Key(=null@hftp://namenode-nn:4444) do not match.
    {noformat}
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Nigel Daley (JIRA) at Apr 4, 2008 at 8:34 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Nigel Daley updated HADOOP-3139:
    --------------------------------

    Fix Version/s: (was: 0.17.0)
    DistributedFileSystem.close() deadlock and FileSystem.closeAll() warning
    ------------------------------------------------------------------------

    Key: HADOOP-3139
    URL: https://issues.apache.org/jira/browse/HADOOP-3139
    Project: Hadoop Core
    Issue Type: Bug
    Components: dfs
    Reporter: Tsz Wo (Nicholas), SZE
    Assignee: Tsz Wo (Nicholas), SZE
    Priority: Blocker
    Fix For: 0.16.3

    Attachments: 3139_20080331.patch, 3139_20080402b.patch, 3139_20080403.patch


    Koji found the following:
    *DistributedFileSystem.close() deadlock*
    My dfs -ls hang.
    Ctrl-Z showed a deadlock state.
    {noformat}
    "Thread-0":
    at org.apache.hadoop.dfs.DistributedFileSystem.close(DistributedFileSystem.java:190)
    - waiting to lock <0xedde8788> (a org.apache.hadoop.dfs.DistributedFileSystem)
    at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:1231)
    - locked <0xee0baf88> (a org.apache.hadoop.fs.FileSystem$Cache)
    at org.apache.hadoop.fs.FileSystem.closeAll(FileSystem.java:169)
    at org.apache.hadoop.fs.FileSystem$ClientFinalizer.run(FileSystem.java:154)
    - locked <0xee0bae40> (a org.apache.hadoop.fs.FileSystem$ClientFinalizer)
    "main":
    at org.apache.hadoop.fs.FileSystem$Cache.remove(FileSystem.java:1201)
    - waiting to lock <0xee0baf88> (a org.apache.hadoop.fs.FileSystem$Cache)
    at org.apache.hadoop.fs.FileSystem.close(FileSystem.java:1085)
    at org.apache.hadoop.dfs.DistributedFileSystem.close(DistributedFileSystem.java:192)
    - locked <0xedde8788> (a org.apache.hadoop.dfs.DistributedFileSystem)
    at org.apache.hadoop.fs.FsShell.close(FsShell.java:1698)
    at org.apache.hadoop.fs.FsShell.main(FsShell.java:1712)
    Found 1 deadlock.
    {noformat}
    *FileSystem.closeAll() warning*
    {noformat}
    08/03/31 16:48:42 INFO fs.FileSystem: FileSystem.closeAll() threw an exception:
    java.io.IOException: HftpFileSystem(=org.apache.hadoop.dfs.HftpFileSystem@111111) and
    Key(=null@hftp://namenode-nn:4444) do not match.
    {noformat}
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Tsz Wo (Nicholas), SZE (JIRA) at Apr 4, 2008 at 11:09 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Tsz Wo (Nicholas), SZE updated HADOOP-3139:
    -------------------------------------------

    Attachment: 3139_20080403_0.16.patch

    3139_20080403_0.16.patch:
    0.16 needs a sleep() in the test. Otherwise, it will fail in (a) Java 1.5.0_13 in Mac and (b) Java 1.5.0_08 in Linux. It won't fail in Java 1.5.0_14 in Windows or Java 1.6 in any platform. For the fail cases (a) and (b), the test won't fail if adding -Dtest.output=yes in ant. It probably is a bug in Java 1.5 or JUnit.
    DistributedFileSystem.close() deadlock and FileSystem.closeAll() warning
    ------------------------------------------------------------------------

    Key: HADOOP-3139
    URL: https://issues.apache.org/jira/browse/HADOOP-3139
    Project: Hadoop Core
    Issue Type: Bug
    Components: dfs
    Reporter: Tsz Wo (Nicholas), SZE
    Assignee: Tsz Wo (Nicholas), SZE
    Priority: Blocker
    Fix For: 0.16.3

    Attachments: 3139_20080331.patch, 3139_20080402b.patch, 3139_20080403.patch, 3139_20080403_0.16.patch


    Koji found the following:
    *DistributedFileSystem.close() deadlock*
    My dfs -ls hang.
    Ctrl-Z showed a deadlock state.
    {noformat}
    "Thread-0":
    at org.apache.hadoop.dfs.DistributedFileSystem.close(DistributedFileSystem.java:190)
    - waiting to lock <0xedde8788> (a org.apache.hadoop.dfs.DistributedFileSystem)
    at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:1231)
    - locked <0xee0baf88> (a org.apache.hadoop.fs.FileSystem$Cache)
    at org.apache.hadoop.fs.FileSystem.closeAll(FileSystem.java:169)
    at org.apache.hadoop.fs.FileSystem$ClientFinalizer.run(FileSystem.java:154)
    - locked <0xee0bae40> (a org.apache.hadoop.fs.FileSystem$ClientFinalizer)
    "main":
    at org.apache.hadoop.fs.FileSystem$Cache.remove(FileSystem.java:1201)
    - waiting to lock <0xee0baf88> (a org.apache.hadoop.fs.FileSystem$Cache)
    at org.apache.hadoop.fs.FileSystem.close(FileSystem.java:1085)
    at org.apache.hadoop.dfs.DistributedFileSystem.close(DistributedFileSystem.java:192)
    - locked <0xedde8788> (a org.apache.hadoop.dfs.DistributedFileSystem)
    at org.apache.hadoop.fs.FsShell.close(FsShell.java:1698)
    at org.apache.hadoop.fs.FsShell.main(FsShell.java:1712)
    Found 1 deadlock.
    {noformat}
    *FileSystem.closeAll() warning*
    {noformat}
    08/03/31 16:48:42 INFO fs.FileSystem: FileSystem.closeAll() threw an exception:
    java.io.IOException: HftpFileSystem(=org.apache.hadoop.dfs.HftpFileSystem@111111) and
    Key(=null@hftp://namenode-nn:4444) do not match.
    {noformat}
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Chris Douglas (JIRA) at Apr 4, 2008 at 11:19 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Chris Douglas updated HADOOP-3139:
    ----------------------------------

    Resolution: Fixed
    Hadoop Flags: [Reviewed]
    Status: Resolved (was: Patch Available)

    I just committed this. Thanks, Nicholas!
    DistributedFileSystem.close() deadlock and FileSystem.closeAll() warning
    ------------------------------------------------------------------------

    Key: HADOOP-3139
    URL: https://issues.apache.org/jira/browse/HADOOP-3139
    Project: Hadoop Core
    Issue Type: Bug
    Components: dfs
    Reporter: Tsz Wo (Nicholas), SZE
    Assignee: Tsz Wo (Nicholas), SZE
    Priority: Blocker
    Fix For: 0.16.3

    Attachments: 3139_20080331.patch, 3139_20080402b.patch, 3139_20080403.patch, 3139_20080403_0.16.patch


    Koji found the following:
    *DistributedFileSystem.close() deadlock*
    My dfs -ls hang.
    Ctrl-Z showed a deadlock state.
    {noformat}
    "Thread-0":
    at org.apache.hadoop.dfs.DistributedFileSystem.close(DistributedFileSystem.java:190)
    - waiting to lock <0xedde8788> (a org.apache.hadoop.dfs.DistributedFileSystem)
    at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:1231)
    - locked <0xee0baf88> (a org.apache.hadoop.fs.FileSystem$Cache)
    at org.apache.hadoop.fs.FileSystem.closeAll(FileSystem.java:169)
    at org.apache.hadoop.fs.FileSystem$ClientFinalizer.run(FileSystem.java:154)
    - locked <0xee0bae40> (a org.apache.hadoop.fs.FileSystem$ClientFinalizer)
    "main":
    at org.apache.hadoop.fs.FileSystem$Cache.remove(FileSystem.java:1201)
    - waiting to lock <0xee0baf88> (a org.apache.hadoop.fs.FileSystem$Cache)
    at org.apache.hadoop.fs.FileSystem.close(FileSystem.java:1085)
    at org.apache.hadoop.dfs.DistributedFileSystem.close(DistributedFileSystem.java:192)
    - locked <0xedde8788> (a org.apache.hadoop.dfs.DistributedFileSystem)
    at org.apache.hadoop.fs.FsShell.close(FsShell.java:1698)
    at org.apache.hadoop.fs.FsShell.main(FsShell.java:1712)
    Found 1 deadlock.
    {noformat}
    *FileSystem.closeAll() warning*
    {noformat}
    08/03/31 16:48:42 INFO fs.FileSystem: FileSystem.closeAll() threw an exception:
    java.io.IOException: HftpFileSystem(=org.apache.hadoop.dfs.HftpFileSystem@111111) and
    Key(=null@hftp://namenode-nn:4444) do not match.
    {noformat}
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hudson (JIRA) at Apr 5, 2008 at 12:17 pm
    [ https://issues.apache.org/jira/browse/HADOOP-3139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12585976#action_12585976 ]

    Hudson commented on HADOOP-3139:
    --------------------------------

    Integrated in Hadoop-trunk #451 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/451/])
    DistributedFileSystem.close() deadlock and FileSystem.closeAll() warning
    ------------------------------------------------------------------------

    Key: HADOOP-3139
    URL: https://issues.apache.org/jira/browse/HADOOP-3139
    Project: Hadoop Core
    Issue Type: Bug
    Components: dfs
    Reporter: Tsz Wo (Nicholas), SZE
    Assignee: Tsz Wo (Nicholas), SZE
    Priority: Blocker
    Fix For: 0.16.3

    Attachments: 3139_20080331.patch, 3139_20080402b.patch, 3139_20080403.patch, 3139_20080403_0.16.patch


    Koji found the following:
    *DistributedFileSystem.close() deadlock*
    My dfs -ls hang.
    Ctrl-Z showed a deadlock state.
    {noformat}
    "Thread-0":
    at org.apache.hadoop.dfs.DistributedFileSystem.close(DistributedFileSystem.java:190)
    - waiting to lock <0xedde8788> (a org.apache.hadoop.dfs.DistributedFileSystem)
    at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:1231)
    - locked <0xee0baf88> (a org.apache.hadoop.fs.FileSystem$Cache)
    at org.apache.hadoop.fs.FileSystem.closeAll(FileSystem.java:169)
    at org.apache.hadoop.fs.FileSystem$ClientFinalizer.run(FileSystem.java:154)
    - locked <0xee0bae40> (a org.apache.hadoop.fs.FileSystem$ClientFinalizer)
    "main":
    at org.apache.hadoop.fs.FileSystem$Cache.remove(FileSystem.java:1201)
    - waiting to lock <0xee0baf88> (a org.apache.hadoop.fs.FileSystem$Cache)
    at org.apache.hadoop.fs.FileSystem.close(FileSystem.java:1085)
    at org.apache.hadoop.dfs.DistributedFileSystem.close(DistributedFileSystem.java:192)
    - locked <0xedde8788> (a org.apache.hadoop.dfs.DistributedFileSystem)
    at org.apache.hadoop.fs.FsShell.close(FsShell.java:1698)
    at org.apache.hadoop.fs.FsShell.main(FsShell.java:1712)
    Found 1 deadlock.
    {noformat}
    *FileSystem.closeAll() warning*
    {noformat}
    08/03/31 16:48:42 INFO fs.FileSystem: FileSystem.closeAll() threw an exception:
    java.io.IOException: HftpFileSystem(=org.apache.hadoop.dfs.HftpFileSystem@111111) and
    Key(=null@hftp://namenode-nn:4444) do not match.
    {noformat}
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-dev @
categorieshadoop
postedMar 31, '08 at 8:12p
activeApr 5, '08 at 12:17p
posts24
users1
websitehadoop.apache.org...
irc#hadoop

1 user in discussion

Hudson (JIRA): 24 posts

People

Translate

site design / logo © 2022 Grokbase