FAQ
Provide examples of using offline image viewer (oiv) to analyze hadoop file systems
-----------------------------------------------------------------------------------

Key: HADOOP-5752
URL: https://issues.apache.org/jira/browse/HADOOP-5752
Project: Hadoop Core
Issue Type: Improvement
Components: dfs
Reporter: Jakob Homan
Assignee: Jakob Homan


The offline image viewer provides the ability to generate large amounts of data about an hdfs namespace. It would be good to provide tools, examples, etc. on how to analyze this data to find useful information.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Search Discussions

  • Jakob Homan (JIRA) at Apr 28, 2009 at 12:57 am
    [ https://issues.apache.org/jira/browse/HADOOP-5752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Jakob Homan updated HADOOP-5752:
    --------------------------------

    Attachment: HADOOP-5752.patch

    The OIV's output data are ripe for analysis. The attached patch:
    * Creates a new image processor, Delimited, that creates a (by default) tab-delimited file of the namespace that is suitable for analysis by other tools.
    * Updates the the oiv documentation to provide examples of how to analyze these files using Pig to find probable duplicate files, files that have never been accessed and the total number of files of each user in the namespace. These are meant as examples to help ops and such build other useful scripts.
    * Provides unit test for new DelimitedImageVisitor

    Right now the script files themselves are not included in the patch because I couldn't figure out a good place to stash them in the file structure. Konstantin suggested adding them to the wiki, which would be nice as other users could add other scripts as they are created, but I don't see where the wiki hosts files like these. If it can, can someone please point me to them?

    Santhosh from the Pig team kindly reviewed and blessed the pig scripts.
    Provide examples of using offline image viewer (oiv) to analyze hadoop file systems
    -----------------------------------------------------------------------------------

    Key: HADOOP-5752
    URL: https://issues.apache.org/jira/browse/HADOOP-5752
    Project: Hadoop Core
    Issue Type: Improvement
    Components: dfs
    Reporter: Jakob Homan
    Assignee: Jakob Homan
    Attachments: HADOOP-5752.patch


    The offline image viewer provides the ability to generate large amounts of data about an hdfs namespace. It would be good to provide tools, examples, etc. on how to analyze this data to find useful information.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Jakob Homan (JIRA) at Apr 28, 2009 at 12:57 am
    [ https://issues.apache.org/jira/browse/HADOOP-5752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Jakob Homan updated HADOOP-5752:
    --------------------------------

    Status: Patch Available (was: Open)

    Submitting patch. All unit tests pass. Testpatch:
    {noformat}
    [exec] +1 overall.
    [exec]
    [exec] +1 @author. The patch does not contain any @author tags.
    [exec]
    [exec] +1 tests included. The patch appears to include 3 new or modified tests.
    [exec]
    [exec] +1 javadoc. The javadoc tool did not generate any warning messages.
    [exec]
    [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings.
    [exec]
    [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings.
    [exec]
    [exec] +1 Eclipse classpath. The patch retains Eclipse classpath integrity.
    [exec]
    [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings.

    {noformat}
    Provide examples of using offline image viewer (oiv) to analyze hadoop file systems
    -----------------------------------------------------------------------------------

    Key: HADOOP-5752
    URL: https://issues.apache.org/jira/browse/HADOOP-5752
    Project: Hadoop Core
    Issue Type: Improvement
    Components: dfs
    Reporter: Jakob Homan
    Assignee: Jakob Homan
    Attachments: HADOOP-5752.patch


    The offline image viewer provides the ability to generate large amounts of data about an hdfs namespace. It would be good to provide tools, examples, etc. on how to analyze this data to find useful information.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Jakob Homan (JIRA) at Apr 28, 2009 at 2:46 am
    [ https://issues.apache.org/jira/browse/HADOOP-5752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Jakob Homan updated HADOOP-5752:
    --------------------------------

    Issue Type: New Feature (was: Improvement)
    Provide examples of using offline image viewer (oiv) to analyze hadoop file systems
    -----------------------------------------------------------------------------------

    Key: HADOOP-5752
    URL: https://issues.apache.org/jira/browse/HADOOP-5752
    Project: Hadoop Core
    Issue Type: New Feature
    Components: dfs
    Reporter: Jakob Homan
    Assignee: Jakob Homan
    Attachments: HADOOP-5752.patch


    The offline image viewer provides the ability to generate large amounts of data about an hdfs namespace. It would be good to provide tools, examples, etc. on how to analyze this data to find useful information.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hadoop QA (JIRA) at Apr 28, 2009 at 9:18 am
    [ https://issues.apache.org/jira/browse/HADOOP-5752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12703546#action_12703546 ]

    Hadoop QA commented on HADOOP-5752:
    -----------------------------------

    +1 overall. Here are the results of testing the latest attachment
    http://issues.apache.org/jira/secure/attachment/12406590/HADOOP-5752.patch
    against trunk revision 769174.

    +1 @author. The patch does not contain any @author tags.

    +1 tests included. The patch appears to include 3 new or modified tests.

    +1 javadoc. The javadoc tool did not generate any warning messages.

    +1 javac. The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs. The patch does not introduce any new Findbugs warnings.

    +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

    +1 release audit. The applied patch does not increase the total number of release audit warnings.

    +1 core tests. The patch passed core unit tests.

    +1 contrib tests. The patch passed contrib unit tests.

    Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/254/testReport/
    Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/254/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
    Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/254/artifact/trunk/build/test/checkstyle-errors.html
    Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/254/console

    This message is automatically generated.
    Provide examples of using offline image viewer (oiv) to analyze hadoop file systems
    -----------------------------------------------------------------------------------

    Key: HADOOP-5752
    URL: https://issues.apache.org/jira/browse/HADOOP-5752
    Project: Hadoop Core
    Issue Type: New Feature
    Components: dfs
    Reporter: Jakob Homan
    Assignee: Jakob Homan
    Attachments: HADOOP-5752.patch


    The offline image viewer provides the ability to generate large amounts of data about an hdfs namespace. It would be good to provide tools, examples, etc. on how to analyze this data to find useful information.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Tsz Wo (Nicholas), SZE (JIRA) at Apr 28, 2009 at 11:32 pm
    [ https://issues.apache.org/jira/browse/HADOOP-5752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12703879#action_12703879 ]

    Tsz Wo (Nicholas), SZE commented on HADOOP-5752:
    ------------------------------------------------

    Tried the new processor, it works well.

    A nit: For the processors not supporting the -delimiter, oiv should show an error message.
    Provide examples of using offline image viewer (oiv) to analyze hadoop file systems
    -----------------------------------------------------------------------------------

    Key: HADOOP-5752
    URL: https://issues.apache.org/jira/browse/HADOOP-5752
    Project: Hadoop Core
    Issue Type: New Feature
    Components: dfs
    Reporter: Jakob Homan
    Assignee: Jakob Homan
    Attachments: HADOOP-5752.patch


    The offline image viewer provides the ability to generate large amounts of data about an hdfs namespace. It would be good to provide tools, examples, etc. on how to analyze this data to find useful information.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Jakob Homan (JIRA) at Apr 28, 2009 at 11:50 pm
    [ https://issues.apache.org/jira/browse/HADOOP-5752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Jakob Homan updated HADOOP-5752:
    --------------------------------

    Attachment: HADOOP-5752.patch

    Updated patch to implement Nicholas' suggestion. Will now give an erorr and exit if -delimited is specified with any processor other than Delimiter. Thanks, Nicholas. Manually tested.
    Provide examples of using offline image viewer (oiv) to analyze hadoop file systems
    -----------------------------------------------------------------------------------

    Key: HADOOP-5752
    URL: https://issues.apache.org/jira/browse/HADOOP-5752
    Project: Hadoop Core
    Issue Type: New Feature
    Components: dfs
    Reporter: Jakob Homan
    Assignee: Jakob Homan
    Attachments: HADOOP-5752.patch, HADOOP-5752.patch


    The offline image viewer provides the ability to generate large amounts of data about an hdfs namespace. It would be good to provide tools, examples, etc. on how to analyze this data to find useful information.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Tsz Wo (Nicholas), SZE (JIRA) at Apr 29, 2009 at 6:14 pm
    [ https://issues.apache.org/jira/browse/HADOOP-5752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12704228#action_12704228 ]

    Tsz Wo (Nicholas), SZE commented on HADOOP-5752:
    ------------------------------------------------
    Updated patch to implement Nicholas' suggestion. Will now give an erorr and exit if -delimited is specified with any processor other than Delimiter. Thanks, Nicholas. Manually tested.
    Are the words "Delimited" the processor name and "delimiter" the option name? Could you also check the doc? It seems two words are messing up. e.g.
    {noformat}
    + <td>When used in conjunction with the Delimiter processor, replaces the default
    {noformat}
    Provide examples of using offline image viewer (oiv) to analyze hadoop file systems
    -----------------------------------------------------------------------------------

    Key: HADOOP-5752
    URL: https://issues.apache.org/jira/browse/HADOOP-5752
    Project: Hadoop Core
    Issue Type: New Feature
    Components: dfs
    Reporter: Jakob Homan
    Assignee: Jakob Homan
    Attachments: HADOOP-5752.patch, HADOOP-5752.patch


    The offline image viewer provides the ability to generate large amounts of data about an hdfs namespace. It would be good to provide tools, examples, etc. on how to analyze this data to find useful information.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Jakob Homan (JIRA) at Apr 29, 2009 at 6:42 pm
    [ https://issues.apache.org/jira/browse/HADOOP-5752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Jakob Homan updated HADOOP-5752:
    --------------------------------

    Attachment: HADOOP-5752.patch

    Great catch Nicholas. Fixed that and one other instance of [dr] mix-up. Thanks. Any thoughts on where the pig scripts should be located?
    Provide examples of using offline image viewer (oiv) to analyze hadoop file systems
    -----------------------------------------------------------------------------------

    Key: HADOOP-5752
    URL: https://issues.apache.org/jira/browse/HADOOP-5752
    Project: Hadoop Core
    Issue Type: New Feature
    Components: dfs
    Reporter: Jakob Homan
    Assignee: Jakob Homan
    Attachments: HADOOP-5752.patch, HADOOP-5752.patch, HADOOP-5752.patch


    The offline image viewer provides the ability to generate large amounts of data about an hdfs namespace. It would be good to provide tools, examples, etc. on how to analyze this data to find useful information.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Tsz Wo (Nicholas), SZE (JIRA) at Apr 29, 2009 at 6:51 pm
    [ https://issues.apache.org/jira/browse/HADOOP-5752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Tsz Wo (Nicholas), SZE updated HADOOP-5752:
    -------------------------------------------

    Hadoop Flags: [Reviewed]

    +1 patch looks good. Thanks, Jakob.
    Provide examples of using offline image viewer (oiv) to analyze hadoop file systems
    -----------------------------------------------------------------------------------

    Key: HADOOP-5752
    URL: https://issues.apache.org/jira/browse/HADOOP-5752
    Project: Hadoop Core
    Issue Type: New Feature
    Components: dfs
    Reporter: Jakob Homan
    Assignee: Jakob Homan
    Attachments: HADOOP-5752.patch, HADOOP-5752.patch, HADOOP-5752.patch


    The offline image viewer provides the ability to generate large amounts of data about an hdfs namespace. It would be good to provide tools, examples, etc. on how to analyze this data to find useful information.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Jakob Homan (JIRA) at Apr 30, 2009 at 2:57 am
    [ https://issues.apache.org/jira/browse/HADOOP-5752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12704451#action_12704451 ]

    Jakob Homan commented on HADOOP-5752:
    -------------------------------------

    Sounds good. Tested everything after last revision.
    Provide examples of using offline image viewer (oiv) to analyze hadoop file systems
    -----------------------------------------------------------------------------------

    Key: HADOOP-5752
    URL: https://issues.apache.org/jira/browse/HADOOP-5752
    Project: Hadoop Core
    Issue Type: New Feature
    Components: dfs
    Reporter: Jakob Homan
    Assignee: Jakob Homan
    Attachments: HADOOP-5752.patch, HADOOP-5752.patch, HADOOP-5752.patch


    The offline image viewer provides the ability to generate large amounts of data about an hdfs namespace. It would be good to provide tools, examples, etc. on how to analyze this data to find useful information.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Tsz Wo (Nicholas), SZE (JIRA) at Apr 30, 2009 at 3:03 am
    [ https://issues.apache.org/jira/browse/HADOOP-5752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Tsz Wo (Nicholas), SZE updated HADOOP-5752:
    -------------------------------------------

    Resolution: Fixed
    Fix Version/s: 0.21.0
    Status: Resolved (was: Patch Available)

    I have committed this. Thanks, Jakob!
    Provide examples of using offline image viewer (oiv) to analyze hadoop file systems
    -----------------------------------------------------------------------------------

    Key: HADOOP-5752
    URL: https://issues.apache.org/jira/browse/HADOOP-5752
    Project: Hadoop Core
    Issue Type: New Feature
    Components: dfs
    Reporter: Jakob Homan
    Assignee: Jakob Homan
    Fix For: 0.21.0

    Attachments: HADOOP-5752.patch, HADOOP-5752.patch, HADOOP-5752.patch


    The offline image viewer provides the ability to generate large amounts of data about an hdfs namespace. It would be good to provide tools, examples, etc. on how to analyze this data to find useful information.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Hudson (JIRA) at Apr 30, 2009 at 6:54 pm
    [ https://issues.apache.org/jira/browse/HADOOP-5752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12704742#action_12704742 ]

    Hudson commented on HADOOP-5752:
    --------------------------------

    Integrated in Hadoop-trunk #822 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/822/])
    . Add a new hdfs image processor, Delimited, to oiv. Contributed by Jakob Homan

    Provide examples of using offline image viewer (oiv) to analyze hadoop file systems
    -----------------------------------------------------------------------------------

    Key: HADOOP-5752
    URL: https://issues.apache.org/jira/browse/HADOOP-5752
    Project: Hadoop Core
    Issue Type: New Feature
    Components: dfs
    Reporter: Jakob Homan
    Assignee: Jakob Homan
    Fix For: 0.21.0

    Attachments: HADOOP-5752.patch, HADOOP-5752.patch, HADOOP-5752.patch


    The offline image viewer provides the ability to generate large amounts of data about an hdfs namespace. It would be good to provide tools, examples, etc. on how to analyze this data to find useful information.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Jakob Homan (JIRA) at Apr 30, 2009 at 6:59 pm
    [ https://issues.apache.org/jira/browse/HADOOP-5752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Jakob Homan updated HADOOP-5752:
    --------------------------------

    Release Note: Extend offline image viewer (oiv) to generate Pig-friendly data and provide examples of analyzing those data with Pig.

    added release note.
    Provide examples of using offline image viewer (oiv) to analyze hadoop file systems
    -----------------------------------------------------------------------------------

    Key: HADOOP-5752
    URL: https://issues.apache.org/jira/browse/HADOOP-5752
    Project: Hadoop Core
    Issue Type: New Feature
    Components: dfs
    Reporter: Jakob Homan
    Assignee: Jakob Homan
    Fix For: 0.21.0

    Attachments: HADOOP-5752.patch, HADOOP-5752.patch, HADOOP-5752.patch


    The offline image viewer provides the ability to generate large amounts of data about an hdfs namespace. It would be good to provide tools, examples, etc. on how to analyze this data to find useful information.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-dev @
categorieshadoop
postedApr 28, '09 at 12:49a
activeApr 30, '09 at 6:59p
posts14
users1
websitehadoop.apache.org...
irc#hadoop

1 user in discussion

Jakob Homan (JIRA): 14 posts

People

Translate

site design / logo © 2022 Grokbase