FAQ
Hi, all


As I know, hadoop fs -ls / can list files and directory of root
directory, so I am wondering How could I write a Java program to travel
across the whole DFS directory structure?

That is, if the directory structure at the moment like the following:

/
+----home
+ anderson
+ samples.dat


Is it possible to write a Java program that can visit from the /
directory and list subdirectory, and find if it reaches a dat file?

Afterwards, how could I obtain the content of the samples.dat ? So far,
I know the starting point is constructing a Configuration object,
however, What's the necessary information should be included in the
Configuration object?
Shall I specify the hadoop-defaults.xml and hadoop-sites.xml inside it.

I'll appreciate if a simple sample program is provided.

BR/anderson

Search Discussions

  • Nick Cen at Jun 16, 2009 at 5:20 am
    I think you can take a look at the following classes. FileSystem, Path,
    FileStatus.

    *and the listStatus(Path path)* method in FileSystem.



    2009/6/16 Wenrui Guo <wenrui.guo@ericsson.com>
    Hi, all


    As I know, hadoop fs -ls / can list files and directory of root
    directory, so I am wondering How could I write a Java program to travel
    across the whole DFS directory structure?

    That is, if the directory structure at the moment like the following:

    /
    +----home
    + anderson
    + samples.dat


    Is it possible to write a Java program that can visit from the /
    directory and list subdirectory, and find if it reaches a dat file?

    Afterwards, how could I obtain the content of the samples.dat ? So far,
    I know the starting point is constructing a Configuration object,
    however, What's the necessary information should be included in the
    Configuration object?
    Shall I specify the hadoop-defaults.xml and hadoop-sites.xml inside it.

    I'll appreciate if a simple sample program is provided.

    BR/anderson


    --
    http://daily.appspot.com/food/
  • Wenrui Guo at Jun 17, 2009 at 2:50 am
    Hi, Nick

    I think the listStatus(Path) is really what I want.

    Meanwhile, I also asked How to set the Configuration object when
    constructing the FileSystem object. As I know, in order to make Hadoop
    client programs runs (like ./hadoop fs ls / command), the hadoop
    configuration files, e.g hadoop-default.xml and hadoop-sites.xml must be
    parsed to obtain information of NameNode and DataNode.

    So, If I'd like to run the directory traversal class as a standalone
    Java application on a Machine rather than Nodes within the Hadoop
    cluster, Do I need to copy hadoop configuration files to client side and
    load them at runtime?

    BR/anderson

    -----Original Message-----
    From: Nick Cen
    Sent: Tuesday, June 16, 2009 1:19 PM
    To: core-user@hadoop.apache.org
    Subject: Re: How to use DFS API to travel across the directory tree and
    retrieve content of a DFS file?

    I think you can take a look at the following classes. FileSystem, Path,
    FileStatus.

    *and the listStatus(Path path)* method in FileSystem.



    2009/6/16 Wenrui Guo <wenrui.guo@ericsson.com>
    Hi, all


    As I know, hadoop fs -ls / can list files and directory of root
    directory, so I am wondering How could I write a Java program to
    travel across the whole DFS directory structure?

    That is, if the directory structure at the moment like the following:

    /
    +----home
    + anderson
    + samples.dat


    Is it possible to write a Java program that can visit from the /
    directory and list subdirectory, and find if it reaches a dat file?

    Afterwards, how could I obtain the content of the samples.dat ? So
    far, I know the starting point is constructing a Configuration object,
    however, What's the necessary information should be included in the
    Configuration object?
    Shall I specify the hadoop-defaults.xml and hadoop-sites.xml inside it.
    I'll appreciate if a simple sample program is provided.

    BR/anderson


    --
    http://daily.appspot.com/food/
  • Nick Cen at Jun 17, 2009 at 4:40 am
    I think you can take a look at the Configuration class.

    2009/6/17 Wenrui Guo <wenrui.guo@ericsson.com>
    Hi, Nick

    I think the listStatus(Path) is really what I want.

    Meanwhile, I also asked How to set the Configuration object when
    constructing the FileSystem object. As I know, in order to make Hadoop
    client programs runs (like ./hadoop fs ls / command), the hadoop
    configuration files, e.g hadoop-default.xml and hadoop-sites.xml must be
    parsed to obtain information of NameNode and DataNode.

    So, If I'd like to run the directory traversal class as a standalone
    Java application on a Machine rather than Nodes within the Hadoop
    cluster, Do I need to copy hadoop configuration files to client side and
    load them at runtime?

    BR/anderson

    -----Original Message-----
    From: Nick Cen
    Sent: Tuesday, June 16, 2009 1:19 PM
    To: core-user@hadoop.apache.org
    Subject: Re: How to use DFS API to travel across the directory tree and
    retrieve content of a DFS file?

    I think you can take a look at the following classes. FileSystem, Path,
    FileStatus.

    *and the listStatus(Path path)* method in FileSystem.



    2009/6/16 Wenrui Guo <wenrui.guo@ericsson.com>
    Hi, all


    As I know, hadoop fs -ls / can list files and directory of root
    directory, so I am wondering How could I write a Java program to
    travel across the whole DFS directory structure?

    That is, if the directory structure at the moment like the following:

    /
    +----home
    + anderson
    + samples.dat


    Is it possible to write a Java program that can visit from the /
    directory and list subdirectory, and find if it reaches a dat file?

    Afterwards, how could I obtain the content of the samples.dat ? So
    far, I know the starting point is constructing a Configuration object,
    however, What's the necessary information should be included in the
    Configuration object?
    Shall I specify the hadoop-defaults.xml and hadoop-sites.xml inside it.
    I'll appreciate if a simple sample program is provided.

    BR/anderson


    --
    http://daily.appspot.com/food/


    --
    http://daily.appspot.com/food/

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
categorieshadoop
postedJun 16, '09 at 3:25a
activeJun 17, '09 at 4:40a
posts4
users2
websitehadoop.apache.org...
irc#hadoop

2 users in discussion

Wenrui Guo: 2 posts Nick Cen: 2 posts

People

Translate

site design / logo © 2021 Grokbase