FAQ
Hi,

I'm trying to use FileLocalizer in a UDF to check if a path passed in as a
parameter is a file or a directory.
I saw in some of the pig interval code that something like this:
PigContext pc = (PigContext)
ObjectSerializer.deserialize(PigMapReduce.sJobConf.get("pig.pigContext"));
if (FileLocalizer.isFile(Path, pc)) ...

But I'm getting a NullPointerExecption. Probably I missed something.
Could someone provide an example of how to do this ?

Also, Is it possible to get a list of files in a dfs directory somehow?


Thanks,
Tamir

Search Discussions

  • Jeff Zhang at Dec 3, 2009 at 11:22 am
    HI Tamir,

    PigMapReduce.sJobConf is Null. The PigMapReduce should only been used by
    hadoop internal. Do not sure why you want to use it like that.

    You can look into the source code, the sJobConf will been assigned value
    each time when a mapper or reducer task is initialized.


    Jeff Zhang


    On Thu, Dec 3, 2009 at 2:50 PM, Tamir Kamara wrote:

    Hi,

    I'm trying to use FileLocalizer in a UDF to check if a path passed in as a
    parameter is a file or a directory.
    I saw in some of the pig interval code that something like this:
    PigContext pc = (PigContext)
    ObjectSerializer.deserialize(PigMapReduce.sJobConf.get("pig.pigContext"));
    if (FileLocalizer.isFile(Path, pc)) ...

    But I'm getting a NullPointerExecption. Probably I missed something.
    Could someone provide an example of how to do this ?

    Also, Is it possible to get a list of files in a dfs directory somehow?


    Thanks,
    Tamir
  • Tamir Kamara at Dec 3, 2009 at 12:39 pm
    Hi Jeff,

    As I wrote before, I saw in the source code that pig uses that syntax for
    accessing the dfs.
    Can you suggest a way to check if a path is a file or directory, and if a
    directory to enumerate the files in it ?

    Thanks,
    Tamir
    On Thu, Dec 3, 2009 at 1:22 PM, Jeff Zhang wrote:

    HI Tamir,

    PigMapReduce.sJobConf is Null. The PigMapReduce should only been used by
    hadoop internal. Do not sure why you want to use it like that.

    You can look into the source code, the sJobConf will been assigned value
    each time when a mapper or reducer task is initialized.


    Jeff Zhang


    On Thu, Dec 3, 2009 at 2:50 PM, Tamir Kamara wrote:

    Hi,

    I'm trying to use FileLocalizer in a UDF to check if a path passed in as a
    parameter is a file or a directory.
    I saw in some of the pig interval code that something like this:
    PigContext pc = (PigContext)
    ObjectSerializer.deserialize(PigMapReduce.sJobConf.get("pig.pigContext"));
    if (FileLocalizer.isFile(Path, pc)) ...

    But I'm getting a NullPointerExecption. Probably I missed something.
    Could someone provide an example of how to do this ?

    Also, Is it possible to get a list of files in a dfs directory somehow?


    Thanks,
    Tamir
  • Jeff Zhang at Dec 3, 2009 at 1:21 pm
    Tamir,

    You can use the FileSystem class in hadoop to judge whether a Path is a
    directory or file. And use FileSystem.globStatus(path) to list files under
    directory

    In Pig just like following:

    *PigServer pig = new PigServer(ExecType.MAPREDUCE);
    FileSystem fs
    =FileSystem.get(ConfigurationUtil.toConfiguration(pig.getPigContext().getConf()));
    fs.isDirectory(new Path("/your/file"));*


    Jeff Zhang
    On Thu, Dec 3, 2009 at 4:38 AM, Tamir Kamara wrote:

    Hi Jeff,

    As I wrote before, I saw in the source code that pig uses that syntax for
    accessing the dfs.
    Can you suggest a way to check if a path is a file or directory, and if a
    directory to enumerate the files in it ?

    Thanks,
    Tamir
    On Thu, Dec 3, 2009 at 1:22 PM, Jeff Zhang wrote:

    HI Tamir,

    PigMapReduce.sJobConf is Null. The PigMapReduce should only been used by
    hadoop internal. Do not sure why you want to use it like that.

    You can look into the source code, the sJobConf will been assigned value
    each time when a mapper or reducer task is initialized.


    Jeff Zhang



    On Thu, Dec 3, 2009 at 2:50 PM, Tamir Kamara <tamirkamara@gmail.com>
    wrote:
    Hi,

    I'm trying to use FileLocalizer in a UDF to check if a path passed in
    as
    a
    parameter is a file or a directory.
    I saw in some of the pig interval code that something like this:
    PigContext pc = (PigContext)
    ObjectSerializer.deserialize(PigMapReduce.sJobConf.get("pig.pigContext"));
    if (FileLocalizer.isFile(Path, pc)) ...

    But I'm getting a NullPointerExecption. Probably I missed something.
    Could someone provide an example of how to do this ?

    Also, Is it possible to get a list of files in a dfs directory somehow?


    Thanks,
    Tamir
  • Dmitriy Ryaboy at Dec 3, 2009 at 1:25 pm
    Hi Tamir,

    sJobConf is null during the planning stage; it is defined in the
    execution stage. If you are writing a LoadFunc, you can piggyback on
    the DataStorage object that is passed in to determineSchema() to work
    with the FS at the planning stage. I am not sure at the moment how to
    work with the dfs in an EvalFunc in the planning stage. In the
    execution stage, you can use sJobConf:

    PigContext pigContext = new PigContext(ExecType.MAPREDUCE,
    ConfigurationUtil.toConfiguration(PigMapReduce.sJobConf));
    DataStorage dfs = pigContext.getDfs();

    For how to use DataStorage, check out what I am doing in PIG-760 in
    JsonMetadata.
    On Thu, Dec 3, 2009 at 7:38 AM, Tamir Kamara wrote:
    Hi Jeff,

    As I wrote before, I saw in the source code that pig uses that syntax for
    accessing the dfs.
    Can you suggest a way to check if a path is a file or directory, and if a
    directory to enumerate the files in it ?

    Thanks,
    Tamir
    On Thu, Dec 3, 2009 at 1:22 PM, Jeff Zhang wrote:

    HI Tamir,

    PigMapReduce.sJobConf is Null. The PigMapReduce should only been used by
    hadoop internal.  Do not sure why you want to use it like that.

    You can look into the source code, the sJobConf will been assigned value
    each time when a mapper or reducer task is initialized.


    Jeff Zhang



    On Thu, Dec 3, 2009 at 2:50 PM, Tamir Kamara <tamirkamara@gmail.com>
    wrote:
    Hi,

    I'm trying to use FileLocalizer in a UDF to check if a path passed in as a
    parameter is a file or a directory.
    I saw in some of the pig interval code that something like this:
    PigContext pc = (PigContext)
    ObjectSerializer.deserialize(PigMapReduce.sJobConf.get("pig.pigContext"));
    if (FileLocalizer.isFile(Path, pc)) ...

    But I'm getting a NullPointerExecption. Probably I missed something.
    Could someone provide an example of how to do this ?

    Also, Is it possible to get a list of files in a dfs directory somehow?


    Thanks,
    Tamir

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupuser @
categoriespig, hadoop
postedDec 3, '09 at 6:50a
activeDec 3, '09 at 1:25p
posts5
users3
websitepig.apache.org

People

Translate

site design / logo © 2021 Grokbase