FAQ
Hi all.

I have some data stored in properties file/s and need that hadoop job uses
it.

What is the best practice to work hadoop job with resources like properties
files?
Does hadoop provide infrastructure to work with properties files or there is
another way to do it (like XML config files)?

Thanks in advance
Oleg.

Search Discussions

  • Harsh J at May 25, 2010 at 2:49 pm
    Hi,

    You can use the Hadoop-provided Configuration API to place your custom
    configuration name-value pairs while creating a job (This is actually
    an XML solution in the end, as the conf gets written to the job.xml).

    Its roughly like this:

    Configuration conf = new Configuration();
    (...)
    conf.set("myproperty.myname", "Eggs and Spam");
    // Likewise, you can set Int/Float/Long/Booleans also.
    (...)
    // Submit/Exec job now.

    Then to use these set values, you can utilize the JobConfigurable
    interface to access the JobConf object from within your Mapper or
    Reducer interface implementation (OLD API) by overriding the configure
    method. Or use the JobContext object to get the configuration object
    under the Mapper/Reducer's setup call (NEW API). Store them as class
    properties if you need them to persist.

    Alternatively, you could place your global file onto a
    DistributedCache and use it from within your mappers/reducers. Check
    the YDN resource on the same to decide which way to go:
    http://developer.yahoo.com/hadoop/tutorial/module5.html#auxdata
    On Tue, May 25, 2010 at 7:25 PM, Oleg Ruchovets wrote:
    Hi all.

    I have some data stored in properties file/s and need that hadoop job uses
    it.

    What is the best practice to work hadoop job with resources like properties
    files?
    Does hadoop provide infrastructure to work with properties files or there is
    another way to do it (like XML config files)?

    Thanks in advance
    Oleg.


    --
    Harsh J
    www.harshj.com
  • Oleg Ruchovets at May 25, 2010 at 3:37 pm
    Thank you for detailed answer:
    On Tue, May 25, 2010 at 5:48 PM, Harsh J wrote:

    Hi,

    You can use the Hadoop-provided Configuration API to place your custom
    configuration name-value pairs while creating a job (This is actually
    an XML solution in the end, as the conf gets written to the job.xml).

    Its roughly like this:

    Configuration conf = new Configuration();
    (...)
    conf.set("myproperty.myname", "Eggs and Spam");
    // Likewise, you can set Int/Float/Long/Booleans also.
    (...)
    // Submit/Exec job now.


    From above example I understand that you have to write data to job.xml file
    using java code. Is it possible work with resources without java code
    changes? I mean writing to file. In case I have to add some values to file I
    have to recompile java code , but I want to make changes only in properties
    file (properties , or xml) without java code changes.


    Thanks
    Oleg.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
categorieshadoop
postedMay 25, '10 at 1:56p
activeMay 25, '10 at 3:37p
posts3
users2
websitehadoop.apache.org...
irc#hadoop

2 users in discussion

Oleg Ruchovets: 2 posts Harsh J: 1 post

People

Translate

site design / logo © 2022 Grokbase