I am new to Hadoop, and I am trying to use the GenericOptionsParser Class.
In particular, I would like to use the -libjar option to specify additional
jar files to include in the classpath. I've created a class that extends
Configured and Implements Tool:
*public class* OptionDemo *extends* Configured *implements* Tool
{
...
* public int* run(String[] args) *throws* Exception
{
Configuration conf = getConf();
GenericOptionsParser opts = *new* GenericOptionsParser(conf, args);
...
}
}
However, when I run my code the jar files that I include after -libjar
aren't being added to the classpath and I receive an error that certain
classes can't be found during the execution of my job.
The book Hadoop: The Definitive Guide states:
You don’t usually use GenericOptionsParser directly, as it’s more convenient
to implement the Tool interface and run your application with the
ToolRunner, which uses GenericOptionsParser internally:
public interface Tool extends Configurable {
int run(String [] args) throws Exception;
}
but it still isn't clear to me how the -libjars option is parsed, whether or
not I need to explicitly add it to the classpath inside my run method, or
where it needs to be placed in the command-line? Any advice or sample code
on using -libjar would greatly be appreciated.
--
Aquil H. Abdullah
aquil.abdullah@gmail.com