FAQ
Hey all -

I've been trying to create a workflow that executes a distcp action using a
specific fair scheduler pool but I haven't had much luck. I believe I'm
following the XML schema correctly (
https://github.com/apache/oozie/blob/trunk/client/src/main/resources/distcp-action-0.1.xsd),
but my properties aren't making it to the actual action. Any ideas? Bug?
I've glimpsed over the source and I don't see any obvious reason why it
wouldn't work.

cheers,
-James


<action name="distcp-proc-logs">
<distcp xmlns="uri:oozie:distcp-action:0.1">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<prepare>
<delete
path="${nameNode}/user/hive/warehouse/logs/dt=${processDate}"/>
</prepare>
<configuration>
<property>
<name>pool.name</name>
<value>distcp</value>
</property>
<property>
<name>madethisup</name>
<value>honestlyidid</value>
</property>
</configuration>

<arg>hdfs://external:8020/user/hive/warehouse/logs/dt=${processDate}</arg>

<arg>${nameNode}/user/hive/warehouse/logs/dt=${processDate}</arg>
</distcp>
<ok to="mail-report"/>
<error to="mail-report"/>
</action>

--

Search Discussions

  • James Warren at Dec 5, 2012 at 7:26 pm
    Answering my own question:

    Apparently you can configure the action by using command line arguments.
    I'm guessing the configuration element may be strictly for the launcher?

    cheers,
    -James

    <action name="distcp-proc-logs">
    <distcp xmlns="uri:oozie:distcp-action:0.1">
    <job-tracker>${jobTracker}</job-tracker>
    <name-node>${nameNode}</name-node>
    <prepare>
    <delete
    path="${nameNode}/user/hive/warehouse/logs/dt=${processDate}"/>
    </prepare>
    <arg>-Dpool.name=distcp</arg>

    <arg>hdfs://external:8020/user/hive/warehouse/logs/dt=${processDate}</arg>

    <arg>${nameNode}/user/hive/warehouse/logs/dt=${processDate}</arg>
    </distcp>
    <ok to="mail-report"/>
    <error to="mail-report"/>
    </action>

    On Tue, Dec 4, 2012 at 11:39 PM, James Warren wrote:

    Hey all -

    I've been trying to create a workflow that executes a distcp action using
    a specific fair scheduler pool but I haven't had much luck. I believe I'm
    following the XML schema correctly (
    https://github.com/apache/oozie/blob/trunk/client/src/main/resources/distcp-action-0.1.xsd),
    but my properties aren't making it to the actual action. Any ideas? Bug?
    I've glimpsed over the source and I don't see any obvious reason why it
    wouldn't work.

    cheers,
    -James


    <action name="distcp-proc-logs">
    <distcp xmlns="uri:oozie:distcp-action:0.1">
    <job-tracker>${jobTracker}</job-tracker>
    <name-node>${nameNode}</name-node>
    <prepare>
    <delete
    path="${nameNode}/user/hive/warehouse/logs/dt=${processDate}"/>
    </prepare>
    <configuration>
    <property>
    <name>pool.name</name>
    <value>distcp</value>
    </property>
    <property>
    <name>madethisup</name>
    <value>honestlyidid</value>
    </property>
    </configuration>

    <arg>hdfs://external:8020/user/hive/warehouse/logs/dt=${processDate}</arg>

    <arg>${nameNode}/user/hive/warehouse/logs/dt=${processDate}</arg>
    </distcp>
    <ok to="mail-report"/>
    <error to="mail-report"/>
    </action>
    --
  • Robert Kanter at Dec 11, 2012 at 12:27 am
    Hi James,

    I'm not familiar with the fair scheduler, but typically, properties
    specified in <configuration> are made available in the jobconf that you can
    get in an MR job. Properties specified in <arg> are passed directly to the
    application as if they were command line arguments.

    - Robert

    On Wed, Dec 5, 2012 at 11:25 AM, James Warren wrote:

    Answering my own question:

    Apparently you can configure the action by using command line arguments.
    I'm guessing the configuration element may be strictly for the launcher?

    cheers,
    -James

    <action name="distcp-proc-logs">
    <distcp xmlns="uri:oozie:distcp-action:0.1">
    <job-tracker>${jobTracker}</job-tracker>
    <name-node>${nameNode}</name-node>
    <prepare>
    <delete
    path="${nameNode}/user/hive/warehouse/logs/dt=${processDate}"/>
    </prepare>
    <arg>-Dpool.name=distcp</arg>

    <arg>hdfs://external:8020/user/hive/warehouse/logs/dt=${processDate}</arg>

    <arg>${nameNode}/user/hive/warehouse/logs/dt=${processDate}</arg>
    </distcp>
    <ok to="mail-report"/>
    <error to="mail-report"/>
    </action>


    On Tue, Dec 4, 2012 at 11:39 PM, James Warren <
    james.warren@stanfordalumni.org> wrote:
    Hey all -

    I've been trying to create a workflow that executes a distcp action using
    a specific fair scheduler pool but I haven't had much luck. I believe I'm
    following the XML schema correctly (
    https://github.com/apache/oozie/blob/trunk/client/src/main/resources/distcp-action-0.1.xsd),
    but my properties aren't making it to the actual action. Any ideas? Bug?
    I've glimpsed over the source and I don't see any obvious reason why it
    wouldn't work.

    cheers,
    -James


    <action name="distcp-proc-logs">
    <distcp xmlns="uri:oozie:distcp-action:0.1">
    <job-tracker>${jobTracker}</job-tracker>
    <name-node>${nameNode}</name-node>
    <prepare>
    <delete
    path="${nameNode}/user/hive/warehouse/logs/dt=${processDate}"/>
    </prepare>
    <configuration>
    <property>
    <name>pool.name</name>
    <value>distcp</value>
    </property>
    <property>
    <name>madethisup</name>
    <value>honestlyidid</value>
    </property>
    </configuration>

    <arg>hdfs://external:8020/user/hive/warehouse/logs/dt=${processDate}</arg>

    <arg>${nameNode}/user/hive/warehouse/logs/dt=${processDate}</arg>
    </distcp>
    <ok to="mail-report"/>
    <error to="mail-report"/>
    </action>
    --


    --

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcdh-user @
categorieshadoop
postedDec 5, '12 at 7:40a
activeDec 11, '12 at 12:27a
posts3
users2
websitecloudera.com
irc#hadoop

2 users in discussion

James Warren: 2 posts Robert Kanter: 1 post

People

Translate

site design / logo © 2022 Grokbase