FAQ
Hi,

I've be reading and recreating the analyzing twitter data series found
here:
http://blog.cloudera.com/blog/2012/09/analyzing-twitter-data-with-hadoop/.
Now I've setup the workflow for creating new partitions using the Oozie
editor in Hue instead as adviced in the series. One thing that I could not
get to work was the <job-xml> element in the hive action. Instead I had to
add it as a global variable so that my workflow xml looks like this:

<workflow-app name="twitter-add-partition-wf"
xmlns="uri:oozie:workflow:0.4">
<global>

<job-xml>/user/hue/oozie/workspaces/_mpo_-oozie-1/hive-site.xml</job-xml>
</global>
<start to="add-datehour-partition"/>
<action name="add-datehour-partition">
<hive xmlns="uri:oozie:hive-action:0.2">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<configuration>
<property>
<name>oozie.hive.defaults</name>

<value>/user/hue/oozie/workspaces/_mpo_-oozie-1/hive-site.xml</value>
</property>
</configuration>

<script>/user/hue/oozie/workspaces/_mpo_-oozie-1/add_partition.q</script>
<param>DATEHOUR=${dateHour}</param>
<param>PARTITION_PATH=${partitionPath}</param>
</hive>
<ok to="end"/>
<error to="kill"/>
</action>
<kill name="kill">
<message>Action failed, error
message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<end name="end"/>
</workflow-app>

Which works. But it seems like the Job XML field found in the Hive Action
has no impact on the definition generated unless I'm doing something wrong?

Thanks.

Search Discussions

  • Mortenbpost at Jan 28, 2013 at 4:28 pm
    Addendum:

    And there are still limitations as to what is possible when defining a
    coordinator in terms of defining time formulas as input properties to your
    workflows. As far as I can see only dataset inputs are supported.

    Thanks.

    Den mandag den 28. januar 2013 16.36.15 UTC+1 skrev mortenbpost:
    Hi,

    I've be reading and recreating the analyzing twitter data series found
    here:
    http://blog.cloudera.com/blog/2012/09/analyzing-twitter-data-with-hadoop/.
    Now I've setup the workflow for creating new partitions using the Oozie
    editor in Hue instead as adviced in the series. One thing that I could not
    get to work was the <job-xml> element in the hive action. Instead I had to
    add it as a global variable so that my workflow xml looks like this:

    <workflow-app name="twitter-add-partition-wf"
    xmlns="uri:oozie:workflow:0.4">
    <global>

    <job-xml>/user/hue/oozie/workspaces/_mpo_-oozie-1/hive-site.xml</job-xml>
    </global>
    <start to="add-datehour-partition"/>
    <action name="add-datehour-partition">
    <hive xmlns="uri:oozie:hive-action:0.2">
    <job-tracker>${jobTracker}</job-tracker>
    <name-node>${nameNode}</name-node>
    <configuration>
    <property>
    <name>oozie.hive.defaults</name>

    <value>/user/hue/oozie/workspaces/_mpo_-oozie-1/hive-site.xml</value>
    </property>
    </configuration>

    <script>/user/hue/oozie/workspaces/_mpo_-oozie-1/add_partition.q</script>
    <param>DATEHOUR=${dateHour}</param>
    <param>PARTITION_PATH=${partitionPath}</param>
    </hive>
    <ok to="end"/>
    <error to="kill"/>
    </action>
    <kill name="kill">
    <message>Action failed, error
    message[${wf:errorMessage(wf:lastErrorNode())}]</message>
    </kill>
    <end name="end"/>
    </workflow-app>

    Which works. But it seems like the Job XML field found in the Hive Action
    has no impact on the definition generated unless I'm doing something wrong?

    Thanks.
  • Romain Rigaux at Jan 28, 2013 at 5:10 pm
    Thanks for the feedback!

    First question:
    Yes, there is a bug in the current version that is fixed in 2.2:
    https://groups.google.com/a/cloudera.org/group/hue-user/browse_thread/thread/336db42830339a38
    It means that you need it as a property for now but in next version you
    just need to specify the job-xml field:
    https://issues.apache.org/jira/browse/OOZIE-1087

    Second question:
    Yes, but this limitation is fixed in 2.2 release which is coming in about 2
    weeks: https://issues.cloudera.org/browse/HUE-983 (with screenshot)

    Feel free to ask questions if I am not clear enough. 2.2 is soon and we are
    starting a new blog post series at the same time.

    Romain
    On Mon, Jan 28, 2013 at 8:28 AM, mortenbpost wrote:

    Addendum:

    And there are still limitations as to what is possible when defining a
    coordinator in terms of defining time formulas as input properties to your
    workflows. As far as I can see only dataset inputs are supported.

    Thanks.

    Den mandag den 28. januar 2013 16.36.15 UTC+1 skrev mortenbpost:
    Hi,

    I've be reading and recreating the analyzing twitter data series found
    here: http://blog.cloudera.com/blog/**2012/09/analyzing-twitter-**
    data-with-hadoop/<http://blog.cloudera.com/blog/2012/09/analyzing-twitter-data-with-hadoop/>.
    Now I've setup the workflow for creating new partitions using the Oozie
    editor in Hue instead as adviced in the series. One thing that I could not
    get to work was the <job-xml> element in the hive action. Instead I had to
    add it as a global variable so that my workflow xml looks like this:

    <workflow-app name="twitter-add-partition-**wf"
    xmlns="uri:oozie:workflow:0.4"**>
    <global>
    <job-xml>/user/hue/oozie/**workspaces/_mpo_-oozie-1/hive-**
    site.xml</job-xml>
    </global>
    <start to="add-datehour-partition"/>
    <action name="add-datehour-partition">
    <hive xmlns="uri:oozie:hive-action:**0.2">
    <job-tracker>${jobTracker}</**job-tracker>
    <name-node>${nameNode}</name-**node>
    <configuration>
    <property>
    <name>oozie.hive.defaults</**name>
    <value>/user/hue/oozie/**
    workspaces/_mpo_-oozie-1/hive-**site.xml</value>
    </property>
    </configuration>
    <script>/user/hue/oozie/**workspaces/_mpo_-oozie-1/add_**
    partition.q</script>
    <param>DATEHOUR=${dateHour}</**param>
    <param>PARTITION_PATH=${**partitionPath}</param>
    </hive>
    <ok to="end"/>
    <error to="kill"/>
    </action>
    <kill name="kill">
    <message>Action failed, error message[${wf:errorMessage(wf:**
    lastErrorNode())}]</message>
    </kill>
    <end name="end"/>
    </workflow-app>

    Which works. But it seems like the Job XML field found in the Hive Action
    has no impact on the definition generated unless I'm doing something wrong?

    Thanks.
  • Morten Post at Jan 28, 2013 at 6:03 pm
    Romain,

    Thanks for confirming these issues, just what I needed. The new version
    looks great (screenshot).

    -Morten

    On Mon, Jan 28, 2013 at 6:10 PM, Romain Rigaux wrote:

    Thanks for the feedback!

    First question:
    Yes, there is a bug in the current version that is fixed in 2.2:
    https://groups.google.com/a/cloudera.org/group/hue-user/browse_thread/thread/336db42830339a38
    It means that you need it as a property for now but in next version you
    just need to specify the job-xml field:
    https://issues.apache.org/jira/browse/OOZIE-1087

    Second question:
    Yes, but this limitation is fixed in 2.2 release which is coming in about
    2 weeks: https://issues.cloudera.org/browse/HUE-983 (with screenshot)

    Feel free to ask questions if I am not clear enough. 2.2 is soon and we
    are starting a new blog post series at the same time.

    Romain
    On Mon, Jan 28, 2013 at 8:28 AM, mortenbpost wrote:

    Addendum:

    And there are still limitations as to what is possible when defining a
    coordinator in terms of defining time formulas as input properties to your
    workflows. As far as I can see only dataset inputs are supported.

    Thanks.

    Den mandag den 28. januar 2013 16.36.15 UTC+1 skrev mortenbpost:
    Hi,

    I've be reading and recreating the analyzing twitter data series found
    here: http://blog.cloudera.com/blog/**2012/09/analyzing-twitter-**
    data-with-hadoop/<http://blog.cloudera.com/blog/2012/09/analyzing-twitter-data-with-hadoop/>.
    Now I've setup the workflow for creating new partitions using the Oozie
    editor in Hue instead as adviced in the series. One thing that I could not
    get to work was the <job-xml> element in the hive action. Instead I had to
    add it as a global variable so that my workflow xml looks like this:

    <workflow-app name="twitter-add-partition-**wf"
    xmlns="uri:oozie:workflow:0.4"**>
    <global>
    <job-xml>/user/hue/oozie/**workspaces/_mpo_-oozie-1/hive-**
    site.xml</job-xml>
    </global>
    <start to="add-datehour-partition"/>
    <action name="add-datehour-partition">
    <hive xmlns="uri:oozie:hive-action:**0.2">
    <job-tracker>${jobTracker}</**job-tracker>
    <name-node>${nameNode}</name-**node>
    <configuration>
    <property>
    <name>oozie.hive.defaults</**name>
    <value>/user/hue/oozie/**
    workspaces/_mpo_-oozie-1/hive-**site.xml</value>
    </property>
    </configuration>
    <script>/user/hue/oozie/**workspaces/_mpo_-oozie-1/add_**
    partition.q</script>
    <param>DATEHOUR=${dateHour}</**param>
    <param>PARTITION_PATH=${**partitionPath}</param>
    </hive>
    <ok to="end"/>
    <error to="kill"/>
    </action>
    <kill name="kill">
    <message>Action failed, error message[${wf:errorMessage(wf:**
    lastErrorNode())}]</message>
    </kill>
    <end name="end"/>
    </workflow-app>

    Which works. But it seems like the Job XML field found in the Hive
    Action has no impact on the definition generated unless I'm doing something
    wrong?

    Thanks.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouphue-user @
categorieshadoop
postedJan 28, '13 at 3:36p
activeJan 28, '13 at 6:03p
posts4
users2
websitecloudera.com
irc#hadoop

2 users in discussion

Morten Post: 3 posts Romain Rigaux: 1 post

People

Translate

site design / logo © 2022 Grokbase