FAQ
Are you talking about doing
https://github.com/cloudera/cdh-twitter-example/blob/master/oozie-workflows/hive-action.xml#L19in
Hue?

When you create a workflow, you can specify its workspace on the HDFS by
clicking on the advanced tab. When using the default workspace, just click
on 'Upload' in the workflow editor and upload your files. Jars should go in
a 'lib' sub-directory.

Fs and Shell action are present as examples in the Oozie app. Did you run
or copy them?

Romain

On Tue, Jun 25, 2013 at 8:57 PM, ignorant wrote:

Wrong forum, posting to
https://groups.google.com/a/cloudera.org/forum/?fromgroups#!searchin/hue-user

On Tuesday, June 25, 2013 10:08:24 PM UTC-4, ignorant wrote:

Hi there,

I am a beginner with Oozie and am trying to implement the twitter example
- https://github.com/cloudera/**cdh-twitter-example<https://github.com/cloudera/cdh-twitter-example>

I see that Hue has a nice interface for this. I have uploaded the package
with all relevant libraries to /user/oozie/oozie-workflows after making the
relevant edits.

I tried to use Hue to add this but am not able to do so successfully. It
gives me a GUI but I cannot point it to the uploaded configuration. I tried
to use the GUI am tried to use the FS and Shell actions in the workflow but
couldn't figure out how to configure them correctly.

I also tried to get the Oozie UI working (http://www.cloudera.com/**
content/cloudera-content/**cloudera-docs/CM4Ent/4.5.3/**
Cloudera-Manager-Enterprise-**Edition-Installation-Guide/**
cmeeig_topic_13.html<http://www.cloudera.com/content/cloudera-content/cloudera-docs/CM4Ent/4.5.3/Cloudera-Manager-Enterprise-Edition-Installation-Guide/cmeeig_topic_13.html>)
But that did not work either. That is a lower priority issue since I am
more interested in getting this to work from the Hue UI if possible.

I would appreciate any feedback on this.

Thanks,
--

---
You received this message because you are subscribed to the Google Groups
"CDH Users" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to cdh-user+unsubscribe@cloudera.org.
For more options, visit
https://groups.google.com/a/cloudera.org/groups/opt_out.

Search Discussions

  • Ignorant at Jun 26, 2013 at 2:40 pm
    Go Romain,

    I had previously uploaded the whole package to hdfs /user/oozie and use
    that in Import but it didn't quite take. So then I went a little more
    manual.

    Here are the steps I took :

    1) Import Workflow. Used that hive-action.xml to create a new workflow
    called "tweet management".
    2) Create a coordinator called twitterstream with workflow selected as
    "tweet management".
    3) Defined frequency in hours.. start time was a little confusing. I set it
    as the start of the previous hour in Universal time zone. This is because
    the Hue box provisoned by CDH seems to be in UTC.
    4) Went to Inputs and hit "Create dataset here". called the new dataset
    "tweets"
    5) I set it as the same frequency, start time as the coordinator - start of
    this hour in Universal.
    6) URI - /user/flume/tweets/${YEAR}/${MONTH}/${DAY}/${HOUR} (I like the
    fact that I don't have to hardcode hdfs instance name).
    7) Instance -> Range -> Start -> ${coord:current(coord:tzOffset() / 60)}
    8) Instance -> Range -> End -> ${coord:current(1 + (coord:tzOffset() / 60))}
    9) Timezone: Universal and hit save
    10) Back in coordinator screen, tie wfInput to tweets that we just created
    11) Leave outputs blank, Go to Advanced Screen
    12) In Oozie parameters, leave "oozie.use.system.libpath" with value of true
    13) add another Oozie parameter, dateHour with
    value ${coord:formatTime(coord:dateOffset(coord:nominalTime(), tzOffset,
    'HOUR'), 'yyyyMMddHH')}
    14) Added a timeout of 10 minutes. Hit save..
    15) Hit Submit job.. it will pop up with two parameters, dateHour already
    filled in. To set workflowRoot, go to the workflow in another window, then
    Properties -> Advanced and copy the HDFS deployment directory. There is
    probably an easier way here too.

    Now I see the coordinator running but throwing errors about missing files
    in hdfs. One of the files its looking for is for the end of the hour. I am
    guessing this has to do with the time I put but i am not sure really. Here
    is what the coordinator definition looks like at runtime -

    <coordinator-app name="tweetstream"
       frequency="${coord:hours(1)}"
       start="2013-06-26T14:00Z" end="2013-07-31T22:19Z" timezone="Universal"
       xmlns="uri:oozie:coordinator:0.1">
       <controls>
         <timeout>10</timeout>
       </controls>
       <datasets>
         <dataset name="tweets" frequency="${coord:hours(1)}"
                  initial-instance="2013-06-26T14:00Z" timezone="Universal">

    <uri-template>${nameNode}/user/flume/tweets/${YEAR}/${MONTH}/${DAY}/${HOUR}</uri-template>
           <done-flag></done-flag>
         </dataset>
       </datasets>
       <input-events>
         <data-in name="wfInput" dataset="tweets">
         <start-instance>
             ${coord:current(coord:tzOffset() / 60)}
         </start-instance>
         <end-instance>
             ${coord:current(1 + (coord:tzOffset() / 60))}
         </end-instance>
         </data-in>
       </input-events>
       <action>
         <workflow>
           <app-path>${wf_application_path}</app-path>
           <configuration>
               <property>
                 <name>wfInput</name>
                 <value>${coord:dataIn('wfInput')}</value>
               </property>
           </configuration>
        </workflow>
       </action>
    </coordinator-app>


    Thanks

    On Wednesday, June 26, 2013 1:19:12 AM UTC-4, Romain Rigaux wrote:

    Are you talking about doing
    https://github.com/cloudera/cdh-twitter-example/blob/master/oozie-workflows/hive-action.xml#L19in Hue?

    When you create a workflow, you can specify its workspace on the HDFS by
    clicking on the advanced tab. When using the default workspace, just click
    on 'Upload' in the workflow editor and upload your files. Jars should go in
    a 'lib' sub-directory.

    Fs and Shell action are present as examples in the Oozie app. Did you run
    or copy them?

    Romain


    On Tue, Jun 25, 2013 at 8:57 PM, ignorant <saurabh...@gmail.com<javascript:>
    wrote:
    Wrong forum, posting to
    https://groups.google.com/a/cloudera.org/forum/?fromgroups#!searchin/hue-user

    On Tuesday, June 25, 2013 10:08:24 PM UTC-4, ignorant wrote:

    Hi there,

    I am a beginner with Oozie and am trying to implement the twitter
    example - https://github.com/cloudera/**cdh-twitter-example<https://github.com/cloudera/cdh-twitter-example>

    I see that Hue has a nice interface for this. I have uploaded the
    package with all relevant libraries to /user/oozie/oozie-workflows after
    making the relevant edits.

    I tried to use Hue to add this but am not able to do so successfully. It
    gives me a GUI but I cannot point it to the uploaded configuration. I tried
    to use the GUI am tried to use the FS and Shell actions in the workflow but
    couldn't figure out how to configure them correctly.

    I also tried to get the Oozie UI working (http://www.cloudera.com/**
    content/cloudera-content/**cloudera-docs/CM4Ent/4.5.3/**
    Cloudera-Manager-Enterprise-**Edition-Installation-Guide/**
    cmeeig_topic_13.html<http://www.cloudera.com/content/cloudera-content/cloudera-docs/CM4Ent/4.5.3/Cloudera-Manager-Enterprise-Edition-Installation-Guide/cmeeig_topic_13.html>)
    But that did not work either. That is a lower priority issue since I am
    more interested in getting this to work from the Hue UI if possible.

    I would appreciate any feedback on this.

    Thanks,
    --

    ---
    You received this message because you are subscribed to the Google Groups
    "CDH Users" group.
    To unsubscribe from this group and stop receiving emails from it, send an
    email to cdh-user+u...@cloudera.org <javascript:>.
    For more options, visit
    https://groups.google.com/a/cloudera.org/groups/opt_out.

  • Ignorant at Jun 28, 2013 at 3:56 pm
    Hi,
    Any pointers on this?
    On Wednesday, June 26, 2013 10:40:09 AM UTC-4, ignorant wrote:

    Hi Romain,

    I had previously uploaded the whole package to hdfs /user/oozie and use
    that in Import but it didn't quite take. So then I went a little more
    manual.

    Here are the steps I took :

    1) Import Workflow. Used that hive-action.xml to create a new workflow
    called "tweet management".
    2) Create a coordinator called twitterstream with workflow selected as
    "tweet management".
    3) Defined frequency in hours.. start time was a little confusing. I set
    it as the start of the previous hour in Universal time zone. This is
    because the Hue box provisoned by CDH seems to be in UTC.
    4) Went to Inputs and hit "Create dataset here". called the new dataset
    "tweets"
    5) I set it as the same frequency, start time as the coordinator - start
    of this hour in Universal.
    6) URI - /user/flume/tweets/${YEAR}/${MONTH}/${DAY}/${HOUR} (I like the
    fact that I don't have to hardcode hdfs instance name).
    7) Instance -> Range -> Start -> ${coord:current(coord:tzOffset() / 60)}
    8) Instance -> Range -> End -> ${coord:current(1 + (coord:tzOffset() /
    60))}
    9) Timezone: Universal and hit save
    10) Back in coordinator screen, tie wfInput to tweets that we just created
    11) Leave outputs blank, Go to Advanced Screen
    12) In Oozie parameters, leave "oozie.use.system.libpath" with value of
    true
    13) add another Oozie parameter, dateHour with
    value ${coord:formatTime(coord:dateOffset(coord:nominalTime(), tzOffset,
    'HOUR'), 'yyyyMMddHH')}
    14) Added a timeout of 10 minutes. Hit save..
    15) Hit Submit job.. it will pop up with two parameters, dateHour already
    filled in. To set workflowRoot, go to the workflow in another window, then
    Properties -> Advanced and copy the HDFS deployment directory. There is
    probably an easier way here too.

    Now I see the coordinator running but throwing errors about missing files
    in hdfs. One of the files its looking for is for the end of the hour. I am
    guessing this has to do with the time I put but i am not sure really. Here
    is what the coordinator definition looks like at runtime -

    <coordinator-app name="tweetstream"
    frequency="${coord:hours(1)}"
    start="2013-06-26T14:00Z" end="2013-07-31T22:19Z" timezone="Universal"
    xmlns="uri:oozie:coordinator:0.1">
    <controls>
    <timeout>10</timeout>
    </controls>
    <datasets>
    <dataset name="tweets" frequency="${coord:hours(1)}"
    initial-instance="2013-06-26T14:00Z" timezone="Universal">

    <uri-template>${nameNode}/user/flume/tweets/${YEAR}/${MONTH}/${DAY}/${HOUR}</uri-template>
    <done-flag></done-flag>
    </dataset>
    </datasets>
    <input-events>
    <data-in name="wfInput" dataset="tweets">
    <start-instance>
    ${coord:current(coord:tzOffset() / 60)}
    </start-instance>
    <end-instance>
    ${coord:current(1 + (coord:tzOffset() / 60))}
    </end-instance>
    </data-in>
    </input-events>
    <action>
    <workflow>
    <app-path>${wf_application_path}</app-path>
    <configuration>
    <property>
    <name>wfInput</name>
    <value>${coord:dataIn('wfInput')}</value>
    </property>
    </configuration>
    </workflow>
    </action>
    </coordinator-app>


    Thanks

    On Wednesday, June 26, 2013 1:19:12 AM UTC-4, Romain Rigaux wrote:

    Are you talking about doing
    https://github.com/cloudera/cdh-twitter-example/blob/master/oozie-workflows/hive-action.xml#L19in Hue?

    When you create a workflow, you can specify its workspace on the HDFS by
    clicking on the advanced tab. When using the default workspace, just click
    on 'Upload' in the workflow editor and upload your files. Jars should go in
    a 'lib' sub-directory.

    Fs and Shell action are present as examples in the Oozie app. Did you run
    or copy them?

    Romain

    On Tue, Jun 25, 2013 at 8:57 PM, ignorant wrote:

    Wrong forum, posting to
    https://groups.google.com/a/cloudera.org/forum/?fromgroups#!searchin/hue-user

    On Tuesday, June 25, 2013 10:08:24 PM UTC-4, ignorant wrote:

    Hi there,

    I am a beginner with Oozie and am trying to implement the twitter
    example - https://github.com/cloudera/**cdh-twitter-example<https://github.com/cloudera/cdh-twitter-example>

    I see that Hue has a nice interface for this. I have uploaded the
    package with all relevant libraries to /user/oozie/oozie-workflows after
    making the relevant edits.

    I tried to use Hue to add this but am not able to do so successfully.
    It gives me a GUI but I cannot point it to the uploaded configuration. I
    tried to use the GUI am tried to use the FS and Shell actions in the
    workflow but couldn't figure out how to configure them correctly.

    I also tried to get the Oozie UI working (http://www.cloudera.com/**
    content/cloudera-content/**cloudera-docs/CM4Ent/4.5.3/**
    Cloudera-Manager-Enterprise-**Edition-Installation-Guide/**
    cmeeig_topic_13.html<http://www.cloudera.com/content/cloudera-content/cloudera-docs/CM4Ent/4.5.3/Cloudera-Manager-Enterprise-Edition-Installation-Guide/cmeeig_topic_13.html>)
    But that did not work either. That is a lower priority issue since I am
    more interested in getting this to work from the Hue UI if possible.

    I would appreciate any feedback on this.

    Thanks,
    --

    ---
    You received this message because you are subscribed to the Google
    Groups "CDH Users" group.
    To unsubscribe from this group and stop receiving emails from it, send
    an email to cdh-user+u...@cloudera.org.
    For more options, visit
    https://groups.google.com/a/cloudera.org/groups/opt_out.

  • Romain Rigaux at Jul 1, 2013 at 9:52 pm
    Could you share the exact error + 'Configuration' tab of the created
    workflow?

    If it is a missing input file, this is probably because of the Timezone
    difference. Oozie is using *UTC timezones *everywhere, so all the
    Coordinator dates should be entered in UTC (yes, this is a bit misleading).
    By how many hours is the first input date missed?

    BTW: nameNode is automatically set in my Hue version, maybe you have an
    older one.

    Romain

    On Fri, Jun 28, 2013 at 8:56 AM, ignorant wrote:

    Hi,
    Any pointers on this?
    On Wednesday, June 26, 2013 10:40:09 AM UTC-4, ignorant wrote:

    Hi Romain,

    I had previously uploaded the whole package to hdfs /user/oozie and use
    that in Import but it didn't quite take. So then I went a little more
    manual.

    Here are the steps I took :

    1) Import Workflow. Used that hive-action.xml to create a new workflow
    called "tweet management".
    2) Create a coordinator called twitterstream with workflow selected as
    "tweet management".
    3) Defined frequency in hours.. start time was a little confusing. I set
    it as the start of the previous hour in Universal time zone. This is
    because the Hue box provisoned by CDH seems to be in UTC.
    4) Went to Inputs and hit "Create dataset here". called the new dataset
    "tweets"
    5) I set it as the same frequency, start time as the coordinator - start
    of this hour in Universal.
    6) URI - /user/flume/tweets/${YEAR}/$**{MONTH}/${DAY}/${HOUR} (I like
    the fact that I don't have to hardcode hdfs instance name).
    7) Instance -> Range -> Start -> ${coord:current(coord:**tzOffset() /
    60)}
    8) Instance -> Range -> End -> ${coord:current(1 + (coord:tzOffset() /
    60))}
    9) Timezone: Universal and hit save
    10) Back in coordinator screen, tie wfInput to tweets that we just created
    11) Leave outputs blank, Go to Advanced Screen
    12) In Oozie parameters, leave "oozie.use.system.libpath" with value of
    true
    13) add another Oozie parameter, dateHour with value ${coord:formatTime(*
    *coord:dateOffset(coord:**nominalTime(), tzOffset, 'HOUR'),
    'yyyyMMddHH')}
    14) Added a timeout of 10 minutes. Hit save..
    15) Hit Submit job.. it will pop up with two parameters, dateHour already
    filled in. To set workflowRoot, go to the workflow in another window, then
    Properties -> Advanced and copy the HDFS deployment directory. There is
    probably an easier way here too.

    Now I see the coordinator running but throwing errors about missing files
    in hdfs. One of the files its looking for is for the end of the hour. I am
    guessing this has to do with the time I put but i am not sure really. Here
    is what the coordinator definition looks like at runtime -

    <coordinator-app name="tweetstream"
    frequency="${coord:hours(1)}"
    start="2013-06-26T14:00Z" end="2013-07-31T22:19Z" timezone="Universal"
    xmlns="uri:oozie:coordinator:**0.1">
    <controls>
    <timeout>10</timeout>
    </controls>
    <datasets>
    <dataset name="tweets" frequency="${coord:hours(1)}"
    initial-instance="2013-06-**26T14:00Z" timezone="Universal">
    <uri-template>${nameNode}/**user/flume/tweets/${YEAR}/${**
    MONTH}/${DAY}/${HOUR}</uri-**template>
    <done-flag></done-flag>
    </dataset>
    </datasets>
    <input-events>
    <data-in name="wfInput" dataset="tweets">
    <start-instance>
    ${coord:current(coord:**tzOffset() / 60)}
    </start-instance>
    <end-instance>
    ${coord:current(1 + (coord:tzOffset() / 60))}
    </end-instance>
    </data-in>
    </input-events>
    <action>
    <workflow>
    <app-path>${wf_application_**path}</app-path>
    <configuration>
    <property>
    <name>wfInput</name>
    <value>${coord:dataIn('**wfInput')}</value>
    </property>
    </configuration>
    </workflow>
    </action>
    </coordinator-app>


    Thanks

    On Wednesday, June 26, 2013 1:19:12 AM UTC-4, Romain Rigaux wrote:

    Are you talking about doing https://github.com/cloudera/**
    cdh-twitter-example/blob/**master/oozie-workflows/hive-**action.xml#L19<https://github.com/cloudera/cdh-twitter-example/blob/master/oozie-workflows/hive-action.xml#L19>in Hue?

    When you create a workflow, you can specify its workspace on the HDFS by
    clicking on the advanced tab. When using the default workspace, just click
    on 'Upload' in the workflow editor and upload your files. Jars should go in
    a 'lib' sub-directory.

    Fs and Shell action are present as examples in the Oozie app. Did you
    run or copy them?

    Romain

    On Tue, Jun 25, 2013 at 8:57 PM, ignorant wrote:

    Wrong forum, posting to https://groups.google.com/**
    a/cloudera.org/forum/?**fromgroups#!searchin/hue-user<https://groups.google.com/a/cloudera.org/forum/?fromgroups#!searchin/hue-user>

    On Tuesday, June 25, 2013 10:08:24 PM UTC-4, ignorant wrote:

    Hi there,

    I am a beginner with Oozie and am trying to implement the twitter
    example - https://github.com/cloudera/****cdh-twitter-example<https://github.com/cloudera/cdh-twitter-example>

    I see that Hue has a nice interface for this. I have uploaded the
    package with all relevant libraries to /user/oozie/oozie-workflows after
    making the relevant edits.

    I tried to use Hue to add this but am not able to do so successfully.
    It gives me a GUI but I cannot point it to the uploaded configuration. I
    tried to use the GUI am tried to use the FS and Shell actions in the
    workflow but couldn't figure out how to configure them correctly.

    I also tried to get the Oozie UI working (http://www.cloudera.com/**
    conte**nt/cloudera-content/**cloudera-**docs/CM4Ent/4.5.3/**Cloudera-*
    *Manager-Enterprise-**Edition-**Installation-Guide/**cmeeig_**
    topic_13.html<http://www.cloudera.com/content/cloudera-content/cloudera-docs/CM4Ent/4.5.3/Cloudera-Manager-Enterprise-Edition-Installation-Guide/cmeeig_topic_13.html>)
    But that did not work either. That is a lower priority issue since I am
    more interested in getting this to work from the Hue UI if possible.

    I would appreciate any feedback on this.

    Thanks,
    --

    ---
    You received this message because you are subscribed to the Google
    Groups "CDH Users" group.
    To unsubscribe from this group and stop receiving emails from it, send
    an email to cdh-user+u...@cloudera.org.
    For more options, visit https://groups.google.com/a/**
    cloudera.org/groups/opt_out<https://groups.google.com/a/cloudera.org/groups/opt_out>
    .

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouphue-user @
categorieshadoop
postedJun 26, '13 at 5:19a
activeJul 1, '13 at 9:52p
posts4
users2
websitecloudera.com
irc#hadoop

2 users in discussion

Romain Rigaux: 2 posts Ignorant: 2 posts

People

Translate

site design / logo © 2022 Grokbase