Grokbase Groups Pig dev August 2010
FAQ
new syntax for native mapreduce operator
----------------------------------------

Key: PIG-1580
URL: https://issues.apache.org/jira/browse/PIG-1580
Project: Pig
Issue Type: Task
Reporter: Thejas M Nair
Assignee: Thejas M Nair


mapreduce operator (PIG-506) and stream operator have some similarities. It makes sense to use a similar syntax for both.

Alan has proposed the following syntax for mapreduce operator, and that we move stream operator also to similar a syntax in a future release.

MAPREDUCE id jar
INPUT 'path' USING LoadFunc
OUTPUT 'path' USING StoreFunc
[SHIP 'path' [, 'path' ...]]
[CACHE 'dfs_path#dfs_file' [, 'dfs_path#dfs_file' ...]]


--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Search Discussions

  • Thejas M Nair (JIRA) at Aug 30, 2010 at 6:51 pm
    [ https://issues.apache.org/jira/browse/PIG-1580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Thejas M Nair updated PIG-1580:
    -------------------------------

    Fix Version/s: 0.8.0
    new syntax for native mapreduce operator
    ----------------------------------------

    Key: PIG-1580
    URL: https://issues.apache.org/jira/browse/PIG-1580
    Project: Pig
    Issue Type: Task
    Reporter: Thejas M Nair
    Assignee: Thejas M Nair
    Fix For: 0.8.0


    mapreduce operator (PIG-506) and stream operator have some similarities. It makes sense to use a similar syntax for both.
    Alan has proposed the following syntax for mapreduce operator, and that we move stream operator also to similar a syntax in a future release.
    MAPREDUCE id jar
    INPUT 'path' USING LoadFunc
    OUTPUT 'path' USING StoreFunc
    [SHIP 'path' [, 'path' ...]]
    [CACHE 'dfs_path#dfs_file' [, 'dfs_path#dfs_file' ...]]
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Thejas M Nair (JIRA) at Aug 30, 2010 at 7:18 pm
    [ https://issues.apache.org/jira/browse/PIG-1580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904298#action_12904298 ]

    Thejas M Nair commented on PIG-1580:
    ------------------------------------

    Updating syntax to include support for parameters -

    MAPREDUCE id jar 'params'
    INPUT 'path' USING LoadFunc
    OUTPUT 'path' USING StoreFunc
    [SHIP 'path' [, 'path' ...]]
    [CACHE 'dfs_path#dfs_file' , 'dfs_path#dfs_file' ...]
    new syntax for native mapreduce operator
    ----------------------------------------

    Key: PIG-1580
    URL: https://issues.apache.org/jira/browse/PIG-1580
    Project: Pig
    Issue Type: Task
    Reporter: Thejas M Nair
    Assignee: Thejas M Nair
    Fix For: 0.8.0


    mapreduce operator (PIG-506) and stream operator have some similarities. It makes sense to use a similar syntax for both.
    Alan has proposed the following syntax for mapreduce operator, and that we move stream operator also to similar a syntax in a future release.
    MAPREDUCE id jar
    INPUT 'path' USING LoadFunc
    OUTPUT 'path' USING StoreFunc
    [SHIP 'path' [, 'path' ...]]
    [CACHE 'dfs_path#dfs_file' [, 'dfs_path#dfs_file' ...]]
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Thejas M Nair (JIRA) at Sep 1, 2010 at 5:40 pm
    [ https://issues.apache.org/jira/browse/PIG-1580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Thejas M Nair resolved PIG-1580.
    --------------------------------

    Resolution: Won't Fix

    In case of 'hadoop jar' command, the files to ship to distributed cache are specified using -files command line option. Since typical users would be moving an existing map-reduce job that they were running using 'hadoop jar', it is easier for them to copy the existing command line options rather than the SHIP/CACHE clause in the proposed syntax.

    If we don't have the SHIP/CACHE clauses in mapreduce operator, there is very little similarity between streaming and mapreduce operator. It will be better to use LOAD/STORE instead of INPUT/OUTPUT in the syntax of mapreduce, as they specify the load/store functions and not the streaming deserializer/serializer.

    So I think it is better to go back to the old syntax. Resolving jira as won't-fix.

    new syntax for native mapreduce operator
    ----------------------------------------

    Key: PIG-1580
    URL: https://issues.apache.org/jira/browse/PIG-1580
    Project: Pig
    Issue Type: Task
    Reporter: Thejas M Nair
    Assignee: Thejas M Nair
    Fix For: 0.8.0


    mapreduce operator (PIG-506) and stream operator have some similarities. It makes sense to use a similar syntax for both.
    Alan has proposed the following syntax for mapreduce operator, and that we move stream operator also to similar a syntax in a future release.
    MAPREDUCE id jar
    INPUT 'path' USING LoadFunc
    OUTPUT 'path' USING StoreFunc
    [SHIP 'path' [, 'path' ...]]
    [CACHE 'dfs_path#dfs_file' [, 'dfs_path#dfs_file' ...]]
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupdev @
categoriespig, hadoop
postedAug 30, '10 at 6:51p
activeSep 1, '10 at 5:40p
posts4
users1
websitepig.apache.org

1 user in discussion

Thejas M Nair (JIRA): 4 posts

People

Translate

site design / logo © 2022 Grokbase