FAQ
PERFORMANCE: removing keys from the value
-----------------------------------------

Key: PIG-465
URL: https://issues.apache.org/jira/browse/PIG-465
Project: Pig
Issue Type: Improvement
Affects Versions: types_branch
Reporter: Olga Natkovich
Fix For: types_branch


Currently, reducers get the key data twice: once in the key and once in the value. If grouping key is the large part of the value, this causes large data replication and performance loss.

The key should not be sent as part of the value. Instead, a metadata should used to assist in reconstructing the row from the key and the remaining data

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Search Discussions

  • Pradeep Kamath (JIRA) at Oct 9, 2008 at 12:34 am
    [ https://issues.apache.org/jira/browse/PIG-465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Pradeep Kamath updated PIG-465:
    -------------------------------

    Assignee: Pradeep Kamath
    Status: Patch Available (was: Open)

    Attached patch
    PERFORMANCE: removing keys from the value
    -----------------------------------------

    Key: PIG-465
    URL: https://issues.apache.org/jira/browse/PIG-465
    Project: Pig
    Issue Type: Improvement
    Affects Versions: types_branch
    Reporter: Olga Natkovich
    Assignee: Pradeep Kamath
    Fix For: types_branch

    Attachments: PIG-465.patch


    Currently, reducers get the key data twice: once in the key and once in the value. If grouping key is the large part of the value, this causes large data replication and performance loss.
    The key should not be sent as part of the value. Instead, a metadata should used to assist in reconstructing the row from the key and the remaining data
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Pradeep Kamath (JIRA) at Oct 9, 2008 at 12:34 am
    [ https://issues.apache.org/jira/browse/PIG-465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Pradeep Kamath updated PIG-465:
    -------------------------------

    Attachment: PIG-465.patch
    PERFORMANCE: removing keys from the value
    -----------------------------------------

    Key: PIG-465
    URL: https://issues.apache.org/jira/browse/PIG-465
    Project: Pig
    Issue Type: Improvement
    Affects Versions: types_branch
    Reporter: Olga Natkovich
    Assignee: Pradeep Kamath
    Fix For: types_branch

    Attachments: PIG-465.patch


    Currently, reducers get the key data twice: once in the key and once in the value. If grouping key is the large part of the value, this causes large data replication and performance loss.
    The key should not be sent as part of the value. Instead, a metadata should used to assist in reconstructing the row from the key and the remaining data
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • Olga Natkovich (JIRA) at Oct 9, 2008 at 7:25 pm
    [ https://issues.apache.org/jira/browse/PIG-465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    Olga Natkovich updated PIG-465:
    -------------------------------

    Resolution: Fixed
    Status: Resolved (was: Patch Available)

    patch committed; thanks, pradeep
    PERFORMANCE: removing keys from the value
    -----------------------------------------

    Key: PIG-465
    URL: https://issues.apache.org/jira/browse/PIG-465
    Project: Pig
    Issue Type: Improvement
    Affects Versions: types_branch
    Reporter: Olga Natkovich
    Assignee: Pradeep Kamath
    Fix For: types_branch

    Attachments: PIG-465.patch


    Currently, reducers get the key data twice: once in the key and once in the value. If grouping key is the large part of the value, this causes large data replication and performance loss.
    The key should not be sent as part of the value. Instead, a metadata should used to assist in reconstructing the row from the key and the remaining data
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupdev @
categoriespig, hadoop
postedSep 30, '08 at 12:01a
activeOct 9, '08 at 7:25p
posts4
users1
websitepig.apache.org

1 user in discussion

Olga Natkovich (JIRA): 4 posts

People

Translate

site design / logo © 2022 Grokbase