FAQ
Hi all,
I want to ask you about the performance difference between using the Text class and using a custom Class which implements Writable interface.

Lets say in InvertedIndex problem when I emit token and a list of document Ids which contains it , using Text we usually Concat the list of document ids with space as a separator "d1 d2 d3 d4" etc..If I need the same values in a later step of map reduce, I need to split the value string to get the list of all document Ids. Is it not better to use Writable List instead??

I need to ask it because I am using too many Concats and Splits in my project to use documents total tokens count, token frequency in a particular document etc..


Thanks in advance,
Chintan


_________________________________________________________________
Windows Live Messenger. Multitasking at its finest.
http://www.microsoft.com/india/windows/windowslive/messenger.aspx

Search Discussions

  • Aaron Kimball at Apr 23, 2009 at 8:44 am
    In general, serializing to text and then parsing back into a different
    format will always be slower than using a purpose-built class that can
    serialize itself. The tradeoff, of course, is that going to text is often
    more convenient from a developer-time perspective.

    - Aaron
    On Mon, Apr 20, 2009 at 2:23 PM, chintan bhatt wrote:


    Hi all,
    I want to ask you about the performance difference between using the Text
    class and using a custom Class which implements Writable interface.

    Lets say in InvertedIndex problem when I emit token and a list of document
    Ids which contains it , using Text we usually Concat the list of document
    ids with space as a separator "d1 d2 d3 d4" etc..If I need the same values
    in a later step of map reduce, I need to split the value string to get the
    list of all document Ids. Is it not better to use Writable List instead??

    I need to ask it because I am using too many Concats and Splits in my
    project to use documents total tokens count, token frequency in a particular
    document etc..


    Thanks in advance,
    Chintan


    _________________________________________________________________
    Windows Live Messenger. Multitasking at its finest.
    http://www.microsoft.com/india/windows/windowslive/messenger.aspx

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
categorieshadoop
postedApr 20, '09 at 5:24a
activeApr 23, '09 at 8:44a
posts2
users2
websitehadoop.apache.org...
irc#hadoop

2 users in discussion

Aaron Kimball: 1 post Chintan bhatt: 1 post

People

Translate

site design / logo © 2022 Grokbase