FAQ
LZO COMPRESSION support
-----------------------

Key: HBASE-1126
URL: https://issues.apache.org/jira/browse/HBASE-1126
Project: Hadoop HBase
Issue Type: New Feature
Environment: All
Reporter: Alex Newman


It would be interesting to see the performance of lzo compressed Column Families Vs Normal ZlibBlock Compression. Based on some very preliminary performance profiling that I have done, as long as the cells are of reasonable size ( > 4k ), the zlib compression really dominates the overhead in random read/scanning situations.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Search Discussions

  • stack (JIRA) at Jan 17, 2009 at 6:57 pm
    [ https://issues.apache.org/jira/browse/HBASE-1126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    stack updated HBASE-1126:
    -------------------------

    Summary: Enable choice of code; i.e. at a minimum enable LZO COMPRESSION support (was: LZO COMPRESSION support)

    Broaden the issue to making it so users can choose codec.

    Over in http://wiki.apache.org/hadoop/Hbase/PerformanceEvaluation#0_19_0, I tested the DefaultCodec (zlib, using native encoder/decoder) and found that random reads are horrid, writes just a bit slower and scans about the same. I wonder how lzo would change this? Perhaps scanning and writes would run as fast as non-compressed and random reads would come up close to non-compressed data? I did notice that block compression made for less regions -- about half -- and this was with the PE data which does its best to foil good compression.
    Enable choice of code; i.e. at a minimum enable LZO COMPRESSION support
    -----------------------------------------------------------------------

    Key: HBASE-1126
    URL: https://issues.apache.org/jira/browse/HBASE-1126
    Project: Hadoop HBase
    Issue Type: New Feature
    Environment: All
    Reporter: Alex Newman

    It would be interesting to see the performance of lzo compressed Column Families Vs Normal ZlibBlock Compression. Based on some very preliminary performance profiling that I have done, as long as the cells are of reasonable size ( > 4k ), the zlib compression really dominates the overhead in random read/scanning situations.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • stack (JIRA) at Jan 17, 2009 at 7:11 pm
    [ https://issues.apache.org/jira/browse/HBASE-1126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    stack updated HBASE-1126:
    -------------------------

    Summary: Enable choice of codec; i.e. at a minimum enable LZO COMPRESSION support (was: Enable choice of code; i.e. at a minimum enable LZO COMPRESSION support)
    Enable choice of codec; i.e. at a minimum enable LZO COMPRESSION support
    ------------------------------------------------------------------------

    Key: HBASE-1126
    URL: https://issues.apache.org/jira/browse/HBASE-1126
    Project: Hadoop HBase
    Issue Type: New Feature
    Environment: All
    Reporter: Alex Newman

    It would be interesting to see the performance of lzo compressed Column Families Vs Normal ZlibBlock Compression. Based on some very preliminary performance profiling that I have done, as long as the cells are of reasonable size ( > 4k ), the zlib compression really dominates the overhead in random read/scanning situations.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • stack (JIRA) at Apr 15, 2009 at 4:04 pm
    [ https://issues.apache.org/jira/browse/HBASE-1126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12699250#action_12699250 ]

    stack commented on HBASE-1126:
    ------------------------------

    See http://code.google.com/p/hadoop-gpl-compression/.
    Enable choice of codec; i.e. at a minimum enable LZO COMPRESSION support
    ------------------------------------------------------------------------

    Key: HBASE-1126
    URL: https://issues.apache.org/jira/browse/HBASE-1126
    Project: Hadoop HBase
    Issue Type: New Feature
    Environment: All
    Reporter: Alex Newman

    It would be interesting to see the performance of lzo compressed Column Families Vs Normal ZlibBlock Compression. Based on some very preliminary performance profiling that I have done, as long as the cells are of reasonable size ( > 4k ), the zlib compression really dominates the overhead in random read/scanning situations.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • stack (JIRA) at May 20, 2009 at 6:32 pm
    [ https://issues.apache.org/jira/browse/HBASE-1126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

    stack resolved HBASE-1126.
    --------------------------

    Resolution: Fixed
    Fix Version/s: 0.20.0

    Ryan made this work, and he doc'd it: http://wiki.apache.org/hadoop/UsingLzoCompression
    Enable choice of codec; i.e. at a minimum enable LZO COMPRESSION support
    ------------------------------------------------------------------------

    Key: HBASE-1126
    URL: https://issues.apache.org/jira/browse/HBASE-1126
    Project: Hadoop HBase
    Issue Type: New Feature
    Environment: All
    Reporter: Alex Newman
    Fix For: 0.20.0


    It would be interesting to see the performance of lzo compressed Column Families Vs Normal ZlibBlock Compression. Based on some very preliminary performance profiling that I have done, as long as the cells are of reasonable size ( > 4k ), the zlib compression really dominates the overhead in random read/scanning situations.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.
  • stack (JIRA) at May 20, 2009 at 6:32 pm
    [ https://issues.apache.org/jira/browse/HBASE-1126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12711284#action_12711284 ]

    stack commented on HBASE-1126:
    ------------------------------

    See HBASE-1379
    Enable choice of codec; i.e. at a minimum enable LZO COMPRESSION support
    ------------------------------------------------------------------------

    Key: HBASE-1126
    URL: https://issues.apache.org/jira/browse/HBASE-1126
    Project: Hadoop HBase
    Issue Type: New Feature
    Environment: All
    Reporter: Alex Newman
    Fix For: 0.20.0


    It would be interesting to see the performance of lzo compressed Column Families Vs Normal ZlibBlock Compression. Based on some very preliminary performance profiling that I have done, as long as the cells are of reasonable size ( > 4k ), the zlib compression really dominates the overhead in random read/scanning situations.
    --
    This message is automatically generated by JIRA.
    -
    You can reply to this email to add a comment to the issue online.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupdev @
categorieshbase, hadoop
postedJan 14, '09 at 6:02p
activeMay 20, '09 at 6:32p
posts6
users1
websitehbase.apache.org

1 user in discussion

stack (JIRA): 6 posts

People

Translate

site design / logo © 2022 Grokbase