FAQ
Hi,

We've developed some code to implement fixed-length datatypes for well
known digest function output (MD5, SHA1 and the various SHA2 types).
These types have minimal overhead and are quite complete, including
btree and hash opclasses.

We're wondering about proposing them for inclusion in pgcrypto. I asked
Marko Kreen but he is not sure about it; according to him it would be
better to have general fixed-length hex types. (I guess it would be
possible to implement the digest types as domains over those.)

So basically we have sha1, sha-256, sha-512 etc on one hand, and hex8,
hex16, hex32 on the other hand. In both cases there is a single body of
code that is compiled with a macro definition that provides the data
length for every separate case. (Actually in the digest code we
refactored the common routines so that each type has a light wrapper
calling a function that works on any length; this could also be done to
the fixed-len hex code as well -- that code is pretty grotty at the
moment.)

Of these two choices, which one is likely to have better acceptance
around here?

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Search Discussions

  • Tom Lane at Jul 27, 2009 at 2:20 pm

    Alvaro Herrera writes:
    We've developed some code to implement fixed-length datatypes for well
    known digest function output (MD5, SHA1 and the various SHA2 types).
    These types have minimal overhead and are quite complete, including
    btree and hash opclasses.
    We're wondering about proposing them for inclusion in pgcrypto.
    Wasn't this proposed and rejected before? (Or more to the point,
    why'd you bother? The advantage over bytea seems negligible.)

    regards, tom lane
  • Merlin Moncure at Jul 27, 2009 at 3:13 pm

    On Mon, Jul 27, 2009 at 10:20 AM, Tom Lanewrote:
    Alvaro Herrera <alvherre@commandprompt.com> writes:
    We've developed some code to implement fixed-length datatypes for well
    known digest function output (MD5, SHA1 and the various SHA2 types).
    These types have minimal overhead and are quite complete, including
    btree and hash opclasses.
    We're wondering about proposing them for inclusion in pgcrypto.
    Wasn't this proposed and rejected before?  (Or more to the point,
    why'd you bother?  The advantage over bytea seems negligible.)
    well, one nice things about the fixed length types is that you can
    keep your table from needing a toast table when you have a bytea in
    it.

    merlin
  • Tom Lane at Jul 27, 2009 at 3:37 pm

    Merlin Moncure writes:
    On Mon, Jul 27, 2009 at 10:20 AM, Tom Lanewrote:
    Wasn't this proposed and rejected before?  (Or more to the point,
    why'd you bother?  The advantage over bytea seems negligible.)
    well, one nice things about the fixed length types is that you can
    keep your table from needing a toast table when you have a bytea in
    it.
    If you don't actually use the toast table, it doesn't cost anything very
    noticeable ...

    regards, tom lane
  • Andrew Dunstan at Jul 27, 2009 at 4:03 pm

    Merlin Moncure wrote:
    On Mon, Jul 27, 2009 at 10:20 AM, Tom Lanewrote:
    Alvaro Herrera <alvherre@commandprompt.com> writes:
    We've developed some code to implement fixed-length datatypes for well
    known digest function output (MD5, SHA1 and the various SHA2 types).
    These types have minimal overhead and are quite complete, including
    btree and hash opclasses.

    We're wondering about proposing them for inclusion in pgcrypto.
    Wasn't this proposed and rejected before? (Or more to the point,
    why'd you bother? The advantage over bytea seems negligible.)
    well, one nice things about the fixed length types is that you can
    keep your table from needing a toast table when you have a bytea in
    it.

    Can't you just set storage on the column to MAIN to stop it being stored
    in a toast table?

    cheers

    andrew
  • Merlin Moncure at Jul 27, 2009 at 5:55 pm

    On Mon, Jul 27, 2009 at 12:02 PM, Andrew Dunstanwrote:
    Merlin Moncure wrote:
    On Mon, Jul 27, 2009 at 10:20 AM, Tom Lanewrote:
    Alvaro Herrera <alvherre@commandprompt.com> writes:
    We've developed some code to implement fixed-length datatypes for well
    known digest function output (MD5, SHA1 and the various SHA2 types).
    These types have minimal overhead and are quite complete, including
    btree and hash opclasses.
    We're wondering about proposing them for inclusion in pgcrypto.
    Wasn't this proposed and rejected before?  (Or more to the point,
    why'd you bother?  The advantage over bytea seems negligible.)
    well, one nice things about the fixed length types is that you can
    keep your table from needing a toast table when you have a bytea in
    it.
    Can't you just set storage on the column to MAIN to stop it being stored in
    a toast table?
    of course.

    hm. would the input/output functions for the fixed length types be
    faster? what is the advantage of the proposal?

    merlin
  • Peter Eisentraut at Jul 28, 2009 at 11:15 am

    On Monday 27 July 2009 14:50:30 Alvaro Herrera wrote:
    We've developed some code to implement fixed-length datatypes for well
    known digest function output (MD5, SHA1 and the various SHA2 types).
    These types have minimal overhead and are quite complete, including
    btree and hash opclasses.

    We're wondering about proposing them for inclusion in pgcrypto. I asked
    Marko Kreen but he is not sure about it; according to him it would be
    better to have general fixed-length hex types. (I guess it would be
    possible to implement the digest types as domains over those.)
    I think equipping bytea with a length restriction would be a very natural,
    simple, and useful addition. If we ever want to move the bytea type closer to
    the SQL standard blob type, this will need to happen anyway.

    The case for separate fixed-length data types seems very dubious, unless you
    can show very impressive performance numbers. For one thing, they would make
    the whole type system more complicated, or in the alternative, would have
    little function and operator support.
  • Merlin Moncure at Jul 28, 2009 at 3:12 pm

    On Tue, Jul 28, 2009 at 7:15 AM, Peter Eisentrautwrote:
    On Monday 27 July 2009 14:50:30 Alvaro Herrera wrote:
    We've developed some code to implement fixed-length datatypes for well
    known digest function output (MD5, SHA1 and the various SHA2 types).
    These types have minimal overhead and are quite complete, including
    btree and hash opclasses.
    I think equipping bytea with a length restriction would be a very natural,
    simple, and useful addition.  If we ever want to move the bytea type closer to
    the SQL standard blob type, this will need to happen anyway.
    +1

    merlin
  • Decibel at Jul 29, 2009 at 5:16 pm

    On Jul 28, 2009, at 6:15 AM, Peter Eisentraut wrote:
    On Monday 27 July 2009 14:50:30 Alvaro Herrera wrote:
    We've developed some code to implement fixed-length datatypes for
    well
    known digest function output (MD5, SHA1 and the various SHA2 types).
    These types have minimal overhead and are quite complete, including
    btree and hash opclasses.

    We're wondering about proposing them for inclusion in pgcrypto. I
    asked
    Marko Kreen but he is not sure about it; according to him it would be
    better to have general fixed-length hex types. (I guess it would be
    possible to implement the digest types as domains over those.)
    I think equipping bytea with a length restriction would be a very
    natural,
    simple, and useful addition. If we ever want to move the bytea
    type closer to
    the SQL standard blob type, this will need to happen anyway.

    The case for separate fixed-length data types seems very dubious,
    unless you
    can show very impressive performance numbers. For one thing, they
    would make
    the whole type system more complicated, or in the alternative,
    would have
    little function and operator support.
    bytea doesn't cast well to and from text when you're dealing with hex
    data; you end up using the same amount of space as a varchar. What
    would probably work well is a hex datatype that internally works like
    bytea but requires that the input data is hex (I know you can use
    encode/decode, but that added step is a pain). A similar argument
    could be made for base64 encoded data.
    --
    Decibel!, aka Jim C. Nasby, Database Architect decibel@decibel.org
    Give your computer some brain candy! www.distributed.net Team #1828
  • Peter Eisentraut at Jul 30, 2009 at 8:10 am

    On Wednesday 29 July 2009 20:16:48 decibel wrote:
    bytea doesn't cast well to and from text when you're dealing with hex
    data; you end up using the same amount of space as a varchar. What
    would probably work well is a hex datatype that internally works like
    bytea but requires that the input data is hex (I know you can use
    encode/decode, but that added step is a pain). A similar argument
    could be made for base64 encoded data.
    There is a patch in the queue that adds hex input and output to bytea.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppgsql-hackers @
categoriespostgresql
postedJul 27, '09 at 11:50a
activeJul 30, '09 at 8:10a
posts10
users6
websitepostgresql.org...
irc#postgresql

People

Translate

site design / logo © 2022 Grokbase