FAQ
People,

Severity: Serious Annoyance
Reproducable on: 7.4.1, 7.4.3, 7.5devel
Summary: Concatination of CHAR() data type field seems to result in a TEXT
value instead of a CHAR value. Is there a reason for this?

Example:
webmergers=> select '"'::char(4) || ''::char(4) || '"'::char(4);
?column?
----------
""
(1 row)

Depending on the spec, it seems to me that the above should result either in a
char(4) of " " or a char(12) of " " . But we get a text value.
Is this the SQL spec? Is there another reason for this behavior?

--
-Josh Berkus
Aglio Database Solutions
San Francisco

Search Discussions

  • Josh Berkus at Jul 20, 2004 at 6:06 pm
    Folks,

    Also:
    This behavior was different in 7.1:

    [11:02:45] <DarcyB> darcy=# select '1'::char(4) || '-'::char(1);
    [11:02:45] <DarcyB> ?column?
    [11:02:45] <DarcyB> ----------
    [11:02:45] <DarcyB> 1 -
    [11:02:45] <DarcyB> (1 row)
    [11:02:49] <DarcyB> on 7.1

    And there's apparently either an issue, or a change in behavior, in CHAR for
    7.5:

    [11:03:25] <DarcyB> darcy=# SELECT length('1'::char(4));
    [11:03:25] <DarcyB> length
    [11:03:25] <DarcyB> --------
    [11:03:25] <DarcyB> 1
    [11:03:25] <DarcyB> (1 row)
    [11:03:29] <DarcyB> is 7.5

    pg743=> select length('1'::char(4));
    length
    --------
    4
    (1 row)
    (on 7.4.3)

    Are these changes intentional, or are they bugs?

    --
    -Josh Berkus
    Aglio Database Solutions
    San Francisco
  • Tom Lane at Jul 21, 2004 at 5:56 am

    Josh Berkus writes:
    Are these changes intentional,
    Yes. We've been moving more and more steadily towards the notion that
    trailing spaces in char(n) values are insignificant noise. If you think
    that trailing spaces are significant, you shouldn't be using char(n)
    to store them.

    regards, tom lane
  • Jean-Luc Lachance at Jul 21, 2004 at 12:37 pm
    This means that there is no more difference between CHAR(N) and
    VARCHAR(N). To bad... '1 ' sould be different from '1'.


    Tom Lane wrote:
    Josh Berkus <josh@agliodbs.com> writes:
    Are these changes intentional,

    Yes. We've been moving more and more steadily towards the notion that
    trailing spaces in char(n) values are insignificant noise. If you think
    that trailing spaces are significant, you shouldn't be using char(n)
    to store them.

    regards, tom lane

    ---------------------------(end of broadcast)---------------------------
    TIP 6: Have you searched our list archives?

    http://archives.postgresql.org
  • Stephan Szabo at Jul 21, 2004 at 3:01 pm

    On Wed, 21 Jul 2004, Tom Lane wrote:

    Josh Berkus <josh@agliodbs.com> writes:
    Are these changes intentional,
    Yes. We've been moving more and more steadily towards the notion that
    trailing spaces in char(n) values are insignificant noise. If you think
    that trailing spaces are significant, you shouldn't be using char(n)
    to store them.
    Well, the problem here is that technically we're returning the wrong type.
    We should be returning a char(l1+l2) rather than a text for a char
    concatenate, but similarly to the recent complaint about numerics, we
    don't really have a fully proper way to do that and it seems non-trivial.
  • Tom Lane at Jul 21, 2004 at 3:23 pm

    Stephan Szabo writes:
    On Wed, 21 Jul 2004, Tom Lane wrote:
    Yes. We've been moving more and more steadily towards the notion that
    trailing spaces in char(n) values are insignificant noise. If you think
    that trailing spaces are significant, you shouldn't be using char(n)
    to store them.
    Well, the problem here is that technically we're returning the wrong type.
    We should be returning a char(l1+l2) rather than a text for a char
    concatenate, but similarly to the recent complaint about numerics, we
    don't really have a fully proper way to do that and it seems non-trivial.
    Well, it'd be trivial to implement a char || char yielding char
    operator; it could just point to the existing textcat function and
    you'd get what you want. (It would come out as char(-1), ie unspecified
    length, but I'm not buying into doing the kind of analysis it would take
    to predict the length.) The real question in my mind is whether that
    would be more or less consistent with the behavior in other cases.
    Food for thought: in 7.4,

    regression=# select ('X '::char) = ('X'::char);
    ?column?
    ----------
    t
    (1 row)

    regression=# select ('Y '::char) = ('Y'::char);
    ?column?
    ----------
    t
    (1 row)

    regression=# select ('X '::char || 'Y '::char) = ('X'::char || 'Y'::char);
    ?column?
    ----------
    t
    (1 row)

    If we change || as is proposed in this thread, then the last case would
    yield 'false', because the first concatenation would yield 'X Y '
    which is not equal to 'XY' no matter what you think about trailing
    spaces. I find it a bit disturbing that the concatenation of equal
    values would yield unequal values.

    IMHO the bottom line here is that the SQL-spec behavior of type char(N)
    is completely brain-dead. Practically all of the questions in this area
    would go away if people used varchar(N) or text to store their data.

    regards, tom lane
  • Stephan Szabo at Jul 21, 2004 at 3:50 pm

    On Wed, 21 Jul 2004, Tom Lane wrote:

    Stephan Szabo <sszabo@megazone.bigpanda.com> writes:
    On Wed, 21 Jul 2004, Tom Lane wrote:
    Yes. We've been moving more and more steadily towards the notion that
    trailing spaces in char(n) values are insignificant noise. If you think
    that trailing spaces are significant, you shouldn't be using char(n)
    to store them.
    Well, the problem here is that technically we're returning the wrong type.
    We should be returning a char(l1+l2) rather than a text for a char
    concatenate, but similarly to the recent complaint about numerics, we
    don't really have a fully proper way to do that and it seems non-trivial.
    Well, it'd be trivial to implement a char || char yielding char
    operator; it could just point to the existing textcat function and
    you'd get what you want. (It would come out as char(-1), ie unspecified
    length, but I'm not buying into doing the kind of analysis it would take
    to predict the length.) The real question in my mind is whether that
    The reason that to do it completely means knowing the length comes from
    case and union afaics. Both of these need to do something consistent with
    the lengths.

    case when <blah> then 'f'::char(2) || 'g'::char(2) else
    'f'::char(3) || 'g'::char(3) end
    should return a consistent length char no matter which branch is taken on
    any given row.

    This was the basic complaint with numeric in the -sql thread, we return
    the "correct" actual numeric values with proper seeming precision and
    scale, but if you then case two of these that gave different precision and
    scale, you'd get inconsistent scale in the case output.
    Food for thought: in 7.4,

    regression=# select ('X '::char) = ('X'::char);
    ?column?
    ----------
    t
    (1 row)

    regression=# select ('Y '::char) = ('Y'::char);
    ?column?
    ----------
    t
    (1 row)

    regression=# select ('X '::char || 'Y '::char) = ('X'::char || 'Y'::char);
    ?column?
    ----------
    t
    (1 row)

    If we change || as is proposed in this thread, then the last case would
    yield 'false', because the first concatenation would yield 'X Y '
    which is not equal to 'XY' no matter what you think about trailing
    spaces. I find it a bit disturbing that the concatenation of equal
    values would yield unequal values.
    That is somewhat bad, yeah.
    IMHO the bottom line here is that the SQL-spec behavior of type char(N)
    is completely brain-dead. Practically all of the questions in this area
    would go away if people used varchar(N) or text to store their data.
    It is fairly wierd, yes. I'm not sure if the spec lets you, but a NO PAD
    default character set probably would have made this simpler, by not
    requiring that 'Y'::char(4) is equal to 'Y'::char(2), but it's really too
    late to change that now in any case.
  • Andreas Pflug at Jul 21, 2004 at 4:14 pm

    Tom Lane wrote:

    Food for thought: in 7.4,

    regression=# select ('X '::char) = ('X'::char);
    ?column?
    ----------
    t
    (1 row)

    regression=# select ('Y '::char) = ('Y'::char);
    ?column?
    ----------
    t
    (1 row)

    regression=# select ('X '::char || 'Y '::char) = ('X'::char || 'Y'::char);
    ?column?
    ----------
    t
    (1 row)

    If we change || as is proposed in this thread, then the last case would
    yield 'false', because the first concatenation would yield 'X Y '
    which is not equal to 'XY' no matter what you think about trailing
    spaces. I find it a bit disturbing that the concatenation of equal
    values would yield unequal values.
    Well this indicates that the first two examples are questionable. 'X '
    is quite-the-same as 'X', but not really-the-same.

    CREATE OR REPLACE FUNCTION toms_name() RETURNS char(50)
    as $BODY$
    DECLARE fullname char(50);
    DECLARE firstname char(50) := 'Tom';
    DECLARE secondname char(50) := 'G';
    DECLARE lastname char(50) := 'Lane';
    BEGIN
    fullname := firstname;
    IF secondname != '' THEN
    IF fullname != '' THEN
    fullname := fullname || ' ';
    END IF;
    fullname := fullname || secondname;
    END IF;
    IF fullname != '' THEN
    fullname := fullname || ' ';
    END IF;
    fullname := fullname || lastname;

    RETURN fullname;
    END;
    $BODY$ LANGUAGE 'plpgsql'


    I find the result of this function quite surprising, and certainly not
    yielding what was intended (yes, this can avoided, I know). Surprise is
    getting bigger, if fullname is declared as text...

    IMHO the bottom line here is that the SQL-spec behavior of type char(N)
    is completely brain-dead.
    Just for COBOL's sake, I suppose.

    Regards,
    Andreas
  • Tom Lane at Jul 22, 2004 at 12:11 am

    Andreas Pflug writes:
    Tom Lane wrote:
    Food for thought: in 7.4,

    regression=# select ('X '::char) = ('X'::char);
    ?column?
    ----------
    t
    (1 row)

    regression=# select ('Y '::char) = ('Y'::char);
    ?column?
    ----------
    t
    (1 row)

    regression=# select ('X '::char || 'Y '::char) = ('X'::char || 'Y'::char);
    ?column?
    ----------
    t
    (1 row)

    If we change || as is proposed in this thread, then the last case would
    yield 'false', because the first concatenation would yield 'X Y '
    which is not equal to 'XY' no matter what you think about trailing
    spaces. I find it a bit disturbing that the concatenation of equal
    values would yield unequal values.
    Well this indicates that the first two examples are questionable.
    Indeed, but AFAICS this behavior is mandated by the SQL standard.
    (Note we are interpreting char(N) as always having the PAD SPACE
    behavior, though the spec really wants us to associate that with
    a collation instead.)

    regards, tom lane
  • Darcy Buskermolen at Jul 21, 2004 at 7:06 pm

    On July 21, 2004 08:22 am, Tom Lane wrote:
    IMHO the bottom line here is that the SQL-spec behavior of type char(N)
    is completely brain-dead. Practically all of the questions in this area
    would go away if people used varchar(N) or text to store their data.

    regards, tom lane
    For reference sake oracle treats it as follows:

    <Dorm> SQL> create table dummy (value char(4));
    <Dorm> Table created.
    <Dorm> SQL> insert into dummy values ('A');
    <Dorm> 1 row created.
    <Dorm> SQL> commit;
    <Dorm> Commit complete.
    <Dorm> SQL> select value||value from dummy;
    <Dorm> VALUE||V
    <Dorm> --------
    <Dorm> A A
    <Dorm> SQL> select length(value||value) from dummy;
    <Dorm> LENGTH(VALUE||VALUE)
    <Dorm> --------------------
    <Dorm> 8



    --
    Darcy Buskermolen
    Wavefire Technologies Corp.
    ph: 250.717.0200
    fx: 250.763.1759
    http://www.wavefire.com

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppgsql-bugs @
categoriespostgresql
postedJul 20, '04 at 6:00p
activeJul 22, '04 at 12:11a
posts10
users6
websitepostgresql.org
irc#postgresql

People

Translate

site design / logo © 2021 Grokbase