I'm making pretty good progress on the task of splitting input and
output collations for expression nodes. There remains one case in the
regression tests that is giving a non-expected result. It involves this
function:

CREATE FUNCTION dup (f1 anyelement, f2 out anyelement, f3 out anyarray)
AS 'select $1, array[$1,$1]' LANGUAGE sql;

which is used like this:

SELECT a, (dup(b)).* FROM collate_test3 ORDER BY 2;

where the column collate_test3.b is declared with a nondefault
collation. In HEAD, the output of dup() inherits that collation from
the b Var, and then so does the field selected from it, and so the ORDER
BY sorts according to the column collation. With my patch as it stands,
you get sorting by the default collation. The reason is that the output
of dup() is of type RECORD, which is a noncollatable type according to
pg_type, so the function result is labeled noncollatable even though it
has collatable inputs. The FieldSelect node then has to choose the
default collation since there's nothing else available.

ISTM there are basically three things we might do about this:

(1) Decide that the patch's behavior is correct and what's embodied in
the regression expected file is wrong.

(2) Allow collations to propagate up through nodes that deliver
noncollatable outputs.

(3) Decide that composite types are collatable.

I don't much care for #2, mainly because it seems likely to cause a lot
of unnecessary collation conflicts. It has a lot of flavor of action
at a distance in most cases, too, even though it might seem natural
in this particular example.

#3 has some appeal but it also seems fraught with definitional issues,
for instance:

* Something like
row('a' collate "C", 'b' collate "en_US")
doesn't have a resolvable collation, but I don't think we want it
to throw an error.

* If composites are collatable, one would logically expect a table's
rowtype to inherit the common collation of its columns (if there is
one). This is problematic because then the rowtype collation can change
after ALTER TABLE ADD COLUMN/DROP COLUMN/ALTER COLUMN TYPE, which is
something we can't support even if it seemed like a good idea.

Also, I fail to see any support for #3 in the SQL standard. So far as
I can find, the standard only considers that string types are collatable.
By the same token, it doesn't look to me like the standard is expecting
#2 to happen, since only string types can carry collation.

Thoughts?

regards, tom lane

Search Discussions

  • Tom Lane at Mar 19, 2011 at 4:45 pm

    I wrote:
    ISTM there are basically three things we might do about this:
    (1) Decide that the patch's behavior is correct and what's embodied in
    the regression expected file is wrong.
    (2) Allow collations to propagate up through nodes that deliver
    noncollatable outputs.
    (3) Decide that composite types are collatable.
    I thought of another possibility, which is to special-case field
    selection from a function-returning-composite, ie make it look
    through the function node and use the function's input collation.
    FieldSelect needs to be a special case in the collation assignment code
    anyway because of the possibility of taking the collation from the field
    declaration instead of the input, so this is not *quite* as ugly as it
    first sounds. It's still ugly, but it makes that regression test pass
    with only a very localized change. So I will do it like that for now
    until someone comes up with an argument for another choice.

    regards, tom lane
  • Martijn van Oosterhout at Mar 20, 2011 at 7:26 pm

    On Fri, Mar 18, 2011 at 10:40:53PM -0400, Tom Lane wrote:
    (2) Allow collations to propagate up through nodes that deliver
    noncollatable outputs.
    I don't think this is the goal. Only strings types are collatable, as
    you point out.
    * Something like
    row('a' collate "C", 'b' collate "en_US")
    doesn't have a resolvable collation, but I don't think we want it
    to throw an error.
    Well, it doesn't. But it returns a rowtype and the string columns in
    the row type would carry a collation and I don't think they'd have to
    be the same. row is not a function really.
    * If composites are collatable, one would logically expect a table's
    rowtype to inherit the common collation of its columns (if there is
    one). This is problematic because then the rowtype collation can change
    after ALTER TABLE ADD COLUMN/DROP COLUMN/ALTER COLUMN TYPE, which is
    something we can't support even if it seemed like a good idea.
    Indeed not.

    A rowtype has an order, determined by the fields within it. Those
    fields may be strings and so may have a collation. Doesn't seem
    particularly magical to me.

    Have a nice day,
    --
    Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/
    Patriotism is when love of your own people comes first; nationalism,
    when hate for people other than your own comes first.
    - Charles de Gaulle
  • Peter Eisentraut at Mar 22, 2011 at 8:39 am

    On sön, 2011-03-20 at 20:26 +0100, Martijn van Oosterhout wrote:
    A rowtype has an order, determined by the fields within it. Those
    fields may be strings and so may have a collation. Doesn't seem
    particularly magical to me.
    Yeah, that's answer #4. The composite types themselves are not
    considered collatable, but the fields in them carry collation
    individually. That's what the test case in question represents, and I
    think it must work like that if you maintain the analogy between
    composite types and tables (which have columns that carry collation
    individually).

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppgsql-hackers @
categoriespostgresql
postedMar 19, '11 at 2:41a
activeMar 22, '11 at 8:39a
posts4
users3
websitepostgresql.org...
irc#postgresql

People

Translate

site design / logo © 2021 Grokbase