Dear list,
I am trying to find out whether I can use the "record" type as a polymorphic
return type to return multiple columns, to be determined at runtime. In
fact, I'm trying to write a function that provides a generic implementation
of some deserialization of a certain field.

The prototype of the function I came up with, is as follows:
CREATE FUNCTION deserialize(the_table t1) RETURNS record << etc. etc. >>
;

It is intended to return multiple fields in an anonymous row: only at time
of invocation it is known what fields will this row consist of...

The function is used in the following statement:
SELECT * FROM (SELECT deserialize( t1 ) FROM t1) ss;
Unfortunately, this results in ONE row, with ONE column. E.g.:

MYDB=# select * from (SELECT deserialize(kvp) FROM kvp) ss;
deserialize
-----------
(1,2)
(1 row)

I guess I am seeking to prototype the anonymous row layout in the above SQL
statement?

Hope you have any idea,
regards,


Rob

Search Discussions

  • Alban Hertroys at Sep 25, 2009 at 5:59 pm

    On 25 Sep 2009, at 18:34, InterRob wrote:

    Unfortunately, this results in ONE row, with ONE column. E.g.:

    MYDB=# select * from (SELECT deserialize(kvp) FROM kvp) ss;
    deserialize
    -----------
    (1,2)
    (1 row)

    I guess I am seeking to prototype the anonymous row layout in the
    above SQL statement?

    I'm not entirely sure about the syntax in your case, but I think
    you're looking for:

    MYDB=# select * from (SELECT deserialize(kvp) FROM kvp) ss (a int, b
    int);

    If that doesn't work, it's based on how you normally select from a
    record-returning function, namely:
    MYDB=# SELECT * FROM deserialize('some string') AS ss (a int, b int);

    You may need to call it like this though:
    MYDB=# select * from (SELECT (deserialize(kvp)).a, (deserialize
    (kvp)).b FROM kvp) ss (a int, b int);

    In that case your function better not be volatile or it will be
    evaluated twice.

    Alban Hertroys

    --
    If you can't see the forest for the trees,
    cut the trees and you'll see there is no forest.


    !DSPAM:737,4abd04ce11682030514312!
  • Rob Marjot at Sep 25, 2009 at 6:40 pm
    Still no luck... To clarify a bit, take this example:

    CREATE OR REPLACE FUNCTION transpose()
    RETURNS record AS
    $BODY$ DECLARE
    output RECORD;
    BEGIN
    SELECT * INTO output FROM (VALUES(1,2)) as tbl(first,second);
    RETURN output;

    END;$BODY$
    LANGUAGE 'plpgsql' STABLE
    COST 100;

    Now, I expect to have 2 columns; named "first" and "second". However, like
    posted before, the flowing query:
    SELECT * FROM deserialize();
    produces only ONE column (in one row, as one would clearly expect from the
    function's defnition):
    deserialize
    -----------
    (1,2)
    (1 row)


    Any thoughts on how to make sure multiple columns are returned; without
    specifying this in the function's prototype return clause?

    Thanks,


    Rob

    2009/9/25 Alban Hertroys <dalroi@solfertje.student.utwente.nl>
    On 25 Sep 2009, at 18:34, InterRob wrote:

    Unfortunately, this results in ONE row, with ONE column. E.g.:
    MYDB=# select * from (SELECT deserialize(kvp) FROM kvp) ss;
    deserialize
    -----------
    (1,2)
    (1 row)

    I guess I am seeking to prototype the anonymous row layout in the above
    SQL statement?

    I'm not entirely sure about the syntax in your case, but I think you're
    looking for:

    MYDB=# select * from (SELECT deserialize(kvp) FROM kvp) ss (a int, b int);

    If that doesn't work, it's based on how you normally select from a
    record-returning function, namely:
    MYDB=# SELECT * FROM deserialize('some string') AS ss (a int, b int);

    You may need to call it like this though:
    MYDB=# select * from (SELECT (deserialize(kvp)).a, (deserialize(kvp)).b
    FROM kvp) ss (a int, b int);

    In that case your function better not be volatile or it will be evaluated
    twice.

    Alban Hertroys

    --
    If you can't see the forest for the trees,
    cut the trees and you'll see there is no forest.


    !DSPAM:968,4abd04cd11681949045486!


  • Tom Lane at Sep 25, 2009 at 10:31 pm

    Rob Marjot writes:
    Any thoughts on how to make sure multiple columns are returned; without
    specifying this in the function's prototype return clause?
    If you want "SELECT * FROM" to expand to multiple columns, the names
    and types of those columns *must* be available at parse time. You
    can either declare them in the function prototype, or you can supply
    them in the function call, a la

    select * from my_func(...) as x(a int, b int);

    It will not work to hope that the parser can predict what the function
    will do when executed.

    regards, tom lane
  • Merlin Moncure at Sep 26, 2009 at 12:28 am

    On Fri, Sep 25, 2009 at 2:40 PM, Rob Marjot wrote:
    Still no luck... To clarify a bit, take this example:
    CREATE OR REPLACE FUNCTION transpose()
    RETURNS record AS
    $BODY$ DECLARE
    output RECORD;
    BEGIN
    SELECT * INTO output FROM (VALUES(1,2)) as tbl(first,second);
    RETURN output;
    END;$BODY$
    LANGUAGE 'plpgsql' STABLE
    COST 100;
    Now, I expect to have 2 columns; named "first" and "second". However, like
    posted before, the flowing query:
    SELECT * FROM deserialize();
    produces only ONE column (in one row, as one would clearly expect from the
    function's defnition):
    deserialize
    -----------
    (1,2)
    (1 row)

    Any thoughts on how to make sure multiple columns are returned; without
    specifying this in the function's prototype return clause?
    In a sense, what you are asking is impossible. Having a function (even
    a C one) return 'record' does not get you out of having to define the
    output columns...either in the function definition with 'out' or as
    part of the calling query. Depending on what you are trying to do,
    this could either matter a little or a lot. If it matters, why don't
    you post some more details of the context of your problem and see if a
    better solution can be found?

    merlin
  • Rob Marjot at Sep 26, 2009 at 9:23 pm
    Thank you for your response; I understand the information provided was
    somewhat limited; I am happy to provide a bit more though: I notice you guys
    have quite experience modeling data...
    What I am trying to do is: building views on a base table, extended by one
    or more columns, extracted (hence the naming of the function
    "deserialize()") from a SINGLE column (XML) **that is in this same base
    table** (see below). Instructions for deserialization (that is: which
    'fields' to look for) reside in some other table. There are MULTIPLE base
    tables, they basically look like this:

    [table definition:]
    BASETABLE(ID INT, model TEXT, field1 some_type, field2 some_type, ... fieldN
    some_type, serialized_data XML)

    So, I wish to define multiple VIEWs based on a BASETABLE; one for each
    "model" (as stated in the above table definition: "model" is a property for
    each row). This QUERY would look like this (producing a VIEW for "MODEL1";
    the query below in invalid, unfortunately):
    SELECT base_t.*, deserialized.* FROM "BASETABLE" base_t,
    deserialize('MODEL1', base_t) as deserialized(fieldX some_type, fieldY
    some_type) WHERE base_t.model = 'MODEL1';

    I have no problem with the requirement to supply the table type in the
    query; infact this is logical. Still, this query is impossible, obviously,
    because "base_t" as a target is not known in the context of the FROM-clause,
    where I whish to use it in calling "deserialize(...)". Ofcourse, I could
    write a "deserialize()" function for each base table (e.g.
    "deserialize_base1(...)") but I wish it to perform it's action on only rows
    that will actually be part of the result set; thus I want the WHERE-clause
    to apply to the function's seq scan álso. When provided, I whish to
    incorporated the user's WHERE-clause as well; this is done by the PostgreSQL
    RULE system...

    Alternatively, the VIEW could be defined by the following query:
    SELECT base_t.*, (deserialize('MODEL1', base_t) as temp(fieldX some_type,
    field_Y some_type)).* FROM "BASETABLE" base_t WHERE base_t.model = 'MODEL1';

    This approach does not work either: deserialize(...) will return its set of
    fields as ONE field (comma separated, circumfixed by brackets); expressions
    within a SELECT-list seem to be only allowed to result in ONE column, except
    from the * shorthand...

    ** So, the question is: how can i feed my "deserialize()" function with a
    record subset (e.g. filter by a WHERE-clause) of the BASETABLE, while still
    returning a record?!? **

    I tried the following approach also:
    SELECT (SELECT fieldX FROM deserialize(base_t) deserialized(fieldX
    some_type, fieldY some_type)) "fieldX", (SELECT fieldY FROM
    deserialize(base_t) deserialized(fieldX some_type, fieldY some_type))
    "fieldY" FROM "BASETABLE" table_t WHERE model= 'MODEL1';

    Which infact worked, but caused the function to get invoked TWICE FOR EACH
    ROW (even when marked IMMUTABLE STRICT; or did I have to flush cached query
    plans in psql?).

    Another approach would be to put all key/value pairs into a separate table
    (as one would do when implementing a EAV-model within a RDBMS) which is then
    to be joined (and joined again... and possibly again (!); in case of
    MULTIPLE additional rows -- depending on the definition of the VIEW) onto
    the BASETABLE, rather than to deserialize from XML which is stored within
    the same record... How does this approach then actually translate in terms
    of table scans? Will they be limited by the filter on the BASETABLE, as the
    available values to join on will be limited? At any rate: this approach will
    be more difficult to implement / maintain in case of EDITABLE VIEWS
    (inserts, update, delete)...

    Hope any of you has some useful thoughts on this... It appears to me
    updating the additional (virtual) fields in the BASETABLE is much easier:
    the "serialize()"-function can be fed by a list of key/value pairs,
    producing some XML that can be stored in the xml field of "serialized_data",
    part of this same base table...
    All this needs to be implemented fully in the database back-end; client
    application will not know they are talking to VIEWS rather than tables...
    Thus: the hosted database must simulate to provide various tables, whereas
    these are in fact stored in a limited number of base tables.

    Thanks in advance, you guys out there!


    Rob



    2009/9/26 Merlin Moncure <mmoncure@gmail.com>
    On Fri, Sep 25, 2009 at 2:40 PM, Rob Marjot wrote:
    Still no luck... To clarify a bit, take this example:
    CREATE OR REPLACE FUNCTION transpose()
    RETURNS record AS
    $BODY$ DECLARE
    output RECORD;
    BEGIN
    SELECT * INTO output FROM (VALUES(1,2)) as tbl(first,second);
    RETURN output;
    END;$BODY$
    LANGUAGE 'plpgsql' STABLE
    COST 100;
    Now, I expect to have 2 columns; named "first" and "second". However, like
    posted before, the flowing query:
    SELECT * FROM deserialize();
    produces only ONE column (in one row, as one would clearly expect from the
    function's defnition):
    deserialize
    -----------
    (1,2)
    (1 row)

    Any thoughts on how to make sure multiple columns are returned; without
    specifying this in the function's prototype return clause?
    In a sense, what you are asking is impossible. Having a function (even
    a C one) return 'record' does not get you out of having to define the
    output columns...either in the function definition with 'out' or as
    part of the calling query. Depending on what you are trying to do,
    this could either matter a little or a lot. If it matters, why don't
    you post some more details of the context of your problem and see if a
    better solution can be found?

    merlin
  • InterRob at Sep 26, 2009 at 9:30 pm
    Thank you for your response; I understand the information provided was
    somewhat limited; I am happy to provide a bit more though: I notice you guys
    have quite experience modeling data...
    What I am trying to do is: building views on a base table, extended by one
    or more columns, extracted (hence the naming of the function
    "deserialize()") from a SINGLE column (XML) **that is in this same base
    table** (see below). Instructions for deserialization (that is: which
    'fields' to look for) reside in some other table. There are MULTIPLE base
    tables, they basically look like this:

    [table definition:]
    BASETABLE(ID INT, model TEXT, field1 some_type, field2 some_type, ... fieldN
    some_type, serialized_data XML)

    So, I wish to define multiple VIEWs based on a BASETABLE; one for each
    "model" (as stated in the above table definition: "model" is a property for
    each row). This QUERY would look like this (producing a VIEW for "MODEL1";
    the query below in invalid, unfortunately):
    SELECT base_t.*, deserialized.* FROM "BASETABLE" base_t,
    deserialize('MODEL1', base_t) as deserialized(fieldX some_type, fieldY
    some_type) WHERE base_t.model = 'MODEL1';

    I have no problem with the requirement to supply the table type in the
    query; infact this is logical. Still, this query is impossible, obviously,
    because "base_t" as a target is not known in the context of the FROM-clause,
    where I whish to use it in calling "deserialize(...)". Ofcourse, I could
    write a "deserialize()" function for each base table (e.g.
    "deserialize_base1(...)") but I wish it to perform it's action on only rows
    that will actually be part of the result set; thus I want the WHERE-clause
    to apply to the function's seq scan álso. When provided, I whish to
    incorporated the user's WHERE-clause as well; this is done by the PostgreSQL
    RULE system...

    Alternatively, the VIEW could be defined by the following query:
    SELECT base_t.*, (deserialize('MODEL1', base_t) as temp(fieldX some_type,
    field_Y some_type)).* FROM "BASETABLE" base_t WHERE base_t.model = 'MODEL1';

    This approach does not work either: deserialize(...) will return its set of
    fields as ONE field (comma separated, circumfixed by brackets); expressions
    within a SELECT-list seem to be only allowed to result in ONE column, except
    from the * shorthand...

    ** So, the question is: how can i feed my "deserialize()" function with a
    record subset (e.g. filter by a WHERE-clause) of the BASETABLE, while still
    returning a record?!? **

    I tried the following approach also:
    SELECT (SELECT fieldX FROM deserialize(base_t) deserialized(fieldX
    some_type, fieldY some_type)) "fieldX", (SELECT fieldY FROM
    deserialize(base_t) deserialized(fieldX some_type, fieldY some_type))
    "fieldY" FROM "BASETABLE" table_t WHERE model= 'MODEL1';

    Which infact worked, but caused the function to get invoked TWICE FOR EACH
    ROW (even when marked IMMUTABLE STRICT; or did I have to flush cached query
    plans in psql?).

    Another approach would be to put all key/value pairs into a separate table
    (as one would do when implementing a EAV-model within a RDBMS) which is then
    to be joined (and joined again... and possibly again (!); in case of
    MULTIPLE additional rows -- depending on the definition of the VIEW) onto
    the BASETABLE, rather than to deserialize from XML which is stored within
    the same record... How does this approach then actually translate in terms
    of table scans? Will they be limited by the filter on the BASETABLE, as the
    available values to join on will be limited? At any rate: this approach will
    be more difficult to implement / maintain in case of EDITABLE VIEWS
    (inserts, update, delete)...

    Hope any of you has some useful thoughts on this... It appears to me
    updating the additional (virtual) fields in the BASETABLE is much easier:
    the "serialize()"-function can be fed by a list of key/value pairs,
    producing some XML that can be stored in the xml field of "serialized_data",
    part of this same base table...
    All this needs to be implemented fully in the database back-end; client
    application will not know they are talking to VIEWS rather than tables...
    Thus: the hosted database must simulate to provide various tables, whereas
    these are in fact stored in a limited number of base tables.

    Thanks in advance, you guys out there!


    Rob

    2009/9/26 Merlin Moncure <mmoncure@gmail.com>
    On Fri, Sep 25, 2009 at 2:40 PM, Rob Marjot wrote:
    Still no luck... To clarify a bit, take this example:
    CREATE OR REPLACE FUNCTION transpose()
    RETURNS record AS
    $BODY$ DECLARE
    output RECORD;
    BEGIN
    SELECT * INTO output FROM (VALUES(1,2)) as tbl(first,second);
    RETURN output;
    END;$BODY$
    LANGUAGE 'plpgsql' STABLE
    COST 100;
    Now, I expect to have 2 columns; named "first" and "second". However, like
    posted before, the flowing query:
    SELECT * FROM deserialize();
    produces only ONE column (in one row, as one would clearly expect from the
    function's defnition):
    deserialize
    -----------
    (1,2)
    (1 row)

    Any thoughts on how to make sure multiple columns are returned; without
    specifying this in the function's prototype return clause?
    In a sense, what you are asking is impossible. Having a function (even
    a C one) return 'record' does not get you out of having to define the
    output columns...either in the function definition with 'out' or as
    part of the calling query. Depending on what you are trying to do,
    this could either matter a little or a lot. If it matters, why don't
    you post some more details of the context of your problem and see if a
    better solution can be found?

    merlin

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppgsql-general @
categoriespostgresql
postedSep 25, '09 at 4:35p
activeSep 26, '09 at 9:30p
posts7
users5
websitepostgresql.org
irc#postgresql

People

Translate

site design / logo © 2022 Grokbase