FAQ
Hi,

I would like to propose support for per-column generic option, which is
defined in the SQL/MED standard. In 9.0 release, support for foreign
tables and per-table generic option have been added, but support for
per-column generic option hasn't.

Please examine the description below and attached patch
per_column_option_v1.patch. Any comments or questions are welcome.

Possible use cases
~~~~~~~~~~~~~~~~~~
Purpose of per-column generic option is passing column-specific settings
to the foreign-data wrapper.

1) For file_fdw, per-column generic option can be used to represent
per-column COPY option FORCE_NOT_NULL with boolean value (currently
file_fdw doesn't support FORCE_NOT_NULL option).

2) For postgresql_fdw (even though it has not been implemented yet),
per-column generic option could be used to represent the name of the
column on the foreign side. It is similar to per-table generic option
such as "nspname" and "relname" for namespace name/relation name,
proposed in the last development cycles. Such option would be named
"attname" after pg_attribute.attname.

Catalog design
~~~~~~~~~~~~~~
This proposal requires changing some catalogs.

1) To store per-column generic options, new attribute attfdwoptions
(text[]) was added at tail of pg_attribute. This is similar to the
generic option of other FDW objects such as FDW, server, user mapping
and foreign table. Existing attribute attoptions is not used for
generic options.

2) To conform the SQL/MED standard, an information_schema view
COLUMN_OPTIONS was added. Also underlying view
_pg_foreign_table_columns was added to show only columns which current
user has any access privilege. This fashion is same as other FDW views.

Syntax design
~~~~~~~~~~~~~
Per-column generic options can be operated via CREATE FOREIGN TABLE
statement and ALTER FOREIGN TABLE statement. Similar to other generic
options, ADD/SET/DROP can be specified for ALTER FOREIGN TABLE.

1) In CREATE FOREIGN TABLE statement, per-column generic options can be
specified in a column definition without operation qualifier such as
SET, ADD and DROP; all options are treated as ADD. Similar to other FDW
objects, multiple options can be specified for one column by separating
option-value pairs with comma.

-- multiple options can be specified for one column at once
CREATE FOREIGN TABLE foo (
c1 int OPTIONS (opt1 'value1'),
c2 text OPTIONS (opt2 'values2', opt3 'value3'),
c3 date OPTIONS (opt4 'value4) NOT NULL
) SERVER server;

To avoid syntax conflict between "OPTIONS (...)" and "DEFAULT b_expr"
(b_expr can end with a token "OPTION"), I placed OPTIONS (...) between
data type and any other column qualifier such as default values and
constraints.

The SQL/MED standard doesn't consider any column qualifier other than
data type, so it defines the syntax simply as below. I think new syntax
conforms the standard...

CREATE FOREIGN TABLE foo (
{ column_name data_type [ OPTIONS ( option 'value' [, ...] ) ] }
[, ... ]
) SERVER server [ OPTIONS (...) ]

Please note that CREATE FOREIGN TABLE shares the columnDef, a syntax
element for a column definition, with CREATE TABLE. I thought that they
should so, and I didn't introduce separated syntax for foreign tables.

2) Similar to other FDW objects' ALTER statement, ALTER FOREIGN TABLE
ALTER COLUMN accepts ADD/SET/DROP operation for each option. DROP
requires only option name.

ALTER FOREIGN TABLE foo ALTER COLUMN c1
OPTIONS (SET opt1 'VALUE1'); -- should be set in advance
ALTER FOREIGN TABLE foo ALTER COLUMN c1
OPTIONS (ADD opt2 'VALUE1', DROP opt1);

Similar to other ALTER FOREIGN TABLE commands, ALTER COLUMN ... OPTIONS
(...) can be contained in the list of ALTER commands.

ALTER FOREIGN TABLE foo
ALTER COLUMN col1 OPTIONS (opt1 'val1'),
ALTER COLUMN col2 SET NOT NULL;

psql support
~~~~~~~~~~~~
1) psql should support describing per-column generic options, so \dec
command was added. If the form \dec+ is used, generic options are also
displayed. Output sample is:

postgres=# \dec csv_branches
List of foreign table columns
Schema | Table | Column
--------+--------------+----------
public | csv_branches | bid
public | csv_branches | bbalance
public | csv_branches | filler
(3 rows)

postgres=# \dec+ csv_branches
List of foreign table columns
Schema | Table | Column | Options
--------+--------------+----------+------------------------
public | csv_branches | bid | {force_not_null=false}
public | csv_branches | bbalance | {force_not_null=true}
public | csv_branches | filler |
(3 rows)

Here I found an inconsistency about privilege to see generic options
(not only column but also FDW and server et al). The
information_schema.*_options only shows options which are associated to
objects that current user can access, but \de*+ doesn't have such
restriction. \de* commands should be fixed to hide forbidden objects?

2) psql can support tab-completion CREATE/ALTER FOREIGN TABLE statement
about OPTIONS, but the patch doesn't include this feature.

pg_dump support
~~~~~~~~~~~~~~~
Sorry, I overlooked this issue till writing this post... I'm going to
work on this and post revised patch soon. Please examine other parts first.

Documents
~~~~~~~~~
1) Is "generic options" proper term to mean FDW-specific option
associated to a FDW object? It's used in the SQL/MED standard, but
seems not popular... "FDW option" would be better than "generic option"?

Regards,
--
Shigeru Hanada

Search Discussions

  • Robert Haas at Jun 14, 2011 at 12:20 pm
    I haven't looked at the patch yet, but here are a few comments on the
    design, which overall looks good.

    2011/6/14 Shigeru Hanada <shigeru.hanada@gmail.com>:
    1) psql should support describing per-column generic options, so \dec
    command was added.  If the form \dec+ is used, generic options are also
    displayed.  Output sample is:
    I would not add a new backslash command for this - it's unlikely to be
    useful to see this information across all tables. It would be more
    helpful to somehow (not sure of the details) incorporate this into the
    output of running \d on a foreign table.
    Here I found an inconsistency about privilege to see generic options
    (not only column but also FDW and server et al).  The
    information_schema.*_options only shows options which are associated to
    objects that current user can access, but \de*+ doesn't have such
    restriction.  \de* commands should be fixed to hide forbidden objects?
    It's less important whether \de* is consistent with information_schema
    in this regard than it is whether it is consistent with other psql
    backslash commands, e.g. \dv or \db or \dC. AFAIK those commands do
    not filter by privilege.
    1) Is "generic options" proper term to mean FDW-specific option
    associated to a FDW object?  It's used in the SQL/MED standard, but
    seems not popular...  "FDW option" would be better than "generic option"?
    I think FDW option is much clearer.

    --
    Robert Haas
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise PostgreSQL Company
  • Shigeru Hanada at Jun 15, 2011 at 8:57 am

    (2011/06/14 21:20), Robert Haas wrote:
    I haven't looked at the patch yet, but here are a few comments on the
    design, which overall looks good.
    Thanks for the review. Please find attached a revised patch.

    In addition to responding to your comments, I also added pg_dump
    support. Now pg_dump dumps per-column generic options with ALTER
    FOREIGN TABLE ALTER COLUMN statement just after CREATE FOREIGN TABLE.
    Once I've though to dump them in each column definition of a CREATE
    FOREIGN TABLE statement, but that seems to makes the statement too complex.
    2011/6/14 Shigeru Hanada<shigeru.hanada@gmail.com>:
    1) psql should support describing per-column generic options, so \dec
    command was added. If the form \dec+ is used, generic options are also
    displayed. Output sample is:
    I would not add a new backslash command for this - it's unlikely to be
    useful to see this information across all tables. It would be more
    helpful to somehow (not sure of the details) incorporate this into the
    output of running \d on a foreign table.
    Hm, belatedly I noticed that relation-kind-specific column are added
    preceding to verbose-only columns such as expression for indexes and
    column values for sequences. It seems suitable place to show per-column
    generic options. Please see attached "desc_results.txt" as sample.

    I also noticed that relation-kind-specific information are not mentioned
    in any document (at least in the section of psql[1]), even about
    existing ones such as sequence values and index definition. I also
    added short brief of them to psql document.

    BTW, while working around \d command, I noticed that we can avoid
    variable width (# of columns) query result, which is used to fetch
    column information, with using NULL as placeholder (and it has already
    been used partially). I think that it would enhance maintainability
    little, so I've separated this fix to another patch
    avoid_variable_width_result.patch. The main patch
    per_column_option_v2.patch assumes that this fix has been applied.

    [1] http://developer.postgresql.org/pgdocs/postgres/app-psql.html
    Here I found an inconsistency about privilege to see generic options
    (not only column but also FDW and server et al). The
    information_schema.*_options only shows options which are associated to
    objects that current user can access, but \de*+ doesn't have such
    restriction. \de* commands should be fixed to hide forbidden objects?
    It's less important whether \de* is consistent with information_schema
    in this regard than it is whether it is consistent with other psql
    backslash commands, e.g. \dv or \db or \dC. AFAIK those commands do
    not filter by privilege.
    Agreed, I'll leave \de* to show results unconditionally.
    1) Is "generic options" proper term to mean FDW-specific option
    associated to a FDW object? It's used in the SQL/MED standard, but
    seems not popular... "FDW option" would be better than "generic option"?
    I think FDW option is much clearer.
    So do I, but I didn't touch them because "generic option" appears in
    many documents, source files including comments and psql's \d* output.
    Most of them have been there since 8.4. Is it acceptable to change them
    to "FDW option", at least for only documents?

    OTOH, psql's \d* commands use "Options" as column header of FDW options
    and reloptions. I also left them because I thought that this would not
    cause misunderstanding.

    Regards,
    --
    Shigeru Hanada
  • Kohei KaiGai at Jun 26, 2011 at 9:34 am
    I checked your patch.

    The backend portion seems to me OK, but I have several questions/comments.

    * This patch should be rebased.
    It conflicts with the latest bin/psql/describe.c and
    include/catalog/catversion.h.
    IIRC, we should not touch catversion.h in submission stage. (It might
    be a task of
    committer when a patch get upstreamed.)

    * It might be an option to extend attreloptions, instead of the new
    attfdwoptions.
    Although I didn't track the discussion when pg_foreign_table catalog
    that provides
    relation level fdw-options, was it impossible or unreasonable to extend existing
    design of reloptions/attoptions?
    Right now, it accepts only hard-wired options listed at reloptions.c.
    But, it seems
    to me worthwhile, if it could accept options validated by loadable modules.

    * pg_dump shall die when we run it for older postgresql version.

    This patch does not modify queries to older postgresql version at
    getTableAttrs().
    In the result, this index shall be set by -1.
    + i_attfdwoptions = PQfnumber(res, "attfdwoptions");

    Then, PGgetvalue() returns NULL for unranged column number, and strdup()
    shall cause segmentation fault.
    + tbinfo->attfdwoptions[j] = strdup(PQgetvalue(res, j,
    i_attfdwoptions));

    In fact, I tried to run the patched pg_dump towards v9.0.2
    [kaigai@vmlinux pg_dump]$ ./pg_dump postgres
    pg_dump: column number -1 is out of range 0..14
    Segmentation fault

    My recommendation is to append "NULL as attfdwoptions" on the queries to
    older versions. It eventually makes PGgetvalue() to return an empty string,
    then strdup() does not cause a problem.

    Thanks,

    2011年6月15日10:57 Shigeru Hanada (2011/06/14 21:20), Robert Haas wrote:
    I haven't looked at the patch yet, but here are a few comments on the
    design, which overall looks good.
    Thanks for the review. Please find attached a revised patch.

    In addition to responding to your comments, I also added pg_dump
    support. Now pg_dump dumps per-column generic options with ALTER
    FOREIGN TABLE ALTER COLUMN statement just after CREATE FOREIGN TABLE.
    Once I've though to dump them in each column definition of a CREATE
    FOREIGN TABLE statement, but that seems to makes the statement too complex.
    2011/6/14 Shigeru Hanada<shigeru.hanada@gmail.com>:
    1) psql should support describing per-column generic options, so \dec
    command was added. If the form \dec+ is used, generic options are also
    displayed. Output sample is:
    I would not add a new backslash command for this - it's unlikely to be
    useful to see this information across all tables. It would be more
    helpful to somehow (not sure of the details) incorporate this into the
    output of running \d on a foreign table.
    Hm, belatedly I noticed that relation-kind-specific column are added
    preceding to verbose-only columns such as expression for indexes and
    column values for sequences. It seems suitable place to show per-column
    generic options. Please see attached "desc_results.txt" as sample.

    I also noticed that relation-kind-specific information are not mentioned
    in any document (at least in the section of psql[1]), even about
    existing ones such as sequence values and index definition. I also
    added short brief of them to psql document.

    BTW, while working around \d command, I noticed that we can avoid
    variable width (# of columns) query result, which is used to fetch
    column information, with using NULL as placeholder (and it has already
    been used partially). I think that it would enhance maintainability
    little, so I've separated this fix to another patch
    avoid_variable_width_result.patch. The main patch
    per_column_option_v2.patch assumes that this fix has been applied.

    [1] http://developer.postgresql.org/pgdocs/postgres/app-psql.html
    Here I found an inconsistency about privilege to see generic options
    (not only column but also FDW and server et al). The
    information_schema.*_options only shows options which are associated to
    objects that current user can access, but \de*+ doesn't have such
    restriction. \de* commands should be fixed to hide forbidden objects?
    It's less important whether \de* is consistent with information_schema
    in this regard than it is whether it is consistent with other psql
    backslash commands, e.g. \dv or \db or \dC. AFAIK those commands do
    not filter by privilege.
    Agreed, I'll leave \de* to show results unconditionally.
    1) Is "generic options" proper term to mean FDW-specific option
    associated to a FDW object? It's used in the SQL/MED standard, but
    seems not popular... "FDW option" would be better than "generic option"?
    I think FDW option is much clearer.
    So do I, but I didn't touch them because "generic option" appears in
    many documents, source files including comments and psql's \d* output.
    Most of them have been there since 8.4. Is it acceptable to change them
    to "FDW option", at least for only documents?

    OTOH, psql's \d* commands use "Options" as column header of FDW options
    and reloptions. I also left them because I thought that this would not
    cause misunderstanding.

    Regards,
    --
    Shigeru Hanada



    --
    Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
    To make changes to your subscription:
    http://www.postgresql.org/mailpref/pgsql-hackers


    --
    KaiGai Kohei <kaigai@kaigai.gr.jp>
  • Shigeru Hanada at Jun 27, 2011 at 2:48 pm

    (2011/06/26 18:34), Kohei KaiGai wrote:
    I checked your patch.
    Thanks for the review! Please find attached a revised patch.
    The backend portion seems to me OK, but I have several questions/comments.

    * This patch should be rebased.
    It conflicts with the latest bin/psql/describe.c and
    include/catalog/catversion.h.
    IIRC, we should not touch catversion.h in submission stage. (It might
    be a task of
    committer when a patch get upstreamed.)
    I've rebased against current HEAD, and reverted catversion.h.
    * It might be an option to extend attreloptions, instead of the new
    attfdwoptions.
    Although I didn't track the discussion when pg_foreign_table catalog
    that provides
    relation level fdw-options, was it impossible or unreasonable to extend existing
    design of reloptions/attoptions?
    Right now, it accepts only hard-wired options listed at reloptions.c.
    But, it seems
    to me worthwhile, if it could accept options validated by loadable modules.
    IIRC someone has objected against storing FDW options in
    reloptions/attoptions, but I couldn't find such post. I'll follow the
    discussion again.

    IMHO, though at present I don't have clear proof, separating FDW options
    from access method options seems better than merging them, but I should
    learn more about AM mechanism to clarify this issue. Please check other
    issues first.
    * pg_dump shall die when we run it for older postgresql version.

    This patch does not modify queries to older postgresql version at
    getTableAttrs().
    In the result, this index shall be set by -1.
    + i_attfdwoptions = PQfnumber(res, "attfdwoptions");

    Then, PGgetvalue() returns NULL for unranged column number, and strdup()
    shall cause segmentation fault.
    + tbinfo->attfdwoptions[j] = strdup(PQgetvalue(res, j,
    i_attfdwoptions));

    In fact, I tried to run the patched pg_dump towards v9.0.2
    [kaigai@vmlinux pg_dump]$ ./pg_dump postgres
    pg_dump: column number -1 is out of range 0..14
    Segmentation fault

    My recommendation is to append "NULL as attfdwoptions" on the queries to
    older versions. It eventually makes PGgetvalue() to return an empty string,
    then strdup() does not cause a problem.
    Fixed in the way you've recommended, and tested against 8.4. I should
    have noticed that same technique is used in some other places...

    BTW, I also have found an unnecessary FIXME comment and removed it.
    Please see the line 2845 of src/backend/catalog/heap.c
    (InsertPgAttributeTuple) for the correction.

    Regards,
    --
    Shigeru Hanada
  • Robert Haas at Jun 27, 2011 at 6:05 pm

    2011/6/27 Shigeru Hanada <shigeru.hanada@gmail.com>:
    * It might be an option to extend attreloptions, instead of the new
    attfdwoptions.
    Although I didn't track the discussion when pg_foreign_table catalog
    that provides
    relation level fdw-options, was it impossible or unreasonable to extend existing
    design of reloptions/attoptions?
    Right now, it accepts only hard-wired options listed at reloptions.c.
    But, it seems
    to me worthwhile, if it could accept options validated by loadable modules.
    IIRC someone has objected against storing FDW options in
    reloptions/attoptions, but I couldn't find such post.  I'll follow the
    discussion again.
    I think they should definitely be separate.

    --
    Robert Haas
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise PostgreSQL Company
  • Kohei KaiGai at Jul 3, 2011 at 9:51 am
    Hanada-san,

    I checked the per-column generic option patch.
    Right now, I have nothing to comment on anymore.
    So, it should be reviewed by committers.

    Thanks,

    2011年6月27日16:47 Shigeru Hanada (2011/06/26 18:34), Kohei KaiGai wrote:
    I checked your patch.
    Thanks for the review! Please find attached a revised patch.
    The backend portion seems to me OK, but I have several questions/comments.

    * This patch should be rebased.
    It conflicts with the latest bin/psql/describe.c and
    include/catalog/catversion.h.
    IIRC, we should not touch catversion.h in submission stage. (It might
    be a task of
    committer when a patch get upstreamed.)
    I've rebased against current HEAD, and reverted catversion.h.
    * It might be an option to extend attreloptions, instead of the new
    attfdwoptions.
    Although I didn't track the discussion when pg_foreign_table catalog
    that provides
    relation level fdw-options, was it impossible or unreasonable to extend existing
    design of reloptions/attoptions?
    Right now, it accepts only hard-wired options listed at reloptions.c.
    But, it seems
    to me worthwhile, if it could accept options validated by loadable modules.
    IIRC someone has objected against storing FDW options in
    reloptions/attoptions, but I couldn't find such post. I'll follow the
    discussion again.

    IMHO, though at present I don't have clear proof, separating FDW options
    from access method options seems better than merging them, but I should
    learn more about AM mechanism to clarify this issue. Please check other
    issues first.
    * pg_dump shall die when we run it for older postgresql version.

    This patch does not modify queries to older postgresql version at
    getTableAttrs().
    In the result, this index shall be set by -1.
    + i_attfdwoptions = PQfnumber(res, "attfdwoptions");

    Then, PGgetvalue() returns NULL for unranged column number, and strdup()
    shall cause segmentation fault.
    + tbinfo->attfdwoptions[j] = strdup(PQgetvalue(res, j,
    i_attfdwoptions));

    In fact, I tried to run the patched pg_dump towards v9.0.2
    [kaigai@vmlinux pg_dump]$ ./pg_dump postgres
    pg_dump: column number -1 is out of range 0..14
    Segmentation fault

    My recommendation is to append "NULL as attfdwoptions" on the queries to
    older versions. It eventually makes PGgetvalue() to return an empty string,
    then strdup() does not cause a problem.
    Fixed in the way you've recommended, and tested against 8.4. I should
    have noticed that same technique is used in some other places...

    BTW, I also have found an unnecessary FIXME comment and removed it.
    Please see the line 2845 of src/backend/catalog/heap.c
    (InsertPgAttributeTuple) for the correction.

    Regards,
    --
    Shigeru Hanada


    --
    KaiGai Kohei <kaigai@kaigai.gr.jp>
  • Shigeru Hanada at Jul 4, 2011 at 1:16 am

    (2011/07/03 18:50), Kohei KaiGai wrote:
    I checked the per-column generic option patch.
    Right now, I have nothing to comment on anymore.
    So, it should be reviewed by committers.
    Thanks for the review!.

    Regards,
    --
    Shigeru Hanada
  • Shigeru Hanada at Jul 4, 2011 at 12:02 pm

    (2011/07/04 10:17), Shigeru Hanada wrote:
    (2011/07/03 18:50), Kohei KaiGai wrote:
    I checked the per-column generic option patch.
    Right now, I have nothing to comment on anymore.
    So, it should be reviewed by committers.
    Thanks for the review!.
    I would like to propose adding force_not_null support to file_fdw, as
    the first use case of per-column FDW option. Attached patch, which
    assumes that per_column_option_v3.patch has been applied, implements
    force_not_null option as per-column FDW option.

    Overview
    ========
    This option is originally supported by COPY FROM command, so I think
    it's reasonable to support it in file_fdw too. It would provides more
    flexible parsing capability. In fact, this option has been supported
    by the internal routines which are shared with COPY FROM, but currently
    we don't have any way to specify it.

    Difference between COPY
    =======================
    For COPY FROM, FORCE_NOT_NULL is specified as a list of column names
    ('*' is not allowed). For file_fdw, per-table FDW option can be used
    like other options, but then file_fdw needs parser which can identify
    valid column. I think it's too much work, so I prefer per-column FDW
    option which accepts boolean value string. The value 'true' means that
    the column doesn't be matched against NULL string, same as ones listed
    for COPY FROM.

    Example:

    If you have created a foreign table with:

    CREATE FOREIGN TABLE foo (
    c1 int OPTIONS (force_not_null 'false'),
    c2 text OPTIONS (force_not_null 'true')
    ) SERVER file OPTIONS (file '/path/to/file', format 'csv', null '');

    values which are read from the file for c1 are matched against
    null-representation-string '', but values for c2 are NOT. Empty strings
    for c2 are stored as empty strings; they don't treated as NULL value.

    Regards,
    --
    Shigeru Hanada
  • Alvaro Herrera at Jul 10, 2011 at 3:49 am

    Shigeru Hanada escribió:
    (2011/06/26 18:34), Kohei KaiGai wrote:
    I checked your patch.
    Thanks for the review! Please find attached a revised patch.
    Err, \dec seems to have a line in describe.h but nowhere else; are you
    going to introduce that command?

    The new ALTER TABLE grammar seems a bit strange -- ADD, SET, DROP. Is
    this defined by the SQL/MED standard? It seems at odds with our
    handling of attoptions (and the pg_dump query seems rather bizarre in
    comparison to the handling of attoptions there; why do we need
    pg_options_to_table when attoptions do not?).

    What's the reason for this:

    @@ -3681,7 +3691,7 @@ AlterFdwStmt: ALTER FOREIGN DATA_P WRAPPER name opt_fdw_options alter_generic_op
    /* Options definition for CREATE FDW, SERVER and USER MAPPING */
    create_generic_options:
    OPTIONS '(' generic_option_list ')' { $$ = $3; }
    - | /*EMPTY*/ { $$ = NIL; }
    + | /*EMPTY*/ { $$ = NIL }
    ;


    I think this should be removed:

    + foreach (clist, column->fdwoptions)
    + {
    + DefElem *option = (DefElem *) lfirst(clist);
    + elog(DEBUG3, "%s=%s", option->defname, strVal(option->arg));
    + }


    As for whether attfdwoptions needs to be separate from attoptions, I am
    not sure that they really need to be; but if they are, I think we should
    use a different name than attfdwoptions, because we might want to use
    them for something else. Maybe attgenoptions (and note that the alter
    table code already calls them "generic options" so I'm not sure why the
    rest of the code is calling them FDW options) ... but then I really
    start to question whether they need to be separate from attoptions.

    Currently, attoptions are used to store n_distinct and
    n_distinct_inherited. Do those options make sense for foreign tables?
    If they do make sense for some types of foreign servers, maybe we should
    decree that they need to be specifically declared as such by the FDW --
    that is, the FDW needs to provide its own attribute_reloptions routine
    (or equivalent therein) for validation that includes those core options.
    There is no saying that, even if all existing core options such as
    n_distinct apply to a FDW now, more core options that we might invent in
    the future are going to apply as well.

    In short: in my opinion, attoptions and attfdwoptions need to be one
    thing and the same.

    --
    Álvaro Herrera <alvherre@commandprompt.com>
    The PostgreSQL Company - Command Prompt, Inc.
    PostgreSQL Replication, Consulting, Custom Development, 24x7 support
  • Robert Haas at Jul 11, 2011 at 1:22 am

    On Jul 9, 2011, at 10:49 PM, Alvaro Herrera wrote:
    In short: in my opinion, attoptions and attfdwoptions need to be one
    thing and the same.
    I feel the opposite. In particular, what happens when a future release of PostgreSQL adds an attoption that happens to have the same name as somebody's per-column FDW option? Something breaks, that's what...

    Another point: We don't commingle these concepts at the table level. It doesn't make sense to have table reloptions separate from table FDW options but then go and make the opposite decision at the column level.

    ...Robert
  • Alvaro Herrera at Jul 11, 2011 at 4:11 am

    Excerpts from Robert Haas's message of dom jul 10 21:21:19 -0400 2011:
    On Jul 9, 2011, at 10:49 PM, Alvaro Herrera wrote:
    In short: in my opinion, attoptions and attfdwoptions need to be one
    thing and the same.
    I feel the opposite. In particular, what happens when a future release of PostgreSQL adds an attoption that happens to have the same name as somebody's per-column FDW option? Something breaks, that's what...
    Hmm, if you follow my proposal above, that wouldn't actually happen,
    because the core options do not apply to foreign columns.
    Another point: We don't commingle these concepts at the table level.
    It doesn't make sense to have table reloptions separate from table FDW
    options but then go and make the opposite decision at the column
    level.
    That's a point. I remember feeling uneasy at the fact that we were
    doing things like that, at the time, yes :-)

    --
    Álvaro Herrera <alvherre@commandprompt.com>
    The PostgreSQL Company - Command Prompt, Inc.
    PostgreSQL Replication, Consulting, Custom Development, 24x7 support
  • Robert Haas at Jul 18, 2011 at 7:09 pm

    On Mon, Jul 11, 2011 at 12:11 AM, Alvaro Herrera wrote:
    Excerpts from Robert Haas's message of dom jul 10 21:21:19 -0400 2011:
    On Jul 9, 2011, at 10:49 PM, Alvaro Herrera wrote:
    In short: in my opinion, attoptions and attfdwoptions need to be one
    thing and the same.
    I feel the opposite. In particular, what happens when a future release of PostgreSQL adds an attoption that happens to have the same name as somebody's per-column FDW option?  Something breaks, that's what...
    Hmm, if you follow my proposal above, that wouldn't actually happen,
    because the core options do not apply to foreign columns.
    Well, not at the moment. But I think it's altogether likely that we
    might want them to in the future. The foreign data wrapper support we
    have right now is basically a stub until we get around to improving
    it, so we don't (for example) analyze foreign tables, which means that
    n_distinct is not relevant. But that's something we presumably want
    to change at some point. Eventually, I would anticipate that we'll
    have quite a few more column options and most will apply to both
    tables and foreign tables, so I'm not keen to bake in something that
    makes that potentially problematic. I think we should understand
    attoptions as things that modify the behavior of PostgreSQL, while
    attfdw/genoptions are there solely for the foreign data wrapper to
    use. An extra nullable field in pg_attribute isn't costing us
    anything non-trivial, and the syntactic and definitional clarity seems
    entirely worth it.

    --
    Robert Haas
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise PostgreSQL Company
  • Tom Lane at Jul 18, 2011 at 7:26 pm

    Robert Haas writes:
    ... I think we should understand
    attoptions as things that modify the behavior of PostgreSQL, while
    attfdw/genoptions are there solely for the foreign data wrapper to
    use. An extra nullable field in pg_attribute isn't costing us
    anything non-trivial, and the syntactic and definitional clarity seems
    entirely worth it.
    +1. We paid the price of allowing nullable columns in pg_attribute long
    ago. One more isn't going to cost anything, especially since virtually
    every row in that catalog already contains at least one null.

    I'm not too thrilled with the terminology of "generic options", though.
    I think this should be understood as specifically "FDW-owned options".
    If the column isn't reserved for the use of the FDW, then you get right
    back into the problem of who's allowed to use it and what if there's a
    collision.

    regards, tom lane
  • Robert Haas at Jul 18, 2011 at 7:31 pm

    On Mon, Jul 18, 2011 at 3:26 PM, Tom Lane wrote:
    Robert Haas <robertmhaas@gmail.com> writes:
    ... I think we should understand
    attoptions as things that modify the behavior of PostgreSQL, while
    attfdw/genoptions are there solely for the foreign data wrapper to
    use.  An extra nullable field in pg_attribute isn't costing us
    anything non-trivial, and the syntactic and definitional clarity seems
    entirely worth it.
    +1.  We paid the price of allowing nullable columns in pg_attribute long
    ago.  One more isn't going to cost anything, especially since virtually
    every row in that catalog already contains at least one null.

    I'm not too thrilled with the terminology of "generic options", though.
    I think this should be understood as specifically "FDW-owned options".
    If the column isn't reserved for the use of the FDW, then you get right
    back into the problem of who's allowed to use it and what if there's a
    collision.
    I concur. The SQL/MED standard is welcome to refer to them as generic
    options, but at least FTTB they are going to be entirely for FDWs in
    our implementation, and naming them that way is therefore a Good
    Thing. If the SQL committee decides to use them in other places and
    we choose to support that in some future release for some
    as-yet-unclear purpose, well, it won't be the first time we've
    modified the system catalog schema.

    --
    Robert Haas
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise PostgreSQL Company
  • Shigeru Hanada at Jul 12, 2011 at 5:31 am

    (2011/07/11 10:21), Robert Haas wrote:
    On Jul 9, 2011, at 10:49 PM, Alvaro Herrerawrote:
    In short: in my opinion, attoptions and attfdwoptions need to be one
    thing and the same.
    I feel the opposite. In particular, what happens when a future release
    of PostgreSQL adds an attoption that happens to have the same name as
    somebody's per-column FDW option? Something breaks, that's what...

    Another point: We don't commingle these concepts at the table level.
    It doesn't make sense to have table reloptions separate from table FDW
    options but then go and make the opposite decision at the column
    level.
    I'm afraid that I've misunderstood the discussion. Do you mean that
    per-table options should be stored in reloptions, but per-column should
    be separated from attoptions? (I think I've misread...)

    Could you tell me little more detail why it doesn't make sense to have
    table reloptions separate from table FDW options?

    Regards,
    --
    Shigeru Hanada
  • Robert Haas at Jul 12, 2011 at 12:19 pm

    On Jul 12, 2011, at 12:31 AM, Shigeru Hanada wrote:
    (2011/07/11 10:21), Robert Haas wrote:
    On Jul 9, 2011, at 10:49 PM, Alvaro Herrerawrote:
    In short: in my opinion, attoptions and attfdwoptions need to be one
    thing and the same.
    I feel the opposite. In particular, what happens when a future release
    of PostgreSQL adds an attoption that happens to have the same name as
    somebody's per-column FDW option? Something breaks, that's what...

    Another point: We don't commingle these concepts at the table level.
    It doesn't make sense to have table reloptions separate from table FDW
    options but then go and make the opposite decision at the column
    level.
    I'm afraid that I've misunderstood the discussion. Do you mean that
    per-table options should be stored in reloptions, but per-column should
    be separated from attoptions? (I think I've misread...)
    No, I was arguing that they should both be separate.

    ...Robert
  • Shigeru Hanada at Jul 12, 2011 at 12:24 pm

    (2011/07/12 21:19), Robert Haas wrote:
    On Jul 12, 2011, at 12:31 AM, Shigeru Hanadawrote:
    I'm afraid that I've misunderstood the discussion. Do you mean that
    per-table options should be stored in reloptions, but per-column should
    be separated from attoptions? (I think I've misread...)
    No, I was arguing that they should both be separate.
    Thanks, I'm relieved. :)

    Regards,
    --
    Shigeru Hanada
  • Shigeru Hanada at Jul 11, 2011 at 2:12 pm
    Thanks for the review.

    (2011/07/10 12:49), Alvaro Herrera wrote:
    Err, \dec seems to have a line in describe.h but nowhere else; are you
    going to introduce that command?
    \dec command is obsolete, so I removed that line.
    The new ALTER TABLE grammar seems a bit strange -- ADD, SET, DROP. Is
    this defined by the SQL/MED standard?
    Yes, syntax for altering foreign table is defined by the SQL/MED
    standard as below, and <alter generic option> is common to all SQL/MED
    objects:

    <alter foreign table statement> ::=
    ALTER FOREIGN TABLE <table name> <alter foreign table action>

    <alter foreign table action> ::=
    <add basic column definition>
    <alter basic column definition>
    <drop basic column definition>
    <alter generic options>
    <alter generic options> ::=
    OPTIONS <left paren> <alter generic option list> <right paren>

    <alter generic option list> ::=
    <alter generic option> [ { <comma> <alter generic option> }... ]

    <alter generic option> ::= [ <alter operation> ] <option name> [ <option
    value> ]
    <alter operation> ::=
    ADD
    SET
    DROP
    <generic option> ::= <option name> [ <option value> ]

    <option value> ::= <character string literal>

    FYI, default for <alter operation> is ADD.
    It seems at odds with our
    handling of attoptions (and the pg_dump query seems rather bizarre in
    comparison to the handling of attoptions there; why do we need
    pg_options_to_table when attoptions do not?).
    That's because of the syntax difference between FDW options and AM
    options. AM options should be dumped as "key=value, key=value, ...",
    but FDW options should be dumped as "key 'value', key 'value', ...". The
    pg_options_to_table() is used to construct list in the latter form.
    The way used to handle per-column options in my patch is same as the way
    used for other existing FDW objects, such as FDW, server, and user mapping.
    What's the reason for this:

    @@ -3681,7 +3691,7 @@ AlterFdwStmt: ALTER FOREIGN DATA_P WRAPPER name opt_fdw_options alter_generic_op
    /* Options definition for CREATE FDW, SERVER and USER MAPPING */
    create_generic_options:
    OPTIONS '(' generic_option_list ')' { $$ = $3; }
    - | /*EMPTY*/ { $$ = NIL; }
    + | /*EMPTY*/ { $$ = NIL }
    ;
    Reverted this unintended change.
    I think this should be removed:

    + foreach (clist, column->fdwoptions)
    + {
    + DefElem *option = (DefElem *) lfirst(clist);
    + elog(DEBUG3, "%s=%s", option->defname, strVal(option->arg));
    + }
    Removed, the codes were used only for debug.
    As for whether attfdwoptions needs to be separate from attoptions, I am
    not sure that they really need to be; but if they are, I think we should
    use a different name than attfdwoptions, because we might want to use
    them for something else. Maybe attgenoptions (and note that the alter
    table code already calls them "generic options" so I'm not sure why the
    rest of the code is calling them FDW options) ... but then I really
    start to question whether they need to be separate from attoptions.
    For now I got +1 for attfdwoptions and +1 for attgenoptions for the
    naming. I prefer attgenoptions because it follows SQL/MED standard, but
    I don't have strong feeling for naming, so I've inspected usage in the
    current HEAD... Hm, "gen.*option" appears twice much as "fdw.*option"
    in the source code with case insensitive grep, and most of "fdw.*option"
    were hit "fdwoptions", per-wrapper options. ISTM that "generic option"
    would be better to mean options used by FDW for consistency, so I
    unified the wording to "generic option" from "fdw option". I hope that
    the name "generic option" doesn't confuse users who aren't familiar to
    SQL/MED standard.
    Currently, attoptions are used to store n_distinct and
    n_distinct_inherited. Do those options make sense for foreign tables?
    If they do make sense for some types of foreign servers, maybe we should
    decree that they need to be specifically declared as such by the FDW --
    that is, the FDW needs to provide its own attribute_reloptions routine
    (or equivalent therein) for validation that includes those core options.
    There is no saying that, even if all existing core options such as
    n_distinct apply to a FDW now, more core options that we might invent in
    the future are going to apply as well.

    In short: in my opinion, attoptions and attfdwoptions need to be one
    thing and the same.
    The n_distinct might make sense for every foreign tables in a sense,
    though syntax to set it is not supported. It would allow users to
    specify not-FDW-specific statistics information to control costs for the
    scan. However each FDW would be able to support such option too. I
    think that reloptions and attoptions should be used by only PG core, and
    FDW should use generic options. So I prefer separated design.

    The attached patch fixes issues other than generic options separation.

    Regards,
    --
    Shigeru Hanada
  • Peter Eisentraut at Jul 11, 2011 at 3:44 pm

    On lör, 2011-07-09 at 23:49 -0400, Alvaro Herrera wrote:
    The new ALTER TABLE grammar seems a bit strange -- ADD, SET, DROP. Is
    this defined by the SQL/MED standard? It seems at odds with our
    handling of attoptions
    Well, I believe the SQL/MED options were actually implemented first and
    the attoptions afterwards. But it's probably not unwise to keep them
    separate, even though the syntaxes could have been made more similar.
  • Shigeru Hanada at Jul 12, 2011 at 7:11 am

    (2011/07/12 0:44), Peter Eisentraut wrote:
    On lör, 2011-07-09 at 23:49 -0400, Alvaro Herrera wrote:
    The new ALTER TABLE grammar seems a bit strange -- ADD, SET, DROP. Is
    this defined by the SQL/MED standard? It seems at odds with our
    handling of attoptions
    Well, I believe the SQL/MED options were actually implemented first and
    the attoptions afterwards. But it's probably not unwise to keep them
    separate, even though the syntaxes could have been made more similar.
    As you say, syntax for attoptions/reloptions seem to satisfy the
    requirement of SQL/MED; SET for ADD/SET and RESET for DROP.

    But at this time it would break backward compatibility. I think it's
    reasonable to unify the syntax for handling SQL/MED options at every
    level to "OPTIONS (key 'value', ...)".

    Regards,
    --
    Shigeru Hanada
  • Alvaro Herrera at Jul 12, 2011 at 1:56 pm

    Excerpts from Shigeru Hanada's message of mar jul 12 03:11:54 -0400 2011:
    (2011/07/12 0:44), Peter Eisentraut wrote:
    On lör, 2011-07-09 at 23:49 -0400, Alvaro Herrera wrote:
    The new ALTER TABLE grammar seems a bit strange -- ADD, SET, DROP. Is
    this defined by the SQL/MED standard? It seems at odds with our
    handling of attoptions
    Well, I believe the SQL/MED options were actually implemented first and
    the attoptions afterwards. But it's probably not unwise to keep them
    separate, even though the syntaxes could have been made more similar.
    As you say, syntax for attoptions/reloptions seem to satisfy the
    requirement of SQL/MED; SET for ADD/SET and RESET for DROP.
    Speaking of which -- what's the difference between ADD and SET for SQL/MED
    options?

    --
    Álvaro Herrera <alvherre@commandprompt.com>
    The PostgreSQL Company - Command Prompt, Inc.
    PostgreSQL Replication, Consulting, Custom Development, 24x7 support
  • Peter Eisentraut at Jul 12, 2011 at 9:26 pm

    On tis, 2011-07-12 at 09:56 -0400, Alvaro Herrera wrote:
    Speaking of which -- what's the difference between ADD and SET for
    SQL/MED options?
    ADD add to the existing options, SET overwrites all options with what
    you specify.
  • Shigeru Hanada at Jul 12, 2011 at 9:55 pm

    (2011/07/12 22:56), Alvaro Herrera wrote:
    Speaking of which -- what's the difference between ADD and SET for SQL/MED
    options?
    ADD can only add new option; it can't overwrite existing option's value.
    To overwrite existing option's value, you need to use SET instead.

    Regards,
    --
    Shigeru Hanada
  • Josh Berkus at Jul 14, 2011 at 7:17 pm
    All,

    Is the spec for this feature still under discussion? I don't seem to
    see a consensus on this thread.

    --
    Josh Berkus
    PostgreSQL Experts Inc.
    http://pgexperts.com
  • Shigeru Hanada at Jul 15, 2011 at 2:56 am

    (2011/07/15 4:17), Josh Berkus wrote:
    All,

    Is the spec for this feature still under discussion? I don't seem to
    see a consensus on this thread.
    Yeah, a remaining concern is whether generic (FDW) options should be
    separated from existing attoptions or not.

    Indeed, reloptions/attoptions mechanism seems to be also applicable to
    generic options, since both need to store multiple key-value pairs, but
    IMHO generic options should be separated from reloptions/attoptions for
    several reasons:

    1) FDW options are handled by only FDW, but reloptions/attoptions are
    handled by PG core modules such as planner, AM and autovacuum. If we
    can separate them completely, they would be able to share one attribute,
    but I worry that some of reloptions/attoptions make sense for some FDW.
    For instance, n_distinct might be useful to control costs of a foreign
    table scan. Though attoptions can't be set via CREATE/ALTER FOREIGN
    TABLE yet.

    2) In future, newly added option might conflict somebody's FDW option.
    Robert Haas has pointed out this issue some days ago. FDW validator
    would reject unknown options, so every FDW would have to know all of
    reloptions/attoptions to avoid this issue.

    3) It would be difficult to unify syntax to set options from perspective
    of backward compatibility and syntax consistency. Other SQL/MED objects
    have the syntax such as "OPTIONS (key 'value', ...)", but
    reloptions/attoptions have the syntax such as "SET (key = value, ...)".
    Without syntax unification, some tools should care relkind before
    handling attoptions. For instance, pg_dump should choose syntax used to
    dump attoptions. It seems undesired complexity.

    Any comments/objections/questions are welcome.

    Regards,
    --
    Shigeru Hanada

    * 英語 - 自動検出
    * 英語
    * 日本語

    * 英語
    * 日本語

    <javascript:void(0);>
  • David Fetter at Jun 16, 2011 at 11:45 pm

    On Tue, Jun 14, 2011 at 05:56:05PM +0900, Shigeru Hanada wrote:
    Hi,

    I would like to propose support for per-column generic option, which
    is defined in the SQL/MED standard. In 9.0 release, support for
    foreign tables and per-table generic option have been added, but
    support for per-column generic option hasn't.

    Please examine the description below and attached patch
    per_column_option_v1.patch. Any comments or questions are welcome.
    Sorry not to respond sooner.

    First, the per-column generic options are a great thing for us to
    have. :)

    I have an idea I've been using for the next release of DBI-Link that
    has varying levels of data type mapping. In general, these mappings
    would be units of executable code, one in-bound, and one out-bound,
    for each of:

    Universe (everything, default "mapping" is the identity map, i.e. a no-op)
    Database type (e.g. MySQL)
    Instance (e.g. mysql://foo.bar.com:5432)
    Database
    Schema
    Table
    Column

    I didn't include row in the hierarchy because I couldn't think of a
    way to identify rows across DBMSs and stable over time.

    The finest-grain transformation that's been set would be the one
    actually used.

    Here's an example of a non-trivial mapping.

    Database type:
    MySQL
    Foreign data type:
    datetime
    PostgreSQL data type:
    timestamptz
    Transformation direction:
    Import
    Transformation:
    CASE
    WHEN DATA = '0000-00-00 00:00:00'
    THEN NULL
    ELSE DATA
    END

    Here, I'm making the simplifying assumption that there is a bijective
    mapping between data types.

    Is there some way to fit the per-column part of such a mapping into
    this scheme? We'd need to do some dependency tracking in order to be
    able to point to the appropriate code...

    Cheers,
    David.
    --
    David Fetter <david@fetter.org> http://fetter.org/
    Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
    Skype: davidfetter XMPP: david.fetter@gmail.com
    iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

    Remember to vote!
    Consider donating to Postgres: http://www.postgresql.org/about/donate
  • Shigeru Hanada at Jun 17, 2011 at 10:19 am

    (2011/06/17 8:44), David Fetter wrote:
    Sorry not to respond sooner.

    First, the per-column generic options are a great thing for us to
    have. :)
    Thanks for the comments. :-)
    I have an idea I've been using for the next release of DBI-Link that
    has varying levels of data type mapping. In general, these mappings
    would be units of executable code, one in-bound, and one out-bound,
    for each of:

    Universe (everything, default "mapping" is the identity map, i.e. a no-op)
    Database type (e.g. MySQL)
    Instance (e.g. mysql://foo.bar.com:5432)
    Database
    Schema
    Table
    Column
    Some of them seem to be able to be mapped to FDW object, e.g. Database
    to SERVER and Table to FOREIGN TABLE.
    I didn't include row in the hierarchy because I couldn't think of a
    way to identify rows across DBMSs and stable over time.

    The finest-grain transformation that's been set would be the one
    actually used.

    Here's an example of a non-trivial mapping.

    Database type:
    MySQL
    Foreign data type:
    datetime
    PostgreSQL data type:
    timestamptz
    Transformation direction:
    Import
    Transformation:
    CASE
    WHEN DATA = '0000-00-00 00:00:00'
    THEN NULL
    ELSE DATA
    END

    Here, I'm making the simplifying assumption that there is a bijective
    mapping between data types.

    Is there some way to fit the per-column part of such a mapping into
    this scheme? We'd need to do some dependency tracking in order to be
    able to point to the appropriate code...
    IIUC, you are talking about using FDW options as storage of data type
    mapping setting, or mapping definition itself, right? If so, a foreign
    table needs to be created to use per-column FDW options. Does it suit
    to your idea?

    BTW, I couldn't get what you mean by "dependency tracking". You mean
    the dependency between foreign column and local column? It might
    include essence of your idea... Would you explain the detail?

    Regards,
    --
    Shigeru Hanada
  • David Fetter at Jun 17, 2011 at 12:59 pm

    On Fri, Jun 17, 2011 at 07:19:39PM +0900, Shigeru Hanada wrote:
    (2011/06/17 8:44), David Fetter wrote:
    Sorry not to respond sooner.

    First, the per-column generic options are a great thing for us to
    have. :)
    Thanks for the comments. :-)
    I have an idea I've been using for the next release of DBI-Link that
    has varying levels of data type mapping. In general, these mappings
    would be units of executable code, one in-bound, and one out-bound,
    for each of:

    Universe (everything, default "mapping" is the identity map, i.e. a no-op)
    Database type (e.g. MySQL)
    Instance (e.g. mysql://foo.bar.com:5432)
    Database
    Schema
    Table
    Column
    Some of them seem to be able to be mapped to FDW object, e.g. Database
    to SERVER and Table to FOREIGN TABLE.
    Yes, I see there are a few missing. "Universe" doesn't really need
    much of anything, as far as I can tell, except if we wanted to do
    something that affected SQL/MED globally. Is that hierarchy otherwise
    OK? DB2 may have one more level between Instance and Database Type,
    that latter being the province of an individual FDW.
    I didn't include row in the hierarchy because I couldn't think of a
    way to identify rows across DBMSs and stable over time.

    The finest-grain transformation that's been set would be the one
    actually used.

    Here's an example of a non-trivial mapping.

    Database type:
    MySQL
    Foreign data type:
    datetime
    PostgreSQL data type:
    timestamptz
    Transformation direction:
    Import
    Transformation:
    CASE
    WHEN DATA = '0000-00-00 00:00:00'
    THEN NULL
    ELSE DATA
    END

    Here, I'm making the simplifying assumption that there is a bijective
    mapping between data types.

    Is there some way to fit the per-column part of such a mapping into
    this scheme? We'd need to do some dependency tracking in order to be
    able to point to the appropriate code...
    IIUC, you are talking about using FDW options as storage of data
    type mapping setting, or mapping definition itself, right? If so, a
    foreign table needs to be created to use per-column FDW options.
    Does it suit to your idea?
    Yes. The only mildly disturbing thing about how that would work is
    that "magic" key names would actually point to executable code, so
    there would be some kind of non-uniform processing of the options, and
    (possibly quite unlikely) ways to escalate privilege.
    BTW, I couldn't get what you mean by "dependency tracking". You
    mean the dependency between foreign column and local column? It
    might include essence of your idea... Would you explain the detail?
    I think the dependency between the mapping between the foreign column
    and the local one is already handled. On that subject, it's possible
    to make an argument that this mapping might need to be expanded so
    that in general, M foreign columns map to N local ones (distinct M and
    N), but that's a research topic, so let's not worry about it now.

    The dependency tracking I have in mind is of the actual executable
    code. If the inbound mapping has what amounts to a pointer to a
    function, it shouldn't be possible to drop that function without
    CASCADE, and if we're caching such functions, the cache needs to be
    refreshed any time the function changes.

    Cheers,
    David.
    --
    David Fetter <david@fetter.org> http://fetter.org/
    Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
    Skype: davidfetter XMPP: david.fetter@gmail.com
    iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

    Remember to vote!
    Consider donating to Postgres: http://www.postgresql.org/about/donate
  • David Fetter at Jun 27, 2011 at 11:42 pm

    On Fri, Jun 17, 2011 at 05:59:31AM -0700, David Fetter wrote:
    On Fri, Jun 17, 2011 at 07:19:39PM +0900, Shigeru Hanada wrote:

    Here's an example of a non-trivial mapping.

    Database type:
    MySQL
    Foreign data type:
    datetime
    PostgreSQL data type:
    timestamptz
    Transformation direction:
    Import
    Transformation:
    CASE
    WHEN DATA = '0000-00-00 00:00:00'
    THEN NULL
    ELSE DATA
    END

    Here, I'm making the simplifying assumption that there is a bijective
    mapping between data types.
    Any word on this?

    Cheers,
    David.
    --
    David Fetter <david@fetter.org> http://fetter.org/
    Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
    Skype: davidfetter XMPP: david.fetter@gmail.com
    iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

    Remember to vote!
    Consider donating to Postgres: http://www.postgresql.org/about/donate
  • Shigeru Hanada at Jun 28, 2011 at 12:06 pm
    Sorry for the long delay...

    (2011/06/17 21:59), David Fetter wrote:
    On Fri, Jun 17, 2011 at 07:19:39PM +0900, Shigeru Hanada wrote:
    (2011/06/17 8:44), David Fetter wrote:
    Sorry not to respond sooner.

    First, the per-column generic options are a great thing for us to
    have. :)
    Thanks for the comments. :-)
    I have an idea I've been using for the next release of DBI-Link that
    has varying levels of data type mapping. In general, these mappings
    would be units of executable code, one in-bound, and one out-bound,
    for each of:

    Universe (everything, default "mapping" is the identity map, i.e. a no-op)
    Database type (e.g. MySQL)
    Instance (e.g. mysql://foo.bar.com:5432)
    Database
    Schema
    Table
    Column
    Some of them seem to be able to be mapped to FDW object, e.g. Database
    to SERVER and Table to FOREIGN TABLE.
    Yes, I see there are a few missing. "Universe" doesn't really need
    much of anything, as far as I can tell, except if we wanted to do
    something that affected SQL/MED globally. Is that hierarchy otherwise
    OK?
    Yes, maybe some levels in your hierarchy can be mapped to SQL/MED
    objects, and you can store options with them.

    Universe : N/A (I'm not sure but custom GUC might suit for this)
    Database type : FOREIGN DATA WRAPPER
    Instance : N/A
    Database : SERVER
    Schema : N/A
    Table : FOREIGN TABLE
    Column : column of FOREIGN TABLE(WIP)
    DB2 may have one more level between Instance and Database Type,
    that latter being the province of an individual FDW.
    I'm not familiar with DB2, but it would be difficult to map such level
    to one of existing SQL/MED object types.
    I didn't include row in the hierarchy because I couldn't think of a
    way to identify rows across DBMSs and stable over time.

    The finest-grain transformation that's been set would be the one
    actually used.
    Yeah, I think it's generally convenient for users if a FDW allows to
    override settings which were defined for object on upper level.
    For instance, if I'm dealing many files which have same format, and if
    we could set "format" option for file_fdw on the server, all I have to
    do for each foreign table is to specify "filename". I think that that's
    usual use case.
    Here's an example of a non-trivial mapping.

    Database type:
    MySQL
    Foreign data type:
    datetime
    PostgreSQL data type:
    timestamptz
    Transformation direction:
    Import
    Transformation:
    CASE
    WHEN DATA = '0000-00-00 00:00:00'
    THEN NULL
    ELSE DATA
    END

    Here, I'm making the simplifying assumption that there is a bijective
    mapping between data types.

    Is there some way to fit the per-column part of such a mapping into
    this scheme? We'd need to do some dependency tracking in order to be
    able to point to the appropriate code...
    IIUC, you are talking about using FDW options as storage of data
    type mapping setting, or mapping definition itself, right? If so, a
    foreign table needs to be created to use per-column FDW options.
    Does it suit to your idea?
    Yes. The only mildly disturbing thing about how that would work is
    that "magic" key names would actually point to executable code, so
    there would be some kind of non-uniform processing of the options, and
    (possibly quite unlikely) ways to escalate privilege.
    How are you planning to define a mapping for an object other than
    column? ISTM that you need to combine N mappings for such object, N is
    the number of distinct types used under the level, so FDW seems to have
    to cover all kind of transformation. Maybe you need to retrieve options
    from lower level to upper level, until you find one which is suitable
    for the combination of types.
    BTW, I couldn't get what you mean by "dependency tracking". You
    mean the dependency between foreign column and local column? It
    might include essence of your idea... Would you explain the detail?
    I think the dependency between the mapping between the foreign column
    and the local one is already handled. On that subject, it's possible
    to make an argument that this mapping might need to be expanded so
    that in general, M foreign columns map to N local ones (distinct M and
    N), but that's a research topic, so let's not worry about it now.

    The dependency tracking I have in mind is of the actual executable
    code. If the inbound mapping has what amounts to a pointer to a
    function, it shouldn't be possible to drop that function without
    CASCADE, and if we're caching such functions, the cache needs to be
    refreshed any time the function changes.
    Agreed, such dependency would have to be maintained by the system.
    Dependencies from column to FDW (through foreign table and server) have
    been managed with pg_depend. Cache invalidation would be need to be
    implemented by dbi-link.

    Current dependency graph about SQL/MED objects is:

    column -> foreign table ----> server -> FDW
    user mapping _/

    VALIDATOR function might be able to be used to maintain pg_depend
    entries when options are set/changed/dropped via CREATE/ALTER, though
    it's not main purpose of VALIDATOR.

    regards,
    --
    Shigeru Hanada

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppgsql-hackers @
categoriespostgresql
postedJun 14, '11 at 8:56a
activeJul 18, '11 at 7:31p
posts31
users8
websitepostgresql.org...
irc#postgresql

People

Translate

site design / logo © 2022 Grokbase