FAQ
Hi All.

I just subscribed to RRReviewers (that should be pronounce with a nice
rolling r-r-reviewers, right?)

As part of my getting up to speed, I tried to build and run test on the
current master 073d7cb513f5de44530f4bdbaaa4b5d4cce5f984

Basically I did:
1) Clone into new dir
2) ./configure --enable-debug --enable-cassert --with-pgport=5499
--prefix=$(realpath ../root)
3) make -j4
4) maje -j4 check

And expecting in 4) to get "all test passed", I was surprised to see
that an index-test failed.

I run (Gentoo) Linux x86_64, gcc-4.7.3.

Any ideas what might have happened?

Svenne

Search Discussions

  • Kevin Grittner at Jun 18, 2013 at 6:18 pm

    Svenne Krap wrote:

    current master 073d7cb513f5de44530f4bdbaaa4b5d4cce5f984
    I was surprised to see that an index-test failed.
    It works for me.  Could you paste or attach some detail?


    --
    Kevin Grittner
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise PostgreSQL Company
  • Svenne Krap at Jun 18, 2013 at 6:37 pm

    On 18-06-2013 20:17, Kevin Grittner wrote:
    I was surprised to see that an index-test failed.
    It works for me. Could you paste or attach some detail?
    Gladly, if you tell me what would be relevant to attach :)

    I am brand new to the postgresql source code and hence have no real idea
    how to catch it..

    Svenne
  • Kevin Grittner at Jun 18, 2013 at 6:48 pm

    Svenne Krap wrote:
    On 18-06-2013 20:17, Kevin Grittner wrote:

    I was surprised to see that an index-test failed.
    It works for me.  Could you paste or attach some detail?
    Gladly, if you tell me what would be relevant to attach :)

    I am brand new to the postgresql source code and hence have no
    real idea how to catch it..
    Apologies; I somehow missed the file attached to your initial post.
    That's the sort of thing I was looking for.

    Having reviewed that, the source code comments indicate it is for
    "character-by-character (not collation order) comparison operators
    for character types".  Perhaps your collation is having an impact
    regardless of that.  Could you show us the values of your settings
    related to locale?

    --
    Kevin Grittner
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise PostgreSQL Company
  • Svenne Krap at Jun 18, 2013 at 7:04 pm

    On 18-06-2013 20:48, Kevin Grittner wrote:
    Apologies; I somehow missed the file attached to your initial post.
    That's the sort of thing I was looking for.
    Aplogy accepted... :)
    Having reviewed that, the source code comments indicate it is for
    "character-by-character (not collation order) comparison operators
    for character types". Perhaps your collation is having an impact
    regardless of that. Could you show us the values of your settings
    related to locale?
    I am not entirely sure what you mean by settings related to locale ...
    but here is the system locale output and the definition of my main
    instance (the 9.2 one I wrote about in the second test)...I don't know
    if/how to extract the same information from the temp instance made by
    "make check".

    Btw. could you point a newbie to where you found the function (so that I
    know that for later)?

    If you need other data, please be more specific about how I get them :)


    # locale

    LANG=da_DK.UTF8

    LC_CTYPE="da_DK.UTF8"

    LC_NUMERIC="da_DK.UTF8"

    LC_TIME="da_DK.UTF8"

    LC_COLLATE="da_DK.UTF8"

    LC_MONETARY="da_DK.UTF8"

    LC_MESSAGES=en_US.UTF8

    LC_PAPER="da_DK.UTF8"

    LC_NAME="da_DK.UTF8"

    LC_ADDRESS="da_DK.UTF8"

    LC_TELEPHONE="da_DK.UTF8"

    LC_MEASUREMENT="da_DK.UTF8"

    LC_IDENTIFICATION="da_DK.UTF8"

    LC_ALL=


    (sk@[local]:5432) [sk] > \l

                                       List of databases

         Name | Owner | Encoding | Collate | Ctype | Access
    privileges

    -
    -------------+----------+----------+------------+------------+-----------------------

      postgres | postgres | UTF8 | da_DK.UTF8 | da_DK.UTF8 |

      root | root | UTF8 | da_DK.UTF8 | da_DK.UTF8 |

      sk | sk | UTF8 | da_DK.UTF8 | da_DK.UTF8 |

      template0 | postgres | UTF8 | da_DK.UTF8 | da_DK.UTF8 |
    =c/postgres +
    postgres=CTc/postgres

      template1 | postgres | UTF8 | da_DK.UTF8 | da_DK.UTF8 |
    =c/postgres +
    postgres=CTc/postgres

    (5 rows)
  • Svenne Krap at Jun 18, 2013 at 7:07 pm

    On 18-06-2013 21:04, Svenne Krap wrote:
    (sk@[local]:5432) [sk] > \l

    List of databases

    Name | Owner | Encoding | Collate | Ctype | Access
    privileges

    -
    Arghh... crappy mailer... I have the information attached here instead...
  • Kevin Grittner at Jun 18, 2013 at 7:34 pm

    Svenne Krap wrote:

    I have the information attached here instead...
    I find it suspicious that the test is using an index which sorts
    first by the "f1" column, then later by "f1 text_pattern_ops"
    column.  I'm not 100% sure whether the test is bad or you have
    found a bug, although I suspect the latter.  The actual result
    should not depend on the index definition; the index should only
    affect performance and possibly the order of results where order is
    not specified.

    --
    Kevin Grittner
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise PostgreSQL Company
  • Svenne Krap at Jun 18, 2013 at 6:20 pm

    On 18-06-2013 18:40, Svenne Krap wrote:
    Any ideas what might have happened?
    After doing some more digging...

    My laptop (which runs PostgreSQL 9.2.4 on x86_64-pc-linux-gnu, compiled
    by x86_64-pc-linux-gnu-gcc (Gentoo 4.7.3 p1.0, pie-0.5.5) 4.7.3,
    64-bit) also returns "99", if I

    - - run the CREATE TABLE tenk1 (from the git-master)
    - - load data from tenk.data (from git-master)
    - - run the "offending part" of the create_index.sql (also from git-master):

    The offending offending_part is:
    CREATE TABLE dupindexcols
    AS
       SELECT unique1 as id, stringu2::text as f1 FROM
    tenk1;
    CREATE INDEX dupindexcols_i ON dupindexcols (f1, id, f1
    text_pattern_ops);
    ANALYZE
    dupindexcols;



    EXPLAIN (COSTS
    OFF)

       SELECT count(*) FROM
    dupindexcols
         WHERE f1 > 'WA' and id < 1000 and f1 ~<~
    'YX';
    SELECT count(*) FROM
    dupindexcols
       WHERE f1 > 'WA' and id < 1000 and f1 ~<~ 'YX';


    As I have no real idea of what "~<~" is for an operator (I have looked
    it up as scalarltjoinsel), but I cannot find any semantics for it in the
    docs*... So I have no way of manually checking the expected result.

    *=The term ~<~ is not exactly google-friendly and the docs site's search
    also returns empty...

    Anyone has any idea what to look after next?

    Svenne
  • Jeff Janes at Jun 18, 2013 at 7:14 pm

    On Tue, Jun 18, 2013 at 11:20 AM, Svenne Krap wrote:
    -----BEGIN PGP SIGNED MESSAGE-----
    Hash: SHA256


    On 18-06-2013 18:40, Svenne Krap wrote:
    Any ideas what might have happened?
    After doing some more digging...

    My laptop (which runs PostgreSQL 9.2.4 on x86_64-pc-linux-gnu, compiled
    by x86_64-pc-linux-gnu-gcc (Gentoo 4.7.3 p1.0, pie-0.5.5) 4.7.3,
    64-bit) also returns "99", if I

    - - run the CREATE TABLE tenk1 (from the git-master)
    - - load data from tenk.data (from git-master)
    - - run the "offending part" of the create_index.sql (also from
    git-master):
    But 9.2.4 does pass "make check", and only fails if you reproduce those
    things manually?

    If so, I'm guessing that you have some language/locale settings that "make
    check" neutralizes in 9.2.4, but that neutralization is broken in HEAD.


    As I have no real idea of what "~<~" is for an operator (I have looked
    it up as scalarltjoinsel), but I cannot find any semantics for it in the
    docs*... So I have no way of manually checking the expected result.

    Yes, it does seem to be entirely undocumented. Using:
    git grep '~<~', I found the code comment "character-by-character (not
    collation order) comparison operators for character types"

    Anyway, if REL9_2_4 passes make check, but 073d7cb513f5de44530f fails, then
    you could use "git bisect" to find the exact commit that broke things.

    Cheers,

    Jeff
  • Alvaro Herrera at Jun 18, 2013 at 7:23 pm

    Jeff Janes escribió:
    On Tue, Jun 18, 2013 at 11:20 AM, Svenne Krap wrote:

    As I have no real idea of what "~<~" is for an operator (I have looked
    it up as scalarltjoinsel), but I cannot find any semantics for it in the
    docs*... So I have no way of manually checking the expected result.
    Yes, it does seem to be entirely undocumented. Using:
    git grep '~<~', I found the code comment "character-by-character (not
    collation order) comparison operators for character types"
    To look up an operator you can search in pg_operator.h (where you'd also
    see the DESCR() line nearby containing a description) or the pg_operator
    catalog. The pg_operator row contains a reference to the pg_proc entry
    that implements the operator. The pg_proc row, in turn, refers to a
    (typically) C-language function that implements the function. Normally
    looking at the function and surrounding code you can figure out what the
    operator is about.

    In this case you should probably be able to find the operator referenced
    in pg_amop as well, as part of the "*_pattern_ops" opclasses.

    --
    Álvaro Herrera http://www.2ndQuadrant.com/
    PostgreSQL Development, 24x7 Support, Training & Services
  • Svenne Krap at Jun 18, 2013 at 7:41 pm

    On 18-06-2013 21:14, Jeff Janes wrote:
    But 9.2.4 does pass "make check", and only fails if you reproduce
    those things manually?
    >
    No, I was lazy and used the (distribution-installed) 9.2....

    I have tried "make check" on REL_9_2_4 and that fails to (same sole
    failure)...
    If so, I'm guessing that you have some language/locale settings that "make check" neutralizes in
    9.2.4, but that neutralization is broken in HEAD.
    >
    Nope, just never ran "make check" on it...

    As I have no real idea of what "~<~" is for an operator (I have looked
    it up as scalarltjoinsel), but I cannot find any semantics for it in the
    docs*... So I have no way of manually checking the expected result.



    Yes, it does seem to be entirely undocumented. Using:
    git grep '~<~', I found the code comment "character-by-character (not
    collation order) comparison operators for character types"
    Anyway, if REL9_2_4 passes make check, but 073d7cb513f5de44530f fails,
    then you could use "git bisect" to find the exact commit that broke things.
    >
    It does not..

    I will dig futher and get back...

    Svenne
  • Svenne Krap at Jun 18, 2013 at 7:51 pm

    On 18-06-2013 21:41, Svenne Krap wrote:


    I will dig futher and get back...
    The regression test was added in 9.2, the earliest interesting commit is
    d6d5f67b5b98b1685f9158e9d00a726afb2ae789,
    where Tom Lane changes the definition to the current.

    It still fails (which suggests that it has always and will always fail
    on my setup....)

    I am happy to run whatever relevant tests you can dream up, but I am
    fresh out of ideas :)

    Svenne
  • Kevin Grittner at Jun 18, 2013 at 8:16 pm

    Svenne Krap wrote:

    I am happy to run whatever relevant tests you can dream up, but I am
    fresh out of ideas :)
    psql regression
    begin;
    drop index dupindexcols_i;
    SELECT count(*) FROM dupindexcols
      WHERE f1 > 'WA' and id < 1000 and f1 ~<~ 'YX';
    rollback;
    select f1 from dupindexcols where f1 like 'W%' ORDER BY f1;

    What are the results?

    --
    Kevin Grittner
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise PostgreSQL Company
  • Jeff Janes at Jun 18, 2013 at 8:18 pm

    On Tue, Jun 18, 2013 at 12:51 PM, Svenne Krap wrote:
    -----BEGIN PGP SIGNED MESSAGE-----
    Hash: SHA256
    On 18-06-2013 21:41, Svenne Krap wrote:



    I will dig futher and get back...
    The regression test was added in 9.2, the earliest interesting commit is
    d6d5f67b5b98b1685f9158e9d00a726afb2ae789,
    where Tom Lane changes the definition to the current.

    I get it back to e2c2c2e8b1df7dfdb01e7, where the ability to have one index
    with the same column twice appears.


    The problem is the f1 > 'WA' part of the query. In Danish, apparently 'AA'
    'WA', so two more rows show up.
      SELECT f1 FROM
    dupindexcols
       WHERE f1 > 'WA' and id < 1000 and f1 ~<~ 'YX' except
    SELECT f1 FROM
    dupindexcols
       WHERE f1 ~>~ 'WA' and id < 1000 and f1 ~<~ 'YX';

        f1
    --------
      AANAAA
      AAMAAA


    I don't know how important it is to the community for make check to pass
    under every possible LANG setting, or the best way to go about fixing it if
    it is important.

    Cheers,

    Jeff
  • Kevin Grittner at Jun 18, 2013 at 8:23 pm

    Jeff Janes wrote:

    The problem is the f1 > 'WA' part of the query.  In Danish,
    apparently 'AA' > 'WA', so two more rows show up.
    Thanks -- I didn't have the right locale installed, and wasn't
    quite sure what package to install to get it.

    So, the test is bad, rather than there being a production bug.
    I don't know how important it is to the community for make check
    to pass under every possible LANG setting, or the best way to go
    about fixing it if it is important.
    We should probably tweak the test to not use a range which fails in
    any known locale.

    --
    Kevin Grittner
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise PostgreSQL Company
  • Svenne Krap at Jun 19, 2013 at 6:00 am

    On 18-06-2013 22:18, Jeff Janes wrote:
    In Danish, apparently 'AA' > 'WA', so two more rows show up.
    Yes of course....

    We have three extra vowels following Z (namely Æ, Ø and Å) and for
    keyboard missing those essential keys we have an official alternate way
    to write them as AE , OE and AA.

    Which of course means that AA is larger than any other letter ;)

    Nice find :)

    Svenne
  • Kevin Grittner at Jun 19, 2013 at 1:18 pm

    Svenne Krap wrote:
    On 18-06-2013 22:18, Jeff Janes wrote:

    In Danish, apparently 'AA' > 'WA', so two more rows show up.
    Yes of course....

    We have three extra vowels following Z (namely Æ, Ø and Å) and
    for keyboard missing those essential keys we have an official
    alternate way to write them as AE , OE and AA.

    Which of course means that AA is larger than any other letter ;)
    Does anyone object to the attached change, so that regression tests
    pass when run in a Danish locale?  I think it should be
    back-patched to 9.2, where the test was introduced.

    --
    Kevin Grittner
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise PostgreSQL Company
  • Andres Freund at Jun 19, 2013 at 1:22 pm

    On 2013-06-19 06:18:20 -0700, Kevin Grittner wrote:
    Svenne Krap wrote:
    On 18-06-2013 22:18, Jeff Janes wrote:

    In Danish, apparently 'AA' > 'WA', so two more rows show up.
    Yes of course....

    We have three extra vowels following Z (namely Æ, Ø and Å) and
    for keyboard missing those essential keys we have an official
    alternate way to write them as AE , OE and AA.

    Which of course means that AA is larger than any other letter ;)
    Does anyone object to the attached change, so that regression tests
    pass when run in a Danish locale?  I think it should be
    back-patched to 9.2, where the test was introduced.
    Don't we actually run make check/standard pg_regress with an enforced C
    locale? In which case this would imply some bigger problem we probably
    don't want to hide.

    Greetings,

    Andres Freund

    --
      Andres Freund http://www.2ndQuadrant.com/
      PostgreSQL Development, 24x7 Support, Training & Services
  • Andres Freund at Jun 19, 2013 at 1:46 pm

    On 2013-06-19 15:23:16 +0200, Andres Freund wrote:
    On 2013-06-19 06:18:20 -0700, Kevin Grittner wrote:
    Svenne Krap wrote:
    On 18-06-2013 22:18, Jeff Janes wrote:

    In Danish, apparently 'AA' > 'WA', so two more rows show up.
    Yes of course....

    We have three extra vowels following Z (namely Æ, Ø and Å) and
    for keyboard missing those essential keys we have an official
    alternate way to write them as AE , OE and AA.

    Which of course means that AA is larger than any other letter ;)
    Does anyone object to the attached change, so that regression tests
    pass when run in a Danish locale?  I think it should be
    back-patched to 9.2, where the test was introduced.
    Don't we actually run make check/standard pg_regress with an enforced C
    locale? In which case this would imply some bigger problem we probably
    don't want to hide.
    Misremembered, we only do that optionally. So yes, seems sensible.

    Greetings,

    Andres Freund

    --
      Andres Freund http://www.2ndQuadrant.com/
      PostgreSQL Development, 24x7 Support, Training & Services
  • Peter Eisentraut at Jun 19, 2013 at 2:05 pm

    On 6/19/13 9:18 AM, Kevin Grittner wrote:
    Svenne Krap wrote:
    On 18-06-2013 22:18, Jeff Janes wrote:

    In Danish, apparently 'AA' > 'WA', so two more rows show up.
    Yes of course....

    We have three extra vowels following Z (namely Æ, Ø and Å) and
    for keyboard missing those essential keys we have an official
    alternate way to write them as AE , OE and AA.

    Which of course means that AA is larger than any other letter ;)
    Does anyone object to the attached change, so that regression tests
    pass when run in a Danish locale? I think it should be
    back-patched to 9.2, where the test was introduced.
    Yes, that should be fixed. I wouldn't put in the comment, though. A
    few releases ago, I fixed a number of other "Danish" issues, so adding
    this comment would give the impression that this the only place.
  • Kevin Grittner at Jun 19, 2013 at 3:41 pm

    Peter Eisentraut wrote:
    On 6/19/13 9:18 AM, Kevin Grittner wrote:

    Does anyone object to the attached change, so that regression tests
    pass when run in a Danish locale?  I think it should be
    back-patched to 9.2, where the test was introduced.
    Yes, that should be fixed.  I wouldn't put in the comment, though.  A
    few releases ago, I fixed a number of other "Danish" issues, so adding
    this comment would give the impression that this the only place.
    OK, pushed without the comment.

    --
    Kevin Grittner
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise PostgreSQL Company
  • Svenne Krap at Jun 19, 2013 at 8:54 pm

    On 19-06-2013 17:41, Kevin Grittner wrote:
    OK, pushed without the comment.
    Works like a charm :)

    Svenne
  • Jeff Janes at Jun 19, 2013 at 9:50 pm

    On Wed, Jun 19, 2013 at 8:41 AM, Kevin Grittner wrote:

    Peter Eisentraut wrote:
    On 6/19/13 9:18 AM, Kevin Grittner wrote:

    Does anyone object to the attached change, so that regression tests
    pass when run in a Danish locale? I think it should be
    back-patched to 9.2, where the test was introduced.
    Yes, that should be fixed. I wouldn't put in the comment, though. A
    few releases ago, I fixed a number of other "Danish" issues, so adding
    this comment would give the impression that this the only place.
    OK, pushed without the comment.
    I had started this and let it run overnight:

    for LANG in `locale -a`; do make check >& /dev/null ; echo $? $LANG; done

    Of the 735 language/locales/encodings, I got 93 failures. After your
    commit I re-tested just the failures, and it fixed 25 of them.

    Of the ones I looked at, most of the problems are in create_index, some in
    matview as well.

    Lithuanian has Y coming between I and J. Estonian has Z between S and T.
      Norwegian seems to treat V and W as being equal except to break
    suffix-ties.

    Is there an infrastructure to use a different expected file depending on
    the LANG used?

    Cheers,

    Jeff
  • Kevin Grittner at Jun 19, 2013 at 10:03 pm

    Jeff Janes wrote:
    Kevin Grittner wrote:
    Peter Eisentraut wrote:
    On 6/19/13 9:18 AM, Kevin Grittner wrote:
    Does anyone object to the attached change, so that regression tests
    pass when run in a Danish locale?  I think it should be
    back-patched to 9.2, where the test was introduced.
    Yes, that should be fixed.  I wouldn't put in the comment, though.  A
    few releases ago, I fixed a number of other "Danish" issues, so adding
    this comment would give the impression that this the only place.
    OK, pushed without the comment.
    I had started this and let it run overnight:

    for LANG in `locale -a`; do make check >& /dev/null ; echo $? $LANG; done

    Of the 735 language/locales/encodings, I got 93 failures. Ouch!
    After your commit I re-tested just the failures, and it fixed 25
    of them.
    That's more than I would have guessed.  Cool.
    Of the ones I looked at, most of the problems are in
    create_index, some in matview as well.
    So of the 68 remaining locales which fail, most are due to a couple
    scripts.
    Lithuanian has Y coming between I and J.
    Estonian has Z between S and T.
    Norwegian seems to treat V and W as being equal except to break suffix-ties.
    Is there an infrastructure to use a different expected file
    depending on the LANG used?
    Well, any one test can have alternative "expected" scripts; but in
    previous discussions we decided that that facility should not be
    used for locale issues.  It would tend to get into multiplicative
    permutations with other reasons to have alternatives.  What we have
    done is to try to create tests that don't hit those edge conditions
    when we know of them.

    Could you share your detailed information on the remaining failures?

    --
    Kevin Grittner
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise PostgreSQL Company
  • Peter Eisentraut at Jun 20, 2013 at 2:01 am

    On Wed, 2013-06-19 at 14:50 -0700, Jeff Janes wrote:
    Is there an infrastructure to use a different expected file depending
    on the LANG used?
    Not really. A couple of years ago I did the same exercise you just did,
    and we just fixed most of what was reasonable to fix by adjusting the
    test cases.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppgsql-hackers @
categoriespostgresql
postedJun 18, '13 at 4:40p
activeJun 20, '13 at 2:01a
posts25
users6
websitepostgresql.org...
irc#postgresql

People

Translate

site design / logo © 2021 Grokbase