Hi,

After pg_trgm extracts the trigrams as GIN index keys, generate_trgm()
removes duplicate index keys, to avoid generating redundant index entries.
Also ginExtractEntries() which is the caller of pg_trgm does the same thing.
Why do we need to remove GIN index entries twice? I think that we can
get rid of the removal-of-duplicate code block from generate_trgm()
because it's useless. Comments?

Regards,

--
Fujii Masao

Search Discussions

  • Tom Lane at Aug 27, 2012 at 7:38 pm

    Fujii Masao writes:
    After pg_trgm extracts the trigrams as GIN index keys, generate_trgm()
    removes duplicate index keys, to avoid generating redundant index entries.
    Also ginExtractEntries() which is the caller of pg_trgm does the same thing.
    Why do we need to remove GIN index entries twice? I think that we can
    get rid of the removal-of-duplicate code block from generate_trgm()
    because it's useless. Comments?
    I see eight different callers of generate_trgm(). It might be that
    gin_extract_value_trgm() doesn't really need this behavior, but that
    doesn't mean the other seven don't want it.

    Also, seeing that generate_trgm() is able to use relatively cheap
    trigram-specific comparison operators for this, it's not impossible
    that getting rid of duplicates internal to it is a net savings even
    for the gin_extract_value case, because it'd reduce the number of
    much-more-heavyweight comparisons done by ginExtractEntries...

    regards, tom lane

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppgsql-hackers @
categoriespostgresql
postedAug 27, '12 at 4:46p
activeAug 27, '12 at 7:38p
posts2
users2
websitepostgresql.org...
irc#postgresql

2 users in discussion

Fujii Masao: 1 post Tom Lane: 1 post

People

Translate

site design / logo © 2021 Grokbase