FAQ
Hi everybody,

I have got an issue with PostgreSQL. There is a limitation on the
column length of a tuple, in case there is an index over it. In the
actual project I am working on, I meet such a situation. I have got an
attribute over which I am doing a search (that is, I need an index over
it), but this attribute can be in some cases very large (100KB+).

The log message I get from Postgres, if I try to insert a tuple with
such a big attribute (e.g. 10K) is the following:

ERROR: index row requires 15704 bytes, maximum size is 8191

(PostgreSQL 8.07 under Linux. The index is a btree index.)

I have thought of a possible workaround. I would like to know if it
seems reasonable. The idea would be to build a hash, on the client
side, over the problematic column (let's say column a). I then store in
the db the attribute a (without index) and the hash(a) (with an index).
Then when I am doing a select, I use firstly a sub-select to choose all
tuples with the right hash (quick, with index), and then an outer
select to choose the tuple with the right attribute a (slow, sequential
scan, but normally few tuples, because few collisions). Something like
that:

SELECT b
FROM (
SELECT a, b
FROM foo
WHERE hash='<hash(a)>'
) as bar
WHERE bar.a='<a>'

(Actually, in my case the situation is slightly more complicated
because I don't have just one attribute but 2+, so there are some index
types that I cannot use. Anyway the principle is the same).

Does this solution seem reasonable, or is there other (more elegant)
ways to do that?

Thank you in advance.
Cheers,

Pat

Search Discussions

  • Martijn van Oosterhout at Jun 26, 2006 at 7:57 pm

    On Mon, Jun 26, 2006 at 02:52:56AM -0700, pajai wrote:
    I have thought of a possible workaround. I would like to know if it
    seems reasonable. The idea would be to build a hash, on the client
    side, over the problematic column (let's say column a). I then store in
    the db the attribute a (without index) and the hash(a) (with an index).
    Then when I am doing a select, I use firstly a sub-select to choose all
    tuples with the right hash (quick, with index), and then an outer
    select to choose the tuple with the right attribute a (slow, sequential
    scan, but normally few tuples, because few collisions). Something like
    that:
    Perhaps you should look into functional indexes. Indexes over a
    function.

    CREATE INDEX foo_index ON foo( hash(a) );

    This index will automatically be used if you make a query like this:

    ... WHERE hash(a) = 'blah';

    Hope this helps,
    --
    Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/
    From each according to his ability. To each according to his ability to litigate.
  • Tom Lane at Jun 26, 2006 at 8:25 pm

    "pajai" <patrick.jayet@gmail.com> writes:
    I have got an issue with PostgreSQL. There is a limitation on the
    column length of a tuple, in case there is an index over it. In the
    actual project I am working on, I meet such a situation. I have got an
    attribute over which I am doing a search (that is, I need an index over
    it), but this attribute can be in some cases very large (100KB+).
    It sounds to me a lot like you may be in need of full-text-index code ---
    see contrib/tsearch2. If you were not trying to roll-your-own text
    searcher, please give more details. I can hardly imagine a situation in
    which it is useful to make a btree index on 100KB values.

    regards, tom lane

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppgsql-general @
categoriespostgresql
postedJun 26, '06 at 12:32p
activeJun 26, '06 at 8:25p
posts3
users3
websitepostgresql.org
irc#postgresql

People

Translate

site design / logo © 2022 Grokbase