Some info:
PostgreSQL version: 9.1.2
Table "cache":
Rows count: 3 471 081
Column "tsv" tsvector
Index "cache_tsv" USING gin (tsv)
If i do query like THIS:
*SELECT id FROM table WHERE tsv @@ to_tsquery('test:*');*
It uses index and returns results immediately:
explain analyze
'Bitmap Heap Scan on cache (cost=1441.78..63802.63 rows=19843 width=4)
(actual time=29.309..31.518 rows=1358 loops=1)'
' Recheck Cond: (tsv @@ to_tsquery('test:*'::text))'
' -> Bitmap Index Scan on cache_tsv (cost=0.00..1436.82 rows=19843
width=0) (actual time=28.966..28.966 rows=1559 loops=1)'
' Index Cond: (tsv @@ to_tsquery('test:*'::text))'
'Total runtime: 31.789 ms'
But the performance problems starts when i do the same query specifying
LIMIT.
*SELECT id FROM cache WHERE tsv @@ to_tsquery('test:*') limit 20;*
By some reason index is not used.
explain analyze
'Limit (cost=0.00..356.23 rows=20 width=4) (actual time=7.984..765.550
rows=20 loops=1)'
' -> Seq Scan on cache (cost=0.00..353429.50 rows=19843 width=4) (actual
time=7.982..765.536 rows=20 loops=1)'
' Filter: (tsv @@ to_tsquery('test:*'::text))'
'Total runtime: 765.620 ms'
Some more debug notes:
1) If i set SET enable_seqscan=off; then query uses indexes correctly
2) Also i notified, if i use: to_tsquery('test') without wildcard search
:*, then index is used correctly in both queries, with or without LIMIT
Any ideas how to fix the problem?
Thank you
Query planner doesn't use index scan on tsvector GIN index if LIMIT is specifiedQuery planner doesn't use index scan on tsvector GIN index if LIMIT is specified
| Tweet |
|
Search Discussions
-
Tom Lane at Jan 10, 2012 at 5:04 pm ⇧
It apparently thinks there are enough matches that it might as well justdarklow writes:
But the performance problems starts when i do the same query specifying
LIMIT.
*SELECT id FROM cache WHERE tsv @@ to_tsquery('test:*') limit 20;*
By some reason index is not used.
seqscan the table and expect to find some matches at random, in less
time than using the index would take.
The estimate seems to be off quite a bit, so maybe raising the stats
target for this column would help.
regards, tom lane
-
Jesper Krogh at Jan 10, 2012 at 9:43 pm ⇧
The cost of matching ts_match_vq against a toasted columnOn 2012-01-10 18:04, Tom Lane wrote:
darklow<darklow@gmail.com> writes:But the performance problems starts when i do the same query specifyingIt apparently thinks there are enough matches that it might as well just
LIMIT.
*SELECT id FROM cache WHERE tsv @@ to_tsquery('test:*') limit 20;*
By some reason index is not used.
seqscan the table and expect to find some matches at random, in less
time than using the index would take.
The estimate seems to be off quite a bit, so maybe raising the stats
target for this column would help.
is not calculated correctly. This is completely parallel with
http://archives.postgresql.org/pgsql-hackers/2011-11/msg01754.php
Try raising the cost for ts_match_vq(tsvector,tsquery) that help a bit, but
its hard to get the cost high enough.
Raising statistics target helps too..
--
Jesper
Related Discussions
Discussion Navigation
| view | thread | post |
Discussion Overview
| group | pgsql-performance
|
| categories | postgresql |
| posted | Jan 10, '12 at 12:30p |
| active | Jan 10, '12 at 9:43p |
| posts | 3 |
| users | 3 |
| website | postgresql.org |
| irc | #postgresql |
