The following bug has been logged online:
Bug reference: 3979
Logged by: David Lee
Email address: [email protected]
PostgreSQL version: 8.2.6
Operating system: Ubuntu Feisty Server
Description: SELECT DISTINCT slow even on indexed column
Column | Type | Modifiers
a | integer | not null
b | integer | not null
time | timestamp without time zone | not null
remote_time | timestamp without time zone | not null
ip | inet | not null
The table has 20 million rows.
The table "x" has an index on ("a", "b").
I first tried:
SELECT DISTINCT a, b FROM x
but it was so slow.
I ran EXPLAIN and it showed that the path did not use the index, so I ran:
SET enable_seqscan = off;
and ran the query again.
Although it used the index, the query was still very slow.
Finally, I ran:
SELECT a, b FROM x GROUP BY a, b;
But it was still the same.
Next I created an index on ("a") and ran the query:
SELECT DISTINCT a FROM x
but the same thing happened (first didn't use the index; after turning
seq-scan off, was still slow; tried using GROUP BY, still slow).
The columns "a" and "b" are NOT NULL and has 100 distinct values each. The
indexes are all btree indexes.
Grokbase › Groups › PostgreSQL › pgsql-bugs › February 2008