The following bug has been logged online:
Bug reference: 5231
Logged by: Thomas Hamilton
Email address: email@example.com
PostgreSQL version: 8.3.8
Operating system: Ubuntu 4.2.4
Description: SELECT DISTINCT poorly implemented vs SELECT ... GROUP
SELECT DISTINCT does a Sort followed by Unique.
SELECT ... GROUP BY, which is logically equivalent, performs a
When run against a large dataset with a small number of distinct results
HashAggregate is an order of magnitude more efficient!
Since the spec does not require DISTINCT to return sorted results, I don't
believe Sort ... Unique will ever be more efficient than HashAggregate.
Therefore, in order to maximize performance, DISTINCT should always be
implemented as HashAggregate.