Thread: SELECT DISTINCT triggers sorting operation

SELECT DISTINCT triggers sorting operation

From
Gaëtan Allart
Date:
Hi everybody,

I'm experiencing an interesting issue with PostgreSQL 8.4.2-r2.
I was running a pretty big database on another server (8.3) pretty fine.
When I moved it to this new server, behaviour changed.

On a complex SELECT DISTINCT query, pgsql now runs a "SORT" operation
although I've never requested it to do so! Considered this a millions
records table, requests usually runned within a second are now executed in
minutes...
Removing DISTINCT clause makes the request get executed instantely.

The interesting part is that "sort operation" is only added when there is
more than one field selected :


database=# EXPLAIN SELECT DISTINCT "articles_article"."id"  FROM
"articles_article"   LIMIT 8;
                                                      QUERY PLAN
----------------------------------------------------------------------------
------------------------------------------
 Limit  (cost=0.00..28.07 rows=8 width=4)
   ->  Unique  (cost=0.00..7879955.60 rows=2245904 width=4)
         ->  Index Scan using articles_article_pkey on articles_article
(cost=0.00..7874340.84 rows=2245904 width=4)
(3 rows)



database=# EXPLAIN SELECT DISTINCT "articles_article"."id",
"articles_article"."flux_id"  FROM "articles_article"   LIMIT 8;
                                         QUERY PLAN
----------------------------------------------------------------------------
-----------------
 Limit  (cost=614898.16..614898.22 rows=8 width=8)
   ->  Unique  (cost=614898.16..631742.44 rows=2245904 width=8)
         ->  Sort  (cost=614898.16..620512.92 rows=2245904 width=8)
               Sort Key: id, flux_id
               ->  Seq Scan on articles_article  (cost=0.00..316550.04
rows=2245904 width=8)
(5 rows)



Tunning the postgresql.conf with these options has not changed anything :
enable_hashagg = on, enable_sort =off.

Any idea how to disable this automatic CPU killing sorting operation?
Thanks,

Gaëtan Allart


Re: SELECT DISTINCT triggers sorting operation

From
Scott Marlowe
Date:
On Sat, Feb 6, 2010 at 3:49 PM, Gaëtan Allart <gallart@free.fr> wrote:
> Hi everybody,
>
> I'm experiencing an interesting issue with PostgreSQL 8.4.2-r2.
> I was running a pretty big database on another server (8.3) pretty fine.
> When I moved it to this new server, behaviour changed.

What does explain analyze run on both the 8.3 and 8.4 say?

Re: SELECT DISTINCT triggers sorting operation

From
Tom Lane
Date:
=?iso-8859-1?Q?Ga=EBtan_Allart?= <gallart@free.fr> writes:
> I'm experiencing an interesting issue with PostgreSQL 8.4.2-r2.
> I was running a pretty big database on another server (8.3) pretty fine.
> When I moved it to this new server, behaviour changed.

Did you ANALYZE the database after loading it?  Are you sure you
have all the parameters set similarly to what you used in the 8.3
installation?

(The described behavior sounds suspiciously like work_mem is not
large enough to allow a hashagg to be chosen.)

            regards, tom lane

Re: SELECT DISTINCT triggers sorting operation

From
Gaëtan Allart
Date:
You were absolutely right Tom.
Rising work_mem did the trick!

Many thanks :-)

What's the best value for work_mem ?

Gaëtan

Le 7 févr. 2010 à 07:38, Tom Lane a écrit :

> =?iso-8859-1?Q?Ga=EBtan_Allart?= <gallart@free.fr> writes:
>> I'm experiencing an interesting issue with PostgreSQL 8.4.2-r2.
>> I was running a pretty big database on another server (8.3) pretty fine.
>> When I moved it to this new server, behaviour changed.
>
> Did you ANALYZE the database after loading it?  Are you sure you
> have all the parameters set similarly to what you used in the 8.3
> installation?
>
> (The described behavior sounds suspiciously like work_mem is not
> large enough to allow a hashagg to be chosen.)
>
>             regards, tom lane