Home > mailing lists

Re: queries with DISTINCT / GROUP BY giving different plans - Mailing list pgsql-performance

From	Tomas Vondra
Subject	Re: queries with DISTINCT / GROUP BY giving different plans
Date	August 20, 2013 22:21:29
Msg-id	5213EBE4.5030302@fuzzy.cz Whole thread Raw
In response to	Re: queries with DISTINCT / GROUP BY giving different plans (Tom Lane <tgl@sss.pgh.pa.us>)
Responses	Re: queries with DISTINCT / GROUP BY giving different plans
List	pgsql-performance

Tree view

On 20.8.2013 23:02, Tom Lane wrote:
> Tomas Vondra <tv@fuzzy.cz> writes:
>
>> Anyway, I still don't understand why the same logic around
>> hash_agg_entry_size should not apply to choose_hashed_grouping as
>> well? Well, it would make it slower in this particular corner case,
>> but wouldn't it be more correct?

Meh, I meant it the other way around - applying the hashentrysize logic
from hashed_grouping to hashed_distinct. So that both use 56B.

> choose_hashed_grouping has it right, or at least more nearly right.
> choose_hashed_distinct is simply failing to account for space that
> will in fact be consumed.  Not fixing that is not a good way to deal
> with inaccurate number-of-groups estimates; if that estimate is low
> rather than high, the consequences will be a lot worse than they are
> here.

Not quite sure how to parse this (not a native speaker here, sorry).
Does that mean we want to keep it as it is now (because fixing it would
cause even worse errors with low estimates)? Or do we want to fix
hashed_distinct so that it behaves like hashed_grouping?

Tomas

pgsql-performance by date:

From: Tom Lane
Date: 20 August 2013, 21:02:22
Subject: Re: queries with DISTINCT / GROUP BY giving different plans

From: Tom Lane
Date: 20 August 2013, 23:32:17
Subject: Re: queries with DISTINCT / GROUP BY giving different plans

Re: queries with DISTINCT / GROUP BY giving different plans - Mailing list pgsql-performance

Previous

Next