Re: Back-patch change in hashed DISTINCT estimation? - Mailing list pgsql-hackers

From Jeff Janes
Subject Re: Back-patch change in hashed DISTINCT estimation?
Date
Msg-id CAMkU=1ynT3s+FHJGktG94nq-4ynRx0L++D6t+7OLp08-mMWaQw@mail.gmail.com
Whole thread Raw
In response to Re: Back-patch change in hashed DISTINCT estimation?  (Andres Freund <andres@2ndquadrant.com>)
List pgsql-hackers
On Wed, Aug 21, 2013 at 4:05 AM, Andres Freund <andres@2ndquadrant.com> wrote:
> On 2013-08-20 17:24:18 -0400, Tom Lane wrote:
>> In a thread over in pgsql-performance, Tomas Vondra pointed out that
>> choose_hashed_distinct was sometimes making different choices than
>> choose_hashed_grouping, so that queries like these:
>>
>>       select distinct x from ... ;
>>       select x from ... group by 1;
>>
>> might get different plans even though they should be equivalent.
>> After some debugging it turns out that I omitted hash_agg_entry_size()
>> from the hash table size estimate in choose_hashed_distinct --- I'm pretty
>> sure I momentarily thought that this function must yield zero if there are
>> no aggregates, but that's wrong.  So we need a patch like this:
>
>> What I'm wondering is whether to back-patch this or leave well enough
>> alone.  The risk of back-patching is that it might destabilize plan
>> choices that people like. [...]
>
>> A possible compromise is to back-patch into 9.3 (where the argument about
>> destabilizing plan choices doesn't have much force yet), but not further.
>
> +1 for 9.3 only.

I agree.  work_mem is hard to tune with any great precision
analytically.  If it is carefully tuned, it was probably done
empirically, so changing the behavior in back branches seems bad.

Cheers,

Jeff



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: PL/pgSQL, RAISE and error context
Next
From: Boszormenyi Zoltan
Date:
Subject: Re: GSOC13 proposal - extend RETURNING syntax