On Sat, Apr 20, 2019 at 04:46:03PM -0400, Tom Lane wrote:
>Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
>> I think it's really a matter of underestimate, which convinces the planner
>> to hash the larger table. In this case, the table is 42GB, so it's
>> possible it actually works as expected. With work_mem = 4MB I've seen 32k
>> batches, and that's not that far off, I'd day. Maybe there are more common
>> values, but it does not seem like a very contrived data set.
>
>Maybe we just need to account for the per-batch buffers while estimating
>the amount of memory used during planning. That would force this case
>into a mergejoin instead, given that work_mem is set so small.
>
How would that solve the issue of underestimates like this one?
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services