Re: strange slow query - lost lot of time somewhere - Mailing list pgsql-hackers

From David Rowley
Subject Re: strange slow query - lost lot of time somewhere
Date
Msg-id CAApHDvp87+kCufDkiq=7n-BD8kYxgYzvQLb_GkDK0Gs9TAQYYQ@mail.gmail.com
Whole thread Raw
In response to Re: strange slow query - lost lot of time somewhere  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: strange slow query - lost lot of time somewhere
Re: strange slow query - lost lot of time somewhere
List pgsql-hackers
On Tue, 3 May 2022 at 11:02, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> David Rowley <dgrowleyml@gmail.com> writes:
> > On Mon, 2 May 2022 at 21:00, Pavel Stehule <pavel.stehule@gmail.com> wrote:
> >> I found a query that is significantly slower with more memory
>
> > If it was work_mem you increased, it seems strange that the plan would
> > switch over to using a Nested Loop / Memoize plan.
>
> Yeah, there's something unexplained there.
>
> I think that the most probable explanation for the symptoms is that
> cost_memoize_rescan is computing some insane value for est_entries,
> causing ExecInitMemoize to allocate-and-zero a huge hash table,
> which ExecEndMemoize then frees again.  Neither of those steps
> gets counted into any plan node's runtime, but EXPLAIN's total
> execution time will include them.  An insane value for est_entries
> could perhaps go along with a cost misestimate that convinces the
> planner to include the memoize even though it seems pointless.

That seems pretty unlikely to me. est_entries is based on the minimum
value of the expected number of total cache entries and the ndistinct
value. ndistinct cannot be insane here as ndistinct is never going to
be higher than the number of calls, which is the row estimate from the
outer side of the join. That's 91 in both cases here.   As far as I
can see, that's just going to make a table of 128 buckets.

See estimate_num_groups_incremental() at:

/*
* We don't ever want to return an estimate of zero groups, as that tends
* to lead to division-by-zero and other unpleasantness.  The input_rows
* estimate is usually already at least 1, but clamp it just in case it
* isn't.
*/
input_rows = clamp_row_est(input_rows);


> I spent some time studying cost_memoize_rescan, and the only
> conclusions I arrived at were that the variable names are poorly
> chosen and the comments are unhelpful.  For instance, one would
> think that est_entry_bytes is the expected size of one cache entry,
> but it seems to actually be the total space that would be occupied
> if the whole input relation were loaded into the cache.

I think you've misunderstood. It *is* the estimated size of a single
entry. I think you might be going wrong in assuming "tuples" is the
expected tuples from all rescans of the inner side of the join. It's
actually from a single scan.  I can add a comment there to help make
that clear.

> And
> the est_cache_entries computation seems nonsensical; if it does
> make sense, the comment sure doesn't illuminate why.  So I am
> quite prepared to buy into the idea that cost_memoize_rescan is
> producing bogus answers, but it's hard to tell what it's coming out
> with in this example.  Too bad EXPLAIN doesn't print est_entries.

I'm wishing I put the initial hash table size and the final hash table
size in EXPLAIN + EXPLAIN ANALYZE now. Perhaps it's not too late for
v15 to do that so that it might help us figure things out in the
future.

I'm open to making improvements to the comments in that area. I do
remember spending quite a bit of time trying to make things as clear
as possible as it is fairly complex what's going on there.

David



pgsql-hackers by date:

Previous
From: Noah Misch
Date:
Subject: Re: testclient.exe installed under MSVC
Next
From: David Rowley
Date:
Subject: Re: strange slow query - lost lot of time somewhere