Re: strange slow query - lost lot of time somewhere - Mailing list pgsql-hackers
From | David Rowley |
---|---|
Subject | Re: strange slow query - lost lot of time somewhere |
Date | |
Msg-id | CAApHDvp87+kCufDkiq=7n-BD8kYxgYzvQLb_GkDK0Gs9TAQYYQ@mail.gmail.com Whole thread Raw |
In response to | Re: strange slow query - lost lot of time somewhere (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: strange slow query - lost lot of time somewhere
Re: strange slow query - lost lot of time somewhere |
List | pgsql-hackers |
On Tue, 3 May 2022 at 11:02, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > David Rowley <dgrowleyml@gmail.com> writes: > > On Mon, 2 May 2022 at 21:00, Pavel Stehule <pavel.stehule@gmail.com> wrote: > >> I found a query that is significantly slower with more memory > > > If it was work_mem you increased, it seems strange that the plan would > > switch over to using a Nested Loop / Memoize plan. > > Yeah, there's something unexplained there. > > I think that the most probable explanation for the symptoms is that > cost_memoize_rescan is computing some insane value for est_entries, > causing ExecInitMemoize to allocate-and-zero a huge hash table, > which ExecEndMemoize then frees again. Neither of those steps > gets counted into any plan node's runtime, but EXPLAIN's total > execution time will include them. An insane value for est_entries > could perhaps go along with a cost misestimate that convinces the > planner to include the memoize even though it seems pointless. That seems pretty unlikely to me. est_entries is based on the minimum value of the expected number of total cache entries and the ndistinct value. ndistinct cannot be insane here as ndistinct is never going to be higher than the number of calls, which is the row estimate from the outer side of the join. That's 91 in both cases here. As far as I can see, that's just going to make a table of 128 buckets. See estimate_num_groups_incremental() at: /* * We don't ever want to return an estimate of zero groups, as that tends * to lead to division-by-zero and other unpleasantness. The input_rows * estimate is usually already at least 1, but clamp it just in case it * isn't. */ input_rows = clamp_row_est(input_rows); > I spent some time studying cost_memoize_rescan, and the only > conclusions I arrived at were that the variable names are poorly > chosen and the comments are unhelpful. For instance, one would > think that est_entry_bytes is the expected size of one cache entry, > but it seems to actually be the total space that would be occupied > if the whole input relation were loaded into the cache. I think you've misunderstood. It *is* the estimated size of a single entry. I think you might be going wrong in assuming "tuples" is the expected tuples from all rescans of the inner side of the join. It's actually from a single scan. I can add a comment there to help make that clear. > And > the est_cache_entries computation seems nonsensical; if it does > make sense, the comment sure doesn't illuminate why. So I am > quite prepared to buy into the idea that cost_memoize_rescan is > producing bogus answers, but it's hard to tell what it's coming out > with in this example. Too bad EXPLAIN doesn't print est_entries. I'm wishing I put the initial hash table size and the final hash table size in EXPLAIN + EXPLAIN ANALYZE now. Perhaps it's not too late for v15 to do that so that it might help us figure things out in the future. I'm open to making improvements to the comments in that area. I do remember spending quite a bit of time trying to make things as clear as possible as it is fairly complex what's going on there. David
pgsql-hackers by date: