Re: executor relation handling - Mailing list pgsql-hackers

From Amit Langote
Subject Re: executor relation handling
Date
Msg-id 54aeae00-51b6-05ac-91d1-296515277a3b@lab.ntt.co.jp
Whole thread Raw
In response to Re: executor relation handling  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On 2018/10/08 8:18, Tom Lane wrote:
> I wrote:
>> Still need to think a bit more about whether we want 0005 in
>> anything like its current form.
> 
> So I poked at that for a bit, and soon realized that the *main* problem
> there is that ExecFindRowMark() eats O(N^2) time due to repeated searches
> of the es_rowMarks list.

Yeah, I proposed the patch only because of stumbling across O(N^2)
behavior, but went to solve it with the planner hack, which I agree is an
inferior solution overall.

> While the patch as stated would improve that
> for cases where most of the partitions can be pruned at plan time, it
> does nothing much for cases where they can't.  However, it's pretty
> trivial to fix that: let's just use an array not a list.  Patch 0001
> attached does that.
>
> A further improvement we could consider is to avoid opening the relcache
> entries for pruned-away relations.  I could not find a way to do that
> that was less invasive than removing ExecRowMark.relation and requiring
> callers to call a new function to get the relation if they need it.
> Patch 0002 attached is a delta on top of 0001 that does that.
>
> Replicating your select-lt-for-share test case as best I can (you never
> actually specified it carefully), I find that the TPS rate on my
> workstation goes from about 250 tps with HEAD to 920 tps with patch 0001
> or 1130 tps with patch 0002.  This compares to about 1600 tps for the
> non-FOR-SHARE version of the query.
>
> However, we should keep in mind that without partitioning overhead
> (ie "select * from lt_999 where b = 999 for share"), the TPS rate
> is over 25800 tps.  Most of the overhead in the partitioned case seems
> to be from acquiring locks on rangetable entries that we won't ever
> use, and none of these patch variants are touching that problem.
> So ISTM that the *real* win for this scenario is going to come from
> teaching the system to prune unwanted relations from the query
> altogether, not just from the PlanRowMark list.

Agreed, which is why I mentioned the other patch [1], which gets us closer
to that goal.

> Keeping that comparison in mind, I'm inclined to think that 0001
> is the best thing to do for now.  The incremental win from 0002
> is not big enough to justify the API break it creates, while your
> 0005 is not really attacking the problem the right way.

I agree that 0002's improvement is only incremental and would lose its
appeal if we're able to solve the bigger problem of removing range table
and other overhead when planning with large number of partitions, but that
might take a while.

Thanks,
Amit

[1] https://commitfest.postgresql.org/20/1778/




pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: out-of-order XID insertion in KnownAssignedXids
Next
From: Amit Langote
Date:
Subject: Re: partition tree inspection functions