Re: [PoC] Reducing planning time when tables have many partitions - Mailing list pgsql-hackers

From David Rowley
Subject Re: [PoC] Reducing planning time when tables have many partitions
Date
Msg-id CAApHDvrDe6OQWOabbcaDrQ=+Y6m3pbQP1iuuU1WpE_hpymsHTg@mail.gmail.com
Whole thread Raw
In response to Re: [PoC] Reducing planning time when tables have many partitions  (Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>)
Responses Re: [PoC] Reducing planning time when tables have many partitions
Re: [PoC] Reducing planning time when tables have many partitions
List pgsql-hackers
On Tue, 8 Apr 2025 at 17:41, Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:
> Thanks for listing all the patterns. Creating four different iterators
> is going to affect functionality and might require duplicate code. But
> each of the patterns is using exactly one BMS operation on em_relids
> and relids being used as search criteria. That BMS operation/function
> pointer can be passed to the iterator. I have not looked into whether
> each of those BMS functions return boolean or not OR whether all the
> functions take arguments in the same order. But, those things can be
> fixed. However, given that the feature freeze deadline is so close,
> it's fine to do it later either during the beta phase or in PG 19. The
> speed up would be small enough to be noticeable in PG 18 given this
> and other improvements that have gone in.

I'll be happy to see further speedups proposed here. I doubt we'll see
much in the same league as this one, but a % here and there are often
welcome, providing the added complexity is acceptable.

> I think the distinction between parent and child is useful. So I will
> still suggest renaming the field but it can be done
> post-feature-freeze. Tom seems to be fine with that per his last email
> on the thread. If we do earlier in the beta cycle like in April
> itself, that will give enough time for the extension authors to adjust
> their code, if necessary.

That's fine for me. I did grep the Citus codebase earlier to look for
ec_member usages and I did see one. So, let's see who complains first.

> Attached diff also brings ec_childmembers_size closer to
> ec_childmembers - usual practice of keeping the array and its size
> together.

The reason I put it where it was is because it was filling a 4-byte
hole in the struct. I'd not tested any performance with it located in
a place that would cause the struct to be enlarged. IMO, we should be
allowed some flexibility here for such small struct. I don't think
having it 1 field up is confusing. We could swap ec_members and
ec_childmembers positions if you feel strongly.

I've pushed the patch now. Thanks for all the reviews of my adjustments.

Thanks to Yuya for persisting on this for so many years. I was
impressed with that persistence and also with your very detailed and
easy to understand performance benchmark results.  This feels like
(all going well) it's making v18 is big win for the big partitioning
users of Postgres. I kind of feel the "up to a few thousand partitions
fairly well" in [1] has been abused by many or taken without the
caveat of "the query planner to prune all but a small number of
partitions" over the years and the "Never just assume that more
partitions are better than fewer partitions, nor vice-versa." has been
ignored by too many. It's good that the people pushing these limits
will no longer be getting as big an unwelcome surprise now (I hope).
Or, at least, onto the next bottleneck. Maybe relcache :)

I've attached a dump of the performance tests of v42 I did an hour or
so ago with up to 8K partitions. That's where I got the 60x numbers I
hinted at in the commit message.

David

[1] https://www.postgresql.org/docs/current/ddl-partitioning.html#DDL-PARTITIONING-DECLARATIVE-BEST-PRACTICES

Attachment

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: [PATCH] clarify palloc comment on quote_literal_cstr
Next
From: Richard Guo
Date:
Subject: Re: An incorrect check in get_memoize_path