Home > mailing lists

Re: [PoC] Reducing planning time when tables have many partitions - Mailing list pgsql-hackers

From	David Rowley
Subject	Re: [PoC] Reducing planning time when tables have many partitions
Date	August 9, 2022 07:10:00
Msg-id	CAApHDvoWaCEOA2PbsBWoF95QFo1W8-8O-QW99ZYf-3S18ja6pg@mail.gmail.com Whole thread Raw
In response to	Re: [PoC] Reducing planning time when tables have many partitions (Yuya Watari <watari.yuya@gmail.com>)
Responses	Re: [PoC] Reducing planning time when tables have many partitions
List	pgsql-hackers

Tree view

On Mon, 8 Aug 2022 at 23:28, Yuya Watari <watari.yuya@gmail.com> wrote:
> If you have already applied David's patch, please start the 'git am'
> command from 0002-Fix-bugs.patch. All regression tests passed with
> this patch on my environment.

Thanks for fixing those scope bugs.

In regards to the 0002 patch, you have;

+ * TODO: "bms_add_members(ec1->ec_member_indexes, ec2->ec_member_indexes)"
+ * did not work to combine two EquivalenceClasses. This is probably because
+ * the order of the EquivalenceMembers is different from the previous
+ * implementation, which added the ec2's EquivalenceMembers to the end of
+ * the list.

as far as I can see, the reason the code I that wrote caused the
following regression test failure;

-         Index Cond: ((ff = '42'::bigint) AND (ff = '42'::bigint))
+         Index Cond: (ff = '42'::bigint)

was down to how generate_base_implied_equalities_const() marks the EC
as ec_broken = true without any regard to cleaning up the work it's
partially already complete.

Because the loop inside generate_base_implied_equalities_const() just
breaks as soon as we're unable to find a valid equality operator for
the two given types, with my version, since the EquivalenceMember's
order has effectively changed, we just discover the EC is broken
before we call process_implied_equality() ->
distribute_restrictinfo_to_rels(). In the code you've added, the
EquivalenceMembers are effectively still in the original order and the
process_implied_equality() -> distribute_restrictinfo_to_rels() gets
done before we discover the broken EC. The same qual is just added
again during generate_base_implied_equalities_broken(), which is why
the plan has a duplicate ff=42.

This is all just down to the order that the ECs are merged. If you'd
just swapped the order of the items in the query's WHERE clause to
become:

  where ec1.ff = 42::int8 and ss1.x = ec1.f1 and ec1.ff = ec1.f1;

then my version would keep the duplicate qual. For what you've changed
the code to, the planner would not have produced the duplicate ff=42
qual if you'd written the WHERE clause as follows:

  where ss1.x = ec1.f1 and ec1.ff = ec1.f1 and ec1.ff = 42::int8;

In short, I think the code I had for that was fine and it's just the
expected plan that you should be editing. If we wanted to this
behaviour to be consistent then the fix should be to make
generate_base_implied_equalities_const() better at only distributing
the quals down to the relations after it has discovered that the EC is
not broken, or at least cleaning up the partial work that it's done if
it discovers a broken EC. The former seems better to me, but I doubt
that it matters too much as broken ECs should be pretty rare and it
does not seem worth spending too much effort making this work better.

I've not had a chance to look at the 0003 patch yet.

David

pgsql-hackers by date:

From: John Naylor
Date: 09 August 2022, 06:21:41
Subject: Re: optimize lookups in snapshot [sub]xip arrays

From: Andres Freund
Date: 09 August 2022, 07:10:55
Subject: Re: [RFC] building postgres with meson

Re: [PoC] Reducing planning time when tables have many partitions - Mailing list pgsql-hackers

Previous

Next