Re: Creating foreign key on partitioned table is too slow - Mailing list pgsql-hackers

From David Rowley
Subject Re: Creating foreign key on partitioned table is too slow
Date
Msg-id CAKJS1f8v-fUG8YpaAGj309ZuALo3aEk7f6cqMHr_AVwz1fKXug@mail.gmail.com
Whole thread Raw
In response to Re: Creating foreign key on partitioned table is too slow  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Responses Re: Creating foreign key on partitioned table is too slow  (Andres Freund <andres@anarazel.de>)
Re: Creating foreign key on partitioned table is too slow  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Thu, 31 Oct 2019 at 07:30, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
>
> On Thu, Oct 24, 2019 at 04:28:38PM -0700, Andres Freund wrote:
> >Hi,
> >
> >On 2019-10-23 05:59:01 +0000, kato-sho@fujitsu.com wrote:
> >> To benchmark with tpcb model, I tried to create a foreign key in the partitioned history table, but backend
processkilled by OOM.
 
> >> the number of partitions is 8192. I tried in master(commit: ad4b7aeb84).
> >
> >Obviously this should be improved. But I think it's also worthwhile to
> >note that using 8k partitions is very unlikely to be a good choice for
> >anything. The metadata, partition pruning, etc overhead is just going to
> >be very substantial.
> >
>
> True. Especially with two partitioned tables, each with 8k partitions.

In Ottawa this year, Andres and I briefly talked about the possibility
of making a series of changes to how equalfuncs.c works. The idea was
to make it easy by using some pre-processor magic to allow us to
create another version of equalfuncs which would let us have an equal
comparison function that returns -1 / 0 / +1 rather than just true or
false.  This would allow us to Binary Search Trees of objects. I
identified that EquivalenceClass.ec_members would be better written as
a BST to allow much faster lookups in get_eclass_for_sort_expr().

The implementation I had in mind for the BST was a compact tree that
instead of using pointers for the left and right children, it just
uses an integer to reference the array element number.  This would
allow us to maintain very fast insert-order traversals.  Deletes would
need to decrement all child references greater than the deleted index.
This is sort of on-par with how the new List implementation in master.
i.e deletes take additional effort, but inserts are fast if there's
enough space in the array for a new element, traversals are
cache-friendly, etc.  I think trees might be better than hash tables
for this as a hash function needs to hash all fields, whereas a
comparison function can stop when it finds the first non-match.

This may also be able to help simplify the code in setrefs.c to get
rid of the complex code around indexed tlists. tlist_member() would
become O(log n) instead of O(n), so perhaps there'd be not much point
in having both search_indexed_tlist_for_var() and
search_indexed_tlist_for_non_var().

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: Thoughts on nbtree with logical/varwidth table identifiers, v12on-disk representation
Next
From: Andres Freund
Date:
Subject: Re: Creating foreign key on partitioned table is too slow