Re: Table partition with primary key in 11.3 - Mailing list pgsql-general

From Alvaro Herrera
Subject Re: Table partition with primary key in 11.3
Date
Msg-id 20190607213532.GA26907@alvherre.pgsql
Whole thread Raw
In response to Re: Table partition with primary key in 11.3  (Peter Geoghegan <pg@bowt.ie>)
Responses Re: Table partition with primary key in 11.3
List pgsql-general
On 2019-Jun-07, Peter Geoghegan wrote:

> On Fri, Jun 7, 2019 at 1:22 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:

> > Because you can't rely on that exclusively, and you want to reuse the
> > partition ID eventually, you still need a cleanup process that removes
> > those remaining index entries.  This cleanup process is a background
> > process, so it doesn't affect latency.  I think it's not a good idea to
> > add latency to clients in order to optimize a background process.
> 
> Ordinarily I would agree, but we're talking about something that takes
> place at the point that we're just about to split the page, that will
> probably make the page split unnecessary when we can reclaim as few as
> one or two tuples. A page split is already a very expensive thing by
> any measure, and something that can rarely be "undone", so avoiding
> them entirely is very compelling.

Sorry, I confused your argumentation with mine.  I agree that removing
entries to try and prevent a page split is worth doing.

> > This way, when a partition is dropped, we have to take the time to scan
> > all global indexes; when they've been scanned we can remove the catalog
> > entry, and at that point the partition ID becomes available to future
> > partitions.
> 
> It seems worth recycling partition IDs, but it should be possible to
> delay that for a very long time if necessary. Ideally, users wouldn't
> have to bother with it when they have really huge global indexes.

I envision this happening automatically -- you drop the partition, a
persistent work item is registered, autovacuum takes care of it
whenever.  The user doesn't have to do anything about it.

> You don't have to be Claude Shannon to realize that it's kind of silly
> to reserve 16 bits for the offset number component of a
> TID/ItemPointer. We need to continue to support offset numbers that go
> that high, but the implementation would optimize for the common case
> where offset numbers are less than 512 (or maybe less than 1024).

(In many actual cases offset numbers require less than 7 bits in typical
pages, even).

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



pgsql-general by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: Table partition with primary key in 11.3
Next
From: Peter Geoghegan
Date:
Subject: Re: Table partition with primary key in 11.3