Re: PITR, checkpoint, and local relations - Mailing list pgsql-hackers

From Tom Lane
Subject Re: PITR, checkpoint, and local relations
Date
Msg-id 6107.1028426472@sss.pgh.pa.us
Whole thread Raw
In response to Re: PITR, checkpoint, and local relations  (Bruce Momjian <pgman@candle.pha.pa.us>)
Responses Re: PITR, checkpoint, and local relations  (Bruce Momjian <pgman@candle.pha.pa.us>)
Re: PITR, checkpoint, and local relations  (Greg Copeland <greg@CopelandConsulting.Net>)
Re: PITR, checkpoint, and local relations  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> There is debate on whether the local buffers are even valuable
> considering the headache they cause in other parts of the system.

More specifically, the issue is that when (if) you commit, the contents
of the new table now have to be pushed out to shared storage.  This is
moderately annoying in itself (among other things, it implies fsync'ing
those tables before commit).  But the real reason it comes up now is
that the proposed PITR scheme can't cope gracefully with tables that
are suddenly there but weren't participating in checkpoints before.

It looks to me like we should stop using local buffers for ordinary
tables that happen to be in their first transaction of existence.
But, per Vadim's suggestion, we shouldn't abandon the local buffer
manager altogether.  What we could and should use it for is TEMP tables,
which have no need to be checkpointed or WAL-logged or fsync'd or
accessible to other backends *ever*.  Also, a temp table can leave
blocks in local buffers across transactions, which makes local buffers
considerably more useful than they are now.

If temp tables didn't use the shared bufmgr nor did updates to them get
WAL-logged, they'd be noticeably more efficient than plain tables, which
IMHO would be a Good Thing.  Such tables would be essentially invisible
to WAL and PITR (at least their contents would be --- I assume we'd
still log file creation and deletion).  But I can't see anything wrong
with that.

In short, the proposal runs something like this:

* Regular tables that happen to be in their first transaction of
existence are not treated differently from any other regular table so
far as buffer management or WAL or PITR go.  (rd_myxactonly either goes
away or is used for much less than it is now.)

* TEMP tables use the local buffer manager for their entire existence.
(This probably means adding an "rd_istemp" flag to relcache entries, but
I can't see anything wrong with that.)

* Local bufmgr semantics are twiddled to reflect this reality --- in
particular, data in local buffers can be held across transactions, there
is no end-of-transaction write (much less fsync).  A TEMP table that
isn't too large might never touch disk at all.

* Data operations in TEMP tables do not get WAL-logged, nor do we
WAL-log page images of local-buffer pages.


These changes seem very attractive to me even without regard for making
the world safer for PITR.  I'm willing to volunteer to make them happen,
if there are no objections.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Gavin Sherry
Date:
Subject: CLUSTER and indisclustered
Next
From: Tom Lane
Date:
Subject: Re: FUNC_MAX_ARGS benchmarks