Re: POC: Cleaning up orphaned files using undo logs - Mailing list pgsql-hackers

From Robert Haas
Subject Re: POC: Cleaning up orphaned files using undo logs
Date
Msg-id CA+TgmoZD2YcWV1E+kLShbnSG7UUvybDkAyFm9K2GA9g-ai0Ajg@mail.gmail.com
Whole thread Raw
In response to Re: POC: Cleaning up orphaned files using undo logs  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: POC: Cleaning up orphaned files using undo logs
Re: POC: Cleaning up orphaned files using undo logs
List pgsql-hackers
On Tue, Jun 18, 2019 at 2:07 PM Robert Haas <robertmhaas@gmail.com> wrote:
> On Tue, Jun 18, 2019 at 7:31 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > [ new patches ]
>
> I tried writing some code [ to use these patches ].

I spent some more time experimenting with this patch set today and I
think that the UndoFetchRecord interface is far too zheap-centric.  I
expected that I would be able to do this:

UnpackedUndoRecord *uur = UndoFetchRecord(urp);
// do stuff with uur
UndoRecordRelease(uur);

But I can't, because the UndoFetchRecord API requires me to pass not
only an undo record but also a block number, an offset number, an XID,
and a callback.  I think I could get the effect that I want by
defining a callback that always returns true.  Then I could do:

UndoRecPtr junk;
UnpackedUndoRecord *uur = UndoFetchRecord(urp, InvalidBlockNumber,
InvalidOffsetNumber, &junk, always_returns_true);
// do stuff with uur
UndoRecordRelease(uur);

That seems ridiculously baroque.  I think the most common thing that
an AM will want to do with an UndoRecPtr is look up that exact record;
that is, for example, what zedstore will want to do. However, even if
some AMs, like zheap, want to search backward through a chain of
records, there's no real reason to suppose that all of them will want
to search by block number + offset.  They might want to search by some
bit of data buried in the payload, for example.

I think the basic question here is whether we really need anything
more complicated than:

extern UnpackedUndoRecord *UndoFetchRecord(UndoRecPtr urp);

I mean, if you had that, the caller can implement looping easily
enough, and insert any test they want:

for (;;)
{
    UnpackedUndoRecord *uur = UndoFetchRecord(urp);
    if (i like this one)
        break;
    urp = uur->uur_blkprev; // should be renamed, since zedstore +
probably others will have tuple chains not block chains
    UndoRecordRelease(uur);
}

The question in my mind is whether there's some performance advantage
of having the undo layer manage the looping rather than the caller do
it.  If there is, then there's a lot of zheap code that ought to be
changed to use it, because it's just using the same satisfies-callback
everywhere.  If there's not, we should just simplify the undo record
lookup along the lines mentioned above and put all the looping into
the callers.  zheap could provide a wrapper around UndoFetchRecord
that does a search by block and offset, so that we don't have to
repeat that logic in multiple places.

BTW, an actually generic iterator interface would probably look more like this:

typedef bool (*SatisfyUndoRecordCallback)(void *callback_data,
UnpackedUndoRecord *uur);
extern UnpackedUndoRecord *UndoFetchRecord(UndoRecPtr urp, UndoRecPtr
*found, SatisfyUndoRecordCallback callback, void *callback_data);

Now we're not assuming anything about what parts of the record the
callback wants to examine.  It can do whatever it likes.  Typically
with this sort of interface a caller will define a file-private struct
that is known to both the callback and the caller of UndoFetchRecord,
but not elsewhere.

If we decide we need an iterator within the undo machinery itself,
then I think it should look like the above, and I think it should
accept NULL for found, callback, and callback_data, so that somebody
who wants to just look up a record, full stop, can do just:

UnpackedUndoRecord *uur = UndoFetchRecord(urp, NULL, NULL, NULL);

which seems tolerable.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Pavel Stehule
Date:
Subject: Re: idea: log_statement_sample_rate - bottom limit for sampling
Next
From: Robert Haas
Date:
Subject: Re: POC: Cleaning up orphaned files using undo logs