Re: 2pc leaks fds - Mailing list pgsql-hackers

From Andres Freund
Subject Re: 2pc leaks fds
Date
Msg-id 20200422183031.rpz7mziiyd5zvjad@alap3.anarazel.de
Whole thread Raw
In response to Re: 2pc leaks fds  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Responses Re: 2pc leaks fds  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Re: 2pc leaks fds  (Alvaro Herrera <alvherre@2ndquadrant.com>)
List pgsql-hackers
On 2020-04-22 13:57:54 -0400, Alvaro Herrera wrote:
> Concretely, I propose to have a new struct like
> 
> typedef struct xlogReaderFuncs
> {
>     XLogPageReadCB read_page;
>     XLogSegmentOpenCB open_segment;
>     XLogSegmentCloseCB open_segment;
> } xlogReaderFuncs;
> 
> #define XLOGREADER_FUNCS(...) &(xlogReaderFuncs){__VA_ARGS__}

Not sure I quite see the point of that helper macro...

> and then invoke it something like
> 
>     xlogreader = XLogReaderAllocate(wal_segment_size, NULL,
>                                     XLOGREADER_FUNCS(.readpage = &read_local_xlog_page,
>                                               .opensegment = &wal_segment_open),
>                                               .closesegment = &wal_segment_close),
>                                     NULL);
> 
> (with suitable definitions for XLogSegmentOpenCB etc) so that the
> support functions are all available at the xlogreader level, instead of
> "open" being buried at the read-page level.  Any additional support
> functions can be added easily.
> 
> This would give xlogreader a simpler interface.

My first reaction was that this looks like it'd make it harder to read
WAL from memory. But that's not really a problem, since
opensegment/closesegment don't have to do anything.

I think reducing the levels of indirection around xlogreader would be a
good idea. The control flow currently is *really* complicated: With the
page read callback at the xlogreader level, as well as separate
callbacks set from within the page read callback and passed to
WALRead(). And even though the WALOpenSegment, WALSegmentContext are
really private to WALRead, not XLogReader as a whole, they are members
of XLogReaderState.  I think the PG13 changes made it considerably
harder to understand xlogreader / xlogreader using code.

Note that the WALOpenSegment callback currently does not have access to
XLogReaderState->private_data, which I think is a pretty significant new
restriction. Afaict it's not nicely possible anymore to have two
xlogreaders inside the the same process that read from different data
directories or other cases where opening the segment requires context
information.

> If people like this, I could make this change for pg13 and avoid
> changing the API again in pg14.

I'm in favor of doing so. Not necessarily primarily to avoid repeated
API changes, but because I don't think the v13 changes went in the quite
right direction.

ISTM that we should:
- have the three callbacks you mention above
- change WALSegmentOpen to also get the XLogReaderState
- add private state to WALOpenSegment, so it can be used even when not
  accessing data in files / when one needs more information to close the
  file.
- disambiguate between WALOpenSegment (struct describing an open
  segment) and WALSegmentOpen (callback to open a segment) (note that
  the read page callback uses a *CB naming, why not follow?)

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: design for parallel backup
Next
From: Andres Freund
Date:
Subject: Re: More efficient RI checks - take 2