Re: Non-replayable WAL records through overflows and >MaxAllocSize lengths - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: Non-replayable WAL records through overflows and >MaxAllocSize lengths
Date
Msg-id 7e44dc64-c6e3-fd4e-4c05-ad36555f57f5@iki.fi
Whole thread Raw
In response to Non-replayable WAL records through overflows and >MaxAllocSize lengths  (Matthias van de Meent <boekewurm+postgres@gmail.com>)
Responses Re: Non-replayable WAL records through overflows and >MaxAllocSize lengths
Re: Non-replayable WAL records through overflows and >MaxAllocSize lengths
List pgsql-hackers
On 11/03/2022 17:42, Matthias van de Meent wrote:
> Hi,
> 
> Xlogreader limits the size of what it considers valid xlog records to
> MaxAllocSize; but this is not currently enforced in the
> XLogRecAssemble API. This means it is possible to assemble a record
> that postgresql cannot replay.

Oops, that would be nasty.

> Similarly; it is possible to repeatedly call XlogRegisterData() so as
> to overflow rec->xl_tot_len; resulting in out-of-bounds reads and
> writes while processing record data;

And that too.

Have you been able to create a test case for that? The largest record I 
can think of is a commit record with a huge number of subtransactions, 
dropped relations, and shared inval messages. I'm not sure if you can 
overflow a uint32 with that, but exceeding MaxAllocSize seems possible.

> PFA a patch that attempts to fix both of these issues in the insertion
> API; by checking against overflows and other incorrectly large values
> in the relevant functions in xloginsert.c. In this patch, I've also
> added a comment to the XLogRecord spec to document that xl_tot_len
> should not be larger than 1GB - 1B; and why that limit exists.
> diff --git a/src/backend/access/transam/xloginsert.c b/src/backend/access/transam/xloginsert.c
> index c260310c4c..ae654177de 100644
> --- a/src/backend/access/transam/xloginsert.c
> +++ b/src/backend/access/transam/xloginsert.c
> @@ -342,6 +342,11 @@ XLogRegisterData(char *data, int len)
>  
>      if (num_rdatas >= max_rdatas)
>          elog(ERROR, "too much WAL data");
> +
> +    /* protect against overflow */
> +    if (unlikely((uint64) mainrdata_len + (uint64) len > UINT32_MAX))
> +        elog(ERROR, "too much WAL data");
> +
>      rdata = &rdatas[num_rdatas++];
>  
>      rdata->data = data;

Could check for just AllocSizeValid(mainrdata_len), if we're only 
worried about the total size of the data to exceed the limit, and assume 
that each individual piece of data is smaller.

We also don't check for negative 'len'. I think that's fine, the caller 
bears some responsibility for passing valid arguments too. But maybe 
uint32 or size_t would be more appropriate here.

I wonder if these checks hurt performance. These are very cheap, but 
then again, this codepath is very hot. It's probably fine, but it still 
worries me a little. Maybe some of these could be Asserts.

> @@ -387,6 +392,11 @@ XLogRegisterBufData(uint8 block_id, char *data, int len)
>  
>      if (num_rdatas >= max_rdatas)
>          elog(ERROR, "too much WAL data");
> +
> +    /* protect against overflow */
> +    if (unlikely((uint64) regbuf->rdata_len + (uint64) len > UINT32_MAX))
> +        elog(ERROR, "too much WAL data");
> +
>      rdata = &rdatas[num_rdatas++];
>  
>      rdata->data = data;

Could check "len > UINT16_MAX". As you noted in XLogRecordAssemble, 
that's the real limit. And if you check for that here, you don't need to 
check it in XLogRecordAssemble.

> @@ -505,7 +515,7 @@ XLogRecordAssemble(RmgrId rmid, uint8 info,
>                     XLogRecPtr *fpw_lsn, int *num_fpi, bool *topxid_included)
>  {
>      XLogRecData *rdt;
> -    uint32        total_len = 0;
> +    uint64        total_len = 0;
>      int            block_id;
>      pg_crc32c    rdata_crc;
>      registered_buffer *prev_regbuf = NULL;

I don't think the change to uint64 is necessary. If all the data blocks 
are limited to 64 kB, and the number of blocks is limited, and the 
number of blocks is limited too.

> @@ -734,6 +744,10 @@ XLogRecordAssemble(RmgrId rmid, uint8 info,
>  
>          if (needs_data)
>          {
> +            /* protect against overflow */
> +            if (unlikely(regbuf->rdata_len > UINT16_MAX))
> +                elog(ERROR, "too much WAL data for registered buffer");
> +
>              /*
>               * Link the caller-supplied rdata chain for this buffer to the
>               * overall list.
> @@ -836,6 +850,13 @@ XLogRecordAssemble(RmgrId rmid, uint8 info,
>      for (rdt = hdr_rdt.next; rdt != NULL; rdt = rdt->next)
>          COMP_CRC32C(rdata_crc, rdt->data, rdt->len);
>  
> +    /*
> +     * Ensure that xlogreader.c can read the record; and check that we don't
> +     * accidentally overflow the size of the record.
> +     * */
> +    if (unlikely(!AllocSizeIsValid(total_len) || total_len > UINT32_MAX))
> +        elog(ERROR, "too much registered data for WAL record");
> +
>      /*
>       * Fill in the fields in the record header. Prev-link is filled in later,
>       * once we know where in the WAL the record will be inserted. The CRC does

It's enough to check AllocSizeIsValid(total_len), no need to also check 
against UINT32_MAX.

- Heikki



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: pg_walinspect - a new extension to get raw WAL data and WAL stats
Next
From: Melanie Plageman
Date:
Subject: Issue with pg_stat_subscription_stats