Re: Non-replayable WAL records through overflows and >MaxAllocSize lengths - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Non-replayable WAL records through overflows and >MaxAllocSize lengths
Date
Msg-id 20221202170913.rhhryfx5eosvjcib@awork3.anarazel.de
Whole thread Raw
In response to Re: Non-replayable WAL records through overflows and >MaxAllocSize lengths  (Matthias van de Meent <boekewurm+postgres@gmail.com>)
List pgsql-hackers
Hi,

On 2022-07-26 18:58:02 +0200, Matthias van de Meent wrote:
> - updated the MaxXLogRecordSize and XLogRecordLengthIsValid(len)
> macros (now in xlogrecord.h), with a max length of the somewhat
> arbitrary 1020MiB.
>  This leaves room for approx. 4MiB of per-record allocation overhead
> before you'd hit MaxAllocSize, and also detaches the dependency on
> memutils.h.
> 
> - Retained the check in XLogRegisterData, so that we check against
> integer overflows in the registerdata code instead of only an assert
> in XLogRecordAssemble where it might be too late.
> - Kept the inline static elog-ing function (as per Andres' suggestion
> on 2022-03-14; this decreases binary sizes)

I don't think it should be a static inline. It should to be a *non* inlined
function, so we don't include the code for the elog in the callers.


> +/*
> + * Error due to exceeding the maximum size of a WAL record, or registering
> + * more datas than are being accounted for by the XLog infrastructure.
> + */
> +static inline void
> +XLogErrorDataLimitExceeded()
> +{
> +    elog(ERROR, "too much WAL data");
> +}

I think this should be pg_noinline, as mentioned above.


>  /*
>   * Begin constructing a WAL record. This must be called before the
>   * XLogRegister* functions and XLogInsert().
> @@ -348,14 +359,29 @@ XLogRegisterBlock(uint8 block_id, RelFileLocator *rlocator, ForkNumber forknum,
>   * XLogRecGetData().
>   */
>  void
> -XLogRegisterData(char *data, int len)
> +XLogRegisterData(char *data, uint32 len)
>  {
>      XLogRecData *rdata;
>  
> -    Assert(begininsert_called);
> +    Assert(begininsert_called && XLogRecordLengthIsValid(len));
> +
> +    /*
> +     * Check against max_rdatas; and ensure we don't fill a record with
> +     * more data than can be replayed. Records are allocated in one chunk
> +     * with some overhead, so ensure XLogRecordLengthIsValid() for that
> +     * size of record.
> +     *
> +     * Additionally, check that we don't accidentally overflow the
> +     * intermediate sum value on 32-bit systems by ensuring that the
> +     * sum of the two inputs is no less than one of the inputs.
> +     */
> +    if (num_rdatas >= max_rdatas ||
> +#if SIZEOF_SIZE_T == 4
> +         mainrdata_len + len < len ||
> +#endif
> +        !XLogRecordLengthIsValid((size_t) mainrdata_len + (size_t) len))
> +        XLogErrorDataLimitExceeded();

This is quite a complicated check, and the SIZEOF_SIZE_T == 4 bit is fairly
ugly.

I think we should make mainrdata_len a uint64, then we don't have to worry
about it overflowing on 32bit systems. And TBH, we don't care about some minor
inefficiency on 32bit systems.



> @@ -399,8 +425,16 @@ XLogRegisterBufData(uint8 block_id, char *data, int len)
>          elog(ERROR, "no block with id %d registered with WAL insertion",
>               block_id);
>  
> -    if (num_rdatas >= max_rdatas)
> -        elog(ERROR, "too much WAL data");
> +    /*
> +     * Check against max_rdatas; and ensure we don't register more data per
> +     * buffer than can be handled by the physical data format; 
> +     * i.e. that regbuf->rdata_len does not grow beyond what
> +     * XLogRecordBlockHeader->data_length can hold.
> +     */
> +    if (num_rdatas >= max_rdatas ||
> +        regbuf->rdata_len + len > UINT16_MAX)
> +        XLogErrorDataLimitExceeded();
> +
>      rdata = &rdatas[num_rdatas++];
>  
>      rdata->data = data;

This partially has been applied in ffd1b6bb6f8, I think we should consider
adding XLogErrorDataLimitExceeded() separately too.


>              rdt_datas_last->next = regbuf->rdata_head;
> @@ -858,6 +907,16 @@ XLogRecordAssemble(RmgrId rmid, uint8 info,
>      for (rdt = hdr_rdt.next; rdt != NULL; rdt = rdt->next)
>          COMP_CRC32C(rdata_crc, rdt->data, rdt->len);
>  
> +    /*
> +     * Ensure that the XLogRecord is not too large.
> +     *
> +     * XLogReader machinery is only able to handle records up to a certain
> +     * size (ignoring machine resource limitations), so make sure we will
> +     * not emit records larger than those sizes we advertise we support.
> +     */
> +    if (!XLogRecordLengthIsValid(total_len))
> +        XLogErrorDataLimitExceeded();
> +
>      /*
>       * Fill in the fields in the record header. Prev-link is filled in later,
>       * once we know where in the WAL the record will be inserted. The CRC does

I think this needs to mention that it'll typically cause a PANIC.


Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Non-replayable WAL records through overflows and >MaxAllocSize lengths
Next
From: Tom Lane
Date:
Subject: Re: Failed Assert in pgstat_assoc_relation