Re: astreamer_lz4: fix bug of output pointer advancement in decompressor - Mailing list pgsql-hackers

From Chao Li
Subject Re: astreamer_lz4: fix bug of output pointer advancement in decompressor
Date
Msg-id 7ECBFB37-0705-4A1C-AED1-746CB5E11B30@gmail.com
Whole thread Raw
In response to astreamer_lz4: fix bug of output pointer advancement in decompressor  (Chao Li <li.evan.chao@gmail.com>)
Responses Re: astreamer_lz4: fix bug of output pointer advancement in decompressor
List pgsql-hackers

> On Mar 2, 2026, at 17:17, Chao Li <li.evan.chao@gmail.com> wrote:
>
> Hi,
>
> There have been a couple of LZ4-related patches recently, so I spent some time playing with the LZ4 path and found a
bugin astreamer_lz4_decompressor_content(). 
>
> Looking at the code snippet (omitting irrelevant code):
> ```
> ret = LZ4F_decompress(mystreamer->dctx,
>  next_out, &out_size,
>  next_in, &read_size, NULL);
>
> mystreamer->bytes_written += out_size; // <== bumped bytes_written already
>
> /*
> * If output buffer is full then forward the content to next streamer
> * and update the output buffer.
> */
> if (mystreamer->bytes_written >= mystreamer->base.bbs_buffer.maxlen)
> {
> astreamer_content(mystreamer->base.bbs_next, member,
>  mystreamer->base.bbs_buffer.data,
>  mystreamer->base.bbs_buffer.maxlen,
>  context);
>
> avail_out = mystreamer->base.bbs_buffer.maxlen;
> mystreamer->bytes_written = 0;
> next_out = (uint8 *) mystreamer->base.bbs_buffer.data;
> }
> else
> {
> avail_out = mystreamer->base.bbs_buffer.maxlen - mystreamer->bytes_written;
> next_out += mystreamer->bytes_written; // <== The bug is there
> }
> ```
>
> To advance next_out, the code uses mystreamer->bytes_written. However, bytes_written has already been increased by
out_sizein the current iteration. As a result, next_out is advanced by the cumulative number of bytes written so far,
insteadof just the number of bytes produced in this iteration. Effectively, the pointer movement is double-counting the
previousprogress. 
>
> When I tried to design a test case to trigger this bug, I found it is actually not easy to hit in normal execution.
Tracinginto the function, I found that the default output buffer size is 1024 bytes, and in practice LZ4F_decompress()
tendsto fill the output buffer in one or two iterations. As a result, the problematic else branch is either not
reached,or reached in a case where bytes_written == out_size, so the incorrect pointer increment does not manifest. 
>
> To reliably trigger the bug, I used a small hack: instead of letting LZ4F_decompress() use the full available
out_size,I artificially limited out_size before the call, forcing LZ4F_decompress() to require one more iteration to
fillthe buffer. See the attached nocfbot_hack.diff for the hack. 
>
> With that hack in place, the bug can be reproduced using the following procedure:
>
> 1. initdb
> 2 Set "wal_level = replica” in postgreSQl.conf
> 3. Restart the instance
> 4. Create a database
> 5. Generate some WAL logs by psql
> ```
> CREATE TABLE t AS SELECT generate_series(1, 100000) AS id;
> CHECKPOINT;
> ```
> 6. Create a backup
> ```
> % rm -rf /tmp/bkup_lz4
> % pg_basebackup -D /tmp/bkup_lz4 -F t -Z lz4 -X stream -c fast
> ```
> 7. Verify the backup
> ```
> % pg_verifybackup -F t -n /tmp/bkup_lz4
> pg_verifybackup: error: zsh: trace trap  pg_verifybackup -F t -n /tmp/bkup_lz4
> ```
>
> With the fix applied (plus the hack), step 7 succeeds:
> ```
> % pg_verifybackup -F t -n /tmp/bkup_lz4
> backup successfully verified
> ```
> Best regards,
> --
> Chao Li (Evan)
> HighGo Software Co., Ltd.
> https://www.highgo.com/
>
>
>
>
> <nocfbot_hack.diff><v1-0001-astreamer_lz4-fix-output-pointer-advancement-in-d.patch>


Added to CF for tracking https://commitfest.postgresql.org/patch/6561/

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/







pgsql-hackers by date:

Previous
From: Chao Li
Date:
Subject: Re: doc: Clarify that empty COMMENT string removes the comment
Next
From: Andreas Karlsson
Date:
Subject: Re: [BUGFIX] Fix crash due to sizeof bug in RegisterExtensionExplainOption