astreamer_lz4: fix bug of output pointer advancement in decompressor - Mailing list pgsql-hackers

From Chao Li
Subject astreamer_lz4: fix bug of output pointer advancement in decompressor
Date
Msg-id 0594CC79-1544-45DD-8AA4-26270DE777A7@gmail.com
Whole thread Raw
Responses Re: astreamer_lz4: fix bug of output pointer advancement in decompressor
Re: astreamer_lz4: fix bug of output pointer advancement in decompressor
List pgsql-hackers
Hi,

There have been a couple of LZ4-related patches recently, so I spent some time playing with the LZ4 path and found a
bugin astreamer_lz4_decompressor_content(). 

Looking at the code snippet (omitting irrelevant code):
```
    ret = LZ4F_decompress(mystreamer->dctx,
                          next_out, &out_size,
                          next_in, &read_size, NULL);

    mystreamer->bytes_written += out_size; // <== bumped bytes_written already

    /*
     * If output buffer is full then forward the content to next streamer
     * and update the output buffer.
     */
    if (mystreamer->bytes_written >= mystreamer->base.bbs_buffer.maxlen)
    {
        astreamer_content(mystreamer->base.bbs_next, member,
                          mystreamer->base.bbs_buffer.data,
                          mystreamer->base.bbs_buffer.maxlen,
                          context);

        avail_out = mystreamer->base.bbs_buffer.maxlen;
        mystreamer->bytes_written = 0;
        next_out = (uint8 *) mystreamer->base.bbs_buffer.data;
    }
    else
    {
        avail_out = mystreamer->base.bbs_buffer.maxlen - mystreamer->bytes_written;
        next_out += mystreamer->bytes_written; // <== The bug is there
    }
```

To advance next_out, the code uses mystreamer->bytes_written. However, bytes_written has already been increased by
out_sizein the current iteration. As a result, next_out is advanced by the cumulative number of bytes written so far,
insteadof just the number of bytes produced in this iteration. Effectively, the pointer movement is double-counting the
previousprogress. 

When I tried to design a test case to trigger this bug, I found it is actually not easy to hit in normal execution.
Tracinginto the function, I found that the default output buffer size is 1024 bytes, and in practice LZ4F_decompress()
tendsto fill the output buffer in one or two iterations. As a result, the problematic else branch is either not
reached,or reached in a case where bytes_written == out_size, so the incorrect pointer increment does not manifest. 

To reliably trigger the bug, I used a small hack: instead of letting LZ4F_decompress() use the full available out_size,
Iartificially limited out_size before the call, forcing LZ4F_decompress() to require one more iteration to fill the
buffer.See the attached nocfbot_hack.diff for the hack. 

With that hack in place, the bug can be reproduced using the following procedure:

1. initdb
2 Set "wal_level = replica” in postgreSQl.conf
3. Restart the instance
4. Create a database
5. Generate some WAL logs by psql
```
CREATE TABLE t AS SELECT generate_series(1, 100000) AS id;
CHECKPOINT;
```
6. Create a backup
```
% rm -rf /tmp/bkup_lz4
% pg_basebackup -D /tmp/bkup_lz4 -F t -Z lz4 -X stream -c fast
```
7. Verify the backup
```
% pg_verifybackup -F t -n /tmp/bkup_lz4
pg_verifybackup: error: zsh: trace trap  pg_verifybackup -F t -n /tmp/bkup_lz4
```

With the fix applied (plus the hack), step 7 succeeds:
```
% pg_verifybackup -F t -n /tmp/bkup_lz4
backup successfully verified
```
Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/





Attachment

pgsql-hackers by date:

Previous
From: Fujii Masao
Date:
Subject: Re: DOC: fixes multiple errors in alter table doc
Next
From: Chao Li
Date:
Subject: Re: DOC: fixes multiple errors in alter table doc