Thread: BUG #18533: pg_basebackup uses out-of-bounds memory and a segment error occurs during backup

The following bug has been logged on the website:

Bug reference:      18533
Logged by:          ji xiaohang
Email address:      1165125080@qq.com
PostgreSQL version: 16.3
Operating system:   centos 7
Description:

pg_basebackup uses out-of-bounds memory and a segment error occurs during
backup

Run the following command to back up the pg11 database using the
pg_basebackup backup tool of the pg16 version:
pg_basebackup -h 127.0.0.1 -p 56100 -U Backup -F tar -X f -z -c fast -P -v
-D - > base.tar.gz
The segmentation fault does not recur after I run commands. It seems that
this problem recurs only in my service invoking scenario.
If the backup content is BBFLOWER_MEMBER_TRAILER, the coredump stack
information is as follows:

#0 __memmove_evex_unaligned_erms () at
../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:476
#1 0x00007f0d8b928ef1 in memcpy (__len=388, __src=0x5625f0fcffa4,
__dest=<optimized out>) at /usr/include/bits/string_fortified.h:29
#2 gz_write (state=0x5625f0f862d0, buf=0x5625f0fcffa4, len=388) at
gzwrite.c:213
#3 0x00007f0d8b929034 in gz_write (len=<optimized out>, buf=<optimized out>,
state=<optimized out>) at gzwrite.c:270
#4 gzwrite (file=<optimized out>, buf=<optimized out>, len=<optimized out>)
at gzwrite.c:270
#5 0x00005625efaa4570 in bbstreamer_gzip_writer_content
(streamer=0x5625f0f85170, member=0x5625f0f87db0, data=0x5625f0fcffa4 "",
len=388, context=BBSTREAMER_MEMBER_TRAILER) at bbstreamer_gzip.c:138
#6 0x00005625efaa57d4 in bbstreamer_content (streamer=0x5625f0f85170,
member=0x5625f0f87db0, data=0x5625f0fcffa4 "", len=388,
context=BBSTREAMER_MEMBER_TRAILER) at bbstreamer.h:145
#7 0x00005625efaa64ff in bbstreamer_tar_archiver_content
(streamer=0x5625f0f863c0, member=0x5625f0f87db0, data=0x5625f0fcffa4 "",
len=388, context=BBSTREAMER_MEMBER_TRAILER) at bbstreamer_tar.c:508
#8 0x00005625efaa57d4 in bbstreamer_content (streamer=0x5625f0f863c0,
member=0x5625f0f87db0, data=0x5625f0fcffa4 "", len=388,
context=BBSTREAMER_MEMBER_TRAILER) at bbstreamer.h:145
#9 0x00005625efaa5c00 in bbstreamer_tar_parser_content
(streamer=0x5625f0f87d80, member=0x0, data=0x5625f0fcffa4 "", len=0,
context=BBSTREAMER_UNKNOWN) at bbstreamer_tar.c:228
#10 0x00005625efa9dc85 in bbstreamer_content (streamer=0x5625f0f87d80,
member=0x0, data=0x5625f0fcfe20 "", len=388, context=BBSTREAMER_UNKNOWN) at
bbstreamer.h:145
#11 0x00005625efaa127d in ReceiveTarCopyChunk (r=388, copybuf=0x5625f0fcfe20
"", callback_data=0x7ffe9ceffef0) at pg_basebackup.c:1759
#12 0x00005625efa9fe1f in ReceiveCopyData (conn=0x5625f0f70b10,
callback=0x5625efaa123c <ReceiveTarCopyChunk>, callback_data=0x7ffe9ceffef0)
at pg_basebackup.c:1102
#13 0x00005625efaa1115 in ReceiveTarFile (conn=0x5625f0f70b10,
archive_name=0x7ffe9cf00460 "base.tar", spclocation=0x0, tablespacenum=true,
compress=0x7ffe9cf008e0) at pg_basebackup.c:1708
#14 0x00005625efaa2481 in BaseBackup (compression_algorithm=0x5625f0f6f5f0
"gzip", compression_detail=0x5625f0f6f610 "2",
compressloc=COMPRESS_LOCATION_CLIENT, client_compress=0x7ffe9cf008e0) at
pg_basebackup.c:2177
#15 0x00005625efaa3ad6 in main (argc=17, argv=0x7ffe9cf00a48) at
pg_basebackup.c:2900

According to the stack analysis, the address of the copybuf (or data)
received from the server is changed in the bbstreamer_tar_archiver_content
function. The data address is moved forward by 388 bytes, that is, the
length of the variable len.
After analysis, the bbstreamer_buffer_until command is executed, causing the
data address to change. In this case, the data pointer points to the end of
the available address, and len = 0.
However, in bbstreamer_tar_archiver_content and
bbstreamer_gzip_writer_content, the value of len is restored to 388. When
gz_write is finally invoked, memory overwriting occurs and an error
occurs.
In addition, when the pg16.3 server is backed up, memory overwriting also
occurs. but no segmentation fault.
#0  bbstreamer_gzip_writer_content (streamer=0x5555555aa830,
member=0x5555555ab8c0, data=0x5555555aa495 "", len=260,
context=BBSTREAMER_MEMBER_TRAILER) at bbstreamer_gzip.c:132
#1  0x0000555555560631 in bbstreamer_content (streamer=0x5555555aa830,
member=0x5555555ab8c0, data=0x5555555aa495 "", len=260,
context=BBSTREAMER_MEMBER_TRAILER) at bbstreamer.h:145
#2  0x0000555555561279 in bbstreamer_tar_archiver_content
(streamer=0x5555555aaf20, member=0x5555555ab8c0, data=0x5555555aa495 "",
len=260, context=BBSTREAMER_MEMBER_TRAILER) at bbstreamer_tar.c:508
#3  0x0000555555560631 in bbstreamer_content (streamer=0x5555555aaf20,
member=0x5555555ab8c0, data=0x5555555aa495 "", len=260,
context=BBSTREAMER_MEMBER_TRAILER) at bbstreamer.h:145
#4  0x0000555555560a53 in bbstreamer_tar_parser_content
(streamer=0x5555555ab890, member=0x0, data=0x5555555aa495 "", len=0,
context=BBSTREAMER_UNKNOWN) at bbstreamer_tar.c:228
#5  0x0000555555558c84 in bbstreamer_content (streamer=0x5555555ab890,
member=0x0, data=0x5555555aa391 "", len=260, context=BBSTREAMER_UNKNOWN) at
bbstreamer.h:145
#6  0x000055555555bbed in ReceiveArchiveStreamChunk (r=261,
copybuf=0x5555555aa390 "d", callback_data=0x7fffffffd420) at
pg_basebackup.c:1528
#7  0x000055555555ad34 in ReceiveCopyData (conn=0x555555596660,
callback=0x55555555b8af <ReceiveArchiveStreamChunk>,
callback_data=0x7fffffffd420) at pg_basebackup.c:1102
#8  0x000055555555b7cd in ReceiveArchiveStream (conn=0x555555596660,
compress=0x7fffffffe220) at pg_basebackup.c:1379
#9  0x000055555555d1ff in BaseBackup (compression_algorithm=0x555555582fed
"gzip", compression_detail=0x0, compressloc=COMPRESS_LOCATION_CLIENT,
client_compress=0x7fffffffe220) at pg_basebackup.c:2147
#10 0x000055555555e94a in main (argc=19, argv=0x7fffffffe338) at
pg_basebackup.c:2900

Do I think we should remove the bbstreamer_buffer_until call in the
bbstreamer_tar_parser_content function in the scenario of
BBFLOWER_MEMBER_TRAILER?

/ *
* If we're expecting an archive member trailer, accumulate
* the expected number of padding bytes before sending
* anything onward.
* /
if (!bbstreamer_buffer_until(streamer, &data, &len,
mystreamer->pad_bytes_expected))
return;


PG Bug reporting form <noreply@postgresql.org> writes:
> The following bug has been logged on the website:
> Bug reference:      18533
> Logged by:          ji xiaohang
> Email address:      1165125080@qq.com
> PostgreSQL version: 16.3
> Operating system:   centos 7
> Description:        

> pg_basebackup uses out-of-bounds memory and a segment error occurs during
> backup

> Run the following command to back up the pg11 database using the
> pg_basebackup backup tool of the pg16 version:
> pg_basebackup -h 127.0.0.1 -p 56100 -U Backup -F tar -X f -z -c fast -P -v
> -D - > base.tar.gz
> The segmentation fault does not recur after I run commands. It seems that
> this problem recurs only in my service invoking scenario.

FWIW, I tried and failed to reproduce this problem.  I don't see any
segfault, nor does running pg_basebackup under Valgrind report any
invalid accesses.  Perhaps it's dependent on the contents of the
source installation?  (My test was with v11's core regression database
and not much else.)

It's going to be hard to convince people that we should change
anything if we can't duplicate the problem, so I'd suggest working
harder to make a self-contained reproducer.

            regards, tom lane