Thread: BUG #16951: pg_restore segfaults on custom format piped from a different version of PG

BUG #16951: pg_restore segfaults on custom format piped from a different version of PG

From
PG Bug reporting form
Date:
The following bug has been logged on the website:

Bug reference:      16951
Logged by:          Sergey Koposov
Email address:      skoposov@ed.ac.uk
PostgreSQL version: 10.16
Operating system:   Linux
Description:

Hi,

I have a reproducible case of segfaulting pg_restore when trying to restore
from pg_dump of a different version. Specifically at least pg_restore from
10 crashes from pg_dump 12. 
I understand that this is not supported, but presumably it still shouldn't
segfault.
This was a command

pg_dump12 -n SCHEMA -Fc  -U dbadmin DB  | pg_restore10 -U dbadmin -h
localhost -1 -d DB

where pg_dump12 is pg_dump from 12.6 on one linux 64bit machine and
pg_restore10 is pg_restore from 10.16 on another linux 64bit machine

I attach the gdb bt full of the crash (see below). I also have a 512 byte
file that crashes pg_restore (the top 512 bytes from the pgdump). I can
share it if needed.

It is clear that some checks of the version of the archive have not been
done early enough by pg_restore leading to the segfault. I don't have time
to get to the bottom of this, but 
I'm seeing that readHead() in bg_backup_archiver() has not executed the
checks 
behind 
if (!AH->readHeader) that would have failed. 
And it also looks like the readHeader flag is set early by
_discoverArchiveFormat() 
on when reading from stdin.
(but this is just my impression from a quick look at the code)

Cheers,
       Sergey

 

#0  __strcmp_sse2_unaligned ()
    at ../sysdeps/x86_64/multiarch/strcmp-sse2-unaligned.S:31
#1  0x00000000004098d0 in ReadToc (AH=0xe8bb60) at
pg_backup_archiver.c:2660
#2  0x000000000040f010 in InitArchiveFmt_Custom (AH=0xe8bb60)
    at pg_backup_custom.c:191
#3  0x0000000000408f57 in _allocAH (FileSpec=0x0, fmt=archUnknown, 
    compression=0, dosync=1 '\001', mode=archModeRead, 
    setupWorkerPtr=0x404528 <setupRestoreWorker>) at
pg_backup_archiver.c:2400
#4  0x00000000004045d3 in OpenArchive (FileSpec=0x0, fmt=archUnknown)
    at pg_backup_archiver.c:235
#5  0x0000000000403eff in main (argc=7, argv=0x7fffb0559018)
    at pg_restore.c:400


#0  __strcmp_sse2_unaligned ()
    at ../sysdeps/x86_64/multiarch/strcmp-sse2-unaligned.S:31
No locals.
#1  0x00000000004098d0 in ReadToc (AH=0xe8bb60) at
pg_backup_archiver.c:2660
        i = 0
        tmp = 0xe6d7d0 "ENCODING"
        deps = 0x7fa9b5de08e0 <_IO_2_1_stdin_>
        depIdx = 15127296
        depSize = 0
        te = 0xe91140
#2  0x000000000040f010 in InitArchiveFmt_Custom (AH=0xe8bb60)
    at pg_backup_custom.c:191
        ctx = 0xe8d100
#3  0x0000000000408f57 in _allocAH (FileSpec=0x0, fmt=archUnknown, 
    compression=0, dosync=1 '\001', mode=archModeRead, 
    setupWorkerPtr=0x404528 <setupRestoreWorker>) at
pg_backup_archiver.c:2400
        AH = 0xe8bb60
#4  0x00000000004045d3 in OpenArchive (FileSpec=0x0, fmt=archUnknown)
    at pg_backup_archiver.c:235
        AH = 0xe6cf90
#5  0x0000000000403eff in main (argc=7, argv=0x7fffb0559018)
    at pg_restore.c:400
        opts = 0xe8b9e0
        c = -1
        exit_code = 32681
        numWorkers = 1
        AH = 0x7fa9b647667b <do_lookup_x+2011>
        inputFileSpec = 0x0
        disable_triggers = 0
        enable_row_security = 0
        if_exists = 0
        no_data_for_failed_tables = 0
        outputNoTablespaces = 0
        use_setsessauth = 0
        no_publications = 0
        no_security_labels = 0
        no_subscriptions = 0
        strict_names = 0
        cmdopts = {{name = 0x41edd8 "clean", has_arg = 0, flag = 0x0, 
            val = 99}, {name = 0x41edde "create", has_arg = 0, flag = 0x0,

            val = 67}, {name = 0x41ede5 "data-only", has_arg = 0, flag =
0x0, 
            val = 97}, {name = 0x41edef "dbname", has_arg = 1, flag = 0x0,

            val = 100}, {name = 0x41edf6 "exit-on-error", has_arg = 0, 
            flag = 0x0, val = 101}, {name = 0x41ee04 "exclude-schema", 
            has_arg = 1, flag = 0x0, val = 78}, {name = 0x41ee13 "file", 
            has_arg = 1, flag = 0x0, val = 102}, {name = 0x41ee18 "format",

            has_arg = 1, flag = 0x0, val = 70}, {name = 0x41ee1f "function",

            has_arg = 1, flag = 0x0, val = 80}, {name = 0x41ee28 "host", 
            has_arg = 1, flag = 0x0, val = 104}, {name = 0x41ee2d "index",

            has_arg = 1, flag = 0x0, val = 73}, {name = 0x41ee33 "jobs", 
            has_arg = 1, flag = 0x0, val = 106}, {name = 0x41ee38 "list", 
            has_arg = 0, flag = 0x0, val = 108}, {
            name = 0x41ee3d "no-privileges", has_arg = 0, flag = 0x0, 
            val = 120}, {name = 0x41ee4b "no-acl", has_arg = 0, flag = 0x0,

            val = 120}, {name = 0x41ee52 "no-owner", has_arg = 0, flag =
0x0, 
            val = 79}, {name = 0x41ee5b "no-reconnect", has_arg = 0, 
            flag = 0x0, val = 82}, {name = 0x41ee68 "port", has_arg = 1, 
            flag = 0x0, val = 112}, {name = 0x41ee6d "no-password", 
            has_arg = 0, flag = 0x0, val = 119}, {name = 0x41ee79
"password", 
            has_arg = 0, flag = 0x0, val = 87}, {name = 0x41ee82 "schema",

            has_arg = 1, flag = 0x0, val = 110}, {
            name = 0x41ee89 "schema-only", has_arg = 0, flag = 0x0, 
            val = 115}, {name = 0x41ee95 "superuser", has_arg = 1, flag =
0x0, 
            val = 83}, {name = 0x41ee9f "table", has_arg = 1, flag = 0x0, 
            val = 116}, {name = 0x41eea5 "trigger", has_arg = 1, flag = 0x0,

            val = 84}, {name = 0x41eead "use-list", has_arg = 1, flag = 0x0,

            val = 76}, {name = 0x41eeb6 "username", has_arg = 1, flag = 0x0,

            val = 85}, {name = 0x41eebf "verbose", has_arg = 0, flag = 0x0,

            val = 118}, {name = 0x41eec7 "single-transaction", has_arg = 0,

            flag = 0x0, val = 49}, {name = 0x41eeda "disable-triggers", 
            has_arg = 0, flag = 0x62c5ac <disable_triggers>, val = 1}, {
            name = 0x41eeeb "enable-row-security", has_arg = 0, 
            flag = 0x62c5b0 <enable_row_security>, val = 1}, {
            name = 0x41eeff "if-exists", has_arg = 0, 
            flag = 0x62c5cc <if_exists>, val = 1}, {
            name = 0x41ef09 "no-data-for-failed-tables", has_arg = 0, 
            flag = 0x62c5b4 <no_data_for_failed_tables>, val = 1}, {
            name = 0x41ef23 "no-tablespaces", has_arg = 0, 
            flag = 0x62c5b8 <outputNoTablespaces.7124>, val = 1}, {
            name = 0x41ef32 "role", has_arg = 1, flag = 0x0, val = 2}, {
            name = 0x41ef37 "section", has_arg = 1, flag = 0x0, val = 3},
{
            name = 0x41ef3f "strict-names", has_arg = 0, 
            flag = 0x62c5d0 <strict_names>, val = 1}, {
            name = 0x41ef4c "use-set-session-authorization", has_arg = 0, 
            flag = 0x62c5bc <use_setsessauth>, val = 1}, {
            name = 0x41ef6a "no-publications", has_arg = 0, 
            flag = 0x62c5c0 <no_publications>, val = 1}, {
            name = 0x41ef7a "no-security-labels", has_arg = 0, 
            flag = 0x62c5c4 <no_security_labels>, val = 1}, {
            name = 0x41ef8d "no-subscriptions", has_arg = 0, 
            flag = 0x62c5c8 <no_subscriptions>, val = 1}, {name = 0x0, 
            has_arg = 0, flag = 0x0, val = 0}}
quit
Detaching from program:
/usr0/home/skoposov_remote/postgresql-10.16/src/bin/pg_dump/pg_restore,
process 3461


PG Bug reporting form <noreply@postgresql.org> writes:
> I have a reproducible case of segfaulting pg_restore when trying to restore
> from pg_dump of a different version. Specifically at least pg_restore from
> 10 crashes from pg_dump 12. 

When I try that I get

pg_restore: [archiver] unsupported version (1.14) in file header

and that test is done first thing in ReadHead(), before the place
you identify.  I suspect you are dealing with a corrupt archive
file, not a version mismatch.

            regards, tom lane



On Wed, 2021-03-31 at 17:27 -0400, Tom Lane wrote:
> This email was sent to you by someone outside the University.
> You should only click on links or attachments if you are certain that the email is genuine and the content is safe.
>
> PG Bug reporting form <noreply@postgresql.org> writes:
> > I have a reproducible case of segfaulting pg_restore when trying to restore
> > from pg_dump of a different version. Specifically at least pg_restore from
> > 10 crashes from pg_dump 12.
>
> When I try that I get
>
> pg_restore: [archiver] unsupported version (1.14) in file header
>
> and that test is done first thing in ReadHead(), before the place
> you identify.  I suspect you are dealing with a corrupt archive
> file, not a version mismatch.


I am pretty sure that is not the case.

I've just done that now
skoposov_remote@wsdb:~/postgresql-10.16$ ssh -o Compression=no koposov@HOSTNAME '/opt/pgsql/bin/pg_dump -n SCHEMA -Fc
-Udbadmin wsdb'  | head -c 32768 >
 
xx.short
(obviously the crash doesn't happen even if I don't put 'head -n ...')

skoposov_remote@wsdb:~/postgresql-10.16$ cat xx.short | pg_restore -U dbadmin -h localhost -d wsdb
Segmentation fault (core dumped)

I attach the file (1 kb of it)

I also noticed when I tried to run pg_restore in the debugger the crash doesn't happen.

gdb --args pg_restore -U dbadmin -h localhost -d wsdb
(gdb) run < /tmp/xx.short
Starting program: /usr0/home/skoposov_remote/postgresql-10.16/src/bin/pg_dump/pg_restore < /tmp/xx.short
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
pg_restore: [archiver] unsupported version (1.14) in file header
[Inferior 1 (process 14109) exited with code 01]

but it does if I pipe it...

      Sergey

The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. Is e
buidheanncarthannais a th’ ann an Oilthigh Dhùn Èideann, clàraichte an Alba, àireamh clàraidh SC005336.
 

Attachment
Sergey KOPOSOV <Sergey.Koposov@ed.ac.uk> writes:
> I also noticed when I tried to run pg_restore in the debugger the crash doesn't happen.
> but it does if I pipe it...

I wonder if your platform is helpfully inserting Windows newlines,
or perhaps removing them, when the data goes through a pipe.

            regards, tom lane



On Wed, 2021-03-31 at 18:47 -0400, Tom Lane wrote:
> This email was sent to you by someone outside the University.
> You should only click on links or attachments if you are certain that the email is genuine and the content is safe.
>
> Sergey KOPOSOV <Sergey.Koposov@ed.ac.uk> writes:
> > I also noticed when I tried to run pg_restore in the debugger the crash doesn't happen.
> > but it does if I pipe it...
>
> I wonder if your platform is helpfully inserting Windows newlines,
> or perhaps removing them, when the data goes through a pipe.

The platform doing the pg_dump (PG12) is debian
And the one doing pg_restore (PG10) is ubuntu.

Also I know for sure when I use pg_restore from postgresql 12 it works fine in the same configuration.
(And I've regularly transferred tables this way from one system to another, it's just I've recently
migrated the debian system from PG11 to PG12 which lead to this segfault.)

      S
The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. Is e
buidheanncarthannais a th’ ann an Oilthigh Dhùn Èideann, clàraichte an Alba, àireamh clàraidh SC005336.
 

On Wed, 2021-03-31 at 23:52 +0100, Sergey Koposov wrote:
> On Wed, 2021-03-31 at 18:47 -0400, Tom Lane wrote:
> > This email was sent to you by someone outside the University.
> > You should only click on links or attachments if you are certain that the email is genuine and the content is
safe.
> >
> > Sergey KOPOSOV <Sergey.Koposov@ed.ac.uk> writes:
> > > I also noticed when I tried to run pg_restore in the debugger the crash doesn't happen.
> > > but it does if I pipe it...
> >
> > I wonder if your platform is helpfully inserting Windows newlines,
> > or perhaps removing them, when the data goes through a pipe.
>
> The platform doing the pg_dump (PG12) is debian
> And the one doing pg_restore (PG10) is ubuntu.
>
> Also I know for sure when I use pg_restore from postgresql 12 it works fine in the same configuration.
> (And I've regularly transferred tables this way from one system to another, it's just I've recently
> migrated the debian system from PG11 to PG12 which lead to this segfault.)

I've just verified that on a different machine ubuntu 18.04 I can crash pg_restore with the file that I've sent to the
list.
Importantly This requires running pg_restore without '-Fc' flag, i.e. to let it autodetect.

$ cat /tmp/xx1.short | ./src/bin/pg_dump/pg_restore
Segmentation fault (core dumped)
$ cat /tmp/xx1.short | ./src/bin/pg_dump/pg_restore  -Fc
pg_restore: [archiver] unsupported version (1.14) in file header

      S
The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. Is e
buidheanncarthannais a th’ ann an Oilthigh Dhùn Èideann, clàraichte an Alba, àireamh clàraidh SC005336.
 

Sergey KOPOSOV <Sergey.Koposov@ed.ac.uk> writes:
> Importantly This requires running pg_restore without '-Fc' flag, i.e. to let it autodetect.

> $ cat /tmp/xx1.short | ./src/bin/pg_dump/pg_restore
> Segmentation fault (core dumped)
> $ cat /tmp/xx1.short | ./src/bin/pg_dump/pg_restore  -Fc
> pg_restore: [archiver] unsupported version (1.14) in file header

Ooooh ... the autodetect + cant-seek code path is just broken.  All of the
sanity checks on the first few fields of the file --- particularly the
version number --- just get skipped in this scenario.

I wonder why it's a good idea to read-ahead any of those fields in the
first place.  Checking the PGDMP magic string seems sufficient.

Will fix, thanks for the report!

            regards, tom lane



Justin Pryzby <pryzby@telsasoft.com> writes:
> On Thu, Apr 01, 2021 at 11:39:33AM -0400, Tom Lane wrote:
>> Will fix, thanks for the report!

> Yes, thank you both.  I've run into this recently but for some reason I thought
> it was fixed.  It probably also explains this one from 2014.
> https://www.postgresql.org/message-id/20141206061151.GA725@telsasoft.com

Yeah, this does look suspiciously like it explains some past reports
that we failed to reproduce, likely because it didn't occur to us that
reading the file from a non-seekable source would make a difference.

Thanks to Sergey for beating me over the head till I didn't dismiss
it anymore ;-)

            regards, tom lane