Re: Patch: incorrect array offset in backend replication tar header - Mailing list pgsql-hackers

From Brian Weaver
Subject Re: Patch: incorrect array offset in backend replication tar header
Date
Msg-id CAAhXZGvTXxJjDPsR6icXcyeeR10+1KJJUoyNrN8F4sqc=UWnSA@mail.gmail.com
Whole thread Raw
In response to Re: Patch: incorrect array offset in backend replication tar header  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Patch: incorrect array offset in backend replication tar header
List pgsql-hackers
Tom,

I'm still investigating and I have been looking at various sources. I
have checked lots of pages on the web and I was just looking at the
libarchive source from github. I found an interesting sequence in
libarchive that implies that the 'ustar00\0' marks the header as GNU
Tar format.

Here are lines 321 through 329 of 'archive_read_support_format_tar.c'
from libarchive
321         /* Recognize POSIX formats. */322         if ((memcmp(header->magic, "ustar\0", 6) == 0)323             &&
(memcmp(header->version,"00", 2) == 0))324                 bid += 56;325326         /* Recognize GNU tar format. */327
      if ((memcmp(header->magic, "ustar ", 6) == 0)328             && (memcmp(header->version, " \0", 2) == 0))329
          bid += 56;
 

I'm wondering if the original committer put the 'ustar00\0' string in by design?

Regardless I'll look at it more tomorrow 'cause I'm calling it a
night. I need to send a note to the libarchive folks too because I
*think* I found a potential buffer overrun in one of their octal
conversion routines.

-- Brian

On Mon, Sep 24, 2012 at 10:07 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Brian Weaver <cmdrclueless@gmail.com> writes:
>> While researching the way streaming replication works I was examining
>> the construction of the tar file header. By comparing documentation on
>> the tar header format from various sources I certain the following
>> patch should be applied to so the group identifier is put into thee
>> header properly.
>
> Yeah, this is definitely wrong.
>
>> While I realize that wikipedia isn't always the best source of
>> information, the header offsets seem to match the other documentation
>> I've found. The format is just easier to read on wikipedia
>
> The authoritative specification can be found in the "pax" page in the
> POSIX spec, which is available here:
> http://pubs.opengroup.org/onlinepubs/9699919799/
>
> I agree that the 117 number is bogus, and also that the magic "ustar"
> string is written incorrectly.  What's more, it appears that the latter
> error has been copied from pg_dump (but the 117 seems to be just a new
> bug in pg_basebackup).  I wonder what else might be wrong hereabouts :-(
> Will sit down and take a closer look.
>
> I believe what we need to do about this is:
>
> 1. fix pg_dump and pg_basebackup output to conform to spec.
>
> 2. make sure pg_restore will accept both conformant and
>    previous-generation files.
>
> Am I right in believing that we don't have any code that's expected to
> read pg_basebackup output?  We just feed it to "tar", no?
>
> I'm a bit concerned about backwards compatibility issues.  It looks to
> me like existing versions of pg_restore will flat out reject files that
> have a spec-compliant "ustar\0" MAGIC field.  Is it going to be
> sufficient if we fix this in minor-version updates, or are we going to
> need to have a switch that tells pg_dump to emit the incorrect old
> format?  (Ick.)
>
>                         regards, tom lane



-- 

/* insert witty comment here */



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: plpgsql gram.y make rule
Next
From: Andrew Dunstan
Date:
Subject: Re: Oid registry