Re: Patch: incorrect array offset in backend replication tar header - Mailing list pgsql-hackers

From Brian Weaver
Subject Re: Patch: incorrect array offset in backend replication tar header
Date
Msg-id CAAhXZGvax_EPVRA=_0oFKAx3DSfamwmspH8wUKsVzH3ASXfrdQ@mail.gmail.com
Whole thread Raw
In response to Re: Patch: incorrect array offset in backend replication tar header  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Patch: incorrect array offset in backend replication tar header  (Marko Tiikkaja <pgmail@joh.to>)
Re: Patch: incorrect array offset in backend replication tar header  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Tom,

I actually plan on doing a lot of work on the frontend pg_basebackup
for my employer. pg_basebackup is 90% of the way to a solution that I
need for doing backups of *large* databases while allowing the
database to continue to work. The problem is a lack of secondary disk
space to save a replication of the original database cluster. I want
to modify pg_basebackup to include the WAL files in the tar output. I
have several ideas but I need to code and test them. That was the main
reason I was examining the backend code.

If you're willing to wait a bit on me to code and test my extensions
to pg_basebackup I will try to address some of the deficiencies as
well add new features.

I agree the checksum algorithm could definitely use some refactoring.
I was already working on that before I retired last night.

-- Brian

On Mon, Sep 24, 2012 at 10:36 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Brian Weaver <cmdrclueless@gmail.com> writes:
>> Here are lines 321 through 329 of 'archive_read_support_format_tar.c'
>> from libarchive
>
>>  321         /* Recognize POSIX formats. */
>>  322         if ((memcmp(header->magic, "ustar\0", 6) == 0)
>>  323             && (memcmp(header->version, "00", 2) == 0))
>>  324                 bid += 56;
>>  325
>>  326         /* Recognize GNU tar format. */
>>  327         if ((memcmp(header->magic, "ustar ", 6) == 0)
>>  328             && (memcmp(header->version, " \0", 2) == 0))
>>  329                 bid += 56;
>
>> I'm wondering if the original committer put the 'ustar00\0' string in by design?
>
> The second part of that looks to me like it matches "ustar  \0",
> not "ustar00\0".  I think the pg_dump coding is just wrong.  I've
> already noticed that its code for writing the checksum is pretty
> brain-dead too :-(
>
> Note that according to the wikipedia page, tar programs typically
> accept files as pre-POSIX format if the checksum is okay, regardless of
> what is in the magic field; and the fields that were added by POSIX
> are noncritical so we'd likely never notice that they were being
> ignored.  (In fact, looking closer, pg_dump isn't even filling those
> fields anyway, so the fact that it's not producing a compliant magic
> field may be a good thing ...)
>
>                         regards, tom lane



-- 

/* insert witty comment here */



pgsql-hackers by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: Oid registry
Next
From: Amit Kapila
Date:
Subject: Re: Switching timeline over streaming replication