Re: [GENERAL] RE: pg_dump & blobs - editable dump? - Mailing list pgsql-hackers

From Giles Lean
Subject Re: [GENERAL] RE: pg_dump & blobs - editable dump?
Date
Msg-id 17006.963439123@nemeton.com.au
Whole thread Raw
In response to RE: [GENERAL] RE: pg_dump & blobs - editable dump?  (Philip Warner <pjw@rhyme.com.au>)
List pgsql-hackers
> >http://www.goice.co.jp/member/mo/formats/tar.html has a nice brief

Best is to look at one of the actual standards, accessible via:

http://www.opengroup.org

The tar and cpio formats are in the pax specification.

>     136     12 bytes  Modify time (in octal ascii)
>
>     ...do you know the format of the date (seconds since 1970?).

It's just 11 bytes plus \0 in tar's usual encode-this-as-octal format:

encode_octal(unsigned char *p, size_t n, unsigned long value)
{
    const unsigned char octal[] = "01234567";
    while (n) {
        *(p + --n) = octal[value & 07];
        value >>= 3;
    }
}

Warning: some values allowed by tar exceed the size of 'long' on a 32
bit platform.

>     157    100 bytes  Linkname ('\0' terminated, 99 maxmum length)
>
>     ...what's this? Is it the target for symlinks?

Long pathnames get split into two pieces on a '/' as I recall.

The code I offered you previously has code to do this too; I
appreciate that the code is quite likely not what you want, but you
might consider looking at it or other tar/pax code to help you
interpret the standard.

>     329      8 bytes  Major device ID (in octal ascii)
>     337      8 bytes  Minor device ID (in octal ascii)
>     345    167 bytes  Padding
>
>     ...and what should I set these to?

Zero.

> If you're serious about the offer, I'd be happy. But, given how simple the
> format is, I can probably tack in into place myself.

For the very limited formats you want to create, that's probably
the easiest way.  You don't care about unpacking, GNU v. POSIX format,
device files, etc etc.

> There is a minor problem. Currently I compress the output stream as I
> receive it from PG, and send it to the archive. I don't know how big it
> will be until it is written. The custom output format can handle this, but
> in streaming a tar file to tape, I have to know the file size first. This
> means writing to /tmp. I supose that's OK, but I've been trying to
> avoid it.

I recommend you compress the whole stream, not the pieces.  Presumably
you can determine the size of the pieces you're backing up, and ending
with a .tar.gz (or whatever) file is more convenient to manage than a
.tar file of compressed pieces unless you really expect people to be
extracting individual files from the backup very often.

Having to pass everything through /tmp would be really unfortunate.

Regards,

Giles

pgsql-hackers by date:

Previous
From: The Hermit Hacker
Date:
Subject: Just a test ...
Next
From: "G. Anthony Reina"
Date:
Subject: Installing the man pages