Re: Forensic recovery deleted pgdump custom format file - Mailing list pgsql-hackers

From David Guimaraes
Subject Re: Forensic recovery deleted pgdump custom format file
Date
Msg-id CAJNfudJmHJ4vL7nWopp7STzvurY-cfffRLFynG4g3iLELWDxtg@mail.gmail.com
Whole thread Raw
In response to Re: Forensic recovery deleted pgdump custom format file  (Michael Paquier <michael.paquier@gmail.com>)
List pgsql-hackers
Yes Michael, I agree.

This is the CloseArchive function at pg_backup_custom.c

WriteHead(AH);
tpos = ftello(AH->FH);
WriteToc(AH);
ctx->dataStart = _getFilePos(AH, ctx);
WriteDataChunks(AH);

This is the WriteHead function at pg_backup_archiver.c:

(*AH->WriteBufPtr) (AH, "PGDMP", 5); /* Magic code */
(*AH->WriteBytePtr) (AH, AH->vmaj);
(*AH->WriteBytePtr) (AH, AH->vmin);
(*AH->WriteBytePtr) (AH, AH->vrev);
(*AH->WriteBytePtr) (AH, AH->intSize);
(*AH->WriteBytePtr) (AH, AH->offSize);
(*AH->WriteBytePtr) (AH, AH->format);
WriteInt(AH, AH->compression);
crtm = *localtime(&AH->createDate);
WriteInt(AH, crtm.tm_sec);
WriteInt(AH, crtm.tm_min);
WriteInt(AH, crtm.tm_hour);
WriteInt(AH, crtm.tm_mday);
WriteInt(AH, crtm.tm_mon);
WriteInt(AH, crtm.tm_year);
WriteInt(AH, crtm.tm_isdst);
WriteStr(AH, PQdb(AH->connection));
WriteStr(AH, AH->public.remoteVersionStr);
WriteStr(AH, PG_VERSION);

There is no mention to File Size or whatsoever in the Header..

WriteToc, however write the number of TOCs structs at the beginning:

void WriteToc(ArchiveHandle *AH) {
...
WriteInt(AH, tocCount);

but these structs are dynamic(linked list), so there is no way to know the size of each one...

At the definition of tocEntry struct, there is no reference to size or anything like that.. it is a linked list with a count number.

And at the end, the CloseArchive function calls WriteDataChunks to write blob information... i don't understand what this function is doing.. it save size information of blob data at the beginning? 

(*te->dataDumper) ((Archive *) AH, te->dataDumperArg);

What this function does?



David


On Mon, Jul 13, 2015 at 11:00 PM, Michael Paquier <michael.paquier@gmail.com> wrote:
On Tue, Jul 14, 2015 at 11:20 AM, David Guimaraes <skysbsb@gmail.com> wrote:
> Yeah bingo

Hm. While there is a magic-code header for the custom format, by
looking at the code I am not seeing any traces of a similar thing at
the end of the dump file (_CloseArchive in pg_backup_custom.c), and I
don't recall wither that there is an estimation of the size of the
dump either in the header. If those files were stored close to each
other, one idea may be to look for the next header present. or to
attempt to roughly estimate the size that they would have I am afraid.
In any case, applying reverse engineering methods seems like the most
reliable method to reconstitute an archive handler that could be used
by pg_restore or pg_dump, but perhaps others have other ideas.
--
Michael



--
David Gomes Guimarães

pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: Bug in bttext_abbrev_convert()
Next
From: Peter Geoghegan
Date:
Subject: Re: Could be improved point of UPSERT