Re: directory archive format for pg_dump - Mailing list pgsql-hackers

From Robert Haas
Subject Re: directory archive format for pg_dump
Date
Msg-id AANLkTimrRrhnoibwb9Z9Nz6CQ-vx22Wezm1YyZR=vR6x@mail.gmail.com
Whole thread Raw
In response to Re: directory archive format for pg_dump  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: directory archive format for pg_dump  (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
Re: directory archive format for pg_dump  (Andrew Dunstan <andrew@dunslane.net>)
List pgsql-hackers
On Thu, Dec 16, 2010 at 2:29 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
>> On 16.12.2010 20:33, Joachim Wieland wrote:
>>> How exactly would you "just split the table in chunks of roughly the
>>> same size" ?
>
>> Check pg_class.relpages, and divide that evenly across the processes.
>> That should be good enough.
>
> Not even close ... relpages could be badly out of date.  If you believe
> it, you could fail to dump data that's in further-out pages.  We'd need
> to move pg_relpages() or some equivalent into core to make this
> workable.
>
>>> Which queries should pg_dump send to the backend?
>
>> Hmm, I was thinking of "SELECT * FROM table WHERE ctid BETWEEN ? AND ?",
>> but we don't support TidScans for ranges. Perhaps we could add that.
>
> Yeah, that seems probably workable, given an up-to-date idea of the
> possible block range.

So how bad would it be if we committed this new format without support
for splitting large relations into multiple files, or with some stub
support that never actually gets used, and fixed this later?  Because
this is starting to sound like a bigger project than I think we ought
to be requiring for this patch.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: [PATCH] V3: Idle in transaction cancellation
Next
From: Tom Lane
Date:
Subject: Re: [PATCH] V3: Idle in transaction cancellation