Re: directory archive format for pg_dump - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: directory archive format for pg_dump
Date
Msg-id 4D0A50D6.5070602@enterprisedb.com
Whole thread Raw
In response to Re: directory archive format for pg_dump  (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
Responses Re: directory archive format for pg_dump  (Robert Haas <robertmhaas@gmail.com>)
Re: directory archive format for pg_dump  (Joachim Wieland <joe@mcknight.de>)
List pgsql-hackers
On 16.12.2010 17:23, Heikki Linnakangas wrote:
> On 16.12.2010 12:12, Greg Smith wrote:
>> There's a number of small things that I'd like to see improved in new
>> rev of this code
>> ...
>
> In addition to those:
>...

One more thing: the motivation behind this patch is to allow parallel 
pg_dump in the future, so we should be make sure this patch caters well 
for that.

As soon as we have parallel pg_dump, the next big thing is going to be 
parallel dump of the same table using multiple processes. Perhaps we 
should prepare for that in the directory archive format, by allowing the 
data of a single table to be split into multiple files. That way 
parallel pg_dump is simple, you just split the table in chunks of 
roughly the same size, say 10GB each, and launch a process for each 
chunk, writing to a separate file.

It should be a quite simple add-on to the current patch, but will make 
life so much easier for parallel pg_dump. It would also be helpful to 
work around file size limitations on some filesystems.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: [PATCH] V3: Idle in transaction cancellation
Next
From: Radosław Smogura
Date:
Subject: Binary timestamp with without timezone