Re: directory archive format for pg_dump - Mailing list pgsql-hackers

From Joachim Wieland
Subject Re: directory archive format for pg_dump
Date
Msg-id AANLkTik+PR2HCsqS7PVAdETPyKSGn2hb-eTeg9QZpRew@mail.gmail.com
Whole thread Raw
In response to Re: directory archive format for pg_dump  (José Arthur Benetasso Villanova<jose.arthur@gmail.com>)
Responses Re: directory archive format for pg_dump
List pgsql-hackers
Hi Jose,

2010/11/19 José Arthur Benetasso Villanova <jose.arthur@gmail.com>:
> The dir format generated in my database 60 files, with different
> sizes, and it looks very confusing. Is it possible to use the same
> trick as pigz and pbzip2, creating a concatenated file of streams?

What pigz is parallelizing is the actual computation of the compressed
data. The directory archive format however is a preparation for a
parallel pg_dump, dumping several tables (especially large tables of
course) in parallel via multiple database connections and multiple
pg_dump frontends. The idea of multiplexing their output into one file
has been rejected on the grounds that it would probably slow down the
whole process.

Nevertheless pigz could be implemented as an alternative compression
algorithm and that way the custom and the directory archive format
could use it, but here as well, license and patent questions might be
in the way, even though it is based on libz.


> The md5.c and kwlookup.c reuse using a link doesn't look nice either.
> This way you need to compile twice, among others things, but I think
> that its temporary, right?

No, it isn't. md5.c is used in the same way by e.g. libpq and there
are other examples for links in core, check out src/bin/psql for
example.

Joachim


pgsql-hackers by date:

Previous
From: Vaibhav Kaushal
Date:
Subject: Fwd: What do these terms mean in the SOURCE CODE?
Next
From: Itagaki Takahiro
Date:
Subject: Re: UNNEST ... WITH ORDINALITY (AND POSSIBLY OTHER STUFF)