Re: Determining size of a database before dumping - Mailing list pgsql-general

From Tom Lane
Subject Re: Determining size of a database before dumping
Date
Msg-id 16895.1159833917@sss.pgh.pa.us
Whole thread Raw
In response to Re: Determining size of a database before dumping  (Jeff Davis <pgsql@j-davis.com>)
List pgsql-general
Jeff Davis <pgsql@j-davis.com> writes:
> On Tue, 2006-10-03 at 00:42 +0200, Alexander Staubo wrote:
>> Why does pg_dump serialize data less efficiently than PostgreSQL when
>> using the "custom" format?

> What you're saying is more theoretical. If pg_dump used specialized
> compression based on the data type of the columns, and everything was
> optimal, you're correct. There's no situation in which the dump *must*
> be bigger. However, since there is no practical demand for such
> compression, and it would be a lot of work ...

There are several reasons for not being overly tense about the pg_dump
format:

* We don't have infinite manpower

* Cross-version and cross-platform portability of the dump files is
  critical

* The more complicated it is, the more chance for bugs, which you'd
  possibly not notice until you *really needed* that dump.

In practice, pushing the data through gzip gets most of the potential
win, for a very small fraction of the effort it would take to have a
smart custom compression mechanism.

            regards, tom lane

pgsql-general by date:

Previous
From: Jeff Davis
Date:
Subject: Re: Determining size of a database before dumping
Next
From: Morgan Kita
Date:
Subject: Foreign keys, arrays, and uniqueness