Thread: Purpose of pg_dump tar archive format?
It doesn't support compression nor restore reordering like the custom format, so I'm having trouble seeing why it still exists (at least without a doc warning that it's obsolete).
On 2024-06-04 14:13 +0200, Ron Johnson wrote: > It doesn't support compression nor restore reordering like the custom > format, so I'm having trouble seeing why it still exists (at least without > a doc warning that it's obsolete). Maybe because of this: > * The tar format also includes a 'restore.sql' script which is there for > * the benefit of humans. This script is never used by pg_restore. I've never used that restore.sql, so I can't say how useful it is, especially since it requires editing $$PATH$$. Should be mentioned in the docs if that file is still relevant. -- Erik
On 6/4/24 05:13, Ron Johnson wrote: > It doesn't support compression nor restore reordering like the custom > format, so I'm having trouble seeing why it still exists (at least > without a doc warning that it's obsolete). pg_dump -d test -U postgres -Ft | gzip --stdout > test.tgz -- Adrian Klaver adrian.klaver@aklaver.com
On Tue, Jun 4, 2024 at 10:43 AM Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 6/4/24 05:13, Ron Johnson wrote:
> It doesn't support compression nor restore reordering like the custom
> format, so I'm having trouble seeing why it still exists (at least
> without a doc warning that it's obsolete).
pg_dump -d test -U postgres -Ft | gzip --stdout > test.tgz
Who's got meaningful databases that small anymore?
And if you've got meaningfully sized databases, open port 5432 and move them using pg_dump.
On 6/4/24 08:25, Ron Johnson wrote: > On Tue, Jun 4, 2024 at 10:43 AM Adrian Klaver <adrian.klaver@aklaver.com > <mailto:adrian.klaver@aklaver.com>> wrote: > > On 6/4/24 05:13, Ron Johnson wrote: > > It doesn't support compression nor restore reordering like the > custom > > format, so I'm having trouble seeing why it still exists (at least > > without a doc warning that it's obsolete). > > pg_dump -d test -U postgres -Ft | gzip --stdout > test.tgz > > > Who's got meaningful databases that small anymore? > > And if you've got meaningfully sized databases, open port 5432 and move > them using pg_dump. If you don't need the tar format then don't use it. -- Adrian Klaver adrian.klaver@aklaver.com
Adrian Klaver <adrian.klaver@aklaver.com> writes: > If you don't need the tar format then don't use it. Indeed. FTR, I think the original motivation for tar format was to have a "standard" structured output format that could be manipulated with tools other than pg_restore. Years later, we added the directory format which could be argued to serve the same purpose. But nobody particularly wanted to remove the tar functionality, especially since the two were made to be compatible: commit 7f508f1c6b515df66d27f860b2faa7b5761fa55d Author: Heikki Linnakangas <heikki.linnakangas@iki.fi> Date: Sun Jan 23 23:10:15 2011 +0200 Add 'directory' format to pg_dump. The new directory format is compatible with the 'tar' format, in that untarring a tar format archive produces a valid directory format archive. regards, tom lane
Reply to list also. Ccing list On 6/4/24 10:03 AM, Ron Johnson wrote: > > If you don't need the tar format then don't use it. > > > That's neither the purpose nor the point of my question. > > I think that a note in the docs mentioning that it's obsolete would be > helpful for new users who recognize "tar" so choose it. You are assuming facts not in evidence, namely that the format is obsolete. -- Adrian Klaver adrian.klaver@aklaver.com
We use it. I bet lots of others do too.
On Tue, 4 Jun 2024, 18:06 Adrian Klaver, <adrian.klaver@aklaver.com> wrote:
Reply to list also.
Ccing list
On 6/4/24 10:03 AM, Ron Johnson wrote:
>
> If you don't need the tar format then don't use it.
>
>
> That's neither the purpose nor the point of my question.
>
> I think that a note in the docs mentioning that it's obsolete would be
> helpful for new users who recognize "tar" so choose it.
You are assuming facts not in evidence, namely that the format is obsolete.
--
Adrian Klaver
adrian.klaver@aklaver.com
On 6/4/24 11:40, Shaheed Haque wrote: > > We use it. I bet lots of others do too. > > Of course. There are lots of small, real, useful databases in the wild.
On Tue, Jun 4, 2024 at 2:55 PM Rob Sargent <robjsargent@gmail.com> wrote:
On 6/4/24 11:40, Shaheed Haque wrote:
>
> We use it. I bet lots of others do too.
>
>
Of course. There are lots of small, real, useful databases in the wild.
But why tar instead of custom? That was part of my original question.
On 6/4/24 13:15, Ron Johnson wrote:
That I can't say. Familiarity? Fewer keystrokes?On Tue, Jun 4, 2024 at 2:55 PM Rob Sargent <robjsargent@gmail.com> wrote:
On 6/4/24 11:40, Shaheed Haque wrote:
>
> We use it. I bet lots of others do too.
>
>
Of course. There are lots of small, real, useful databases in the wild.But why tar instead of custom? That was part of my original question.
On Tue, Jun 4, 2024 at 3:15 PM Ron Johnson <ronljohnsonjr@gmail.com> wrote:
But why tar instead of custom? That was part of my original question.
On Tue, 4 Jun 2024 at 20:47, Gavin Roy <gavinr@aweber.com> wrote:
On Tue, Jun 4, 2024 at 3:15 PM Ron Johnson <ronljohnsonjr@gmail.com> wrote:But why tar instead of custom? That was part of my original question.I've found it pretty useful for programmatically accessing data in a dump for large databases outside of the normal pg_dump/pg_restore workflow. You don't have to seek through one large binary file to get to the data section to get at the data.
This is true for us too; specifically, tar, including with compression, is very convenient for both CLI and Python ecosystems.
On Tue, Jun 4, 2024 at 3:47 PM Gavin Roy <gavinr@aweber.com> wrote:
On Tue, Jun 4, 2024 at 3:15 PM Ron Johnson <ronljohnsonjr@gmail.com> wrote:But why tar instead of custom? That was part of my original question.I've found it pretty useful for programmatically accessing data in a dump for large databases outside of the normal pg_dump/pg_restore workflow. You don't have to seek through one large binary file to get to the data section to get at the data.
Interesting. Please explain, though, since a big tarball _is_ "one large binary file" that you have to sequentially scan. (I don't know the internal structure of custom format files, and whether they have file pointers to each table.)
Is it because you need individual .dat "COPY" files for something other than loading into PG tables (since pg_restore --table=xxxx does that, too), and directory format archives can be inconvenient?
On Tue, Jun 4, 2024 at 7:36 PM Ron Johnson <ronljohnsonjr@gmail.com> wrote:
On Tue, Jun 4, 2024 at 3:47 PM Gavin Roy <gavinr@aweber.com> wrote:On Tue, Jun 4, 2024 at 3:15 PM Ron Johnson <ronljohnsonjr@gmail.com> wrote:But why tar instead of custom? That was part of my original question.I've found it pretty useful for programmatically accessing data in a dump for large databases outside of the normal pg_dump/pg_restore workflow. You don't have to seek through one large binary file to get to the data section to get at the data.Interesting. Please explain, though, since a big tarball _is_ "one large binary file" that you have to sequentially scan. (I don't know the internal structure of custom format files, and whether they have file pointers to each table.)
Not if you untar it first.
Is it because you need individual .dat "COPY" files for something other than loading into PG tables (since pg_restore --table=xxxx does that, too), and directory format archives can be inconvenient?
In the past I've used it for data analysis outside of Postgres.
--
Gavin M. Roy
CTO
CTO
AWeber