Thread: Changing pg_dump default file format

Changing pg_dump default file format

From
"Joshua D. Drake"
Date:
Hello,

For as long as I can remember I have heard complaints about the fact 
that pg_dump doesn't use a backup format that is supported by the 
pg_restore program. At every single talk that I give that talks about 
pg_dump, the people in the majority of the audience agrees that this is 
rather silly and a barrier to usability.

The only counter argument I have heard is that it adds an extra step for 
developers to get the SQL that generates a database. That to me is a 
weak argument as pg_dump is a production/admin too, first, not a 
developer tool.

I humbly request on behalf of those who manage production postgresql 
instances that we change the default backup file format from -Fp to -Fc.

Sincerely,

Joshua D. Drake
-- 
Command Prompt, Inc. - http://www.commandprompt.com/  509-416-6579
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, Postgres-XC, @cmdpromptinc
For my dreams of your image that blossoms   a rose in the deeps of my heart. - W.B. Yeats



Re: Changing pg_dump default file format

From
Magnus Hagander
Date:
On Thu, Nov 7, 2013 at 5:02 PM, Joshua D. Drake <jd@commandprompt.com> wrote:
>
> Hello,
>
> For as long as I can remember I have heard complaints about the fact that
> pg_dump doesn't use a backup format that is supported by the pg_restore
> program. At every single talk that I give that talks about pg_dump, the
> people in the majority of the audience agrees that this is rather silly and
> a barrier to usability.
>
> The only counter argument I have heard is that it adds an extra step for
> developers to get the SQL that generates a database. That to me is a weak
> argument as pg_dump is a production/admin too, first, not a developer tool.
>
> I humbly request on behalf of those who manage production postgresql
> instances that we change the default backup file format from -Fp to -Fc.

The main reason for keeping plain I guess goes with the fact that we
sent it to stdout by default, which really doesn't lend itself to the
custom format.

I'm a bit worried about it breaking peoples scripts as well, if they
pipe the output. In particular when doing a partial dump of say a
sigle table or such, and in my experience that's mostly done in
pipe-mode, which would break.

Though I have to say I like the idea in general :)

Maybe we could consider changing the default format if -f is specified
to write it directly to a file. OTOH, hat might just make it
confusing....

-- Magnus HaganderMe: http://www.hagander.net/Work: http://www.redpill-linpro.com/



Re: Changing pg_dump default file format

From
Andrew Dunstan
Date:
On 11/07/2013 11:02 AM, Joshua D. Drake wrote:
>
> Hello,
>
> For as long as I can remember I have heard complaints about the fact 
> that pg_dump doesn't use a backup format that is supported by the 
> pg_restore program. At every single talk that I give that talks about 
> pg_dump, the people in the majority of the audience agrees that this 
> is rather silly and a barrier to usability.
>
> The only counter argument I have heard is that it adds an extra step 
> for developers to get the SQL that generates a database. That to me is 
> a weak argument as pg_dump is a production/admin too, first, not a 
> developer tool.
>
> I humbly request on behalf of those who manage production postgresql 
> instances that we change the default backup file format from -Fp to -Fc.


At least it gives you a nice message now instead of the rather cryptic 
message it used to give:
   [andrew@emma inst.5715]$ bin/pg_dump -f foo.sql   [andrew@emma inst.5715]$ bin/pg_restore foo.sql   pg_restore:
[archiver]input file appears to be a text format dump.   Please use psql.
 

cheers

andrew





Re: Changing pg_dump default file format

From
Kevin Grittner
Date:
Joshua D. Drake <jd@commandprompt.com> wrote:

> For as long as I can remember I have heard complaints about the fact
> that pg_dump doesn't use a backup format that is supported by the
> pg_restore program. At every single talk that I give that talks about
> pg_dump, the people in the majority of the audience agrees that this is
> rather silly and a barrier to usability.
>
> The only counter argument I have heard is that it adds an extra step for
> developers to get the SQL that generates a database. That to me is a
> weak argument as pg_dump is a production/admin too, first, not a
> developer tool.
>
> I humbly request on behalf of those who manage production postgresql
> instances that we change the default backup file format from -Fp to -Fc.

-1 in the case that output goes to stdout.  It might make sense for
when -f is specified.  As a DBA, 90% of my usage of pg_dump was
piped to psql, with the rest split between data-only dumps for
individual tables and schema-only dumps.  I think I used -Fc about
twice -- before I figured out how to do PITR-style backups.

This, for example, is a valid and useful construct:

pg_dump test | psql test2

... or:

pg_dump -s test | psql test3

A lot of scripts will be broken by changing the default format for
piped output.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Changing pg_dump default file format

From
Tom Lane
Date:
Magnus Hagander <magnus@hagander.net> writes:
> On Thu, Nov 7, 2013 at 5:02 PM, Joshua D. Drake <jd@commandprompt.com> wrote:
>> I humbly request on behalf of those who manage production postgresql
>> instances that we change the default backup file format from -Fp to -Fc.

> I'm a bit worried about it breaking peoples scripts as well, if they
> pipe the output.

Whether they pipe the output or not, this will break people's scripts.
I agree it's a legacy behavior that we wouldn't choose again, but you
could say that about an awful lot of things including much of the SQL
standard.  I think it's too late to consider this now.
        regards, tom lane



Re: Changing pg_dump default file format

From
"Joshua D. Drake"
Date:
On 11/07/2013 08:17 AM, Andrew Dunstan wrote:

>
> At least it gives you a nice message now instead of the rather cryptic
> message it used to give:
>
>     [andrew@emma inst.5715]$ bin/pg_dump -f foo.sql
>     [andrew@emma inst.5715]$ bin/pg_restore foo.sql
>     pg_restore: [archiver] input file appears to be a text format dump.
>     Please use psql.
>
> cheers
>

True :) but still...

> andrew
>
>


-- 
Command Prompt, Inc. - http://www.commandprompt.com/  509-416-6579
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, Postgres-XC, @cmdpromptinc
For my dreams of your image that blossoms   a rose in the deeps of my heart. - W.B. Yeats



Re: Changing pg_dump default file format

From
"Joshua D. Drake"
Date:
On 11/07/2013 08:21 AM, Kevin Grittner wrote:

>> I humbly request on behalf of those who manage production postgresql
>> instances that we change the default backup file format from -Fp to -Fc.
>
> -1 in the case that output goes to stdout.  It might make sense for
> when -f is specified.  As a DBA, 90% of my usage of pg_dump was
> piped to psql, with the rest split between data-only dumps for
> individual tables and schema-only dumps.  I think I used -Fc about
> twice -- before I figured out how to do PITR-style backups.
>

PITR is wonderful, I use it all the time. It has exactly zero to do with 
what we are talking about. We are talking about pg_dump and its 
usefulness. Are you suggesting that we tell people to not use pg_dump 
and instead use PITR?


> This, for example, is a valid and useful construct:
>
> pg_dump test | psql test2
>
> ... or:
>
> pg_dump -s test | psql test3
>

pg_dump -s test -fp| psql test3

> A lot of scripts will be broken by changing the default format for
> piped output.

For the sake of sanity... really?  You just proved my point. This isn't 
about developers. Developers now how to code. They can change a simple 
script. This is about production DBA/admins, the 98% of people using 
postgresql.

Joshua D. Drake

-- 
Command Prompt, Inc. - http://www.commandprompt.com/  509-416-6579
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, Postgres-XC, @cmdpromptinc
For my dreams of your image that blossoms   a rose in the deeps of my heart. - W.B. Yeats



Re: Changing pg_dump default file format

From
Harold Giménez
Date:



On Thu, Nov 7, 2013 at 8:26 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Magnus Hagander <magnus@hagander.net> writes:
> On Thu, Nov 7, 2013 at 5:02 PM, Joshua D. Drake <jd@commandprompt.com> wrote:
>> I humbly request on behalf of those who manage production postgresql
>> instances that we change the default backup file format from -Fp to -Fc.

> I'm a bit worried about it breaking peoples scripts as well, if they
> pipe the output.

Whether they pipe the output or not, this will break people's scripts.
I agree it's a legacy behavior that we wouldn't choose again, but you
could say that about an awful lot of things including much of the SQL
standard.  I think it's too late to consider this now.

Just a thought: If it's the right thing to do, why not do it in small steps? 
For example, couldn't we deprecate the behavior on 9.4, and switch over in 
9.5? By deprecate I mean a warning message on STDERR making it clear
they're relying on behavior that is scheduled to change.

I don't think there's any precedence in postgres or it's tooling of this kind of 
deprecation process, but it seems we need to define something like it.

Thoughts?

-H

Re: Changing pg_dump default file format

From
"Joshua D. Drake"
Date:
On 11/07/2013 08:26 AM, Tom Lane wrote:
> Magnus Hagander <magnus@hagander.net> writes:
>> On Thu, Nov 7, 2013 at 5:02 PM, Joshua D. Drake <jd@commandprompt.com> wrote:
>>> I humbly request on behalf of those who manage production postgresql
>>> instances that we change the default backup file format from -Fp to -Fc.
>
>> I'm a bit worried about it breaking peoples scripts as well, if they
>> pipe the output.
>
> Whether they pipe the output or not, this will break people's scripts.
> I agree it's a legacy behavior that we wouldn't choose again, but you
> could say that about an awful lot of things including much of the SQL
> standard.  I think it's too late to consider this now.

We have certainly done worse in the past. 8.3 comes to mind. I am trying 
to consider the future here, as we continue to grow. Why leave thorns 
for the sake of leaving thorns?

Note:

Perfectly happy to spit out a, "NOTICE: The default format of pg_dump 
will be changing in release X" for a release.

Joshua D. Drake

>
>             regards, tom lane
>


-- 
Command Prompt, Inc. - http://www.commandprompt.com/  509-416-6579
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, Postgres-XC, @cmdpromptinc
For my dreams of your image that blossoms   a rose in the deeps of my heart. - W.B. Yeats



Re: Changing pg_dump default file format

From
Magnus Hagander
Date:
On Thu, Nov 7, 2013 at 5:53 PM, Joshua D. Drake <jd@commandprompt.com> wrote:
>
> On 11/07/2013 08:21 AM, Kevin Grittner wrote:
>
>>> I humbly request on behalf of those who manage production postgresql
>>> instances that we change the default backup file format from -Fp to -Fc.
>>
>>
>> -1 in the case that output goes to stdout.  It might make sense for
>> when -f is specified.  As a DBA, 90% of my usage of pg_dump was
>> piped to psql, with the rest split between data-only dumps for
>> individual tables and schema-only dumps.  I think I used -Fc about
>> twice -- before I figured out how to do PITR-style backups.
>>
>
> PITR is wonderful, I use it all the time. It has exactly zero to do with
> what we are talking about. We are talking about pg_dump and its usefulness.
> Are you suggesting that we tell people to not use pg_dump and instead use
> PITR?

For backups, absolutely.

That is, assuming you mean "pg_basebackup -x", and not full
archive_command/full pitr setup.

That's still a separate argument, of course.


-- Magnus HaganderMe: http://www.hagander.net/Work: http://www.redpill-linpro.com/



Re: Changing pg_dump default file format

From
"Joshua D. Drake"
Date:
On 11/07/2013 09:25 AM, Magnus Hagander wrote:

>> PITR is wonderful, I use it all the time. It has exactly zero to do with
>> what we are talking about. We are talking about pg_dump and its usefulness.
>> Are you suggesting that we tell people to not use pg_dump and instead use
>> PITR?
>
> For backups, absolutely.
>
> That is, assuming you mean "pg_basebackup -x", and not full
> archive_command/full pitr setup.
>
> That's still a separate argument, of course.

Please, for the love of God start another thread for that :P.

JD


-- 
Command Prompt, Inc. - http://www.commandprompt.com/  509-416-6579
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, Postgres-XC, @cmdpromptinc
For my dreams of your image that blossoms   a rose in the deeps of my heart. - W.B. Yeats



Re: Changing pg_dump default file format

From
Josh Berkus
Date:
On 11/07/2013 08:26 AM, Tom Lane wrote:
> Magnus Hagander <magnus@hagander.net> writes:
>> On Thu, Nov 7, 2013 at 5:02 PM, Joshua D. Drake <jd@commandprompt.com> wrote:
>>> I humbly request on behalf of those who manage production postgresql
>>> instances that we change the default backup file format from -Fp to -Fc.
> 
>> I'm a bit worried about it breaking peoples scripts as well, if they
>> pipe the output.
> 
> Whether they pipe the output or not, this will break people's scripts.
> I agree it's a legacy behavior that we wouldn't choose again, but you
> could say that about an awful lot of things including much of the SQL
> standard.  I think it's too late to consider this now.

I don't agree that it's too late, personally.  However, I do agree that
it would be problematic to change the default format since the default
is to send to STDOUT.  If we wanted to change the default format, we
would need to make "filename" a required parameter.

If we wanted to change the defaults, I think it would be easier to
create a separate bin name (e.g. pg_backup) than to change the existing
parameters for pg_dump.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com



Re: Changing pg_dump default file format

From
"Joshua D. Drake"
Date:
On 11/07/2013 10:00 AM, Josh Berkus wrote:
>
> On 11/07/2013 08:26 AM, Tom Lane wrote:
>> Magnus Hagander <magnus@hagander.net> writes:
>>> On Thu, Nov 7, 2013 at 5:02 PM, Joshua D. Drake <jd@commandprompt.com> wrote:
>>>> I humbly request on behalf of those who manage production postgresql
>>>> instances that we change the default backup file format from -Fp to -Fc.
>>
>>> I'm a bit worried about it breaking peoples scripts as well, if they
>>> pipe the output.
>>
>> Whether they pipe the output or not, this will break people's scripts.
>> I agree it's a legacy behavior that we wouldn't choose again, but you
>> could say that about an awful lot of things including much of the SQL
>> standard.  I think it's too late to consider this now.
>
> I don't agree that it's too late, personally.  However, I do agree that
> it would be problematic to change the default format since the default
> is to send to STDOUT.  If we wanted to change the default format, we
> would need to make "filename" a required parameter.
>
> If we wanted to change the defaults, I think it would be easier to
> create a separate bin name (e.g. pg_backup) than to change the existing
> parameters for pg_dump.

I am not opposed to that. Allow pg_dump to be what it is, and create a 
pg_backup?

JD

>


-- 
Command Prompt, Inc. - http://www.commandprompt.com/  509-416-6579
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, Postgres-XC, @cmdpromptinc
For my dreams of your image that blossoms   a rose in the deeps of my heart. - W.B. Yeats



Re: Changing pg_dump default file format

From
Magnus Hagander
Date:
On Thu, Nov 7, 2013 at 7:08 PM, Joshua D. Drake <jd@commandprompt.com> wrote:
>
> On 11/07/2013 10:00 AM, Josh Berkus wrote:
>>
>>
>> On 11/07/2013 08:26 AM, Tom Lane wrote:
>>>
>>> Magnus Hagander <magnus@hagander.net> writes:
>>>>
>>>> On Thu, Nov 7, 2013 at 5:02 PM, Joshua D. Drake <jd@commandprompt.com>
>>>> wrote:
>>>>>
>>>>> I humbly request on behalf of those who manage production postgresql
>>>>> instances that we change the default backup file format from -Fp to
>>>>> -Fc.
>>>
>>>
>>>> I'm a bit worried about it breaking peoples scripts as well, if they
>>>> pipe the output.
>>>
>>>
>>> Whether they pipe the output or not, this will break people's scripts.
>>> I agree it's a legacy behavior that we wouldn't choose again, but you
>>> could say that about an awful lot of things including much of the SQL
>>> standard.  I think it's too late to consider this now.
>>
>>
>> I don't agree that it's too late, personally.  However, I do agree that
>> it would be problematic to change the default format since the default
>> is to send to STDOUT.  If we wanted to change the default format, we
>> would need to make "filename" a required parameter.
>>
>> If we wanted to change the defaults, I think it would be easier to
>> create a separate bin name (e.g. pg_backup) than to change the existing
>> parameters for pg_dump.
>
>
> I am not opposed to that. Allow pg_dump to be what it is, and create a
> pg_backup?

If we want to go down that route, please let's pick a different name.
pg_backup is going to cause a *lot* of confusion in relation to
pg_basebackup for people.


-- Magnus HaganderMe: http://www.hagander.net/Work: http://www.redpill-linpro.com/



Re: Changing pg_dump default file format

From
Andres Freund
Date:
On 2013-11-07 10:00:53 -0800, Josh Berkus wrote:
> If we wanted to change the defaults, I think it would be easier to
> create a separate bin name (e.g. pg_backup) than to change the existing
> parameters for pg_dump.

Provided somebody is actually willing to put in the not inconsiderable
amount of work to make this smoother for users, something like pg_backup
would be a great idea. It should have an interface that allows to dump
global and local object and it should be able to dump multiple databases
in one invocation. Explaining users they have to use pg_dumpall for
global objects but that they also shouldn't dare to use it for anything
else is, well, difficult.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: Changing pg_dump default file format

From
Tom Lane
Date:
Josh Berkus <josh@agliodbs.com> writes:
> If we wanted to change the defaults, I think it would be easier to
> create a separate bin name (e.g. pg_backup) than to change the existing
> parameters for pg_dump.

That might work.  However, just to be contrary: what about the
functionality of pg_dumpall?  We've heard repeated complaints about
pg_dump's lack of ability to dump multiple databases or global settings,
and just institutionalizing the existing -Fc format as the default
will make it harder not easier to fix that.

I'd be happier about inventing a pg_backup if it were a replacement for
pg_dumpall and not just a cosmetic wrapper.
        regards, tom lane



Re: Changing pg_dump default file format

From
Jeff Janes
Date:
On Thu, Nov 7, 2013 at 8:53 AM, Joshua D. Drake <jd@commandprompt.com> wrote:

On 11/07/2013 08:21 AM, Kevin Grittner wrote:

I humbly request on behalf of those who manage production postgresql
instances that we change the default backup file format from -Fp to -Fc.

-1 in the case that output goes to stdout.  It might make sense for
when -f is specified.  As a DBA, 90% of my usage of pg_dump was
piped to psql, with the rest split between data-only dumps for
individual tables and schema-only dumps.  I think I used -Fc about
twice -- before I figured out how to do PITR-style backups.


PITR is wonderful, I use it all the time. It has exactly zero to do with what we are talking about. We are talking about pg_dump and its usefulness. Are you suggesting that we tell people to not use pg_dump and instead use PITR?

That depends on what they are using it for.  If they are using pg_dump to do backups that are better done with PITR, then they should use PITR.  When I use pg_dump, it is because I want to see what it is that is getting dumped--if I didn't want it be human readable, I'd use a binary backup instead of a binary dump.

One of the things that first attracted me to PostgreSQL was that you use the tool provided to dump the data, and what you get by default is identifiably your data, rather than something that looks like it was designed to lock you in to some proprietary tool and never let you out again.

(Another thing that attracted me to it was that it didn't run around renaming things between every release.)

 

This, for example, is a valid and useful construct:

pg_dump test | psql test2

... or:

pg_dump -s test | psql test3


pg_dump -s test -fp| psql test3


A lot of scripts will be broken by changing the default format for
piped output.

For the sake of sanity... really?  You just proved my point. This isn't about developers. Developers now how to code. They can change a simple script. This is about production DBA/admins, the 98% of people using postgresql.

I don't think that adding a command line option constitutes coding.  If they can't figure out how to do that for pg_dump, how are they ever going to use pg_restore in the first place?

Cheers,

Jeff

Re: Changing pg_dump default file format

From
David Fetter
Date:
On Thu, Nov 07, 2013 at 01:45:20PM -0500, Tom Lane wrote:
> Josh Berkus <josh@agliodbs.com> writes:
> > If we wanted to change the defaults, I think it would be easier to
> > create a separate bin name (e.g. pg_backup) than to change the
> > existing parameters for pg_dump.
> 
> That might work.  However, just to be contrary: what about the
> functionality of pg_dumpall?  We've heard repeated complaints about
> pg_dump's lack of ability to dump multiple databases or global
> settings, and just institutionalizing the existing -Fc format as the
> default will make it harder not easier to fix that.
> 
> I'd be happier about inventing a pg_backup if it were a replacement
> for pg_dumpall and not just a cosmetic wrapper.

That the two are still separate is idiotic.  If it were me waving this
wand, I'd see to it that -Fd (directory) rather than -Fc (custom), was
the default for pg_backup, as it makes the choice to use -j (>1 core)
trivial.  I'd also make sure that the directory format could handle
the entirety of the backed-up database in some comprehensible way.
Maybe a globals/ and a databases/ directory up top...

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate



Re: Changing pg_dump default file format

From
Kevin Grittner
Date:
Joshua D. Drake <jd@commandprompt.com> wrote:

> PITR is wonderful, I use it all the time. It has exactly zero to
> do with what we are talking about. We are talking about pg_dump
> and its usefulness. Are you suggesting that we tell people to not
> use pg_dump and instead use PITR?

Yes.  (Well, actually, any of the WAL-based techniques.)  pg_dump
is not a good tool for primary backups, for many reasons.  At least
on databases over 50GB or so or which have more than a few data
modifications per second.  It is OK on a tiny 10GB database with
daily (or less frequent) batch modifications, but how often do you
see something like that?

> This is about production DBA/admins, the 98% of people using
> postgresql.

That is who I am thinking of.  A DBA team may have hundreds of
databases to manage, each with many scripts which have been running
nicely for years.  A change like this is bound to break some of
those crontab scripts they may not even remember they are running
-- like ones which dump a key table to INSERT statements to feed
into some other database product, which will now choke when it gets
a custom format file instead of the text file it has been getting
for years.  Requiring the DBA team to track all these scripts down
to add -Fp would be annoying, to put it mildly.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Changing pg_dump default file format

From
Alvaro Herrera
Date:
Tom Lane wrote:
> Josh Berkus <josh@agliodbs.com> writes:
> > If we wanted to change the defaults, I think it would be easier to
> > create a separate bin name (e.g. pg_backup) than to change the existing
> > parameters for pg_dump.
> 
> That might work.  However, just to be contrary: what about the
> functionality of pg_dumpall?  We've heard repeated complaints about
> pg_dump's lack of ability to dump multiple databases or global settings,
> and just institutionalizing the existing -Fc format as the default
> will make it harder not easier to fix that.
> 
> I'd be happier about inventing a pg_backup if it were a replacement for
> pg_dumpall and not just a cosmetic wrapper.

Maybe we could provide a master controller program (pg_backup or
whatever name we agree on), which could receive commands much like
pg_ctl.

$ pg_backup --help
pg_backup is a backup handler program for PostgreSQL.

Usage: pg_backup backup [-p PATTERN] DATABASE [...] pg_backup restore [-f PATH] pg_backup backup_globals ...

The "pattern" stuff would let you specify many databases in the command
line and have it dump each DB on a separate file, and perhaps do
strftime-style replacements, etc.

Initially we could have this just call pg_dump underneath, but
eventually it might grow superpowers of its own.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services



Re: Changing pg_dump default file format

From
Jan Lentfer
Date:
Am 07.11.2013 19:08, schrieb Joshua D. Drake:>
> On 11/07/2013 10:00 AM, Josh Berkus wrote:
>> If we wanted to change the defaults, I think it would be easier to
>> create a separate bin name (e.g. pg_backup) than to change the existing
>> parameters for pg_dump.
>
> I am not opposed to that. Allow pg_dump to be what it is, and create a
> pg_backup?
>
> JD


I would definitely agree to having "one" backup utility and making -Fc
the default for SQL dumps. One could even argue if the functionality of
pg_basebackup should be part of that too. But I would be fine with
having two distinct utilities (one for file level backups and one for
logical/SQL level backups), too.

Btw, how hard would it be, to have pg_restore and now also pg_dump run
with -j option do some ordering of work by size of e.g. the tables? E.g.
if you run with -j4 it would make sense to start working on the largest
tables (and it's indexes) first and continue by descending in t´size to
keep all available "slots" filled as good as possible. Just at though.

Jan




Re: Changing pg_dump default file format

From
"Joshua D. Drake"
Date:
On 11/07/2013 11:01 AM, Alvaro Herrera wrote:

> Maybe we could provide a master controller program (pg_backup or
> whatever name we agree on), which could receive commands much like
> pg_ctl.
>
> $ pg_backup --help
> pg_backup is a backup handler program for PostgreSQL.
>
> Usage:
>    pg_backup backup [-p PATTERN] DATABASE [...]
>    pg_backup restore [-f PATH]
>    pg_backup backup_globals
>    ...
>
> The "pattern" stuff would let you specify many databases in the command
> line and have it dump each DB on a separate file, and perhaps do
> strftime-style replacements, etc.
>
> Initially we could have this just call pg_dump underneath, but
> eventually it might grow superpowers of its own.
>

+1

-- 
Command Prompt, Inc. - http://www.commandprompt.com/  509-416-6579
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, Postgres-XC, @cmdpromptinc
For my dreams of your image that blossoms   a rose in the deeps of my heart. - W.B. Yeats



Re: Changing pg_dump default file format

From
Peter Eisentraut
Date:
On 11/7/13, 1:00 PM, Josh Berkus wrote:
> If we wanted to change the defaults, I think it would be easier to
> create a separate bin name (e.g. pg_backup) than to change the existing
> parameters for pg_dump.

Note the following code in pg_dump.c:
   /* Set default options based on progname */   if (strcmp(progname, "pg_backup") == 0)       format = "c";



Re: Changing pg_dump default file format

From
"Joshua D. Drake"
Date:
On 11/07/2013 10:54 AM, Kevin Grittner wrote:

> That is who I am thinking of.  A DBA team may have hundreds of
> databases to manage, each with many scripts which have been running
> nicely for years.  A change like this is bound to break some of
> those crontab scripts they may not even remember they are running
> -- like ones which dump a key table to INSERT statements to feed
> into some other database product, which will now choke when it gets
> a custom format file instead of the text file it has been getting
> for years.  Requiring the DBA team to track all these scripts down
> to add -Fp would be annoying, to put it mildly.

Only because they don't take the time to properly document or put into 
some form of automation/tracking/configfile management mechanism.

Don't get me wrong, I would be in the same boat but I don't think it is 
realistic to manage the communities expectations with the lackluster 
operations performance of fellow DBAs.

JD

-- 
Command Prompt, Inc. - http://www.commandprompt.com/  509-416-6579
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, Postgres-XC, @cmdpromptinc
For my dreams of your image that blossoms   a rose in the deeps of my heart. - W.B. Yeats



Re: Changing pg_dump default file format

From
Andrew Dunstan
Date:
On 11/07/2013 02:27 PM, Jan Lentfer wrote:

>
> Btw, how hard would it be, to have pg_restore and now also pg_dump run
> with -j option do some ordering of work by size of e.g. the tables? E.g.
> if you run with -j4 it would make sense to start working on the largest
> tables (and it's indexes) first and continue by descending in t´size to
> keep all available "slots" filled as good as possible. Just at though.
>


Please start another thread on this - it's really completely separate
from the issue at hand. Hijacking threads like this is not a good idea.

cheers

andrew




Re: Changing pg_dump default file format

From
Josh Berkus
Date:
All,

I'm amused at how this has become a "let's pile on everything which has
ever been missing in pg_dump into one thread".

Agree with Tom that if we're going to create a new program name, we
should fix the pg_dumpall issue as well.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com



Re: Changing pg_dump default file format

From
"Joshua D. Drake"
Date:
On 11/07/2013 12:42 PM, Josh Berkus wrote:
>
> All,
>
> I'm amused at how this has become a "let's pile on everything which has
> ever been missing in pg_dump into one thread".

Well it has been broken longer than most of our utilities. Sorry... not 
broken but certainly not complete. It is to be expected.

>
> Agree with Tom that if we're going to create a new program name, we
> should fix the pg_dumpall issue as well.
>

All I want to start is this simple fix. I don't know who is going to 
step up to work on pg_dump/all considering it is the bastard step child 
of the project. We only work on it when we absolutely have to (and CMD 
is just as guilty as the rest).

JD


-- 
Command Prompt, Inc. - http://www.commandprompt.com/  509-416-6579
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, Postgres-XC, @cmdpromptinc
For my dreams of your image that blossoms   a rose in the deeps of my heart. - W.B. Yeats



Re: Changing pg_dump default file format

From
Andrew Dunstan
Date:
On 11/07/2013 06:33 PM, Joshua D. Drake wrote:
>
> On 11/07/2013 12:42 PM, Josh Berkus wrote:
>>
>> All,
>>
>> I'm amused at how this has become a "let's pile on everything which has
>> ever been missing in pg_dump into one thread".
>
> Well it has been broken longer than most of our utilities. Sorry... 
> not broken but certainly not complete. It is to be expected.
>
>>
>> Agree with Tom that if we're going to create a new program name, we
>> should fix the pg_dumpall issue as well.
>>
>
> All I want to start is this simple fix. I don't know who is going to 
> step up to work on pg_dump/all considering it is the bastard step 
> child of the project. We only work on it when we absolutely have to 
> (and CMD is just as guilty as the rest).
>
>


What you're asking for isn't a simple fix. Calling it one doesn't make 
it so.

cheers

andrew



Re: Changing pg_dump default file format

From
Craig Ringer
Date:
On 11/08/2013 12:53 AM, Harold Giménez wrote:
> 
> 
> Just a thought: If it's the right thing to do, why not do it in small
> steps? 
> For example, couldn't we deprecate the behavior on 9.4, and switch over in 
> 9.5? By deprecate I mean a warning message on STDERR making it clear
> they're relying on behavior that is scheduled to change.

I used to like that idea too, but I've since learned the hard way that:

(a) Lots of people only upgrade every two, three, or even more major
versions. I'm dealing with clients on 8.3, and people still pop up on
Stack Overflow with 8.1 sometimes! These people don't ever see the
deprecated phase.

(b) People routinely ignore cron job output. Even important job output.
They won't see the messages, and won't act on them if they do until
something actually breaks.


-- Craig Ringer                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: Changing pg_dump default file format

From
Craig Ringer
Date:
On 11/08/2013 12:55 AM, Joshua D. Drake wrote:
> 
> We have certainly done worse in the past. 8.3 comes to mind. I am trying
> to consider the future here, as we continue to grow. Why leave thorns
> for the sake of leaving thorns?

8.3 was _bad_. It's been a major upgrade barrier, and a huge pain.

I was also very unhappy with the way that the bytea_output change worked
out. I didn't have any idea it'd work out that way, but it caused a fair
bit of data corruption for real-world users because some client drivers
liked to interpret 'hex' format data as 'escape', then re-escape it when
writing it back to the server.

PHP's `pg' driver was notably at fault here - I'm not sure it supports
'hex' even now. It's mostly deprecated in favour of PDO, but it's still
in heavy use even in new code (you know, PHP users). It also took quite
some time for Ruby's "Pg" gem to stop mangling round-tripped bytea data.

If we changed the default, it would be absolutely vital to modify the
pg_restore output and psql's behaviour so that this didn't happen:



psql:/home/craig/tmp/rt-db-2013-07-09.dump:1: ERROR:  syntax error at or
near "PGDMP"
LINE 1: PGDMP^A^L^A^A^A^AREVOKE ALL ON SCHEMA londiste FROM dbadmin;       ^
psql:/home/craig/tmp/rt-db-2013-07-09.dump:2: ERROR:  schema "londiste"
does not exist
psql:/home/craig/tmp/rt-db-2013-07-09.dump:3: ERROR:  schema "londiste"
does not exist
psql:/home/craig/tmp/rt-db-2013-07-09.dump:4: ERROR:  syntax error at or
near "^A"
LINE 1: ^A^AREVOKE ALL ON SCHEMA pgq FROM dbadmin;       ^
psql:/home/craig/tmp/rt-db-2013-07-09.dump:5: ERROR:  schema "pgq" does
not exist
psql:/home/craig/tmp/rt-db-2013-07-09.dump:6: ERROR:  schema "pgq" does
not exist
psql:/home/craig/tmp/rt-db-2013-07-09.dump:7: ERROR:  syntax error at or
near "^A"
LINE 1: ^A^AREVOKE ALL ON SCHEMA pgq_ext FROM dbadmin;       ^
psql:/home/craig/tmp/rt-db-2013-07-09.dump:8: ERROR:  schema "pgq_ext"
does not exist
psql:/home/craig/tmp/rt-db-2013-07-09.dump:9: ERROR:  schema "pgq_ext"
does not exist
psql:/home/craig/tmp/rt-db-2013-07-09.dump:10: ERROR:  syntax error at
or near "^A"
LINE 1: ^A^AREVOKE ALL ON SCHEMA pgq_node FROM dbadmin;       ^
psql:/home/craig/tmp/rt-db-2013-07-09.dump:11: ERROR:  schema "pgq_node"
does not exist
psql:/home/craig/tmp/rt-db-2013-07-09.dump:12: ERROR:  schema "pgq_node"
does not exist
psql:/home/craig/tmp/rt-db-2013-07-09.dump:13: ERROR:  syntax error at
or near "^A"
LINE 1: ^A^AREVOKE ALL ON SCHEMA public FROM postgres;       ^


-- Craig Ringer                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: Changing pg_dump default file format

From
Craig Ringer
Date:
On 11/08/2013 07:33 AM, Joshua D. Drake wrote:
>>
> 
> All I want to start is this simple fix. I don't know who is going to
> step up to work on pg_dump/all considering it is the bastard step child
> of the project. We only work on it when we absolutely have to (and CMD
> is just as guilty as the rest).

The change its self is simple; the consequences aren't. We learned that
with bytea_output.

There are two-way compat issues to deal with:

- old pg_dump with new pg_restore
- new pg_dump with old pg_restore
- old pg_dump with new psql
- new pg_dump with old psql

and all of pg_dump, pg_restore, and psql are relevant because of the way
they handle errors and consume pg_dump output.

-- Craig Ringer                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: Changing pg_dump default file format

From
Harold Giménez
Date:



On Thu, Nov 7, 2013 at 7:01 PM, Craig Ringer <craig@2ndquadrant.com> wrote:

(a) Lots of people only upgrade every two, three, or even more major
versions. I'm dealing with clients on 8.3, and people still pop up on
Stack Overflow with 8.1 sometimes! These people don't ever see the
deprecated phase.

Interesting that they'd never update their clients either. I guess if that's true
there's a mentality of if it ain't broke don't fix it going on here.

Would they read change logs before upgrading, or just cross their fingers?
 

(b) People routinely ignore cron job output. Even important job output.
They won't see the messages, and won't act on them if they do until
something actually breaks.

How common is this?

Re: Changing pg_dump default file format

From
Craig Ringer
Date:
On 11/08/2013 11:41 AM, Harold Giménez wrote:
> 
> 
> 
> On Thu, Nov 7, 2013 at 7:01 PM, Craig Ringer <craig@2ndquadrant.com
> <mailto:craig@2ndquadrant.com>> wrote:
> 
> 
>     (a) Lots of people only upgrade every two, three, or even more major
>     versions. I'm dealing with clients on 8.3, and people still pop up on
>     Stack Overflow with 8.1 sometimes! These people don't ever see the
>     deprecated phase.
> 
> 
> Interesting that they'd never update their clients either. I guess if
> that's true
> there's a mentality of if it ain't broke don't fix it going on here.

I've seen all sorts of combinations of client and server versions in the
wild. Ancient JDBC, ODBC, etc drivers are also common.

> Would they read change logs before upgrading, or just cross their fingers?
> 
>     (b) People routinely ignore cron job output. Even important job output.
>     They won't see the messages, and won't act on them if they do until
>     something actually breaks.
> 
> 
> How common is this?

I couldn't possibly say, I can only go by what I see in communication
with clients, in private mail, in the mailing lists, and on Stack Overflow.

I do see a couple of different groups:

* People who upgrade casually without looking at release notes etc, then
wonder why everything just broke. Typically people running PostgreSQL in
development environments. I'm not too fussed about this group, they do
it to themselves. On the other hand they're a *big* group (think Ruby on
Rails developers, etc) and the more of them who whine about how
PostgreSQL is painful to upgrade and always breaks, the worse it is for
general uptake of Pg.

* Those who don't upgrade because they don't know or care to. That box
has been sitting there doing its thing for ages, and until they hit some
key limitation, trigger an old bug, or finally need to do something
different they don't realise that life would've been easier if they
hadn't stayed on 7.1. These folks seem to be the most likely ones to
unthinkingly upgrade from 7.something-low or 8.x straight to 9.3 without
reading the release notes then wonder why things don't work, especially
since they'll tend to do it in a hurry as a knee-jerk to try to fix a
problem.

* People who want to upgrade but are choked by bureaucracy and change
management processes that move on a tectonic time scale. They're still
on 8.something-low because it's just too painful to change. They're the
ones who'll ask you to backport synchronous replication into 9.0 because
they're not allowed to upgrade to 9.1/9.2, despite the obvious insanity
of running custom-patched and rebuilt software instead of a well-tested
public release. When they upgrade they do it in giant leaps despite the
pain involved, just because it's so hard to upgrade at all. They're
likely to plan carefully and do it right.

* People who'd love to upgrade, but have an old database running a 24x7
mission critical system with enough data that they just can't get a
dump/reload window, and they're on something older than 8.4 so they
can't pg_upgrade. When they do upgrade they tend to plan it in detail,
test carefully first, etc, so again they're pretty OK.


In general, people are busy and the database isn't all they care about.
They probably read the release notes about as well as you read the
release notes on a new distro upgrade or something else that you care
moderately about.

If you're doing a major upgrade from an old Pg to a very new one there's
a lot to take in and a lot of rel notes to read. One of the things that
might help here is if we had (on the wiki, maybe) a single document that
kept a running list of compatibility affecting changes and upgrades that
need special action, along with a recommended upgrade path. That way
people wouldn't have to read 6 major versions of release notes, get half
way, and give up.

-- Craig Ringer                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: Changing pg_dump default file format

From
Harold Giménez
Date:
I don't want to hijack this thread any further, but Craig, thanks for your insight.

-Harold


On Thu, Nov 7, 2013 at 8:35 PM, Craig Ringer <craig@2ndquadrant.com> wrote:
On 11/08/2013 11:41 AM, Harold Giménez wrote:
>
>
>
> On Thu, Nov 7, 2013 at 7:01 PM, Craig Ringer <craig@2ndquadrant.com
> <mailto:craig@2ndquadrant.com>> wrote:
>
>
>     (a) Lots of people only upgrade every two, three, or even more major
>     versions. I'm dealing with clients on 8.3, and people still pop up on
>     Stack Overflow with 8.1 sometimes! These people don't ever see the
>     deprecated phase.
>
>
> Interesting that they'd never update their clients either. I guess if
> that's true
> there's a mentality of if it ain't broke don't fix it going on here.

I've seen all sorts of combinations of client and server versions in the
wild. Ancient JDBC, ODBC, etc drivers are also common.

> Would they read change logs before upgrading, or just cross their fingers?
>
>     (b) People routinely ignore cron job output. Even important job output.
>     They won't see the messages, and won't act on them if they do until
>     something actually breaks.
>
>
> How common is this?

I couldn't possibly say, I can only go by what I see in communication
with clients, in private mail, in the mailing lists, and on Stack Overflow.

I do see a couple of different groups:

* People who upgrade casually without looking at release notes etc, then
wonder why everything just broke. Typically people running PostgreSQL in
development environments. I'm not too fussed about this group, they do
it to themselves. On the other hand they're a *big* group (think Ruby on
Rails developers, etc) and the more of them who whine about how
PostgreSQL is painful to upgrade and always breaks, the worse it is for
general uptake of Pg.

* Those who don't upgrade because they don't know or care to. That box
has been sitting there doing its thing for ages, and until they hit some
key limitation, trigger an old bug, or finally need to do something
different they don't realise that life would've been easier if they
hadn't stayed on 7.1. These folks seem to be the most likely ones to
unthinkingly upgrade from 7.something-low or 8.x straight to 9.3 without
reading the release notes then wonder why things don't work, especially
since they'll tend to do it in a hurry as a knee-jerk to try to fix a
problem.

* People who want to upgrade but are choked by bureaucracy and change
management processes that move on a tectonic time scale. They're still
on 8.something-low because it's just too painful to change. They're the
ones who'll ask you to backport synchronous replication into 9.0 because
they're not allowed to upgrade to 9.1/9.2, despite the obvious insanity
of running custom-patched and rebuilt software instead of a well-tested
public release. When they upgrade they do it in giant leaps despite the
pain involved, just because it's so hard to upgrade at all. They're
likely to plan carefully and do it right.

* People who'd love to upgrade, but have an old database running a 24x7
mission critical system with enough data that they just can't get a
dump/reload window, and they're on something older than 8.4 so they
can't pg_upgrade. When they do upgrade they tend to plan it in detail,
test carefully first, etc, so again they're pretty OK.


In general, people are busy and the database isn't all they care about.
They probably read the release notes about as well as you read the
release notes on a new distro upgrade or something else that you care
moderately about.

If you're doing a major upgrade from an old Pg to a very new one there's
a lot to take in and a lot of rel notes to read. One of the things that
might help here is if we had (on the wiki, maybe) a single document that
kept a running list of compatibility affecting changes and upgrades that
need special action, along with a recommended upgrade path. That way
people wouldn't have to read 6 major versions of release notes, get half
way, and give up.

--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: Changing pg_dump default file format

From
Josh Berkus
Date:
On 11/07/2013 08:35 PM, Craig Ringer wrote:
> I've seen all sorts of combinations of client and server versions in the
> wild. Ancient JDBC, ODBC, etc drivers are also common.

There's a security compromise out there for the 8.1 JDBC driver which
was discovered last year, and folks *keep* reporting to pgsql-security.Even though the 8.1 driver had been officially
deprecatedfor 2 years
 
before the security issue was discovered.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com



Re: Changing pg_dump default file format

From
Bruce Momjian
Date:
On Thu, Nov  7, 2013 at 02:40:17PM -0500, Peter Eisentraut wrote:
> On 11/7/13, 1:00 PM, Josh Berkus wrote:
> > If we wanted to change the defaults, I think it would be easier to
> > create a separate bin name (e.g. pg_backup) than to change the existing
> > parameters for pg_dump.
> 
> Note the following code in pg_dump.c:
> 
>     /* Set default options based on progname */
>     if (strcmp(progname, "pg_backup") == 0)
>         format = "c";

Wow, when was that added?  git blame says 2002:
 9f0ae0c8 (Tom Lane           2002-05-10 22:36:27 +0000   387)   /* Set default options based on progname */ 9f0ae0c8
(TomLane           2002-05-10 22:36:27 +0000   388)   if (strcmp(progname, "pg_backup") == 0) 9f0ae0c8 (Tom Lane
  2002-05-10 22:36:27 +0000   389)       format = "c";
 

However, pggit log 9f0ae0c82060e3dcd1fa7ac8bbe35a3f9a44dbba does not
show that line being added by the diff.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + Everyone has their own god. +



Re: Changing pg_dump default file format

From
Robert Haas
Date:
On Mon, Nov 18, 2013 at 3:21 PM, Bruce Momjian <bruce@momjian.us> wrote:
> On Thu, Nov  7, 2013 at 02:40:17PM -0500, Peter Eisentraut wrote:
>> On 11/7/13, 1:00 PM, Josh Berkus wrote:
>> > If we wanted to change the defaults, I think it would be easier to
>> > create a separate bin name (e.g. pg_backup) than to change the existing
>> > parameters for pg_dump.
>>
>> Note the following code in pg_dump.c:
>>
>>     /* Set default options based on progname */
>>     if (strcmp(progname, "pg_backup") == 0)
>>         format = "c";
>
> Wow, when was that added?  git blame says 2002:
>
>   9f0ae0c8 (Tom Lane           2002-05-10 22:36:27 +0000   387)   /* Set default options based on progname */
>   9f0ae0c8 (Tom Lane           2002-05-10 22:36:27 +0000   388)   if (strcmp(progname, "pg_backup") == 0)
>   9f0ae0c8 (Tom Lane           2002-05-10 22:36:27 +0000   389)       format = "c";
>
> However, pggit log 9f0ae0c82060e3dcd1fa7ac8bbe35a3f9a44dbba does not
> show that line being added by the diff.

I dunno what your pggit script does, but "git log" doesn't normally
show the diff at all.  "git show
9f0ae0c82060e3dcd1fa7ac8bbe35a3f9a44dbba", however, does show that
line being added.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Changing pg_dump default file format

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
>> However, pggit log 9f0ae0c82060e3dcd1fa7ac8bbe35a3f9a44dbba does not
>> show that line being added by the diff.

> I dunno what your pggit script does, but "git log" doesn't normally
> show the diff at all.  "git show
> 9f0ae0c82060e3dcd1fa7ac8bbe35a3f9a44dbba", however, does show that
> line being added.

It also shows it being removed, ie this is just -u diff format being
unhelpful, as it so often is.

The actual source of the code seems to be commit
e8f69be054e9343b3c41d7e77cc142913ee55439.
        regards, tom lane



Re: Changing pg_dump default file format

From
Bruce Momjian
Date:
On Tue, Nov 19, 2013 at 11:26:34AM -0500, Tom Lane wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
> >> However, pggit log 9f0ae0c82060e3dcd1fa7ac8bbe35a3f9a44dbba does not
> >> show that line being added by the diff.
> 
> > I dunno what your pggit script does, but "git log" doesn't normally
> > show the diff at all.  "git show
> > 9f0ae0c82060e3dcd1fa7ac8bbe35a3f9a44dbba", however, does show that
> > line being added.
> 
> It also shows it being removed, ie this is just -u diff format being
> unhelpful, as it so often is.
> 
> The actual source of the code seems to be commit
> e8f69be054e9343b3c41d7e77cc142913ee55439.

Thanks.  Even the commit message doesn't mention this gem.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + Everyone has their own god. +