Thread: pg_dump --comment?

pg_dump --comment?

From
Chris Campbell
Date:
I've encountered a situation where I'd like to store some information 
about the database when I do a pg_dump. For instance, the timestamp of 
the dump. And some other information that I pull from the database.

If pg_dump had a --comment flag that allowed me to pass a string that 
would be stored inside the dumpfile, that I could then retrieve in some 
way (without doing a full restore of the dump), that would meet my 
needs. In discussing this idea with other people, it sounded like a 
general-use feature that mankind as a whole could benefit from. :)

Here's what I'm envisioning:
   pg_dump --comment 'This is a comment' <more pg_dump args>

That would store the comment ('This is a comment') in the dump file 
somehow. The definition of "somehow" would vary depending on the output 
format (text, tar, or custom). Initially, since I only use the custom 
format, I would only focus on getting it to work with that. But for the 
text format, there could be a SQL comment at the top of the file with

-- COMMENT: This is a comment

or something. In the tar format, there could be a "comment" file in the 
archive that contains the text "This is a comment".

For the custom format...I haven't looked at the format specification, so 
I don't know exactly where the comment would go. It could go at the very 
top of the file, and have a special delimiter after it. pg_restore would 
just skim over the file until the delimiter is reached, and then go on 
about its business. The benefit of this scheme is that any program could 
read the comment -- just open a file and read the bytes until the delimiter.

There could also be a pg_dump or pg_restore option that prints out the 
comment stored in a given dump file, or another binary (pg_comment?) 
that does that.

Is this a desirable feature? Should I work it up like described and 
submit a patch? Any comments/suggestions?

Thanks!

- Chris

Re: pg_dump --comment?

From
Christopher Kings-Lynne
Date:
> I've encountered a situation where I'd like to store some information 
> about the database when I do a pg_dump. For instance, the timestamp of 
> the dump. And some other information that I pull from the database.

I think every dump should dump the timestamp regardless...

Chris



Re: pg_dump --comment?

From
Jon Jensen
Date:
On Fri, 28 May 2004, Christopher Kings-Lynne wrote:

> > I've encountered a situation where I'd like to store some information 
> > about the database when I do a pg_dump. For instance, the timestamp of 
> > the dump. And some other information that I pull from the database.
> 
> I think every dump should dump the timestamp regardless...

That would cause me a lot of trouble. Every night I do a pg_dump on all my
databases to a temporary file. Then I use cmp to compare that dump to last
night's dump. If they're identical I just delete the new dump so that only
the old one remains, with its original timestamp. That way rsync doesn't
see any change, and doesn't waste any time comparing it when we do
backups. It's also handy to see the last day the dump changed by looking
at the file's timestamp.

Granted, this is only of interest on databases that don't change at all, 
but on a multi-user system we have a surprising number of databases that 
don't change at all for days (alongside the ones that change all the time, 
of course).

However, I would like to see an option to include the timestamp if someone 
wants it.

Jon


Re: pg_dump --comment?

From
Andrew Dunstan
Date:
Jon Jensen wrote:

>On Fri, 28 May 2004, Christopher Kings-Lynne wrote:
>
>  
>
>>>I've encountered a situation where I'd like to store some information 
>>>about the database when I do a pg_dump. For instance, the timestamp of 
>>>the dump. And some other information that I pull from the database.
>>>      
>>>
>>I think every dump should dump the timestamp regardless...
>>    
>>
>
>That would cause me a lot of trouble. Every night I do a pg_dump on all my
>databases to a temporary file. Then I use cmp to compare that dump to last
>night's dump. If they're identical I just delete the new dump so that only
>the old one remains, with its original timestamp. That way rsync doesn't
>see any change, and doesn't waste any time comparing it when we do
>backups. It's also handy to see the last day the dump changed by looking
>at the file's timestamp.
>
>Granted, this is only of interest on databases that don't change at all, 
>but on a multi-user system we have a surprising number of databases that 
>don't change at all for days (alongside the ones that change all the time, 
>of course).
>
>However, I would like to see an option to include the timestamp if someone 
>wants it.
>
>  
>

That makes sense.

A better answer to this particular problem might be incremental dumps, 
though :-)

cheers

andrew


Re: pg_dump --comment?

From
James Robinson
Date:
On May 28, 2004, at 10:48 AM, Andrew Dunstan wrote:

> A better answer to this particular problem might be incremental dumps, 
> though :-)
>

Oh would incremental dumps be ever so hungrily accepted by ever so many 
shops. I had imagined that PITR transaction log archiving would allow 
one to perform an equivalent to "repay initial full data dump", then 
foreach transaction log, replay. If no writes have occurred, then the 
transaction log would be small / nongrowing, right?

For example, we perform a full dump hourly, scp to a backup db box who 
then imports it in its entirety, giving us, on average, a half-hour's 
worth of potential data lossage at any given moment in time if the 
primary box goes horribly bad. With current database size / machine 
speed, etc., this process takes ~8 minutes, so we're not sweating it. 
But we know that the vast majority of the data did not change in the 
past hour, so the majority of that work was not truly necessary.

With PITR log archiving, could we kick off this transfer + replay 
activity on the second box to ultimately just ship deltas? I suspect 
that we could not get such fixed-time guarantees anymore, since any 
individual transaction log file would not become full until, well, 
enough writes have occurred to push it over the edge, so our 'data 
hanging in the wind not yet on the backup box' becomes a function of 
'most recent N Kbytes of changed data'. By lowering transaction segment 
size (or whatever controls the size of individual transaction log 
files) we could force archiving to occur more / less frequently 
according to write traffic, right? Poor man's (very) async replication.

If all of this is correct, kudos to the forward-thinkers involved with 
PITR and the abstract archive transaction log protocol. If my proposed 
usage is flawed, then negative kudos to my puny mind.

----
James Robinson
Socialserve.com



Re: pg_dump --comment?

From
Philip Warner
Date:
At 12:15 AM 29/05/2004, Jon Jensen wrote:
>However, I would like to see an option to include the timestamp if someone
>wants it.

The timestamp is already in the header for custom dumps. Have a look at the 
first few lines from a 'pg_restore -l'.

In terms of general comments, there is already a 'WARNING' type TOC entry 
for custom dumps. This emits a warning message when pg_restore is called.

A similar mechanism could be used to dump SQL comments. eg. new TOC type of 
'SQL_COMMENT' with a defn starting with '--' would work with old & new 
pg_dump versions. It might be a good idea to investigate the likely uses 
for these comments, since the TOC entries may be suppressed depending on 
whether they are deemed to be part of the schema or the data. Once the 
modes of use are defined, a better mechanism may become apparent.





----------------------------------------------------------------
Philip Warner                    |     __---_____
Albatross Consulting Pty. Ltd.   |----/       -  \
(A.B.N. 75 008 659 498)          |          /(@)   ______---_
Tel: (+61) 0500 83 82 81         |                 _________  \
Fax: (+61) 03 5330 3172          |                 ___________ |
Http://www.rhyme.com.au          |                /           \|                                 |    --________--
PGP key available upon request,  |  /
and from pgp.mit.edu:11371       |/ 



Re: pg_dump --comment?

From
Jan Wieck
Date:
This can be accomplished with 10-20 lines of shell scripting. I don't 
think it's worth the trouble bloating pg_dump with it at all.


Jan

On 5/27/2004 10:10 PM, Chris Campbell wrote:
> I've encountered a situation where I'd like to store some information 
> about the database when I do a pg_dump. For instance, the timestamp of 
> the dump. And some other information that I pull from the database.
> 
> If pg_dump had a --comment flag that allowed me to pass a string that 
> would be stored inside the dumpfile, that I could then retrieve in some 
> way (without doing a full restore of the dump), that would meet my 
> needs. In discussing this idea with other people, it sounded like a 
> general-use feature that mankind as a whole could benefit from. :)
> 
> Here's what I'm envisioning:
> 
>     pg_dump --comment 'This is a comment' <more pg_dump args>
> 
> That would store the comment ('This is a comment') in the dump file 
> somehow. The definition of "somehow" would vary depending on the output 
> format (text, tar, or custom). Initially, since I only use the custom 
> format, I would only focus on getting it to work with that. But for the 
> text format, there could be a SQL comment at the top of the file with
> 
> -- COMMENT: This is a comment
> 
> or something. In the tar format, there could be a "comment" file in the 
> archive that contains the text "This is a comment".
> 
> For the custom format...I haven't looked at the format specification, so 
> I don't know exactly where the comment would go. It could go at the very 
> top of the file, and have a special delimiter after it. pg_restore would 
> just skim over the file until the delimiter is reached, and then go on 
> about its business. The benefit of this scheme is that any program could 
> read the comment -- just open a file and read the bytes until the delimiter.
> 
> There could also be a pg_dump or pg_restore option that prints out the 
> comment stored in a given dump file, or another binary (pg_comment?) 
> that does that.
> 
> Is this a desirable feature? Should I work it up like described and 
> submit a patch? Any comments/suggestions?
> 
> Thanks!
> 
> - Chris


-- 
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #



Re: pg_dump --comment?

From
Bruce Momjian
Date:
Jan Wieck wrote:
> This can be accomplished with 10-20 lines of shell scripting. I don't 
> think it's worth the trouble bloating pg_dump with it at all.

Considering how much bloat -v adds, adding a datestamp to it is trivial.-v outputs things like:---- TOC entry 1449
(class0 OID 0)-- Dependencies: 4-- Name: public; Type: ACL; Schema: -; Owner: postgres--
 

so adding a timestamp at the top is probably a good extension.

Actually, the pg_dump -v docs are incorrect because they only mention
additional stderr output, not the additional comments.  I updated the
text to:
       Specifies verbose mode.  This will cause       <application>pg_dump</application> to output detailed object
comments in the dump file, and progress messages to standard error.
 

This -v functionality controlling comments is new in 7.5.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: pg_dump --comment?

From
Harald Fuchs
Date:
In article <40B69F86.2020002@bignerdranch.com>,
Chris Campbell <chris@bignerdranch.com> writes:

> I've encountered a situation where I'd like to store some information
> about the database when I do a pg_dump. For instance, the timestamp of
> the dump. And some other information that I pull from the database.

> If pg_dump had a --comment flag that allowed me to pass a string that
> would be stored inside the dumpfile, that I could then retrieve in
> some way (without doing a full restore of the dump), that would meet
> my needs. In discussing this idea with other people, it sounded like a
> general-use feature that mankind as a whole could benefit from. :)

Why don't you just do
 ( echo "-- This is my comment"   pg_dump whatever ) > dumpfile

?



Re: pg_dump --comment?

From
Chris Campbell
Date:
Harald Fuchs wrote:

> Why don't you just do
> 
>   ( echo "-- This is my comment"
>     pg_dump whatever
>   ) > dumpfile
> 
> ?

How could I dump using the custom format, and then use "dumpfile" with 
pg_restore to restore the dump? If I just prepend the comment to the 
file, then pg_restore will choke, since the file won't be in the proper 
custom format. I would have to remove the comment before sending the 
file to pg_restore. Is there an easy way to do that? That can be easily 
automated, and not take a huge amount of time given a 4 gig dump file 
that must be modified?

Given those requirements, building a commenting mechanism into the 
custom format would work out very nicely, I think.

Thanks!

- Chris


Re: pg_dump --comment?

From
Harald Fuchs
Date:
In article <40BBBC40.3040804@bignerdranch.com>,
Chris Campbell <chris@bignerdranch.com> writes:

> Harald Fuchs wrote:
>> Why don't you just do
>> ( echo "-- This is my comment"
>> pg_dump whatever
>> ) > dumpfile
>> ?

> How could I dump using the custom format, and then use "dumpfile" with
> pg_restore to restore the dump? If I just prepend the comment to the
> file, then pg_restore will choke, since the file won't be in the
> proper custom format. I would have to remove the comment before
> sending the file to pg_restore. Is there an easy way to do that? That
> can be easily automated, and not take a huge amount of time given a 4
> gig dump file that must be modified?

Since pg_restore is able to read from standard input, that should not
be a problem: instead of
 pg_restore options dumpfile

just do
 sed 1d dumpfile | pg_restore options