Thread: Patch for pg_dump: Multiple -t options and new -T option

Patch for pg_dump: Multiple -t options and new -T option

From
"David F. Skoll"
Date:
Attached is a patch against pg_dump version 7.4.3 that permits
multiple "-t" switches so that you can select more than one table (but
less than all) to dump.

It also adds a "-T" switch (long name "--exclude-table") that says
*not* to dump a specific table.  So:

        pg_dump -t table1 -t table2 db

will dump table1 and table2 only, whereas:

        pg_dump -T table1 -T table2 db

will dump all the tables *except* table1 and table2.

Tested briefly on my system; doesn't seem to break anything.

Regards,

David.

--
David F. Skoll <dfs@roaringpenguin.com>    Roaring Penguin Software Inc.
+1 (613) 231-6599 ext. 100                 http://www.roaringpenguin.com/
For CanIt technical support, please mail: support@roaringpenguin.com

Attachment

Re: Patch for pg_dump: Multiple -t options and new -T option

From
Bruce Momjian
Date:
Does anyone have opinions on including this in 7.5?  I see it first
appeared on July 6, six days after feature freeze.

---------------------------------------------------------------------------

David F. Skoll wrote:
> Attached is a patch against pg_dump version 7.4.3 that permits
> multiple "-t" switches so that you can select more than one table (but
> less than all) to dump.
>
> It also adds a "-T" switch (long name "--exclude-table") that says
> *not* to dump a specific table.  So:
>
>         pg_dump -t table1 -t table2 db
>
> will dump table1 and table2 only, whereas:
>
>         pg_dump -T table1 -T table2 db
>
> will dump all the tables *except* table1 and table2.
>
> Tested briefly on my system; doesn't seem to break anything.
>
> Regards,
>
> David.
>
> --
> David F. Skoll <dfs@roaringpenguin.com>    Roaring Penguin Software Inc.
> +1 (613) 231-6599 ext. 100                 http://www.roaringpenguin.com/
> For CanIt technical support, please mail: support@roaringpenguin.com

Content-Description:

[ Attachment, skipping... ]

>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
>                http://archives.postgresql.org

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: Patch for pg_dump: Multiple -t options and new -T option

From
Bruce Momjian
Date:
I see one vote in favor of its inclusion on the grounds it is a bug not
to support multiple -t parameters.  However, is someone objects I will
have to hold it for 7.6.  It needs SGML doc additions which I will do
myself.

Your patch has been added to the PostgreSQL unapplied patches list at:

    http://momjian.postgresql.org/cgi-bin/pgpatches

It will be applied as soon as one of the PostgreSQL committers reviews
and approves it.

---------------------------------------------------------------------------


David F. Skoll wrote:
> Attached is a patch against pg_dump version 7.4.3 that permits
> multiple "-t" switches so that you can select more than one table (but
> less than all) to dump.
>
> It also adds a "-T" switch (long name "--exclude-table") that says
> *not* to dump a specific table.  So:
>
>         pg_dump -t table1 -t table2 db
>
> will dump table1 and table2 only, whereas:
>
>         pg_dump -T table1 -T table2 db
>
> will dump all the tables *except* table1 and table2.
>
> Tested briefly on my system; doesn't seem to break anything.
>
> Regards,
>
> David.
>
> --
> David F. Skoll <dfs@roaringpenguin.com>    Roaring Penguin Software Inc.
> +1 (613) 231-6599 ext. 100                 http://www.roaringpenguin.com/
> For CanIt technical support, please mail: support@roaringpenguin.com

Content-Description:

[ Attachment, skipping... ]

>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
>                http://archives.postgresql.org

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: Patch for pg_dump: Multiple -t options and new -T option

From
Christopher Kings-Lynne
Date:
> I see one vote in favor of its inclusion on the grounds it is a bug not
> to support multiple -t parameters.  However, is someone objects I will
> have to hold it for 7.6.  It needs SGML doc additions which I will do
> myself.

Weeeeell, I guess I'm against it based on the rules of feature freeze,
even though it would be really useful for me :(

I don't see how it's a "bug" to not support multiple parameters thought
- that's really scraping the bottom of the barrel...

Chris


Re: Patch for pg_dump: Multiple -t options and new -T option

From
Tom Lane
Date:
Christopher Kings-Lynne <chriskl@familyhealth.com.au> writes:
> Weeeeell, I guess I'm against it based on the rules of feature freeze,
> even though it would be really useful for me :(

It would have been a lot easier to approve it if it'd arrived on June 30
rather than July 6 :-(.  However, I do believe that David originally
submitted a slightly-too-late version of this in the previous release
cycle, so maybe we could cut him a little slack and pretend this is a
mistakenly-forgotten patch that we held over from 7.4.

Note I haven't actually *read* the patch and so take no position on
whether it does what it claims to.  But if someone else will read/test
it and give it a favorable report, then I'm inclined to approve it.
I'm quite sure we'd agreed in principle to allow multiple -t values.
(A negative -T switch is another matter --- that part maybe needs
more discussion.)

            regards, tom lane

Re: Patch for pg_dump: Multiple -t options and new -T option

From
Christopher Kings-Lynne
Date:
>>Weeeeell, I guess I'm against it based on the rules of feature freeze,
>>even though it would be really useful for me :(
>
> It would have been a lot easier to approve it if it'd arrived on June 30
> rather than July 6 :-(.  However, I do believe that David originally
> submitted a slightly-too-late version of this in the previous release
> cycle, so maybe we could cut him a little slack and pretend this is a
> mistakenly-forgotten patch that we held over from 7.4.

Yes, the reason it would be nice for me is that currently if you want to
dump two specific, related tables from your db, there's no way to do it
with pg_dump within the one transactions (ie. maintaining integrity).  I
guess I'm in favour of -t -t but not -T depending on the complexity of
it.  I'll review the patch if you like.

Chris


Re: Patch for pg_dump: Multiple -t options and new -T option

From
Christopher Kings-Lynne
Date:
> Yes, the reason it would be nice for me is that currently if you want to
> dump two specific, related tables from your db, there's no way to do it
> with pg_dump within the one transactions (ie. maintaining integrity).  I
> guess I'm in favour of -t -t but not -T depending on the complexity of
> it.  I'll review the patch if you like.

One problem with this patch is that there's no way to dump multiple
tables in different schemas.  Does this matter?  It's a bit
non-orthogonal...

Chris


Re: Patch for pg_dump: Multiple -t options and new -T option

From
"Andrew Dunstan"
Date:
Tom Lane said:
> Christopher Kings-Lynne <chriskl@familyhealth.com.au> writes:
>> Weeeeell, I guess I'm against it based on the rules of feature freeze,
>>  even though it would be really useful for me :(
>
> It would have been a lot easier to approve it if it'd arrived on June
> 30 rather than July 6 :-(.  However, I do believe that David originally
> submitted a slightly-too-late version of this in the previous release
> cycle, so maybe we could cut him a little slack and pretend this is a
> mistakenly-forgotten patch that we held over from 7.4.
>
> Note I haven't actually *read* the patch and so take no position on
> whether it does what it claims to.  But if someone else will read/test
> it and give it a favorable report, then I'm inclined to approve it. I'm
> quite sure we'd agreed in principle to allow multiple -t values. (A
> negative -T switch is another matter --- that part maybe needs
> more discussion.)
>


I entirely agree. Feature freeze has been said to be slightly porous, and
this is a change with relatively low impact/risk and significant benefit.

Let's not be overly rulebound.

cheers

andrew



Re: Patch for pg_dump: Multiple -t options and new -T option

From
Tom Lane
Date:
Christopher Kings-Lynne <chriskl@familyhealth.com.au> writes:
> One problem with this patch is that there's no way to dump multiple
> tables in different schemas.  Does this matter?  It's a bit
> non-orthogonal...

Yeah.  With the combination of -n and -t you can pull a specific table,
but as soon as you allow either switch to be multiple you've got an
inexact tool.

I had thought of allowing -t to be schema.table but I'm worried about
backwards-compatibility issues.  In particular, since we don't support
SQL-style quoting in -t arguments, how could one then select a table
name that actually contains a dot?  Or should we just write off that
case as "stupidity is its own reward"?  It would also be good to not
foreclose the possibility of wild-card matching patterns in these
switches in future.

(BTW, does the patch handle multiple -n switches?)

            regards, tom lane

Re: Patch for pg_dump: Multiple -t options and new -T

From
"David F. Skoll"
Date:
On Tue, 20 Jul 2004, Tom Lane wrote:

> (BTW, does the patch handle multiple -n switches?)

No, it doesn't.  I can look into that if you like.  The patch was
entirely to satisfy a need some of our customers have.  The -T switch
does fill a real need for our customers; our product has a couple of tables
that aren't critical if they aren't backed up, but as the product evolves,
we occasionally add more tables.  So it's easier to use a -T switch to
say what *not* to back up, than multiple -t switches to say what to back up.

Regards,

David.

Re: Patch for pg_dump: Multiple -t options and new -T option

From
Bruce Momjian
Date:
Tom Lane wrote:
> Christopher Kings-Lynne <chriskl@familyhealth.com.au> writes:
> > Weeeeell, I guess I'm against it based on the rules of feature freeze,
> > even though it would be really useful for me :(
>
> It would have been a lot easier to approve it if it'd arrived on June 30
> rather than July 6 :-(.  However, I do believe that David originally
> submitted a slightly-too-late version of this in the previous release
> cycle, so maybe we could cut him a little slack and pretend this is a
> mistakenly-forgotten patch that we held over from 7.4.

Yes, I do see one from 7.4 but it was submitted by someone else:

---------------------------------------------------------------------------

Message 179/231 Andreas Joseph Krogh
                       Oct 1, 2003 04:00:08 pm +0200
Organization: OfficeNet AS
To: pgsql-hackers@postgresql.org
Subject: [HACKERS] Patch for allowing multiple -t <table-name> options
for pg_dump
Date: Wed, 1 Oct 2003 16:00:08 +0200

This si my first look at the pg-code, so it may not comply with the
coding-standards. I haven't coded in C for a while either, so if someone
finds a better way to implement this, go ahead, but this patch works for
me
with 7.4beta3.

http://home.officenet.no/~andreak/pg_dump.c.diff

comments are welcome.

If it's ok, I'll remove my debuging statements and provide a cleaner
patch.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: Patch for pg_dump: Multiple -t options and new -T option

From
Bruce Momjian
Date:
David F. Skoll wrote:
> On Tue, 20 Jul 2004, Tom Lane wrote:
>
> > (BTW, does the patch handle multiple -n switches?)
>
> No, it doesn't.  I can look into that if you like.  The patch was
> entirely to satisfy a need some of our customers have.  The -T switch
> does fill a real need for our customers; our product has a couple of tables
> that aren't critical if they aren't backed up, but as the product evolves,
> we occasionally add more tables.  So it's easier to use a -T switch to
> say what *not* to back up, than multiple -t switches to say what to back up.

Ah, I see in TODO:

    * Allow pg_dump to use multiple -t and -n switches

so the problem with lack of multiple -n parameters was already known.

Should we allow -n to affect subsequent -t parameters, so:

    -n schema1 -t tab1 -n schema2 -t tab2

does schema1.tab1 and schema2.tab2?

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: Patch for pg_dump: Multiple -t options and new -T option

From
Christopher Kings-Lynne
Date:
> No, it doesn't.  I can look into that if you like.  The patch was
> entirely to satisfy a need some of our customers have.  The -T switch
> does fill a real need for our customers; our product has a couple of tables
> that aren't critical if they aren't backed up, but as the product evolves,
> we occasionally add more tables.  So it's easier to use a -T switch to
> say what *not* to back up, than multiple -t switches to say what to back up.

Well, since you wrote the patch, you'd be better off munging it.  Read
Tom's comments and see what you can come up with.  There's been no
decision made yet though on what changes to make however.

Chris


Re: Patch for pg_dump: Multiple -t options and new -T

From
Gavin Sherry
Date:
On Wed, 21 Jul 2004, Christopher Kings-Lynne wrote:

> > No, it doesn't.  I can look into that if you like.  The patch was
> > entirely to satisfy a need some of our customers have.  The -T switch
> > does fill a real need for our customers; our product has a couple of tables
> > that aren't critical if they aren't backed up, but as the product evolves,
> > we occasionally add more tables.  So it's easier to use a -T switch to
> > say what *not* to back up, than multiple -t switches to say what to back up.
>
> Well, since you wrote the patch, you'd be better off munging it.  Read
> Tom's comments and see what you can come up with.  There's been no
> decision made yet though on what changes to make however.

I'd also move the should_dump.c file into an existing file and make sure
the patch is against CVS HEAD, not 7.4.3.

Also, there's a copyright statement at the top, retaining copyright with
the author. Does anyone have an issue with that?

Gavin

Re: Patch for pg_dump: Multiple -t options and new -T option

From
Bruce Momjian
Date:
Gavin Sherry wrote:
> On Wed, 21 Jul 2004, Christopher Kings-Lynne wrote:
>
> > > No, it doesn't.  I can look into that if you like.  The patch was
> > > entirely to satisfy a need some of our customers have.  The -T switch
> > > does fill a real need for our customers; our product has a couple of tables
> > > that aren't critical if they aren't backed up, but as the product evolves,
> > > we occasionally add more tables.  So it's easier to use a -T switch to
> > > say what *not* to back up, than multiple -t switches to say what to back up.
> >
> > Well, since you wrote the patch, you'd be better off munging it.  Read
> > Tom's comments and see what you can come up with.  There's been no
> > decision made yet though on what changes to make however.
>
> I'd also move the should_dump.c file into an existing file and make sure
> the patch is against CVS HEAD, not 7.4.3.

Agreed.

> Also, there's a copyright statement at the top, retaining copyright with
> the author. Does anyone have an issue with that?

Yes, no need for it.  We discourage that.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: Patch for pg_dump: Multiple -t options and new

From
Philip Warner
Date:
At 12:59 PM 21/07/2004, Bruce Momjian wrote:
>Yes, no need for it.  We discourage that.

Might be polite, not to mention legally required, to check with the author
of the patch first.


----------------------------------------------------------------
Philip Warner                    |     __---_____
Albatross Consulting Pty. Ltd.   |----/       -  \
(A.B.N. 75 008 659 498)          |          /(@)   ______---_
Tel: (+61) 0500 83 82 81         |                 _________  \
Fax: (+61) 03 5330 3172          |                 ___________ |
Http://www.rhyme.com.au          |                /           \|
                                  |    --________--
PGP key available upon request,  |  /
and from pgp.mit.edu:11371       |/


Re: Patch for pg_dump: Multiple -t options and new -T

From
"David F. Skoll"
Date:
Hi,

This is a response to several messages:

1) Copyright notice:  I have no problem having this removed, although it
would be nice to credit me somewhere in a comment.

2) I put most of the code in a separate file so that if the patch is
rejected, it's easy for me to maintain a forked copy.  If the patch is
accepted, obviously it can be integrated into an existing file.

3) Multiple -n options:  We need to figure out how this would work, and make
it non-surprising.  Some ideas:

    pg_dump -t t1 -n s2 -t t2 -t t3 -n s4 -t t5

What does that do?  My guess is:

    - Dump table t1 in any schema
    - Dump tables t2 and t3 in schema s2
    - Dump table t5 in schema s4

So now the position of the options matters!  That might surprise people,
because:

    pg_dump -s s1 -t t2

is no longer the same as:

    pg_dump -t t2 -n s1

What about:

    pg_dump -t t1 -n s2

Should that dump table t1 in any schema, and any table in schema s2?

If we can nail down the semantics, I can implement the patch.  The
code is very simple.

4) The -T option (and, one assumes, a corresponding -N option)

If the -T option is considered unknown/risky and would prevent the patch
from going in, we can drop it for now.

Regards,

David.

Re: Patch for pg_dump: Multiple -t options and new -T option

From
Bruce Momjian
Date:
David F. Skoll wrote:
> Hi,
>
> This is a response to several messages:
>
> 1) Copyright notice:  I have no problem having this removed, although it
> would be nice to credit me somewhere in a comment.

We credit in the commit message, and in the release notes so it will
always be seen.

> 2) I put most of the code in a separate file so that if the patch is
> rejected, it's easy for me to maintain a forked copy.  If the patch is
> accepted, obviously it can be integrated into an existing file.

OK.  Makes sense.

> 3) Multiple -n options:  We need to figure out how this would work, and make
> it non-surprising.  Some ideas:
>
>     pg_dump -t t1 -n s2 -t t2 -t t3 -n s4 -t t5
>
> What does that do?  My guess is:
>
>     - Dump table t1 in any schema
>     - Dump tables t2 and t3 in schema s2
>     - Dump table t5 in schema s4
>
> So now the position of the options matters!  That might surprise people,
> because:
>
>     pg_dump -s s1 -t t2
>
> is no longer the same as:
>
>     pg_dump -t t2 -n s1
>
> What about:
>
>     pg_dump -t t1 -n s2
>
> Should that dump table t1 in any schema, and any table in schema s2?
>
> If we can nail down the semantics, I can implement the patch.  The
> code is very simple.

Even though I suggested it, I am afraid this is just too confusing an API.

And I can't think of another one.  :-(


--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: Patch for pg_dump: Multiple -t options and new -T

From
"David F. Skoll"
Date:
On Wed, 21 Jul 2004, Bruce Momjian wrote:

> Even though I suggested it, I am afraid this is just too confusing an API.

How about this:

pg_dump -t t1                          -- Dump table t1 in any schema
pg_dump -n s1                          -- Dump all of schema s1
pg_dump -t t1 -n s1                    -- Dump t1 in s1
pg_dump -t t1 -t t2 -n s1              -- Dump s1.t1 and s1.t2
pg_dump -t t1 -t t2 -n s1 -n s2        -- Dump s1.t1, s1.t2, s2.t1 and s2.t2

Basically, no "-t" option means dump all tables.  No "-n" option
means dump all schemas.  If any "-t" or "-n" options are present,
then we only dump the specified tables/schemas.  We also probably
should not warn about missing tables, because it's likely that the
full cartesian product of schemas and tables won't exist.

And we nuke the -T and -N options.

Regards,

David.

Re: Patch for pg_dump: Multiple -t options and new -T option

From
Tom Lane
Date:
"David F. Skoll" <dfs@roaringpenguin.com> writes:
> How about this:

> pg_dump -t t1                          -- Dump table t1 in any schema
> pg_dump -n s1                          -- Dump all of schema s1
> pg_dump -t t1 -n s1                    -- Dump t1 in s1
> pg_dump -t t1 -t t2 -n s1              -- Dump s1.t1 and s1.t2
> pg_dump -t t1 -t t2 -n s1 -n s2        -- Dump s1.t1, s1.t2, s2.t1 and s2.t2

Why not

pg_dump -t t1                          -- Dump table t1 in any schema
pg_dump -n s1                          -- Dump all of schema s1
pg_dump -t s1.t1                       -- Dump t1 in s1
pg_dump -t s1.t1 -t s2.t2              -- Dump s1.t1 and s2.t2
pg_dump -t t1 -t t2 -n s1 -n s2        -- Dump s1.t1, s1.t2, s2.t1 and s2.t2

That is, the rules are:
    - if any -t switches appear, only tables matching (any one of)
      those switches are dumped
    - if any -n switches appear, only objects in (any one of)
      those schemas are dumped
    - a -t switch can be name only or schema.name

The cross-product semantics you're proposing can't implement my fourth
example.

I really dislike the idea of switch ordering making a difference...

> We also probably should not warn about missing tables, because it's
> likely that the full cartesian product of schemas and tables won't
> exist.

Agreed.  If any -t or -n switches appear, then warn only if *no* objects
get selected.

            regards, tom lane

Re: Patch for pg_dump: Multiple -t options and new -T

From
"David F. Skoll"
Date:
On Wed, 21 Jul 2004, Tom Lane wrote:

> pg_dump -t s1.t1 -t s2.t2              -- Dump s1.t1 and s2.t2

That's a good idea, but then it's questionable whether we need the -n
switch at all.  It might be simpler to extend the -t switch to
accept:

    pg-dump -t 's1.*'

rather than using a -n switch.  Of course, that breaks
backward-compatibility.

Regards,

David.

Re: Patch for pg_dump: Multiple -t options and new -T option

From
Tom Lane
Date:
"David F. Skoll" <dfs@roaringpenguin.com> writes:
> On Wed, 21 Jul 2004, Tom Lane wrote:
>> pg_dump -t s1.t1 -t s2.t2              -- Dump s1.t1 and s2.t2

> That's a good idea, but then it's questionable whether we need the -n
> switch at all.

Sure we do --- for backwards compatibility if nothing else.

> It might be simpler to extend the -t switch to accept:
>     pg-dump -t 's1.*'

That would not be the same thing --- that would mean to dump *only tables*
from s1, rather than objects of all types.  Anyway, I think it's a bit
late in this cycle to be proposing to implement wild-card matching.
Maybe for next time someone can do that, but for 7.5 I think we should
limit ourselves to cleaning up any design flaws of the already-submitted
patch.

            regards, tom lane

Re: Patch for pg_dump: Multiple -t options and new -T option

From
Bruce Momjian
Date:
This has been saved for the 7.6 release:

    http:/momjian.postgresql.org/cgi-bin/pgpatches2

---------------------------------------------------------------------------

Tom Lane wrote:
> "David F. Skoll" <dfs@roaringpenguin.com> writes:
> > On Wed, 21 Jul 2004, Tom Lane wrote:
> >> pg_dump -t s1.t1 -t s2.t2              -- Dump s1.t1 and s2.t2
>
> > That's a good idea, but then it's questionable whether we need the -n
> > switch at all.
>
> Sure we do --- for backwards compatibility if nothing else.
>
> > It might be simpler to extend the -t switch to accept:
> >     pg-dump -t 's1.*'
>
> That would not be the same thing --- that would mean to dump *only tables*
> from s1, rather than objects of all types.  Anyway, I think it's a bit
> late in this cycle to be proposing to implement wild-card matching.
> Maybe for next time someone can do that, but for 7.5 I think we should
> limit ourselves to cleaning up any design flaws of the already-submitted
> patch.
>
>             regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 7: don't forget to increase your free space map settings
>

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073