Re: Feature: give pg_dump a WHERE clause expression - Mailing list pgsql-patches

From Davy Durham
Subject Re: Feature: give pg_dump a WHERE clause expression
Date
Msg-id 1212347327.17810.52.camel@ubuntu
Whole thread Raw
In response to Re: Feature: give pg_dump a WHERE clause expression  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Feature: give pg_dump a WHERE clause expression  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: Feature: give pg_dump a WHERE clause expression  (David Fetter <david@fetter.org>)
List pgsql-patches
On Sun, 2008-06-01 at 10:43 -0400, Tom Lane wrote:
> Davy Durham <pubaddr5@davyandbeth.com> writes:
> >   I have added a new parameter, -w/--where=EXPR
> >   This lets you specify an expression that will be used in a WHERE
> > clause when the data is dumped.
>
> This seems pretty poorly thought out.  It can hardly work in a dump
> of more than one table, which means that there's not any real reason
> to use pg_dump at all.  Just do a COPY (SELECT ...) TO somefile.
>
>             regards, tom lane
>

Well, my primary reason for writing the patch was to have a standard SQL
file using INSERT statements in order to load the some of a table's data
into a database other than postgresql which does not support the COPY
statement.

I'll admit that the single where clause would often not be applicable
across all tables in a database, but when pg_dump is told specific
tables to dump (a nice existing feature of pg_dump for doing something
specialized other than a simple entire database backup), then it can be
useful.

My particular case is that I have several tables that are simple event
logs.  Each table has a timestamp column.  I'm periodically bringing
these tables into sync on another database and I only want to pull rows
newer than since the last sync..  So, a where-clause of..
        'ts > $last_sync'
..works for me.  However, I'm sure there are other uses too..


== Thinking Further ==

Beyond serving my own needs, I'm trying to generically extend the
general idea that pg_dump already supports:
        1) pg_dump can be made to dump an entire database
        2) pg_dump can be made to dump only requested tables
        3) [my addition] pg_dump can be made to dump only requested rows
        from requested tables

However, it's no SO generic in that the where clause applies to all
tables.

So, if this patch is not acceptable as-is, what would you feel about
this:
        I could enhance the -t/--table=NAME option to accept more than a
        simple NAME.  Rather it could accept something in the form:

                --table=<table_name>:<where-clause expression>

                For example, pg_dump --table='foo:col1 > 10 AND f2 < 14'

        Currently, the user can specify -t/--table multiple times to
        have more than one table dumped.  Or the user can use a pattern
        to a single -t option to request multiple tabes.

        This way, a user could specify a WHERE clause per table he has
        requested to dump.  Granted, the WHERE clause may then apply to
        multiple tables if a pattern was used, but that may very well be
        desirable to the user.

Unless you disagree, this is a more generic solution (than what my patch
contains) to allowing the user of pg_dump to further refine what they
wish to dump.


Thoughts?

Thanks for the feedback


pgsql-patches by date:

Previous
From: Joe Conway
Date:
Subject: Re: [BUGS] BUG #4203: perform dblink() in begin/exception returns wrong SQLSTATE code
Next
From: Tom Lane
Date:
Subject: Re: [BUGS] BUG #4203: perform dblink() in begin/exception returns wrong SQLSTATE code