Thread: Re: [HACKERS] Patch for Prevent pg_dump/pg_restore from being affected by statement_timeout

On Wed, Apr 16, 2008 at 4:54 PM, Alvaro Herrera
<alvherre@commandprompt.com> wrote:
> Joshua D. Drake escribió:
>
> > That is an interesting idea. Something like:
>  >
>  > pg_restore -E "SET STATEMENT_TIMEOUT=0; SET MAINTENANCE_WORK_MEM=1G" ?
>
>  We already have it -- it's called PGOPTIONS.
>

Ok but is not the purpose of the patch to turn off statement_timeout
by *default* in pg_restore/pg_dump?

Here is an updated patch for I posted above (with the command line
option --use-statement-timeout) for pg_dump and pg_restore.

(sorry If I hijacked your patch Josh :) )

Attachment
Alex Hunsaker wrote:
> On Wed, Apr 16, 2008 at 4:54 PM, Alvaro Herrera
> <alvherre@commandprompt.com> wrote:
> > Joshua D. Drake escribi?:
> >
> > > That is an interesting idea. Something like:
> >  >
> >  > pg_restore -E "SET STATEMENT_TIMEOUT=0; SET MAINTENANCE_WORK_MEM=1G" ?
> >
> >  We already have it -- it's called PGOPTIONS.
> >
>
> Ok but is not the purpose of the patch to turn off statement_timeout
> by *default* in pg_restore/pg_dump?
>
> Here is an updated patch for I posted above (with the command line
> option --use-statement-timeout) for pg_dump and pg_restore.

I would like to get do this without adding a new --use-statement-timeout
flag.  Is anyone going to want to honor statement_timeout during
pg_dump/pg_restore?  I thought we were just going to disable it.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

On Mon, Jun 23, 2008 at 06:51:28PM -0400, Bruce Momjian wrote:
> Alex Hunsaker wrote:
> > On Wed, Apr 16, 2008 at 4:54 PM, Alvaro Herrera
> > <alvherre@commandprompt.com> wrote:
> > > Joshua D. Drake escribi?:
> > >
> > > > That is an interesting idea. Something like:
> > >  >
> > >  > pg_restore -E "SET STATEMENT_TIMEOUT=0; SET MAINTENANCE_WORK_MEM=1G" ?
> > >
> > >  We already have it -- it's called PGOPTIONS.
> > >
> >
> > Ok but is not the purpose of the patch to turn off statement_timeout
> > by *default* in pg_restore/pg_dump?
> >
> > Here is an updated patch for I posted above (with the command line
> > option --use-statement-timeout) for pg_dump and pg_restore.
>
> I would like to get do this without adding a new --use-statement-timeout
> flag.  Is anyone going to want to honor statement_timeout during
> pg_dump/pg_restore?  I thought we were just going to disable it.

I have a patch in the queue to use set statement timeout while pg_dump is
taking locks to avoid pg_dump hanging for other long running transactions
that may have done ddl. Do I need to repost for discussion now?

-dg

--
David Gould       daveg@sonic.net      510 536 1443    510 282 0869
If simplicity worked, the world would be overrun with insects.

daveg wrote:
> On Mon, Jun 23, 2008 at 06:51:28PM -0400, Bruce Momjian wrote:
> > Alex Hunsaker wrote:
> > > On Wed, Apr 16, 2008 at 4:54 PM, Alvaro Herrera
> > > <alvherre@commandprompt.com> wrote:
> > > > Joshua D. Drake escribi?:
> > > >
> > > > > That is an interesting idea. Something like:
> > > >  >
> > > >  > pg_restore -E "SET STATEMENT_TIMEOUT=0; SET MAINTENANCE_WORK_MEM=1G" ?
> > > >
> > > >  We already have it -- it's called PGOPTIONS.
> > > >
> > >
> > > Ok but is not the purpose of the patch to turn off statement_timeout
> > > by *default* in pg_restore/pg_dump?
> > >
> > > Here is an updated patch for I posted above (with the command line
> > > option --use-statement-timeout) for pg_dump and pg_restore.
> >
> > I would like to get do this without adding a new --use-statement-timeout
> > flag.  Is anyone going to want to honor statement_timeout during
> > pg_dump/pg_restore?  I thought we were just going to disable it.
>
> I have a patch in the queue to use set statement timeout while pg_dump is
> taking locks to avoid pg_dump hanging for other long running transactions
> that may have done ddl. Do I need to repost for discussion now?

I see it now, but I forgot how it would interact with this patch.  We
would have to prevent --use-statement-timeout when lock timeout was
being used, but my point is that I see no value in having
--use-statement-timeout.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

On Mon, Jun 23, 2008 at 4:51 PM, Bruce Momjian <bruce@momjian.us> wrote:
> I would like to get do this without adding a new --use-statement-timeout
> flag.  Is anyone going to want to honor statement_timeout during
> pg_dump/pg_restore?  I thought we were just going to disable it.

I believe so.  This was when not everyone was convinced.  Im fairly
certain Josh original patch is in the commit fest. So feel free to
drop this one.

Alex Hunsaker wrote:
> On Mon, Jun 23, 2008 at 4:51 PM, Bruce Momjian <bruce@momjian.us> wrote:
> > I would like to get do this without adding a new --use-statement-timeout
> > flag.  Is anyone going to want to honor statement_timeout during
> > pg_dump/pg_restore?  I thought we were just going to disable it.
>
> I believe so.  This was when not everyone was convinced.  Im fairly
> certain Josh original patch is in the commit fest. So feel free to
> drop this one.

I certainly don't see any version of Drake's patch in the July
commitfest:

    http://wiki.postgresql.org/wiki/CommitFest

I am thinking I will just remove the option and commit it.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

Alex Hunsaker wrote:
> On Mon, Jun 23, 2008 at 4:51 PM, Bruce Momjian <bruce@momjian.us> wrote:
>> I would like to get do this without adding a new --use-statement-timeout
>> flag.  Is anyone going to want to honor statement_timeout during
>> pg_dump/pg_restore?  I thought we were just going to disable it.
>
> I believe so.  This was when not everyone was convinced.  Im fairly
> certain Josh original patch is in the commit fest. So feel free to
> drop this one.
>

My patch has been committed.

Joshua D. Drake

Joshua D. Drake wrote:
> Alex Hunsaker wrote:
> > On Mon, Jun 23, 2008 at 4:51 PM, Bruce Momjian <bruce@momjian.us> wrote:
> >> I would like to get do this without adding a new --use-statement-timeout
> >> flag.  Is anyone going to want to honor statement_timeout during
> >> pg_dump/pg_restore?  I thought we were just going to disable it.
> >
> > I believe so.  This was when not everyone was convinced.  Im fairly
> > certain Josh original patch is in the commit fest. So feel free to
> > drop this one.
> >
>
> My patch has been committed.

Ah, I see, but with no switch.  Thanks.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

On Mon, Jun 23, 2008 at 07:30:53PM -0400, Bruce Momjian wrote:
> daveg wrote:
> > On Mon, Jun 23, 2008 at 06:51:28PM -0400, Bruce Momjian wrote:
> > > Alex Hunsaker wrote:
> > > > On Wed, Apr 16, 2008 at 4:54 PM, Alvaro Herrera
> > > > <alvherre@commandprompt.com> wrote:
> > > > > Joshua D. Drake escribi?:
> > > > >
> > > > > > That is an interesting idea. Something like:
> > > > >  >
> > > > >  > pg_restore -E "SET STATEMENT_TIMEOUT=0; SET MAINTENANCE_WORK_MEM=1G" ?
> > > > >
> > > > >  We already have it -- it's called PGOPTIONS.
> > > > >
> > > >
> > > > Ok but is not the purpose of the patch to turn off statement_timeout
> > > > by *default* in pg_restore/pg_dump?
> > > >
> > > > Here is an updated patch for I posted above (with the command line
> > > > option --use-statement-timeout) for pg_dump and pg_restore.
> > >
> > > I would like to get do this without adding a new --use-statement-timeout
> > > flag.  Is anyone going to want to honor statement_timeout during
> > > pg_dump/pg_restore?  I thought we were just going to disable it.
> >
> > I have a patch in the queue to use set statement timeout while pg_dump is
> > taking locks to avoid pg_dump hanging for other long running transactions
> > that may have done ddl. Do I need to repost for discussion now?
>
> I see it now, but I forgot how it would interact with this patch.  We
> would have to prevent --use-statement-timeout when lock timeout was
> being used, but my point is that I see no value in having
> --use-statement-timeout.

lock-timeout sets statement_timeout to a small value while locks are being
taken on all the tables. Then it resets it to default. So it could reset it
to whatever the new default is.

Do I need to adjust my patch or something?

-dg

--
David Gould       daveg@sonic.net      510 536 1443    510 282 0869
If simplicity worked, the world would be overrun with insects.

daveg <daveg@sonic.net> writes:
> lock-timeout sets statement_timeout to a small value while locks are being
> taken on all the tables. Then it resets it to default. So it could reset it
> to whatever the new default is.

"reset to default" is *surely* not the right behavior; resetting to the
setting that had been in effect might be a bit sane.  But the whole
design sounds pretty broken to start with.  The timer management code
already understands the concept of scheduling the interrupt for the
nearest of a couple of potentially active timeouts.  ISTM any patch
intended to add a feature like this ought to extend that logic rather
than hack around changing the values of global variables.

            regards, tom lane

On Tue, Jun 24, 2008 at 05:34:50PM -0400, Tom Lane wrote:
> daveg <daveg@sonic.net> writes:
> > lock-timeout sets statement_timeout to a small value while locks are being
> > taken on all the tables. Then it resets it to default. So it could reset it
> > to whatever the new default is.
>
> "reset to default" is *surely* not the right behavior; resetting to the
> setting that had been in effect might be a bit sane.  But the whole
> design sounds pretty broken to start with.  The timer management code
> already understands the concept of scheduling the interrupt for the
> nearest of a couple of potentially active timeouts.  ISTM any patch
> intended to add a feature like this ought to extend that logic rather
> than hack around changing the values of global variables.

Are we talking about the same patch? Because I don't know what you are
refering to with "timer management code" and "scheduling the interrupt" in
the context of pg_dump.

-dg


--
David Gould       daveg@sonic.net      510 536 1443    510 282 0869
If simplicity worked, the world would be overrun with insects.

daveg <daveg@sonic.net> writes:
> Are we talking about the same patch?

Maybe not --- I thought you were talking about a backend-side behavioral
change.

> Because I don't know what you are
> refering to with "timer management code" and "scheduling the interrupt" in
> the context of pg_dump.

I'm not sure that I see a good argument for making pg_dump deliberately
fail, if that's what you're proposing.  Maybe I'm just too old-school,
but there simply is not any other higher priority for a database than
safeguarding your data.

            regards, tom lane

On Tue, Jun 24, 2008 at 10:41:07PM -0400, Tom Lane wrote:
> daveg <daveg@sonic.net> writes:
> > Are we talking about the same patch?
>
> Maybe not --- I thought you were talking about a backend-side behavioral
> change.
>
> > Because I don't know what you are
> > refering to with "timer management code" and "scheduling the interrupt" in
> > the context of pg_dump.
>
> I'm not sure that I see a good argument for making pg_dump deliberately
> fail, if that's what you're proposing.  Maybe I'm just too old-school,
> but there simply is not any other higher priority for a database than
> safeguarding your data.

We agree about that. The intent of my patch it to give the user a chance to
take corrective action in a case where pg_dump cannot be relied on to succeed.

The problem is that pg_dump can get blocked behind locks and then fail hours
later when the locks are released because some table it had not locked yet
changed. In the worst case:

  - no backup,
  - no notice until too late to restart the backup,
  - lost production due to other processes waiting on locks pg_dump holds.

So the intent of the patch is to optionally allow pg_dump to fail quickly
if it cannot get all the access share locks it needs. This gives the user
an opportunity to notice and retry in a timely way.

Please see http://archives.postgresql.org/pgsql-patches/2008-05/msg00351.php
for the orginal patch and problem description.

A sample failure instance from a very heavy batch environment with a lot of
materialized views being maintained concurrently with pg_dump. DB size
is about 300 GB:

---
20080410 14:53:49 dumpdb c04_20080410_public: dumping c04 to /backups/c04_20080410_public
pg_dump: SQL command failed
pg_dump: Error message from server: ERROR:  cache lookup failed for index 22619852
pg_dump: The command was: SELECT t.tableoid, t.oid, t.relname as indexname, pg_catalog.pg_get_indexdef(i.indexrelid) as
indexdef,t.relnatts as indnkeys, i.indkey, i.indisclustered, c.contype, c.conname, c.tableoid as contableoid, c.oid as
conoid,(SELECT spcname FROM pg_catalog.pg_tablespace s WHERE s.oid = t.reltablespace) as tablespace,
array_to_string(t.reloptions,', ') as options FROM pg_catalog.pg_index i JOIN pg_catalog.pg_class t ON (t.oid =
i.indexrelid)LEFT JOIN pg_catalog.pg_depend d ON (d.classid = t.tableoid AND d.objid = t.oid AND d.deptype = 'i') LEFT
JOINpg_catalog.pg_constraint c ON (d.refclassid = c.tableoid AND d.refobjid = c.oid) WHERE i.indrelid =
'22615005'::pg_catalog.oidORDER BY indexname 
20080411 06:12:17 dumpdb FATAL: c04_20080410_public: dump failed
---

Note that the dump started at 14:53, but did not fail until 06:12 the next day,
and it never got to the actual copy out phase. Meanwhile other DDL using
processes were hung on the access share locks aready held by pg_dump.

Regards

-dg

--
David Gould       daveg@sonic.net      510 536 1443    510 282 0869
If simplicity worked, the world would be overrun with insects.