Re: pg_dump --snapshot - Mailing list pgsql-hackers

From Simon Riggs
Subject Re: pg_dump --snapshot
Date
Msg-id CA+U5nM+weu-perLOGD7pqkZGFwgKJM6X5SXJeYE8H7Vc4eVvZg@mail.gmail.com
Whole thread Raw
In response to Re: pg_dump --snapshot  (Stephen Frost <sfrost@snowman.net>)
Responses Re: pg_dump --snapshot  (Stephen Frost <sfrost@snowman.net>)
List pgsql-hackers
On 7 May 2013 01:18, Stephen Frost <sfrost@snowman.net> wrote:

> * Simon Riggs (simon@2ndQuadrant.com) wrote:
>> If anybody really wanted to fix pg_dump, they could do. If that was so
>> important, why block this patch, but allow parallel pg_dump to be
>> committed without it?
>
> Because parallel pg_dump didn't make the problem any *worse*..?  This
> does.

Sorry, not accurate. Patch makes nothing *worse*.

The existing API *can* be misused in the way you say, and so also
could pg_dump if the patch is allowed.

However, there is no reason to suppose that such misuse would be
common; no reason why a timing gap would *necessarily* occur in the
way your previous example showed, or if it did why it would
necessarily present a problem for the user. Especially if we put
something in the docs.

> pg_dump uses it already and uses it as best it can.  Users could use it
> also, provided they understand the constraints around it.

Snapshots have no WARNING on them. There is no guarantee in any
transaction that the table you want will not be dropped before you try
to access it. *Any* program that dynamically assembles a list of
objects and then acts on them is at risk of having an out-of-date list
of objects as the database moves forward. This applies to any form of
snapshot, not just this patch, nor even just exported snapshots.

> However,
> there really isn't a way for users to use this new option correctly-

Not accurate.

> they would need to intuit what pg_dump will want to lock, lock it
> immediately after their transaction is created, and only *then* get the
> snapshot ID and pass it to pg_dump, hoping against hope that pg_dump
> will actually need the locks that they decided to acquire..

The argument against this is essentially that we don't trust the user
to use it well, so we won't let them have it at all. Which makes no
sense since they already have this API and don't need our permission
to use it. All that blocking this patch does is to remove any chance
the user has of coordinating pg_dump with other actions; preventing
that causes more issues for the user and so doing nothing is not a
safe or correct either. A balanced viewpoint needs to include the same
level of analysis on both sides, not just a deep look at the worst
case on one side and claim everything is rosy with the current
situation.

--Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



pgsql-hackers by date:

Previous
From: "Karl O. Pinc"
Date:
Subject: Make targets of doc links used by phpPgAdmin static
Next
From: Christoph Berg
Date:
Subject: [patch] PSQLDIR not passed to pg_regress in contrib/pg_upgrade/test.sh