Re: pg_dump --snapshot - Mailing list pgsql-hackers

From Andres Freund
Subject Re: pg_dump --snapshot
Date
Msg-id 20130507005315.GA9222@awork2.anarazel.de
Whole thread Raw
In response to Re: pg_dump --snapshot  (Stephen Frost <sfrost@snowman.net>)
Responses Re: pg_dump --snapshot  (Andres Freund <andres@2ndquadrant.com>)
Re: pg_dump --snapshot  (Stephen Frost <sfrost@snowman.net>)
List pgsql-hackers
On 2013-05-06 20:18:26 -0400, Stephen Frost wrote:
> Simon,
> 
> * Simon Riggs (simon@2ndQuadrant.com) wrote:
> > If anybody really wanted to fix pg_dump, they could do. If that was so
> > important, why block this patch, but allow parallel pg_dump to be
> > committed without it?

> Because parallel pg_dump didn't make the problem any *worse*..?  This
> does.  The problem existed before parallel pg_dump.

Yes, it did.

> > There is no risk that is larger than the one already exposed by the
> > existing user API.

> The API exposes it, yes, but *pg_dump* isn't any worse than it was
> before.

No, but its still broken. pg_dump without the parameter being passed
isn't any worse off after the patch has been applied. With the parameter
the window gets a bit bigger sure...

> > If you do see a risk in the existing API, please deprecate it and
> > remove it from the docs, or mark it not-for-use-by-users. I hope you
> > don't, but if you do, do it now - I'll be telling lots of people about
> > all the useful things you can do with it over the next few years,
> > hopefully in pg_dump as well.

> pg_dump uses it already and uses it as best it can.  Users could use it
> also, provided they understand the constraints around it.  However,
> there really isn't a way for users to use this new option correctly-
> they would need to intuit what pg_dump will want to lock, lock it
> immediately after their transaction is created, and only *then* get the
> snapshot ID and pass it to pg_dump, hoping against hope that pg_dump
> will actually need the locks that they decided to acquire..

Given that we don't have all that many types of objects we can lock,
that task isn't all that complicated. But I'd guess a very common usage
is to start the snapshot and immediately fork pg_dump. In that case the
window between snapshot acquiration and reading the object list is
probably smaller than the one between reading the object list and
locking.

This all reads like a textbook case of "perfect is the enemy of good" to
me.

A rather useful feature has to fix a bug in pg_dump which a) exists for
ages b) has yet to be reported to the lists c) is rather complicated to
fix and quite possibly requires proper snapshots for internals?

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



pgsql-hackers by date:

Previous
From: Stephen Frost
Date:
Subject: Re: pg_dump --snapshot
Next
From: Andres Freund
Date:
Subject: Re: pg_dump --snapshot