Re: Commits 8de72b and 5457a1 (COPY FREEZE) - Mailing list pgsql-hackers

From Robert Haas
Subject Re: Commits 8de72b and 5457a1 (COPY FREEZE)
Date
Msg-id CA+TgmobQ7g5rYGs3DNFLGxyo2hnCzg9FGkrMcKwZWF8brLRo0A@mail.gmail.com
Whole thread Raw
In response to Re: Commits 8de72b and 5457a1 (COPY FREEZE)  (Noah Misch <noah@leadboat.com>)
Responses Re: Commits 8de72b and 5457a1 (COPY FREEZE)  (Stephen Frost <sfrost@snowman.net>)
Re: Commits 8de72b and 5457a1 (COPY FREEZE)  (Noah Misch <noah@leadboat.com>)
Re: Commits 8de72b and 5457a1 (COPY FREEZE)  (Bruce Momjian <bruce@momjian.us>)
List pgsql-hackers
On Sun, Dec 9, 2012 at 3:06 PM, Noah Misch <noah@leadboat.com> wrote:
> I favor[1] unconditionally letting older snapshots see the new rows after the
> CREATE+COPY transaction commits.  To recap, making affected scans see an empty
> table is as wrong as making them see those rows.  Robert also listed[2] that
> as a credible option, and I don't recall anyone opining against it in previous
> discussions.  I did perceive an undercurrent preference, all other things
> being equal, for an optimization free from semantic side-effects.  I shared
> that preference, but investigations showed that we must compromise something.

You know, I hadn't been taking that option terribly seriously, but
maybe we ought to reconsider it.  It would certainly be simpler, and
as you point out, it's not really any worse from an MVCC point of view
than anything else we do.  Moreover, it would make this available to
clients like pg_dump without further hackery.

I think the current behavior, where we treat FREEZE as a hint, is just
awful.  Regardless of whether the behavior is automatic or manually
requested, the idea that you might get the optimization or not
depending on the timing of relcache flushes seems very much
undesirable.  I mean, if the optimization is actually important for
performance, then you want to get it when you ask for it.  If it
isn't, then why bother having it at all?  Let's say that COPY FREEZE
normally doubles performance on a data load that therefore takes 8
hours - somebody who suddenly loses that benefit because of a relcache
flush that they can't prevent or control and ends up with a 16 hour
data load is going to pop a gasket.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Support for REINDEX CONCURRENTLY
Next
From: Robert Haas
Date:
Subject: Re: [v9.3] OAT_POST_ALTER object access hooks