Make relation_openrv atomic wrt DDL - Mailing list pgsql-hackers

From Noah Misch
Subject Make relation_openrv atomic wrt DDL
Date
Msg-id 20110612191843.GF21098@tornado.leadboat.com
Whole thread Raw
In response to Re: ALTER TABLE ... REPLACE WITH  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Make relation_openrv atomic wrt DDL
Re: Make relation_openrv atomic wrt DDL
List pgsql-hackers
On Thu, Jan 20, 2011 at 09:48:12PM -0500, Robert Haas wrote:
> On Thu, Jan 20, 2011 at 6:19 PM, Noah Misch <noah@leadboat.com> wrote:
> > On Thu, Jan 20, 2011 at 09:36:11PM +0000, Simon Riggs wrote:
> >> I agree that the DDL behaviour is wrong and should be fixed. Thank you
> >> for championing that alternative view.
> >>
> >> Swapping based upon names only works and is very flexible, much more so
> >> than EXCHANGE could be.
> >>
> >> A separate utility might be worth it, but the feature set of that should
> >> be defined in terms of correctly-working DDL behaviour. It's possible
> >> that no further requirement exists. I remove my own patch from
> >> consideration for this release.
> >>
> >> I'll review your patch and commit it, problems or objections excepted. I
> >> haven't looked at it in any detail.
> >
> > Thanks. ?I wouldn't be very surprised if that patch is even the wrong way to
> > achieve these semantics, but it's great that we're on the same page as to which
> > semantics they are.
>
> I think Noah's patch is a not a good idea, because it will result in
> calling RangeVarGetRelid twice even in the overwhelmingly common case
> where no relevant invalidation occurs.  That'll add several syscache
> lookups per table to very common operations.

Valid concern.

[Refresher: this was a patch to improve behavior for this test case:

    psql -c "CREATE TABLE t AS VALUES ('old'); CREATE TABLE new_t AS VALUES ('new')"
    psql -c 'SELECT pg_sleep(2) FROM t' & # block the ALTER or DROP briefly
    sleep 1                                  # give prev time to take AccessShareLock

    # Do it this way, and the next SELECT gets data from the old table.
    psql -c 'ALTER TABLE t RENAME TO old_t; ALTER TABLE new_t RENAME TO t' &
    # Do it this way, and get: ERROR:  could not open relation with OID 41380
    #psql -c 'DROP TABLE t; ALTER TABLE new_t RENAME TO t' &

    psql -c 'SELECT * FROM t'        # I get 'old' or an error, never 'new'.
    psql -c 'DROP TABLE IF EXISTS t, old_t, new_t'

It did so by rechecking the RangeVar->oid resolution after locking the found
relation, by which time concurrent DDL could no longer be interfering.]

I benchmarked the original patch with this function:

    Datum
    nmtest(PG_FUNCTION_ARGS)
    {
        int32        n = PG_GETARG_INT32(0);
        int            i;
        RangeVar   *rv = makeRangeVar(NULL, "pg_am", 0);

        for (i = 0; i < n; ++i)
        {
            Relation    r = relation_openrv(rv, AccessShareLock);
            relation_close(r, AccessShareLock);
        }
        PG_RETURN_INT32(4);
    }

(Releasing the lock before transaction end makes for an unrealistic benchmark,
but so would opening the same relation millions of times in a single
transaction.  I'm trying to isolate the cost that would be spread over
millions of transactions opening relations a handful of times.  See attached
shar archive for a complete extension wrapping that test function.)

Indeed, the original patch slowed it by about 50%.  I improved the patch,
adding a global SharedInvalidMessageCounter to increment as we process
messages.  If this counter does not change between the RangeVarGetRelid() call
and the post-lock AcceptInvalidationMessages() call, we can skip the second
RangeVarGetRelid() call.  With the updated patch, I get these timings (in ms)
for runs of "SELECT nmtest(10000000)":

master: 19697.642, 20087.477, 19748.995
patched: 19723.716, 19879.538, 20257.671

In other words, no significant difference.  Since the patch removes the
no-longer-needed pre-lock call to AcceptInvalidationMessages(), changing to
"relation_close(r, NoLock)" in the test case actually reveals a 15%
performance improvement.  Granted, nothing to get excited about in light of
the artificiality.

This semantic improvement would be hard to test with the current pg_regress
suite, so I do not include any test case addition in the patch.  If
sufficiently important, it could be done with isolationtester.

Thanks,
nm

Attachment

pgsql-hackers by date:

Previous
From: Noah Misch
Date:
Subject: Re: On-the-fly index tuple deletion vs. hot_standby
Next
From: Florian Pflug
Date:
Subject: Re: Detailed documentation for external calls (threading, shared resources etc)