Re: Optimize LISTEN/NOTIFY - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Optimize LISTEN/NOTIFY
Date
Msg-id 1843261.1767817768@sss.pgh.pa.us
Whole thread Raw
In response to Re: Optimize LISTEN/NOTIFY  ("Joel Jacobson" <joel@compiler.org>)
List pgsql-hackers
"Joel Jacobson" <joel@compiler.org> writes:
> I've reworked the staging mechanism for LISTEN/UNLISTEN. The new design
> tracks LISTEN state at three levels:
> * pendingListenChannels: per-transaction pending changes
> * listenChannelsHash: per-backend committed state cache
> * channelHash: cluster-wide shared state

I spent some time looking through this version.

> The main benefit is that pendingListenChannels is now a hash table
> instead of a simple List.  In the old design, LISTEN foo; UNLISTEN foo;
> LISTEN foo would create three list entries that all had to be processed
> at commit.  The new design collapses this to one hash entry storing the
> final state, which we just apply at commit.

That seems a little weird: surely such usage is not something we need
to optimize.  Maybe it can be justified because it makes the code
simpler, but that's not immediately obvious to me.

I found the terminology pretty confusing, in particular there's too
little cognitive distance between "channel hash table" and "local
listen channels hash table" and "pending listen channels hash table".
I think you could improve matters by renaming, perhaps along the
lines of
    channelHash -> globalChannelTable
    listenChannelsHash -> localChannelTable
    pendingListenChannels -> pendingListenActions

... and I just noticed that there's yet a fourth data structure
that's described as a "channel hash table", namely
pendingNotifies->channelSet.  That's another name that could be
better, maybe uniqueChannelNames?

I noted while looking at code coverage results that AtAbort_Notify
seems almost entirely unreached, and soon realized that that's because
none of it runs unless we are cleaning up a transaction that fails
after Exec_ListenPreCommit has created pendingListenChannels.
That's obviously extremely difficult to reach in a test, so it's a bit
sad that we have that big chunk of unreachable code, especially when
the coverage elsewhere in this file is pretty stellar.  I wonder if
there is a way to merge that chunk with the code in AtCommit_Notify
that also has to scan pendingListenChannels, so as to reduce the
amount of untested code.  Also, the comments for AtAbort_Notify
don't really make it clear that there's nothing to do unless we're
reverting an almost-committed transaction.

Speaking of AtCommit_Notify:

            entry = dshash_find(channelHash, &key, true);
            if (entry == NULL)
                continue;

How can it be okay to just silently "continue" here?  Certainly
this case should be unreachable, but that doesn't make it okay
to silently ignore the failure if one were to occur.  Given
that this is post-commit processing, probably we can do no
better than

            if (entry == NULL)
                elog(PANIC, "could not find channel hash entry when expected");

Also, a bit further down:

                        /* UNLISTEN being committed: remove from channelHash */
                        entry->numListeners--;
                        if (i < entry->numListeners)
                            memmove(&listeners[i], &listeners[i + 1],
                                    sizeof(ListenerEntry) * (entry->numListeners - i));

                        /* Remove from local cache */
                        (void) hash_search(listenChannelsHash, pending->channel,
                                           HASH_REMOVE, NULL);

                        if (entry->numListeners == 0)
                        {
                            dsa_free(channelDSA, entry->listenersArray);
                            dshash_delete_entry(channelHash, entry);
                            entry = NULL;
                        }

This seems like a weird order of operations: why didn't you finish
dealing with the channelHash entry before dealing with
listenChannelsHash?

The separation between (for example) Exec_ListenPreCommit
and Exec_ListenPreCommitStage is extremely nonobvious;
readers will wonder why you didn't fold them into one routine.
I see that you did that because Exec_ListenPreCommit falls
out early if amRegisteredListener, but that doesn't make it
less confusing.  I wonder if it'd help to rename
Exec_ListenPreCommit to something that better reflects what
it actually does, say BecomeRegisteredListener.  The new
function's name is not good either.  I think what's bugging me
about it is that you're using "stage" as a verb, but English
does not put the verb at the end of the sentence, so readers
will think "stage" means "a step of the process" and then
wonder why you couldn't be clearer about which step.
Maybe better, "PrepareTableEntriesForListen"?  Similar
comments apply to the other xxxStage functions.

            regards, tom lane



pgsql-hackers by date:

Previous
From: Masahiko Sawada
Date:
Subject: Re: pg_upgrade: optimize replication slot caught-up check
Next
From: Daniil Davydov
Date:
Subject: Re: POC: Parallel processing of indexes in autovacuum