Re: [patch] demote - Mailing list pgsql-hackers

From Jehan-Guillaume de Rorthais
Subject Re: [patch] demote
Date
Msg-id 20200805000453.229fbefc@firost
Whole thread Raw
In response to Re: [patch] demote  (Jehan-Guillaume de Rorthais <jgdr@dalibo.com>)
Responses Re: [patch] demote  (Jehan-Guillaume de Rorthais <jgdr@dalibo.com>)
List pgsql-hackers
Hi,

Yet another summary + patch + tests.

Demote now keeps backends with no active xid alive. Smart mode keeps all
backends: it waits for them to finish their xact and enter read-only. Fast
mode terminate backends wit an active xid and keeps all other ones.
Backends enters "read-only" using LocalXLogInsertAllowed=0 and flip it to -1
(check recovery state) once demoted.
During demote, no new session is allowed.

As backends with no active xid survive, a new SQL admin function
"pg_demote(fast bool, wait bool, wait_seconds int)" had been added.

Demote now relies on sigusr1 instead of hijacking sigterm/sigint and pmdie().
The resulting refactoring makes the code much simpler, cleaner, with better
isolation of actions from the code point of view.

Thanks to the refactoring, the patch now only adds one state to the state
machine: PM_DEMOTING. A second one could be use to replace:

    /* Demoting: start the Startup Process */
    if (DemoteSignal && pmState == PM_SHUTDOWN && CheckpointerPID == 0)

with eg.:

    if (pmState == PM_DEMOTED)

I believe it might be a bit simpler to understand, but the existing comment
might be good enough as well. The full state machine path for demote is:

 PM_DEMOTING  /* wait for active xid backend to finish */
 PM_SHUTDOWN  /* wait for checkpoint shutdown and its 
                 various shutdown tasks */
 PM_SHUTDOWN && !CheckpointerPID /* aka PM_DEMOTED: start Startup process */
 PM_STARTUP

Tests in "recovery/t/021_promote-demote.pl" grows from 13 to 24 tests,
adding tests on backend behaviors during demote and new function pg_demote().

On my todo:

* cancel running checkpoint for fast demote ?
* forbid demote when PITR backup is in progress
* user documentation
* Robert's concern about snapshot during hot standby
* anything else reported to me

Plus, I might be able to split the backend part and their signals of the patch
0002 in its own patch if it helps the review. It would apply after 0001 and
before actual 0002.

As there was no consensus and the discussions seemed to conclude this patch set
should keep growing to see were it goes, I wonder if/when I should add it to
the commitfest. Advice? Opinion?

Regards,

Attachment

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: 13dev failed assert: comparetup_index_btree(): ItemPointer values should never be equal
Next
From: Tom Lane
Date:
Subject: Re: Confusing behavior of create table like