Re: pg_upgrade: optimize replication slot caught-up check - Mailing list pgsql-hackers
| From | Masahiko Sawada |
|---|---|
| Subject | Re: pg_upgrade: optimize replication slot caught-up check |
| Date | |
| Msg-id | CAD21AoCgCE9kTTB=QKQkDTdhnb9cczFnYAU0P20qqKEQszG-2g@mail.gmail.com Whole thread Raw |
| In response to | Re: pg_upgrade: optimize replication slot caught-up check (Chao Li <li.evan.chao@gmail.com>) |
| Responses |
Re: pg_upgrade: optimize replication slot caught-up check
|
| List | pgsql-hackers |
On Mon, Jan 5, 2026 at 6:55 PM Chao Li <li.evan.chao@gmail.com> wrote: > > > > > On Jan 6, 2026, at 02:02, Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > Hi all, > > > > Commit 29d0a77fa6 improved pg_upgrade to allow migrating logical > > slots. Currently, to check if the slots are ready to be migrated, we > > call binary_upgrade_logical_slot_has_caught_up() for every single > > slot. This checks if there are any unconsumed WAL records. However, we > > noticed a performance issue. If there are many slots (e.g., 100 or > > more) or if there is a WAL backlog, checking all slots one by one > > takes a long time. > > > > Here are some test results from my environment: > > With an empty cluster: 1.55s > > With 200 slots and 30MB backlog: 15.51s > > > > Commit 6d3d2e8e5 introduced parallel checks per database, but a single > > job might still have to check too many slots, causing delays. > > > > Since binary_upgrade_logical_slot_has_caught_up() essentially checks > > if any decodable record exists in the database, IIUC it is not > > necessary to check every slot. We can optimize this by checking only > > the slot with the minimum confirmed_flush_lsn. If that slot is caught > > up, we can assume others are too. The attached patch implements this > > optimization. With the patch, the test with 200 slots finished in > > 2.512s. The execution time is now stable regardless of the number of > > slots. > > > > One thing to note is that DecodeTXNNeedSkip() also considers > > replication origin filters. Theoretically, a plugin could filter out > > specific origins, which might lead to different results. However, this > > is a very rare case. Even if it happens, it would just result in a > > false positive (the upgrade fails safely), so the impact is minimal. > > Therefore, the patch simplifies the check to be per-database instead > > of per-slot. > > > > Feedback is very welcome. > > > > Regards, > > > > -- > > Masahiko Sawada > > Amazon Web Services: https://aws.amazon.com > > <v1-0001-pg_upgrade-Optimize-replication-slot-caught-up-ch.patch> > > Hi Masahiko-san, > > Basically I think the optimization idea makes sense to me. Reducing the check from O(slots × WAL) to O(databases × WAL)sounds a win. Just a few small comments: Thank you for reviewing the patch! > > 1 > ``` > +static void process_old_cluter_logical_slot_catchup_infos(DbInfo *dbinfo, PGresult *res, void *arg); > ``` > > Typo in the function name: cluter->cluster > > 2 > ··· > - logical_slot_infos_query = get_old_cluster_logical_slot_infos_query(); > + const char *slot_info_query = "SELECT slot_name, plugin, two_phase, failover, “ > ··· > > logical_slot_infos_query is no longer used, should be removed. Fixed. > > 3 > ``` > + " ORDER BY confirmed_flush_lsn ASC " > ``` > > I am thinking, if confirmed_flush_lsn can be NULL? If that’s true, we may want to add “NULLS LAST”. Given that the query is not executed when live-check, I think confirmed_flush cannot be NULL. temporary should also not be true but it's for consistency with slot_info_query. > > 4 > ``` > + Assert(num_slots == dbinfo->slot_arr.nslots); > + > + for (int i = 0; i < num_slots; i++) > + { > + char *slotname; > + bool caught_up; > + > + slotname = PQgetvalue(res, i, PQfnumber(res, "slot_name")); > + caught_up = (strcmp(PQgetvalue(res, i, PQfnumber(res, "caught_up")), "t") == 0); > + > + for (int slotnum = 0; slotnum < dbinfo->slot_arr.nslots; slotnum++) > + { > ``` > > As num_slots == dbinfo->slot_arr.nslots, this is still a O(num_slots^2) check. Given this patch’s goal is to eliminatethe burden from large number of slots, maybe we can build a hash table to make the check faster. > Or if we sort both queries in the same order we can simply update the slots sequentially. One problem I can see in this approach is that we could wrongly report the invalid slots in invalid_logical_slots.txt. Even if the slot with the oldest confirmed_flush_lsn has unconsumed decodable records, other slots might have already been caught up. I'll think of how to deal with this problem. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
pgsql-hackers by date: