Home > mailing lists

Re: pg_upgrade: optimize replication slot caught-up check - Mailing list pgsql-hackers

From	Masahiko Sawada
Subject	Re: pg_upgrade: optimize replication slot caught-up check
Date	January 6 23:09:55
Msg-id	CAD21AoCgCE9kTTB=QKQkDTdhnb9cczFnYAU0P20qqKEQszG-2g@mail.gmail.com Whole thread Raw
In response to	Re: pg_upgrade: optimize replication slot caught-up check (Chao Li <li.evan.chao@gmail.com>)
Responses	Re: pg_upgrade: optimize replication slot caught-up check
List	pgsql-hackers

Tree view

On Mon, Jan 5, 2026 at 6:55 PM Chao Li <li.evan.chao@gmail.com> wrote:
>
>
>
> > On Jan 6, 2026, at 02:02, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > Hi all,
> >
> > Commit 29d0a77fa6 improved pg_upgrade to allow migrating logical
> > slots. Currently, to check if the slots are ready to be migrated, we
> > call binary_upgrade_logical_slot_has_caught_up() for every single
> > slot. This checks if there are any unconsumed WAL records. However, we
> > noticed a performance issue. If there are many slots (e.g., 100 or
> > more) or if there is a WAL backlog, checking all slots one by one
> > takes a long time.
> >
> > Here are some test results from my environment:
> > With an empty cluster: 1.55s
> > With 200 slots and 30MB backlog: 15.51s
> >
> > Commit 6d3d2e8e5 introduced parallel checks per database, but a single
> > job might still have to check too many slots, causing delays.
> >
> > Since binary_upgrade_logical_slot_has_caught_up() essentially checks
> > if any decodable record exists in the database, IIUC it is not
> > necessary to check every slot. We can optimize this by checking only
> > the slot with the minimum confirmed_flush_lsn. If that slot is caught
> > up, we can assume others are too. The attached patch implements this
> > optimization. With the patch, the test with 200 slots finished in
> > 2.512s. The execution time is now stable regardless of the number of
> > slots.
> >
> > One thing to note is that DecodeTXNNeedSkip() also considers
> > replication origin filters. Theoretically, a plugin could filter out
> > specific origins, which might lead to different results. However, this
> > is a very rare case. Even if it happens, it would just result in a
> > false positive (the upgrade fails safely), so the impact is minimal.
> > Therefore, the patch simplifies the check to be per-database instead
> > of per-slot.
> >
> > Feedback is very welcome.
> >
> > Regards,
> >
> > --
> > Masahiko Sawada
> > Amazon Web Services: https://aws.amazon.com
> > <v1-0001-pg_upgrade-Optimize-replication-slot-caught-up-ch.patch>
>
> Hi Masahiko-san,
>
> Basically I think the optimization idea makes sense to me. Reducing the check from O(slots × WAL) to O(databases ×
WAL)sounds a win. Just a few small comments: 

Thank you for reviewing the patch!

>
> 1
> ```
> +static void process_old_cluter_logical_slot_catchup_infos(DbInfo *dbinfo, PGresult *res, void *arg);
> ```
>
> Typo in the function name: cluter->cluster
>
> 2
> ···
> -               logical_slot_infos_query = get_old_cluster_logical_slot_infos_query();
> +               const char *slot_info_query = "SELECT slot_name, plugin, two_phase, failover, “
> ···
>
> logical_slot_infos_query is no longer used, should be removed.

Fixed.

>
> 3
> ```
> +                               "  ORDER BY confirmed_flush_lsn ASC "
> ```
>
> I am thinking, if confirmed_flush_lsn can be NULL? If that’s true, we may want to add “NULLS LAST”.

Given that the query is not executed when live-check, I think
confirmed_flush cannot be NULL. temporary should also not be true but
it's for consistency with slot_info_query.

>
> 4
> ```
> +       Assert(num_slots == dbinfo->slot_arr.nslots);
> +
> +       for (int i = 0; i < num_slots; i++)
> +       {
> +               char       *slotname;
> +               bool            caught_up;
> +
> +               slotname = PQgetvalue(res, i, PQfnumber(res, "slot_name"));
> +               caught_up = (strcmp(PQgetvalue(res, i, PQfnumber(res, "caught_up")), "t") == 0);
> +
> +               for (int slotnum = 0; slotnum < dbinfo->slot_arr.nslots; slotnum++)
> +               {
> ```
>
> As num_slots == dbinfo->slot_arr.nslots, this is still a O(num_slots^2) check. Given this patch’s goal is to
eliminatethe burden from large number of slots, maybe we can build a hash table to make the check faster. 
>

Or if we sort both queries in the same order we can simply update the
slots sequentially.

One problem I can see in this approach is that we could wrongly report
the invalid slots in invalid_logical_slots.txt. Even if the slot with
the oldest confirmed_flush_lsn has unconsumed decodable records, other
slots might have already been caught up. I'll think of how to deal
with this problem.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

pgsql-hackers by date:

From: Kirill Reshke
Date: 06 January, 23:05:45
Subject: Re: Reduce build times of pg_trgm GIN indexes

From: Sami Imseih
Date: 06 January, 23:20:47
Subject: Re: Refactor query normalization into core query jumbling

Re: pg_upgrade: optimize replication slot caught-up check - Mailing list pgsql-hackers

Previous

Next