Home > mailing lists

Re: Initial COPY of Logical Replication is too slow - Mailing list pgsql-hackers

From	Masahiko Sawada
Subject	Re: Initial COPY of Logical Replication is too slow
Date	December 20 04:58:59
Msg-id	CAD21AoA6i2ui8FMZeuU_KxX4t-fM8G==zTW2Dp6-goujttrpew@mail.gmail.com Whole thread Raw
In response to	Initial COPY of Logical Replication is too slow (Marcos Pegoraro <marcos@f10.com.br>)
List	pgsql-hackers

Tree view

Hi,

On Sat, Dec 6, 2025 at 4:19 AM Marcos Pegoraro <marcos@f10.com.br> wrote:
>
> Subscriber needs to ask publisher about tables and fields to COPY and it uses pg_get_publication_tables for that, and
itis too slow when the number of tables is high because on every table it's subscribed it has to run this select. 

Yeah, if we pass a publication that a lot of tables belong to to
pg_get_publication_tables(), it could take a long time to return as it
needs to construct many entries.

> We can get the same result with a join on pg_publication_rel.

You changed the query not to use pg_get_publication_tables():

-                                                "SELECT DISTINCT"
-                                                "  (CASE WHEN
(array_length(gpt.attrs, 1) = c.relnatts)"
-                                                "   THEN NULL ELSE
gpt.attrs END)"
-                                                "  FROM pg_publication p,"
-                                                "  LATERAL
pg_get_publication_tables(p.pubname) gpt,"
-                                                "  pg_class c"
-                                                " WHERE gpt.relid =
%u AND c.oid = gpt.relid"
-                                                "   AND p.pubname IN ( %s )",
-                                                lrel->remoteid,
-                                                pub_names->data);
+                       "SELECT CASE WHEN cardinality(r.prattrs) <>
relnatts THEN "
+                                               "r.prattrs END FROM pg_class c "
+                                               "LEFT JOIN LATERAL
(SELECT DISTINCT prattrs FROM "
+                                               "pg_publication_rel r
INNER JOIN pg_publication p "
+                                               "ON p.oid = r.prpubid
WHERE c.oid = r.prrelid AND "
+                                               "pubname in ( %s )) r
ON TRUE WHERE c.oid = %u",
+                                                pub_names->data,
+                                                lrel->remoteid);

Simply replacing pg_get_publication_tables() with joining on
pg_publication_rel doesn't work since pg_get_publication_tables()
cares for several cases, for example where the specified columns are
generated columns and the specified table is a partitioned table etc.
Therefore the patch doesn't pass the regression tests.

I think it would make more sense to introduce a dedicated SQL function
that takes the reloid as well as the list of publications and returns
the relation's the column list and row filter expression while
filtering unnecessary rows inside the function.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

pgsql-hackers by date:

From: John Naylor
Date: 20 December, 04:37:35
Subject: Re: Fix typo 586/686 in atomics/arch-x86.h

From: Tom Lane
Date: 20 December, 05:01:23
Subject: Re: Fix typo 586/686 in atomics/arch-x86.h

Re: Initial COPY of Logical Replication is too slow - Mailing list pgsql-hackers

Previous

Next