Re: Initial COPY of Logical Replication is too slow - Mailing list pgsql-hackers
| From | Chao Li |
|---|---|
| Subject | Re: Initial COPY of Logical Replication is too slow |
| Date | |
| Msg-id | DDAEF936-A1F2-4619-B272-6290219B6BEC@gmail.com Whole thread Raw |
| In response to | Re: Initial COPY of Logical Replication is too slow (Masahiko Sawada <sawada.mshk@gmail.com>) |
| List | pgsql-hackers |
> On Feb 26, 2026, at 03:03, Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > On Mon, Jan 26, 2026 at 12:30 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: >> >> On Mon, Jan 19, 2026 at 9:44 AM Marcos Pegoraro <marcos@f10.com.br> wrote: >>> >>> Em sex., 19 de dez. de 2025 às 22:59, Masahiko Sawada <sawada.mshk@gmail.com> escreveu: >>>> >>>> Yeah, if we pass a publication that a lot of tables belong to to >>>> pg_get_publication_tables(), it could take a long time to return as it >>>> needs to construct many entries. >>> >>> >>> Well, I don't know how to help but I'm sure it's working badly. >>> Today I added some fields on my server, then seeing logs I could see how slow this process is. >>> >>> duration: 2213.872 ms statement: SELECT DISTINCT (CASE WHEN (array_length(gpt.attrs, 1) = c.relnatts) THEN NULL ELSEgpt.attrs END) FROM pg_publication p, LATERAL pg_get_publication_tables(p.pubname) gpt, pg_class c WHERE gpt.relid= 274376788 AND c.oid = gpt.relid AND p.pubname IN ( 'mypub' ) >>> >>> 2 seconds to get the list of fields of a table is really too slow. >>> How can we solve this ? >> >> After more investigation of slowness, it seems that the >> list_concat_unique_oid() called below is quite slow when the database >> has a lot of tables to publish: >> >> relids = GetPublicationRelations(pub_elem->oid, >> pub_elem->pubviaroot ? >> PUBLICATION_PART_ROOT : >> PUBLICATION_PART_LEAF); >> schemarelids = GetAllSchemaPublicationRelations(pub_elem->oid, >> pub_elem->pubviaroot ? >> PUBLICATION_PART_ROOT : >> PUBLICATION_PART_LEAF); >> pub_elem_tables = list_concat_unique_oid(relids, schemarelids); >> >> This is simply because it's O(n^2), where n is the number of oids in >> schemarelids in the test case. A simple change would be to do sort & >> dedup instead. With the attached experimental patch, the >> pg_get_publication_tables() execution time gets halved in my >> environment (796ms -> 430ms with 50k tables). If the number of tables >> is not large, this method might be slower than today but it's not a >> huge regression. >> >> In the initial tablesync cases, it could be optimized further in a way >> that we introduce a new SQL function that gets the column list and >> expr of the specific table. This way, we can filter the result by >> relid at an early stage instead of getting all information and >> filtering by relid as the tablesync worker does today, avoiding >> overheads of gathering system catalog scan results. > > I've drafted this idea and I find it looks like a better approach. The > patch introduces the pg_get_publication_table_info() SQL function that > returns the column list and row filter expression like > pg_get_publication_tables() returns but it checks only the specific > table unlike pg_get_publication_tables(). On my env, the tablesync > worker's query in question becomes 0.6ms from 288 ms with 50k tables > in one publication. Feedback is very welcome. > > Regards, > > -- > Masahiko Sawada > Amazon Web Services: https://aws.amazon.com > <0001-Add-pg_get_publication_table_info-to-optimize-logica.patch> A few comments: 1. pg_publication.c needs to pg_indent. When I ran pg_indent agains it, the patch code got a lot format changes. 2 ``` + values[0] = ObjectIdGetDatum(pub->oid); + values[1] = ObjectIdGetDatum(relid); + + values[0] = ObjectIdGetDatum(pub->oid); + values[1] = ObjectIdGetDatum(relid); ``` Looks like a copy-pasto. 3 I think we can optimize pg_get_publication_table_info() a little bit for non-publish tables by setting funcctx->max_calls= 0. I tried this code locally, and test still passed: ``` if (publish) { pubrel = palloc_object(published_rel); pubrel->relid = relid; pubrel->pubid = pub->oid; funcctx->tuple_desc = BlessTupleDesc(tupdesc); funcctx->user_fctx = pubrel; funcctx->max_calls = 1; /* return one row */ } else funcctx->max_calls = 0; /* return no rows */ MemoryContextSwitchTo(oldcontext); } /* stuff done on every call of the function */ funcctx = SRF_PERCALL_SETUP(); if (funcctx->call_cntr < funcctx->max_calls) { HeapTuple rettuple; table_info = (published_rel *) funcctx->user_fctx; rettuple = construct_published_rel_tuple(table_info, funcctx->tuple_desc); SRF_RETURN_NEXT(funcctx, HeapTupleGetDatum(rettuple)); } SRF_RETURN_DONE(funcctx); ``` 4 ``` + int i; + + attnums = palloc_array(int16, desc->natts); + + for (i = 0; i < desc->natts; i++) ``` Nit: Peter (E) ever did some cleanup of changing for loop variable into for. So I think that style is more preferred now: ``` for (int i = 0; i < desc->natts; i++) ``` 5 ``` + attnums = palloc_array(int16, desc->natts); ``` Nit: attnums array is never free-ed, is that intentional? Actually, I’m kinda lost when a memory should be free-ed and whennot. Best regards, -- Chao Li (Evan) HighGo Software Co., Ltd. https://www.highgo.com/
pgsql-hackers by date: