Re: [PATCH] Reuse Workers and Replication Slots during Logical Replication - Mailing list pgsql-hackers
From | shveta malik |
---|---|
Subject | Re: [PATCH] Reuse Workers and Replication Slots during Logical Replication |
Date | |
Msg-id | CAJpy0uAo+AWTMU9s7bJUXJuyd-NvMM290Au2BMtQMfpmviXLNA@mail.gmail.com Whole thread Raw |
In response to | Re: [PATCH] Reuse Workers and Replication Slots during Logical Replication (Melih Mutlu <m.melihmutlu@gmail.com>) |
Responses |
Re: [PATCH] Reuse Workers and Replication Slots during Logical Replication
Re: [PATCH] Reuse Workers and Replication Slots during Logical Replication |
List | pgsql-hackers |
On Thu, Jan 26, 2023 at 7:53 PM Melih Mutlu <m.melihmutlu@gmail.com> wrote: > > If a relation is currently being synced by a tablesync worker and uses a replication slot/origin for that operation, thensrrelslotname and srreloriginname fields will have values. > When a relation is done with its replication slot/origin, their info gets removed from related catalog row, so that slot/origincan be reused for another table or dropped if not needed anymore. > In your case, all relations are in READY state so it's expected that srrelslotname and srreloriginname are empty. READYrelations do not need a replication slot/origin anymore. > > Tables are probably synced so quickly that you're missing the moments when a tablesync worker copies a relation and storesits rep. slot/origin in the catalog. > If initial sync is long enough, then you should be able to see the columns get updated. I follow [1] to make it longerand test if the patch really updates the catalog. > Thank You for the details. It is clear now. > > > Rebased and resolved conflicts. Please check the new version > Please find my suggestions on v9: 1. --Can we please add a few more points to the Summary to make it more clear. a) something telling that reusability of workers is for tables under one subscription and not across multiple subscriptions. b) Since we are reusing both workers and slots, can we add: --when do we actually end up spawning a new worker --when do we actually end up creating a new slot in a worker rather than using existing one. --if we reuse existing slots, what happens to the snapshot? 2. + The last used ID for tablesync workers. This ID is used to + create replication slots. The last used ID needs to be stored + to make logical replication can safely proceed after any interruption. + If sublastusedid is 0, then no table has been synced yet. --typo: to make logical replication can safely proceed ==> to make logical replication safely proceed --Also, does it sound better: The last used ID for tablesync workers. It acts as an unique identifier for replication slots which are created by table-sync workers. The last used ID needs to be persisted... 3. is_first_run; move_to_next_rel; --Do you think one variable is enough here as we do not get any extra info by using 2 variables? Can we have 1 which is more generic like 'ready_to_reuse'. Otherwise, please let me know if we must use 2. 4. /* missin_ok = true, since the origin might be already dropped. */ typo: missing_ok 5. GetReplicationSlotNamesBySubId: errmsg("not tuple returned.")); Can we have a better error msg: ereport(ERROR, errmsg("could not receive list of slots associated with subscription %d, error: %s", subid, res->err)); 6. static void clean_sync_worker(void) --can we please add introductory comment for this function. 7. /* * Pick the table for the next run if there is not another worker * already picked that table. */ Pick the table for the next run if it is not already picked up by another worker. 8. process_syncing_tables_for_sync() /* Cleanup before next run or ending the worker. */ --can we please improve this comment: if there is no more work left for this worker, stop the worker gracefully, else do clean-up and make it ready for the next relation/run. 9. LogicalRepSyncTableStart: * Read previous slot name from the catalog, if exists. */ prev_slotname = (char *) palloc0(NAMEDATALEN); Do we need to free this at the end? 10. if (strlen(prev_slotname) == 0) { elog(ERROR, "Replication slot could not be found for relation %u", MyLogicalRepWorker->relid); } shall we mention subid also in error msg. I am reviewing further... thanks Shveta
pgsql-hackers by date: