Re: Skipping logical replication transactions on subscriber side - Mailing list pgsql-hackers
From | Masahiko Sawada |
---|---|
Subject | Re: Skipping logical replication transactions on subscriber side |
Date | |
Msg-id | CAD21AoAZ76=YB_QyQuDNc-NBdGfQ_zbiee3aw7MUVFFmTZPB6A@mail.gmail.com Whole thread Raw |
In response to | Re: Skipping logical replication transactions on subscriber side (Amit Kapila <amit.kapila16@gmail.com>) |
Responses |
Re: Skipping logical replication transactions on subscriber side
|
List | pgsql-hackers |
On Sat, Sep 25, 2021 at 4:23 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Fri, Sep 24, 2021 at 6:44 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > On Fri, Sep 24, 2021 at 8:01 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > 6. > > > +typedef struct PgStat_StatSubEntry > > > +{ > > > + Oid subid; /* hash table key */ > > > + > > > + /* > > > + * Statistics of errors that occurred during logical replication. While > > > + * having the hash table for table sync errors we have a separate > > > + * statistics value for apply error (apply_error), because we can avoid > > > + * building a nested hash table for table sync errors in the case where > > > + * there is no table sync error, which is the common case in practice. > > > + * > > > > > > The above comment is not clear to me. Why do you need to have a > > > separate hash table for table sync errors? And what makes it avoid > > > building nested hash table? > > > > In the previous patch, a subscription stats entry > > (PgStat_StatSubEntry) had one hash table that had error entries of > > both apply and table sync. Since a subscription can have one apply > > worker and multiple table sync workers it makes sense to me to have > > the subscription entry have a hash table for them. > > > > Sure, but each tablesync worker must have a separate relid. Why can't > we have a single hash table for both apply and table sync workers > which are hashed by sub_id + rel_id? For apply worker, the rel_id will > always be zero (InvalidOId) and tablesync workers will have a unique > OID for rel_id, so we should be able to uniquely identify each of > apply and table sync workers. What I imagined is to extend the subscription statistics, for instance, transaction stats[1]. By having a hash table for subscriptions, we can store those statistics into an entry of the hash table and we can think of subscription errors as also statistics of the subscription. So we can have another hash table for errors in an entry of the subscription hash table. For example, the subscription entry struct will be something like: typedef struct PgStat_StatSubEntry { Oid subid; /* hash key */ HTAB *errors; /* apply and table sync errors */ /* transaction stats of subscription */ PgStat_Counter xact_commit; PgStat_Counter xact_commit_bytes; PgStat_Counter xact_error; PgStat_Counter xact_error_bytes; PgStat_Counter xact_abort; PgStat_Counter xact_abort_bytes; PgStat_Counter failure_count; } PgStat_StatSubEntry; When a subscription is dropped, we can easily drop the subscription entry along with those statistics including the errors from the hash table. Regards, [1] https://www.postgresql.org/message-id/OSBPR01MB48887CA8F40C8D984A6DC00CED199%40OSBPR01MB4888.jpnprd01.prod.outlook.com -- Masahiko Sawada EDB: https://www.enterprisedb.com/
pgsql-hackers by date: