Re: Fix race condition in pg_get_publication_tables with concurrent DROP TABLE - Mailing list pgsql-hackers
| From | shveta malik |
|---|---|
| Subject | Re: Fix race condition in pg_get_publication_tables with concurrent DROP TABLE |
| Date | |
| Msg-id | CAJpy0uCMuZ8BOvJqTb4m29h9K2LyJR+zXtEWu6NpyO3wqM=tZA@mail.gmail.com Whole thread |
| In response to | Re: Fix race condition in pg_get_publication_tables with concurrent DROP TABLE (Chao Li <li.evan.chao@gmail.com>) |
| List | pgsql-hackers |
On Tue, May 5, 2026 at 8:40 AM Chao Li <li.evan.chao@gmail.com> wrote:
>
>
>
> > On May 4, 2026, at 13:08, shveta malik <shveta.malik@gmail.com> wrote:
> >
> > On Fri, May 1, 2026 at 7:00 AM Bharath Rupireddy
> > <bharath.rupireddyforpostgres@gmail.com> wrote:
> >>
> >> Hi,
> >>
> >> On Wed, Apr 29, 2026 at 8:41 PM Chao Li <li.evan.chao@gmail.com> wrote:
> >>>
> >>>>
<v5-0001-PG16-Fix-pg_get_publication_tables-race-with-conc.txt><v5-0001-Fix-pg_get_publication_tables-race-with-concurren.patch><v5-0001-PG18-Fix-pg_get_publication_tables-race-with-conc.txt><v5-0001-PG17-Fix-pg_get_publication_tables-race-with-conc.txt>
> >>>
> >>> I am afraid this is only a partial fix.
> >>
> >> Thanks for reviewing it. Please find my responses below.
> >>
> >>> ```
> >>> @@ -1599,12 +1621,18 @@ pg_get_publication_tables(FunctionCallInfo fcinfo, ArrayType *pubnames,
> >>> /* Show all columns when the column list is not specified. */
> >>> if (nulls[2])
> >>> {
> >>> - Relation rel = table_open(relid, AccessShareLock);
> >>> + Relation rel = try_table_open(relid, AccessShareLock);
> >>> int nattnums = 0;
> >>> int16 *attnums;
> >>> - TupleDesc desc = RelationGetDescr(rel);
> >>> + TupleDesc desc;
> >>> int i;
> >>>
> >>> + /* Skip if the relation has been concurrently dropped. */
> >>> + if (rel == NULL)
> >>> + continue;
> >>> ```
> >>>
> >>> This change uses try_table_open() to detect whether a table has been dropped, but try_table_open() is only called
whenthe column list is not specified. If a table is included in the publication with an explicit column list, then even
ifit is dropped concurrently, it may still be returned by pg_get_publication_tables().
> >>
> >> Right. The try_table_open() is only needed there because that's the
> >> only code path that actually opens the relation (to enumerate its
> >> columns). The column list path reads from the pg_publication_rel
> >> catalog - it never calls table_open(), so it cannot hit the ERROR.
> >>
> >>> So this patch removes the “could not open relation with OID” error, but it does not fully ensure the accuracy of
thereturned table list.
> >>
> >> IMO, no function returning table OIDs can guarantee they remain valid
> >> - a drop can happen right after we return the OID, and accuracy is in
> >> the caller's hands. All the callers of pg_get_publication_tables()
> >> already handle this by JOINing with pg_class.
> >>
> >
> > I agree with Bharath. Also I would like to add one more point. We do have this:
> >
> > + /* Skip if the relation has been concurrently dropped. */
> > + if (!OidIsValid(schemaid))
> > + continue;
> >
>
> Actually, this is the other comment I have. Why the comment says “if the relation has been dropped”, but the actual
checkis on schema id?
>
Okay, I see your point. Shall we tweak it to the below to make it more
understandable?
Oid schemaid;
/*
* get_rel_namespace() returns InvalidOid if the relation no longer exists
* (e.g., dropped concurrently). Skip such entries.
*/
if (!OidIsValid(schemaid = get_rel_namespace(relid)))
continue;
thanks
Shveta
pgsql-hackers by date: