Re: Fix a server crash problem from pg_get_database_ddl - Mailing list pgsql-hackers
| From | Andrew Dunstan |
|---|---|
| Subject | Re: Fix a server crash problem from pg_get_database_ddl |
| Date | |
| Msg-id | CAD5tBcLmK90R2UvUkXPtXZS7f6u08PgrvMtbtNVXdxJbwCzGhg@mail.gmail.com Whole thread |
| In response to | Re: Fix a server crash problem from pg_get_database_ddl (Chao Li <li.evan.chao@gmail.com>) |
| List | pgsql-hackers |
On Thu, Apr 30, 2026 at 2:41 AM Chao Li <li.evan.chao@gmail.com> wrote:
> On Apr 27, 2026, at 12:56, SATYANARAYANA NARLAPURAM <satyanarlapuram@gmail.com> wrote:
>
> Hi Chao,
>
> On Sun, Apr 26, 2026 at 7:03 PM Chao Li <li.evan.chao@gmail.com> wrote:
>
>
> > On Apr 26, 2026, at 22:50, Andrew Dunstan <andrew@dunslane.net> wrote:
> >
> >
> > On 2026-04-23 Th 2:47 AM, SATYANARAYANA NARLAPURAM wrote:
> >>
> >>
> >> Thanks for printing out that. Yes, they are similar.
> >>
> >> I agree with what Tom said in [2]:
> >> ```
> >> This is not a bug. This is a superuser intentionally breaking
> >> the system by corrupting the catalogs. There are any number
> >> of ways to cause trouble with ill-advised manual updates to a
> >> catalog table. Try, eg, "DELETE FROM pg_proc" (... but not in
> >> a database you care about).
> >> ```
> >>
> >> So, let me take back this patch.
> >>
> >> [2] https://www.postgresql.org/message-id/1538113.1768921841@sss.pgh.pa.us
> >> In this case, it is a very corner case but not something superuser intentionally breaks.
> >> For example, a concurrent tablespace drop + database ddl to assign a different tablespace or default.
> >> We aren't acquiring Access Share lock on the DB in this function (intentional) so it is a good practice
> >> to do the null checks. Of course, it makes more sense to add this comment while doing a code review.
> >> I will let Tom and others chime in with their thoughts on fixing this.
> >>
> >> Attached an injection point test to show the race. Not intended to commit.
> >>
> >>
> >
> > I agree if there's a race condition we should protect against it. I don't much like the idea of silently ignoring it, though. Raising an error seems more like the right thing to do.
> >
> > cheers
> >
> > andrew
> > --
> > Andrew Dunstan
> > EDB: https://www.enterprisedb.com
> >
>
> The v1 patch raises an error when the tablespace name is NULL.
>
> PFA v2: removed hint from the error message, because I now consider the hint might not be necessary.
>
> + if (spcname == NULL)
> + ereport(ERROR,
> + errcode(ERRCODE_UNDEFINED_OBJECT),
> + errmsg("tablespace with OID %u does not exist",
> + dbform->dattablespace));
> +
>
> A message with error detail that says a concurrent DDL could have dropped a tablespace could be better?
> System catalog is optional.
> Something like:
>
> errdetail("The tablespace may have been dropped concurrently, or the system catalog is inconsistent.")));
>
> Thanks,
> Satya
Hi Satya,
Thanks for your review. I hesitate to add a detail message here because we do not actually know the root cause. A concurrent DROP TABLESPACE could be one cause, but some unusual user operation could also lead to the same result, so I am not sure the detail message would help much.
The main purpose of this patch is to avoid passing NULL to pg_strcasecmp() and to report the missing object clearly. So I think the errmsg itself is enough.
Thanks, pushed.
I don't think it hurts to give some information on why this might happen, so I did add an errdetail.
cheers
andrew
pgsql-hackers by date: