Re: Fix a server crash problem from pg_get_database_ddl - Mailing list pgsql-hackers

From SATYANARAYANA NARLAPURAM
Subject Re: Fix a server crash problem from pg_get_database_ddl
Date
Msg-id CAHg+QDcNyJ94cCD+9ZRfz==hDnghjE5BaR4+BiSWXt82hpgDtA@mail.gmail.com
Whole thread
In response to Re: Fix a server crash problem from pg_get_database_ddl  (Chao Li <li.evan.chao@gmail.com>)
List pgsql-hackers
Hi,

Adding Tom to the thread explicitly to seek his opinion.

On Wed, Apr 15, 2026 at 6:36 PM Chao Li <li.evan.chao@gmail.com> wrote:


> On Apr 16, 2026, at 09:23, Japin Li <japinli@hotmail.com> wrote:
>
> On Wed, 15 Apr 2026 at 20:44, "Jack Bonatakis" <jack@bonatak.is> wrote:
>> I have reproduced this error against the current master:
>>
>> ```
>> CREATE TABLESPACE ts1 LOCATION '/workspace/tablespaces/pg_bug_ts1';
>> CREATE DATABASE db1 TABLESPACE ts1;
>> DELETE FROM pg_tablespace WHERE spcname = 'ts1';
>> SELECT * FROM pg_get_database_ddl('db1'::regdatabase);
>>
>> server closed the connection unexpectedly
>> This probably means the server terminated abnormally
>> before or while processing the request.
>> The connection to the server was lost. Attempting reset: Failed.
>> ```
>> Backend logs show:
>>
>> ```
>> [1] LOG:  client backend (PID 15420) was terminated by signal 11: Segmentation fault
>> [1] DETAIL:  Failed process was running: SELECT * FROM pg_get_database_ddl('db1'::regdatabase);
>> [1] LOG:  terminating any other active server processes
>> ```
>> After applying the patch:
>>
>> ```
>> SELECT * FROM pg_get_database_ddl('db1'::regdatabase);
>> ERROR:  tablespace with OID 16393 does not exist
>> HINT:  To recover, try ALTER DATABASE ... SET TABLESPACE ... to a valid tablespace.
>> ```
>> and backend logs show:
>>
>> ```
>> [56] ERROR:  tablespace with OID 16393 does not exist
>> [56] HINT:  To recover, try ALTER DATABASE ... SET TABLESPACE ... to a valid tablespace.
>> [56] STATEMENT:  SELECT * FROM pg_get_database_ddl('db1'::regdatabase);
>> ```
>> All tests pass.
>>
>> The only note I'd have on the code change is that there is no accompanying test. It seems like a TAP test would be
>> reasonable, but I am quite new and will defer to whether you think that's the right call or even necessary.
>>
>> Jack
>
> This seems similar to [1]. Could you please confirm?
>
> [1] https://www.postgresql.org/message-id/CAJTYsWXcd324VELk%3D9KdsfTsua9So3Yexqv7N3B23h9zAUD40g%40mail.gmail.com.
>
> --
> Regards,
> Japin Li
> ChengDu WenWu Information Technology Co., Ltd.
>
>

Thanks for printing out that. Yes, they are similar.

I agree with what Tom said in [2]:
```
This is not a bug. This is a superuser intentionally breaking
the system by corrupting the catalogs. There are any number
of ways to cause trouble with ill-advised manual updates to a
catalog table. Try, eg, "DELETE FROM pg_proc" (... but not in
a database you care about).
```

So, let me take back this patch.

[2] https://www.postgresql.org/message-id/1538113.1768921841@sss.pgh.pa.us

In this case, it is a very corner case but not something superuser intentionally breaks.
For example, a concurrent tablespace drop + database ddl to assign a different tablespace or default.
We aren't acquiring Access Share lock on the DB in this function (intentional) so it is a good practice
to do the null checks. Of course, it makes more sense to add this comment while doing a code review.
I will let Tom and others chime in with their thoughts on fixing this.

Attached an injection point test to show the race. Not intended to commit.

Thanks,
Satya
Attachment

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Question about criteria for adding items to the v19 open items wiki page
Next
From: SATYANARAYANA NARLAPURAM
Date:
Subject: Re: Add null check on get_tablespace_name() return in pg_get_database_ddl_internal