Fix a server crash problem from pg_get_database_ddl

Started by Chao Li2 days ago4 messageshackers
Jump to latest
#1Chao Li
li.evan.chao@gmail.com

Hi,

While doing some testing, I hit a server crash:
```
2026-04-15 11:30:17.377 CST [98179] LOG: client backend (PID 41260) was terminated by signal 11: Segmentation fault: 11
2026-04-15 11:30:17.377 CST [98179] DETAIL: Failed process was running: SELECT * FROM pg_get_database_ddl('db1'::regdatabase);
2026-04-15 11:30:17.377 CST [98179] LOG: terminating any other active server processes
2026-04-15 11:30:17.380 CST [44361] FATAL: the database system is in recovery mode
```

After debugging it, I found that the crash happened because I had mistakenly deleted the tablespace entry directly from pg_tablespace, and pg_get_database_ddl_internal() calls get_tablespace_name() without checking whether the return value is NULL.

So this doesn't seem like a bug a normal user could hit. It is more like a superuser-only mistake that creates an invalid catalog state. I think that even in such an edge case, we should raise a proper error instead of crashing the backend.

BTW, I have verified that in this case, ALTER DATABASE ... SET TABLESPACE can move the database to a valid tablespace and recover from the issue.

This patch fixes that by checking for a NULL result and throwing an error.

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/

Attachments:

v1-0001-ddlutils-error-out-when-pg_get_database_ddl-sees-.patchapplication/octet-stream; name=v1-0001-ddlutils-error-out-when-pg_get_database_ddl-sees-.patch; x-unix-mode=0644Download+7-1
#2Jack Bonatakis
jack@bonatak.is
In reply to: Chao Li (#1)
Re: Fix a server crash problem from pg_get_database_ddl

I have reproduced this error against the current master:

```
CREATE TABLESPACE ts1 LOCATION '/workspace/tablespaces/pg_bug_ts1';
CREATE DATABASE db1 TABLESPACE ts1;
DELETE FROM pg_tablespace WHERE spcname = 'ts1';
SELECT * FROM pg_get_database_ddl('db1'::regdatabase);

server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
```
Backend logs show:

```
[1]: LOG: terminating any other active server processes ``` After applying the patch:
[1]: LOG: terminating any other active server processes ``` After applying the patch:
[1]: LOG: terminating any other active server processes ``` After applying the patch:
```
After applying the patch:

```
SELECT * FROM pg_get_database_ddl('db1'::regdatabase);
ERROR: tablespace with OID 16393 does not exist
HINT: To recover, try ALTER DATABASE ... SET TABLESPACE ... to a valid tablespace.
```
and backend logs show:

```
[56]: STATEMENT: SELECT * FROM pg_get_database_ddl('db1'::regdatabase); ``` All tests pass.
[56]: STATEMENT: SELECT * FROM pg_get_database_ddl('db1'::regdatabase); ``` All tests pass.
[56]: STATEMENT: SELECT * FROM pg_get_database_ddl('db1'::regdatabase); ``` All tests pass.
```
All tests pass.

The only note I'd have on the code change is that there is no accompanying test. It seems like a TAP test would be reasonable, but I am quite new and will defer to whether you think that's the right call or even necessary.

Jack

#3Japin Li
japinli@hotmail.com
In reply to: Jack Bonatakis (#2)
Re: Fix a server crash problem from pg_get_database_ddl

On Wed, 15 Apr 2026 at 20:44, "Jack Bonatakis" <jack@bonatak.is> wrote:

I have reproduced this error against the current master:

```
CREATE TABLESPACE ts1 LOCATION '/workspace/tablespaces/pg_bug_ts1';
CREATE DATABASE db1 TABLESPACE ts1;
DELETE FROM pg_tablespace WHERE spcname = 'ts1';
SELECT * FROM pg_get_database_ddl('db1'::regdatabase);

server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
```
Backend logs show:

```
[1] LOG: client backend (PID 15420) was terminated by signal 11: Segmentation fault
[1] DETAIL: Failed process was running: SELECT * FROM pg_get_database_ddl('db1'::regdatabase);
[1] LOG: terminating any other active server processes
```
After applying the patch:

```
SELECT * FROM pg_get_database_ddl('db1'::regdatabase);
ERROR: tablespace with OID 16393 does not exist
HINT: To recover, try ALTER DATABASE ... SET TABLESPACE ... to a valid tablespace.
```
and backend logs show:

```
[56] ERROR: tablespace with OID 16393 does not exist
[56] HINT: To recover, try ALTER DATABASE ... SET TABLESPACE ... to a valid tablespace.
[56] STATEMENT: SELECT * FROM pg_get_database_ddl('db1'::regdatabase);
```
All tests pass.

The only note I'd have on the code change is that there is no accompanying test. It seems like a TAP test would be
reasonable, but I am quite new and will defer to whether you think that's the right call or even necessary.

Jack

This seems similar to [1]/messages/by-id/CAJTYsWXcd324VELk=9KdsfTsua9So3Yexqv7N3B23h9zAUD40g@mail.gmail.com.. Could you please confirm?

[1]: /messages/by-id/CAJTYsWXcd324VELk=9KdsfTsua9So3Yexqv7N3B23h9zAUD40g@mail.gmail.com.

--
Regards,
Japin Li
ChengDu WenWu Information Technology Co., Ltd.

#4Chao Li
li.evan.chao@gmail.com
In reply to: Japin Li (#3)
Re: Fix a server crash problem from pg_get_database_ddl

On Apr 16, 2026, at 09:23, Japin Li <japinli@hotmail.com> wrote:

On Wed, 15 Apr 2026 at 20:44, "Jack Bonatakis" <jack@bonatak.is> wrote:

I have reproduced this error against the current master:

```
CREATE TABLESPACE ts1 LOCATION '/workspace/tablespaces/pg_bug_ts1';
CREATE DATABASE db1 TABLESPACE ts1;
DELETE FROM pg_tablespace WHERE spcname = 'ts1';
SELECT * FROM pg_get_database_ddl('db1'::regdatabase);

server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
```
Backend logs show:

```
[1] LOG: client backend (PID 15420) was terminated by signal 11: Segmentation fault
[1] DETAIL: Failed process was running: SELECT * FROM pg_get_database_ddl('db1'::regdatabase);
[1] LOG: terminating any other active server processes
```
After applying the patch:

```
SELECT * FROM pg_get_database_ddl('db1'::regdatabase);
ERROR: tablespace with OID 16393 does not exist
HINT: To recover, try ALTER DATABASE ... SET TABLESPACE ... to a valid tablespace.
```
and backend logs show:

```
[56] ERROR: tablespace with OID 16393 does not exist
[56] HINT: To recover, try ALTER DATABASE ... SET TABLESPACE ... to a valid tablespace.
[56] STATEMENT: SELECT * FROM pg_get_database_ddl('db1'::regdatabase);
```
All tests pass.

The only note I'd have on the code change is that there is no accompanying test. It seems like a TAP test would be
reasonable, but I am quite new and will defer to whether you think that's the right call or even necessary.

Jack

This seems similar to [1]. Could you please confirm?

[1] /messages/by-id/CAJTYsWXcd324VELk=9KdsfTsua9So3Yexqv7N3B23h9zAUD40g@mail.gmail.com.

--
Regards,
Japin Li
ChengDu WenWu Information Technology Co., Ltd.

Thanks for printing out that. Yes, they are similar.

I agree with what Tom said in [2]/messages/by-id/1538113.1768921841@sss.pgh.pa.us:
```
This is not a bug. This is a superuser intentionally breaking
the system by corrupting the catalogs. There are any number
of ways to cause trouble with ill-advised manual updates to a
catalog table. Try, eg, "DELETE FROM pg_proc" (... but not in
a database you care about).
```

So, let me take back this patch.

[2]: /messages/by-id/1538113.1768921841@sss.pgh.pa.us

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/