BUG #18866: Running pg_freespace() on views triggers an Abort

Started by PG Bug reporting formabout 1 year ago14 messagesbugs
Jump to latest
#1PG Bug reporting form
noreply@postgresql.org

The following bug has been logged on the website:

Bug reference: 18866
Logged by: Robins Tharakan
Email address: tharakan@gmail.com
PostgreSQL version: Unsupported/Unknown
Operating system: Ubuntu
Description:

Hi,

Passing a view to pg_freespace() triggers an Abort on HEAD. This has been so
since the beginning (049ef3398d05c9dc8f48aa9a6d68440661cfeb87). Given that
this is just an assert, feel free to skip - but thought I'd bring it up, in
case this needs a review.

SQL / Backtrace / Error Log excerpt given below:

SQL
===
$ cat crashing.sql
CREATE EXTENSION pg_freespacemap;
SELECT pg_freespace('pg_roles', 0);

SQL Output
==========
$ psql -f crashing.sql
CREATE EXTENSION
psql:crashing.sql:2: server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
psql:crashing.sql:2: error: connection to server was lost

Backtrace
=========
#5 0x00005b073976f7e0 in ExceptionalCondition (conditionName=0x5b07399bb930
"RelFileNumberIsValid(rlocator.relNumber)", fileName=0x5b07399bb90c
"smgr.c", lineNumber=228)
at assert.c:66
No locals.
#6 0x00005b0739556b62 in smgropen (rlocator=..., backend=-1) at
smgr.c:228
brlocator = {locator = {spcOid = 1101590320, dbOid = 23303,
relNumber = 1615569600}, backend = 31264}
reln = 0x5b0741ab40a8
found = 65
#7 0x00005b07395163cb in RelationGetSmgr (rel=0x7a2060b99bf8) at
../../../../src/include/utils/rel.h:579
No locals.
#8 0x00005b0739516dc2 in fsm_readbuf (rel=0x7a2060b99bf8, addr=...,
extend=false) at freespace.c:558
blkno = 2
buf = -998073056
reln = 0x0
#9 0x00005b0739516705 in GetRecordedFreeSpace (rel=0x7a2060b99bf8,
heapBlk=0) at freespace.c:254
addr = {level = 0, logpageno = 0}
slot = 0
buf = 23303
cat = 22 '\026'
#10 0x00007a2061ab630c in pg_freespace (fcinfo=0x5b0741b07928) at
pg_freespacemap.c:38
relid = 12000
blkno = 0
freespace = -17281
rel = 0x7a2060b99bf8
__func__ = "pg_freespace"
#11 0x00005b0739259dc4 in ExecInterpExpr (state=0x5b0741b077d0,
econtext=0x5b0741b074b8, isnull=0x0) at execExprInterp.c:1001
d = 100086724982736
fcinfo = 0x5b0741b07928
args = 0x5b0741b07948
op = 0x5b0741b07b50
resultslot = 0x5b0741b07708
innerslot = 0x0
outerslot = 0x0
scanslot = 0x0
oldslot = 0x0
newslot = 0x0

Surfacing Commit
================
Checking (62f36d6924c~0) - 62f36d6924 - fail
Checking (62f36d6924c~10) - 023fb51275 - fail
Checking (62f36d6924c~30) - e215166c9c - fail
.
.
.
Checking (62f36d6924c~3560) - b31ba5310b - fail
Checking (62f36d6924c~3565) - e5b8c4f68f - pass
Checking (62f36d6924c~3562) - 049ef3398d - fail
Checking (62f36d6924c~3563) - cd7f19da34 - pass
Surfacing commit is 049ef3398d05c9dc8f48aa9a6d68440661cfeb87

Postgres Error Logs
===================
TRAP: failed Assert("RelFileNumberIsValid(rlocator.relNumber)"), File:
"smgr.c", Line: 228, PID: 2372149
postgres: smith: smith postgres 127.0.0.1(51026)
SELECT(ExceptionalCondition+0xbb)[0x5b073976f7ad]
postgres: smith: smith postgres 127.0.0.1(51026)
SELECT(smgropen+0x5e)[0x5b0739556b62]
postgres: smith: smith postgres 127.0.0.1(51026)
SELECT(+0x6b83cb)[0x5b07395163cb]
postgres: smith: smith postgres 127.0.0.1(51026)
SELECT(+0x6b8dc2)[0x5b0739516dc2]
postgres: smith: smith postgres 127.0.0.1(51026)
SELECT(GetRecordedFreeSpace+0x50)[0x5b0739516705]
/home/smith/proj/blamepg/lib/postgresql/pg_freespacemap.so(pg_freespace+0xc4)[0x7a2061ab730c]
postgres: smith: smith postgres 127.0.0.1(51026)
SELECT(+0x3fbdc4)[0x5b0739259dc4]
postgres: smith: smith postgres 127.0.0.1(51026)
SELECT(ExecInterpExprStillValid+0x5b)[0x5b073925c94a]
postgres: smith: smith postgres 127.0.0.1(51026)
SELECT(+0x461f27)[0x5b07392bff27]
postgres: smith: smith postgres 127.0.0.1(51026)
SELECT(+0x461fe5)[0x5b07392bffe5]
postgres: smith: smith postgres 127.0.0.1(51026)
SELECT(+0x462046)[0x5b07392c0046]
postgres: smith: smith postgres 127.0.0.1(51026)
SELECT(+0x462269)[0x5b07392c0269]
postgres: smith: smith postgres 127.0.0.1(51026)
SELECT(+0x418aba)[0x5b0739276aba]
postgres: smith: smith postgres 127.0.0.1(51026)
SELECT(+0x40aeaf)[0x5b0739268eaf]
postgres: smith: smith postgres 127.0.0.1(51026)
SELECT(+0x40df9e)[0x5b073926bf9e]
postgres: smith: smith postgres 127.0.0.1(51026)
SELECT(standard_ExecutorRun+0x1cc)[0x5b07392696eb]
postgres: smith: smith postgres 127.0.0.1(51026)
SELECT(ExecutorRun+0x54)[0x5b073926951c]
postgres: smith: smith postgres 127.0.0.1(51026)
SELECT(+0x707af2)[0x5b0739565af2]
postgres: smith: smith postgres 127.0.0.1(51026)
SELECT(PortalRun+0x274)[0x5b0739565720]
postgres: smith: smith postgres 127.0.0.1(51026)
SELECT(+0x6ffc69)[0x5b073955dc69]
postgres: smith: smith postgres 127.0.0.1(51026)
SELECT(PostgresMain+0xb8d)[0x5b0739563461]
postgres: smith: smith postgres 127.0.0.1(51026)
SELECT(+0x6fafe7)[0x5b0739558fe7]
postgres: smith: smith postgres 127.0.0.1(51026)
SELECT(postmaster_child_launch+0x174)[0x5b073945cac4]
postgres: smith: smith postgres 127.0.0.1(51026)
SELECT(+0x605664)[0x5b0739463664]
postgres: smith: smith postgres 127.0.0.1(51026)
SELECT(+0x602a7e)[0x5b0739460a7e]
postgres: smith: smith postgres 127.0.0.1(51026)
SELECT(PostmasterMain+0x1546)[0x5b0739460374]
postgres: smith: smith postgres 127.0.0.1(51026)
SELECT(main+0x38c)[0x5b07392ff7db]
/lib/x86_64-linux-gnu/libc.so.6(+0x2a1ca)[0x7a206042a1ca]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x8b)[0x7a206042a28b]
postgres: smith: smith postgres 127.0.0.1(51026)
SELECT(_start+0x25)[0x5b0738f43eb5]
2025-03-25 21:32:25.437 ACDT [2372118] LOG: client backend (PID 2372149)
was terminated by signal 6: Aborted
2025-03-25 21:32:25.437 ACDT [2372118] DETAIL: Failed process was running:
SELECT public.pg_freespace('pg_roles', 0);
2025-03-25 21:32:25.437 ACDT [2372118] LOG: terminating any other active
server processes

Thanks to SQLSmith / SQLReduce.
-
Robins

#2Tender Wang
tndrwang@gmail.com
In reply to: PG Bug reporting form (#1)
Re: BUG #18866: Running pg_freespace() on views triggers an Abort

PG Bug reporting form <noreply@postgresql.org> 于2025年3月25日周二 22:42写道:

The following bug has been logged on the website:

Bug reference: 18866
Logged by: Robins Tharakan
Email address: tharakan@gmail.com
PostgreSQL version: Unsupported/Unknown
Operating system: Ubuntu
Description:

Hi,

Passing a view to pg_freespace() triggers an Abort on HEAD. This has been
so
since the beginning (049ef3398d05c9dc8f48aa9a6d68440661cfeb87). Given that
this is just an assert, feel free to skip - but thought I'd bring it up, in
case this needs a review.

SQL / Backtrace / Error Log excerpt given below:

SQL
===
$ cat crashing.sql
CREATE EXTENSION pg_freespacemap;
SELECT pg_freespace('pg_roles', 0);

Yeah, it would crash if you input a foreign table, for example:
create extension postgres_fdw;
CREATE SERVER d FOREIGN DATA WRAPPER postgres_fdw;
CREATE FOREIGN TABLE f (g text) SERVER d;
SELECT pg_freespace('f', 0); -- will crash too

I think we can remove the Assert in smgropen().
Any thoughts?

--
Thanks,
Tender Wang

Attachments:

0001-Remove-a-wrong-Assert.patchtext/plain; charset=US-ASCII; name=0001-Remove-a-wrong-Assert.patchDownload+0-2
#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tender Wang (#2)
Re: BUG #18866: Running pg_freespace() on views triggers an Abort

Tender Wang <tndrwang@gmail.com> writes:

PG Bug reporting form <noreply@postgresql.org> 于2025年3月25日周二 22:42写道:

Passing a view to pg_freespace() triggers an Abort on HEAD.

I think we can remove the Assert in smgropen().
Any thoughts?

That seems like a "solution" that will allow other bugs. pg_freespace
should be fixed to verify RELKIND_HAS_STORAGE() before it tries to
access said storage.

regards, tom lane

#4Tender Wang
tndrwang@gmail.com
In reply to: Tom Lane (#3)
Re: BUG #18866: Running pg_freespace() on views triggers an Abort

Tom Lane <tgl@sss.pgh.pa.us> 于2025年3月26日周三 00:37写道:

Tender Wang <tndrwang@gmail.com> writes:

PG Bug reporting form <noreply@postgresql.org> 于2025年3月25日周二 22:42写道:

Passing a view to pg_freespace() triggers an Abort on HEAD.

I think we can remove the Assert in smgropen().
Any thoughts?

That seems like a "solution" that will allow other bugs. pg_freespace
should be fixed to verify RELKIND_HAS_STORAGE() before it tries to
access said storage.

Thanks for the advice. Please see the attached patch.

--
Thanks,
Tender Wang

Attachments:

v2-0001-Don-t-allow-no-storage-relation-to-get-FSM-info.patchtext/plain; charset=US-ASCII; name=v2-0001-Don-t-allow-no-storage-relation-to-get-FSM-info.patchDownload+9-1
#5Euler Taveira
euler@eulerto.com
In reply to: Tender Wang (#4)
Re: BUG #18866: Running pg_freespace() on views triggers an Abort

On Tue, Mar 25, 2025, at 10:18 PM, Tender Wang wrote:

Thanks for the advice. Please see the attached patch.

Your patch needs some adjustments. There is no need to include pg_class.h. I
don't like the proposed error message. I prefer saying the relation cannot be
opened because that's what will happen if it reaches this code path. This is an
existing message so no need for translation. The output will be like:

postgres=# SELECT pg_freespace('f', 0);
ERROR: cannot open relation "f"
DETAIL: This operation is not supported for foreign tables.
postgres=# SELECT pg_freespace('pg_roles', 0);
ERROR: cannot open relation "pg_roles"
DETAIL: This operation is not supported for views.
postgres=# SELECT pg_freespace('pg_class', 0);
pg_freespace
--------------
7520
(1 row)

I'm attaching a patch that includes these modifications.

--
Euler Taveira
EDB https://www.enterprisedb.com/

Attachments:

fsm.patchtext/x-patch; name=fsm.patchDownload+8-0
#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Euler Taveira (#5)
Re: BUG #18866: Running pg_freespace() on views triggers an Abort

"Euler Taveira" <euler@eulerto.com> writes:

Your patch needs some adjustments. There is no need to include pg_class.h. I
don't like the proposed error message. I prefer saying the relation cannot be
opened because that's what will happen if it reaches this code path.

I don't care for that proposal either: we just did open the relation ;-)

Looking at other places where we throw an error for !RELKIND_HAS_STORAGE:

rawpage.c:
(errcode(ERRCODE_WRONG_OBJECT_TYPE),
errmsg("cannot get raw page from relation \"%s\"",

pgstatindex.c:

(errcode(ERRCODE_WRONG_OBJECT_TYPE),
errmsg("cannot get page count of relation \"%s\"",

tid.c:
errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("cannot look at latest visible tid for relation \"%s.%s\"",

relcache.c:
elog(ERROR, "relation \"%s\" does not have storage",
RelationGetRelationName(relation));

So the previous proposal was evidently modeled on the first two of
these precedents. Personally I prefer messages that say *why*
something failed, so I'd go with something more like "relation \"%s\"
does not have storage". Use of errdetail_relkind_not_supported is
fine though.

regards, tom lane

#7Euler Taveira
euler@eulerto.com
In reply to: Tom Lane (#6)
Re: BUG #18866: Running pg_freespace() on views triggers an Abort

On Wed, Mar 26, 2025, at 1:00 PM, Tom Lane wrote:

"Euler Taveira" <euler@eulerto.com> writes:

Your patch needs some adjustments. There is no need to include pg_class.h. I
don't like the proposed error message. I prefer saying the relation cannot be
opened because that's what will happen if it reaches this code path.

I don't care for that proposal either: we just did open the relation ;-)

Fair point.

relcache.c:
elog(ERROR, "relation \"%s\" does not have storage",
RelationGetRelationName(relation));

So the previous proposal was evidently modeled on the first two of
these precedents. Personally I prefer messages that say *why*
something failed, so I'd go with something more like "relation \"%s\"
does not have storage". Use of errdetail_relkind_not_supported is
fine though.

I thought about saying "no storage" but don't know why was convinced by that
proposal. Your suggestion works for me.

--
Euler Taveira
EDB https://www.enterprisedb.com/

#8Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Euler Taveira (#5)
Re: BUG #18866: Running pg_freespace() on views triggers an Abort

On 2025-Mar-25, Euler Taveira wrote:

On Tue, Mar 25, 2025, at 10:18 PM, Tender Wang wrote:

Thanks for the advice. Please see the attached patch.

Your patch needs some adjustments. There is no need to include pg_class.h.

I think including pg_class.h (which is where RELKIND_HAS_STORAGE() is
defined and where errdetail_relkind_not_supported is declared) is better
than including utils/rel.h, which brings in a whole bunch of additional
stuff in addition to pg_class.h :-)

--
Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
"Thou shalt not follow the NULL pointer, for chaos and madness await
thee at its end." (2nd Commandment for C programmers)

#9Euler Taveira
euler@eulerto.com
In reply to: Alvaro Herrera (#8)
Re: BUG #18866: Running pg_freespace() on views triggers an Abort

On Wed, Mar 26, 2025, at 2:33 PM, Álvaro Herrera wrote:

On 2025-Mar-25, Euler Taveira wrote:

On Tue, Mar 25, 2025, at 10:18 PM, Tender Wang wrote:

Thanks for the advice. Please see the attached patch.

Your patch needs some adjustments. There is no need to include pg_class.h.

I think including pg_class.h (which is where RELKIND_HAS_STORAGE() is
defined and where errdetail_relkind_not_supported is declared) is better
than including utils/rel.h, which brings in a whole bunch of additional
stuff in addition to pg_class.h :-)

It doesn't work because of RelationGetRelationName() that is defined into
utils/rel.h.

--
Euler Taveira
EDB https://www.enterprisedb.com/

#10Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Euler Taveira (#9)
Re: BUG #18866: Running pg_freespace() on views triggers an Abort

On 2025-Mar-26, Euler Taveira wrote:

It doesn't work because of RelationGetRelationName() that is defined into
utils/rel.h.

Doh, of course.

--
Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/
"I can't go to a restaurant and order food because I keep looking at the
fonts on the menu. Five minutes later I realize that it's also talking
about food" (Donald Knuth)

#11Tender Wang
tndrwang@gmail.com
In reply to: Tom Lane (#6)
Re: BUG #18866: Running pg_freespace() on views triggers an Abort

Tom Lane <tgl@sss.pgh.pa.us> 于2025年3月27日周四 00:01写道:

"Euler Taveira" <euler@eulerto.com> writes:

Your patch needs some adjustments. There is no need to include

pg_class.h. I

don't like the proposed error message. I prefer saying the relation

cannot be

opened because that's what will happen if it reaches this code path.

I don't care for that proposal either: we just did open the relation ;-)

Looking at other places where we throw an error for !RELKIND_HAS_STORAGE:

rawpage.c:
(errcode(ERRCODE_WRONG_OBJECT_TYPE),
errmsg("cannot get raw page from relation \"%s\"",

pgstatindex.c:

(errcode(ERRCODE_WRONG_OBJECT_TYPE),
errmsg("cannot get page count of relation \"%s\"",

tid.c:
errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("cannot look at latest visible tid for relation
\"%s.%s\"",

relcache.c:
elog(ERROR, "relation \"%s\" does not have storage",
RelationGetRelationName(relation));

So the previous proposal was evidently modeled on the first two of
these precedents.

Yes, I referred to the rawpage.c.

Personally I prefer messages that say *why*
something failed, so I'd go with something more like "relation \"%s\"
does not have storage". Use of errdetail_relkind_not_supported is
fine though.

PFA for the updated patch file.

--
Thanks,
Tender Wang

Attachments:

v3-0001-Don-t-allow-no-storage-relation-to-get-FSM-info.patchtext/plain; charset=US-ASCII; name=v3-0001-Don-t-allow-no-storage-relation-to-get-FSM-info.patchDownload+8-1
#12Richard Guo
guofenglinux@gmail.com
In reply to: Tender Wang (#11)
Re: BUG #18866: Running pg_freespace() on views triggers an Abort

On Thu, Mar 27, 2025 at 4:14 PM Tender Wang <tndrwang@gmail.com> wrote:

PFA for the updated patch file.

Not related to this issue but I wonder why blkno is verified after
the relation is opened. It can be verified beforehand, no?

Thanks
Richard

#13Tom Lane
tgl@sss.pgh.pa.us
In reply to: Richard Guo (#12)
Re: BUG #18866: Running pg_freespace() on views triggers an Abort

Richard Guo <guofenglinux@gmail.com> writes:

Not related to this issue but I wonder why blkno is verified after
the relation is opened. It can be verified beforehand, no?

It makes sense to me to check the parameters left-to-right, so
I think verifying blkno after relation is fine. It's not like
there's value in optimizing the failure case.

This does, however, suggest that we ought to check the relkind
immediately after opening the rel, before the blkno check.
I'll adjust the patch that way and push.

regards, tom lane

#14Tender Wang
tndrwang@gmail.com
In reply to: Tom Lane (#13)
Re: BUG #18866: Running pg_freespace() on views triggers an Abort

Tom Lane <tgl@sss.pgh.pa.us> 于2025年3月28日周五 00:01写道:

Richard Guo <guofenglinux@gmail.com> writes:

Not related to this issue but I wonder why blkno is verified after
the relation is opened. It can be verified beforehand, no?

It makes sense to me to check the parameters left-to-right, so
I think verifying blkno after relation is fine. It's not like
there's value in optimizing the failure case.

This does, however, suggest that we ought to check the relkind
immediately after opening the rel, before the blkno check.
I'll adjust the patch that way and push.

Thanks for pushing.

--
Thanks,
Tender Wang