Thread: BUG #18866: Running pg_freespace() on views triggers an Abort
The following bug has been logged on the website: Bug reference: 18866 Logged by: Robins Tharakan Email address: tharakan@gmail.com PostgreSQL version: Unsupported/Unknown Operating system: Ubuntu Description: Hi, Passing a view to pg_freespace() triggers an Abort on HEAD. This has been so since the beginning (049ef3398d05c9dc8f48aa9a6d68440661cfeb87). Given that this is just an assert, feel free to skip - but thought I'd bring it up, in case this needs a review. SQL / Backtrace / Error Log excerpt given below: SQL === $ cat crashing.sql CREATE EXTENSION pg_freespacemap; SELECT pg_freespace('pg_roles', 0); SQL Output ========== $ psql -f crashing.sql CREATE EXTENSION psql:crashing.sql:2: server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. psql:crashing.sql:2: error: connection to server was lost Backtrace ========= #5 0x00005b073976f7e0 in ExceptionalCondition (conditionName=0x5b07399bb930 "RelFileNumberIsValid(rlocator.relNumber)", fileName=0x5b07399bb90c "smgr.c", lineNumber=228) at assert.c:66 No locals. #6 0x00005b0739556b62 in smgropen (rlocator=..., backend=-1) at smgr.c:228 brlocator = {locator = {spcOid = 1101590320, dbOid = 23303, relNumber = 1615569600}, backend = 31264} reln = 0x5b0741ab40a8 found = 65 #7 0x00005b07395163cb in RelationGetSmgr (rel=0x7a2060b99bf8) at ../../../../src/include/utils/rel.h:579 No locals. #8 0x00005b0739516dc2 in fsm_readbuf (rel=0x7a2060b99bf8, addr=..., extend=false) at freespace.c:558 blkno = 2 buf = -998073056 reln = 0x0 #9 0x00005b0739516705 in GetRecordedFreeSpace (rel=0x7a2060b99bf8, heapBlk=0) at freespace.c:254 addr = {level = 0, logpageno = 0} slot = 0 buf = 23303 cat = 22 '\026' #10 0x00007a2061ab630c in pg_freespace (fcinfo=0x5b0741b07928) at pg_freespacemap.c:38 relid = 12000 blkno = 0 freespace = -17281 rel = 0x7a2060b99bf8 __func__ = "pg_freespace" #11 0x00005b0739259dc4 in ExecInterpExpr (state=0x5b0741b077d0, econtext=0x5b0741b074b8, isnull=0x0) at execExprInterp.c:1001 d = 100086724982736 fcinfo = 0x5b0741b07928 args = 0x5b0741b07948 op = 0x5b0741b07b50 resultslot = 0x5b0741b07708 innerslot = 0x0 outerslot = 0x0 scanslot = 0x0 oldslot = 0x0 newslot = 0x0 Surfacing Commit ================ Checking (62f36d6924c~0) - 62f36d6924 - fail Checking (62f36d6924c~10) - 023fb51275 - fail Checking (62f36d6924c~30) - e215166c9c - fail . . . Checking (62f36d6924c~3560) - b31ba5310b - fail Checking (62f36d6924c~3565) - e5b8c4f68f - pass Checking (62f36d6924c~3562) - 049ef3398d - fail Checking (62f36d6924c~3563) - cd7f19da34 - pass Surfacing commit is 049ef3398d05c9dc8f48aa9a6d68440661cfeb87 Postgres Error Logs =================== TRAP: failed Assert("RelFileNumberIsValid(rlocator.relNumber)"), File: "smgr.c", Line: 228, PID: 2372149 postgres: smith: smith postgres 127.0.0.1(51026) SELECT(ExceptionalCondition+0xbb)[0x5b073976f7ad] postgres: smith: smith postgres 127.0.0.1(51026) SELECT(smgropen+0x5e)[0x5b0739556b62] postgres: smith: smith postgres 127.0.0.1(51026) SELECT(+0x6b83cb)[0x5b07395163cb] postgres: smith: smith postgres 127.0.0.1(51026) SELECT(+0x6b8dc2)[0x5b0739516dc2] postgres: smith: smith postgres 127.0.0.1(51026) SELECT(GetRecordedFreeSpace+0x50)[0x5b0739516705] /home/smith/proj/blamepg/lib/postgresql/pg_freespacemap.so(pg_freespace+0xc4)[0x7a2061ab730c] postgres: smith: smith postgres 127.0.0.1(51026) SELECT(+0x3fbdc4)[0x5b0739259dc4] postgres: smith: smith postgres 127.0.0.1(51026) SELECT(ExecInterpExprStillValid+0x5b)[0x5b073925c94a] postgres: smith: smith postgres 127.0.0.1(51026) SELECT(+0x461f27)[0x5b07392bff27] postgres: smith: smith postgres 127.0.0.1(51026) SELECT(+0x461fe5)[0x5b07392bffe5] postgres: smith: smith postgres 127.0.0.1(51026) SELECT(+0x462046)[0x5b07392c0046] postgres: smith: smith postgres 127.0.0.1(51026) SELECT(+0x462269)[0x5b07392c0269] postgres: smith: smith postgres 127.0.0.1(51026) SELECT(+0x418aba)[0x5b0739276aba] postgres: smith: smith postgres 127.0.0.1(51026) SELECT(+0x40aeaf)[0x5b0739268eaf] postgres: smith: smith postgres 127.0.0.1(51026) SELECT(+0x40df9e)[0x5b073926bf9e] postgres: smith: smith postgres 127.0.0.1(51026) SELECT(standard_ExecutorRun+0x1cc)[0x5b07392696eb] postgres: smith: smith postgres 127.0.0.1(51026) SELECT(ExecutorRun+0x54)[0x5b073926951c] postgres: smith: smith postgres 127.0.0.1(51026) SELECT(+0x707af2)[0x5b0739565af2] postgres: smith: smith postgres 127.0.0.1(51026) SELECT(PortalRun+0x274)[0x5b0739565720] postgres: smith: smith postgres 127.0.0.1(51026) SELECT(+0x6ffc69)[0x5b073955dc69] postgres: smith: smith postgres 127.0.0.1(51026) SELECT(PostgresMain+0xb8d)[0x5b0739563461] postgres: smith: smith postgres 127.0.0.1(51026) SELECT(+0x6fafe7)[0x5b0739558fe7] postgres: smith: smith postgres 127.0.0.1(51026) SELECT(postmaster_child_launch+0x174)[0x5b073945cac4] postgres: smith: smith postgres 127.0.0.1(51026) SELECT(+0x605664)[0x5b0739463664] postgres: smith: smith postgres 127.0.0.1(51026) SELECT(+0x602a7e)[0x5b0739460a7e] postgres: smith: smith postgres 127.0.0.1(51026) SELECT(PostmasterMain+0x1546)[0x5b0739460374] postgres: smith: smith postgres 127.0.0.1(51026) SELECT(main+0x38c)[0x5b07392ff7db] /lib/x86_64-linux-gnu/libc.so.6(+0x2a1ca)[0x7a206042a1ca] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x8b)[0x7a206042a28b] postgres: smith: smith postgres 127.0.0.1(51026) SELECT(_start+0x25)[0x5b0738f43eb5] 2025-03-25 21:32:25.437 ACDT [2372118] LOG: client backend (PID 2372149) was terminated by signal 6: Aborted 2025-03-25 21:32:25.437 ACDT [2372118] DETAIL: Failed process was running: SELECT public.pg_freespace('pg_roles', 0); 2025-03-25 21:32:25.437 ACDT [2372118] LOG: terminating any other active server processes Thanks to SQLSmith / SQLReduce. - Robins
PG Bug reporting form <noreply@postgresql.org> 于2025年3月25日周二 22:42写道:
The following bug has been logged on the website:
Bug reference: 18866
Logged by: Robins Tharakan
Email address: tharakan@gmail.com
PostgreSQL version: Unsupported/Unknown
Operating system: Ubuntu
Description:
Hi,
Passing a view to pg_freespace() triggers an Abort on HEAD. This has been so
since the beginning (049ef3398d05c9dc8f48aa9a6d68440661cfeb87). Given that
this is just an assert, feel free to skip - but thought I'd bring it up, in
case this needs a review.
SQL / Backtrace / Error Log excerpt given below:
SQL
===
$ cat crashing.sql
CREATE EXTENSION pg_freespacemap;
SELECT pg_freespace('pg_roles', 0);
Yeah, it would crash if you input a foreign table, for example:
create extension postgres_fdw;
CREATE SERVER d FOREIGN DATA WRAPPER postgres_fdw;
CREATE FOREIGN TABLE f (g text) SERVER d;
SELECT pg_freespace('f', 0); -- will crash too
I think we can remove the Assert in smgropen().
Any thoughts?
Thanks,
Tender Wang
Attachment
Tender Wang <tndrwang@gmail.com> writes: > PG Bug reporting form <noreply@postgresql.org> 于2025年3月25日周二 22:42写道: >> Passing a view to pg_freespace() triggers an Abort on HEAD. > I think we can remove the Assert in smgropen(). > Any thoughts? That seems like a "solution" that will allow other bugs. pg_freespace should be fixed to verify RELKIND_HAS_STORAGE() before it tries to access said storage. regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> 于2025年3月26日周三 00:37写道:
Tender Wang <tndrwang@gmail.com> writes:
> PG Bug reporting form <noreply@postgresql.org> 于2025年3月25日周二 22:42写道:
>> Passing a view to pg_freespace() triggers an Abort on HEAD.
> I think we can remove the Assert in smgropen().
> Any thoughts?
That seems like a "solution" that will allow other bugs. pg_freespace
should be fixed to verify RELKIND_HAS_STORAGE() before it tries to
access said storage.
Thanks for the advice. Please see the attached patch.
Thanks,
Tender Wang
Attachment
On Tue, Mar 25, 2025, at 10:18 PM, Tender Wang wrote:
Thanks for the advice. Please see the attached patch.
Your patch needs some adjustments. There is no need to include pg_class.h. I
don't like the proposed error message. I prefer saying the relation cannot be
opened because that's what will happen if it reaches this code path. This is an
existing message so no need for translation. The output will be like:
postgres=# SELECT pg_freespace('f', 0);
ERROR: cannot open relation "f"
DETAIL: This operation is not supported for foreign tables.
postgres=# SELECT pg_freespace('pg_roles', 0);
ERROR: cannot open relation "pg_roles"
DETAIL: This operation is not supported for views.
postgres=# SELECT pg_freespace('pg_class', 0);
pg_freespace
--------------
7520
(1 row)
I'm attaching a patch that includes these modifications.
Attachment
"Euler Taveira" <euler@eulerto.com> writes: > Your patch needs some adjustments. There is no need to include pg_class.h. I > don't like the proposed error message. I prefer saying the relation cannot be > opened because that's what will happen if it reaches this code path. I don't care for that proposal either: we just did open the relation ;-) Looking at other places where we throw an error for !RELKIND_HAS_STORAGE: rawpage.c: (errcode(ERRCODE_WRONG_OBJECT_TYPE), errmsg("cannot get raw page from relation \"%s\"", pgstatindex.c: (errcode(ERRCODE_WRONG_OBJECT_TYPE), errmsg("cannot get page count of relation \"%s\"", tid.c: errcode(ERRCODE_FEATURE_NOT_SUPPORTED), errmsg("cannot look at latest visible tid for relation \"%s.%s\"", relcache.c: elog(ERROR, "relation \"%s\" does not have storage", RelationGetRelationName(relation)); So the previous proposal was evidently modeled on the first two of these precedents. Personally I prefer messages that say *why* something failed, so I'd go with something more like "relation \"%s\" does not have storage". Use of errdetail_relkind_not_supported is fine though. regards, tom lane
On Wed, Mar 26, 2025, at 1:00 PM, Tom Lane wrote:
"Euler Taveira" <euler@eulerto.com> writes:> Your patch needs some adjustments. There is no need to include pg_class.h. I> don't like the proposed error message. I prefer saying the relation cannot be> opened because that's what will happen if it reaches this code path.I don't care for that proposal either: we just did open the relation ;-)
Fair point.
relcache.c:elog(ERROR, "relation \"%s\" does not have storage",RelationGetRelationName(relation));So the previous proposal was evidently modeled on the first two ofthese precedents. Personally I prefer messages that say *why*something failed, so I'd go with something more like "relation \"%s\"does not have storage". Use of errdetail_relkind_not_supported isfine though.
I thought about saying "no storage" but don't know why was convinced by that
proposal. Your suggestion works for me.
On 2025-Mar-25, Euler Taveira wrote: > On Tue, Mar 25, 2025, at 10:18 PM, Tender Wang wrote: > > Thanks for the advice. Please see the attached patch. > > Your patch needs some adjustments. There is no need to include pg_class.h. I think including pg_class.h (which is where RELKIND_HAS_STORAGE() is defined and where errdetail_relkind_not_supported is declared) is better than including utils/rel.h, which brings in a whole bunch of additional stuff in addition to pg_class.h :-) -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/ "Thou shalt not follow the NULL pointer, for chaos and madness await thee at its end." (2nd Commandment for C programmers)
On Wed, Mar 26, 2025, at 2:33 PM, Álvaro Herrera wrote:
On 2025-Mar-25, Euler Taveira wrote:> On Tue, Mar 25, 2025, at 10:18 PM, Tender Wang wrote:> > Thanks for the advice. Please see the attached patch.>> Your patch needs some adjustments. There is no need to include pg_class.h.I think including pg_class.h (which is where RELKIND_HAS_STORAGE() isdefined and where errdetail_relkind_not_supported is declared) is betterthan including utils/rel.h, which brings in a whole bunch of additionalstuff in addition to pg_class.h :-)
It doesn't work because of RelationGetRelationName() that is defined into
utils/rel.h.
On 2025-Mar-26, Euler Taveira wrote: > It doesn't work because of RelationGetRelationName() that is defined into > utils/rel.h. Doh, of course. -- Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/ "I can't go to a restaurant and order food because I keep looking at the fonts on the menu. Five minutes later I realize that it's also talking about food" (Donald Knuth)
Tom Lane <tgl@sss.pgh.pa.us> 于2025年3月27日周四 00:01写道:
"Euler Taveira" <euler@eulerto.com> writes:
> Your patch needs some adjustments. There is no need to include pg_class.h. I
> don't like the proposed error message. I prefer saying the relation cannot be
> opened because that's what will happen if it reaches this code path.
I don't care for that proposal either: we just did open the relation ;-)
Looking at other places where we throw an error for !RELKIND_HAS_STORAGE:
rawpage.c:
(errcode(ERRCODE_WRONG_OBJECT_TYPE),
errmsg("cannot get raw page from relation \"%s\"",
pgstatindex.c:
(errcode(ERRCODE_WRONG_OBJECT_TYPE),
errmsg("cannot get page count of relation \"%s\"",
tid.c:
errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("cannot look at latest visible tid for relation \"%s.%s\"",
relcache.c:
elog(ERROR, "relation \"%s\" does not have storage",
RelationGetRelationName(relation));
So the previous proposal was evidently modeled on the first two of
these precedents.
Yes, I referred to the rawpage.c.
Personally I prefer messages that say *why*
something failed, so I'd go with something more like "relation \"%s\"
does not have storage". Use of errdetail_relkind_not_supported is
fine though.
PFA for the updated patch file.
Thanks,
Tender Wang
Attachment
On Thu, Mar 27, 2025 at 4:14 PM Tender Wang <tndrwang@gmail.com> wrote: > PFA for the updated patch file. Not related to this issue but I wonder why blkno is verified after the relation is opened. It can be verified beforehand, no? Thanks Richard
Richard Guo <guofenglinux@gmail.com> writes: > Not related to this issue but I wonder why blkno is verified after > the relation is opened. It can be verified beforehand, no? It makes sense to me to check the parameters left-to-right, so I think verifying blkno after relation is fine. It's not like there's value in optimizing the failure case. This does, however, suggest that we ought to check the relkind immediately after opening the rel, before the blkno check. I'll adjust the patch that way and push. regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> 于2025年3月28日周五 00:01写道:
Richard Guo <guofenglinux@gmail.com> writes:
> Not related to this issue but I wonder why blkno is verified after
> the relation is opened. It can be verified beforehand, no?
It makes sense to me to check the parameters left-to-right, so
I think verifying blkno after relation is fine. It's not like
there's value in optimizing the failure case.
This does, however, suggest that we ought to check the relkind
immediately after opening the rel, before the blkno check.
I'll adjust the patch that way and push.
Thanks for pushing.
Thanks,
Tender Wang