Re: pg15b1: FailedAssertion("val > base", File: "...src/include/utils/relptr.h", Line: 67, PID: 30485) - Mailing list pgsql-hackers

From Kyotaro Horiguchi
Subject Re: pg15b1: FailedAssertion("val > base", File: "...src/include/utils/relptr.h", Line: 67, PID: 30485)
Date
Msg-id 20220601.114201.1536897005260401301.horikyota.ntt@gmail.com
Whole thread Raw
In response to Re: pg15b1: FailedAssertion("val > base", File: "...src/include/utils/relptr.h", Line: 67, PID: 30485)  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: pg15b1: FailedAssertion("val > base", File: "...src/include/utils/relptr.h", Line: 67, PID: 30485)
List pgsql-hackers
At Tue, 31 May 2022 16:10:05 -0400, Tom Lane <tgl@sss.pgh.pa.us> wrote in 
tgl> Robert Haas <robertmhaas@gmail.com> writes:
tgl> > Yeah, so when I created this stuff in the first place, I figured that
tgl> > it wasn't a problem if we reserved relptr == 0 to mean a NULL pointer,
tgl> > because you would never have a relative pointer pointing to the
tgl> > beginning of a DSM, because it would probably always start with a
tgl> > dsm_toc. But when Thomas made it so that DSM allocations could happen
tgl> > in the main shared memory segment, that ceased to be true. This
tgl> > example happened not to break because we never use relptr_access() on
tgl> > fpm->self. We do use fpm_segment_base(), but that accidentally fails
tgl> > to break, because instead of using relptr_access() it drills right
tgl> > through the abstraction and doesn't have any kind of special case for
tgl> > 0.
tgl> 
tgl> Seems like that in itself is a a lousy idea.  Either the code should
tgl> respect the abstraction, or it shouldn't be declaring the variable
tgl> as a relptr in the first place.
tgl> 
tgl> > So we can fix this by:
tgl> > 1. Using a relative pointer value other than 0 to represent a null
tgl> > pointer. Andres suggested (Size) -1.
tgl> > 2. Not storing the free page manager for the DSM in the main shared
tgl> > memory segment at byte offset 0.
tgl> > 3. Dropping the assertion while loudly singing "la la la la la la".
tgl> 
tgl> I'm definitely down on #3, because that just leaves the ambiguity
tgl> in place to bite somewhere else in future.  #1 would work as long
tgl> as nobody expects memset-to-zero to produce null relptrs, but that
tgl> doesn't seem very nice either.
tgl> 
tgl> On the whole, wasting MAXALIGN worth of memory seems like the least bad
tgl> alternative, but I wonder if we ought to do it right here as opposed
tgl> to somewhere in the DSM code proper.  Why is this DSM space not like
tgl> other DSM spaces in starting with a TOC?
tgl> 
tgl>             regards, tom lane



pgsql-hackers by date:

Previous
From: "osumi.takamichi@fujitsu.com"
Date:
Subject: RE: Build-farm - intermittent error in 031_column_list.pl
Next
From: Kyotaro Horiguchi
Date:
Subject: Re: pg15b1: FailedAssertion("val > base", File: "...src/include/utils/relptr.h", Line: 67, PID: 30485)