Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements - Mailing list pgsql-hackers

From Mihail Nikalayeu
Subject Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements
Date
Msg-id CADzfLwUrodAcOggK+3j3LbPLaSXemgHxa-n=LhZTwRAsaakL2g@mail.gmail.com
Whole thread Raw
In response to Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements  (Sergey Sargsyan <sergey.sargsyan.2001@gmail.com>)
List pgsql-hackers
Hello, Sergey!

> Today I encountered a segmentation fault caused by the patch v20-0007-Add-Datum-storage-support-to-tuplestore.patch.
Duringthe merge phase, I inserted some tuples into the table so that STIR would have data for the validation phase. The
segfaultoccurred during a call to tuplestore_end(). 
>
> The root cause is that a few functions in the tuplestore code still assume that all stored data is a pointer and thus
attemptto pfree it. This assumption breaks when datumByVal is used, as the data is stored directly and not as a
pointer.In particular, tuplestore_end(), tuplestore_trim(), and tuplestore_clear() incorrectly try to free such values. 
>
> When addressing this, please also ensure that context memory accounting is handled properly: we should not increment
ordecrement the remaining context memory size when cleaning or trimming datumByVal entries, since no actual memory was
allocatedfor them. 
>
> Interestingly, I’m surprised you haven’t hit this segfault yourself. Are you perhaps testing on an older system where
INT8OIDis passed by reference? Or is your STIR always empty during the validation phase? 

Thanks for pointing that out. It looks like tuplestore_trim and
tuplestore_clear are broken, while tuplestore_end seems to be correct
but fails due to previous heap corruption.
In my case, tuplestore_trim and tuplestore_clear aren't called at all
- that's why the issue wasn't detected. I'm not sure why; perhaps some
recent changes in your codebase are affecting that?

Please run a stress test (if you've already applied the in-place fix
for the tuplestore):
         ninja && meson test --suite setup && meson test
--print-errorlogs --suite pg_amcheck *006*

This will help ensure everything else is working correctly on your system.

> One more point: I noticed you modified the index_create() function signature. You added the relpersistence parameter,
whichseems unnecessary— 
> this can be determined internally by checking if it’s an auxiliary index, in which case the index should be marked as
unlogged.You also added an 
> auxiliaryIndexOfOid argument (do not remember exact naming, but was used for dependency). It might be cleaner to pass
thisvia the IndexInfo structure 
> instead. index_create() already has dozens of mouthful arguments, and external extensions
> (like pg_squeeze) still rely on the old signature, so minimizing changes to the function interface would improve
compatibility.

Yes, that’s probably a good idea. I was trying to keep it simple from
the perspective of parameters to avoid dealing with some of the tricky
internal logic.
But you're right - it’s better to stick with the old signature.

Best regards,
Mikhail.



pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: pg18: Virtual generated columns are not (yet) safe when superuser selects from them
Next
From: Thomas Munro
Date:
Subject: Re: Non-reproducible AIO failure