Thread: Avoid possible dereference null pointer (src/backend/utils/cache/relcache.c)

Hi.

The function *ScanPgRelation* can return a invalid tuple.
It is necessary to check the function's return, 
as is already done in other parts of the source.

patch attached.

Best regards,
Ranier Vilela

Attachment
Hi,

> The function *ScanPgRelation* can return a invalid tuple.
> It is necessary to check the function's return,
> as is already done in other parts of the source.

Good find although I'm not certain if I agree with the patch.

We would be in serious trouble if RelationReloadNailed() could be
called for a non-existing relation. Wouldn't Assert() be more
appropriate?

-- 
Best regards,
Aleksander Alekseev



Hi,

>> We would be in serious trouble if RelationReloadNailed() could be
>> called for a non-existing relation. Wouldn't Assert() be more
>> appropriate?
>
> IMO, I think no.
>
> In all paths, this check is done, why would this be the only exception?

We use Asserts() for cases that shouldn't happen in practice for
performance reasons. Since this code doesn't crash I suspect this is
one of such cases. Unless you are aware of a specific scenario that
makes the code crash of course.

-- 
Best regards,
Aleksander Alekseev



On 6/17/25 15:09, Aleksander Alekseev wrote:
> Hi,
> 
>>> We would be in serious trouble if RelationReloadNailed() could be
>>> called for a non-existing relation. Wouldn't Assert() be more
>>> appropriate?
>>
>> IMO, I think no.
>>
>> In all paths, this check is done, why would this be the only exception?
> 
> We use Asserts() for cases that shouldn't happen in practice for
> performance reasons. Since this code doesn't crash I suspect this is
> one of such cases. Unless you are aware of a specific scenario that
> makes the code crash of course.
> 

Right. Assert() are for checks that we don't expect to ever fail, so we
don't want to keep them in production builds. I believe this is exactly
such case, because the code is about "nailed" catalogs, that we need to
access very early (e.g. before being able to access pg_class).

There's only ~10 of such catalogs (look for formrdesc calls), and if we
get a failure here, it's not clear to me we can really keep the system
running anyway. It's not just the usual "relcache miss" and if the user
retries it will probably work fine. The catalog is borked, and who knows
in what way.

My opinion is that adding a "elog(ERROR)" here would be misleading, as
it implies it's something we expect. And mostly pointless. I can imagine
adding an Assert, but I don't quite see how is that better than just
hitting a segfault a couple lines later.


regards

-- 
Tomas Vondra





Em ter., 17 de jun. de 2025 às 10:43, Tomas Vondra <tomas@vondra.me> escreveu:
On 6/17/25 15:09, Aleksander Alekseev wrote:
> Hi,
>
>>> We would be in serious trouble if RelationReloadNailed() could be
>>> called for a non-existing relation. Wouldn't Assert() be more
>>> appropriate?
>>
>> IMO, I think no.
>>
>> In all paths, this check is done, why would this be the only exception?
>
> We use Asserts() for cases that shouldn't happen in practice for
> performance reasons. Since this code doesn't crash I suspect this is
> one of such cases. Unless you are aware of a specific scenario that
> makes the code crash of course.
>

Right. Assert() are for checks that we don't expect to ever fail, so we
don't want to keep them in production builds. I believe this is exactly
such case, because the code is about "nailed" catalogs, that we need to
access very early (e.g. before being able to access pg_class).

There's only ~10 of such catalogs (look for formrdesc calls), and if we
get a failure here, it's not clear to me we can really keep the system
running anyway. It's not just the usual "relcache miss" and if the user
retries it will probably work fine. The catalog is borked, and who knows
in what way.

My opinion is that adding a "elog(ERROR)" here would be misleading, as
it implies it's something we expect. And mostly pointless. I can imagine
adding an Assert, but I don't quite see how is that better than just
hitting a segfault a couple lines later.
To me this is a contradiction, whether you consider waiting for a segfault or consider adding an Assert.
For the user it is better to have a log, where he can quickly find the problem, rather than having to investigate on his own.

best regards,
Ranier Vilela
Hi Ranier,

> To me this is a contradiction, whether you consider waiting for a segfault or consider adding an Assert.
> For the user it is better to have a log, where he can quickly find the problem, rather than having to investigate on
hisown.
 

That's the point. User's shouldn't encounter this. Thus there is no
reason to break branch prediction for everyone here.

I agree with Tomas that adding Assert() in fact will not change much.
It can serve as a little piece of documentation maybe: "yes we thought
about it", that's all.

-- 
Best regards,
Aleksander Alekseev



Em ter., 17 de jun. de 2025 às 11:10, Aleksander Alekseev <aleksander@timescale.com> escreveu:
Hi Ranier,

> To me this is a contradiction, whether you consider waiting for a segfault or consider adding an Assert.
> For the user it is better to have a log, where he can quickly find the problem, rather than having to investigate on his own.

That's the point. User's shouldn't encounter this.
It shouldn't.
 
Thus there is no
reason to break branch prediction for everyone here.
For now there is no consensus that there will not be a segfault.
If  a segfault will never occur, ok.
IMO, this is not a query-dependent issue.
But the table in question is never corrupted or invalid.

best regards,
Ranier Vilela
> On 17 Jun 2025, at 15:43, Tomas Vondra <tomas@vondra.me> wrote:

> .. if we
> get a failure here, it's not clear to me we can really keep the system
> running anyway. It's not just the usual "relcache miss" and if the user
> retries it will probably work fine. The catalog is borked, and who knows
> in what way.

Agreed.

> My opinion is that adding a "elog(ERROR)" here would be misleading, as
> it implies it's something we expect. And mostly pointless. I can imagine
> adding an Assert, but I don't quite see how is that better than just
> hitting a segfault a couple lines later.

If I break something here while hacking I'd probably prefer a segfault to an
Assert.

--
Daniel Gustafsson