Thread: BUG #14928: Unchecked SearchSysCacheCopy1() return value

BUG #14928: Unchecked SearchSysCacheCopy1() return value

From
bianpan2016@163.com
Date:
The following bug has been logged on the website:

Bug reference:      14928
Logged by:          Pan Bian
Email address:      bianpan2016@163.com
PostgreSQL version: 10.1
Operating system:   Linux
Description:

File: postgresql-10.1/src/backend/commands/tablecmds.c
Function: ATExecDetachPartition
Line: 13816

Function SearchSysCacheCopy1() may return a NULL pointer if there is no
enough memory. But in function ATExecDetachPartition(), its return value is
not checked, which may result in NULL dereference (see line 13818).

For your convenience, I copy and paste related codes as follows.

13815     classRel = heap_open(RelationRelationId, RowExclusiveLock);
13816     tuple = SearchSysCacheCopy1(RELOID,
13817
ObjectIdGetDatum(RelationGetRelid(partRel)));
13818     Assert(((Form_pg_class) GETSTRUCT(tuple))->relispartition);
13819 
13820     (void) SysCacheGetAttr(RELOID, tuple,
Anum_pg_class_relpartbound,
13821                            &isnull);
13822     Assert(!isnull);

Thank you!

Pan Bian


Re: BUG #14928: Unchecked SearchSysCacheCopy1() return value

From
Amit Langote
Date:
On 2017/11/27 18:13, bianpan2016@163.com wrote:
> The following bug has been logged on the website:
> 
> Bug reference:      14928
> Logged by:          Pan Bian
> Email address:      bianpan2016@163.com
> PostgreSQL version: 10.1
> Operating system:   Linux
> Description:        
> 
> File: postgresql-10.1/src/backend/commands/tablecmds.c
> Function: ATExecDetachPartition
> Line: 13816
> 
> Function SearchSysCacheCopy1() may return a NULL pointer if there is no
> enough memory. But in function ATExecDetachPartition(), its return value is
> not checked, which may result in NULL dereference (see line 13818).
> 
> For your convenience, I copy and paste related codes as follows.
> 
> 13815     classRel = heap_open(RelationRelationId, RowExclusiveLock);
> 13816     tuple = SearchSysCacheCopy1(RELOID,
> 13817                                
> ObjectIdGetDatum(RelationGetRelid(partRel)));
> 13818     Assert(((Form_pg_class) GETSTRUCT(tuple))->relispartition);
> 13819 
> 13820     (void) SysCacheGetAttr(RELOID, tuple,
> Anum_pg_class_relpartbound,
> 13821                            &isnull);
> 13822     Assert(!isnull);

Thanks for the report.  Attached a patch that adds a check that tuple is
valid before trying to dereference it.

Thanks,
Amit


Attachment

Re: BUG #14928: Unchecked SearchSysCacheCopy1() return value

From
PanBian
Date:
On Mon, Nov 27, 2017 at 07:21:32PM +0900, Amit Langote wrote:
> On 2017/11/27 18:13, bianpan2016@163.com wrote:
> > The following bug has been logged on the website:
> > 
> > Bug reference:      14928
> > Logged by:          Pan Bian
> > Email address:      bianpan2016@163.com
> > PostgreSQL version: 10.1
> > Operating system:   Linux
> > Description:        
> > 
> 
> Thanks for the report.  Attached a patch that adds a check that tuple is
> valid before trying to dereference it.
> 
> Thanks,
> Amit
> 

Got it.

Thanks a lot,
Pan Bian

> diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
> index d19846d005..8cd6c65111 100644
> --- a/src/backend/commands/tablecmds.c
> +++ b/src/backend/commands/tablecmds.c
> @@ -14111,6 +14111,9 @@ ATExecDetachPartition(Relation rel, RangeVar *name)
>      classRel = heap_open(RelationRelationId, RowExclusiveLock);
>      tuple = SearchSysCacheCopy1(RELOID,
>                                  ObjectIdGetDatum(RelationGetRelid(partRel)));
> +    if (!HeapTupleIsValid(tuple))
> +        elog(ERROR, "cache lookup failed for relation %u",
> +                    RelationGetRelid(partRel));
>      Assert(((Form_pg_class) GETSTRUCT(tuple))->relispartition);
>  
>      (void) SysCacheGetAttr(RELOID, tuple, Anum_pg_class_relpartbound,




Re: BUG #14928: Unchecked SearchSysCacheCopy1() return value

From
Tom Lane
Date:
Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> writes:
> On 2017/11/27 18:13, bianpan2016@163.com wrote:
>> Function SearchSysCacheCopy1() may return a NULL pointer if there is no
>> enough memory. But in function ATExecDetachPartition(), its return value is
>> not checked, which may result in NULL dereference (see line 13818).

> Thanks for the report.  Attached a patch that adds a check that tuple is
> valid before trying to dereference it.

Pushed.  Checking other SearchSysCache calls in these two files, I noted
a third instance of the same problem in StorePartitionKey().  However,
looking closer, StorePartitionKey never does anything at all with the
result of its SearchSysCache1(PARTRELID, ...) lookup, including never
releasing the syscache reference.  How is it that we don't get refcount
leak warnings?  I guess that must prove that that lookup always fails,
which is not too surprising since it seems to be against the partition
key info that we haven't stored yet.  Anyway, I just diked that one
out, since it's clearly useless.
        regards, tom lane


Re: BUG #14928: Unchecked SearchSysCacheCopy1() return value

From
Amit Langote
Date:
On 2017/11/28 9:25, Tom Lane wrote:
> Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> writes:
>> On 2017/11/27 18:13, bianpan2016@163.com wrote:
>>> Function SearchSysCacheCopy1() may return a NULL pointer if there is no
>>> enough memory. But in function ATExecDetachPartition(), its return value is
>>> not checked, which may result in NULL dereference (see line 13818).
> 
>> Thanks for the report.  Attached a patch that adds a check that tuple is
>> valid before trying to dereference it.
> 
> Pushed.

Thank you.

> Checking other SearchSysCache calls in these two files, I noted
> a third instance of the same problem in StorePartitionKey().  However,
> looking closer, StorePartitionKey never does anything at all with the
> result of its SearchSysCache1(PARTRELID, ...) lookup, including never
> releasing the syscache reference.  How is it that we don't get refcount
> leak warnings?  I guess that must prove that that lookup always fails,
> which is not too surprising since it seems to be against the partition
> key info that we haven't stored yet.  Anyway, I just diked that one
> out, since it's clearly useless.

Thank you, too.  I guess that SearchSysCache was from one of the earliest
versions of the patch, whereby we'd error out if the tuple we got out was
valid; that is, error for trying to set the partition of key of a table
that already had one.

Thanks,
Amit