Re: fix for BUG #3720: wrong results at using ltree - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: fix for BUG #3720: wrong results at using ltree
Date
Msg-id 20200124182915.zz7hzdkijzsb4ic4@development
Whole thread Raw
In response to Re: fix for BUG #3720: wrong results at using ltree  (Alexander Korotkov <a.korotkov@postgrespro.ru>)
Responses Re: fix for BUG #3720: wrong results at using ltree
Re: fix for BUG #3720: wrong results at using ltree
List pgsql-hackers
Hi Nikita,

This patch seems inactive / stuck in "waiting on author" since November.
It's marked as bugfix, so it'd be good to get it committed instead of
just punting it to the next CF.

I did a quick review, and I came mostly with the same two complaints as
Alexander ...

On Wed, Jul 17, 2019 at 09:33:46PM +0300, Alexander Korotkov wrote:
>Hi Nikita,
>
>On Tue, Jul 16, 2019 at 6:52 PM Nikita Glukhov <n.gluhov@postgrespro.ru> wrote:
>> I looked at "ltree syntax improvement" patch and found two more very
>> old bugs in ltree/lquery (fixes are attached):
>
>Thank you for the fixes.  I've couple notes on them.
>
>0001-Fix-max-size-checking-for-ltree-and-lquery.patch
>
>+#define LTREE_MAX_LEVELS Min(PG_UINT16_MAX, MaxAllocSize / sizeof(nodeitem))
>+#define LQUERY_MAX_LEVELS Min(PG_UINT16_MAX, MaxAllocSize / ITEMSIZE)
>
>Looks over caution.  PG_UINT16_MAX is not even close to MaxAllocSize /
>sizeof(nodeitem) or MaxAllocSize / ITEMSIZE.
>

Yeah, I'm also puzzled by the usage of PG_UINT16_MAX here. It's so much
lower than the other values we could jut use the constant directly, but
let's say the structs could grow from the ~16B to chnge this.

The main question is why we need PG_UINT16_MAX at all? It kinda implies
we need to squish the value into a 2B counter or something, but is that
actually true? I don't see anything like that in ltree_io.c.

So it seems more like an arbitrary value considered "sane" - which is
fine, but then a comment saying so would be nice, and we could pick a
value that is "nicer" for humans. Or just use value computed from the
MaxAllocSize limit, without the Min().

>0002-Fix-successive-lquery-ops.patch
>
>diff --git a/contrib/ltree/lquery_op.c b/contrib/ltree/lquery_op.c
>index 62172d5..d4f4941 100644
>--- a/contrib/ltree/lquery_op.c
>+++ b/contrib/ltree/lquery_op.c
>@@ -255,8 +255,8 @@ checkCond(lquery_level *curq, int query_numlevel,
>ltree_level *curt, int tree_nu
>  }
>  else
>  {
>- low_pos = cur_tpos + curq->low;
>- high_pos = cur_tpos + curq->high;
>+ low_pos = Min(low_pos + curq->low, PG_UINT16_MAX);
>+ high_pos = Min(high_pos + curq->high, PG_UINT16_MAX);
>  if (ptr && ptr->q)
>  {
>  ptr->nq++;
>@@ -282,8 +282,8 @@ checkCond(lquery_level *curq, int query_numlevel,
>ltree_level *curt, int tree_nu
>  }
>  else
>  {
>- low_pos = cur_tpos + curq->low;
>- high_pos = cur_tpos + curq->high;
>+ low_pos = Min(low_pos + curq->low, PG_UINT16_MAX);
>+ high_pos = Min(high_pos + curq->high, PG_UINT16_MAX);
>  }
>
>  curq = LQL_NEXT(curq);
>
>I'm not sure what do these checks do.  Code around is uncommented and
>puzzled.  But could we guarantee the same invariant on the stage of
>ltree/lquery parsing?
>

Unfortunately, the current code is somewhat undercommented :-(

Anyway, I don't quite understand why we need these caps. It kinda seems
like a band-aid for potential overflow.

Why should it be OK for the values to even get past the maximum, with
sane input data? And isn't there a better upper limit (e.g. based on
how much space we actually allocated)?


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



pgsql-hackers by date:

Previous
From: Mark Dilger
Date:
Subject: Re: making the backend's json parser work in frontend code
Next
From: Robert Haas
Date:
Subject: Re: making the backend's json parser work in frontend code