Thread: jsonb crash

jsonb crash

From
Jaime Casanova
Date:
Hi,

I found a crash (segmentation fault) on jsonb.
This is the best I could do to reduce the query:

"""
select  
  75 as c1
from 
  public.pagg_tab_ml as ref_0,
  lateral (select  
        ref_0.a as c5 
      from generate_series(1, 300) as sample_0
      fetch first 78 rows only
      ) as subq_0
where case when (subq_0.c5 < 2) 
           then cast(null as jsonb) 
       else cast(null as jsonb) 
      end ? ref_0.c
"""

And because it needs pagg_tab_ml it should be run a regression database.
This affects at least 14 and 15.

Attached is the backtrace.

-- 
Jaime Casanova
Director de Servicios Profesionales
SystemGuards - Consultores de PostgreSQL

Attachment

Re: jsonb crash

From
Ranier Vilela
Date:
Em qua., 29 de set. de 2021 às 15:55, Jaime Casanova <jcasanov@systemguards.com.ec> escreveu:
Hi,

I found a crash (segmentation fault) on jsonb.
This is the best I could do to reduce the query:

"""
select 
  75 as c1
from
  public.pagg_tab_ml as ref_0,
  lateral (select 
        ref_0.a as c5
      from generate_series(1, 300) as sample_0
      fetch first 78 rows only
      ) as subq_0
where case when (subq_0.c5 < 2)
           then cast(null as jsonb)
           else cast(null as jsonb)
      end ? ref_0.c
"""

And because it needs pagg_tab_ml it should be run a regression database.
This affects at least 14 and 15.

Attached is the backtrace.
Yeah, Coverity has a report about this at function:

JsonbValue *
pushJsonbValue(JsonbParseState **pstateJsonbIteratorToken seq,
                           JsonbValue *jbval)

1. CID undefined: Dereference after null check (FORWARD_NULL)
return pushJsonbValueScalar(pstateseqjbval);

2. CID undefined (#1 of 1): Dereference after null check (FORWARD_NULL)16. var_deref_model: 
Passing pstate to pushJsonbValueScalar, which dereferences null *pstate

res = pushJsonbValueScalar(pstatetok,
                                                                   tok < WJB_BEGIN_ARRAY ||
                                                                   (tok == WJB_BEGIN_ARRAY &&
                                                                        v.val.array.rawScalar) ? &v : NULL);

regards,
Ranier Vilela

Re: jsonb crash

From
Tom Lane
Date:
Jaime Casanova <jcasanov@systemguards.com.ec> writes:
> I found a crash (segmentation fault) on jsonb.
> This is the best I could do to reduce the query:

> """
> select  
>   75 as c1
> from 
>   public.pagg_tab_ml as ref_0,
>   lateral (select  
>         ref_0.a as c5 
>       from generate_series(1, 300) as sample_0
>       fetch first 78 rows only
>       ) as subq_0
> where case when (subq_0.c5 < 2) 
>            then cast(null as jsonb) 
>        else cast(null as jsonb) 
>       end ? ref_0.c
> """

I think this must be a memoize bug.  AFAICS, nowhere in this query
can we be processing a non-null JSONB value, so what are we doing
in jsonb_hash?  Something down-stack must have lost the information
that the Datum is actually null.

            regards, tom lane



Re: jsonb crash

From
Tom Lane
Date:
I wrote:
> I think this must be a memoize bug.  AFAICS, nowhere in this query
> can we be processing a non-null JSONB value, so what are we doing
> in jsonb_hash?  Something down-stack must have lost the information
> that the Datum is actually null.

After further inspection, "what are we doing in jsonb_hash?" is
indeed a relevant question, but it seems like it's a type mismatch
not a nullness issue.  EXPLAIN VERBOSE shows

   ->  Memoize  (cost=0.01..1.96 rows=1 width=4)
         Output: subq_0.c5
         Cache Key: ref_0.c, ref_0.a
         ->  Subquery Scan on subq_0  (cost=0.00..1.95 rows=1 width=4)
               Output: subq_0.c5
               Filter: (CASE WHEN (subq_0.c5 < 2) THEN NULL::jsonb ELSE NULL::jsonb END ? ref_0.c)
               ->  Limit  (cost=0.00..0.78 rows=78 width=4)
                     Output: (ref_0.a)
                     ->  Function Scan on pg_catalog.generate_series sample_0  (cost=0.00..3.00 rows=300 width=4)
                           Output: ref_0.a
                           Function Call: generate_series(1, 300)

so unless the "Cache Key" output is a complete lie, the cache key
types we should be concerned with are text and integer.  The Datum
that's being passed to jsonb_hash looks suspiciously like it is a
text value '0000', too, which matches the "c" value from the first
row of pagg_tab_ml.  I now think some part of Memoize is looking in
completely the wrong place to discover the cache key datatypes.

            regards, tom lane



Re: jsonb crash

From
Andrew Dunstan
Date:
On 9/29/21 4:00 PM, Tom Lane wrote:
> Jaime Casanova <jcasanov@systemguards.com.ec> writes:
>> I found a crash (segmentation fault) on jsonb.
>> This is the best I could do to reduce the query:
>> """
>> select  
>>   75 as c1
>> from 
>>   public.pagg_tab_ml as ref_0,
>>   lateral (select  
>>         ref_0.a as c5 
>>       from generate_series(1, 300) as sample_0
>>       fetch first 78 rows only
>>       ) as subq_0
>> where case when (subq_0.c5 < 2) 
>>            then cast(null as jsonb) 
>>        else cast(null as jsonb) 
>>       end ? ref_0.c
>> """
> I think this must be a memoize bug.  AFAICS, nowhere in this query
> can we be processing a non-null JSONB value, so what are we doing
> in jsonb_hash?  Something down-stack must have lost the information
> that the Datum is actually null.


Yeah, confirmed that this is not failing in release 13.


cheers


andrew


-- 

Andrew Dunstan
EDB: https://www.enterprisedb.com




Re: jsonb crash

From
David Rowley
Date:
On Thu, 30 Sept 2021 at 09:24, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> After further inspection, "what are we doing in jsonb_hash?" is
> indeed a relevant question, but it seems like it's a type mismatch
> not a nullness issue.  EXPLAIN VERBOSE shows

I think you're right here. It should be hashing text.  That seems to
be going wrong in check_memoizable() because it assumes it's always
fine to use the left side's type of the OpExpr to figure out the hash
function to use.

Maybe we can cache the left and the right type's hash function and use
the correct one in paraminfo_get_equal_hashops().

David



Re: jsonb crash

From
Tom Lane
Date:
David Rowley <dgrowleyml@gmail.com> writes:
> On Thu, 30 Sept 2021 at 09:24, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> After further inspection, "what are we doing in jsonb_hash?" is
>> indeed a relevant question, but it seems like it's a type mismatch
>> not a nullness issue.  EXPLAIN VERBOSE shows

> I think you're right here. It should be hashing text.  That seems to
> be going wrong in check_memoizable() because it assumes it's always
> fine to use the left side's type of the OpExpr to figure out the hash
> function to use.

> Maybe we can cache the left and the right type's hash function and use
> the correct one in paraminfo_get_equal_hashops().

Um ... it seems to have correctly identified the cache key expressions,
so why isn't it just doing exprType on those?  The jsonb_exists operator
seems entirely irrelevant here.

            regards, tom lane



Re: jsonb crash

From
David Rowley
Date:
On Thu, 30 Sept 2021 at 10:09, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> David Rowley <dgrowleyml@gmail.com> writes:
> > Maybe we can cache the left and the right type's hash function and use
> > the correct one in paraminfo_get_equal_hashops().
>
> Um ... it seems to have correctly identified the cache key expressions,
> so why isn't it just doing exprType on those?  The jsonb_exists operator
> seems entirely irrelevant here.

This is down to the caching stuff I added to RestrictInfo to minimise
the amount of work done during the join search. I cached the hash
equal function in RestrictInfo so I didn't have to check what that was
each time we consider a join.  The problem is, that I did a bad job of
taking inspiration from check_hashjoinable() which just looks at the
left type.

David



Re: jsonb crash

From
Tom Lane
Date:
David Rowley <dgrowleyml@gmail.com> writes:
> On Thu, 30 Sept 2021 at 10:09, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Um ... it seems to have correctly identified the cache key expressions,
>> so why isn't it just doing exprType on those?  The jsonb_exists operator
>> seems entirely irrelevant here.

> This is down to the caching stuff I added to RestrictInfo to minimise
> the amount of work done during the join search. I cached the hash
> equal function in RestrictInfo so I didn't have to check what that was
> each time we consider a join.  The problem is, that I did a bad job of
> taking inspiration from check_hashjoinable() which just looks at the
> left type.

I'm still confused.  AFAICS, the top-level operator of the qual clause has
exactly nada to do with the cache keys, as this example makes plain.

            regards, tom lane



Re: jsonb crash

From
David Rowley
Date:
On Thu, 30 Sept 2021 at 10:20, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> David Rowley <dgrowleyml@gmail.com> writes:
> > On Thu, 30 Sept 2021 at 10:09, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> Um ... it seems to have correctly identified the cache key expressions,
> >> so why isn't it just doing exprType on those?  The jsonb_exists operator
> >> seems entirely irrelevant here.
>
> > This is down to the caching stuff I added to RestrictInfo to minimise
> > the amount of work done during the join search. I cached the hash
> > equal function in RestrictInfo so I didn't have to check what that was
> > each time we consider a join.  The problem is, that I did a bad job of
> > taking inspiration from check_hashjoinable() which just looks at the
> > left type.
>
> I'm still confused.  AFAICS, the top-level operator of the qual clause has
> exactly nada to do with the cache keys, as this example makes plain.

You're right that it does not. The lateral join condition could be
anything.  We just need to figure out the hash function and which
equality function so that we can properly find any cached tuples when
we're probing the hash table.  We need the equal function too as we
can't just return any old cache tuples that match the same hash value.

Maybe recording the operator is not the best thing to do. Maybe I
should have just recorded the regproc's Oid for the equal function.
That would save from calling get_opcode() in ExecInitMemoize().

David



Re: jsonb crash

From
David Rowley
Date:
On Thu, 30 Sept 2021 at 09:48, David Rowley <dgrowleyml@gmail.com> wrote:
> Maybe we can cache the left and the right type's hash function and use
> the correct one in paraminfo_get_equal_hashops().

Here's a patch of what I had in mind for the fix.  It's just hot off
the press, so really only intended to assist discussion at this stage.

David

Attachment

Re: jsonb crash

From
Tom Lane
Date:
David Rowley <dgrowley@gmail.com> writes:
> On Thu, 30 Sept 2021 at 10:20, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I'm still confused.  AFAICS, the top-level operator of the qual clause has
>> exactly nada to do with the cache keys, as this example makes plain.

> You're right that it does not. The lateral join condition could be
> anything.

Actually, the more I look at this the more unhappy I get, because
it's becoming clear that you have made unfounded semantic
assumptions.  The hash functions generally only promise that they
will distinguish values that are distinguishable by the associated
equality operator.  We have plenty of data types in which that does
not map to bitwise equality ... you need not look further than
float8 for an example.  And in turn, that means that there are lots
of functions/operators that *can* distinguish hash-equal values.
The fact that you're willing to treat this example as cacheable
means that memoize will fail on such clauses.

So I'm now thinking you weren't that far wrong to be looking at
hashability of the top-level qual operator.  What is missing is
that you have to restrict candidate cache keys to be the *direct*
arguments of such an operator.  Looking any further down in the
expression introduces untenable assumptions.

            regards, tom lane



Re: jsonb crash

From
Tom Lane
Date:
I wrote:
> So I'm now thinking you weren't that far wrong to be looking at
> hashability of the top-level qual operator.  What is missing is
> that you have to restrict candidate cache keys to be the *direct*
> arguments of such an operator.  Looking any further down in the
> expression introduces untenable assumptions.

Hmm ... I think that actually, a correct statement of the semantic
restriction is

    To be eligible for memoization, the inside of a join can use the
    passed-in parameters *only* as direct arguments of hashable equality
    operators.

In order to exploit RestrictInfo-based caching, you could make the
further restriction that all such equality operators appear at the
top level of RestrictInfo clauses.  But that's not semantically
necessary.

As an example, assuming p1 and p2 are the path parameters,

    (p1 = x) xor (p2 = y)

is semantically safe to memoize, despite the upper-level xor
operator.  But the example we started with, with a parameter
used as an argument of jsonb_exists, is not safe to memoize
because we have no grounds to suppose that two hash-equal values
will act the same in jsonb_exists.

            regards, tom lane



Re: jsonb crash

From
David Rowley
Date:
On Thu, 30 Sept 2021 at 11:20, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> I wrote:
> > So I'm now thinking you weren't that far wrong to be looking at
> > hashability of the top-level qual operator.  What is missing is
> > that you have to restrict candidate cache keys to be the *direct*
> > arguments of such an operator.  Looking any further down in the
> > expression introduces untenable assumptions.
>
> Hmm ... I think that actually, a correct statement of the semantic
> restriction is
>
>     To be eligible for memoization, the inside of a join can use the
>     passed-in parameters *only* as direct arguments of hashable equality
>     operators.
>
> In order to exploit RestrictInfo-based caching, you could make the
> further restriction that all such equality operators appear at the
> top level of RestrictInfo clauses.  But that's not semantically
> necessary.
>
> As an example, assuming p1 and p2 are the path parameters,
>
>         (p1 = x) xor (p2 = y)
>
> is semantically safe to memoize, despite the upper-level xor
> operator.  But the example we started with, with a parameter
> used as an argument of jsonb_exists, is not safe to memoize
> because we have no grounds to suppose that two hash-equal values
> will act the same in jsonb_exists.

I'm not really sure if I follow your comment about the top-level qual
operator. I'm not really sure why that has anything to do with it.
Remember that we *never* do any hashing of any values from the inner
side of the join.  If we're doing a parameterized nested loop and say
our parameter has the value of 1, the first time through we don't find
any cached tuples, so we run the plan from the inner side of the
nested loop join and cache all the tuples that we get from it. When
the parameter changes, we check if the current value of the parameter
has any tuples cached.  This is what the hashing and equality
comparison does. If the new parameter value is 2, then we'll hash that
and probe the hash table. Since we've only seen value 1 so far, we
won't get a cache hit.  If at some later point in time we see the
parameter value of 1 again, we hash that, find something in the hash
bucket for that value then do an equality test to ensure the values
are actually the same and not just the same hash bucket or hash value.

At no point do we do any hashing on the actual cached tuples.

This allows us to memoize any join expression, not just equality
expressions. e.g if the SQL is: SELECT * FROM t1 INNER JOIN t2 on t1.a
> t2.a;  assuming t2 is on the inner side of the nested loop join,
then the only thing we hash is the t1.a parameter and the only thing
we do an equality comparison on is the current value of t1.a vs some
previous value of t1.a that is stored in the hash table. You can see
here that if t1.a and t2.a are not the same data type then that's of
no relevance as we *never* hash or do any equality comparisons on t2.a
in the memoize code.

The whole thing just hangs together by the assumption that parameters
with the same value will always yield the same tuples. If that's
somehow a wrong assumption, then we have a problem.

I'm not sure if this helps explain how it's meant to work, or if I
just misunderstood you.

David



Re: jsonb crash

From
Tom Lane
Date:
David Rowley <dgrowleyml@gmail.com> writes:
> On Thu, 30 Sept 2021 at 11:20, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Hmm ... I think that actually, a correct statement of the semantic
>> restriction is
>> To be eligible for memoization, the inside of a join can use the
>> passed-in parameters *only* as direct arguments of hashable equality
>> operators.

> I'm not really sure if I follow your comment about the top-level qual
> operator. I'm not really sure why that has anything to do with it.
> Remember that we *never* do any hashing of any values from the inner
> side of the join.  If we're doing a parameterized nested loop and say
> our parameter has the value of 1, the first time through we don't find
> any cached tuples, so we run the plan from the inner side of the
> nested loop join and cache all the tuples that we get from it. When
> the parameter changes, we check if the current value of the parameter
> has any tuples cached.

Right, and the point is that if you *do* get a hit, you are assuming
that the inner side would return the same values as it returned for
the previous hash-equal value.  You are doing yourself no good by
thinking about simple cases like integers.  Think about float8,
and ask yourself whether, if you cached a result for +0, that result
is still good for -0.  In general we can only assume that for applications
of the hash equality operator itself (or members of its hash opfamily).
Anything involving a cast to text, for example, would fail on such a case.

> This allows us to memoize any join expression, not just equality
> expressions.

I am clearly failing to get through to you.  Do I need to build
an example?

            regards, tom lane



Re: jsonb crash

From
David Rowley
Date:
On Thu, 30 Sept 2021 at 11:54, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> David Rowley <dgrowleyml@gmail.com> writes:
> > This allows us to memoize any join expression, not just equality
> > expressions.
>
> I am clearly failing to get through to you.  Do I need to build
> an example?

Taking your -0.0 / +0.0 float example, If I understand correctly due
to -0.0 and +0.0 hashing to the same value and being classed as equal,
we're really only guaranteed to get the same rows if the join
condition uses the float value (in this example) directly.

If for example there was something like a function we could pass that
float value through that was able to distinguish -0.0 from +0.0, then
that could cause issues as the return value of that function could be
anything and have completely different join partners on the other side
of the join.

A function something like:

create or replace function text_sign(f float) returns text as
$$
begin
    if f::text like '-%' then
        return 'neg';
    else
        return 'pos';
    end if;
end;
$$ language plpgsql;

would be enough to do it.  If the join condition was text_sign(t1.f) =
 t2.col and the cache key was t1.f rather than text_sign(t1.f)

On Thu, 30 Sept 2021 at 10:54, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> So I'm now thinking you weren't that far wrong to be looking at
> hashability of the top-level qual operator.  What is missing is
> that you have to restrict candidate cache keys to be the *direct*
> arguments of such an operator.  Looking any further down in the
> expression introduces untenable assumptions.

I think I also follow you now with the above.  The problem is that if
the join operator is able to distinguish something that the equality
operator and hash function are not then we have the same problem.
Restricting the join operator to hash equality ensures that the join
condition cannot distinguish anything that we cannot distinguish in
the cache hash table.

David



Re: jsonb crash

From
David Rowley
Date:
On Thu, 30 Sept 2021 at 07:55, Jaime Casanova
<jcasanov@systemguards.com.ec> wrote:
> """
> select
>   75 as c1
> from
>   public.pagg_tab_ml as ref_0,
>   lateral (select
>         ref_0.a as c5
>       from generate_series(1, 300) as sample_0
>       fetch first 78 rows only
>       ) as subq_0
> where case when (subq_0.c5 < 2)
>            then cast(null as jsonb)
>            else cast(null as jsonb)
>       end ? ref_0.c
> """

I've attached 2 patches that aim to fix this bug. One for master and
one for pg14.  Unfortunately, for pg14, RestrictInfo lacks a field to
store the righthand type's hash equality operator. I don't think it's
going to be possible as [1] shows me that there's at least one project
outside of core that does makeNode(RestrictInfo).  The next best thing
I can think to do for pg14 is just to limit Memoization for
parameterizations where the RestrictInfo has the same types on the
left and right of an OpExpr.  For pg14, the above query just does not
use Memoize anymore.

In theory, since this field is just caching the hash equality
operator, it might be possible to look up the hash equality function
each time when the left and right types don't match and we need to
know the hash equality operator of the right type.  That'll probably
not be a super common case, but I wouldn't go as far as to say that
it'll be rare.  I'd be a bit worried about the additional planning
time if we went that route.  The extra lookup would have to be done
during the join search, so could happen thousands of times given a bit
more than a handful of tables in the join search.

For master, we can just add a new field to RestrictInfo.  The master
version of the patch does that.

Does anyone have any thoughts on the proposed fixes?

David

[1] https://codesearch.debian.net/search?q=makeNode%28RestrictInfo%29&literal=1

Attachment

Re: jsonb crash

From
Justin Pryzby
Date:
On Tue, Oct 26, 2021 at 07:07:01PM +1300, David Rowley wrote:
> Does anyone have any thoughts on the proposed fixes?

I don't have any thoughts, but I want to be sure it isn't forgotten.

-- 
Justin



Re: jsonb crash

From
David Rowley
Date:
On Sat, 6 Nov 2021 at 11:38, Justin Pryzby <pryzby@telsasoft.com> wrote:
>
> On Tue, Oct 26, 2021 at 07:07:01PM +1300, David Rowley wrote:
> > Does anyone have any thoughts on the proposed fixes?
>
> I don't have any thoughts, but I want to be sure it isn't forgotten.

Not forgotten. I was just hoping to get some feedback.

I've now pushed the fix to restrict v14 to only allow Memoize when the
left and right types are the same.  For master, since it's possible to
add a field to RestrictInfo, I've changed that to cache the left and
right hash equality operators.

This does not fix the binary / logical issue mentioned by Tom.  I have
ideas about allowing Memoize to operate in a binary equality mode or
logical equality mode.  I'll need to run in binary mode when there are
lateral vars or when any join operator is not hashable.

David



Re: jsonb crash

From
Tom Lane
Date:
David Rowley <dgrowleyml@gmail.com> writes:
> I've now pushed the fix to restrict v14 to only allow Memoize when the
> left and right types are the same.  For master, since it's possible to
> add a field to RestrictInfo, I've changed that to cache the left and
> right hash equality operators.

If you were going to push this fix before 14.1, you should have done it
days ago.  At this point it's not possible to get a full set of buildfarm
results before the wrap.  The valgrind and clobber-cache animals, in
particular, are unlikely to report back in time.

            regards, tom lane



Re: jsonb crash

From
David Rowley
Date:
On Mon, 8 Nov 2021 at 15:38, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> David Rowley <dgrowleyml@gmail.com> writes:
> > I've now pushed the fix to restrict v14 to only allow Memoize when the
> > left and right types are the same.  For master, since it's possible to
> > add a field to RestrictInfo, I've changed that to cache the left and
> > right hash equality operators.
>
> If you were going to push this fix before 14.1, you should have done it
> days ago.  At this point it's not possible to get a full set of buildfarm
> results before the wrap.  The valgrind and clobber-cache animals, in
> particular, are unlikely to report back in time.

Sorry, I was under the impression that it was ok until you'd done the
stamp for the release. As far as I can see, that's not done yet.

Do you want me to back out the change I made to 14?

David



Re: jsonb crash

From
David Rowley
Date:
On Thu, 30 Sept 2021 at 10:54, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Actually, the more I look at this the more unhappy I get, because
> it's becoming clear that you have made unfounded semantic
> assumptions.  The hash functions generally only promise that they
> will distinguish values that are distinguishable by the associated
> equality operator.  We have plenty of data types in which that does
> not map to bitwise equality ... you need not look further than
> float8 for an example.

I think this part might be best solved by allowing Memoize to work in
a binary mode.  We already have datum_image_eq() for performing a
binary comparison on a Datum. We'll also need to supplement that with
a function that generates a hash value based on the binary value too.

If we do that and put Memoize in binary mode when join operators are
not hashable or when we're doing LATERAL joins, I think it should fix
this.

It might be possible to work a bit harder and allow the logical mode
for some LATERAL joins.  e.g. something like: SELECT * FROM a, LATERAL
(SELECT * FROM b WHERE a.a = b.b LIMIT 1) b; could use the logical
mode (assuming the join operator is hashable), however, we really only
know the lateral_vars. We don't really collect their context
currently, or the full Expr that they're contained in.  That case gets
more complex if the join condition had contained a mix of lateral and
non-lateral vars on one side of the qual, e.g WHERE a.a = b.b + a.z.

Certainly if the lateral part of the query was a function call, then
we'd be forced into binary mode as we'd have no idea what the function
is doing with the lateral vars being passed to it.

I've my proposed patch.

An easier way out of this would be to disable Memoize for lateral
joins completely and only allow it for normal joins when the join
operators are hashable. I don't want to do this as people are already
seeing good wins in PG14 with Memoize and lateral joins [1]. I think
quite a few people would be upset if we removed that ability.

David

[1] https://twitter.com/RPorsager/status/1455660236375826436

Attachment

Re: jsonb crash

From
David Rowley
Date:
On Thu, 11 Nov 2021 at 18:08, David Rowley <dgrowleyml@gmail.com> wrote:
>
> On Thu, 30 Sept 2021 at 10:54, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > Actually, the more I look at this the more unhappy I get, because
> > it's becoming clear that you have made unfounded semantic
> > assumptions.  The hash functions generally only promise that they
> > will distinguish values that are distinguishable by the associated
> > equality operator.  We have plenty of data types in which that does
> > not map to bitwise equality ... you need not look further than
> > float8 for an example.
>
> I think this part might be best solved by allowing Memoize to work in
> a binary mode.  We already have datum_image_eq() for performing a
> binary comparison on a Datum. We'll also need to supplement that with
> a function that generates a hash value based on the binary value too.

Since I really don't want to stop Memoize from working with LATERAL
joined function calls, I see no other way other than to make use of a
binary key'd cache for cases where we can't be certain that it's safe
to make work in non-binary mode.

I've had thoughts about if we should just make it work in binary mode
all the time, but my thoughts are that that's not exactly a great idea
since it could have a negative effect on cache hits due to there being
the possibility that some types such as case insensitive text where
the number of variations in the binary representation might be vast.
The reason this could be bad is that the estimated cache hit ratio is
calculated by looking at n_distinct, which is obviously not looking
for distinctions in the binary representation. So in binary mode, we
may get a lower cache hit ratio than we might think we'll get, even
with accurate statistics. I'd like to minimise those times by only
using binary mode when we can't be certain that logical mode is safe.

The patch does add new fields to the Memoize plan node type, the
MemoizeState executor node and also MemoizePath.  The new fields do
fit inside the padding of the existing structs.  I've also obviously
had to modify the read/write/copy functions for Memoize to add the new
field there too. My understanding is that this should be ok since
those are only used for parallel query to send plans to the working
and to deserialise it on the worker side. There should never be any
version mismatches there.

If anyone wants to chime in about my proposed patch for this, then
please do so soon. I'm planning to look at this in my Tuesday morning
(UTC+13).

David