Thread: Re: BUG #16863: Assert failed in set_plain_rel_size() on processing ~* with a long prefix

12.02.2021 22:00, PG Bug reporting form wrote:
> SELECT * FROM test WHERE t ~* ('^' || repeat('-', 500));
> ...
> #2  0x0000559da7963ff8 in ExceptionalCondition (
>     conditionName=conditionName@entry=0x559da7ab4690 "rel->rows > 0 ||
> IS_DUMMY_REL(rel)",
>     errorType=errorType@entry=0x559da79bf028 "FailedAssertion",
> fileName=fileName@entry=0x559da7ab43b1 "allpaths.c",
>     lineNumber=lineNumber@entry=462) at assert.c:67
With the attached debugging patch applied I see that rel->rows there is
NaN. At the end of the walk by the following calls:
set_plain_rel_size -> set_baserel_size_estimates ->
clauselist_selectivity -> clauselist_selectivity_simple ->
clause_selectivity -> restriction_selectivity -> icregexeqsel ->
patternsel -> patternsel_common -> pattern_fixed_prefix ->
regex_fixed_prefix -> regex_selectivity
I've found a division that produces NaN:
sel /= pow(FIXED_CHAR_SEL, fixed_prefix_len);

The complete output with the debugging code:
psql:500.sql:2: INFO:  sel:

0.0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000,
FIXED_CHAR_SEL: 0.200000, fixed_prefix_len: 500, pow:

0.0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
psql:500.sql:2: INFO:  sel: NaN
psql:500.sql:2: INFO:  rel->rows: NaN; nrows: NaN
psql:500.sql:2: INFO:  rel->rows: NaN
psql:500.sql:2: server closed the connection unexpectedly

With the prefix length 400 the output is different:
psql:400.sql:2: INFO:  sel:

0.0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000002582249878086965166223199366465225696632326517795503118666931124080740300000000000000000000000000000000000000000000000000,
FIXED_CHAR_SEL: 0.200000, fixed_prefix_len: 400, pow:

0.0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000002582249878086966144821031402096464952544460966116815567078211594740854000000000000000000000000000000000000000000000000000
psql:400.sql:2: INFO:  sel: 1.000000
psql:400.sql:2: INFO:  rel->rows: 7.000000; nrows: 6.800000
psql:400.sql:2: INFO:  rel->rows: 7.000000

On the master with the length 500 there is no assertion failure but the
row count is strange:
psql:master500.sql:3: INFO:  sel:

0.0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000,
FIXED_CHAR_SEL: 0.200000, fixed_prefix_len: 500, pow:

0.0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
psql:master500.sql:3: INFO:  sel: NaN
psql:master500.sql:3: INFO:  rel->rows:
10000000000000000159028911097599180468360808563945281389781327557747838772170381060813469985856815104.000000;
nrows: NaN
psql:master500.sql:3: INFO:  rel->rows:
10000000000000000159028911097599180468360808563945281389781327557747838772170381060813469985856815104.000000

By the way, if the following check in restriction_selectivity() is
intended to throw error on an invalid selectivity, shouldn't the
isnan(result) test be appended here?:
    if (result < 0.0 || result > 1.0)
        elog(ERROR, "invalid restriction selectivity: %f", result);

Best regards,
Alexander

Attachment
Alexander Lakhin <exclusion@gmail.com> writes:
> I've found a division that produces NaN:
> sel /= pow(FIXED_CHAR_SEL, fixed_prefix_len);

Hmm.  I'm not getting a NaN AFAICT, but I am getting pretty darn weird
estimates.  I agree this needs some kind of clamp.

regression=# create table test (t text);
CREATE TABLE
regression=# explain SELECT * FROM test WHERE t ~* ('^' || repeat('-', 500));
...
 Seq Scan on test  (cost=0.00..27.00
rows=10000000000000000159028911097599180468360808563945281389781327557747838772170381060813469985856815104width=32) 
...

BTW, the message you're answering hasn't shown up here, nor is it
in the mailing list archives.  Odd.

            regards, tom lane



Hello Tom,
12.02.2021 23:37, Tom Lane wrote:
> Alexander Lakhin <exclusion@gmail.com> writes:
>> I've found a division that produces NaN:
>> sel /= pow(FIXED_CHAR_SEL, fixed_prefix_len);
> Hmm.  I'm not getting a NaN AFAICT, but I am getting pretty darn weird
> estimates.  I agree this needs some kind of clamp.
>
> regression=# create table test (t text);
> CREATE TABLE
> regression=# explain SELECT * FROM test WHERE t ~* ('^' || repeat('-', 500));
> ...
>  Seq Scan on test  (cost=0.00..27.00
rows=10000000000000000159028911097599180468360808563945281389781327557747838772170381060813469985856815104width=32)
 
The same number I've seen on the master branch. But on REL_13_STABLE I
get a NaN and then the assertion failure. (I've chosen the version 13.2
in the bug report, but it's really delayed somewhere.)
Though that division produced a NaN for me on both branches.

Best regards,
Alexander



Alexander Lakhin <exclusion@gmail.com> writes:
> 12.02.2021 23:37, Tom Lane wrote:
>> Alexander Lakhin <exclusion@gmail.com> writes:
>>> I've found a division that produces NaN:
>>> sel /= pow(FIXED_CHAR_SEL, fixed_prefix_len);

>> Hmm.  I'm not getting a NaN AFAICT, but I am getting pretty darn weird
>> estimates.  I agree this needs some kind of clamp.
>>
>> regression=# create table test (t text);
>> CREATE TABLE
>> regression=# explain SELECT * FROM test WHERE t ~* ('^' || repeat('-', 500));
>> ...
>> Seq Scan on test  (cost=0.00..27.00
rows=10000000000000000159028911097599180468360808563945281389781327557747838772170381060813469985856815104width=32) 

> The same number I've seen on the master branch. But on REL_13_STABLE I
> get a NaN and then the assertion failure.

The difference in behavior is evidently explained by commit a90c950fc,
which has decided to insert a random number in place of a NaN estimate.

Well, it's not really random, it's supposed to be 1e100 ... but EXPLAIN
doesn't know there's only 16 or so significant digits there.  Maybe we
need to work a bit harder on making that print nicely.  And maybe we
should rethink the idea that it's okay to gloss over a NaN estimate
that way.  I certainly don't see a defensible reason for assuming that
NaN means "a large value".

            regards, tom lane