Thread: Failing assertions in indxpath.c, placeholder.c and brin_minmax.c

Failing assertions in indxpath.c, placeholder.c and brin_minmax.c

From
Andreas Seltenreich
Date:
Hi,

when running my random query generator contraption[1] against the
regression database of 9.5 or master, it occasionally triggers one of
the following three assertions.  Someone more knowledgeable might want
to take a look at them...

-- FailedAssertion("!(outer_rel->rows > 0)", File: "indxpath.c", Line: 1911)
-- sample query:   select     rel1925354.loid as c0,     rel1925353.version as c1   from     (select
rel1925352.aaas c0,           rel1925352.aa as c1         from           public.b as rel1925352         where
(rel1925352.bbis NULL)           and (rel1925352.bb < rel1925352.bb)) as subq_303136         inner join
pg_catalog.pg_stat_sslas rel1925353         on (subq_303136.c0 = rel1925353.pid )       right join
pg_catalog.pg_largeobjectas rel1925354       on (subq_303136.c0 = rel1925354.pageno )   where (rel1925353.clientdn !~
rel1925353.clientdn)    and (rel1925353.cipher <= rel1925353.clientdn);
 

,----[ git bisect ]
|   first bad commit: [3f8c23c4d31d4a0e801041733deb2c7cfa577b32] Improve
|   predtest.c's ability to reason about operator expressions.
`----

-- FailedAssertion("!(!bms_is_empty(phinfo->ph_eval_at))", File: "placeholder.c", Line: 109)
-- sample query:   select     rel1600276.viewowner as c0,     rel1600274.maxwritten_clean as c1,
rel1600275.n_tup_hot_updas c2   from     pg_catalog.pg_stat_bgwriter as rel1600274         inner join
pg_catalog.pg_stat_xact_all_tablesas rel1600275         on (rel1600274.maxwritten_clean = rel1600275.seq_scan )
rightjoin pg_catalog.pg_views as rel1600276         right join pg_catalog.pg_operator as rel1600277         on
(rel1600276.viewname= rel1600277.oprname )       on (rel1600275.relname = rel1600277.oprname )   where 3 is not NULL;
 

,----[ git bisect ]
|   first bad commit: [f4abd0241de20d5d6a79b84992b9e88603d44134] Support
|   flattening of empty-FROM subqueries and one-row VALUES tables.
`----

-- FailedAssertion("!(key->sk_flags & 0x0080)", File: "brin_minmax.c", Line: 177)
-- sample query:   select     rel167978.namecol as c0   from     information_schema.parameters as rel167972       left
joinpublic.student as rel167977         inner join public.brintest as rel167978         on (rel167977.age =
rel167978.int4col)       on (rel167972.interval_precision = rel167977.age )   where rel167977.name <> rel167977.name;
 

regards,
andreas

Footnotes: 
[1]  https://github.com/anse1/sqlsmith



Re: Failing assertions in indxpath.c, placeholder.c and brin_minmax.c

From
Michael Paquier
Date:
On Sun, Jul 26, 2015 at 9:55 PM, Andreas Seltenreich <seltenreich@gmx.de> wrote:
> when running my random query generator contraption[1] against the
> regression database of 9.5 or master, it occasionally triggers one of
> the following three assertions.  Someone more knowledgeable might want
> to take a look at them...
>
> -- FailedAssertion("!(outer_rel->rows > 0)", File: "indxpath.c", Line: 1911)
> -- sample query:
>     select
>       rel1925354.loid as c0,
>       rel1925353.version as c1
>     from
>       (select
>             rel1925352.aa as c0,
>             rel1925352.aa as c1
>           from
>             public.b as rel1925352
>           where (rel1925352.bb is NULL)
>             and (rel1925352.bb < rel1925352.bb)) as subq_303136
>           inner join pg_catalog.pg_stat_ssl as rel1925353
>           on (subq_303136.c0 = rel1925353.pid )
>         right join pg_catalog.pg_largeobject as rel1925354
>         on (subq_303136.c0 = rel1925354.pageno )
>     where (rel1925353.clientdn !~ rel1925353.clientdn)
>       and (rel1925353.cipher <= rel1925353.clientdn);
>
> ,----[ git bisect ]
> |   first bad commit: [3f8c23c4d31d4a0e801041733deb2c7cfa577b32] Improve
> |   predtest.c's ability to reason about operator expressions.
> `----
>
> -- FailedAssertion("!(!bms_is_empty(phinfo->ph_eval_at))", File: "placeholder.c", Line: 109)
> -- sample query:
>     select
>       rel1600276.viewowner as c0,
>       rel1600274.maxwritten_clean as c1,
>       rel1600275.n_tup_hot_upd as c2
>     from
>       pg_catalog.pg_stat_bgwriter as rel1600274
>           inner join pg_catalog.pg_stat_xact_all_tables as rel1600275
>           on (rel1600274.maxwritten_clean = rel1600275.seq_scan )
>         right join pg_catalog.pg_views as rel1600276
>           right join pg_catalog.pg_operator as rel1600277
>           on (rel1600276.viewname = rel1600277.oprname )
>         on (rel1600275.relname = rel1600277.oprname )
>     where 3 is not NULL;
>
> ,----[ git bisect ]
> |   first bad commit: [f4abd0241de20d5d6a79b84992b9e88603d44134] Support
> |   flattening of empty-FROM subqueries and one-row VALUES tables.
> `----
>
> -- FailedAssertion("!(key->sk_flags & 0x0080)", File: "brin_minmax.c", Line: 177)
> -- sample query:
>     select
>       rel167978.namecol as c0
>     from
>       information_schema.parameters as rel167972
>         left join public.student as rel167977
>           inner join public.brintest as rel167978
>           on (rel167977.age = rel167978.int4col )
>         on (rel167972.interval_precision = rel167977.age )
>     where rel167977.name <> rel167977.name;
> Footnotes:
> [1]  https://github.com/anse1/sqlsmith

This is really interesting stuff. I think that it would be possible to
extract self-contained test cases from your tool and those queries to
reproduce the failures. It is written that this tools connects to a
database to retrieve the schema, what is it exactly in the case of
those failures?
-- 
Michael



Re: Failing assertions in indxpath.c, placeholder.c and brin_minmax.c

From
Andreas Seltenreich
Date:
Michael Paquier writes:

>> Footnotes:
>> [1]  https://github.com/anse1/sqlsmith
>
> This is really interesting stuff. I think that it would be possible to
> extract self-contained test cases from your tool and those queries to
> reproduce the failures. It is written that this tools connects to a
> database to retrieve the schema, what is it exactly in the case of
> those failures?

I used the database "regression" that pg_regress leaves behind when you
remove the --temp-install from it's default invocation through make
check.  Sorry about not being explicit about that.

So, dropping one of the queries into src/test/regress/sql/smith.sql and
invoking
   make check EXTRA_TESTS=smith

was all that was needed to integrate them.  I was then able to perform
"git bisect run" on this command.  Er, plus consing the expected output
file.

I'm using the regression db a lot when hacking on sqlsmith, as it
contains much more nasty things than your average database.

regards
andreas



Re: Failing assertions in indxpath.c, placeholder.c and brin_minmax.c

From
Tom Lane
Date:
Andreas Seltenreich <seltenreich@gmx.de> writes:
> when running my random query generator contraption[1] against the
> regression database of 9.5 or master, it occasionally triggers one of
> the following three assertions.

Very very cool tool!  Please keep doing that testing.

The first two seem to be planner problems, so I'll take responsibility for
digging into those.  But the third appears to be plain old brain fade in
the BRIN code.  It can be reproduced by

regression=# explain select * from brintest where int4col = NULL::integer::information_schema.cardinal_number;
                               QUERY PLAN                                          
 
--------------------------------------------------------------------------------
----------------Bitmap Heap Scan on brintest  (cost=52.01..56.02 rows=1 width=339)  Recheck Cond: (int4col =
((NULL::integer)::information_schema.cardinal_number
)::integer)  ->  Bitmap Index Scan on brinidx  (cost=0.00..52.01 rows=1 width=0)        Index Cond: (int4col =
((NULL::integer)::information_schema.cardinal_nu
mber)::integer)
(4 rows)

regression=# select * from brintest where int4col = NULL::integer::information_schema.cardinal_number;
server closed the connection unexpectedly

or you can do it like this:

regression=# select * from brintest where int4col = (select NULL::integer);
server closed the connection unexpectedly

or you could do it with a join to a table containing some null values.

You need some complication because if you just write a plain null literal:

regression=# explain select * from brintest where int4col = NULL::integer;                                QUERY PLAN
                       
 
------------------------------------------------------------------Result  (cost=0.00..106.30 rows=1 width=339)
One-TimeFilter: NULL::boolean  ->  Seq Scan on brintest  (cost=0.00..106.30 rows=1 width=339)
 
(3 rows)

the planner knows that int4eq is strict so it reduces the WHERE clause
to constant NULL and doesn't bother with an indexscan.

Bottom line is that somebody failed to consider the possibility of a
null comparison value reaching the BRIN index lookup machinery.
The code stanza that's failing supposes that only IS NULL or IS NOT NULL
tests could have SK_ISNULL set, but that's just wrong.
        regards, tom lane



Re: Failing assertions in indxpath.c, placeholder.c and brin_minmax.c

From
Tom Lane
Date:
Andreas Seltenreich <seltenreich@gmx.de> writes:
> when running my random query generator contraption[1] against the
> regression database of 9.5 or master, it occasionally triggers one of
> the following three assertions.

I've fixed the first two of these --- thanks for the report!

> ,----[ git bisect ]
> |   first bad commit: [3f8c23c4d31d4a0e801041733deb2c7cfa577b32] Improve
> |   predtest.c's ability to reason about operator expressions.
> `----

I'm a bit confused about this aspect of your report though, because in
my hands that example fails clear back to 9.2.  It doesn't seem to require
the predtest.c improvement to expose the fault.
        regards, tom lane



Re: Failing assertions in indxpath.c, placeholder.c and brin_minmax.c

From
Peter Geoghegan
Date:
On Sun, Jul 26, 2015 at 7:07 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Andreas Seltenreich <seltenreich@gmx.de> writes:
>> when running my random query generator contraption[1] against the
>> regression database of 9.5 or master, it occasionally triggers one of
>> the following three assertions.
>
> Very very cool tool!  Please keep doing that testing.

The SQLite people have been using a tool like this for some time.
They've also had luck finding bugs with a generic fuzz-testing tool
called "american fuzzy lop" (yes, seriously, that's what it's called),
which apparently is the state of the art.

I myself ran that tool against Postgres. I didn't spend enough time to
tweak it in a way that might have been effective. I also didn't figure
out a way to make iterations fast enough for the tool to be effective,
because I was invoking Postgres in single-user mode. I might pick it
up again in the future, but probably for a more targeted case.

-- 
Peter Geoghegan



Re: Failing assertions in indxpath.c, placeholder.c and brin_minmax.c

From
Michael Paquier
Date:
<div dir="ltr"><br /><div class="gmail_extra"><br /><div class="gmail_quote">On Sun, Jul 26, 2015 at 10:32 PM, Andreas
Seltenreich<span dir="ltr"><<a href="mailto:seltenreich@gmx.de" target="_blank">seltenreich@gmx.de</a>></span>
wrote:<br/><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span
class="">MichaelPaquier writes:<br /><br /> >> Footnotes:<br /> >> [1]  <a
href="https://github.com/anse1/sqlsmith"rel="noreferrer" target="_blank">https://github.com/anse1/sqlsmith</a><br />
><br/> > This is really interesting stuff. I think that it would be possible to<br /> > extract self-contained
testcases from your tool and those queries to<br /> > reproduce the failures. It is written that this tools connects
toa<br /> > database to retrieve the schema, what is it exactly in the case of<br /> > those failures?<br /><br
/></span>Iused the database "regression" that pg_regress leaves behind when you<br /> remove the --temp-install from
it'sdefault invocation through make<br /> check.  Sorry about not being explicit about that.<br /><br /> So, dropping
oneof the queries into src/test/regress/sql/smith.sql and<br /> invoking<br /><br />     make check
EXTRA_TESTS=smith<br/><br /> was all that was needed to integrate them.  I was then able to perform<br /> "git bisect
run"on this command.  Er, plus consing the expected output<br /> file.<br /><br /> I'm using the regression db a lot
whenhacking on sqlsmith, as it<br /> contains much more nasty things than your average database.<span
class="HOEnZb"><fontcolor="#888888"><br /></font></span></blockquote></div><br /></div><div class="gmail_extra">Ah, OK.
Thanks.The code is licensed as GPL, has a dependency on libpqxx and is written in C++, so it cannot be integrated into
coreas a test module in this state, but I think that it would be definitely worth having something like that in the
codetree that runs on the buildfarm. We could have caught up those problems earlier.  Now I imagine that this is a
costlyrun, so we had better have a switch to control if it is run or not, like a configure option or a flag.
Thoughts?<br/>-- <br /><div class="gmail_signature">Michael<br /></div></div></div> 

Re: Failing assertions in indxpath.c, placeholder.c and brin_minmax.c

From
Andreas Seltenreich
Date:
Tom Lane writes:

> Andreas Seltenreich <seltenreich@gmx.de> writes:
>> when running my random query generator contraption[1] against the
>> regression database of 9.5 or master, it occasionally triggers one of
>> the following three assertions.
>
> I've fixed the first two of these --- thanks for the report!

I let sqlsmith run during the night, and it did no longer trigger the
first two.  During roughly a million random queries it triggered the
already mentioned brin one 10 times, but there was also one instance of
this new one in the log:

TRAP: FailedAssertion("!(join_clause_is_movable_into(rinfo, joinrel->relids, join_and_req))", File: "relnode.c", Line:
987)
LOG:  server process (PID 12851) was terminated by signal 6: Aborted
DETAIL:  Failed process was running: select    rel65543066.tmplname as c0,   rel65543064.umuser as c1from   public.dept
asrel65543059    inner join pg_catalog.pg_user_mappings as rel65543064        left join pg_catalog.pg_enum as
rel65543065       on (rel65543064.srvname = rel65543065.enumlabel )      inner join pg_catalog.pg_ts_template as
rel65543066     on (rel65543065.enumtypid = rel65543066.tmplnamespace )    on (rel65543059.dname = rel65543064.srvname
)where((rel65543059.mgrname = rel65543059.mgrname)     and (rel65543064.usename = rel65543066.tmplname))   and
(rel65543059.mgrname~~ rel65543059.mgrname)fetch first 128 rows only;
 

>> ,----[ git bisect ]
>> |   first bad commit: [3f8c23c4d31d4a0e801041733deb2c7cfa577b32] Improve
>> |   predtest.c's ability to reason about operator expressions.
>> `----
>
> I'm a bit confused about this aspect of your report though, because in
> my hands that example fails clear back to 9.2.  It doesn't seem to require
> the predtest.c improvement to expose the fault.

Hmm, I actually used a different, uglier query to trigger this assertion
for the bisection run.  I'll attach it[1] along with the complete git
bisect log[2].

regards,
andreas

Footnotes: 
[1]  select  subq_717608.c3 as c0, rel4551421.inhrelid as c1, rel4551421.inhrelid as c2, subq_717608.c3 as c3 from
information_schema.foreign_tablesas rel4551363 right join public.hash_f8_heap as rel4551366 inner join
pg_catalog.pg_constraintas rel4551419 inner join (select  rel4551420.bb as c0, rel4551420.aa as c1, rel4551420.aa as
c2,rel4551420.aa as c3 from public.b as rel4551420 where ( rel4551420.bb>rel4551420.bb ) and (
rel4551420.bb<rel4551420.bb) ) as subq_717608 on (rel4551419.coninhcount = subq_717608.c1 ) left join
pg_catalog.pg_inheritsas rel4551421 on (subq_717608.c1 = rel4551421.inhseqno ) on (rel4551366.seqno = subq_717608.c1 )
on(rel4551363.foreign_table_schema = rel4551419.conname ) where ( ( rel4551419.contypid<>rel4551419.connamespace ) and
(rel4551419.connamespace>=rel4551419.conrelid ) ) and ( rel4551421.inhparent<rel4551419.contypid )  fetch first 9 rows
only;
 

[2] git bisect start
# bad: [3b5a89c4820fb11c337838c1ad71e8e93f2937d1] Fix resource leak pointed out by Coverity.
git bisect bad 3b5a89c4820fb11c337838c1ad71e8e93f2937d1
# good: [e6df2e1be6330660ba4d81daa726ae4a71535aa9] Stamp 9.4beta1.
git bisect good e6df2e1be6330660ba4d81daa726ae4a71535aa9
# bad: [68e66923ff629c324e219090860dc9e0e0a6f5d6] Add missing volatile qualifier.
git bisect bad 68e66923ff629c324e219090860dc9e0e0a6f5d6
# bad: [a16bac36eca8158cbf78987e95376f610095f792] Remove dependency on wsock32.lib in favor of ws2_32
git bisect bad a16bac36eca8158cbf78987e95376f610095f792
# good: [55d5b3c08279b487cfa44d4b6e6eea67a0af89e4] Remove unnecessary output expressions from unflattened subqueries.
git bisect good 55d5b3c08279b487cfa44d4b6e6eea67a0af89e4
# bad: [1cbc9480106241aaa8db112331e93d0a265b6db0] Check interrupts during logical decoding more frequently.
git bisect bad 1cbc9480106241aaa8db112331e93d0a265b6db0
# bad: [686f362bee126e50280bcd3b35807b02f18a8966] Fix contrib/pg_upgrade/test.sh for $PWD containing spaces.
git bisect bad 686f362bee126e50280bcd3b35807b02f18a8966
# bad: [be76a6d39e2832d4b88c0e1cc381aa44a7f86881] Secure Unix-domain sockets of "make check" temporary clusters.
git bisect bad be76a6d39e2832d4b88c0e1cc381aa44a7f86881
# good: [6554656ea2043c5bb877b427237dc5ddd7c5e5c8] Improve tuplestore's error messages for I/O failures.
git bisect good 6554656ea2043c5bb877b427237dc5ddd7c5e5c8
# bad: [a7205d81573cb0c979f2d463a1d9edb6f97c94aa] Adjust 9.4 release notes.
git bisect bad a7205d81573cb0c979f2d463a1d9edb6f97c94aa
# bad: [3f8c23c4d31d4a0e801041733deb2c7cfa577b32] Improve predtest.c's ability to reason about operator expressions.
git bisect bad 3f8c23c4d31d4a0e801041733deb2c7cfa577b32
# good: [c81e63d85f0c2c39d3fdfd8b95fc1ead6fdcb89f] Fix pg_restore's processing of old-style BLOB COMMENTS data.
git bisect good c81e63d85f0c2c39d3fdfd8b95fc1ead6fdcb89f
# first bad commit: [3f8c23c4d31d4a0e801041733deb2c7cfa577b32] Improve predtest.c's ability to reason about operator
expressions.




Re: Failing assertions in indxpath.c, placeholder.c and brin_minmax.c

From
Alvaro Herrera
Date:
Tom Lane wrote:

> Bottom line is that somebody failed to consider the possibility of a
> null comparison value reaching the BRIN index lookup machinery.
> The code stanza that's failing supposes that only IS NULL or IS NOT NULL
> tests could have SK_ISNULL set, but that's just wrong.

Hm, okay, will look into that.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Failing assertions in indxpath.c, placeholder.c and brin_minmax.c

From
Tom Lane
Date:
Andreas Seltenreich <seltenreich@gmx.de> writes:
> Tom Lane writes:
>> I've fixed the first two of these --- thanks for the report!

> I let sqlsmith run during the night, and it did no longer trigger the
> first two.  During roughly a million random queries it triggered the
> already mentioned brin one 10 times, but there was also one instance of
> this new one in the log:

Oh goody, more fun.  I'll take a look.

>> I'm a bit confused about this aspect of your report though, because in
>> my hands that example fails clear back to 9.2.  It doesn't seem to require
>> the predtest.c improvement to expose the fault.

> Hmm, I actually used a different, uglier query to trigger this assertion
> for the bisection run.

Ah, okay.  The triggering condition for both those cases is
provably-contradictory restriction clauses on an inheritance relation.
In what you showed yesterday, that was something like "x < x AND x IS
NULL", which the planner has been able to recognize as contradictory
for a long time because "<" is strict.  (It did not, and still doesn't,
notice that "x < x" all by itself is contradictory...).  But here it
looks like the trigger is

from public.b as rel4551420
where ( rel4551420.bb>rel4551420.bb ) and ( rel4551420.bb<rel4551420.bb )

It was the recent predtest improvements that allowed recognition that
bb < bb contradicts bb > bb.  So that's why this run started to fail
there, even though the bug it was tickling is much older.
        regards, tom lane



Re: Failing assertions in indxpath.c, placeholder.c and brin_minmax.c

From
Tom Lane
Date:
Andreas Seltenreich <seltenreich@gmx.de> writes:
> I let sqlsmith run during the night, and it did no longer trigger the
> first two.  During roughly a million random queries it triggered the
> already mentioned brin one 10 times, but there was also one instance of
> this new one in the log:

> TRAP: FailedAssertion("!(join_clause_is_movable_into(rinfo, joinrel->relids, join_and_req))", File: "relnode.c",
Line:987)
 
> LOG:  server process (PID 12851) was terminated by signal 6: Aborted
> DETAIL:  Failed process was running: select  
>       rel65543066.tmplname as c0, 
>       rel65543064.umuser as c1
>     from 
>       public.dept as rel65543059
>         inner join pg_catalog.pg_user_mappings as rel65543064
>             left join pg_catalog.pg_enum as rel65543065
>             on (rel65543064.srvname = rel65543065.enumlabel )
>           inner join pg_catalog.pg_ts_template as rel65543066
>           on (rel65543065.enumtypid = rel65543066.tmplnamespace )
>         on (rel65543059.dname = rel65543064.srvname )
>     where ((rel65543059.mgrname = rel65543059.mgrname) 
>         and (rel65543064.usename = rel65543066.tmplname)) 
>       and (rel65543059.mgrname ~~ rel65543059.mgrname)
>     fetch first 128 rows only;

I looked into this one.  What seems to be the story is that
join_clause_is_movable_into() is approximate in the conservative
direction, that is it sometimes can return "false" when a strict analysis
would conclude "true" (a fact that is undocumented in its comments;
I shall fix that).  This is acceptable, as long as the answers are
consistent across different jointree levels, since the worst consequence
is that we might fail to push a join clause as far down the jointree as it
could be pushed.  However, the Assert that's failing here supposes that
the answer is exact.  I think a sufficient fix in the near term is to
disable that Assert and add suitable commentary.  In the long run it might
be nice if the answers were exact, but that would require more information
than is currently passed to join_clause_is_movable_into(), which would
imply non-back-patchable API changes.

> ,----[ git bisect ]
> |   first bad commit: [3f8c23c4d31d4a0e801041733deb2c7cfa577b32] Improve
> |   predtest.c's ability to reason about operator expressions.
> `----

There seems to be something odd going on with your bisection tests,
because once again the query fails clear back to 9.2 for me, which is what
I'd expect based on the above analysis --- the movable-join-clause logic
all came in with parameterized paths in 9.2.
        regards, tom lane



Re: Failing assertions in indxpath.c, placeholder.c and brin_minmax.c

From
Alvaro Herrera
Date:
Tom Lane wrote:

> Bottom line is that somebody failed to consider the possibility of a
> null comparison value reaching the BRIN index lookup machinery.
> The code stanza that's failing supposes that only IS NULL or IS NOT NULL
> tests could have SK_ISNULL set, but that's just wrong.

I think the easiest way to solve this is to consider that all indexable
operators are strict, and have the function return false in that case.
The attached patch implements that.  (In a quick check, the only
non-strict operator in the regression database is <%(point,widget),
which seems okay to ignore given that the type itself is only part of
pg_regress.  I wonder what would happen if the regression tests defined
an index using that operator.)

What btree actually does is precompute a "qual_ok" property at
scan-restart time, which seems pretty clever (maybe too much).  I think
something like _bt_preprocess_keys should probably be applied to BRIN
scans someday.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

Re: Failing assertions in indxpath.c, placeholder.c and brin_minmax.c

From
Tom Lane
Date:
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> Tom Lane wrote:
>> Bottom line is that somebody failed to consider the possibility of a
>> null comparison value reaching the BRIN index lookup machinery.
>> The code stanza that's failing supposes that only IS NULL or IS NOT NULL
>> tests could have SK_ISNULL set, but that's just wrong.

> I think the easiest way to solve this is to consider that all indexable
> operators are strict, and have the function return false in that case.
> The attached patch implements that.

This looks fine to me as a localized fix.  I was wondering whether we
could short-circuit the index lookup further upstream, but I take it from
your comment about _bt_preprocess_keys that BRIN has no convenient place
for that today.  (Even if it did, I'd still vote for making this change,
for safety's sake.)
        regards, tom lane



Re: Failing assertions in indxpath.c, placeholder.c and brin_minmax.c

From
Alvaro Herrera
Date:
Tom Lane wrote:
> Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> > Tom Lane wrote:
> >> Bottom line is that somebody failed to consider the possibility of a
> >> null comparison value reaching the BRIN index lookup machinery.
> >> The code stanza that's failing supposes that only IS NULL or IS NOT NULL
> >> tests could have SK_ISNULL set, but that's just wrong.
> 
> > I think the easiest way to solve this is to consider that all indexable
> > operators are strict, and have the function return false in that case.
> > The attached patch implements that.
> 
> This looks fine to me as a localized fix.  I was wondering whether we
> could short-circuit the index lookup further upstream, but I take it from
> your comment about _bt_preprocess_keys that BRIN has no convenient place
> for that today.  (Even if it did, I'd still vote for making this change,
> for safety's sake.)

Yeah, it doesn't currently.  Hopefully we will improve that in the
future.  Particularly with regards to array keys I think there's a lot
to be done there.

I pushed this as is.  I hesitated about adding a regression test, but it
didn't seem worthwhile in the end, because if in the future we improve
scan key analysis, we will need much more extensive testing, and this
doesn't look like the type of bug that we could reintroduce in a whim.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Failing assertions in indxpath.c, placeholder.c and brin_minmax.c

From
Greg Stark
Date:
On Sun, Jul 26, 2015 at 11:15 PM, Peter Geoghegan <pg@heroku.com> wrote:
> The SQLite people have been using a tool like this for some time.
> They've also had luck finding bugs with a generic fuzz-testing tool
> called "american fuzzy lop" (yes, seriously, that's what it's called),
> which apparently is the state of the art.
>
> I myself ran that tool against Postgres. I didn't spend enough time to
> tweak it in a way that might have been effective. I also didn't figure
> out a way to make iterations fast enough for the tool to be effective,
> because I was invoking Postgres in single-user mode. I might pick it
> up again in the future, but probably for a more targeted case.

I've been poking at this. Like you I ran AFL against Postgres in
single-user mode. It was slowed down by the Postgres startup time and
worse, it was very limiting being in single-user-mode.

What I think holds more promise is Libfuzzer from LLVM. It's not as
clever about generating test cases and it doesn't have a pretty curses
interface, but it's much more amenable to integrating into a
client/server architecture.

In fact this is the interface I have now:

stark=> SELECT fuzz(1000000, 'select $1::timestamptz')

Pretty slick, eh? :)

It's also blindingly fast. Even with a subtransaction around each
fuzzing call it's still getting up to 5-8k/s executions for simpler
functions. The main disadvantage is that it's more fragile because
it's not isolated from the program it's testing. So a bug that causes
infinite recursion or consumes lots of memory can cause it to crash
and it can take a while to do so.

I hope to package it up as a contrib module but it does require the
Libfuzzer library which is built as part of LLVM but only when you
build LLVM itself with the coverage sanitizer so it doesn't end up in
any llvm binary packages. And of course building LLVM takes forever
and a day... Maybe I'll include a copy of the llvm source files in the
module -- the license looks to be compatible.

Anyways, first non-security trophy it found:

stark=# SELECT 'doy'::timestamptz;
ERROR:  unexpected dtype 33 while parsing timestamptz "doy"
LINE 1: SELECT 'doy'::timestamptz;              ^
stark=# SELECT 'dow'::timestamptz;
ERROR:  unexpected dtype 32 while parsing timestamptz "dow"
LINE 1: SELECT 'dow'::timestamptz;              ^

This looks to be a quite an old bug. It dates to 2000 and the code
hasn't been touched since Tom wrote it -- not that Tom but Tom
Lockhart. I haven't quite grokked the ParseDateTime code but it looks
to me like the "dow" and "doy" keywords are miscategorized and should
be type UNITS not type RESERV. The corresponding code in extract_date
would then have to move to the other branch leaving "epoch" as the
only reserved word which serves double duty as a unit in extract date.

-- 
greg