Re: Hash-based MCV matching for large IN-lists - Mailing list pgsql-hackers

From Zsolt Parragi
Subject Re: Hash-based MCV matching for large IN-lists
Date
Msg-id CAN4CZFO3Y25iCqqP_zS1ipgbrBXvAkxeLK2hPuamddyW9ouAzQ@mail.gmail.com
Whole thread Raw
In response to Re: Hash-based MCV matching for large IN-lists  (Ilia Evdokimov <ilya.evdokimov@tantorlabs.com>)
Responses Re: Hash-based MCV matching for large IN-lists
List pgsql-hackers
Hello!

+ if (vardata.isunique && vardata.rel && vardata.rel->tuples >= 1.0)
+ {
+ s2 = 1.0 / vardata.rel->tuples;
+ if (HeapTupleIsValid(vardata.statsTuple))
+ {
+ Form_pg_statistic stats = (Form_pg_statistic) GETSTRUCT(vardata.statsTuple);
+ if (isInequality)
+ s2 = 1.0 - s2 - stats->stanullfrac;
+ }
+ }


Isn't there's a corner case where this if order returns an incorrect
estimate/regression?
See the following test:

CREATE TABLE test AS SELECT generate_series(1, 1000) AS id;
CREATE UNIQUE INDEX ON test(id);
-- no ANALYZE

EXPLAIN SELECT * FROM test WHERE id <> ALL(ARRAY[1, 2, 3]);
-- Actual:   rows=1
-- Expected: rows=997

ANALYZE test;
EXPLAIN SELECT * FROM test WHERE id <> ALL(ARRAY[1, 2, 3]);
-- Correct: rows=997

DROP TABLE test;



pgsql-hackers by date:

Previous
From: Jeff Davis
Date:
Subject: Re: [19] CREATE SUBSCRIPTION ... SERVER
Next
From: Zsolt Parragi
Date:
Subject: Re: Refactor handling of "-only" options in pg_dump, pg_restore