Re: Partial hash index is not used for implied qual. - Mailing list pgsql-hackers

From Sergei Glukhov
Subject Re: Partial hash index is not used for implied qual.
Date
Msg-id cb0b9982-ca1a-4f03-9d7f-f9ca678be35c@postgrespro.ru
Whole thread Raw
In response to Re: Partial hash index is not used for implied qual.  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Hi,

On 11/25/25 6:01 AM, Tom Lane wrote:
> I wrote:
>> Wouldn't it be better to handle it more like the is_target_rel logic
>> a few lines further up?
> Actually, after thinking a bit longer, it'd be better to do something
> like the attached so that we don't keep redundant quals unless they'd
> *all* be excluded.
>
> There's definitely something fishy about the costing though.
> I experimented with this variant of Sergei's example:
>
> regression=# CREATE TABLE hash_partial(x) AS SELECT x % 100 as y from generate_series(1, 1000) as x;
> SELECT 1000
> regression=# ANALYZE hash_partial;
> ANALYZE
> regression=# CREATE INDEX partial_idx ON hash_partial USING hash(x) WHERE x = 1;
> CREATE INDEX
> regression=# set enable_seqscan TO 0;  -- else we'll go for a seqscan
> SET
> regression=# EXPLAIN SELECT x FROM hash_partial WHERE x = 1;
>                                   QUERY PLAN
> ----------------------------------------------------------------------------
>   Bitmap Heap Scan on hash_partial  (cost=24.08..32.56 rows=10 width=4)
>     Recheck Cond: (x = 1)
>     ->  Bitmap Index Scan on partial_idx  (cost=0.00..24.07 rows=10 width=0)
>           Index Cond: (x = 1)
> (4 rows)
>
> regression=# drop index partial_idx;
> DROP INDEX
> regression=# CREATE INDEX ON hash_partial USING hash(x);
> CREATE INDEX
> regression=# EXPLAIN SELECT x FROM hash_partial WHERE x = 1;
>                                      QUERY PLAN
> ----------------------------------------------------------------------------------
>   Bitmap Heap Scan on hash_partial  (cost=4.08..12.56 rows=10 width=4)
>     Recheck Cond: (x = 1)
>     ->  Bitmap Index Scan on hash_partial_x_idx  (cost=0.00..4.08 rows=10 width=0)
>           Index Cond: (x = 1)
> (4 rows)
>
> Why are we thinking that a non-partial index would be substantially
> cheaper to scan?  That seems surely wrong, and it runs counter to my
> intuition about why this fix is incomplete.  (I expected an unfair
> bias towards the partial index, not against it.)
>
>             regards, tom lane
>

Thanks for the fix. It seems there is another case for investigation:

DROP TABLE hash_partial;
CREATE TABLE hash_partial(x, y) AS
SELECT x, x + x as y from generate_series(1, 1000) as x;
ANALYZE hash_partial;
CREATE INDEX partial_idx  ON hash_partial USING hash(x) WHERE x = 1;
SET enable_seqscan TO 0;
EXPLAIN SELECT x FROM hash_partial WHERE x = 1 and y < 0;
--------------------------------------------------------------------------------
Seq Scan on hash_partial  (cost=0.00..23.00 rows=1 width=4)
    Disabled: true
    Filter: ((y < 0) AND (x = 1))
(3 rows)


  Regarding strangeness of the cost,
  cost is depends on numIndexPages and
  in genericcostestimate() we calulate numIndexPages:

  numIndexPages = ceil(numIndexTuples * index->pages / index->tuples);

  For non-partial index index->pages = 6 and index->tuples = 1000
  and for partial index index->pages = 6 and index->tuples = 10.
  Number of pages depends on index relation size and
  initial size is 6 * BLCKSZ for both, partial and non-partial hash indexes
  Initial size of the hash index relation, in turn,
  depends on total number of tuples in the table.

  Regards,
  Sergei Glukhov





pgsql-hackers by date:

Previous
From: "Zhijie Hou (Fujitsu)"
Date:
Subject: RE: How can end users know the cause of LR slot sync delays?
Next
From: Xuneng Zhou
Date:
Subject: Re: Implement waiting for wal lsn replay: reloaded