Re: Why does the query planner use two full indexes, when a dedicated partial index exists? - Mailing list pgsql-performance

From Jeff Janes
Subject Re: Why does the query planner use two full indexes, when a dedicated partial index exists?
Date
Msg-id CAMkU=1yymr-iMMGtu_nCoQ_GpYUzYwjT2OFQ8ftkEONvdH1srg@mail.gmail.com
Whole thread Raw
In response to Re: Why does the query planner use two full indexes, when a dedicated partial index exists?  (Jeff Janes <jeff.janes@gmail.com>)
List pgsql-performance
On Thursday, December 20, 2012, Jeff Janes wrote:
On Thursday, December 20, 2012, Tom Lane wrote:

What I did to try to duplicate Richard's situation was to create a test
table in which all the exit_state values were NULL, then build the
index, then UPDATE all but a small random fraction of the rows to 1,
then vacuum.  This results in a rather bloated partial index, but I
think that's probably what he's got given that every record initially
enters the table with NULL exit_state.  It would take extremely frequent
vacuuming to keep the partial index from accumulating a lot of dead
entries.


Once I cranked up default_statistics_target, I could start reproducing the very high estimates (5000) for the partial index in 9.1.

As you say, switching to 9.2 or above lowers it quite a bit, I still get some pretty high estimates, ~100 when 8 would be more accurate.

The problem is in genericcostestimate

    if (index->pages > 1 && index->tuples > 1)
        numIndexPages = ceil(numIndexTuples * index->pages / index->tuples);

The index->pages should probably not include index pages which are empty.  Even with aggressive vacuuming, most of the pages in the partial index seem to be empty at any given time.

However, I don't know if that number is exposed readily.  And it seems to be updated too slowly to be useful, if pg_freespace is to be believed.

But I wonder if it couldn't be clamped to so that we there can be no more pages than there are tuples.

        numIndexPages = ceil(numIndexTuples * Min(1,index->pages / index->tuples));


 Cheers,

Jeff

pgsql-performance by date:

Previous
From: Jeff Janes
Date:
Subject: Re: backend suddenly becomes slow, then remains slow
Next
From: Vitalii Tymchyshyn
Date:
Subject: Re: Performance on Bulk Insert to Partitioned Table