Thread: BUG #7510: Very bad costing estimation on gin vs gist with FTS

BUG #7510: Very bad costing estimation on gin vs gist with FTS

From
daniel@heroku.com
Date:
The following bug has been logged on the website:

Bug reference:      7510
Logged by:          Daniel Farina
Email address:      daniel@heroku.com
PostgreSQL version: 9.1.4
Operating system:   Ubuntu 10.04
Description:        =


Summary: Planner chooses GiST even if GIN is much better.

We have a table that we decided to use GiST-based full text search on, but
received terrible performance.  It's not a very big table, nor are the
tsvectors very large -- we FTS a tiny bit of text and throw in a few
identifiers at our choosing to enable a search in an application of ours.

The root cause of that is that GiST is terrible when using prefix matching
operators on tsvectors (the ":*" operator in the search language), which is
what one nominally wants for incremental search.

  Upon doing "EXPLAIN (ANALYZE, BUFFERS, FORMAT YAML)", this is what the the
plan looks like:


---------------------------------------------------------------------------=
----------
 - Plan:                                                                    =

       +
     Node Type: "Limit"                                                     =

       +
     Startup Cost: 0.00                                                     =

       +
     Total Cost: 14.48                                                      =

       +
     Plan Rows: 6                                                           =

       +
     Plan Width: 240                                                        =

       +
     Actual Startup Time: 499.515                                           =

       +
     Actual Total Time: 499.515                                             =

       +
     Actual Rows: 0                                                         =

       +
     Actual Loops: 1                                                        =

       +
     Shared Hit Blocks: 269246                                              =

       +
     Shared Read Blocks: 0                                                  =

       +
     Shared Written Blocks: 0                                               =

       +
     Local Hit Blocks: 0                                                    =

       +
     Local Read Blocks: 0                                                   =

       +
     Local Written Blocks: 0                                                =

       +
     Temp Read Blocks: 0                                                    =

       +
     Temp Written Blocks: 0                                                 =

       +
     Plans:                                                                 =

       +
       - Node Type: "Index Scan"                                            =

       +
         Parent Relationship: "Outer"                                       =

       +
         Scan Direction: "NoMovement"                                       =

       +
         Index Name: "resources_text_searchable_idx"                        =

       +
         Relation Name: "resources"                                         =

       +
         Alias: "resources"                                                 =

       +
         Startup Cost: 0.00                                                 =

       +
         Total Cost: 14.48                                                  =

       +
         Plan Rows: 6                                                       =

       +
         Plan Width: 240                                                    =

       +
         Actual Startup Time: 499.511                                       =

       +
         Actual Total Time: 499.511                                         =

       +
         Actual Rows: 0                                                     =

       +
         Actual Loops: 1                                                    =

       +
         Index Cond: "(search_document @@ '''daniel'':* &
''heroku.c'':*'::tsquery)"+
         Shared Hit Blocks: 269246                                          =

       +
         Shared Read Blocks: 0                                              =

       +
         Shared Written Blocks: 0                                           =

       +
         Local Hit Blocks: 0                                                =

       +
         Local Read Blocks: 0                                               =

       +
         Local Written Blocks: 0                                            =

       +
         Temp Read Blocks: 0                                                =

       +
         Temp Written Blocks: 0                                             =

       +
   Triggers:                                                                =

       +
   Total Runtime: 499.571

What's notable here is that the shared hit blocks is about 2GB worth of
data.  The index per \di+ is only about 350MB, and the table per \dt+ is
only ~400MB, so I'm not sure precisely what the deal there is, but
regardless, this GiST index might be no better for I/O than a sequential
scan.  It is probably worse.

If one adds a GIN index and does a fresh analyze, the planner still produce
a plan for the GiST index.  Because there is no way to disable particular
indexes in a session, it's impossible to quickly experiment with a new
hypothetical situation with only the GIN index without possibly painting
yourself into a corner where you've dropped a needed index.  We forked the
database and did our experiment there, removing the GiST index as to remove
its access pattern.  Here's the new EXPLAIN, which, to get to the punch,
only scans 4 pages, or 32K, and is very fast as a result.


                                    QUERY PLAN                              =

      =

---------------------------------------------------------------------------=
--------
 - Plan:                                                                    =

     +
     Node Type: "Limit"                                                     =

     +
     Startup Cost: 13.98                                                    =

     +
     Total Cost: 105.60                                                     =

     +
     Plan Rows: 50                                                          =

     +
     Plan Width: 240                                                        =

     +
     Actual Startup Time: 0.092                                             =

     +
     Actual Total Time: 0.303                                               =

     +
     Actual Rows: 50                                                        =

     +
     Actual Loops: 1                                                        =

     +
     Shared Hit Blocks: 51                                                  =

     +
     Shared Read Blocks: 0                                                  =

     +
     Shared Written Blocks: 0                                               =

     +
     Local Hit Blocks: 0                                                    =

     +
     Local Read Blocks: 0                                                   =

     +
     Local Written Blocks: 0                                                =

     +
     Temp Read Blocks: 0                                                    =

     +
     Temp Written Blocks: 0                                                 =

     +
     Plans:                                                                 =

     +
       - Node Type: "Bitmap Heap Scan"                                      =

     +
         Parent Relationship: "Outer"                                       =

     +
         Relation Name: "resources"                                         =

     +
         Alias: "resources"                                                 =

     +
         Startup Cost: 13.98                                                =

     +
         Total Cost: 2352.21                                                =

     +
         Plan Rows: 1276                                                    =

     +
         Plan Width: 240                                                    =

     +
         Actual Startup Time: 0.089                                         =

     +
         Actual Total Time: 0.238                                           =

     +
         Actual Rows: 50                                                    =

     +
         Actual Loops: 1                                                    =

     +
         Recheck Cond: "(search_document @@
'''daniel@heroku.com'':*'::tsquery)"  +
         Shared Hit Blocks: 51                                              =

     +
         Shared Read Blocks: 0                                              =

     +
         Shared Written Blocks: 0                                           =

     +
         Local Hit Blocks: 0                                                =

     +
         Local Read Blocks: 0                                               =

     +
         Local Written Blocks: 0                                            =

     +
         Temp Read Blocks: 0                                                =

     +
         Temp Written Blocks: 0                                             =

     +
         Plans:                                                             =

     +
           - Node Type: "Bitmap Index Scan"                                 =

     +
             Parent Relationship: "Outer"                                   =

     +
             Index Name: "experiment_ginnish"                               =

     +
             Startup Cost: 0.00                                             =

     +
             Total Cost: 13.91                                              =

     +
             Plan Rows: 1276                                                =

     +
             Plan Width: 0                                                  =

     +
             Actual Startup Time: 0.071                                     =

     +
             Actual Total Time: 0.071                                       =

     +
             Actual Rows: 75                                                =

     +
             Actual Loops: 1                                                =

     +
             Index Cond: "(search_document @@
'''daniel@heroku.com'':*'::tsquery)"+
             Shared Hit Blocks: 4                                           =

     +
             Shared Read Blocks: 0                                          =

     +
             Shared Written Blocks: 0                                       =

     +
             Local Hit Blocks: 0                                            =

     +
             Local Read Blocks: 0                                           =

     +
             Local Written Blocks: 0                                        =

     +
             Temp Read Blocks: 0                                            =

     +
             Temp Written Blocks: 0                                         =

     +
   Triggers:                                                                =

     +
   Total Runtime: 0.395
(1 row)

So this is an interesting result.

Ideally the planner could ask the FTS functions to cost the query against
the index before choosing it, but more pressingly it shows that anyone who
wants incremental search is in for *much* worse performance than the
"roughly 3x" performance difference advertised as a rule of thumb in the
docs, and without knowing how to dissect a fairly advanced form of EXPLAIN
they'd be very hard pressed to find out, because the index appears to be in
use (and in fact, it is, it's just terrible).

Re: BUG #7510: Very bad costing estimation on gin vs gist with FTS

From
Tom Lane
Date:
daniel@heroku.com writes:
> If one adds a GIN index and does a fresh analyze, the planner still produce
> a plan for the GiST index.  Because there is no way to disable particular
> indexes in a session, it's impossible to quickly experiment with a new
> hypothetical situation with only the GIN index without possibly painting
> yourself into a corner where you've dropped a needed index.

FWIW, there is a pretty standard workaround for that:

    begin;
    drop index unwanted_index;
    explain ...;
    rollback;

This isn't ideal in a production database because it requires exclusive
lock on the table for long enough to run the EXPLAIN.  But it's not true
that there's no way to handle this at all.  If you want something more
flexible, I'd suggest working on improving the "index advisor" plugin
that was getting batted around a couple years ago.  There are sufficient
hooks in the planner to let it be given an arbitrary hypothetical set of
indexes.  But I digress...

Anyway, the meat of your complaint is that the planner is overestimating
the cost of a GIN scan relative to a GIST scan.  I believe the reason
for this is that gincostestimate() is trying to make a fairly honest
estimate of the work involved, whereas gistcostestimate() is just a stub
around genericcostestimate(), which computes an estimate that's more or
less suitable for btree-equivalent index operations.  There's certainly
not any intelligence in the latter that would be capable of dealing with
issues like how much a tsvector "*" operator is going to hurt.  We could
stand to have less bogus estimates for GIST (not to mention SPGIST), but
I'm really not familiar enough with either to write better code for that.

It also seems possible that you've tripped over a plain old performance
bug in the GIST code, ie, the fault is not with the estimate but the
reality.  It's hard to tell about that though.  Do you want to try to
make up a smaller self-contained test case?

            regards, tom lane