Re: Fix gin index cost estimation - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Fix gin index cost estimation
Date
Msg-id 4157141.1662676376@sss.pgh.pa.us
Whole thread Raw
In response to Fix gin index cost estimation  (Ronan Dunklau <ronan.dunklau@aiven.io>)
Responses Re: Fix gin index cost estimation
List pgsql-hackers
Ronan Dunklau <ronan.dunklau@aiven.io> writes:
> The problem I'm trying to solve is that, contrary to btree, gist and sp-gist
> indexes, gin indexes do not charge any cpu-cost for descending the entry tree.

I looked this over briefly.  I think you are correct to charge an
initial-search cost per searchEntries count, but don't we also need to
scale up by arrayScans, similar to the "corrections for cache effects"?

+     * We model index descent costs similarly to those for btree, but we also
+     * need an idea of the tree_height.
+     * We use numEntries / numEntryPages as the fanout factor.

I'm not following that calculation?  It seems like it'd be correct
only for a tree height of 1, although maybe I'm just misunderstanding
this (overly terse, perhaps) comment.

+     * We charge descentCost once for every entry
+     */
+    if (numTuples > 1)
+    {
+        descentCost = ceil(log(numTuples) / log(2.0)) * cpu_operator_cost;
+        *indexStartupCost += descentCost * counts.searchEntries;
+    }

I had to read this twice before absorbing the point of the numTuples
test.  Maybe help the reader a bit:

+    if (numTuples > 1)                /* ensure positive log() */

Personally I'd duplicate the comments from nbtreecostestimate rather
than just assuming the reader will go consult them.  For that matter,
why didn't you duplicate nbtree's logic for charging for SA scans?
This bit seems just as relevant for GIN:

     * If there are ScalarArrayOpExprs, charge this once per SA scan.  The
     * ones after the first one are not startup cost so far as the overall
     * plan is concerned, so add them only to "total" cost.

Keep in mind also that pgindent will have its own opinions about how to
format these comments, and it can out-stubborn you.  Either run the
comments into single paragraphs, or if you really want them to be two
paras then leave an empty comment line between.  Another formatting
nitpick is that you seem to have added a number of unnecessary blank
lines.

            regards, tom lane



pgsql-hackers by date:

Previous
From: Jacob Champion
Date:
Subject: Re: [PATCH] Log details for client certificate failures
Next
From: Tom Lane
Date:
Subject: Re: Reducing the chunk header sizes on all memory context types