Re: BRIN cost estimate - Mailing list pgsql-hackers

From Emre Hasegeli
Subject Re: BRIN cost estimate
Date
Msg-id CAE2gYzz4FhRV17090-mbFXx8XekPnCHk1qD71q5swC_hX6ePZw@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] BRIN cost estimate  (Alvaro Herrera <alvherre@2ndquadrant.com>)
List pgsql-hackers
> Good point. That's wrong, but I'm confused at why you kept the:
> + *indexTotalCost += selec * numTuples * cpu_index_tuple_cost;
>
> at all if that's the case. All the BRIN scan is going to do is build a
> bitmap of the matching ranges found.

My mind was not clear when I was working on it a year ago.

I've ended up with:

+ /*
+ * Charge a small amount per range tuple which we expect to match to. This
+ * is meant to reflect the costs of manipulating the bitmap. The BRIN scan
+ * will set a bit for each page in the range when we find a matching
+ * range, so we must multiply the charge by the number of pages in the
+ * range.
+ */
+ *indexTotalCost += 0.1 * cpu_operator_cost * estimatedRanges *
+ pagesPerRange;

Isn't BRIN executes the operator only once per range?  I think we can just multiply cpu_operator_cost and estimatedRanges.

I also noticed that you were doing:

+ if (get_index_stats_hook &&
+ (*get_index_stats_hook) (root, index->indexoid, 1, &vardata))

and

+ vardata.statsTuple = SearchSysCache3(STATRELATTINH,
+ ObjectIdGetDatum(index->indexoid),
+ Int16GetDatum(1),

I wondered why you picked to hardcode that as 1, and I thought that's
surely a bug. But then looking deeper it seems to be copied from
btcostestimate(), which also seems to be a long-standing bug
introduced in a536ed53bca40cb0d199824e358a86fcfd5db7f2. I'll go write
a patch for that one now.

Yes, I copy-pasted from btcostestimate().

I do still have concerns about how the correlation value is used, and
the fact that we use the highest value, but then the whole use of this
value is far from perfect, and is just a rough indication of how
things are.

I agree.  I tried to document how incomplete it is on the comments I wrote to help future developers improve the situation.

pgsql-hackers by date:

Previous
From: Kyotaro HORIGUCHI
Date:
Subject: Re: [COMMITTERS] pgsql: Collect and use multi-columndependency stats
Next
From: Kyotaro HORIGUCHI
Date:
Subject: Re: Interval for launching the table sync worker