AW: AW: More Performance - Mailing list pgsql-hackers

From Zeugswetter Andreas SB
Subject AW: AW: More Performance
Date
Msg-id 219F68D65015D011A8E000006F8590C604AF7D9C@sdexcsrv1.f000.d0188.sd.spardat.at
Whole thread Raw
Responses Re: AW: AW: More Performance  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
> We could ensure that the planner still picks an indexscan as the known
> table size grows by reducing the default selectivity estimate a little
> bit (I experimented and determined that 0.005 would do the trick, for
> the current cost model parameters).  That's pretty ad-hoc, but then
> the 0.01 number is pretty ad-hoc too.  It's probably better to be able
> to say "if you haven't done VACUUM ANALYZE, you will get an indexscan
> from WHERE col = const" than to have to say "it depends".  Comments?

Imho the initial goal why we said 0.01, was to make it use the index,
so reducing it to 0.005 would be ok. I would actually try to calculate 
the value with the current costs for index vs seq scan, so that it
guarantees
use of the index regardless of table size.
But, it probably shows a problem with the chosen metric for selectivity
itself.
Imho the chances are better, that an = restriction will return an equal
amount 
of rows while the table grows than that it will return a percentage of total
table size.

> 
> Of course the real issue here is that the true selectivity of '=' is
> much smaller in this table, because the column being looked at is
> unique.  But the planner doesn't know that without VACUUM stats.
> 
> A hack I have been thinking about adding is that CREATE UNIQUE INDEX
> for a single-column index should immediately force the attdisbursion
> value for that column to "unique", so that the planner would know the
> column is unique even without VACUUM ANALYZE.  That would help not
> at all for the MySQL benchmark (it has a two-column unique index,
> but you can't conclude anything about the properties of either column
> individually from that :-().  But it'd help in a lot of real-world
> scenarios.

Yes, that would imho be a real winner. 
For the multi column index we would need some magic that actually notices 
that all index columns are restricted with =. 

Andreas


pgsql-hackers by date:

Previous
From: Tatsuo Ishii
Date:
Subject: Re: Solaris 2.6 problems
Next
From: Zeugswetter Andreas SB
Date:
Subject: AW: More Performance