AW: AW: AW: More Performance - Mailing list pgsql-hackers

From Zeugswetter Andreas SB
Subject AW: AW: AW: More Performance
Date
Msg-id 219F68D65015D011A8E000006F8590C604AF7DA9@sdexcsrv1.f000.d0188.sd.spardat.at
Whole thread Raw
List pgsql-hackers
> Zeugswetter Andreas SB <ZeugswetterA@Wien.Spardat.at> writes:
> > But, it probably shows a problem with the chosen metric for
> > selectivity itself.  Imho the chances are better, that an =
> > restriction will return an equal amount of rows while the 
> table grows
> > than that it will return a percentage of total table size.
> 
> Unfortunately you are allowing your thinking to be driven by a single
> example.  Consider queries like
>     select * from employees where dept = 'accounting'; 
> It's perfectly possible that the column being tested with '=' has only
> a small number of distinct values, in which case the number 
> of retrieved
> rows probably *is* proportional to the table size.
> 
> I am not willing to change the planner so that it 
> "guarantees" to choose
> an indexscan no matter what, because then it would be broken for cases
> like this.  We have to look at the statistics we have, 
> inadequate though
> they are.

Yes, this would not be good. But imho it would be good to force the index
if we lack disbursion information (no analyze), but have tabsize and index
size
info, and index size is small, since as vadim said analyze is very time
consuming.

Actually could index size compared with colsize*rowcount be an indicator
for disbursion ? At least for fixed length columns ?
big index --> very unique
small index --> many duplicates

Andreas


pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Create user/create database outside template1
Next
From: Tom Lane
Date:
Subject: Re: parser oddity (t.count)