Re: anti-join chosen even when slower than old plan - Mailing list pgsql-performance

From Tom Lane
Subject Re: anti-join chosen even when slower than old plan
Date
Msg-id 19796.1289504156@sss.pgh.pa.us
Whole thread Raw
In response to Re: anti-join chosen even when slower than old plan  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: anti-join chosen even when slower than old plan  ("Kevin Grittner" <Kevin.Grittner@wicourts.gov>)
Re: anti-join chosen even when slower than old plan  (Robert Haas <robertmhaas@gmail.com>)
Re: anti-join chosen even when slower than old plan  (Cédric Villemain <cedric.villemain.debian@gmail.com>)
List pgsql-performance
Robert Haas <robertmhaas@gmail.com> writes:
> Yeah.  For Kevin's case, it seems like we want the caching percentage
> to vary not so much based on which table we're hitting at the moment
> but on how much of it we're actually reading.

Well, we could certainly take the expected number of pages to read and
compare that to effective_cache_size.  The thing that's missing in that
equation is how much other stuff is competing for cache space.  I've
tried to avoid having the planner need to know the total size of the
database cluster, but it's kind of hard to avoid that if you want to
model this honestly.

Would it be at all workable to have an estimate that so many megs of a
table are in cache (independently of any other table), and then we could
scale the cost based on the expected number of pages to read versus that
number?  The trick here is that DBAs really aren't going to want to set
such a per-table number (at least, most of the time) so we need a
formula to get to a default estimate for that number based on some simple
system-wide parameters.  I'm not sure if that's any easier.

BTW, it seems that all these variants have an implicit assumption that
if you're reading a small part of the table it's probably part of the
working set; which is an assumption that could be 100% wrong.  I don't
see a way around it without trying to characterize the data access at
an unworkably fine level, though.

            regards, tom lane

pgsql-performance by date:

Previous
From: "Kevin Grittner"
Date:
Subject: Re: anti-join chosen even when slower than old plan
Next
From: Tom Lane
Date:
Subject: Re: anti-join chosen even when slower than old plan