Re: Where does the time go? - Mailing list pgsql-hackers

From Martijn van Oosterhout
Subject Re: Where does the time go?
Date
Msg-id 20060325175526.GE1695@svana.org
Whole thread Raw
In response to Re: Where does the time go?  (Simon Riggs <simon@2ndquadrant.com>)
List pgsql-hackers
On Sat, Mar 25, 2006 at 05:38:26PM +0000, Simon Riggs wrote:
> On Sat, 2006-03-25 at 16:24 +0100, Martijn van Oosterhout wrote:
>
> > I agree. However, if it's the overhead of calling gettimeofday() that
> > slows everything down, perhaps we should tackle that end. For example,
> > have a sampling mode that only times say 5% of the executed nodes.
> >
> > EXPLAIN ANALYZE SAMPLE blah;
>
> I like this idea. Why not do this all the time? I'd say we don't need
> the SAMPLE clause at all, just do this for all EXPLAIN ANALYZEs.

I was wondering about that. But then you may run into wierd results if
a subselect takes a long time for just a few value. But maybe it should
be the default, and have a FULL mode to say you want to measure
everything.

> Something even simpler? First 40 plus 5% random sample after that? I'd
> prefer a random sample so we have the highest level of trust in the
> numbers produced. Otherwise we might accidentally introduce bias from
> systematic effects such as nested loops queries speeding up towards the
> end of their run. (I know we would do that at the start, but we are
> stuck because we don't know the population size ahead of time and we
> know we need a reasonable number of data points).

Well, I was wondering if a fixed percentage was appropriate. 5% of 10
million is still a lot for possibly not a lot of benefit. The followup
email suggested a sampling that keeps happening less often as the
number of tuples increases it a logorithmic based way. But we could add
dome randomness that'd be cool. The question is, what's the overhead of
calling random()?

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: Role incompatibilities
Next
From: Josh Berkus
Date:
Subject: Re: A big thank you to all!