Re: A costing analysis tool - Mailing list pgsql-hackers

From Martijn van Oosterhout
Subject Re: A costing analysis tool
Date
Msg-id 20051015223417.GF14113@svana.org
Whole thread Raw
In response to Re: A costing analysis tool  (Greg Stark <gsstark@mit.edu>)
Responses Re: A costing analysis tool
List pgsql-hackers
On Sat, Oct 15, 2005 at 05:53:45PM -0400, Greg Stark wrote:
> Martijn van Oosterhout <kleptog@svana.org> writes:
>
> > This is unfortunate because EXPLAIN ANALYZE is an immensly useful tool,
> > as far as it goes. I've pondered if some kind of userspace timing
> > mechanism could be introduced (possibly using builtin CPU cycle
> > counters) to reduce the cost. It does, however, remain a cost.
>
> I wonder if there's a good case for a version of explain analyze that runs the
> query and outputs the plan along with row counts but not timing for each row.
> You would still be able to see if the estimates are correct. And it would have
> basically no overhead so you could use it under a production environment.

That's an interesting thought. Just counting tuples and loops. Wouldn't
be too hard to implement. BTW, for those people thinking of using
hardware CPU cycle counters, it won't work. Think multiple CPUs,
systems with variable CPU clock sppeds and CPUs idling when there's
nothing to do. There is no useful relationship between a CPU cycle
counter and wall clock time over the sort of time intervals we're
interested in.

Interestingly, I notice the windows port of PostgreSQL uses the
QueryPerformanceCounter() function. I tried playing with it under linux
and found that Linux suspends the CPU while waiting for things to
happen. So:

sleep(1)                ~ 20 million cycles
busy loop for 1 second  ~ 800 million cycles (CPU speed)

So, what's good for battery and power usage is bad for accurate
timings. Basically, on Linux it would seriously underestimate the time
for blocking system calls on an otherwise idle system. So, it works for
Windows because they don't do this...

> > Given that you can see how many times gettimeday() was called, you may
> > be able to correct the error. I havn't tried that though.
>
> I tried, it seems like it should be trivial but I got bogged down in details.
> Removing profiling overhead is pretty standard for profilers to do though. It
> really seems to me like it ought to be done. It still wouldn't let you use
> EXPLAIN ANALYZE under production without a lot of overhead though.

It occured to me as I wrote it to try and build it into the system. But
I have no idea what kind of issues you might have run into. Any
references?

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

pgsql-hackers by date:

Previous
From: Greg Stark
Date:
Subject: Re: slow IN() clause for many cases
Next
From: Greg Stark
Date:
Subject: Re: A costing analysis tool