Re: [PATCH] Improve EXPLAIN ANALYZE overhead by sampling - Mailing list pgsql-patches

From Bruce Momjian
Subject Re: [PATCH] Improve EXPLAIN ANALYZE overhead by sampling
Date
Msg-id 200605301401.k4UE1nO08251@candle.pha.pa.us
Whole thread Raw
In response to [PATCH] Improve EXPLAIN ANALYZE overhead by sampling  (Martijn van Oosterhout <kleptog@svana.org>)
Responses Re: [PATCH] Improve EXPLAIN ANALYZE overhead by sampling  (Martijn van Oosterhout <kleptog@svana.org>)
List pgsql-patches
Patch applied.  Thanks.

---------------------------------------------------------------------------


Martijn van Oosterhout wrote:
-- Start of PGP signed section.
> This was a suggestion made back in March that would dramatically reduce
> the overhead of EXPLAIN ANALYZE on queries that loop continuously over
> the same nodes.
>
> http://archives.postgresql.org/pgsql-hackers/2006-03/msg01114.php
>
> What it does behave normally for the first 50 tuples of any node, but
> after that it starts sampling at ever increasing intervals, the
> intervals controlled by an exponential function. So for a node
> iterating over 1 million tuples it takes around 15,000 samples. The
> result is that EXPLAIN ANALYZE has a much reduced effect on the total
> execution time.
>
> Without EXPLAIN ANALYZE:
>
> postgres=# select count(*) from generate_series(1,1000000);
>   count
> ---------
>  1000000
> (1 row)
>
> Time: 2303.599 ms
>
> EXPLAIN ANALYZE without patch:
>
> postgres=# explain analyze select count(*) from generate_series(1,1000000);
>                                                              QUERY PLAN
              
>
------------------------------------------------------------------------------------------------------------------------------------
>  Aggregate  (cost=15.00..15.01 rows=1 width=0) (actual time=8022.070..8022.073 rows=1 loops=1)
>    ->  Function Scan on generate_series  (cost=0.00..12.50 rows=1000 width=0) (actual time=1381.762..4873.369
rows=1000000loops=1) 
>  Total runtime: 8042.472 ms
> (3 rows)
>
> Time: 8043.401 ms
>
> EXPLAIN ANALYZE with patch:
>
> postgres=# explain analyze select count(*) from generate_series(1,1000000);
>                                                              QUERY PLAN
              
>
------------------------------------------------------------------------------------------------------------------------------------
>  Aggregate  (cost=15.00..15.01 rows=1 width=0) (actual time=2469.491..2469.494 rows=1 loops=1)
>    ->  Function Scan on generate_series  (cost=0.00..12.50 rows=1000 width=0) (actual time=1405.002..2187.417
rows=1000000loops=1) 
>  Total runtime: 2496.529 ms
> (3 rows)
>
> Time: 2497.488 ms
>
> As you can see, the overhead goes from 5.7 seconds to 0.2 seconds.
> Obviously this is an extreme case, but it will probably help in a lot
> of other cases people have been complaining about.
>
> - To get this close it needs to get an estimate of the sampling overhead.
> It does this by a little calibration loop that is run once per backend.
> If you don't do this, you end up assuming all tuples take the same time
> as tuples with the overhead, resulting in nodes apparently taking
> longer than their parent nodes. Incidently, I measured the overhead to
> be about 3.6us per tuple per node on my (admittedly slightly old)
> machine.
>
> Note that the resulting times still include the overhead actually
> incurred, I didn't filter it out. I want the times to remain reflecting
> reality as closely as possible.
>
> - I also removed InstrStopNodeMulti and made InstrStopNode take a tuple
> count parameter instead. This is much clearer all round.
>
> - I also didn't make it optional. I'm unsure about whether it should be
> optional or not, given the number of cases where it will make a
> difference to be very few.
>
> - The tuple counter for sampling restarts every loop. Thus a node that is
> called repeatedly only returning one value each time will still measure
> every tuple, though its parent node won't. We'll need some field
> testing to see if that remains a significant effect.
>
> - I don't let the user know anywhere how many samples it took. Is this
> something users should care about?
>
> Any comments?
> --
> Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> > From each according to his ability. To each according to his ability to litigate.

[ Attachment, skipping... ]
-- End of PGP section, PGP failed!

--
  Bruce Momjian   http://candle.pha.pa.us
  EnterpriseDB    http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

pgsql-patches by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: [PATCH] Warning about configure args (weaker version)
Next
From: Bruce Momjian
Date:
Subject: Re: [DOCS] Proposed doc-patch: Identifying the Current WAL file