Thread: Can the V7.3 EXPLAIN ANALYZE be trusted?
While working on a previous question I posed to this group, I ran a number of EXPLAIN ANALYZE's to provide as examples. After sending up my last email, I ran the same query *without* EXPLAIN ANALYZE. The runtimes were vastly different. In the following example, I ran two identical queries one right after the other. The runtimes for both was very close (44.77 sec). I then immediately ran the exact same query, but without EXPLAIN ANALYZE. The same number of rows was returned, but the runtime was only 8.7 sec. I don't think EXPLAIN ANALYZE puts that much overhead on a query. Does anyone have any idea what is going on here? --- Steve vsa=# explain analyze SELECT id,win_patch_scan_id FROM vsa.tbl_win_patch_scan_item WHERE win_patch_scan_id IN (SELECT id FROM vsa.tbl_win_patch_scan WHERE scan_datetime < '2004-09-18 00:00:00'); QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------ Seq Scan on tbl_win_patch_scan_item (cost=0.00..382335670.62 rows=376033 width=8) (actual time=10.18..44773.22 rows=2045 loops=1) Filter: (subplan) SubPlan -> Materialize (cost=505.06..511.38 rows=632 width=4) (actual time=0.00..0.02 rows=43 loops=752066) -> Seq Scan on tbl_win_patch_scan (cost=0.00..505.06 rows=632 width=4) (actual time=0.02..10.09 rows=43 loops=1) Filter: (scan_datetime < '2004-09-18 00:00:00'::timestamp without time zone) Total runtime: 44774.49 msec (7 rows) Time: 44775.62 ms vsa=# explain analyze SELECT id,win_patch_scan_id FROM vsa.tbl_win_patch_scan_item WHERE win_patch_scan_id IN (SELECT id FROM vsa.tbl_win_patch_scan WHERE scan_datetime < '2004-09-18 00:00:00'); QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------ Seq Scan on tbl_win_patch_scan_item (cost=0.00..382335670.62 rows=376033 width=8) (actual time=10.18..44765.36 rows=2045 loops=1) Filter: (subplan) SubPlan -> Materialize (cost=505.06..511.38 rows=632 width=4) (actual time=0.00..0.02 rows=43 loops=752066) -> Seq Scan on tbl_win_patch_scan (cost=0.00..505.06 rows=632 width=4) (actual time=0.02..10.10 rows=43 loops=1) Filter: (scan_datetime < '2004-09-18 00:00:00'::timestamp without time zone) Total runtime: 44766.62 msec (7 rows) Time: 44767.71 ms vsa=# SELECT id,win_patch_scan_id FROM vsa.tbl_win_patch_scan_item WHERE win_patch_scan_id IN (SELECT id FROM vsa.tbl_win_patch_scan WHERE scan_datetime < '2004-09-18 00:00:00'); id | win_patch_scan_id --------+------------------- 1 | 1 2 | 1 3 | 1 4 | 1 5 | 1 ----------8< SNIP -------------- 211 | 7 212 | 7 213 | 7 214 | 7 215 | 7 216 | 7 217 | 7 692344 | 9276 692345 | 9276 692346 | 9276 692347 | 9276 692348 | 9276 ----------8< SNIP -------------- 694167 | 9311 694168 | 9311 694169 | 9311 694170 | 9311 694171 | 9311 (2045 rows) Time: 8703.56 ms vsa=# ___________________________________________________________________________________ Steven Rosenstein IT Architect/Developer | IBM Virtual Server Administration Voice/FAX: 845-689-2064 | Cell: 646-345-6978 | Tieline: 930-6001 Text Messaging: 6463456978 @ mobile.mycingular.com Email: srosenst @ us.ibm.com "Learn from the mistakes of others because you can't live long enough to make them all yourself." -- Eleanor Roosevelt
-----Original Message----- From: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance-owner@postgresql.org]On Behalf Of Steven Rosenstein Sent: Sunday, February 06, 2005 8:51 PM To: pgsql-performance@postgresql.org Subject: [PERFORM] Can the V7.3 EXPLAIN ANALYZE be trusted? While working on a previous question I posed to this group, I ran a number of EXPLAIN ANALYZE's to provide as examples. After sending up my last email, I ran the same query *without* EXPLAIN ANALYZE. The runtimes were vastly different. In the following example, I ran two identical queries one right after the other. The runtimes for both was very close (44.77 sec). I then immediately ran the exact same query, but without EXPLAIN ANALYZE. The same number of rows was returned, but the runtime was only 8.7 sec. I don't think EXPLAIN ANALYZE puts that much overhead on a query. Does anyone have any idea what is going on here? --- Steve Caching by the OS? (Did you try to *first* run the query w/o EXPLAIN ANALYZE, and then with? What's the timing if you do that?) --Tim
"Leeuw van der, Tim" <tim.leeuwvander@nl.unisys.com> writes: > I don't think EXPLAIN ANALYZE puts that much overhead on a query. EXPLAIN ANALYZE does indeed impose a significant overhead. What percentage of the time is overhead depends heavily on how much i/o the query is doing. For queries that are primarily cpu bound because they're processing data from the cache it can be substantial. If all the data is in the shared buffers then the gettimeofday calls for explain analyze can be just about the only syscalls being executed and they're executed a lot. It would be interesting to try to subtract out the profiling overhead from the data like most profilers do. But it's not an easy thing to do since the times are nested. -- greg
> From: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance-owner@postgresql.org]On Behalf Of Steven Rosenstein > >> I don't think EXPLAIN ANALYZE puts that much overhead on a query. I think you're being overly optimistic. The explain shows that the Materialize subnode is being entered upwards of 32 million times: -> Materialize (cost=505.06..511.38 rows=632 width=4) (actual time=0.00..0.02 rows=43 loops=752066) 43 * 752066 = 32338838. The instrumentation overhead is basically two gettimeofday() kernel calls per node entry. Doing the math shows that your machine is able to do gettimeofday() in about half a microsecond, which isn't stellar but it's not all that slow for a kernel call. (What's the platform here, anyway?) Nonetheless it's a couple of times larger than the actual time needed to pull a row from a materialized array ... The real answer to your question is "IN (subselect) sucks before PG 7.4; get a newer release". regards, tom lane
You're probably right about my being overly optimistic about the load imposed by EXPLAIN ANALYZE. It was just that in my previous experience with it, I'd never seen such a large runtime discrepancy before. I even allowed for a "caching effect" by making sure the server was all but quiescent, and then running the three queries as quickly after one another as I could. The server itself is an IBM x345 with dual Xeon 3ghz CPU's (hyperthreading turned off) and 2.5gb of RAM. O/S is RHEL3 Update 4. Disks are a ServeRAID of some flavor, I'm not sure what. Thanks for the heads-up about the performance of IN in 7.3. We're looking to migrate to 8.0 or 8.0.1 when they become GA, but some of our databases are in excess of 200gb-300gb, and we need to make sure we have a good migration plan in place (space to store the dump out of the 7.3 db) before we start. ___________________________________________________________________________________ Steven Rosenstein IT Architect/Developer | IBM Virtual Server Administration Voice/FAX: 845-689-2064 | Cell: 646-345-6978 | Tieline: 930-6001 Text Messaging: 6463456978 @ mobile.mycingular.com Email: srosenst @ us.ibm.com "Learn from the mistakes of others because you can't live long enough to make them all yourself." -- Eleanor Roosevelt Tom Lane <tgl@sss.pgh.pa.u s> To Steven Rosenstein/New 02/06/2005 05:46 York/IBM@IBMUS PM cc pgsql-performance@postgresql.org Subject Re: [PERFORM] Can the V7.3 EXPLAIN ANALYZE be trusted? > From: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance-owner@postgresql.org]On Behalf Of Steven Rosenstein > >> I don't think EXPLAIN ANALYZE puts that much overhead on a query. I think you're being overly optimistic. The explain shows that the Materialize subnode is being entered upwards of 32 million times: -> Materialize (cost=505.06..511.38 rows=632 width=4) (actual time=0.00..0.02 rows=43 loops=752066) 43 * 752066 = 32338838. The instrumentation overhead is basically two gettimeofday() kernel calls per node entry. Doing the math shows that your machine is able to do gettimeofday() in about half a microsecond, which isn't stellar but it's not all that slow for a kernel call. (What's the platform here, anyway?) Nonetheless it's a couple of times larger than the actual time needed to pull a row from a materialized array ... The real answer to your question is "IN (subselect) sucks before PG 7.4; get a newer release". regards, tom lane
Hi, @all, Greg Stark schrieb: > "Leeuw van der, Tim" <tim.leeuwvander@nl.unisys.com> writes: > >>I don't think EXPLAIN ANALYZE puts that much overhead on a query. > > EXPLAIN ANALYZE does indeed impose a significant overhead. Additional note: In some rare cases, you can experience just the opposite effect, explain analyze can be quicker then the actual query. This is the case for rather expensive send/output functions, like the PostGIS ones: lwgeom=# \timing Zeitmessung ist an. lwgeom=# explain analyze select setsrid(geom,4326) from adminbndy1; QUERY PLAN ----------------------------------------------------------------------------------------------------------- Seq Scan on adminbndy1 (cost=0.00..4.04 rows=83 width=89) (actual time=11.793..2170.184 rows=83 loops=1) Total runtime: 2170.834 ms (2 Zeilen) Zeit: 2171,688 ms lwgeom=# \o /dev/null lwgeom=# select setsrid(geom,4326) from adminbndy1; Zeit: 9681,001 ms BTW: I use the cheap setsrid(geom,4326) to force deTOASTing of the geometry column. Not using it seems to ignore TOASTed columns in sequential scan simulation.) lwgeom=# explain analyze select geom from adminbndy1; QUERY PLAN ------------------------------------------------------------------------------------------------------- Seq Scan on adminbndy1 (cost=0.00..3.83 rows=83 width=89) (actual time=0.089..0.499 rows=83 loops=1) Total runtime: 0.820 ms (2 Zeilen) Markus -- markus schaber | dipl. informatiker logi-track ag | rennweg 14-16 | ch 8001 zürich phone +41-43-888 62 52 | fax +41-43-888 62 53 mailto:schabios@logi-track.com | www.logi-track.com