Re: Postgresql 9.0.2 explain analyze very slow (10x), compared to actual query time - Mailing list pgsql-admin

From Achilleas Mantzios
Subject Re: Postgresql 9.0.2 explain analyze very slow (10x), compared to actual query time
Date
Msg-id 201101131413.21757.achill@matrix.gatewaynet.com
Whole thread Raw
In response to Re: Postgresql 9.0.2 explain analyze very slow (10x), compared to actual query time  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Postgresql 9.0.2 explain analyze very slow (10x), compared to actual query time
List pgsql-admin
Στις Wednesday 12 January 2011 18:45:20 ο/η Tom Lane έγραψε:
> Achilleas Mantzios <achill@matrix.gatewaynet.com> writes:
> > Στις Wednesday 12 January 2011 17:07:53 ο/η Tom Lane έγραψε:
> >> Right offhand I'd wonder whether that was more bound by gettimeofday or
> >> by printf.  Please try it without printf in the loop.
>
> > Changed that to smth like: micros_total = micros_total + (double) micros;
> > instead of the printf to beat any compiler optimization, and still linux runs at light speed:
> > FBSD_TEST : user 0.089s, sys 1.4s
> > FBSD_DEV : user 0.183s, sys 3.8s
> > LINUX_PROD : user 0.168s, sys 0s
>
> Well, there's your problem all right: the FBSD_DEV system takes 22X
> longer to execute gettimeofday() than the LINUX_PROD system.  The
> particular plan that 9.0 is choosing is especially vulnerable to this
> because it involves a whole lot of calls of the Materialize plan node:
>
> >                                    ->  Nested Loop  (cost=3390.49..9185.73 rows=1 width=4) (actual
time=279.916..93280.499rows=1824 loops=1) 
> >                                          Join Filter: (msold.marinerid = mold.id)
> >                                          ->  Index Scan using mariner_pkey on mariner mold  (cost=0.00..1957.89
rows=15914width=4) (actual time=0.009..38.449 rows=15914 loops=1) 
> >                                                Filter: ((marinertype)::text = 'Mariner'::text)
> >                                          ->  Materialize  (cost=3390.49..6989.13 rows=1 width=8) (actual
time=0.013..2.881rows=1888 loops=15914) 
>
> You've got 1888 * 15914 * 2 gettimeofday() calls needed to count up the
> time spent in Materialize, which at 3.8usec apiece adds up to ... hmm
> ... I get 228 seconds, which is more than the reported runtime.  So I
> think there's something wrong with your measurement of the gettimeofday
> cost.  But I wouldn't be a bit surprised to hear that the true cost of
> gettimeofday on that machine is a microsecond or so, in which case all
> of the EXPLAIN ANALYZE time bloat is indeed due to that.
>
> This problem has been discussed repeatedly on the pgsql-performance
> list, which is where your question really belongs, not here.  The short
> of it is that cheap motherboards tend to provide cheap clock hardware
> that takes about a microsecond to read.  It also helps to be running
> an OS that is able to read the clock in userspace without invoking a
> kernel call.  It looks like your Linux box is winning on both counts
> compared to your BSD boxes.

Just comparing the two FBSD boxes on equal terms gives crazy results, so
I'll try to leave LINUX out of the equation for a while, and also i'll try to leave the
difference in DB sizes out of the equation so what i did was make a dump of the small DB
on the small (slow) FBSD box (FBSD_DEV) and load it on the bigger (fast) FBSD box (FBSD_TEST).
I set : sysctl kern.timecounter.hardware=TSC on both systems
and now the below program

#include <stdio.h>
#include <stdlib.h>
#include <sys/time.h>
int main(int argc,char** argv) {
        struct timeval *tp=calloc(1,sizeof(struct timeval));
        int runna;
        long double micros_total=0;
        for (runna=0;runna<1000000;runna++) {
                int rc=gettimeofday(tp,NULL);
                long micros = tp->tv_sec * 1000000 + tp->tv_usec;
                micros_total = micros_total + (long double) micros;
        }
        printf("micros_total=%LE\n",micros_total);
}

% time ./gettimeofday_test gives

FBSD_DEV (slow) : 0.127u 0.231s 0:00.35 100.0%    5+192k 0+0io 0pf+0w
FBSD_TEST (fast) : 0.002u 0.014s 0:00.15 6.6%      84+24192k 0+0io 0pf+0w

Database on the fast system, as i said is exactly the same as in the slow one.

The query "SELECT avg(md.perioddue) from status st,items it,machdefs md WHERE st.apptblidval=it.id AND
it.defid=md.defid;"
gives with \timing

FBSD_DEV (slow) : Time: 28747,374 ms, EXPLAIN ANALYZE: Total runtime: 31816.648 ms
FBSD_TEST (fast) : Time: 3153,372 ms, EXPLAIN ANALYZE: Total runtime: 5701.942 ms

(the fast system is indeed a lot faster on this one (by 10x))

However the query (known from the initial mail as well)
SELECT distinct m.id,coalesce(m.givenname,''),coalesce(m.midname,''),m.surname from marinerstates ms,vessels
vsl,marinerm  
where m.id=ms.marinerid and ms.vslid=vsl.id and ms.state='Active' and coalesce(ms.endtime,now())::date >= '2006-07-15'
and 
ms.starttime::date <= '2007-01-11'  and m.marinertype='Mariner'  and m.id not in
   (SELECT distinct mold.id from marinerstates msold,vessels vslold,mariner mold  where mold.id=msold.marinerid and
msold.vslid=vslold.id 
      and msold.state='Active' and coalesce(msold.endtime,now())::date >= '2006-07-15' and msold.starttime::date <=
'2007-01-11'and exists  
         (select 1 from marinerstates msold2 where msold2.marinerid=msold.marinerid and msold2.state='Active' and
msold2.id<> msold.id and  
           msold2.starttime<msold.starttime AND (msold.starttime-msold2.endtime)<='18 months')  and
mold.marinertype='Mariner')  
       order by m.surname,coalesce(m.givenname,''),coalesce(m.midname,'')
gives:

FBSD_DEV (slow) : Time : 246,032 ms  EXPLAIN ANALYZE: Total runtime: 274.425 ms
FBSD_TEST (fast) : Time: 2314,914 ms EXPLAIN ANALYZE: Total runtime: 10438.061 ms

This query runs much much slower in the case of FBSD_TEST.

The explain analyze for this query are:

FBSD_DEV:

Unique  (cost=633677.56..633700.48 rows=1834 width=23) (actual time=270.184..271.042 rows=603 loops=1)
   ->  Sort  (cost=633677.56..633682.14 rows=1834 width=23) (actual time=270.182..270.440 rows=603 loops=1)
         Sort Key: m.surname, (COALESCE(m.givenname, ''::character varying)), (COALESCE(m.midname, ''::character
varying)),m.id 
         Sort Method:  quicksort  Memory: 53kB
         ->  Hash Join  (cost=630601.65..633578.15 rows=1834 width=23) (actual time=194.815..269.294 rows=603 loops=1)
               Hash Cond: (ms.vslid = vsl.id)
               ->  Hash Join  (cost=630580.33..633530.01 rows=2261 width=27) (actual time=194.444..268.140 rows=603
loops=1)
                     Hash Cond: (ms.marinerid = m.id)
                     ->  Seq Scan on marinerstates ms  (cost=0.00..2875.32 rows=4599 width=8) (actual
time=0.012..70.281rows=2625 loops=1) 
                           Filter: (((state)::text = 'Active'::text) AND ((starttime)::date <= '2007-01-11'::date) AND
((COALESCE(endtime,now()))::date >= ' 
2006-07-15'::date))
                     ->  Hash  (cost=630491.54..630491.54 rows=7103 width=23) (actual time=194.391..194.391 rows=12832
loops=1)
                           ->  Index Scan using mariner_pkey on mariner m  (cost=628776.89..630491.54 rows=7103
width=23)(actual time=162.607..184.282 rows= 
12832 loops=1)
                                 Filter: ((NOT (hashed subplan)) AND ((marinertype)::text = 'Mariner'::text))
                                 SubPlan
                                   ->  Unique  (cost=0.00..628772.30 rows=1834 width=4) (actual time=0.064..161.099
rows=1454loops=1) 
                                         ->  Nested Loop  (cost=0.00..628767.72 rows=1834 width=4) (actual
time=0.062..159.091rows=1835 loops=1) 
                                               ->  Nested Loop  (cost=0.00..627027.98 rows=1865 width=4) (actual
time=0.056..147.464rows=1876 loops=1) 
                                                     ->  Index Scan using marinerstates_marinerid on marinerstates
msold (cost=0.00..626316.07 rows=2299 wid 
th=8) (actual time=0.050..138.100 rows=1876 loops=1)
                                                           Filter: (((state)::text = 'Active'::text) AND
((starttime)::date<= '2007-01-11'::date) AND ((COAL 
ESCE(endtime, now()))::date >= '2006-07-15'::date) AND (subplan))
                                                           SubPlan
                                                             ->  Bitmap Heap Scan on marinerstates msold2
(cost=4.28..12.11rows=1 width=0) (actual time=0.0 
08..0.008 rows=1 loops=2625)
                                                                   Recheck Cond: ((marinerid = $0) AND (starttime <
$2))
                                                                   Filter: ((id <> $1) AND ((state)::text =
'Active'::text)AND (($2 - endtime) <= '1 year 6 
mons'::interval))
                                                                   ->  Bitmap Index Scan on
marinerstates_marinerid_starttime (cost=0.00..4.28 rows=2 width= 
0) (actual time=0.005..0.005 rows=6 loops=2625)
                                                                         Index Cond: ((marinerid = $0) AND (starttime <
$2))
                                                     ->  Index Scan using vessels_pkey on vessels vslold
(cost=0.00..0.30rows=1 width=4) (actual time=0.002 
..0.003 rows=1 loops=1876)
                                                           Index Cond: (vslold.id = msold.vslid)
                                               ->  Index Scan using mariner_pkey on mariner mold  (cost=0.00..0.92
rows=1width=4) (actual time=0.004..0.004 
rows=1 loops=1876)
                                                     Index Cond: (mold.id = msold.marinerid)
                                                     Filter: ((mold.marinertype)::text = 'Mariner'::text)
               ->  Hash  (cost=17.81..17.81 rows=281 width=4) (actual time=0.363..0.363 rows=272 loops=1)
                     ->  Seq Scan on vessels vsl  (cost=0.00..17.81 rows=281 width=4) (actual time=0.009..0.199
rows=272loops=1) 
 Total runtime: 271.466 ms

FBSD_TEST:

Unique  (cost=11525.09..11571.55 rows=3717 width=23) (actual time=10455.286..10455.635 rows=603 loops=1)
   ->  Sort  (cost=11525.09..11534.38 rows=3717 width=23) (actual time=10455.285..10455.390 rows=603 loops=1)
         Sort Key: m.surname, (COALESCE(m.givenname, ''::character varying)), (COALESCE(m.midname, ''::character
varying)),m.id 
         Sort Method:  quicksort  Memory: 71kB
         ->  Hash Join  (cost=8281.98..11304.67 rows=3717 width=23) (actual time=10418.032..10454.399 rows=603 loops=1)
               Hash Cond: (ms.marinerid = m.id)
               ->  Hash Join  (cost=20.12..2963.83 rows=3717 width=4) (actual time=0.199..34.986 rows=2625 loops=1)
                     Hash Cond: (ms.vslid = vsl.id)
                     ->  Seq Scan on marinerstates ms  (cost=0.00..2889.32 rows=4590 width=8) (actual
time=0.015..33.507rows=2625 loops=1) 
                           Filter: (((state)::text = 'Active'::text) AND ((starttime)::date <= '2007-01-11'::date) AND
((COALESCE(endtime,now()))::date >= ' 
2006-07-15'::date))
                     ->  Hash  (cost=16.72..16.72 rows=272 width=4) (actual time=0.179..0.179 rows=272 loops=1)
                           Buckets: 1024  Batches: 1  Memory Usage: 10kB
                           ->  Seq Scan on vessels vsl  (cost=0.00..16.72 rows=272 width=4) (actual time=0.003..0.099
rows=272loops=1) 
               ->  Hash  (cost=8172.57..8172.57 rows=7143 width=23) (actual time=10417.803..10417.803 rows=12832
loops=1)
                     Buckets: 1024  Batches: 1  Memory Usage: 702kB
                     ->  Seq Scan on mariner m  (cost=7614.86..8172.57 rows=7143 width=23) (actual
time=10402.282..10412.840rows=12832 loops=1) 
                           Filter: ((NOT (hashed SubPlan 1)) AND ((marinertype)::text = 'Mariner'::text))
                           SubPlan 1
                             ->  Unique  (cost=2768.00..7614.86 rows=1 width=4) (actual time=85.291..10401.289
rows=1454loops=1) 
                                   ->  Nested Loop  (cost=2768.00..7614.86 rows=1 width=4) (actual
time=85.290..10400.399rows=1835 loops=1) 
                                         Join Filter: (msold.marinerid = mold.id)
                                         ->  Index Scan using mariner_pkey on mariner mold  (cost=0.00..1728.60
rows=14286width=4) (actual time=0.007..13.53 
1 rows=14286 loops=1)
                                               Filter: ((marinertype)::text = 'Mariner'::text)
                                         ->  Materialize  (cost=2768.00..5671.97 rows=1 width=8) (actual
time=0.003..0.329rows=1876 loops=14286) 
                                               ->  Nested Loop  (cost=2768.00..5671.96 rows=1 width=8) (actual
time=37.502..83.326rows=1876 loops=1) 
                                                     ->  Hash Semi Join  (cost=2768.00..5671.67 rows=1 width=12)
(actualtime=37.491..79.434 rows=1876 loops= 
1)
                                                           Hash Cond: (msold.marinerid = msold2.marinerid)
                                                           Join Filter: ((msold2.id <> msold.id) AND (msold2.starttime
<msold.starttime) AND ((msold.startti 
me - msold2.endtime) <= '1 year 6 mons'::interval))
                                                           ->  Seq Scan on marinerstates msold  (cost=0.00..2889.32
rows=4590width=20) (actual time=0.004..3 
4.281 rows=2625 loops=1)
                                                                 Filter: (((state)::text = 'Active'::text) AND
((starttime)::date<= '2007-01-11'::date) AND 
((COALESCE(endtime, now()))::date >= '2006-07-15'::date))
                                                           ->  Hash  (cost=2251.66..2251.66 rows=41307 width=24)
(actualtime=37.446..37.446 rows=41250 loops=1) 
                                                                 Buckets: 8192  Batches: 1  Memory Usage: 2246kB
->  Seq Scan on marinerstates msold2  (cost=0.00..2251.66 rows=41307 width=24) (actual time=0.002..23.350 rows=41250
loops=1)
                                                                       Filter: ((state)::text = 'Active'::text)
                                                     ->  Index Scan using vessels_pkey on vessels vslold
(cost=0.00..0.28rows=1 width=4) (actual time=0.001..0.001 rows=1 loops=1876) 
                                                           Index Cond: (vslold.id = msold.vslid)
 Total runtime: 10456.092 ms

So the questions remain,
a) Why do we have "EXPLAIN ANALYZE" in FBSD_TEST (fast) taking an order of magnitude slower than the actual query.
b) Why this particular query is so slow on this generally much faster system.

PS

I agree that this is a lot "performance" related, but the "EXPLAIN ANALYZE" part just dont seem right.
Thats why i posted on admin in the first place.

>
>             regards, tom lane
>



--
Achilleas Mantzios

pgsql-admin by date:

Previous
From: Achilleas Mantzios
Date:
Subject: Re: Postgresql 9.0.2 explain analyze very slow (10x), compared to actual query time
Next
From: Achilleas Mantzios
Date:
Subject: Re: Postgresql 9.0.2 explain analyze very slow (10x), compared to actual query time