Re: planner/optimizer question - Mailing list pgsql-performance

From Gary Doades
Subject Re: planner/optimizer question
Date
Msg-id 40919034.20317.153A0B6F@localhost
Whole thread Raw
In response to Re: planner/optimizer question  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-performance
On 29 Apr 2004 at 17:54, Tom Lane wrote:

> "Gary Doades" <gpd@gpdnet.co.uk> writes:
> > In this example the statistics don't matter.
>
> Don't they?
>
> A prior poster mentioned that he thought MSSQL tries to keep all its
> indexes in memory.  I wonder whether you are giving Postgres a fair
> chance to do the same.  What postgresql.conf settings are you using?
>
>                                  regards, tom lane

As far as I understand it the statistics only contribute to determining the query plan. Once the access methods are determined, the stats don't matter during the running of the query.

I believe I have given Postgres exactly the same chance. The data is small enough to fit into RAM (all the tables in the query add up to around 50meg) and I executed the query several times to get a consistent figure for the explain analyze.

Having picked out an index scan as being the highest time user I concentrated on that in this case and compared the same index scan on MSSQL. At least MSSQL reported it as an index scan on the same index for the same number of rows.

There was nothing wrong with the query plan that Postgres used. As far as I could see it was probably the best one to use, it just physically took longer than the same access plan on MSSQL.

The query and plan are included below, the main thing I was looking at was the index scan on staff_booking_pkey being 676ms long.

The only postgresql.conf parameters changed from the default are:

shared_buffers = 3000   
sort_mem = 4096
effective_cache_size = 15000
default_statistics_target = 100

There was no disk IO (above the small background IO) during the final run of the query as reported by vmstat (Task Mangler on Windows).

SELECT B.CONTRACT_ID,SUM(R.DURATION+1)/60.0 AS SUMDUR FROM SEARCH_REQT_RESULT TSR
JOIN STAFF_BOOKING B ON (B.STAFF_ID = TSR.STAFF_ID)
JOIN ORDER_REQT R ON (R.REQT_ID = B.REQT_ID)
JOIN BOOKING_PLAN BP ON (BP.BOOKING_ID = B.BOOKING_ID) AND BP.BOOKING_DATE BETWEEN '2004-04-12' AND '2004-04-18' AND TSR.SEARCH_ID = 8 GROUP BY B.CONTRACT_ID

QUERY PLAN
HashAggregate  (cost=11205.80..11209.81 rows=401 width=6) (actual time=1179.729..1179.980 rows=50 loops=1)
  ->  Nested Loop  (cost=326.47..11203.79 rows=401 width=6) (actual time=39.700..1177.149 rows=652 loops=1)
        ->  Hash Join  (cost=326.47..9990.37 rows=401 width=8) (actual time=39.537..1154.807 rows=652 loops=1)
              Hash Cond: ("outer".staff_id = "inner".staff_id)
              ->  Merge Join  (cost=320.39..9885.06 rows=3809 width=12) (actual time=38.316..1143.953 rows=4079 loops=1)
                    Merge Cond: ("outer".booking_id = "inner".booking_id)
                    ->  Index Scan using staff_booking_pkey on staff_booking b (cost=0.00..8951.94 rows=222612 width=16) (actual time=0.218..676.219 rows=222609 loops=1)
                    ->  Sort  (cost=320.39..329.91 rows=3808 width=4) (actual time=26.225..32.754 rows=4079 loops=1)
                          Sort Key: bp.booking_id
                          ->  Index Scan using booking_plan_idx2 on booking_plan bp (cost=0.00..93.92 rows=3808 width=4) (actual time=0.223..14.186 rows=4079 loops=1)
                                Index Cond: ((booking_date >= '2004-04-12'::date) AND (booking_date <= '2004-04-18'::date))
              ->  Hash  (cost=5.59..5.59 rows=193 width=4) (actual time=1.139..1.139 rows=0 loops=1)
                    ->  Index Scan using fk_idx_search_reqt_result on search_reqt_result tsr (cost=0.00..5.59 rows=193 width=4) (actual time=0.213..0.764 rows=192 loops=1)
                          Index Cond: (search_id = 8)
        ->  Index Scan using order_reqt_pkey on order_reqt r  (cost=0.00..3.01 rows=1 width=6) (actual time=0.023..0.025 rows=1 loops=652)
              Index Cond: (r.reqt_id = "outer".reqt_id)
Total runtime: 1181.239 ms


Cheers,
Gary.

pgsql-performance by date:

Previous
From: Josh Berkus
Date:
Subject: Re: Simply join in PostrgeSQL takes too long
Next
From: "Gary Doades"
Date:
Subject: Re: planner/optimizer question