Re: sub select performance due to seq scans - Mailing list pgsql-performance

From H Hale
Subject Re: sub select performance due to seq scans
Date
Msg-id 20060802121732.94126.qmail@web88001.mail.re2.yahoo.com
Whole thread Raw
In response to Re: sub select performance due to seq scans  (H Hale <hhale21@rogers.com>)
Responses Re: sub select performance due to seq scans  (Scott Marlowe <smarlowe@g2switchworks.com>)
List pgsql-performance

Initial testing was with data that essentially looks like a single collection with many items. 
I then changed this to have 60 collections of 50 items.
The result, much better (but not optimum) use of indexs, but a seq scan still
used.

Turning seq scan off, all indexes where used.
Query was much faster (1.5ms vs 300ms).

I have tried to increase stats collection...

alter table capsa.flatommemberrelation column srcobj set statistics 1000;
alter table capsa.flatommemberrelation column dstobj set statistics 1000;
alter table capsa.flatommemberrelation column objectid set statistics 1000;
alter table capsa.flatomfilesysentry column objectid set statistics 1000;
vacuum full analyze;
Experimented with many postgres memory parameters.
No difference.

Is seq scan off the solution here?
My tests are with a relatively small number of records.
My concern here is what happens with 100,000'sof records and seq scan off?
I will find out shortly...

Does anyone know of of any know issues with the query planner?

Explain analyze results below.

capsa=# explain analyze select count(*) from capsa.flatomfilesysentry where
objectid in (select dstobj from capsa.flatommemberrelation where
srcobj='5bdef74c-21d3-11db-9a20-001143214409');

QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=742380.16..742380.17 rows=1 width=0) (actual
time=1520.269..1520.270 rows=1 loops=1)
-> Nested Loop (cost=878.91..742355.41 rows=9899 width=0) (actual
time=41.516..1520.076 rows=56 loops=1)
JoinFilter: ("inner".objectid = "outer".dstobj)
-> Unique (cost=437.03..453.67 rows=3329 width=16) (actual
time=0.241..0.624 rows=56 loops=1)
-> Sort (cost=437.03..445.35 rows=3329 width=16) (actual
time=0.237..0.346 rows=56 loops=1)
Sort Key: flatommemberrelation.dstobj
-> Bitmap Heap Scan on flatommemberrelation
(cost=30.65..242.26 rows=3329 width=16) (actual time=0.053..0.135 rows=56
loops=1)
Recheck Cond: (srcobj =
'5bdef74c-21d3-11db-9a20-001143214409'::capsa_sys.uuid)
-> Bitmap Index Scan on
capsa_flatommemberrelation_srcobj_idx (cost=0.00..30.65 rows=3329 width=0)
(actual time=0.044..0.044 rows=56 loops=1)
Index Cond: (srcobj =
'5bdef74c-21d3-11db-9a20-001143214409'::capsa_sys.uuid)
-> Materialize (cost=441.89..540.88 rows=9899 width=16)(actual
time=0.011..14.918 rows=9899 loops=56)
-> Seq Scan on flatomfilesysentry (cost=0.00..431.99 rows=9899
width=16) (actual time=0.005..19.601 rows=9899 loops=1)
Total runtime: 1521.040 ms
(13 rows)

capsa=# explain analyze select count(*) from capsa.flatomfilesysentry where
objectid in (select dstobj from capsa.flatommemberrelation where
srcobj='5bdef74c-21d3-11db-9a20-001143214409');

QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=1486472.45..1486472.46 rows=1 width=0) (actual
time=2.112..2.113 rows=1 loops=1)
-> Nested Loop (cost=439.03..1486447.70 rows=9899 width=0) (actual
time=0.307..2.019rows=56 loops=1)
-> Unique (cost=437.03..453.67 rows=3329 width=16) (actual
time=0.236..0.482 rows=56 loops=1)
-> Sort (cost=437.03..445.35 rows=3329 width=16) (actual
time=0.233..0.306 rows=56 loops=1)
Sort Key: flatommemberrelation.dstobj
-> Bitmap Heap Scan on flatommemberrelation
(cost=30.65..242.26 rows=3329 width=16) (actual time=0.047..0.132 rows=56
loops=1)
Recheck Cond: (srcobj =
'5bdef74c-21d3-11db-9a20-001143214409'::capsa_sys.uuid)
-> Bitmap Index Scan on
capsa_flatommemberrelation_srcobj_idx (cost=0.00..30.65 rows=3329 width=0)
(actual time=0.038..0.038 rows=56 loops=1)
Index Cond: (srcobj =
'5bdef74c-21d3-11db-9a20-001143214409'::capsa_sys.uuid)
-> Bitmap Heap Scan on flatomfilesysentry (cost=2.00..384.50
rows=4950 width=16)(actual time=0.019..0.020 rows=1 loops=56)
Recheck Cond: (flatomfilesysentry.objectid = "outer".dstobj)
-> Bitmap Index Scan on flatomfilesysentry_pkey
(cost=0.00..2.00 rows=4950 width=0) (actual time=0.014..0.014 rows=1 loops=56)
Index Cond: (flatomfilesysentry.objectid = "outer".dstobj)
Total runtime: 2.258 ms
(14 rows)


H Hale <hhale21@rogers.com> wrote:
Not sure if this helps solve the problem but...
(see below)

As new records are added Indexes are used for awhile  and then at some point postgres switches to seq scan. It is repeatable. 

Any suggestions/comments to try and solve this are welcome.  Thanks

Data is as follows:
capsa.flatommemberrelation 1458 records
capsa.flatommemberrelation(srcobj)  3 distinct
capsa.flatommemberrelation(dstobj)  730 distinct
capsa.flatomfilesysentry 732 records
capsa.flatommemberrelation(objectid)  732 distinct

capsa=# set enable_seqscan=on;
SET
Time: 0.599 ms
capsa=# explain analyze select count(*) from capsa.flatomfilesysentry where objectid in (select dstobj from capsa.flatommemberrelation where srcobj='9e5943e0-219f-11db-8504-001143214409');
                                                            QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=196.01..196.02 rows=1 width=0) (actual time=965.420..965.422 rows=1 loops=1)
   ->  Nested Loop IN Join  (cost=0.00..194.19 rows=728 width=0) (actual time=3.373..964.371 rows=729 loops=1)
         Join Filter: ("outer".objectid = "inner".dstobj)
         ->  Seq Scan on flatomfilesysentry  (cost=0.00..65.28 rows=728 width=16) (actual time=0.007..1.505 rows=732 loops=1)
         ->  Seq Scan on flatommemberrelation  (cost=0.00..55.12 rows=725 width=16) (actual time=0.004..0.848 rows=366 loops=732)
               Filter: (srcobj = '9e5943e0-219f-11db-8504-001143214409'::capsa_sys.uuid)
 Total runtime: 965.492 ms
(7 rows)

Time: 966.806 ms

-----------------------------------------------------------------------------------------------------------
capsa=# set enable_seqscan=off;
SET
Time: 0.419 ms
capsa=# explain analyze select count(*) from capsa.flatomfilesysentry where objectid in (select dstobj from capsa.flatommemberrelation where srcobj='9e5943e0-219f-11db-8504-001143214409');
                                                                                QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=24847.73..24847.74 rows=1 width=0) (actual time=24.859..24.860 rows=1 loops=1)
   ->  Nested Loop  (cost=90.05..24845.91 rows=728 width=0) (actual time=2.946..23.640 rows=729 loops=1)
         ->  Unique  (cost=88.04..91.67 rows=363 width=16) (actual time=2.917..6.671 rows=729 loops=1)
               ->  Sort  (cost=88.04..89.86 rows=725 width=16) (actual time=2.914..3.998 rows=729 loops=1)
                     Sort Key: flatommemberrelation.dstobj
                     ->  Bitmap Heap Scan on flatommemberrelation  (cost=7.54..53.60 rows=725 width=16) (actual time=0.260..1.411 rows=729 loops=1)
                           Recheck Cond: (srcobj = '9e5943e0-219f-11db-8504-001143214409'::capsa_sys.uuid)
                           ->  Bitmap Index Scan on capsa_flatommemberrelation_srcobj_idx  (cost=0.00..7.54 rows=725 width=0) (actual time=0.244..0.244 rows=729 loops=1)
                                 Index Cond: (srcobj = '9e5943e0-219f-11db-8504-001143214409'::capsa_sys.uuid)
         ->  Bitmap Heap Scan on flatomfilesysentry  (cost=2.00..63.64 rows=364 width=16) (actual time=0.014..0.015 rows=1 loops=729)
               Recheck Cond: (flatomfilesysentry.objectid = "outer".dstobj)
               ->  Bitmap Index Scan on flatomfilesysentry_pkey  (cost=0.00..2.00 rows=364 width=0) (actual time=0.009..0.009 rows=1 loops=729)
                     Index Cond: (flatomfilesysentry.objectid = "outer".dstobj)
 Total runtime: 25.101 ms
(14 rows)

Time: 26.878 ms






H Hale <hhale21@rogers.com> wrote:
Tom,

It is unique.

Indexes:
    "flatomfilesysentry_pkey" PRIMARY KEY, btree (objectid)
    "capsa_flatomfilesysentry_name_idx" btree (name)
Foreign-key constraints:
    "objectid" FOREIGN KEY (objectid) REFERENCES capsa_sys.master(objectid) ON DELETE CASCADE


Tom Lane <tgl@sss.pgh.pa.us> wrote:
H Hale writes:
> -> Bitmap Heap Scan on flatomfilesysentry (cost=2.00..274.38 rows=3238 width=30) (actual time=0.011..0.013 rows=1 loops=6473)
> Recheck Cond: (flatomfilesysentry.objectid = "outer".dstobj)
> -> Bitmap Index Scan on flatomfilesysentry_pkey (cost=0.00..2.00 rows=3238 width=0) (actual time=0.007..0.007 rows=1 loops=6473)
> Index Cond: (flatomfilesysentry.objectid = "outer".dstobj)

Well, there's our estimation failure: 3238 rows expected, one row
actual.

What is the data distribution of flatomfilesysentry.objectid?
It looks from this example like it is unique or nearly so,
but the planner evidently does not think that.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings




pgsql-performance by date:

Previous
From: Eugeny N Dzhurinsky
Date:
Subject: Re: Query/database optimization
Next
From: Scott Marlowe
Date:
Subject: Re: sub select performance due to seq scans