does the planner "learn"? - Mailing list pgsql-general

From TJ O'Donnell
Subject does the planner "learn"?
Date
Msg-id 4207BC64.4060304@acm.org
Whole thread Raw
Responses Re: does the planner "learn"?
List pgsql-general
I understand the value of indexes and of ANALYZE for the efficient use of them.
In the following statement, you can see that the index scan is being used.
Even though it takes 80 seconds (for a 1.25 million row table), it is
much faster than without the index.
But, if I repeat this search, it speeds up by more than a factor of 2!
I love it, but I'd sure like to understand why.  When I do it a third time,
it speeds up again.  A fourth try does not speed it up more.
Is this speedup due to some memory/disk buffering from which I'm
benefiting?  I'm using linux (actually under VMware on WinXP, so it's even less
efficient that it could be on it's own).  Or is the planner learning
something from previous runs of this search?  It appears not, since the
rows it thinks it needs to search are the same in the EXPLAIN ANALYZE
outputs below.  Can someone help me understand why my searches are speeding
up so I can make it happen the first time, if possible?

Thanks,
TJ


Try #1:
Select distinct smiles from structure where (_c, _o, _arom_c, _c_double, _o_double, _n, _ring_c, _ring_hetero,
_halogen, 
_n_double, _arom_n, _s, _s_double, _other_atoms, _c_triple, _n_triple, _p, _arom_s, _p_double, _arom_o) >=
(4,2,6,2,2,1,4,1,0,0,0,0,0,0,0,0,0,0,0,0) and oe_matches(smiles,'c1ccc(cc1)C2CC(=O)NC2=O') limit 500

Limit  (cost=92649.53..92652.35 rows=500 width=49) (actual time=81544.566..81545.522 rows=117 loops=1)
   ->  Unique  (cost=92649.53..92688.60 rows=6924 width=49) (actual time=81544.561..81545.174 rows=117 loops=1)
         ->  Sort  (cost=92649.53..92669.06 rows=7813 width=49) (actual time=81544.553..81544.726 rows=117 loops=1)
               Sort Key: smiles
               ->  Index Scan using fingerprint on structure  (cost=0.00..92144.36 rows=7813 width=49) (actual
time=36.179..81533.872 rows=117 loops=1)
                     Index Cond: ((_c >= 4) AND (_o >= 2) AND (_arom_c >= 6) AND (_c_double >= 2) AND (_o_double >= 2)
AND (_n >= 1) AND (_ring_c >= 4) AND (_ring_hetero >= 1) AND (_halogen >= 0) AND (_n_double >= 0) AND (_arom_n >= 0)
AND 
(_s >= 0) AND (_s_double >= 0) AND (_other_atoms >= 0) AND (_c_triple >= 0) AND (_n_triple >= 0) AND (_p >= 0) AND
(_arom_s >= 0) AND (_p_double >= 0) AND (_arom_o >= 0))
                     Filter: oe_matches(smiles, 'c1ccc(cc1)C2CC(=O)NC2=O'::character varying)
Total runtime: 81545.903 ms

Try #2:
Limit  (cost=92649.53..92652.35 rows=500 width=49) (actual time=36924.436..36925.450 rows=117 loops=1)
   ->  Unique  (cost=92649.53..92688.60 rows=6924 width=49) (actual time=36924.431..36925.051 rows=117 loops=1)
         ->  Sort  (cost=92649.53..92669.06 rows=7813 width=49) (actual time=36924.423..36924.596 rows=117 loops=1)
               Sort Key: smiles
               ->  Index Scan using fingerprint on structure  (cost=0.00..92144.36 rows=7813 width=49) (actual
time=14.591..36891.589 rows=117 loops=1)
                     Index Cond: ((_c >= 4) AND (_o >= 2) AND (_arom_c >= 6) AND (_c_double >= 2) AND (_o_double >= 2)
AND (_n >= 1) AND (_ring_c >= 4) AND (_ring_hetero >= 1) AND (_halogen >= 0) AND (_n_double >= 0) AND (_arom_n >= 0)
AND 
(_s >= 0) AND (_s_double >= 0) AND (_other_atoms >= 0) AND (_c_triple >= 0) AND (_n_triple >= 0) AND (_p >= 0) AND
(_arom_s >= 0) AND (_p_double >= 0) AND (_arom_o >= 0))
                     Filter: oe_matches(smiles, 'c1ccc(cc1)C2CC(=O)NC2=O'::character varying)
Total runtime: 36925.820 ms

Try #3:
Limit  (cost=92649.53..92652.35 rows=500 width=49) (actual time=23712.435..23713.394 rows=117 loops=1)
   ->  Unique  (cost=92649.53..92688.60 rows=6924 width=49) (actual time=23712.430..23713.046 rows=117 loops=1)
         ->  Sort  (cost=92649.53..92669.06 rows=7813 width=49) (actual time=23712.422..23712.599 rows=117 loops=1)
               Sort Key: smiles
               ->  Index Scan using fingerprint on structure  (cost=0.00..92144.36 rows=7813 width=49) (actual
time=17.548..23631.915 rows=117 loops=1)
                     Index Cond: ((_c >= 4) AND (_o >= 2) AND (_arom_c >= 6) AND (_c_double >= 2) AND (_o_double >= 2)
AND (_n >= 1) AND (_ring_c >= 4) AND (_ring_hetero >= 1) AND (_halogen >= 0) AND (_n_double >= 0) AND (_arom_n >= 0)
AND 
(_s >= 0) AND (_s_double >= 0) AND (_other_atoms >= 0) AND (_c_triple >= 0) AND (_n_triple >= 0) AND (_p >= 0) AND
(_arom_s >= 0) AND (_p_double >= 0) AND (_arom_o >= 0))
                     Filter: oe_matches(smiles, 'c1ccc(cc1)C2CC(=O)NC2=O'::character varying)
Total runtime: 23713.765 ms


pgsql-general by date:

Previous
From: Christopher Petrilli
Date:
Subject: Use of indexes with table inheritance
Next
From: Martijn van Oosterhout
Date:
Subject: Re: does the planner "learn"?