Thread: Optimizer degradation since 8.0

Optimizer degradation since 8.0

From
Teodor Sigaev
Date:
Table with one million rows:

wow=# \d _document83              Table "public._document83"   Column   |            Type             | Modifiers
------------+-----------------------------+----------- _idrref    | bytea                       | not null _marked    |
boolean                    | not null _date_time | timestamp without time zone | not null _number    | character(10)
          | not null _posted    | boolean                     | not null
 
Indexes:    "_document83ng_pkey" PRIMARY KEY, btree (_idrref)    "_document83_bydocdate_tr" btree (_date_time, _idrref)
  "qq" btree (_date_time)
 


Query:
SELECT    _Date_Time,    _IDRRef,
FROM    _Document83
WHERE    _Date_Time = '2006-06-21 11:24:56'::timestamp AND    _IDRRef >
'\\234?PPTP00\\021\\333\\000\\367\\004:v:'::bytea   OR _Date_Time > '2006-06-21 11:24:56'::timestamp
 
LIMIT 20;

Explain analyze in postgres 8.0: Limit  (cost=0.00..0.63 rows=20 width=44) (actual time=0.250..0.485 rows=20 
loops=1)   ->  Index Scan using qq, qq on _document83  (cost=0.00..6679.90 rows=211427 
width=44) (actual time=0.238..0.416 rows=20 loops=1)         Index Cond: ((_date_time = '2006-06-21
11:24:56'::timestampwithout 
 
time zone) OR (_date_time > '2006-06-21 11:24:56'::timestamp without time zone))         Filter: (((_date_time =
'2006-06-2111:24:56'::timestamp without time 
 
zone) AND (_idrref > '\\234?PPTP00\\021\\333\\000\\367\\004:v:'::bytea)) OR 
(_date_time > '2006-06-21 11:24:56'::timestamp without time zone)) Total runtime: 2.313 ms


Explain analyze in postgres 8.1: Limit  (cost=0.00..2.82 rows=20 width=44) (actual time=1448.897..1610.386 
rows=20 loops=1)   ->  Seq Scan on _document83  (cost=0.00..29729.04 rows=210782 width=44) 
(actual time=1448.889..1610.314 rows=20 loops=1)         Filter: ((_date_time > '2006-06-21 11:24:56'::timestamp
withouttime 
 
zone) OR ((_date_time = '2006-06-21 11:24:56'::timestamp without time zone) AND 
(_idrref > E'\\234?PPTP00\\021\\333\\000\\367\\004:v:'::bytea))) Total runtime: 1610.524 ms
(4 rows)

With ENABLE_SEQSCAN=OFF: Limit  (cost=1319.83..1321.23 rows=20 width=44) (actual time=193.261..193.382 
rows=20 loops=1)   ->  Bitmap Heap Scan on _document83  (cost=1319.83..16029.62 rows=210782 
width=44) (actual time=193.253..193.314 rows=20 loops=1)         Recheck Cond: (((_date_time = '2006-06-21
11:24:56'::timestampwithout 
 
time zone) AND (_idrref > E'\\234?PPTP00\\021\\333\\000\\367\\004:v:'::bytea)) 
OR (_date_time > '2006-06-21 11:24:56'::timestamp without time zone))         ->  BitmapOr  (cost=1319.83..1319.83
rows=210788width=0) (actual 
 
time=191.203..191.203 rows=0 loops=1)               ->  Bitmap Index Scan on _document83_bydocdate_tr 
(cost=0.00..2.18 rows=30 width=0) (actual time=2.470..2.470 rows=43 loops=1)                     Index Cond:
((_date_time= '2006-06-21 
 
11:24:56'::timestamp without time zone) AND (_idrref > 
E'\\234?PPTP00\\021\\333\\000\\367\\004:v:'::bytea))               ->  Bitmap Index Scan on qq  (cost=0.00..1317.65
rows=210758
 
width=0) (actual time=188.720..188.720 rows=275800 loops=1)                     Index Cond: (_date_time > '2006-06-21
11:24:56'::timestamp
 
without time zone) Total runtime: 193.872 ms

So, 8.0 is better at least in 100 times. Expected number of rows is close to 
real value ( ~270000 ). Rewriting query with UNION ALL makes better performance 
(about 1 ms): Limit  (cost=0.00..0.73 rows=20 width=44) (actual time=0.654..0.851 rows=20 
loops=1)   ->  Append  (cost=0.00..7712.53 rows=210788 width=44) (actual 
time=0.648..0.791 rows=20 loops=1)         ->  Index Scan using qq on _document83  (cost=0.00..6.42 rows=30 
width=44) (actual time=0.645..0.733 rows=20 loops=1)               Index Cond: (_date_time = '2006-06-21
11:24:56'::timestamp
 
without time zone)               Filter: (_idrref > 
E'\\234?PPTP00\\021\\333\\000\\367\\004:v:'::bytea)         ->  Index Scan using qq on _document83  (cost=0.00..5598.23

rows=210758 width=44) (never executed)               Index Cond: (_date_time > '2006-06-21 11:24:56'::timestamp 
without time zone) Total runtime: 1.059 ms
But it's not always possible to rewrite automatically generated query...

After adding 'order by', postgres uses index but plan becomes the same as 
before, with seqscan=off.

Can I tweak something in 8.1 or it's a bug?



-- 
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
  WWW: http://www.sigaev.ru/
 


Re: Optimizer degradation since 8.0

From
Tom Lane
Date:
Teodor Sigaev <teodor@sigaev.ru> writes:
> Can I tweak something in 8.1 or it's a bug?

It's not a bug, it's an intentional change:

2005-04-24 21:30  tgl
* src/: backend/commands/explain.c,backend/executor/nodeBitmapIndexscan.c,backend/executor/nodeIndexscan.c,
backend/nodes/copyfuncs.c,backend/nodes/outfuncs.c,
backend/optimizer/path/allpaths.c,backend/optimizer/path/indxpath.c,backend/optimizer/path/orindxpath.c,backend/optimizer/plan/createplan.c,backend/optimizer/plan/setrefs.c,backend/optimizer/plan/subselect.c,backend/optimizer/util/pathnode.c,backend/optimizer/util/restrictinfo.c,backend/utils/adt/selfuncs.c,
include/executor/nodeIndexscan.h,include/nodes/execnodes.h,include/nodes/plannodes.h,include/nodes/relation.h,
include/optimizer/paths.h,include/optimizer/planmain.h,include/optimizer/restrictinfo.h:Remove support for OR'd
indexscansinternal to a single IndexScanplan node, as this behavior is now better done as a bitmap ORindexscan.  This
allowsconsiderable simplification innodeIndexscan.c itself as well as several planner modules concernedwith indexscan
plangeneration.  Also we can improve the sharing ofcode between regular and bitmap indexscans, since they are
nowworkingwith nigh-identical Plan nodes.
 

Your example shows a case where a plain indexscan's zero startup cost
is very useful.  I'm disinclined to revert the above patch completely,
because the OR'd-indexscan code was a mess, but perhaps we could
consider ways to make bitmap scans start delivering tuples sooner
than after finishing the whole index scan.  Not for 8.2 though :-(
        regards, tom lane