Thread: Re: [PATCHES] Bundle of patches

Re: [PATCHES] Bundle of patches

From
Tom Lane
Date:
Teodor Sigaev <teodor@sigaev.ru> writes:
> 4) OR clauses optimizations
>    http://www.sigaev.ru/misc/OR_82-0.6.gz

This seems kinda ugly --- it looks very expensive and unlikely to find
useful optimizations most of the time.  Have you done any benchmarking
to find out what the cost is when the optimizations don't succeed?

I guess my big problem with it is that it's a huge expansion of the
single ugliest, most ad-hoc part of the planner, ie, orindxqual.c.
And the new code is just as ad-hoc as what was there before ...

Also, what's with the pull_tlist bit?

            regards, tom lane

Re: [PATCHES] Bundle of patches

From
Teodor Sigaev
Date:
> This seems kinda ugly --- it looks very expensive and unlikely to find
> useful optimizations most of the time.  Have you done any benchmarking
> to find out what the cost is when the optimizations don't succeed?

Yep, it's a big win with order and limit specified.
Let table (a,b) has index over (a,b), so,
queries with ((a=3 AND b>10) OR a>3) ORDER BY a,b may be done with one pass of
index scan with condition a>=3 and filter ((a=3 and b>10) or a>3). And scan out
is already sorted. The single way to execute it without patch is bitmap or scan
over two index scans and following ordering. If limit is small enough then there
is a lot  unnecessary work for executor. Thats case may be found by
find_common_quals() which is fast enough.

Simplest case of second kind is ( a<3 or a>5 ). If it's possible to prove that
set of rows for first condition and set for second are not intersected then
output of correlated index scans can be simply joined/appended. In this case, we
broaden applicability of Append node. What is more, it's possible to order index
scans by conditions, which allows do not use sort node.

I understand, that proving of non-intersecting and ordering by conditions is an
expensive operation because of using predicate_implied_by/predicate_refuted_by,
but in most cases they aren't even called. For using this optimization all
conditions should on single index - and it's first step.

Suggested plan's is very effective when a or b is large values like varchar, not
a just integers.

> Also, what's with the pull_tlist bit?
pull target list from subplan.

I've add it because query
select b,a from foo where a<3 or a>5 order by a limit 1;
with plan
Result
    Append
        Index Scan
        Index Scan
fails because Result node thinks that it gets (b,a) tuple, but in fact it gets
(a,b). So, if pull_tlist is TRUE, then create_append_plan takes target list from
first subplan. Currently, that bit set only with OR-optimized plan. And
optimizer of OR-clause guarantee that target lists are the same. Sorry, but I
didn't find clearer solution...


--
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
                                                    WWW: http://www.sigaev.ru/

Re: [PATCHES] Bundle of patches

From
Teodor Sigaev
Date:
> useful optimizations most of the time.  Have you done any benchmarking
> to find out what the cost is when the optimizations don't succeed?
Test runs on my notebook with PIII/512Mb, FreeBSD 6.1, postgres was compiled
with -O0 and --enable-debug --enable-cassert

Test results (ms)
                          |    [1]    |    [2]    |    [3]
-----------------------------------------------------------
[I]   8.3                | 88490.275 | 46684.972 |  21.563
[II]  8.3 splited query  |  0.492    |   0.560   |  0.635
[III] 8.3 rewrited query |  0.523    |   0.952   |  1.004
[IV]  8.3+patch          |  0.444    |   0.410   |  0.679

Query description
All queries are variants of sinple OR clause with index support. In tests
for 8.3 and 8.3+patch usial queries are used. '8.2 splited query' test executes
separate queries for each OR-ed clause (in supposition that apllication should
concatenate results). '8.3 rewrited query' test uses rewritten queries with
the same result of execution and queries are fast as I could find.

Test result shows a big advantage of using such kind of optimization and
doesn't demonstrate performance gap on optimization stage.

Test suite:
# select * into foo from generate_series(1,100000) as f1, generate_series(1,100)
as f2;
# create index idx on foo (f1,f2);
# vacuum analyze foo;

Queries and plans in details:
==================[I] Without patch=======================
[1]
# explain analyze select * from foo where (f1=70000 and f2>95) or f1>70000 order
by f1, f2 limit 10;
Limit  (cost=0.00..45.94 rows=10 width=8) (actual time=88490.044..88490.109
rows=10 loops=1)
      ->  Index Scan using idx on foo  (cost=0.00..14113503.78 rows=3071985
width=8) (actual time=88490.035..88490.072 rows=10 loops=1)
           Filter: (((f1 = 70000) AND (f2 > 95)) OR (f1 > 70000))
Total runtime: 88490.275 ms

[2]
# explain analyze select * from foo where (f1>40000 and f1<50000) or (f1>60000
and f1<70000) order by f1, f2 limit 10;
Limit  (cost=0.00..69.91 rows=10 width=8) (actual time=46684.725..46684.813
rows=10 loops=1)
     ->  Index Scan using idx on foo  (cost=0.00..14138503.78 rows=2022419
width=8) (actual time=46684.717..46684.768 rows=10 loops=1)
         Filter: (((f1 > 40000) AND (f1 < 50000)) OR ((f1 > 60000) AND (f1 <
70000)))
Total runtime: 46684.972 ms

[3]
# explain  analyze select * from foo where (f1>49980 and f1<50000) or (f1>69980
and f1<70000) order by f1, f2 limit 10;
Limit  (cost=13581.04..13581.07 rows=10 width=8) (actual time=20.767..20.809
rows=10 loops=1)
     ->  Sort  (cost=13581.04..13592.03 rows=4393 width=8) (actual
time=20.759..20.773 rows=10 loops=1)
         Sort Key: f1, f2
         ->  Bitmap Heap Scan on foo  (cost=102.12..13315.25 rows=4393 width=8)
(actual time=3.495..11.911 rows=3800 loops=1)
             Recheck Cond: (((f1 > 49980) AND (f1 < 50000)) OR ((f1 > 69980) AND
(f1 < 70000)))
             ->  BitmapOr  (cost=102.12..102.12 rows=4393 width=0) (actual
time=3.415..3.415 rows=0 loops=1)
                 ->  Bitmap Index Scan on idx  (cost=0.00..51.01 rows=2191
width=0) (actual time=1.887..1.887 rows=1900 loops=1)
                     Index Cond: ((f1 > 49980) AND (f1 < 50000))
                 ->  Bitmap Index Scan on idx  (cost=0.00..51.12 rows=2202
width=0) (actual time=1.519..1.519 rows=1900 loops=1)
                     Index Cond: ((f1 > 69980) AND (f1 < 70000))
Total runtime: 21.563 ms

=================[II] Without patch, Two queries=======================
Here plans are omitted because they are simple index scans.
[1]
# explain analyze select * from foo where f1=70000 and f2>95 order by f1, f2
limit 10;
Total runtime: 0.219 ms
# explain analyze select * from foo where f1>70000 order by f1, f2 limit 10;
Total runtime: 0.273 ms

[2]
#  explain analyze select * from foo where  f1>40000 and f1<50000 order by f1,
f2 limit 10;
Total runtime: 0.245 ms
# explain analyze select * from foo where  f1>60000 and f1<70000 order by f1, f2
limit 10;
Total runtime: 0.315 ms

[3]
# explain analyze select * from foo where f1>49980 and f1<50000 order by f1, f2
limit 10;
Total runtime: 0.292 ms
# explain analyze seleCt * from foo where f1>69980 and f1<70000 order by f1, f2
limit 10;
Total runtime: 0.343 ms


==================[III]Without patch, rewrited query===================
[1]
# explain analyze select * from foo where ((f1=70000 and f2>95) or f1>70000) and
f1>=70000 order by f1, f2 limit 10;
Limit  (cost=0.00..50.33 rows=10 width=8) (actual time=0.320..0.387 rows=10 loops=1)
     ->  Index Scan using idx on foo  (cost=0.00..3974489.10 rows=789613
width=8) (actual time=0.313..0.348 rows=10 loops=1)
         Index Cond: (f1 >= 70000)
         Filter: (((f1 = 70000) AND (f2 > 95)) OR (f1 > 70000))
Total runtime: 0.523 ms

[2]
# explain  analyze
   select * from (
     select * from (select * from foo where f1>40000 and f1<50000 order by f1,
f2 limit 10) as t1
     union all
     select * from (select * from foo where f1>60000 and f1<70000 order by f1,
f2 limit 10) as t2
   ) as t
   order by f1, f2 limit 10;
Limit  (cost=28.85..28.87 rows=10 width=8) (actual time=0.586..0.627 rows=10
loops=1)
     ->  Sort  (cost=28.85..28.90 rows=20 width=8) (actual time=0.581..0.594
rows=10 loops=1)
         Sort Key: t.f1, t.f2
         ->  Result  (cost=0.00..28.42 rows=20 width=8) (actual
time=0.085..0.431 rows=20 loops=1)
             ->  Append  (cost=0.00..28.42 rows=20 width=8) (actual
time=0.076..0.345 rows=20 loops=1)
                 ->  Limit  (cost=0.00..14.11 rows=10 width=8) (actual
time=0.074..0.127 rows=10 loops=1)
                     ->  Index Scan using idx on foo  (cost=0.00..1358815.69
rows=963067 width=8) (actual time=0.069..0.098 rows=10 loops=1)
                         Index Cond: ((f1 > 40000) AND (f1 < 50000))
                 ->  Limit  (cost=0.00..14.11 rows=10 width=8) (actual
time=0.101..0.154 rows=10 loops=1)
                     ->  Index Scan using idx on foo  (cost=0.00..1527039.50
rows=1082489 width=8) (actual time=0.097..0.121 rows=10 loops=1)
                         Index Cond: ((f1 > 60000) AND (f1 < 70000))
Total runtime: 0.952 ms

[3]
# explain  analyze
   select * from (
     select * from (select * from foo where f1>49980 and f1<50000 order by f1,
f2 limit 10) as t1
     union all
     select * from (select * from foo where f1>69980 and f1<70000 order by f1,
f2 limit 10) as t2
   ) as t
   order by f1, f2 limit 10;
Limit  (cost=35.61..35.64 rows=10 width=8) (actual time=0.630..0.673 rows=10
loops=1)
     ->  Sort  (cost=35.61..35.66 rows=20 width=8) (actual time=0.626..0.641
rows=10 loops=1)
         Sort Key: t.f1, t.f2
         ->  Result  (cost=0.00..35.18 rows=20 width=8) (actual
time=0.114..0.476 rows=20 loops=1)
             ->  Append  (cost=0.00..35.18 rows=20 width=8) (actual
time=0.107..0.401 rows=20 loops=1)
                 ->  Limit  (cost=0.00..17.51 rows=10 width=8) (actual
time=0.103..0.156 rows=10 loops=1)
                     ->  Index Scan using idx on foo  (cost=0.00..3550.22
rows=2028 width=8) (actual time=0.099..0.126 rows=10 loops=1)
                         Index Cond: ((f1 > 49980) AND (f1 < 50000))
                 ->  Limit  (cost=0.00..17.47 rows=10 width=8) (actual
time=0.129..0.181 rows=10 loops=1)
                     ->  Index Scan using idx on foo  (cost=0.00..3804.01
rows=2177 width=8) (actual time=0.125..0.152 rows=10 loops=1)
                         Index Cond: ((f1 > 69980) AND (f1 < 70000))
Total runtime: 1.004 ms

=================[IV]With patch===================
[1]
# explain analyze select * from foo where (f1=70000 and f2>95) or f1>70000 order
by f1, f2 limit 1
Limit  (cost=0.00..1.41 rows=1 width=8) (actual time=0.314..0.316 rows=1 loops=1)
     ->  Index Scan using idx on foo  (cost=0.00..4210762.24 rows=2982440
width=8) (actual time=0.309..0.309 rows=1 loops=1)
         Index Cond: (f1 >= 70000)
         Filter: (((f1 = 70000) AND (f2 > 95)) OR (f1 > 70000))
Total runtime: 0.444 ms

[2]
# explain analyze select * from foo where (f1>40000 and f1<50000) or (f1>60000
and f1<70000) order by f1, f2 limit 10;
Limit  (cost=0.00..14.16 rows=10 width=8) (actual time=0.087..0.192 rows=10 loops=1)
     ->  Result  (cost=0.00..2932816.48 rows=2071539 width=8) (actual
time=0.081..0.160 rows=10 loops=1)
         ->  Append  (cost=0.00..2932816.48 rows=2071539 width=8) (actual
time=0.074..0.123 rows=10 loops=1)
             ->  Index Scan using idx on foo  (cost=0.00..1382937.86 rows=976723
width=8) (actual time=0.069..0.092 rows=10 loops=1)
                 Index Cond: ((f1 > 40000) AND (f1 < 50000))
             ->  Index Scan using idx on foo  (cost=0.00..1549878.62
rows=1094816 width=8) (never executed)
                 Index Cond: ((f1 > 60000) AND (f1 < 70000))
Total runtime: 0.410 ms

[3]
# explain  analyze select * from foo where (f1>49980 and f1<50000) or (f1>69980
and f1<70000) order by f1, f2 limit 10
Limit  (cost=0.00..17.57 rows=10 width=8) (actual time=0.115..0.226 rows=10 loops=1)
     ->  Result  (cost=0.00..6902.27 rows=3928 width=8) (actual
time=0.111..0.197 rows=10 loops=1)
         ->  Append  (cost=0.00..6902.27 rows=3928 width=8) (actual
time=0.102..0.156 rows=10 loops=1)
             ->  Index Scan using idx on foo  (cost=0.00..3427.16 rows=1950
width=8) (actual time=0.099..0.125 rows=10 loops=1)
                 Index Cond: ((f1 > 49980) AND (f1 < 50000))
             ->  Index Scan using idx on foo  (cost=0.00..3475.11 rows=1978
width=8) (never executed)
                 Index Cond: ((f1 > 69980) AND (f1 < 70000))
Total runtime: 0.679 ms


--
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
                                                    WWW: http://www.sigaev.ru/

Re: [PATCHES] Bundle of patches

From
Tom Lane
Date:
Teodor Sigaev <teodor@sigaev.ru> writes:
>> useful optimizations most of the time.  Have you done any benchmarking
>> to find out what the cost is when the optimizations don't succeed?

> Test runs on my notebook with PIII/512Mb, FreeBSD 6.1, postgres was compiled
> with -O0 and --enable-debug --enable-cassert

This is not responding to my concern.  What you presented was an
advertisement for the cases where the patch is able to find a better
plan.  What I want to know about is how much planning time is added
for queries that it's *not* able to improve.  EXPLAIN ANALYZE output
doesn't address that point because it doesn't show planning time.

            regards, tom lane

Re: [PATCHES] Bundle of patches

From
Teodor Sigaev
Date:
> This is not responding to my concern.  What you presented was an
Sorry, I see your point now.

Timing is with the same test suite, the same notebook, the same compiling
options and patch (attached) which measures total time of
keybased_rewrite_index_paths() call.

Time in table is measured in microseconds (not milliseconds!):

   Number of clauses:  |    2    |    3    |    4
----------------------------------------------------
not improve           | 123-333 | 209-566 | 359-876
improve               | 138-486 |   962   |  1588

Details:
a) Patch isn't able to improve
# select * from foo where (f1>40000 and f1<50000) or (f1>45000 and f1<70000)
order by f1, f2 limit 10;
NOTICE:  Elapsed time 0.000333 sec
# explain select * from foo where (f1>40000 and f1<50000) or (f1>45000 and
f1<70000) or (f1>65000 and f1<80000) order by f1, f2 limit 10;
NOTICE:  Elapsed time 0.000566 sec
# explain select * from foo where (f1>40000 and f1<50000) or (f1>45000 and
f1<70000) or (f1>65000 and f1<80000) or ( f1>75000 and f1<90000 ) order by f1,
f2 limit 10;
NOTICE:  Elapsed time 0.000876 sec
# explain select * from foo where (f1>40000 and f1<50000) or (f2>45 and f2<46)
order by f1, f2 limit 10;
NOTICE:  Elapsed time 0.000123 sec
# explain select * from foo where (f1>40000 and f1<50000) or (f2>45 and f2<46)
or (f1<30000 and f2<36) order by f1, f2 limit 10;
NOTICE:  Elapsed time 0.000209 sec
# explain select * from foo where (f1>40000 and f1<50000) or (f2>45 and f2<46)
or (f1<30000 and f2<36) or (f1>45000 and f2>5) order by f1, f2 limit 10;
NOTICE:  Elapsed time 0.000359 sec


b) Successful improving
# select * from foo where (f1=70000 and f2>95) or f1>70000 order by f1, f2 limit 10;
NOTICE:  Elapsed time 0.000414 sec
# select * from foo where (f1>40000 and f1<50000) or (f1>60000 and f1<70000)
order by f1, f2 limit 10;
NOTICE:  Elapsed time 0.000486 sec
# select * from foo where (f1>49980 and f1<50000) or (f1>69980 and f1<70000)
order by f1, f2 limit 10;
NOTICE:  Elapsed time 0.000478 sec
#  select * from foo where (f1=70000 and f2>95) or f1>40000 order by f1, f2
limit 10;
NOTICE:  Elapsed time 0.000138 sec
2) tree clause
# select * from foo where (f1>40000 and f1<50000) or (f1>60000 and f1<70000) or
(f1>80000 and f1<90000)order by f1, f2 limit 10;
NOTICE:  Elapsed time 0.000962 sec
3) 4 clauses
# select * from foo where (f1>40000 and f1<50000) or (f1>60000 and f1<70000) or
(f1>80000 and f1<90000) or (f1>95000 and f1<96000) order by f1, f2 limit 10;
NOTICE:  Elapsed time 0.001588 sec



Unsuccessful



--
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
                                                    WWW: http://www.sigaev.ru/
*** ./src/backend/optimizer/path/allpaths.c.orig    Wed Dec  6 20:48:56 2006
--- ./src/backend/optimizer/path/allpaths.c    Wed Dec  6 21:41:05 2006
***************
*** 32,37 ****
--- 32,38 ----
  #include "parser/parsetree.h"
  #include "rewrite/rewriteManip.h"

+ #include <sys/time.h>

  /* These parameters are set by GUC */
  bool        enable_geqo = false;    /* just in case GUC doesn't set it */
***************
*** 189,194 ****
--- 190,200 ----
  #endif
  }

+ static double
+ timediff(struct timeval *begin, struct timeval *end) {
+     return ((double)( end->tv_sec - begin->tv_sec )) + ( (double)( end->tv_usec-begin->tv_usec ) ) / 1.0e+6;
+ }
+
  /*
   * set_plain_rel_pathlist
   *      Build access paths for a plain relation (no subquery, no inheritance)
***************
*** 196,201 ****
--- 202,209 ----
  static void
  set_plain_rel_pathlist(PlannerInfo *root, RelOptInfo *rel, RangeTblEntry *rte)
  {
+     struct timeval begin, end;
+
      /* Mark rel with estimated output rows, width, etc */
      set_baserel_size_estimates(root, rel);

***************
*** 243,249 ****
--- 251,260 ----
      create_index_paths(root, rel);

      /* Consider index scans with rewrited quals */
+     gettimeofday(&begin,NULL);
      keybased_rewrite_index_paths(root, rel);
+     gettimeofday(&end,NULL);
+     elog(NOTICE,"Elapsed time %g sec", timediff(&begin, &end));

      /* Consider TID scans */
      create_tidscan_paths(root, rel);

Re: [PATCHES] Bundle of patches

From
Teodor Sigaev
Date:
>> This is not responding to my concern.  What you presented was an
 > Sorry, I see your point now.

Is that test enough? Or I should make more?

--
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
                                                    WWW: http://www.sigaev.ru/