Thread: Bundle of patches
The 1C (http://www.1c.ru/) company kindly permits to publish a set of patches we (Oleg and me) developed during our work on porting of the '1C:Enterprise' system to support the PostgreSQL database. We would like to suggest to commit they to HEAD. 1) Typmod for user-defined types http://www.sigaev.ru/misc/user_defined_typmod-0.7.gz Patch is based on ideas from http://archives.postgresql.org/pgsql-hackers/2004-06/msg00932.php http://archives.postgresql.org/pgsql-hackers/2005-08/msg01007.php Patch adds to type definition two optional function: typmodinput and typmodoutput. That allows to develop user-defined types with type's modificators. Built-in types use typmod input/output functions too. Typmod internally is represented as non-negative int4 value, but patch allows to point list of integer in type definition. So, NUMERIC type works with a help of typmodin/typmodout function. 2) ORDER BY .. [ NULLS ( FIRST | LAST ) ] http://www.sigaev.ru/misc/NULLS_82-0.5.gz Allow to sort NULLs as greater or lesser than any value. The goal was to simplificate migrations from MySQL/MS SQL which think that NULL is less. Also, syntax conforms to SQL2003. It operate on gram.y level, and adds 'field is [not] null' qualification to sortClause. Note, to allow queries like 'select .. union .. order by f nulls first' pgsql now can rewrite that query to 'select * from (select .. union ..) order by f nulls first'. This solves the problem with 'resjunk' column in SelectStmt->sortClause. 3) Allow to use index for IS [NOT] NULL http://www.sigaev.ru/misc/indexnulls_82-0.6.gz Initially patch was developed by Martijn van Oosterhout <kleptog@svana.org>. But it's reworked and support of searching NULLS to GiST too. Patch adds new column named amsearchnull to pg_am. To recognize IS NULL clause ScanKey->sk_flags contains (SK_ISNULL & SK_INDEXFINDNULL) and ScanKey->sk_strategy == BTEqualStrategyNumber. For IS NOT NULL, ScanKey->sk_strategy == BTLessStrategyNumber. Thats because NULLs are treated greater than any value. It might be look some odd that for IS [NOT] NULL clauses we use Btree strategy numbers even for GiST, but if sk_flags contains SK_ISNULL then we never call user-defined functions. 4) OR clauses optimizations http://www.sigaev.ru/misc/OR_82-0.6.gz Patch can suggest new indexpaths to optimizer for ORed clauses. Patch uses generate_bitmapscan and predicate_implied_by/predicate_refuted_by machineries 4.1) Allow any useful common restriction clauses to be extracted from OR-of-AND quals. Also, it allows to combine several different operations to one which can be used in index scan. SELECT a, b FROM tst WHERE ( a = 50000 ) OR ( a > 50000 AND b > 50000 ) ORDER BY a, b LIMIT 20; Limit (cost=0.00..2.95 rows=20 width=8) (actual time=0.271..0.677 rows=20 loops=1) -> Index Scan using abidx on tst (cost=0.00..3671.26 rows=24878 width=8) (actual time=0.265..0.611 rows=20 loops=1) Index Cond: (a >= 50000) Filter: ((a = 50000) OR ((a > 50000) AND (b > 50000))) 4.2) When OR clauses aren't intersect and use the same index, it's possible to just concatenate results of indexscans. For that, now postgres may use Append node. Append node is modified to have a pathkeys. SELECT a FROM tst WHERE ( a > 60000 AND a < 61000 ) OR ( a > 20000 AND a < 21000 ) ORDER BY ASC LIMIT 20; Limit (cost=0.00..39.86 rows=20 width=4) (actual time=0.364..0.883 rows=20 loops=1) -> Result (cost=0.00..4001.55 rows=2008 width=4) (actual time=0.359..0.824 rows=20 loops=1) -> Append (cost=0.00..4001.55 rows=2008 width=4) (actual time=0.349..0.742 rows=20 loops=1) -> Index Scan using aidx on tst (cost=0.00..2000.42 rows=990 width=4) (actual time=0.346..0.684 rows=20 loops=1) Index Cond: ((a > 20000) AND (a < 21000)) -> Index Scan using aidx on tst (cost=0.00..2001.12 rows=1018 width=4) (never executed) Index Cond: ((a > 60000) AND (a < 61000)) Also, if there is a 'ORDER BY' clause, childs nodes may be ordered by theys ranges (compare plan with previous one). SELECT a FROM tst WHERE ( a > 60000 AND a < 61000 ) OR ( a > 20000 AND a < 21000 ) ORDER BY a DESC LIMIT 20; Limit (cost=0.00..39.86 rows=20 width=4) (actual time=0.162..0.651 rows=20 loops=1) -> Result (cost=0.00..4001.55 rows=2008 width=4) (actual time=0.157..0.589 rows=20 loops=1) -> Append (cost=0.00..4001.55 rows=2008 width=4) (actual time=0.149..0.511 rows=20 loops=1) -> Index Scan Backward using aidx on tst (cost=0.00..2001.12 rows=1018 width=4) (actual time=0.145..0.450 rows=20 loops=1) Index Cond: ((a > 60000) AND (a < 61000)) -> Index Scan Backward using aidx on tst (cost=0.00..2000.42 rows=990 width=4) (never executed) Index Cond: ((a > 20000) AND (a < 21000)) 4.3) As side effect of previous point, overlapped clauses can be eliminated: SELECT a FROM tst WHERE ( a > 50000 AND a < 61000 ) OR ( a > 60000 AND a < 60100 ) ORDER BY a LIMIT 20; Limit (cost=0.00..4.14 rows=20 width=4) (actual time=0.168..1.001 rows=20 loops=1) -> Index Scan using aidx on tst (cost=0.00..2344.85 rows=11338 width=4) (actual time=0.162..0.935 rows=20 loops=1) Index Cond: ((a > 50000) AND (a < 61000)) -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
Teodor Sigaev <teodor@sigaev.ru> writes: > 2) ORDER BY .. [ NULLS ( FIRST | LAST ) ] > http://www.sigaev.ru/misc/NULLS_82-0.5.gz i haven't looked at the other patches yet, but this one is just horrid :-( Instead of creating a proper parse-tree representation of the NULLS FIRST/LAST option, you've kluged it to convert the syntax into a double-column ordering using "foo IS [NOT] NULL" as the first column. This has obvious semantic disdvantages (what if foo is an expensive function?); but the real problem is that there's no way for the planner to reason about ordering in this representation. This patch would guarantee that an ORDER BY with the NULLS option couldn't use an indexscan, even if the index sorts nulls at the correct end. I think a reasonable implementation requires introducing an explicit concept of nulls-first-or-last into the planner's model of sort order, and probably into btree index opclasses as well. There is already some discussion about this in the archives with respect to the problem of handling descending-sort opclasses nicely. (If you just switch the operators to make a descending-sort opclass, then nulls end up at the "other end" from where they would otherwise, meaning that a backwards index scan does something unexpected. We have to fix that or else descending-sort opclasses can break mergejoins...) regards, tom lane
Teodor Sigaev <teodor@sigaev.ru> writes: > 3) Allow to use index for IS [NOT] NULL > http://www.sigaev.ru/misc/indexnulls_82-0.6.gz > Initially patch was developed by Martijn van Oosterhout <kleptog@svana.org>. > But it's reworked and support of searching NULLS to GiST too. Patch > adds new column named amsearchnull to pg_am. To recognize IS NULL clause > ScanKey->sk_flags contains (SK_ISNULL & SK_INDEXFINDNULL) and > ScanKey->sk_strategy == BTEqualStrategyNumber. For IS NOT NULL, > ScanKey->sk_strategy == BTLessStrategyNumber. Thats because NULLs are > treated greater than any value. And what happens when we implement NULLS FIRST/LAST correctly? This is really a poor choice of representation. One thing I find questionable about this is the assumption that indexes can support "foo IS NULL" and "foo IS NOT NULL" searches equally conveniently. This is demonstrably false for, say, hash. (Hash could store null keys by assigning them a fixed hashcode, say 0. Then it would be able to handle IS NULL searches, but not IS NOT NULL, because it can't do full-index scans.) I am not real sure that there is any point in making IS NOT NULL an indexable condition. We don't support <> as an indexable condition, and no one's yelled about that. It might be best just to simplify the patch to do IS NULL only. But if we are going to support both, we probably have to have two pg_am flags not one. regards, tom lane
Teodor Sigaev <teodor@sigaev.ru> writes: > 4) OR clauses optimizations > http://www.sigaev.ru/misc/OR_82-0.6.gz This seems kinda ugly --- it looks very expensive and unlikely to find useful optimizations most of the time. Have you done any benchmarking to find out what the cost is when the optimizations don't succeed? I guess my big problem with it is that it's a huge expansion of the single ugliest, most ad-hoc part of the planner, ie, orindxqual.c. And the new code is just as ad-hoc as what was there before ... Also, what's with the pull_tlist bit? regards, tom lane
> This has obvious semantic disdvantages (what if foo is an expensive > function?); Agree. > but the real problem is that there's no way for the planner > to reason about ordering in this representation. This patch would > guarantee that an ORDER BY with the NULLS option couldn't use an > indexscan, even if the index sorts nulls at the correct end. create table foo ( i int); insert into foo values (1), (5), (NULL); create index fooidx on foo (i); set enable_seqscan=off; set enable_bitmapscan=off; explain select i from foo order by i asc nulls last; QUERY PLAN ------------------------------------------------------------------- Index Scan using fooidx on foo (cost=0.00..12.05 rows=3 width=4) explain select i from foo order by i desc nulls first; QUERY PLAN ---------------------------------------------------------------------------- Index Scan Backward using fooidx on foo (cost=0.00..12.05 rows=3 width=4) Patch is smart enough about "native" NULL's ordering, so it adds quals only if it needed. Index support of non-"native" NULL's ordering, IMHO, has some correlation with suggested OR-patch. Sorting by ASC NULLS FIRST may done by two index scan with append node: Append Index Scan Cond: foo IS NULL Index Scan Cond: foo IS NOT NULL > I think a reasonable implementation requires introducing an explicit > concept of nulls-first-or-last into the planner's model of sort order, Agree, but I tried to keep patches independent as possible... If we will have agreement about ways to resolve, I'll will time to work further in foreseeable future. -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
Teodor Sigaev <teodor@sigaev.ru> writes: > 1) Typmod for user-defined types > http://www.sigaev.ru/misc/user_defined_typmod-0.7.gz > Patch is based on ideas from > http://archives.postgresql.org/pgsql-hackers/2004-06/msg00932.php > http://archives.postgresql.org/pgsql-hackers/2005-08/msg01007.php This one seems generally workable, but I really dislike the approach that's been used for passing typmod arguments to the typmod_in function. Representing them with an "internal" parameter means it'll be forever impossible to write typmod functions in anything but C, which seems an ugly restriction. Perhaps an array of int4 would be better? How much flexibility do we really want to provide for typmod arguments? Allowing full "expr_list" in the grammar seems less than sane, considering the result is still going to have to pack into 32 bits. The patch needs more cleanup before applying, too, eg make comments match code, get rid of unused keywords added to gram.y. regards, tom lane
Teodor Sigaev <teodor@sigaev.ru> writes: >> but the real problem is that there's no way for the planner >> to reason about ordering in this representation. This patch would >> guarantee that an ORDER BY with the NULLS option couldn't use an >> indexscan, even if the index sorts nulls at the correct end. > Patch is smart enough about "native" NULL's ordering, so it adds quals > only if it needed. [ looks again... ] Oh, so you've hard-wired the assumption that ASC/DESC correlate to NULLS FIRST/LAST right into the grammar :-( This is not a workable base for future extension. To be able to do anything interesting with descending-order opclasses, we really have to separate ASC/DESC from NULLS FIRST/LAST, not bind them permanently together. BTW, another sufficient reason to reject this approach is that a query entered as "ORDER BY foo NULLS FIRST" would come out as something quite different, if reverse-listed by ruleutils.c. We've paid the price of that sort of shortsightedness too many times in the past: implementation details should not get exposed in the reverse-listing. As a rule of thumb, if you are putting a transformation involving semantic knowledge into gram.y, you are probably putting it in the wrong place. (One of the reasons I like the typmod patch is that it helps pull a lot of inappropriate knowledge out of gram.y ...) regards, tom lane
> This seems kinda ugly --- it looks very expensive and unlikely to find > useful optimizations most of the time. Have you done any benchmarking > to find out what the cost is when the optimizations don't succeed? Yep, it's a big win with order and limit specified. Let table (a,b) has index over (a,b), so, queries with ((a=3 AND b>10) OR a>3) ORDER BY a,b may be done with one pass of index scan with condition a>=3 and filter ((a=3 and b>10) or a>3). And scan out is already sorted. The single way to execute it without patch is bitmap or scan over two index scans and following ordering. If limit is small enough then there is a lot unnecessary work for executor. Thats case may be found by find_common_quals() which is fast enough. Simplest case of second kind is ( a<3 or a>5 ). If it's possible to prove that set of rows for first condition and set for second are not intersected then output of correlated index scans can be simply joined/appended. In this case, we broaden applicability of Append node. What is more, it's possible to order index scans by conditions, which allows do not use sort node. I understand, that proving of non-intersecting and ordering by conditions is an expensive operation because of using predicate_implied_by/predicate_refuted_by, but in most cases they aren't even called. For using this optimization all conditions should on single index - and it's first step. Suggested plan's is very effective when a or b is large values like varchar, not a just integers. > Also, what's with the pull_tlist bit? pull target list from subplan. I've add it because query select b,a from foo where a<3 or a>5 order by a limit 1; with plan Result Append Index Scan Index Scan fails because Result node thinks that it gets (b,a) tuple, but in fact it gets (a,b). So, if pull_tlist is TRUE, then create_append_plan takes target list from first subplan. Currently, that bit set only with OR-optimized plan. And optimizer of OR-clause guarantee that target lists are the same. Sorry, but I didn't find clearer solution... -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
> This one seems generally workable, but I really dislike the approach > that's been used for passing typmod arguments to the typmod_in function. > Representing them with an "internal" parameter means it'll be forever > impossible to write typmod functions in anything but C, which seems an > ugly restriction. Perhaps an array of int4 would be better? How much I don't think that is a problem - I'll change that > flexibility do we really want to provide for typmod arguments? Allowing > full "expr_list" in the grammar seems less than sane, considering the > result is still going to have to pack into 32 bits. As I remember, I tried to use some thing else but, I've got a lot conflicts with AexprConst: func_name '(' expr_list ')' Sconst > > The patch needs more cleanup before applying, too, eg make comments > match code, get rid of unused keywords added to gram.y. Ok. -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
> And what happens when we implement NULLS FIRST/LAST correctly? This is > really a poor choice of representation. If it's just appending of indexscan's it's not a problem... > > One thing I find questionable about this is the assumption that indexes > can support "foo IS NULL" and "foo IS NOT NULL" searches equally > conveniently. This is demonstrably false for, say, hash. (Hash could > store null keys by assigning them a fixed hashcode, say 0. Then it > would be able to handle IS NULL searches, but not IS NOT NULL, because > it can't do full-index scans.) Is there a guarantee that hash value of some not-null keys doesn't equal to special hash code? > > the patch to do IS NULL only. But if we are going areto support both, > we probably have to have two pg_am flags not one. GiST isn't effective with single NOT NULL condition ... So, using two flags may be useful. -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
On Mon, Dec 04, 2006 at 01:35:21PM -0500, Tom Lane wrote: > > 3) Allow to use index for IS [NOT] NULL > > http://www.sigaev.ru/misc/indexnulls_82-0.6.gz > > Initially patch was developed by Martijn van Oosterhout <kleptog@svana.org>. > > But it's reworked and support of searching NULLS to GiST too. Patch > > adds new column named amsearchnull to pg_am. To recognize IS NULL clause > > ScanKey->sk_flags contains (SK_ISNULL & SK_INDEXFINDNULL) and > > ScanKey->sk_strategy == BTEqualStrategyNumber. For IS NOT NULL, > > ScanKey->sk_strategy == BTLessStrategyNumber. Thats because NULLs are > > treated greater than any value. > > I am not real sure that there is any point in making IS NOT NULL an > indexable condition. We don't support <> as an indexable condition, > and no one's yelled about that. It might be best just to simplify > the patch to do IS NULL only. But if we are going to support both, > we probably have to have two pg_am flags not one. Originally I didn't have IS NOT NULL but added it because it was easy and someone suggested a use case: for indexed columns that have a lot of nulls, it allows you to create an index scan that stops as soon as it reaches the first null entry. This is useful for the NULL FIRST/LAST optimisation for example. You're right, it doesn't work for hash indexes, but you can't do full scans on them anyway, so it's not terribly important. I'd say that ordered indexes like b-tree are the only ones that would get any benefit from IS NOT NULL. Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to litigate.
Attachment
On Mon, Dec 04, 2006 at 02:04:26PM -0500, Tom Lane wrote: > Teodor Sigaev <teodor@sigaev.ru> writes: > > 1) Typmod for user-defined types > > http://www.sigaev.ru/misc/user_defined_typmod-0.7.gz > > Patch is based on ideas from > > http://archives.postgresql.org/pgsql-hackers/2004-06/msg00932.php > > http://archives.postgresql.org/pgsql-hackers/2005-08/msg01007.php > > This one seems generally workable, but I really dislike the approach > that's been used for passing typmod arguments to the typmod_in function. > Representing them with an "internal" parameter means it'll be forever > impossible to write typmod functions in anything but C, which seems an > ugly restriction. Perhaps an array of int4 would be better? How much > flexibility do we really want to provide for typmod arguments? Allowing > full "expr_list" in the grammar seems less than sane, considering the > result is still going to have to pack into 32 bits. People have been discussion passing character set names as typmod parameters, so restricting them to int4 seems too tight. I'd favour the approach where the arguments to the typmod_in function determine the types required. This allows the system to do proper checking and casting and most important of all, good error messages, eg: ERROR: Invalid argument to type: must be one of numeric(), numeric(integer), numeric(integer, integer) Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to litigate.
> useful optimizations most of the time. Have you done any benchmarking > to find out what the cost is when the optimizations don't succeed? Test runs on my notebook with PIII/512Mb, FreeBSD 6.1, postgres was compiled with -O0 and --enable-debug --enable-cassert Test results (ms) | [1] | [2] | [3] ----------------------------------------------------------- [I] 8.3 | 88490.275 | 46684.972 | 21.563 [II] 8.3 splited query | 0.492 | 0.560 | 0.635 [III] 8.3 rewrited query | 0.523 | 0.952 | 1.004 [IV] 8.3+patch | 0.444 | 0.410 | 0.679 Query description All queries are variants of sinple OR clause with index support. In tests for 8.3 and 8.3+patch usial queries are used. '8.2 splited query' test executes separate queries for each OR-ed clause (in supposition that apllication should concatenate results). '8.3 rewrited query' test uses rewritten queries with the same result of execution and queries are fast as I could find. Test result shows a big advantage of using such kind of optimization and doesn't demonstrate performance gap on optimization stage. Test suite: # select * into foo from generate_series(1,100000) as f1, generate_series(1,100) as f2; # create index idx on foo (f1,f2); # vacuum analyze foo; Queries and plans in details: ==================[I] Without patch======================= [1] # explain analyze select * from foo where (f1=70000 and f2>95) or f1>70000 order by f1, f2 limit 10; Limit (cost=0.00..45.94 rows=10 width=8) (actual time=88490.044..88490.109 rows=10 loops=1) -> Index Scan using idx on foo (cost=0.00..14113503.78 rows=3071985 width=8) (actual time=88490.035..88490.072 rows=10 loops=1) Filter: (((f1 = 70000) AND (f2 > 95)) OR (f1 > 70000)) Total runtime: 88490.275 ms [2] # explain analyze select * from foo where (f1>40000 and f1<50000) or (f1>60000 and f1<70000) order by f1, f2 limit 10; Limit (cost=0.00..69.91 rows=10 width=8) (actual time=46684.725..46684.813 rows=10 loops=1) -> Index Scan using idx on foo (cost=0.00..14138503.78 rows=2022419 width=8) (actual time=46684.717..46684.768 rows=10 loops=1) Filter: (((f1 > 40000) AND (f1 < 50000)) OR ((f1 > 60000) AND (f1 < 70000))) Total runtime: 46684.972 ms [3] # explain analyze select * from foo where (f1>49980 and f1<50000) or (f1>69980 and f1<70000) order by f1, f2 limit 10; Limit (cost=13581.04..13581.07 rows=10 width=8) (actual time=20.767..20.809 rows=10 loops=1) -> Sort (cost=13581.04..13592.03 rows=4393 width=8) (actual time=20.759..20.773 rows=10 loops=1) Sort Key: f1, f2 -> Bitmap Heap Scan on foo (cost=102.12..13315.25 rows=4393 width=8) (actual time=3.495..11.911 rows=3800 loops=1) Recheck Cond: (((f1 > 49980) AND (f1 < 50000)) OR ((f1 > 69980) AND (f1 < 70000))) -> BitmapOr (cost=102.12..102.12 rows=4393 width=0) (actual time=3.415..3.415 rows=0 loops=1) -> Bitmap Index Scan on idx (cost=0.00..51.01 rows=2191 width=0) (actual time=1.887..1.887 rows=1900 loops=1) Index Cond: ((f1 > 49980) AND (f1 < 50000)) -> Bitmap Index Scan on idx (cost=0.00..51.12 rows=2202 width=0) (actual time=1.519..1.519 rows=1900 loops=1) Index Cond: ((f1 > 69980) AND (f1 < 70000)) Total runtime: 21.563 ms =================[II] Without patch, Two queries======================= Here plans are omitted because they are simple index scans. [1] # explain analyze select * from foo where f1=70000 and f2>95 order by f1, f2 limit 10; Total runtime: 0.219 ms # explain analyze select * from foo where f1>70000 order by f1, f2 limit 10; Total runtime: 0.273 ms [2] # explain analyze select * from foo where f1>40000 and f1<50000 order by f1, f2 limit 10; Total runtime: 0.245 ms # explain analyze select * from foo where f1>60000 and f1<70000 order by f1, f2 limit 10; Total runtime: 0.315 ms [3] # explain analyze select * from foo where f1>49980 and f1<50000 order by f1, f2 limit 10; Total runtime: 0.292 ms # explain analyze seleCt * from foo where f1>69980 and f1<70000 order by f1, f2 limit 10; Total runtime: 0.343 ms ==================[III]Without patch, rewrited query=================== [1] # explain analyze select * from foo where ((f1=70000 and f2>95) or f1>70000) and f1>=70000 order by f1, f2 limit 10; Limit (cost=0.00..50.33 rows=10 width=8) (actual time=0.320..0.387 rows=10 loops=1) -> Index Scan using idx on foo (cost=0.00..3974489.10 rows=789613 width=8) (actual time=0.313..0.348 rows=10 loops=1) Index Cond: (f1 >= 70000) Filter: (((f1 = 70000) AND (f2 > 95)) OR (f1 > 70000)) Total runtime: 0.523 ms [2] # explain analyze select * from ( select * from (select * from foo where f1>40000 and f1<50000 order by f1, f2 limit 10) as t1 union all select * from (select * from foo where f1>60000 and f1<70000 order by f1, f2 limit 10) as t2 ) as t order by f1, f2 limit 10; Limit (cost=28.85..28.87 rows=10 width=8) (actual time=0.586..0.627 rows=10 loops=1) -> Sort (cost=28.85..28.90 rows=20 width=8) (actual time=0.581..0.594 rows=10 loops=1) Sort Key: t.f1, t.f2 -> Result (cost=0.00..28.42 rows=20 width=8) (actual time=0.085..0.431 rows=20 loops=1) -> Append (cost=0.00..28.42 rows=20 width=8) (actual time=0.076..0.345 rows=20 loops=1) -> Limit (cost=0.00..14.11 rows=10 width=8) (actual time=0.074..0.127 rows=10 loops=1) -> Index Scan using idx on foo (cost=0.00..1358815.69 rows=963067 width=8) (actual time=0.069..0.098 rows=10 loops=1) Index Cond: ((f1 > 40000) AND (f1 < 50000)) -> Limit (cost=0.00..14.11 rows=10 width=8) (actual time=0.101..0.154 rows=10 loops=1) -> Index Scan using idx on foo (cost=0.00..1527039.50 rows=1082489 width=8) (actual time=0.097..0.121 rows=10 loops=1) Index Cond: ((f1 > 60000) AND (f1 < 70000)) Total runtime: 0.952 ms [3] # explain analyze select * from ( select * from (select * from foo where f1>49980 and f1<50000 order by f1, f2 limit 10) as t1 union all select * from (select * from foo where f1>69980 and f1<70000 order by f1, f2 limit 10) as t2 ) as t order by f1, f2 limit 10; Limit (cost=35.61..35.64 rows=10 width=8) (actual time=0.630..0.673 rows=10 loops=1) -> Sort (cost=35.61..35.66 rows=20 width=8) (actual time=0.626..0.641 rows=10 loops=1) Sort Key: t.f1, t.f2 -> Result (cost=0.00..35.18 rows=20 width=8) (actual time=0.114..0.476 rows=20 loops=1) -> Append (cost=0.00..35.18 rows=20 width=8) (actual time=0.107..0.401 rows=20 loops=1) -> Limit (cost=0.00..17.51 rows=10 width=8) (actual time=0.103..0.156 rows=10 loops=1) -> Index Scan using idx on foo (cost=0.00..3550.22 rows=2028 width=8) (actual time=0.099..0.126 rows=10 loops=1) Index Cond: ((f1 > 49980) AND (f1 < 50000)) -> Limit (cost=0.00..17.47 rows=10 width=8) (actual time=0.129..0.181 rows=10 loops=1) -> Index Scan using idx on foo (cost=0.00..3804.01 rows=2177 width=8) (actual time=0.125..0.152 rows=10 loops=1) Index Cond: ((f1 > 69980) AND (f1 < 70000)) Total runtime: 1.004 ms =================[IV]With patch=================== [1] # explain analyze select * from foo where (f1=70000 and f2>95) or f1>70000 order by f1, f2 limit 1 Limit (cost=0.00..1.41 rows=1 width=8) (actual time=0.314..0.316 rows=1 loops=1) -> Index Scan using idx on foo (cost=0.00..4210762.24 rows=2982440 width=8) (actual time=0.309..0.309 rows=1 loops=1) Index Cond: (f1 >= 70000) Filter: (((f1 = 70000) AND (f2 > 95)) OR (f1 > 70000)) Total runtime: 0.444 ms [2] # explain analyze select * from foo where (f1>40000 and f1<50000) or (f1>60000 and f1<70000) order by f1, f2 limit 10; Limit (cost=0.00..14.16 rows=10 width=8) (actual time=0.087..0.192 rows=10 loops=1) -> Result (cost=0.00..2932816.48 rows=2071539 width=8) (actual time=0.081..0.160 rows=10 loops=1) -> Append (cost=0.00..2932816.48 rows=2071539 width=8) (actual time=0.074..0.123 rows=10 loops=1) -> Index Scan using idx on foo (cost=0.00..1382937.86 rows=976723 width=8) (actual time=0.069..0.092 rows=10 loops=1) Index Cond: ((f1 > 40000) AND (f1 < 50000)) -> Index Scan using idx on foo (cost=0.00..1549878.62 rows=1094816 width=8) (never executed) Index Cond: ((f1 > 60000) AND (f1 < 70000)) Total runtime: 0.410 ms [3] # explain analyze select * from foo where (f1>49980 and f1<50000) or (f1>69980 and f1<70000) order by f1, f2 limit 10 Limit (cost=0.00..17.57 rows=10 width=8) (actual time=0.115..0.226 rows=10 loops=1) -> Result (cost=0.00..6902.27 rows=3928 width=8) (actual time=0.111..0.197 rows=10 loops=1) -> Append (cost=0.00..6902.27 rows=3928 width=8) (actual time=0.102..0.156 rows=10 loops=1) -> Index Scan using idx on foo (cost=0.00..3427.16 rows=1950 width=8) (actual time=0.099..0.125 rows=10 loops=1) Index Cond: ((f1 > 49980) AND (f1 < 50000)) -> Index Scan using idx on foo (cost=0.00..3475.11 rows=1978 width=8) (never executed) Index Cond: ((f1 > 69980) AND (f1 < 70000)) Total runtime: 0.679 ms -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
Teodor Sigaev <teodor@sigaev.ru> writes: >> useful optimizations most of the time. Have you done any benchmarking >> to find out what the cost is when the optimizations don't succeed? > Test runs on my notebook with PIII/512Mb, FreeBSD 6.1, postgres was compiled > with -O0 and --enable-debug --enable-cassert This is not responding to my concern. What you presented was an advertisement for the cases where the patch is able to find a better plan. What I want to know about is how much planning time is added for queries that it's *not* able to improve. EXPLAIN ANALYZE output doesn't address that point because it doesn't show planning time. regards, tom lane
> This is not responding to my concern. What you presented was an Sorry, I see your point now. Timing is with the same test suite, the same notebook, the same compiling options and patch (attached) which measures total time of keybased_rewrite_index_paths() call. Time in table is measured in microseconds (not milliseconds!): Number of clauses: | 2 | 3 | 4 ---------------------------------------------------- not improve | 123-333 | 209-566 | 359-876 improve | 138-486 | 962 | 1588 Details: a) Patch isn't able to improve # select * from foo where (f1>40000 and f1<50000) or (f1>45000 and f1<70000) order by f1, f2 limit 10; NOTICE: Elapsed time 0.000333 sec # explain select * from foo where (f1>40000 and f1<50000) or (f1>45000 and f1<70000) or (f1>65000 and f1<80000) order by f1, f2 limit 10; NOTICE: Elapsed time 0.000566 sec # explain select * from foo where (f1>40000 and f1<50000) or (f1>45000 and f1<70000) or (f1>65000 and f1<80000) or ( f1>75000 and f1<90000 ) order by f1, f2 limit 10; NOTICE: Elapsed time 0.000876 sec # explain select * from foo where (f1>40000 and f1<50000) or (f2>45 and f2<46) order by f1, f2 limit 10; NOTICE: Elapsed time 0.000123 sec # explain select * from foo where (f1>40000 and f1<50000) or (f2>45 and f2<46) or (f1<30000 and f2<36) order by f1, f2 limit 10; NOTICE: Elapsed time 0.000209 sec # explain select * from foo where (f1>40000 and f1<50000) or (f2>45 and f2<46) or (f1<30000 and f2<36) or (f1>45000 and f2>5) order by f1, f2 limit 10; NOTICE: Elapsed time 0.000359 sec b) Successful improving # select * from foo where (f1=70000 and f2>95) or f1>70000 order by f1, f2 limit 10; NOTICE: Elapsed time 0.000414 sec # select * from foo where (f1>40000 and f1<50000) or (f1>60000 and f1<70000) order by f1, f2 limit 10; NOTICE: Elapsed time 0.000486 sec # select * from foo where (f1>49980 and f1<50000) or (f1>69980 and f1<70000) order by f1, f2 limit 10; NOTICE: Elapsed time 0.000478 sec # select * from foo where (f1=70000 and f2>95) or f1>40000 order by f1, f2 limit 10; NOTICE: Elapsed time 0.000138 sec 2) tree clause # select * from foo where (f1>40000 and f1<50000) or (f1>60000 and f1<70000) or (f1>80000 and f1<90000)order by f1, f2 limit 10; NOTICE: Elapsed time 0.000962 sec 3) 4 clauses # select * from foo where (f1>40000 and f1<50000) or (f1>60000 and f1<70000) or (f1>80000 and f1<90000) or (f1>95000 and f1<96000) order by f1, f2 limit 10; NOTICE: Elapsed time 0.001588 sec Unsuccessful -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/ *** ./src/backend/optimizer/path/allpaths.c.orig Wed Dec 6 20:48:56 2006 --- ./src/backend/optimizer/path/allpaths.c Wed Dec 6 21:41:05 2006 *************** *** 32,37 **** --- 32,38 ---- #include "parser/parsetree.h" #include "rewrite/rewriteManip.h" + #include <sys/time.h> /* These parameters are set by GUC */ bool enable_geqo = false; /* just in case GUC doesn't set it */ *************** *** 189,194 **** --- 190,200 ---- #endif } + static double + timediff(struct timeval *begin, struct timeval *end) { + return ((double)( end->tv_sec - begin->tv_sec )) + ( (double)( end->tv_usec-begin->tv_usec ) ) / 1.0e+6; + } + /* * set_plain_rel_pathlist * Build access paths for a plain relation (no subquery, no inheritance) *************** *** 196,201 **** --- 202,209 ---- static void set_plain_rel_pathlist(PlannerInfo *root, RelOptInfo *rel, RangeTblEntry *rte) { + struct timeval begin, end; + /* Mark rel with estimated output rows, width, etc */ set_baserel_size_estimates(root, rel); *************** *** 243,249 **** --- 251,260 ---- create_index_paths(root, rel); /* Consider index scans with rewrited quals */ + gettimeofday(&begin,NULL); keybased_rewrite_index_paths(root, rel); + gettimeofday(&end,NULL); + elog(NOTICE,"Elapsed time %g sec", timediff(&begin, &end)); /* Consider TID scans */ create_tidscan_paths(root, rel);
>> Perhaps an array of int4 would be better? How much Done http://www.sigaev.ru/misc/user_defined_typmod-0.9.gz >> The patch needs more cleanup before applying, too, eg make comments >> match code, get rid of unused keywords added to gram.y. Cleaned. -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
Teodor Sigaev <teodor@sigaev.ru> writes: >>> Perhaps an array of int4 would be better? How much > Done > http://www.sigaev.ru/misc/user_defined_typmod-0.9.gz >>> The patch needs more cleanup before applying, too, eg make comments >>> match code, get rid of unused keywords added to gram.y. > Cleaned. OK. I'm up to my rear in the opclass/opfamily rewrite, but will take another look at the typmod code as soon as I have a working HEAD again ;-) regards, tom lane
0.9 doesn't apply cleanly after Peter's changes, so, new version http://www.sigaev.ru/misc/user_defined_typmod-0.10.gz Teodor Sigaev wrote: > >> Perhaps an array of int4 would be better? How much > > Done > http://www.sigaev.ru/misc/user_defined_typmod-0.9.gz > >>> The patch needs more cleanup before applying, too, eg make comments >>> match code, get rid of unused keywords added to gram.y. > > Cleaned. > > -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
Just a freshing for clean applying.. http://www.sigaev.ru/misc/user_defined_typmod-0.11.gz Is any objections to commit? -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
>> This is not responding to my concern. What you presented was an > Sorry, I see your point now. Is that test enough? Or I should make more? -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
Teodor Sigaev <teodor@sigaev.ru> writes: > Just a freshing for clean applying.. > http://www.sigaev.ru/misc/user_defined_typmod-0.11.gz > Is any objections to commit? There's still a lot I don't particularly care for here (lack of documentation being the biggest), but I'll make a pass at cleaning it up. One thought I had after a quick reading of the patch is that there doesn't seem to be a whole lot of percentage in trying to move the typmod handling for time/timestamp/interval types into callable functions, because we still have to special-case the darn things thanks to the SQL spec's oddball syntax requirements. For instance in format_type_be() you've got case TIMETZOID: if (with_typemod) ! buf = psnprintf(50, "time(%d) with time zone", ! typemod); else buf = pstrdup("time with time zone"); break; changed to case TIMETZOID: if (with_typemod) ! buf = printTypmod("time", typemod, typeform->typmodoutput); else buf = pstrdup("time with time zone"); break; which hardly seems like much of an improvement. If we could have gotten rid of the switch() branch for TIMETZOID entirely, then we'd be somewhere, but there's no chance because the typmod-less case still has to be special. So I'm sort of inclined to leave the processing of these datatypes as-is, and only use the typmodin/typmodout infrastructure for datatypes that follow the "canonical" syntax of type_name(int[,int ...]). Thoughts? regards, tom lane
Teodor Sigaev <teodor@sigaev.ru> writes: > Just a freshing for clean applying.. > http://www.sigaev.ru/misc/user_defined_typmod-0.11.gz Applied with some revisions, and pg_dump support and regression tests added. regards, tom lane
Nice, thanks a lot. Tom Lane wrote: > Teodor Sigaev <teodor@sigaev.ru> writes: >> Just a freshing for clean applying.. >> http://www.sigaev.ru/misc/user_defined_typmod-0.11.gz > > Applied with some revisions, and pg_dump support and regression tests > added. > > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 3: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faq -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
I assume we have taken all the patches from here we want. --------------------------------------------------------------------------- Teodor Sigaev wrote: > The 1C (http://www.1c.ru/) company kindly permits to publish a set of patches > we (Oleg and me) developed during our work on porting of the '1C:Enterprise' > system to support the PostgreSQL database. > > We would like to suggest to commit they to HEAD. > > > 1) Typmod for user-defined types > http://www.sigaev.ru/misc/user_defined_typmod-0.7.gz > Patch is based on ideas from > http://archives.postgresql.org/pgsql-hackers/2004-06/msg00932.php > http://archives.postgresql.org/pgsql-hackers/2005-08/msg01007.php > > Patch adds to type definition two optional function: typmodinput and > typmodoutput. That allows to develop user-defined types with type's > modificators. Built-in types use typmod input/output functions too. > Typmod internally is represented as non-negative int4 value, > but patch allows to point list of integer in type definition. So, > NUMERIC type works with a help of typmodin/typmodout function. > > > 2) ORDER BY .. [ NULLS ( FIRST | LAST ) ] > http://www.sigaev.ru/misc/NULLS_82-0.5.gz > Allow to sort NULLs as greater or lesser than any value. The goal was to > simplificate migrations from MySQL/MS SQL which think that NULL is less. > Also, syntax conforms to SQL2003. It operate on gram.y level, and > adds 'field is [not] null' qualification to sortClause. > Note, to allow queries like 'select .. union .. order by f nulls first' > pgsql now can rewrite that query to > 'select * from (select .. union ..) order by f nulls first'. This solves the > problem with 'resjunk' column in SelectStmt->sortClause. > > 3) Allow to use index for IS [NOT] NULL > http://www.sigaev.ru/misc/indexnulls_82-0.6.gz > Initially patch was developed by Martijn van Oosterhout <kleptog@svana.org>. > But it's reworked and support of searching NULLS to GiST too. Patch > adds new column named amsearchnull to pg_am. To recognize IS NULL clause > ScanKey->sk_flags contains (SK_ISNULL & SK_INDEXFINDNULL) and > ScanKey->sk_strategy == BTEqualStrategyNumber. For IS NOT NULL, > ScanKey->sk_strategy == BTLessStrategyNumber. Thats because NULLs are > treated greater than any value. It might be look some odd that > for IS [NOT] NULL clauses we use Btree strategy numbers even for GiST, > but if sk_flags contains SK_ISNULL then we never call user-defined functions. > > 4) OR clauses optimizations > http://www.sigaev.ru/misc/OR_82-0.6.gz > Patch can suggest new indexpaths to optimizer for ORed clauses. Patch uses > generate_bitmapscan and predicate_implied_by/predicate_refuted_by machineries > > 4.1) Allow any useful common restriction clauses to be extracted from > OR-of-AND quals. Also, it allows to combine several different > operations to one which can be used in index scan. > SELECT > a, b > FROM > tst > WHERE ( a = 50000 ) OR ( a > 50000 AND b > 50000 ) > ORDER BY a, b > LIMIT 20; > Limit (cost=0.00..2.95 rows=20 width=8) (actual time=0.271..0.677 rows=20 > loops=1) > -> Index Scan using abidx on tst (cost=0.00..3671.26 rows=24878 width=8) > (actual time=0.265..0.611 rows=20 loops=1) > Index Cond: (a >= 50000) > Filter: ((a = 50000) OR ((a > 50000) AND (b > 50000))) > 4.2) When OR clauses aren't intersect and use the same index, it's possible > to just concatenate results of indexscans. For that, now postgres may use > Append node. Append node is modified to have a pathkeys. > > SELECT > a > FROM > tst > WHERE ( a > 60000 AND a < 61000 ) OR ( a > 20000 AND a < 21000 ) > ORDER BY ASC > LIMIT 20; > Limit (cost=0.00..39.86 rows=20 width=4) (actual time=0.364..0.883 rows=20 > loops=1) > -> Result (cost=0.00..4001.55 rows=2008 width=4) (actual time=0.359..0.824 > rows=20 loops=1) > -> Append (cost=0.00..4001.55 rows=2008 width=4) (actual > time=0.349..0.742 rows=20 loops=1) > -> Index Scan using aidx on tst (cost=0.00..2000.42 rows=990 > width=4) (actual time=0.346..0.684 rows=20 loops=1) > Index Cond: ((a > 20000) AND (a < 21000)) > -> Index Scan using aidx on tst (cost=0.00..2001.12 rows=1018 > width=4) (never executed) > Index Cond: ((a > 60000) AND (a < 61000)) > > Also, if there is a 'ORDER BY' clause, childs nodes may be ordered by theys > ranges (compare plan with previous one). > SELECT > a > FROM > tst > WHERE ( a > 60000 AND a < 61000 ) OR ( a > 20000 AND a < 21000 ) > ORDER BY a DESC > LIMIT 20; > Limit (cost=0.00..39.86 rows=20 width=4) (actual time=0.162..0.651 rows=20 > loops=1) > -> Result (cost=0.00..4001.55 rows=2008 width=4) (actual time=0.157..0.589 > rows=20 loops=1) > -> Append (cost=0.00..4001.55 rows=2008 width=4) (actual > time=0.149..0.511 rows=20 loops=1) > -> Index Scan Backward using aidx on tst (cost=0.00..2001.12 > rows=1018 width=4) (actual time=0.145..0.450 rows=20 loops=1) > Index Cond: ((a > 60000) AND (a < 61000)) > -> Index Scan Backward using aidx on tst (cost=0.00..2000.42 > rows=990 width=4) (never executed) > Index Cond: ((a > 20000) AND (a < 21000)) > > 4.3) As side effect of previous point, overlapped clauses can be eliminated: > > SELECT > a > FROM > tst > WHERE ( a > 50000 AND a < 61000 ) OR ( a > 60000 AND a < 60100 ) > ORDER BY a > LIMIT 20; > Limit (cost=0.00..4.14 rows=20 width=4) (actual time=0.168..1.001 rows=20 > loops=1) > -> Index Scan using aidx on tst (cost=0.00..2344.85 rows=11338 width=4) > (actual time=0.162..0.935 rows=20 loops=1) > Index Cond: ((a > 50000) AND (a < 61000)) > > > > -- > Teodor Sigaev E-mail: teodor@sigaev.ru > WWW: http://www.sigaev.ru/ > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Have you searched our list archives? > > http://archives.postgresql.org -- Bruce Momjian bruce@momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +