Thread: [HACKERS] Secondary index access optimizations
Hi hackers, I am trying to compare different ways of optimizing work with huge append-only tables in PostgreSQL where primary key is something like timestamp and queries are usually accessing most recent data using some secondary keys. Size of secondary index is one of the most critical factors limiting insert/search performance. As far as data is inserted in timestamp ascending order, access to primary key is well localized and accessed tables are present in memory. But if we create secondary key for the whole table, then access to it will require random reads from the disk and significantly decrease performance. There are two well known solutions of the problem: 1. Table partitioning 2. Partial indexes This approaches I want to compare. First of all I want to check if optimizer is able to generate efficient query execution plan covering different time intervals. Unfortunately in both cases generated plan is not optimal. 1. Table partitioning: create table base (k integer primary key, v integer); create table part1 (check (k between 1 and 10000)) inherits (base); create table part2 (check (k between 10001 and 20000)) inherits (base); create index pi1 on part1(v); create index pi2 on part2(v); insert int part1 values (generate series(1,10000), random()); insert into part2 values (generate_series(10001,20000), random()); explain select * from base where k between 1 and 20000 and v = 100; QUERY PLAN ----------------------------------------------------------------------- Append (cost=0.00..15.65 rows=3 width=8) -> SeqScan on base (cost=0.00..0.00 rows=1 width=8) Filter: ((k >= 1) AND (k <= 20000) AND (v = 100)) -> Index Scanusing pi1 on part1 (cost=0.29..8.31 rows=1 width=8) Index Cond: (v = 100) Filter: ((k >= 1) AND (k <=20000)) -> Index Scan using pi2 on part2 (cost=0.29..7.34 rows=1 width=8) Index Cond: (v = 100) Filter:((k >= 1) AND (k <= 20000)) Questions: - Is there some way to avoid sequential scan of parent table? Yes, it is empty and so sequential scan will not take much time, but ... it still requires some additional actions and so increasing query execution time. - Why index scan of partition indexes includes filter condition if it is guaranteed by check constraint that all records of this partition match search predicate? 2. Partial indexes: create table t (k integer primary key, v integer); insert into t values (generate_series(1,20000),random()); create index i1 on t(v) where k between 1 and 10000; create index i2 on t(v) where k between 10001 and 20000; postgres=# explain select * from t where k between 1 and 10000 and v = 100; QUERY PLAN ------------------------------------------------------------ Index Scan using i1 on t (cost=0.29..7.28 rows=1 width=8) Index Cond: (v = 100) (2 rows) Here we get perfect plan. Let's try to extend search interval: postgres=# explain select * from t where k between 1 and 20000 and v = 100; QUERY PLAN ------------------------------------------------------------------ Index Scan using t_pkey on t (cost=0.29..760.43 rows=1width=8) Index Cond: ((k >= 1) AND (k <= 20000)) Filter: (v = 100) (3 rows) Unfortunately in this case Postgres is not able to apply partial indexes. And this is what I expected to get: postgres=# explain select * from t where k between 1 and 10000 and v = 100 union all select * from t where k between 10001 and 20000 and v = 100; QUERY PLAN ---------------------------------------------------------------------- Append (cost=0.29..14.58 rows=2 width=8) -> IndexScan using i1 on t (cost=0.29..7.28 rows=1 width=8) Index Cond: (v = 100) -> Index Scan using i2 on t t_1 (cost=0.29..7.28 rows=1 width=8) Index Cond: (v = 100) I wonder if there are some principle problems in supporting this two things in optimizer: 1. Remove search condition for primary key if it is fully satisfied by derived table check constraint. 2. Append index scans of several partial indexes if specified interval is covered by their conditions. I wonder if someone is familiar with this part of optimizer ad can easily fix it. Otherwise I am going to spend some time on solving this problems (if community think that such optimizations will be useful). -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
On 14.08.2017 12:37, Konstantin Knizhnik wrote: > Hi hackers, > > I am trying to compare different ways of optimizing work with huge > append-only tables in PostgreSQL where primary key is something like > timestamp and queries are usually accessing most recent data using > some secondary keys. Size of secondary index is one of the most > critical factors limiting insert/search performance. As far as data > is inserted in timestamp ascending order, access to primary key is > well localized and accessed tables are present in memory. But if we > create secondary key for the whole table, then access to it will > require random reads from the disk and significantly decrease > performance. > > There are two well known solutions of the problem: > 1. Table partitioning > 2. Partial indexes > > This approaches I want to compare. First of all I want to check if > optimizer is able to generate efficient query execution plan covering > different time intervals. > Unfortunately in both cases generated plan is not optimal. > > 1. Table partitioning: > > create table base (k integer primary key, v integer); > create table part1 (check (k between 1 and 10000)) inherits (base); > create table part2 (check (k between 10001 and 20000)) inherits (base); > create index pi1 on part1(v); > create index pi2 on part2(v); > insert int part1 values (generate series(1,10000), random()); > insert into part2 values (generate_series(10001,20000), random()); > explain select * from base where k between 1 and 20000 and v = 100; > QUERY PLAN > ----------------------------------------------------------------------- > Append (cost=0.00..15.65 rows=3 width=8) > -> Seq Scan on base (cost=0.00..0.00 rows=1 width=8) > Filter: ((k >= 1) AND (k <= 20000) AND (v = 100)) > -> Index Scan using pi1 on part1 (cost=0.29..8.31 rows=1 width=8) > Index Cond: (v = 100) > Filter: ((k >= 1) AND (k <= 20000)) > -> Index Scan using pi2 on part2 (cost=0.29..7.34 rows=1 width=8) > Index Cond: (v = 100) > Filter: ((k >= 1) AND (k <= 20000)) > > Questions: > - Is there some way to avoid sequential scan of parent table? Yes, it > is empty and so sequential scan will not take much time, but ... it > still requires some additional actions and so increasing query > execution time. > - Why index scan of partition indexes includes filter condition if it > is guaranteed by check constraint that all records of this partition > match search predicate? > > > 2. Partial indexes: > > create table t (k integer primary key, v integer); > insert into t values (generate_series(1,20000),random()); > create index i1 on t(v) where k between 1 and 10000; > create index i2 on t(v) where k between 10001 and 20000; > postgres=# explain select * from t where k between 1 and 10000 and v = > 100; > QUERY PLAN > ------------------------------------------------------------ > Index Scan using i1 on t (cost=0.29..7.28 rows=1 width=8) > Index Cond: (v = 100) > (2 rows) > > > Here we get perfect plan. Let's try to extend search interval: > > > postgres=# explain select * from t where k between 1 and 20000 and v = > 100; > QUERY PLAN > ------------------------------------------------------------------ > Index Scan using t_pkey on t (cost=0.29..760.43 rows=1 width=8) > Index Cond: ((k >= 1) AND (k <= 20000)) > Filter: (v = 100) > (3 rows) > > Unfortunately in this case Postgres is not able to apply partial indexes. > And this is what I expected to get: > > postgres=# explain select * from t where k between 1 and 10000 and v = > 100 union all select * from t where k between 10001 and 20000 and v = > 100; > QUERY PLAN > ---------------------------------------------------------------------- > Append (cost=0.29..14.58 rows=2 width=8) > -> Index Scan using i1 on t (cost=0.29..7.28 rows=1 width=8) > Index Cond: (v = 100) > -> Index Scan using i2 on t t_1 (cost=0.29..7.28 rows=1 width=8) > Index Cond: (v = 100) > > > I wonder if there are some principle problems in supporting this two > things in optimizer: > 1. Remove search condition for primary key if it is fully satisfied by > derived table check constraint. > 2. Append index scans of several partial indexes if specified interval > is covered by their conditions. > > I wonder if someone is familiar with this part of optimizer ad can > easily fix it. > Otherwise I am going to spend some time on solving this problems (if > community think that such optimizations will be useful). > Replying to myself: the following small patch removes redundant checks from index scans for derived tables: diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c index 939045d..1f7c9cf 100644 --- a/src/backend/optimizer/util/plancat.c +++ b/src/backend/optimizer/util/plancat.c @@ -1441,6 +1441,20 @@ relation_excluded_by_constraints(PlannerInfo *root, if (predicate_refuted_by(safe_constraints, rel->baserestrictinfo, false)) return true; + /* + * Remove from restriction list items implied by table constraints + */ + safe_restrictions = NULL; + foreach(lc, rel->baserestrictinfo) + { + RestrictInfo *rinfo = (RestrictInfo *) lfirst(lc); + if (!predicate_implied_by(list_make1(rinfo->clause), safe_constraints, false)) { + safe_restrictions = lappend(safe_restrictions, rinfo); + } + } + rel->baserestrictinfo = safe_restrictions; + + return false; } ================================================= I am not sure if this is the right place of adjusting restriction list and is uch transformation always correct. But for the queries I have tested produced plans are correct: postgres=# explain select * from base where k between 1 and 20000 and v = 100; QUERY PLAN ----------------------------------------------------------------------- Append (cost=0.00..15.64 rows=3 width=8) -> SeqScan on base (cost=0.00..0.00 rows=1 width=8) Filter: ((k >= 1) AND (k <= 20000) AND (v = 100)) -> Index Scanusing pi1 on part1 (cost=0.29..8.30 rows=1 width=8) Index Cond: (v = 100) -> Index Scan using pi2 on part2 (cost=0.29..7.34 rows=1 width=8) Index Cond: (v = 100) (7 rows) postgres=# explain select * from base where k between 1 and 15000 and v = 100; QUERY PLAN ----------------------------------------------------------------------- Append (cost=0.00..15.64 rows=3 width=8) -> SeqScan on base (cost=0.00..0.00 rows=1 width=8) Filter: ((k >= 1) AND (k <= 15000) AND (v = 100)) -> Index Scanusing pi1 on part1 (cost=0.29..8.30 rows=1 width=8) Index Cond: (v = 100) -> Index Scan using pi2 on part2 (cost=0.29..7.34 rows=1 width=8) Index Cond: (v = 100) Filter: (k <= 15000) (8 rows) -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
On 14.08.2017 19:33, Konstantin Knizhnik wrote:
There is one more problem with predicate_implied_by function and standard Postgres partitions.
Right now it specifies constrains for partitions using intervals with open high boundary:
postgres=# create table bt (k integer, v integer) partition by range (k);
postgres=# create table dt1 partition of bt for values from (1) to (10001);
postgres=# create table dt2 partition of bt for values from (10001) to (20001);
postgres=# create index dti1 on dt1(v);
postgres=# create index dti2 on dt2(v);
postgres=# insert into bt values (generate_series(1,20000), random());
postgres=# \d+ dt2
Table "public.dt2"
Column | Type | Collation | Nullable | Default | Storage | Stats target | De
scription
--------+---------+-----------+----------+---------+---------+--------------+---
----------
k | integer | | | | plain | |
v | integer | | | | plain | |
Partition of: bt FOR VALUES FROM (10001) TO (20001)
Partition constraint: ((k IS NOT NULL) AND (k >= 10001) AND (k < 20001))
Indexes:
"dti2" btree (v)
If now I run the query with predicate "between 1 and 20000", I still see extra check in the plan:
postgres=# explain select * from bt where k between 1 and 20000 and v = 100;
QUERY PLAN
----------------------------------------------------------------------
Append (cost=0.29..15.63 rows=2 width=8)
-> Index Scan using dti1 on dt1 (cost=0.29..8.30 rows=1 width=8)
Index Cond: (v = 100)
-> Index Scan using dti2 on dt2 (cost=0.29..7.33 rows=1 width=8)
Index Cond: (v = 100)
Filter: (k <= 20000)
(6 rows)
It is because operator_predicate_proof is not able to understand that k < 20001 and k <= 20000 are equivalent for integer type.
If I rewrite query as (k >= 1 and k < 20001) then plan is optimal:
postgres=# explain select * from bt where k >= 1 and k < 20001 and v = 100;
QUERY PLAN
----------------------------------------------------------------------
Append (cost=0.29..15.63 rows=2 width=8)
-> Index Scan using dti1 on dt1 (cost=0.29..8.30 rows=1 width=8)
Index Cond: (v = 100)
-> Index Scan using dti2 on dt2 (cost=0.29..7.33 rows=1 width=8)
Index Cond: (v = 100)
(5 rows)
It is fixed by one more patch of predtest.c:
diff --git a/src/backend/optimizer/util/predtest.c b/src/backend/optimizer/util
index 06fce84..c318d00 100644
--- a/src/backend/optimizer/util/predtest.c
+++ b/src/backend/optimizer/util/predtest.c
@@ -17,6 +17,7 @@
#include "catalog/pg_proc.h"
#include "catalog/pg_type.h"
+#include "catalog/pg_operator.h"
#include "executor/executor.h"
#include "miscadmin.h"
#include "nodes/nodeFuncs.h"
@@ -1407,6 +1408,11 @@ static const StrategyNumber BT_refute_table[6][6] = {
{none, none, BTEQ, none, none, none} /* NE */
};
+#define Int2LessOperator 95
+#define Int2LessOrEqualOperator 522
+#define Int4LessOrEqualOperator 523
+#define Int8LessOrEqualOperator 414
+
/*
* operator_predicate_proof
if (clause_const->constisnull)
return false;
+ if (!refute_it
+ && ((pred_op == Int4LessOrEqualOperator && clause_op == Int4LessOperator)
+ || (pred_op == Int8LessOrEqualOperator && clause_op == Int8LessOperator)
+ || (pred_op == Int2LessOrEqualOperator && clause_op == Int2LessOperator))
+ && pred_const->constbyval && clause_const->constbyval
+ && pred_const->constvalue + 1 == clause_const->constvalue)
+ {
+ return true;
+ }
+
/*
* Lookup the constant-comparison operator using the system catalogs and
* the operator implication tables.
===============================
Now plan is perfect once again:
postgres=# explain select * from bt where k between 1 and 20000 and v = 100;
QUERY PLAN
----------------------------------------------------------------------
Append (cost=0.29..15.63 rows=2 width=8)
-> Index Scan using dti1 on dt1 (cost=0.29..8.30 rows=1 width=8)
Index Cond: (v = 100)
-> Index Scan using dti2 on dt2 (cost=0.29..7.33 rows=1 width=8)
Index Cond: (v = 100)
(5 rows)
postgres=# explain select * from bt where k between 1 and 15000 and v = 100;
QUERY PLAN
----------------------------------------------------------------------
Append (cost=0.29..15.63 rows=2 width=8)
-> Index Scan using dti1 on dt1 (cost=0.29..8.30 rows=1 width=8)
Index Cond: (v = 100)
-> Index Scan using dti2 on dt2 (cost=0.29..7.33 rows=1 width=8)
Index Cond: (v = 100)
Filter: (k <= 15000)
(6 rows)
I am not sure that proposed approach is general enough, but I do not know how to implement in more elegant way.
Composed patch is attached to this mail.
On 14.08.2017 12:37, Konstantin Knizhnik wrote:Hi hackers,
I am trying to compare different ways of optimizing work with huge append-only tables in PostgreSQL where primary key is something like timestamp and queries are usually accessing most recent data using some secondary keys. Size of secondary index is one of the most critical factors limiting insert/search performance. As far as data is inserted in timestamp ascending order, access to primary key is well localized and accessed tables are present in memory. But if we create secondary key for the whole table, then access to it will require random reads from the disk and significantly decrease performance.
There are two well known solutions of the problem:
1. Table partitioning
2. Partial indexes
This approaches I want to compare. First of all I want to check if optimizer is able to generate efficient query execution plan covering different time intervals.
Unfortunately in both cases generated plan is not optimal.
1. Table partitioning:
create table base (k integer primary key, v integer);
create table part1 (check (k between 1 and 10000)) inherits (base);
create table part2 (check (k between 10001 and 20000)) inherits (base);
create index pi1 on part1(v);
create index pi2 on part2(v);
insert int part1 values (generate series(1,10000), random());
insert into part2 values (generate_series(10001,20000), random());
explain select * from base where k between 1 and 20000 and v = 100;
QUERY PLAN
-----------------------------------------------------------------------
Append (cost=0.00..15.65 rows=3 width=8)
-> Seq Scan on base (cost=0.00..0.00 rows=1 width=8)
Filter: ((k >= 1) AND (k <= 20000) AND (v = 100))
-> Index Scan using pi1 on part1 (cost=0.29..8.31 rows=1 width=8)
Index Cond: (v = 100)
Filter: ((k >= 1) AND (k <= 20000))
-> Index Scan using pi2 on part2 (cost=0.29..7.34 rows=1 width=8)
Index Cond: (v = 100)
Filter: ((k >= 1) AND (k <= 20000))
Questions:
- Is there some way to avoid sequential scan of parent table? Yes, it is empty and so sequential scan will not take much time, but ... it still requires some additional actions and so increasing query execution time.
- Why index scan of partition indexes includes filter condition if it is guaranteed by check constraint that all records of this partition match search predicate?
2. Partial indexes:
create table t (k integer primary key, v integer);
insert into t values (generate_series(1,20000),random());
create index i1 on t(v) where k between 1 and 10000;
create index i2 on t(v) where k between 10001 and 20000;
postgres=# explain select * from t where k between 1 and 10000 and v = 100;
QUERY PLAN
------------------------------------------------------------
Index Scan using i1 on t (cost=0.29..7.28 rows=1 width=8)
Index Cond: (v = 100)
(2 rows)
Here we get perfect plan. Let's try to extend search interval:
postgres=# explain select * from t where k between 1 and 20000 and v = 100;
QUERY PLAN
------------------------------------------------------------------
Index Scan using t_pkey on t (cost=0.29..760.43 rows=1 width=8)
Index Cond: ((k >= 1) AND (k <= 20000))
Filter: (v = 100)
(3 rows)
Unfortunately in this case Postgres is not able to apply partial indexes.
And this is what I expected to get:
postgres=# explain select * from t where k between 1 and 10000 and v = 100 union all select * from t where k between 10001 and 20000 and v = 100;
QUERY PLAN
----------------------------------------------------------------------
Append (cost=0.29..14.58 rows=2 width=8)
-> Index Scan using i1 on t (cost=0.29..7.28 rows=1 width=8)
Index Cond: (v = 100)
-> Index Scan using i2 on t t_1 (cost=0.29..7.28 rows=1 width=8)
Index Cond: (v = 100)
I wonder if there are some principle problems in supporting this two things in optimizer:
1. Remove search condition for primary key if it is fully satisfied by derived table check constraint.
2. Append index scans of several partial indexes if specified interval is covered by their conditions.
I wonder if someone is familiar with this part of optimizer ad can easily fix it.
Otherwise I am going to spend some time on solving this problems (if community think that such optimizations will be useful).
Replying to myself: the following small patch removes redundant checks from index scans for derived tables:
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index 939045d..1f7c9cf 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -1441,6 +1441,20 @@ relation_excluded_by_constraints(PlannerInfo *root,
if (predicate_refuted_by(safe_constraints, rel->baserestrictinfo, false))
return true;
+ /*
+ * Remove from restriction list items implied by table constraints
+ */
+ safe_restrictions = NULL;
+ foreach(lc, rel->baserestrictinfo)
+ {
+ RestrictInfo *rinfo = (RestrictInfo *) lfirst(lc);
+ if (!predicate_implied_by(list_make1(rinfo->clause), safe_constraints, false)) {
+ safe_restrictions = lappend(safe_restrictions, rinfo);
+ }
+ }
+ rel->baserestrictinfo = safe_restrictions;
+
+
return false;
}
=================================================
I am not sure if this is the right place of adjusting restriction list and is such transformation always correct.
But for the queries I have tested produced plans are correct:
postgres=# explain select * from base where k between 1 and 20000 and v = 100;
QUERY PLAN
-----------------------------------------------------------------------
Append (cost=0.00..15.64 rows=3 width=8)
-> Seq Scan on base (cost=0.00..0.00 rows=1 width=8)
Filter: ((k >= 1) AND (k <= 20000) AND (v = 100))
-> Index Scan using pi1 on part1 (cost=0.29..8.30 rows=1 width=8)
Index Cond: (v = 100)
-> Index Scan using pi2 on part2 (cost=0.29..7.34 rows=1 width=8)
Index Cond: (v = 100)
(7 rows)
postgres=# explain select * from base where k between 1 and 15000 and v = 100;
QUERY PLAN
-----------------------------------------------------------------------
Append (cost=0.00..15.64 rows=3 width=8)
-> Seq Scan on base (cost=0.00..0.00 rows=1 width=8)
Filter: ((k >= 1) AND (k <= 15000) AND (v = 100))
-> Index Scan using pi1 on part1 (cost=0.29..8.30 rows=1 width=8)
Index Cond: (v = 100)
-> Index Scan using pi2 on part2 (cost=0.29..7.34 rows=1 width=8)
Index Cond: (v = 100)
Filter: (k <= 15000)
(8 rows)
There is one more problem with predicate_implied_by function and standard Postgres partitions.
Right now it specifies constrains for partitions using intervals with open high boundary:
postgres=# create table bt (k integer, v integer) partition by range (k);
postgres=# create table dt1 partition of bt for values from (1) to (10001);
postgres=# create table dt2 partition of bt for values from (10001) to (20001);
postgres=# create index dti1 on dt1(v);
postgres=# create index dti2 on dt2(v);
postgres=# insert into bt values (generate_series(1,20000), random());
postgres=# \d+ dt2
Table "public.dt2"
Column | Type | Collation | Nullable | Default | Storage | Stats target | De
scription
--------+---------+-----------+----------+---------+---------+--------------+---
----------
k | integer | | | | plain | |
v | integer | | | | plain | |
Partition of: bt FOR VALUES FROM (10001) TO (20001)
Partition constraint: ((k IS NOT NULL) AND (k >= 10001) AND (k < 20001))
Indexes:
"dti2" btree (v)
If now I run the query with predicate "between 1 and 20000", I still see extra check in the plan:
postgres=# explain select * from bt where k between 1 and 20000 and v = 100;
QUERY PLAN
----------------------------------------------------------------------
Append (cost=0.29..15.63 rows=2 width=8)
-> Index Scan using dti1 on dt1 (cost=0.29..8.30 rows=1 width=8)
Index Cond: (v = 100)
-> Index Scan using dti2 on dt2 (cost=0.29..7.33 rows=1 width=8)
Index Cond: (v = 100)
Filter: (k <= 20000)
(6 rows)
It is because operator_predicate_proof is not able to understand that k < 20001 and k <= 20000 are equivalent for integer type.
If I rewrite query as (k >= 1 and k < 20001) then plan is optimal:
postgres=# explain select * from bt where k >= 1 and k < 20001 and v = 100;
QUERY PLAN
----------------------------------------------------------------------
Append (cost=0.29..15.63 rows=2 width=8)
-> Index Scan using dti1 on dt1 (cost=0.29..8.30 rows=1 width=8)
Index Cond: (v = 100)
-> Index Scan using dti2 on dt2 (cost=0.29..7.33 rows=1 width=8)
Index Cond: (v = 100)
(5 rows)
It is fixed by one more patch of predtest.c:
diff --git a/src/backend/optimizer/util/predtest.c b/src/backend/optimizer/util
index 06fce84..c318d00 100644
--- a/src/backend/optimizer/util/predtest.c
+++ b/src/backend/optimizer/util/predtest.c
@@ -17,6 +17,7 @@
#include "catalog/pg_proc.h"
#include "catalog/pg_type.h"
+#include "catalog/pg_operator.h"
#include "executor/executor.h"
#include "miscadmin.h"
#include "nodes/nodeFuncs.h"
@@ -1407,6 +1408,11 @@ static const StrategyNumber BT_refute_table[6][6] = {
{none, none, BTEQ, none, none, none} /* NE */
};
+#define Int2LessOperator 95
+#define Int2LessOrEqualOperator 522
+#define Int4LessOrEqualOperator 523
+#define Int8LessOrEqualOperator 414
+
/*
* operator_predicate_proof
if (clause_const->constisnull)
return false;
+ if (!refute_it
+ && ((pred_op == Int4LessOrEqualOperator && clause_op == Int4LessOperator)
+ || (pred_op == Int8LessOrEqualOperator && clause_op == Int8LessOperator)
+ || (pred_op == Int2LessOrEqualOperator && clause_op == Int2LessOperator))
+ && pred_const->constbyval && clause_const->constbyval
+ && pred_const->constvalue + 1 == clause_const->constvalue)
+ {
+ return true;
+ }
+
/*
* Lookup the constant-comparison operator using the system catalogs and
* the operator implication tables.
===============================
Now plan is perfect once again:
postgres=# explain select * from bt where k between 1 and 20000 and v = 100;
QUERY PLAN
----------------------------------------------------------------------
Append (cost=0.29..15.63 rows=2 width=8)
-> Index Scan using dti1 on dt1 (cost=0.29..8.30 rows=1 width=8)
Index Cond: (v = 100)
-> Index Scan using dti2 on dt2 (cost=0.29..7.33 rows=1 width=8)
Index Cond: (v = 100)
(5 rows)
postgres=# explain select * from bt where k between 1 and 15000 and v = 100;
QUERY PLAN
----------------------------------------------------------------------
Append (cost=0.29..15.63 rows=2 width=8)
-> Index Scan using dti1 on dt1 (cost=0.29..8.30 rows=1 width=8)
Index Cond: (v = 100)
-> Index Scan using dti2 on dt2 (cost=0.29..7.33 rows=1 width=8)
Index Cond: (v = 100)
Filter: (k <= 15000)
(6 rows)
I am not sure that proposed approach is general enough, but I do not know how to implement in more elegant way.
Composed patch is attached to this mail.
-- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Attachment
On Wed, Aug 16, 2017 at 9:23 PM, Konstantin Knizhnik <k.knizhnik@postgrespro.ru> wrote: > postgres=# explain select * from bt where k between 1 and 20000 and v = 100; > QUERY PLAN > ---------------------------------------------------------------------- > Append (cost=0.29..15.63 rows=2 width=8) > -> Index Scan using dti1 on dt1 (cost=0.29..8.30 rows=1 width=8) > Index Cond: (v = 100) > -> Index Scan using dti2 on dt2 (cost=0.29..7.33 rows=1 width=8) > Index Cond: (v = 100) > Filter: (k <= 20000) > (6 rows) +1 This seems like a good feature to me: filtering stuff that is obviously true is a waste of CPU cycles and may even require people to add redundant stuff to indexes. I was pondering something related to this over in the partition-wise join thread (join quals that are implied by partition constraints and should be discarded). It'd be interesting to get Amit Langote's feedback, so I CC'd him. I'd be surprised if he and others haven't got a plan or a patch for this down the back of the sofa. I might be missing some higher level architectural problems with the patch, but for what it's worth here's some feedback after a first read through: --- a/src/backend/optimizer/util/plancat.c +++ b/src/backend/optimizer/util/plancat.c @@ -1441,6 +1441,20 @@ relation_excluded_by_constraints(PlannerInfo *root, if (predicate_refuted_by(safe_constraints,rel->baserestrictinfo, false)) return true; + /* + * Remove from restrictions list items implied by table constraints + */ + safe_restrictions = NULL; + foreach(lc, rel->baserestrictinfo) + { + RestrictInfo *rinfo = (RestrictInfo *) lfirst(lc); I think the new way to write this is "RestrictInfo *rinfo = lfirst_node(RestrictInfo, lc)". + if (!predicate_implied_by(list_make1(rinfo->clause), safe_constraints, false)) { + safe_restrictions = lappend(safe_restrictions, rinfo); + } + } + rel->baserestrictinfo = safe_restrictions; It feels wrong to modify rel->baserestrictinfo in relation_excluded_by_constraints(). I think there should probably be a function with a name that more clearly indicates that it mutates its input, and you should call that separately after relation_excluded_by_constraints(). Something like remove_restrictions_implied_by_constraints()? > It is because operator_predicate_proof is not able to understand that k < > 20001 and k <= 20000 are equivalent for integer type. > > [...] > > /* > * operator_predicate_proof > if (clause_const->constisnull) > return false; > > + if (!refute_it > + && ((pred_op == Int4LessOrEqualOperator && clause_op == > Int4LessOperator) > + || (pred_op == Int8LessOrEqualOperator && clause_op == > Int8LessOperator) > + || (pred_op == Int2LessOrEqualOperator && clause_op == > Int2LessOperator)) > + && pred_const->constbyval && clause_const->constbyval > + && pred_const->constvalue + 1 == clause_const->constvalue) > + { > + return true; > + } > + I'm less sure about this part. It seems like a slippery slope. A couple of regression test failures: inherit ... FAILED rowsecurity ... FAILED ========================2 of 179 tests failed. ======================== I didn't try to understand the rowsecurity one, but at first glance all the differences reported in the inherit test are in fact cases where your patch is doing the right thing and removing redundant filters from scans. Nice! -- Thomas Munro http://www.enterprisedb.com
On 09/02/2017 06:44 AM, Thomas Munro wrote: > On Wed, Aug 16, 2017 at 9:23 PM, Konstantin Knizhnik > <k.knizhnik@postgrespro.ru> wrote: >> postgres=# explain select * from bt where k between 1 and 20000 and v = 100; >> QUERY PLAN >> ---------------------------------------------------------------------- >> Append (cost=0.29..15.63 rows=2 width=8) >> -> Index Scan using dti1 on dt1 (cost=0.29..8.30 rows=1 width=8) >> Index Cond: (v = 100) >> -> Index Scan using dti2 on dt2 (cost=0.29..7.33 rows=1 width=8) >> Index Cond: (v = 100) >> Filter: (k <= 20000) >> (6 rows) > +1 > > This seems like a good feature to me: filtering stuff that is > obviously true is a waste of CPU cycles and may even require people to > add redundant stuff to indexes. I was pondering something related to > this over in the partition-wise join thread (join quals that are > implied by partition constraints and should be discarded). > > It'd be interesting to get Amit Langote's feedback, so I CC'd him. > I'd be surprised if he and others haven't got a plan or a patch for > this down the back of the sofa. > > I might be missing some higher level architectural problems with the > patch, but for what it's worth here's some feedback after a first read > through: > > --- a/src/backend/optimizer/util/plancat.c > +++ b/src/backend/optimizer/util/plancat.c > @@ -1441,6 +1441,20 @@ relation_excluded_by_constraints(PlannerInfo *root, > if (predicate_refuted_by(safe_constraints, rel->baserestrictinfo, false)) > return true; > > + /* > + * Remove from restrictions list items implied by table constraints > + */ > + safe_restrictions = NULL; > + foreach(lc, rel->baserestrictinfo) > + { > + RestrictInfo *rinfo = (RestrictInfo *) lfirst(lc); > > I think the new way to write this is "RestrictInfo *rinfo = > lfirst_node(RestrictInfo, lc)". > > + if (!predicate_implied_by(list_make1(rinfo->clause), > safe_constraints, false)) { > + safe_restrictions = lappend(safe_restrictions, rinfo); > + } > + } > + rel->baserestrictinfo = safe_restrictions; > > It feels wrong to modify rel->baserestrictinfo in > relation_excluded_by_constraints(). I think there should probably be > a function with a name that more clearly indicates that it mutates its > input, and you should call that separately after > relation_excluded_by_constraints(). Something like > remove_restrictions_implied_by_constraints()? > >> It is because operator_predicate_proof is not able to understand that k < >> 20001 and k <= 20000 are equivalent for integer type. >> >> [...] >> >> /* >> * operator_predicate_proof >> if (clause_const->constisnull) >> return false; >> >> + if (!refute_it >> + && ((pred_op == Int4LessOrEqualOperator && clause_op == >> Int4LessOperator) >> + || (pred_op == Int8LessOrEqualOperator && clause_op == >> Int8LessOperator) >> + || (pred_op == Int2LessOrEqualOperator && clause_op == >> Int2LessOperator)) >> + && pred_const->constbyval && clause_const->constbyval >> + && pred_const->constvalue + 1 == clause_const->constvalue) >> + { >> + return true; >> + } >> + > I'm less sure about this part. It seems like a slippery slope. > > A couple of regression test failures: > > inherit ... FAILED > rowsecurity ... FAILED > ======================== > 2 of 179 tests failed. > ======================== > > I didn't try to understand the rowsecurity one, but at first glance > all the differences reported in the inherit test are in fact cases > where your patch is doing the right thing and removing redundant > filters from scans. Nice! > Thank you for review. I attached new version of the patch with remove_restrictions_implied_by_constraints() function. Concerning failed tests - this is actually result of this optimization: extra filter conditions are removed from query plans. Sorry, I have not included updated version of expected test output files to the patch. Now I did it. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
On 2017/09/02 12:44, Thomas Munro wrote: > On Wed, Aug 16, 2017 at 9:23 PM, Konstantin Knizhnik > <k.knizhnik@postgrespro.ru> wrote: >> postgres=# explain select * from bt where k between 1 and 20000 and v = 100; >> QUERY PLAN >> ---------------------------------------------------------------------- >> Append (cost=0.29..15.63 rows=2 width=8) >> -> Index Scan using dti1 on dt1 (cost=0.29..8.30 rows=1 width=8) >> Index Cond: (v = 100) >> -> Index Scan using dti2 on dt2 (cost=0.29..7.33 rows=1 width=8) >> Index Cond: (v = 100) >> Filter: (k <= 20000) >> (6 rows) > > +1 > > This seems like a good feature to me: filtering stuff that is > obviously true is a waste of CPU cycles and may even require people to > add redundant stuff to indexes. I was pondering something related to > this over in the partition-wise join thread (join quals that are > implied by partition constraints and should be discarded). > > It'd be interesting to get Amit Langote's feedback, so I CC'd him. > I'd be surprised if he and others haven't got a plan or a patch for > this down the back of the sofa. I agree that that's a good optimization in the cases it's correct. Given that check_index_predicates() already applies the same optimization when considering using a partial index, it might make sense to try to do the same even earlier for the table itself using its CHECK / NOT NULL constraints as predicates (I said *earlier* because relation_excluded_by_constrains happens for a relation before we look at its indexes). Also, at the end of relation_excluded_by_constraints() may not be such a bad place to do this. By the way, I read in check_index_predicates() that we should not apply this optimization if the relation in question is a target of UPDATE / DELETE / SELECT FOR UPDATE. Thanks, Amit
On 04.09.2017 05:38, Amit Langote wrote: > On 2017/09/02 12:44, Thomas Munro wrote: >> On Wed, Aug 16, 2017 at 9:23 PM, Konstantin Knizhnik >> <k.knizhnik@postgrespro.ru> wrote: >>> postgres=# explain select * from bt where k between 1 and 20000 and v = 100; >>> QUERY PLAN >>> ---------------------------------------------------------------------- >>> Append (cost=0.29..15.63 rows=2 width=8) >>> -> Index Scan using dti1 on dt1 (cost=0.29..8.30 rows=1 width=8) >>> Index Cond: (v = 100) >>> -> Index Scan using dti2 on dt2 (cost=0.29..7.33 rows=1 width=8) >>> Index Cond: (v = 100) >>> Filter: (k <= 20000) >>> (6 rows) >> +1 >> >> This seems like a good feature to me: filtering stuff that is >> obviously true is a waste of CPU cycles and may even require people to >> add redundant stuff to indexes. I was pondering something related to >> this over in the partition-wise join thread (join quals that are >> implied by partition constraints and should be discarded). >> >> It'd be interesting to get Amit Langote's feedback, so I CC'd him. >> I'd be surprised if he and others haven't got a plan or a patch for >> this down the back of the sofa. > I agree that that's a good optimization in the cases it's correct. Given > that check_index_predicates() already applies the same optimization when > considering using a partial index, it might make sense to try to do the > same even earlier for the table itself using its CHECK / NOT NULL > constraints as predicates (I said *earlier* because > relation_excluded_by_constrains happens for a relation before we look at > its indexes). Also, at the end of relation_excluded_by_constraints() may > not be such a bad place to do this. > > By the way, I read in check_index_predicates() that we should not apply > this optimization if the relation in question is a target of UPDATE / > DELETE / SELECT FOR UPDATE. Please correct me if I wrong, but it seems to me that in case of table constraints it is not necessary to specially handle update case. As far as I understand we need to leave predicate in the plan in case of partial indexes because due to "read committed" isolation policy we may need to recheck that tuple still satisfies update condition (tuple can be changed by some other committed transaction while we are waiting for it and not satisfying this condition any more). But no transaction can change tuple in such way that it violates table constraints, right? So we do not need to recheck it. Concerning your suggestion to merge check_index_predicates() and remove_restrictions_implied_by_constraints() functions: may be it can be done, but frankly speaking I do not see much sense in it - there are too much differences between this functions and too few code reusing. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Hi Konstantin, On 2017/09/04 18:19, Konstantin Knizhnik wrote: > On 04.09.2017 05:38, Amit Langote wrote: >> On 2017/09/02 12:44, Thomas Munro wrote: >>> On Wed, Aug 16, 2017 at 9:23 PM, Konstantin Knizhnik >>> <k.knizhnik@postgrespro.ru> wrote: >>>> postgres=# explain select * from bt where k between 1 and 20000 and v >>>> = 100; >>>> QUERY PLAN >>>> ---------------------------------------------------------------------- >>>> Append (cost=0.29..15.63 rows=2 width=8) >>>> -> Index Scan using dti1 on dt1 (cost=0.29..8.30 rows=1 width=8) >>>> Index Cond: (v = 100) >>>> -> Index Scan using dti2 on dt2 (cost=0.29..7.33 rows=1 width=8) >>>> Index Cond: (v = 100) >>>> Filter: (k <= 20000) >>>> (6 rows) >>> +1 >>> >>> This seems like a good feature to me: filtering stuff that is >>> obviously true is a waste of CPU cycles and may even require people to >>> add redundant stuff to indexes. I was pondering something related to >>> this over in the partition-wise join thread (join quals that are >>> implied by partition constraints and should be discarded). >>> >>> It'd be interesting to get Amit Langote's feedback, so I CC'd him. >>> I'd be surprised if he and others haven't got a plan or a patch for >>> this down the back of the sofa. >> I agree that that's a good optimization in the cases it's correct. Given >> that check_index_predicates() already applies the same optimization when >> considering using a partial index, it might make sense to try to do the >> same even earlier for the table itself using its CHECK / NOT NULL >> constraints as predicates (I said *earlier* because >> relation_excluded_by_constrains happens for a relation before we look at >> its indexes). Also, at the end of relation_excluded_by_constraints() may >> not be such a bad place to do this. >> >> By the way, I read in check_index_predicates() that we should not apply >> this optimization if the relation in question is a target of UPDATE / >> DELETE / SELECT FOR UPDATE. > > Please correct me if I wrong, but it seems to me that in case of table > constraints it is not necessary to specially handle update case. > As far as I understand we need to leave predicate in the plan in case of > partial indexes because due to "read committed" isolation policy > we may need to recheck that tuple still satisfies update condition (tuple > can be changed by some other committed transaction while we are waiting > for it and not satisfying this condition any more). > But no transaction can change tuple in such way that it violates table > constraints, right? So we do not need to recheck it. Actually, I don't really know why check_index_predicates() skips this optimization in the target relation case, just wanted to point out that that's so. Thinking a bit from what you wrote, maybe we need not worry about EvalPlanQual in the context of your proposed optimization based on the table's constraints. > Concerning your suggestion to merge check_index_predicates() and > remove_restrictions_implied_by_constraints() functions: may be it can be > done, but frankly speaking I do not see much sense in it - there are too > much differences between this functions and too few code reusing. Maybe, you meant to address Thomas here. :) Reading his comment again, I too am a bit concerned about destructively modifying the input rel's baserestrictinfo. There should at least be a comment that that's being done. Thanks, Amit
On 04.09.2017 12:59, Amit Langote wrote: > Hi Konstantin, > > On 2017/09/04 18:19, Konstantin Knizhnik wrote: >> On 04.09.2017 05:38, Amit Langote wrote: >>> On 2017/09/02 12:44, Thomas Munro wrote: >>>> On Wed, Aug 16, 2017 at 9:23 PM, Konstantin Knizhnik >>>> <k.knizhnik@postgrespro.ru> wrote: >>>>> postgres=# explain select * from bt where k between 1 and 20000 and v >>>>> = 100; >>>>> QUERY PLAN >>>>> ---------------------------------------------------------------------- >>>>> Append (cost=0.29..15.63 rows=2 width=8) >>>>> -> Index Scan using dti1 on dt1 (cost=0.29..8.30 rows=1 width=8) >>>>> Index Cond: (v = 100) >>>>> -> Index Scan using dti2 on dt2 (cost=0.29..7.33 rows=1 width=8) >>>>> Index Cond: (v = 100) >>>>> Filter: (k <= 20000) >>>>> (6 rows) >>>> +1 >>>> >>>> This seems like a good feature to me: filtering stuff that is >>>> obviously true is a waste of CPU cycles and may even require people to >>>> add redundant stuff to indexes. I was pondering something related to >>>> this over in the partition-wise join thread (join quals that are >>>> implied by partition constraints and should be discarded). >>>> >>>> It'd be interesting to get Amit Langote's feedback, so I CC'd him. >>>> I'd be surprised if he and others haven't got a plan or a patch for >>>> this down the back of the sofa. >>> I agree that that's a good optimization in the cases it's correct. Given >>> that check_index_predicates() already applies the same optimization when >>> considering using a partial index, it might make sense to try to do the >>> same even earlier for the table itself using its CHECK / NOT NULL >>> constraints as predicates (I said *earlier* because >>> relation_excluded_by_constrains happens for a relation before we look at >>> its indexes). Also, at the end of relation_excluded_by_constraints() may >>> not be such a bad place to do this. >>> >>> By the way, I read in check_index_predicates() that we should not apply >>> this optimization if the relation in question is a target of UPDATE / >>> DELETE / SELECT FOR UPDATE. >> Please correct me if I wrong, but it seems to me that in case of table >> constraints it is not necessary to specially handle update case. >> As far as I understand we need to leave predicate in the plan in case of >> partial indexes because due to "read committed" isolation policy >> we may need to recheck that tuple still satisfies update condition (tuple >> can be changed by some other committed transaction while we are waiting >> for it and not satisfying this condition any more). >> But no transaction can change tuple in such way that it violates table >> constraints, right? So we do not need to recheck it. > Actually, I don't really know why check_index_predicates() skips this > optimization in the target relation case, just wanted to point out that > that's so. > > Thinking a bit from what you wrote, maybe we need not worry about > EvalPlanQual in the context of your proposed optimization based on the > table's constraints. > >> Concerning your suggestion to merge check_index_predicates() and >> remove_restrictions_implied_by_constraints() functions: may be it can be >> done, but frankly speaking I do not see much sense in it - there are too >> much differences between this functions and too few code reusing. > Maybe, you meant to address Thomas here. :) Reading his comment again, I > too am a bit concerned about destructively modifying the input rel's > baserestrictinfo. There should at least be a comment that that's being done. But I have considered Thomas comment and extracted code updating relation's baserestrictinfo from relation_excluded_by_constraints() to remove_restrictions_implied_by_constraints() function. It was included in new version of the patch. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
On 2017/09/04 20:46, Konstantin Knizhnik wrote: > On 04.09.2017 12:59, Amit Langote wrote: >> On 2017/09/04 18:19, Konstantin Knizhnik wrote: >>> Concerning your suggestion to merge check_index_predicates() and >>> remove_restrictions_implied_by_constraints() functions: may be it can be >>> done, but frankly speaking I do not see much sense in it - there are too >>> much differences between this functions and too few code reusing. >> Maybe, you meant to address Thomas here. :) Reading his comment again, I >> too am a bit concerned about destructively modifying the input rel's >> baserestrictinfo. There should at least be a comment that that's being >> done. > > But I have considered Thomas comment and extracted code updating > relation's baserestrictinfo from > relation_excluded_by_constraints() to > remove_restrictions_implied_by_constraints() function. It was included in > new version of the patch. Sorry, I hadn't noticed the new patch. I was confused because I didn't suggest "to merge check_index_predicates() and remove_restrictions_implied_by_constraints() functions". Perhaps, the wording in my previous message wasn't clear. Looking at the new patch, the changes regarding remove_restrictions_implied_by_constraints() look fine. Like Thomas, I'm not so sure about the whole predtest.c patch. The core logic in operator_predicate_proof() should be able to conclude that, say, k < 21 is implied by k <= 20, which you are trying to address with some special case code. If there is still problem you think need to be fixed here, a better place to look at would be somewhere around get_btree_test_op(). Thanks, Amit
On 05.09.2017 04:02, Amit Langote wrote:
Like Thomas, I'm not so sure about the whole predtest.c patch. The core logic in operator_predicate_proof() should be able to conclude that, say, k < 21 is implied by k <= 20, which you are trying to address with some special case code. If there is still problem you think need to be fixed here, a better place to look at would be somewhere around get_btree_test_op().
Frankly speaking I also do not like this part of my patch.
I will be pleased if you or somebody else can propose better solution.
I do not understand how get_btree_test_op() can help here.
Yes, k < 21 is implied by k <= 20. It is based on generic properties of < and <= operators.
But I need to proof something different: having table partition constraint (k < 21) I want to remove predicate (k <= 20) from query.
In other words, operator_predicate_proof() should be able to conclude that (k <= 20) is implied by (k < 21).
But it is true only for integer types, not for floating point types. And Postgres operator definition
doesn't provide some way to determine that user defined type is integer type: has integer values for which such conclusion is true.
Why I think that it is important? Certainly, it is possible to rewrite query as (k < 21) and no changes in operator_predicate_proof() are needed.
Assume the most natural use case: I have some positive integer key and I wan to range partition table by such key, for example with interval 10000.
Currently standard PostgreSQL partitioning mechanism requires to specify intervals with open high boundary.
So if I want first partition to contain interval [1,10000], second - [10001,20001],... I have to create partitions in such way:
create table bt (k integer, v integer) partition by range (k);
create table dt1 partition of bt for values from (1) to (10001);
create table dt2 partition of bt for values from (10001) to (20001);
...
If I want to write query inspecting data of the particular partition, then most likely I will use BETWEEN operator:
SELECT * FROM t WHERE k BETWEEN 1 and 10000;
But right now operator_predicate_proof() is not able to conclude that predicate (k BETWEEN 1 and 10000) transformed to (k >= 1 AND k <= 10000) is equivalent to (k >= 1 AND k < 10001)
which is used as partition constraint.
Another very popular use case (for example mentioned in PostgreSQL documentation of partitioning: https://www.postgresql.org/docs/10/static/ddl-partitioning.html)
is using date as partition key:
CREATE TABLE measurement ( city_id int not null, logdate date not null, peaktemp int, unitsales int ) PARTITION BY RANGE (logdate); CREATE TABLE measurement_y2006m03 PARTITION OF measurement FOR VALUES FROM ('2006-03-01') TO ('2006-04-01')
Assume that now I want to get measurements for March:
There are three ways to write this query:
select * from measurement where extract(month from logdate) = 3;
select * from measurement where logdate between '2006-03-01' AND '2006-03-31';
select * from measurement where logdate >= '2006-03-01' AND logdate < '2006-04-01';
Right now only for the last query optimal query plan will be constructed.
Unfortunately my patch is not covering date type.
-- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
On Sun, Sep 3, 2017 at 4:34 AM, Konstantin Knizhnik <k.knizhnik@postgrespro.ru> wrote: > Thank you for review. > I attached new version of the patch with > remove_restrictions_implied_by_constraints() function. > Concerning failed tests - this is actually result of this optimization: > extra filter conditions are removed from query plans. > Sorry, I have not included updated version of expected test output files to > the patch. > Now I did it. A regression test under contrib/postgres_fdw now fails: - Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" IS NOT NULL)) + Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" (("C 1" IS NOT NULL)) is indeed redundant in that case, because column "C 1" was declared to be NOT NULL. But: 1. Do we want to go this far? Note that this is not involving inheritance and constraint exclusion. I don't immediately see any reason why not, but I'm not sure. 2. If yes, then this postgres_fdw test should be changed, because I think it was trying to demonstrate that IS NOT NULL expressions are sent to remote databases -- it would need to be changed so that it tries that with a column that is actually nullable. -- Thomas Munro http://www.enterprisedb.com
On 07.09.2017 13:00, Thomas Munro wrote: > On Sun, Sep 3, 2017 at 4:34 AM, Konstantin Knizhnik > <k.knizhnik@postgrespro.ru> wrote: >> Thank you for review. >> I attached new version of the patch with >> remove_restrictions_implied_by_constraints() function. >> Concerning failed tests - this is actually result of this optimization: >> extra filter conditions are removed from query plans. >> Sorry, I have not included updated version of expected test output files to >> the patch. >> Now I did it. > A regression test under contrib/postgres_fdw now fails: > > - Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T > 1" WHERE (("C 1" IS NOT NULL)) > + Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" > > (("C 1" IS NOT NULL)) is indeed redundant in that case, because column > "C 1" was declared to be NOT NULL. But: > > 1. Do we want to go this far? Note that this is not involving > inheritance and constraint exclusion. I don't immediately see any > reason why not, but I'm not sure. > > 2. If yes, then this postgres_fdw test should be changed, because I > think it was trying to demonstrate that IS NOT NULL expressions are > sent to remote databases -- it would need to be changed so that it > tries that with a column that is actually nullable. > I do not see any reasons why we should disable this optimization in case of FDW. And disabling it requires some extra efforts... So I have updated test for postgres_fdw, replacing query SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL; with SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL and c3 is not null; Now it checks two things: 1. That not null check is passed to foreign server for nullable column (c3) 2. That not null check is excluded from query execution plan when it can be omitted because column is not nullable. Updated version of the patch is attached to this mail. Also I added support of date type to operator_predicate_proof to be able to imply (logdate <= '2017-03-31') from (logdate < '2017-04-01') . -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
On Fri, Sep 8, 2017 at 3:58 AM, Konstantin Knizhnik <k.knizhnik@postgrespro.ru> wrote: > Updated version of the patch is attached to this mail. > Also I added support of date type to operator_predicate_proof to be able to > imply (logdate <= '2017-03-31') from (logdate < '2017-04-01') . Hi Konstantin, Is there any reason why you don't want to split this into two separate proposals? One for remove_restrictions_implied_by_constraints() and one for the operator_predicate_proof() changes. Your v3 patch breaks the new partition_join test (the recently committed partition-wise join stuff), as far as I can tell in a good way. Can you please double check those changes and post an updated patch? -- Thomas Munro http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On 11/06/2017 04:27 AM, Thomas Munro wrote: > On Fri, Sep 8, 2017 at 3:58 AM, Konstantin Knizhnik > <k.knizhnik@postgrespro.ru> wrote: >> Updated version of the patch is attached to this mail. >> Also I added support of date type to operator_predicate_proof to be able to >> imply (logdate <= '2017-03-31') from (logdate < '2017-04-01') . > Hi Konstantin, > > Is there any reason why you don't want to split this into two separate > proposals? One for remove_restrictions_implied_by_constraints() and > one for the operator_predicate_proof() changes. > > Your v3 patch breaks the new partition_join test (the recently > committed partition-wise join stuff), as far as I can tell in a good > way. Can you please double check those changes and post an updated > patch? > Hi Thomas. The primary idea of this patch was to provide more efficient plans for queries on partitioned tables. So remove_restrictions_implied_by_constraints() removes redundant predicate checks. But it doesn't work for standard Postgres 10 partitioning, because here constraints are set using intervals with open highboundary and original version of operator_predicate_proof() is not able to handle this case. I have explained this problem in my previous e-mails in this thread. This is why I have changed operator_predicate_proof() to correctly handle this case. If you think this patch should be splitted into two, ok: I can do it. I just want to notice that without patching operator_predicate_proof() it may give not positive effect for standard partitioning, which I expect to be the most popular use case where this optimization may have an effect. Concerning broken partition_join test: it is "expected" failure: my patch removes from the plans redundant checks. So the only required action is to update expected file with results. Attached please find updated patch. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
On Mon, Nov 6, 2017 at 10:13 PM, Konstantin Knizhnik <k.knizhnik@postgrespro.ru> wrote: > Concerning broken partition_join test: it is "expected" failure: my patch > removes from the plans redundant checks. > So the only required action is to update expected file with results. > Attached please find updated patch. The last patch version has conflicts in regression tests and did not get any reviews. Please provide a rebased version. I am moving the patch to next CF with waiting on author as status. Thanks! -- Michael
On 30.11.2017 05:16, Michael Paquier wrote: > On Mon, Nov 6, 2017 at 10:13 PM, Konstantin Knizhnik > <k.knizhnik@postgrespro.ru> wrote: >> Concerning broken partition_join test: it is "expected" failure: my patch >> removes from the plans redundant checks. >> So the only required action is to update expected file with results. >> Attached please find updated patch. > The last patch version has conflicts in regression tests and did not > get any reviews. Please provide a rebased version. I am moving the > patch to next CF with waiting on author as status. Thanks! Rebased patch is attached. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Attachment
Konstantin Knizhnik wrote: > > > On 30.11.2017 05:16, Michael Paquier wrote: > > On Mon, Nov 6, 2017 at 10:13 PM, Konstantin Knizhnik > > <k.knizhnik@postgrespro.ru> wrote: > > > Concerning broken partition_join test: it is "expected" failure: my patch > > > removes from the plans redundant checks. > > > So the only required action is to update expected file with results. > > > Attached please find updated patch. > > The last patch version has conflicts in regression tests and did not > > get any reviews. Please provide a rebased version. I am moving the > > patch to next CF with waiting on author as status. Thanks! > > Rebased patch is attached. I don't think this is a rebase on the previously posted patch ... it's about 10x as big and appears to be a thorough rewrite of the entire optimizer. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 04.12.2017 19:44, Alvaro Herrera wrote: > Konstantin Knizhnik wrote: >> >> On 30.11.2017 05:16, Michael Paquier wrote: >>> On Mon, Nov 6, 2017 at 10:13 PM, Konstantin Knizhnik >>> <k.knizhnik@postgrespro.ru> wrote: >>>> Concerning broken partition_join test: it is "expected" failure: my patch >>>> removes from the plans redundant checks. >>>> So the only required action is to update expected file with results. >>>> Attached please find updated patch. >>> The last patch version has conflicts in regression tests and did not >>> get any reviews. Please provide a rebased version. I am moving the >>> patch to next CF with waiting on author as status. Thanks! >> Rebased patch is attached. > I don't think this is a rebase on the previously posted patch ... it's > about 10x as big and appears to be a thorough rewrite of the entire > optimizer. > Or, sorry. I really occasionally committed in this branch patch for aggregate push down. Correct reabased patch is attached. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Attachment
On Tue, Sep 5, 2017 at 9:10 AM, Konstantin Knizhnik <
k.knizhnik@postgrespro.ru> wrote:
>
>
> On 05.09.2017 04:02, Amit Langote wrote:
>
> Like Thomas, I'm not so sure about the whole predtest.c patch. The core
> logic in operator_predicate_proof() should be able to conclude that, say,
> k < 21 is implied by k <= 20, which you are trying to address with some
> special case code. If there is still problem you think need to be fixed
> here, a better place to look at would be somewhere around get_btree_test_op().
>
>
> Frankly speaking I also do not like this part of my patch.
> I will be pleased if you or somebody else can propose better solution.
> I do not understand how get_btree_test_op() can help here.
>
> Yes, k < 21 is implied by k <= 20. It is based on generic properties of <
> and <= operators.
> But I need to proof something different: having table partition constraint
> (k < 21) I want to remove predicate (k <= 20) from query.
> In other words, operator_predicate_proof() should be able to conclude
> that (k <= 20) is implied by (k < 21).
> But it is true only for integer types, not for floating point types. And
> Postgres operator definition
> doesn't provide some way to determine that user defined type is integer
> type: has integer values for which such conclusion is true.
>
> Why I think that it is important? Certainly, it is possible to rewrite
> query as (k < 21) and no changes in operator_predicate_proof() are needed.
> Assume the most natural use case: I have some positive integer key and I
> wan to range partition table by such key, for example with interval 10000.
> Currently standard PostgreSQL partitioning mechanism requires to specify
> intervals with open high boundary.
> So if I want first partition to contain interval [1,10000], second -
> [10001,20001],... I have to create partitions in such way:
>
> create table bt (k integer, v integer) partition by range (k);
> create table dt1 partition of bt for values from (1) to (10001);
> create table dt2 partition of bt for values from (10001) to (20001);
> ...
>
> If I want to write query inspecting data of the particular partition, then
> most likely I will use BETWEEN operator:
>
> SELECT * FROM t WHERE k BETWEEN 1 and 10000;
>
> But right now operator_predicate_proof() is not able to conclude that
> predicate (k BETWEEN 1 and 10000) transformed to (k >= 1 AND k <= 10000) is
> equivalent to (k >= 1 AND k < 10001)
> which is used as partition constraint.
>
> Another very popular use case (for example mentioned in PostgreSQL
> documentation of partitioning: https://www.postgresql.org/
> docs/10/static/ddl-partitioning.html)
> is using date as partition key:
>
> CREATE TABLE measurement (
> city_id int not null,
> logdate date not null,
> peaktemp int,
> unitsales int
> ) PARTITION BY RANGE (logdate);
>
>
> CREATE TABLE measurement_y2006m03 PARTITION OF measurement
> FOR VALUES FROM ('2006-03-01') TO ('2006-04-01')
>
>
> Assume that now I want to get measurements for March:
>
> There are three ways to write this query:
>
> select * from measurement where extract(month from logdate) = 3;
> select * from measurement where logdate between '2006-03-01' AND
> '2006-03-31';
> select * from measurement where logdate >= '2006-03-01' AND logdate <
> '2006-04-01';
>
> Right now only for the last query optimal query plan will be constructed.
>
Perhaps the relative pages (about partitioning and optimization) should
mention to avoid BETWEEN and using closed-open checks, as the last query.
Dates are a perfect example to demonstrate that BETWEEN shouldn't be used,
in my opinion. Dates (and timestamps) are not like integers as they are
often used with different levels of precisions, day, month, year, hour,
minute, second, etc. (month in your example). Constructing the correct
expressions for the different precisions can be a nightmare with BETWEEN
but very simple with >= and < (in the example: get start_date,
'2006-03-01', and add one month).
So, just my 2c, is it worth the trouble to implement this feature
(conversion of (k<21) to (k<=20) and vice versa) and how much work would it
need for all data types that are commonly used for partitioning?
> Unfortunately my patch is not covering date type.
>
> --
> Konstantin Knizhnik
> Postgres Professional: http://www.postgrespro.com
> The Russian Postgres Company
>
>
Greetings, * Konstantin Knizhnik (k.knizhnik@postgrespro.ru) wrote: > On 04.12.2017 19:44, Alvaro Herrera wrote: > >Konstantin Knizhnik wrote: > >> > >>On 30.11.2017 05:16, Michael Paquier wrote: > >>>On Mon, Nov 6, 2017 at 10:13 PM, Konstantin Knizhnik > >>><k.knizhnik@postgrespro.ru> wrote: > >>>>Concerning broken partition_join test: it is "expected" failure: my patch > >>>>removes from the plans redundant checks. > >>>>So the only required action is to update expected file with results. > >>>>Attached please find updated patch. > >>>The last patch version has conflicts in regression tests and did not > >>>get any reviews. Please provide a rebased version. I am moving the > >>>patch to next CF with waiting on author as status. Thanks! > >>Rebased patch is attached. > >I don't think this is a rebase on the previously posted patch ... it's > >about 10x as big and appears to be a thorough rewrite of the entire > >optimizer. > > Or, sorry. I really occasionally committed in this branch patch for > aggregate push down. > Correct reabased patch is attached. This patch applies, builds and passes 'make check-world', with no real review posted of it, so I don't believe it should be 'Waiting on Author' but really in 'Needs Review' status, so I've gone ahead and updated the CF with that status. Thanks! Stephen
Attachment
Konstantin Knizhnik <k.knizhnik@postgrespro.ru> wrote: > On 14.08.2017 19:33, Konstantin Knizhnik wrote: > > On 14.08.2017 12:37, Konstantin Knizhnik wrote: > > Hi hackers, > > I am trying to compare different ways of optimizing work with huge append-only tables in PostgreSQL where primary keyis something like timestamp and queries are usually accessing most recent data using some secondary keys. Size of secondaryindex is one of the most critical > factors limiting insert/search performance. As far as data is inserted in timestamp ascending order, access to primarykey is well localized and accessed tables are present in memory. But if we create secondary key for the whole table,then access to it will require random reads from > the disk and significantly decrease performance. > > There are two well known solutions of the problem: > 1. Table partitioning > 2. Partial indexes > > This approaches I want to compare. First of all I want to check if optimizer is able to generate efficient query executionplan covering different time intervals. > Unfortunately in both cases generated plan is not optimal. > > 1. Table partitioning: > > create table base (k integer primary key, v integer); > create table part1 (check (k between 1 and 10000)) inherits (base); > create table part2 (check (k between 10001 and 20000)) inherits (base); > create index pi1 on part1(v); > create index pi2 on part2(v); > insert int part1 values (generate series(1,10000), random()); > insert into part2 values (generate_series(10001,20000), random()); > explain select * from base where k between 1 and 20000 and v = 100; > QUERY PLAN > ----------------------------------------------------------------------- > Append (cost=0.00..15.65 rows=3 width=8) > -> Seq Scan on base (cost=0.00..0.00 rows=1 width=8) > Filter: ((k >= 1) AND (k <= 20000) AND (v = 100)) > -> Index Scan using pi1 on part1 (cost=0.29..8.31 rows=1 width=8) > Index Cond: (v = 100) > Filter: ((k >= 1) AND (k <= 20000)) > -> Index Scan using pi2 on part2 (cost=0.29..7.34 rows=1 width=8) > Index Cond: (v = 100) > Filter: ((k >= 1) AND (k <= 20000)) > > Questions: > - Is there some way to avoid sequential scan of parent table? Yes, it is empty and so sequential scan will not take muchtime, but ... it still requires some additional actions and so increasing query execution time. > - Why index scan of partition indexes includes filter condition if it is guaranteed by check constraint that all recordsof this partition match search predicate? > > 2. Partial indexes: > > create table t (k integer primary key, v integer); > insert into t values (generate_series(1,20000),random()); > create index i1 on t(v) where k between 1 and 10000; > create index i2 on t(v) where k between 10001 and 20000; > postgres=# explain select * from t where k between 1 and 10000 and v = 100; > QUERY PLAN > ------------------------------------------------------------ > Index Scan using i1 on t (cost=0.29..7.28 rows=1 width=8) > Index Cond: (v = 100) > (2 rows) > > Here we get perfect plan. Let's try to extend search interval: > > postgres=# explain select * from t where k between 1 and 20000 and v = 100; > QUERY PLAN > ------------------------------------------------------------------ > Index Scan using t_pkey on t (cost=0.29..760.43 rows=1 width=8) > Index Cond: ((k >= 1) AND (k <= 20000)) > Filter: (v = 100) > (3 rows) > > Unfortunately in this case Postgres is not able to apply partial indexes. > And this is what I expected to get: > > postgres=# explain select * from t where k between 1 and 10000 and v = 100 union all select * from t where k between 10001and 20000 and v = 100; > QUERY PLAN > ---------------------------------------------------------------------- > Append (cost=0.29..14.58 rows=2 width=8) > -> Index Scan using i1 on t (cost=0.29..7.28 rows=1 width=8) > Index Cond: (v = 100) > -> Index Scan using i2 on t t_1 (cost=0.29..7.28 rows=1 width=8) > Index Cond: (v = 100) > > I wonder if there are some principle problems in supporting this two things in optimizer: > 1. Remove search condition for primary key if it is fully satisfied by derived table check constraint. > 2. Append index scans of several partial indexes if specified interval is covered by their conditions. > > I wonder if someone is familiar with this part of optimizer ad can easily fix it. > Otherwise I am going to spend some time on solving this problems (if community think that such optimizations will be useful). > > Replying to myself: the following small patch removes redundant checks from index scans for derived tables: > > diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c > index 939045d..1f7c9cf 100644 > --- a/src/backend/optimizer/util/plancat.c > +++ b/src/backend/optimizer/util/plancat.c > @@ -1441,6 +1441,20 @@ relation_excluded_by_constraints(PlannerInfo *root, > if (predicate_refuted_by(safe_constraints, rel->baserestrictinfo, false)) > return true; > > + /* > + * Remove from restriction list items implied by table constraints > + */ > + safe_restrictions = NULL; > + foreach(lc, rel->baserestrictinfo) > + { > + RestrictInfo *rinfo = (RestrictInfo *) lfirst(lc); > + if (!predicate_implied_by(list_make1(rinfo->clause), safe_constraints, false)) { > + safe_restrictions = lappend(safe_restrictions, rinfo); > + } > + } > + rel->baserestrictinfo = safe_restrictions; > + > + > return false; > } > > ================================================= > > I am not sure if this is the right place of adjusting restriction list and is such transformation always correct. > But for the queries I have tested produced plans are correct: > > postgres=# explain select * from base where k between 1 and 20000 and v = 100; > QUERY PLAN > ----------------------------------------------------------------------- > Append (cost=0.00..15.64 rows=3 width=8) > -> Seq Scan on base (cost=0.00..0.00 rows=1 width=8) > Filter: ((k >= 1) AND (k <= 20000) AND (v = 100)) > -> Index Scan using pi1 on part1 (cost=0.29..8.30 rows=1 width=8) > Index Cond: (v = 100) > -> Index Scan using pi2 on part2 (cost=0.29..7.34 rows=1 width=8) > Index Cond: (v = 100) > (7 rows) > > postgres=# explain select * from base where k between 1 and 15000 and v = 100; > QUERY PLAN > ----------------------------------------------------------------------- > Append (cost=0.00..15.64 rows=3 width=8) > -> Seq Scan on base (cost=0.00..0.00 rows=1 width=8) > Filter: ((k >= 1) AND (k <= 15000) AND (v = 100)) > -> Index Scan using pi1 on part1 (cost=0.29..8.30 rows=1 width=8) > Index Cond: (v = 100) > -> Index Scan using pi2 on part2 (cost=0.29..7.34 rows=1 width=8) > Index Cond: (v = 100) > Filter: (k <= 15000) > (8 rows) > > There is one more problem with predicate_implied_by function and standard Postgres partitions. > Right now it specifies constrains for partitions using intervals with open high boundary: > > postgres=# create table bt (k integer, v integer) partition by range (k); > postgres=# create table dt1 partition of bt for values from (1) to (10001); > postgres=# create table dt2 partition of bt for values from (10001) to (20001); > postgres=# create index dti1 on dt1(v); > postgres=# create index dti2 on dt2(v); > postgres=# insert into bt values (generate_series(1,20000), random()); > > postgres=# \d+ dt2 > Table "public.dt2" > Column | Type | Collation | Nullable | Default | Storage | Stats target | De > scription > --------+---------+-----------+----------+---------+---------+--------------+--- > ---------- > k | integer | | | | plain | | > v | integer | | | | plain | | > Partition of: bt FOR VALUES FROM (10001) TO (20001) > Partition constraint: ((k IS NOT NULL) AND (k >= 10001) AND (k < 20001)) > Indexes: > "dti2" btree (v) > > If now I run the query with predicate "between 1 and 20000", I still see extra check in the plan: > > postgres=# explain select * from bt where k between 1 and 20000 and v = 100; > QUERY PLAN > ---------------------------------------------------------------------- > Append (cost=0.29..15.63 rows=2 width=8) > -> Index Scan using dti1 on dt1 (cost=0.29..8.30 rows=1 width=8) > Index Cond: (v = 100) > -> Index Scan using dti2 on dt2 (cost=0.29..7.33 rows=1 width=8) > Index Cond: (v = 100) > Filter: (k <= 20000) > (6 rows) > > It is because operator_predicate_proof is not able to understand that k < 20001 and k <= 20000 are equivalent for integertype. > If I rewrite query as (k >= 1 and k < 20001) then plan is optimal: > > postgres=# explain select * from bt where k >= 1 and k < 20001 and v = 100; > QUERY PLAN > ---------------------------------------------------------------------- > Append (cost=0.29..15.63 rows=2 width=8) > -> Index Scan using dti1 on dt1 (cost=0.29..8.30 rows=1 width=8) > Index Cond: (v = 100) > -> Index Scan using dti2 on dt2 (cost=0.29..7.33 rows=1 width=8) > Index Cond: (v = 100) > (5 rows) > > It is fixed by one more patch of predtest.c: Have you considered using the range types (internally in operator_predicate_proof()) instead of hard-coding operator OIDs? The range types do have the knowledge that k < 20001 and k <= 20000 are equivalent for integer type: postgres=# SELECT int4range '(, 20001)' = int4range '(, 20000]'; ?column? ---------- t (1 row) -- Antonin Houska Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt Web: http://www.postgresql-support.de, http://www.cybertec.at
On 09.01.2018 19:48, Antonin Houska wrote:
Konstantin Knizhnik <k.knizhnik@postgrespro.ru> wrote:On 14.08.2017 19:33, Konstantin Knizhnik wrote: On 14.08.2017 12:37, Konstantin Knizhnik wrote: Hi hackers, I am trying to compare different ways of optimizing work with huge append-only tables in PostgreSQL where primary key is something like timestamp and queries are usually accessing most recent data using some secondary keys. Size of secondary index is one of the most criticalfactors limiting insert/search performance. As far as data is inserted in timestamp ascending order, access to primary key is well localized and accessed tables are present in memory. But if we create secondary key for the whole table, then access to it will require random reads fromthe disk and significantly decrease performance. There are two well known solutions of the problem: 1. Table partitioning 2. Partial indexes This approaches I want to compare. First of all I want to check if optimizer is able to generate efficient query execution plan covering different time intervals. Unfortunately in both cases generated plan is not optimal. 1. Table partitioning: create table base (k integer primary key, v integer); create table part1 (check (k between 1 and 10000)) inherits (base); create table part2 (check (k between 10001 and 20000)) inherits (base); create index pi1 on part1(v); create index pi2 on part2(v); insert int part1 values (generate series(1,10000), random()); insert into part2 values (generate_series(10001,20000), random()); explain select * from base where k between 1 and 20000 and v = 100; QUERY PLAN ----------------------------------------------------------------------- Append (cost=0.00..15.65 rows=3 width=8) -> Seq Scan on base (cost=0.00..0.00 rows=1 width=8) Filter: ((k >= 1) AND (k <= 20000) AND (v = 100)) -> Index Scan using pi1 on part1 (cost=0.29..8.31 rows=1 width=8) Index Cond: (v = 100) Filter: ((k >= 1) AND (k <= 20000)) -> Index Scan using pi2 on part2 (cost=0.29..7.34 rows=1 width=8) Index Cond: (v = 100) Filter: ((k >= 1) AND (k <= 20000)) Questions: - Is there some way to avoid sequential scan of parent table? Yes, it is empty and so sequential scan will not take much time, but ... it still requires some additional actions and so increasing query execution time. - Why index scan of partition indexes includes filter condition if it is guaranteed by check constraint that all records of this partition match search predicate? 2. Partial indexes: create table t (k integer primary key, v integer); insert into t values (generate_series(1,20000),random()); create index i1 on t(v) where k between 1 and 10000; create index i2 on t(v) where k between 10001 and 20000; postgres=# explain select * from t where k between 1 and 10000 and v = 100; QUERY PLAN ------------------------------------------------------------ Index Scan using i1 on t (cost=0.29..7.28 rows=1 width=8) Index Cond: (v = 100) (2 rows) Here we get perfect plan. Let's try to extend search interval: postgres=# explain select * from t where k between 1 and 20000 and v = 100; QUERY PLAN ------------------------------------------------------------------ Index Scan using t_pkey on t (cost=0.29..760.43 rows=1 width=8) Index Cond: ((k >= 1) AND (k <= 20000)) Filter: (v = 100) (3 rows) Unfortunately in this case Postgres is not able to apply partial indexes. And this is what I expected to get: postgres=# explain select * from t where k between 1 and 10000 and v = 100 union all select * from t where k between 10001 and 20000 and v = 100; QUERY PLAN ---------------------------------------------------------------------- Append (cost=0.29..14.58 rows=2 width=8) -> Index Scan using i1 on t (cost=0.29..7.28 rows=1 width=8) Index Cond: (v = 100) -> Index Scan using i2 on t t_1 (cost=0.29..7.28 rows=1 width=8) Index Cond: (v = 100) I wonder if there are some principle problems in supporting this two things in optimizer: 1. Remove search condition for primary key if it is fully satisfied by derived table check constraint. 2. Append index scans of several partial indexes if specified interval is covered by their conditions. I wonder if someone is familiar with this part of optimizer ad can easily fix it. Otherwise I am going to spend some time on solving this problems (if community think that such optimizations will be useful). Replying to myself: the following small patch removes redundant checks from index scans for derived tables: diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c index 939045d..1f7c9cf 100644 --- a/src/backend/optimizer/util/plancat.c +++ b/src/backend/optimizer/util/plancat.c @@ -1441,6 +1441,20 @@ relation_excluded_by_constraints(PlannerInfo *root, if (predicate_refuted_by(safe_constraints, rel->baserestrictinfo, false)) return true; + /* + * Remove from restriction list items implied by table constraints + */ + safe_restrictions = NULL; + foreach(lc, rel->baserestrictinfo) + { + RestrictInfo *rinfo = (RestrictInfo *) lfirst(lc); + if (!predicate_implied_by(list_make1(rinfo->clause), safe_constraints, false)) { + safe_restrictions = lappend(safe_restrictions, rinfo); + } + } + rel->baserestrictinfo = safe_restrictions; + + return false; } ================================================= I am not sure if this is the right place of adjusting restriction list and is such transformation always correct. But for the queries I have tested produced plans are correct: postgres=# explain select * from base where k between 1 and 20000 and v = 100; QUERY PLAN ----------------------------------------------------------------------- Append (cost=0.00..15.64 rows=3 width=8) -> Seq Scan on base (cost=0.00..0.00 rows=1 width=8) Filter: ((k >= 1) AND (k <= 20000) AND (v = 100)) -> Index Scan using pi1 on part1 (cost=0.29..8.30 rows=1 width=8) Index Cond: (v = 100) -> Index Scan using pi2 on part2 (cost=0.29..7.34 rows=1 width=8) Index Cond: (v = 100) (7 rows) postgres=# explain select * from base where k between 1 and 15000 and v = 100; QUERY PLAN ----------------------------------------------------------------------- Append (cost=0.00..15.64 rows=3 width=8) -> Seq Scan on base (cost=0.00..0.00 rows=1 width=8) Filter: ((k >= 1) AND (k <= 15000) AND (v = 100)) -> Index Scan using pi1 on part1 (cost=0.29..8.30 rows=1 width=8) Index Cond: (v = 100) -> Index Scan using pi2 on part2 (cost=0.29..7.34 rows=1 width=8) Index Cond: (v = 100) Filter: (k <= 15000) (8 rows) There is one more problem with predicate_implied_by function and standard Postgres partitions. Right now it specifies constrains for partitions using intervals with open high boundary: postgres=# create table bt (k integer, v integer) partition by range (k); postgres=# create table dt1 partition of bt for values from (1) to (10001); postgres=# create table dt2 partition of bt for values from (10001) to (20001); postgres=# create index dti1 on dt1(v); postgres=# create index dti2 on dt2(v); postgres=# insert into bt values (generate_series(1,20000), random()); postgres=# \d+ dt2 Table "public.dt2" Column | Type | Collation | Nullable | Default | Storage | Stats target | De scription --------+---------+-----------+----------+---------+---------+--------------+--- ---------- k | integer | | | | plain | | v | integer | | | | plain | | Partition of: bt FOR VALUES FROM (10001) TO (20001) Partition constraint: ((k IS NOT NULL) AND (k >= 10001) AND (k < 20001)) Indexes: "dti2" btree (v) If now I run the query with predicate "between 1 and 20000", I still see extra check in the plan: postgres=# explain select * from bt where k between 1 and 20000 and v = 100; QUERY PLAN ---------------------------------------------------------------------- Append (cost=0.29..15.63 rows=2 width=8) -> Index Scan using dti1 on dt1 (cost=0.29..8.30 rows=1 width=8) Index Cond: (v = 100) -> Index Scan using dti2 on dt2 (cost=0.29..7.33 rows=1 width=8) Index Cond: (v = 100) Filter: (k <= 20000) (6 rows) It is because operator_predicate_proof is not able to understand that k < 20001 and k <= 20000 are equivalent for integer type.If I rewrite query as (k >= 1 and k < 20001) then plan is optimal: postgres=# explain select * from bt where k >= 1 and k < 20001 and v = 100; QUERY PLAN ---------------------------------------------------------------------- Append (cost=0.29..15.63 rows=2 width=8) -> Index Scan using dti1 on dt1 (cost=0.29..8.30 rows=1 width=8) Index Cond: (v = 100) -> Index Scan using dti2 on dt2 (cost=0.29..7.33 rows=1 width=8) Index Cond: (v = 100) (5 rows) It is fixed by one more patch of predtest.c:Have you considered using the range types (internally in operator_predicate_proof()) instead of hard-coding operator OIDs? The range types do have the knowledge that k < 20001 and k <= 20000 are equivalent for integer type: postgres=# SELECT int4range '(, 20001)' = int4range '(, 20000]';?column? ----------t (1 row)
It is bright idea, but it is not quit clear to me how to implement it.
There are several builtin ranges types in Postgres:
int4range
, int8range,
numrange,
tsrange,
tstzrange,
daterange.
Among them int4range, int8range and daterange are discrete types having canonical function, for which this transformation rules are applicable.Now I perform checks for all this types. So the challenge is to support user defined range types with canonical function.
As input operator_predicate_proof function has Oid of comparison operator and Const * expression representing literal value.
So I see the following generic way of checking equivalence of ranges:
1. Get name of operator. If it is '<=' or '>=' then it is closed interval, if it is '<' or '>' then it is open interval.
2. Convert Const to text (using type's out function) and construct interval: '(,"$value"]' for '<=', '["$value",)' for '>=', '(,"$value")' for '<' and '("$value",)' for '>'.
3. Find range type from type of the constant:
select * from pg_range where rngsubtype=?;
4. Try to cast constructed above string to this range type (using type's in function).
5. Compare two produced ranges and if them are equal, then operator_predicate_proof should return true.
But I am not sure that this steps will correctly work for all possible user defined discrete range types:
1. Is it correct to assume that (col <= XXX) corresponds to '(,XXX]' interval?
Is there some better way to determine type of interval based on operator rather than looking at operator's name?
2. Should each range type represent this interval format?
For builtin range types is is true, but is there any warranty that arbitrary user defined range type will recognize string '(,"XXX"]' ?
3. Is it always possible to find correspondent range type for the specified literal type?
For example, for int2 type there is no range type.
4. Range type input function may throw an error, so we have to catch and ignore it.
So, I completely agree with your that using hardcoded operator/type IDs is bad smell.
But it is not clear if it is possible to perform this check in general way.
At least I have doubts about my approach explained above...
-- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Konstantin Knizhnik <k.knizhnik@postgrespro.ru> wrote: > On 09.01.2018 19:48, Antonin Houska wrote: > >> Have you considered using the range types (internally in >> operator_predicate_proof()) instead of hard-coding operator OIDs? The range >> types do have the knowledge that k < 20001 and k <= 20000 are equivalent for >> integer type: >> >> postgres=# SELECT int4range '(, 20001)' = int4range '(, 20000]'; >> ?column? >> ---------- >> t >> (1 row) > It is bright idea, but it is not quit clear to me how to implement it. > There are several builtin ranges types in Postgres: int4range, int8range, numrange, tsrange, tstzrange, daterange. > > Among them int4range, int8range and daterange are discrete types having canonical function, for which this transformationrules are applicable. > Now I perform checks for all this types. So the challenge is to support user defined range types with canonical function. > As input operator_predicate_proof function has Oid of comparison operator and Const * expression representing literal value. > So I see the following generic way of checking equivalence of ranges: > > 1. Get name of operator. If it is '<=' or '>=' then it is closed interval, if it is '<' or '>' then it is open interval. > 2. Convert Const to text (using type's out function) and construct interval: '(,"$value"]' for '<=', '["$value",)' for'>=', '(,"$value")' for '<' and '("$value",)' for '>'. > 3. Find range type from type of the constant: > select * from pg_range where rngsubtype=?; > 4. Try to cast constructed above string to this range type (using type's in function). > 5. Compare two produced ranges and if them are equal, then operator_predicate_proof should return true. I haven't thought that much about details, so just one comment: you shouldn't need the conversion to text and back to binary form. utils/adt/rangetypes.c contains constructors that accept the binary values. -- Antonin Houska Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26, A-2700 Wiener Neustadt Web: https://www.cybertec-postgresql.com
On 11.01.2018 12:34, Antonin Houska wrote: > Konstantin Knizhnik <k.knizhnik@postgrespro.ru> wrote: > >> On 09.01.2018 19:48, Antonin Houska wrote: >> >>> Have you considered using the range types (internally in >>> operator_predicate_proof()) instead of hard-coding operator OIDs? The range >>> types do have the knowledge that k < 20001 and k <= 20000 are equivalent for >>> integer type: >>> >>> postgres=# SELECT int4range '(, 20001)' = int4range '(, 20000]'; >>> ?column? >>> ---------- >>> t >>> (1 row) >> It is bright idea, but it is not quit clear to me how to implement it. >> There are several builtin ranges types in Postgres: int4range, int8range, numrange, tsrange, tstzrange, daterange. >> >> Among them int4range, int8range and daterange are discrete types having canonical function, for which this transformationrules are applicable. >> Now I perform checks for all this types. So the challenge is to support user defined range types with canonical function. >> As input operator_predicate_proof function has Oid of comparison operator and Const * expression representing literalvalue. >> So I see the following generic way of checking equivalence of ranges: >> >> 1. Get name of operator. If it is '<=' or '>=' then it is closed interval, if it is '<' or '>' then it is open interval. >> 2. Convert Const to text (using type's out function) and construct interval: '(,"$value"]' for '<=', '["$value",)' for'>=', '(,"$value")' for '<' and '("$value",)' for '>'. >> 3. Find range type from type of the constant: >> select * from pg_range where rngsubtype=?; >> 4. Try to cast constructed above string to this range type (using type's in function). >> 5. Compare two produced ranges and if them are equal, then operator_predicate_proof should return true. > I haven't thought that much about details, so just one comment: you shouldn't > need the conversion to text and back to binary form. utils/adt/rangetypes.c > contains constructors that accept the binary values. > Attached please find new version of the patch which uses range type to interval matching in operator_predicate_proof function. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Attachment
Konstantin Knizhnik <k.knizhnik@postgrespro.ru> wrote: > On 11.01.2018 12:34, Antonin Houska wrote: > > Konstantin Knizhnik <k.knizhnik@postgrespro.ru> wrote: > > I haven't thought that much about details, so just one comment: you shouldn't > > need the conversion to text and back to binary form. utils/adt/rangetypes.c > > contains constructors that accept the binary values. > > > Attached please find new version of the patch which uses range type to > interval matching in operator_predicate_proof function. I think that instead of checking the operator name, e.g. strcmp(oprname, "<") you should test the operator B-tree strategy: BTLessStrategyNumber, BTLessEqualStrategyNumber, etc. The operator string alone does not tell enough about the operator semantics. The strategy can be found in the pg_amop catalog. get_op_btree_interpretation() function may be useful, but there may be something better in utils/cache/lsyscache.c. -- Antonin Houska Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26, A-2700 Wiener Neustadt Web: https://www.cybertec-postgresql.com
On 19.01.2018 16:14, Antonin Houska wrote: > Konstantin Knizhnik <k.knizhnik@postgrespro.ru> wrote: > >> On 11.01.2018 12:34, Antonin Houska wrote: >>> Konstantin Knizhnik <k.knizhnik@postgrespro.ru> wrote: >>> I haven't thought that much about details, so just one comment: you shouldn't >>> need the conversion to text and back to binary form. utils/adt/rangetypes.c >>> contains constructors that accept the binary values. >>> >> Attached please find new version of the patch which uses range type to >> interval matching in operator_predicate_proof function. > I think that instead of checking the operator name, e.g. > > strcmp(oprname, "<") > > you should test the operator B-tree strategy: BTLessStrategyNumber, > BTLessEqualStrategyNumber, etc. The operator string alone does not tell enough > about the operator semantics. > > The strategy can be found in the pg_amop catalog. > get_op_btree_interpretation() function may be useful, but there may be > something better in utils/cache/lsyscache.c. > Thank you very much. Shame on me that I didn't notice such solution myself - such checking of B-tree strategy is done in the same predtest.c file! Now checking of predicate clauses compatibility is done in much more elegant and general way. Attached please find new version of the patch. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Attachment
On Sat, Jan 20, 2018 at 5:41 AM, Konstantin Knizhnik <k.knizhnik@postgrespro.ru> wrote: > On 19.01.2018 16:14, Antonin Houska wrote: >> you should test the operator B-tree strategy: BTLessStrategyNumber, >> BTLessEqualStrategyNumber, etc. The operator string alone does not tell >> enough >> about the operator semantics. >> >> The strategy can be found in the pg_amop catalog. >> get_op_btree_interpretation() function may be useful, but there may be >> something better in utils/cache/lsyscache.c. >> > Thank you very much. > Shame on me that I didn't notice such solution myself - such checking of > B-tree strategy is done in the same predtest.c file! > Now checking of predicate clauses compatibility is done in much more elegant > and general way. > Attached please find new version of the patch. I wonder if you should create a new index and SysCache entry for looking up range types by subtype. I'll be interested to see what others have to say about this range type-based technique -- it seems clever to me, but I'm not familiar enough with this stuff to say if there is some other approach you should be using instead. Some superficial project style comments (maybe these could be fixed automatically with pgindent?): +static TypeCacheEntry* lookup_range_type(Oid type) +static RangeType* operator_to_range(TypeCacheEntry *typcache, Oid oper, Const* literal) ... these should be like this: static RangeType * operator_to_range(... I think the idea is that you can always search for a function definition with by looking for "name(" at the start of a line, so we put a newline there. Then there is the whitespace before "*", not after it. + if (pred_range && clause_range && range_eq_internal(typcache, pred_range, clause_range)) + { + return true; + } Unnecessary braces. +/* + * Get range type for the corresprent scalar type. + * Returns NULl if such range type is not found. + * This function performs sequential scan in pg_range table, + * but since number of range rtype is not expected to be large (6 builtin range types), + * it should not be a problem. + */ Typos. -- Thomas Munro http://www.enterprisedb.com
On 29.01.2018 07:34, Thomas Munro wrote: > On Sat, Jan 20, 2018 at 5:41 AM, Konstantin Knizhnik > <k.knizhnik@postgrespro.ru> wrote: >> On 19.01.2018 16:14, Antonin Houska wrote: >>> you should test the operator B-tree strategy: BTLessStrategyNumber, >>> BTLessEqualStrategyNumber, etc. The operator string alone does not tell >>> enough >>> about the operator semantics. >>> >>> The strategy can be found in the pg_amop catalog. >>> get_op_btree_interpretation() function may be useful, but there may be >>> something better in utils/cache/lsyscache.c. >>> >> Thank you very much. >> Shame on me that I didn't notice such solution myself - such checking of >> B-tree strategy is done in the same predtest.c file! >> Now checking of predicate clauses compatibility is done in much more elegant >> and general way. >> Attached please find new version of the patch. > I wonder if you should create a new index and SysCache entry for > looking up range types by subtype. I'll be interested to see what > others have to say about this range type-based technique -- it seems > clever to me, but I'm not familiar enough with this stuff to say if > there is some other approach you should be using instead. I think that it is good idea to add caching for range type lookup. If community think that it will be useful I can try to add such mechanism. But it seems to be not so trivial, especially properly handle invalidations. > Some superficial project style comments (maybe these could be fixed > automatically with pgindent?): > > +static TypeCacheEntry* lookup_range_type(Oid type) > > +static RangeType* operator_to_range(TypeCacheEntry *typcache, Oid > oper, Const* literal) > > ... these should be like this: > > static RangeType * > operator_to_range(... > > I think the idea is that you can always search for a function > definition with by looking for "name(" at the start of a line, so we > put a newline there. Then there is the whitespace before "*", not > after it. > > + if (pred_range && clause_range && range_eq_internal(typcache, > pred_range, clause_range)) > + { > + return true; > + } > > Unnecessary braces. > > +/* > + * Get range type for the corresprent scalar type. > + * Returns NULl if such range type is not found. > + * This function performs sequential scan in pg_range table, > + * but since number of range rtype is not expected to be large (6 > builtin range types), > + * it should not be a problem. > + */ > > Typos. > Thank you. I fixed this issues. New patch is attached. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Attachment
On 29.01.2018 16:24, Konstantin Knizhnik wrote: > On 29.01.2018 07:34, Thomas Munro wrote: >> On Sat, Jan 20, 2018 at 5:41 AM, Konstantin Knizhnik >> <k.knizhnik@postgrespro.ru> wrote: >>> On 19.01.2018 16:14, Antonin Houska wrote: >>>> you should test the operator B-tree strategy: BTLessStrategyNumber, >>>> BTLessEqualStrategyNumber, etc. The operator string alone does not >>>> tell >>>> enough >>>> about the operator semantics. >>>> >>>> The strategy can be found in the pg_amop catalog. >>>> get_op_btree_interpretation() function may be useful, but there may be >>>> something better in utils/cache/lsyscache.c. >>>> >>> Thank you very much. >>> Shame on me that I didn't notice such solution myself - such >>> checking of >>> B-tree strategy is done in the same predtest.c file! >>> Now checking of predicate clauses compatibility is done in much more >>> elegant >>> and general way. >>> Attached please find new version of the patch. >> I wonder if you should create a new index and SysCache entry for >> looking up range types by subtype. I'll be interested to see what >> others have to say about this range type-based technique -- it seems >> clever to me, but I'm not familiar enough with this stuff to say if >> there is some other approach you should be using instead. > I think that it is good idea to add caching for range type lookup. > If community think that it will be useful I can try to add such > mechanism. > But it seems to be not so trivial, especially properly handle > invalidations. > I have added caching of element_type->range_type mapping to typcache.c. But it is not invalidated now if pg_range relation is changed. Also if there are more than one range types for the specified element type then first of them is used. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Attachment
Hi, This patch seems like quite a good improvement. One thing I've not really looked at but am slightly concerned in passing: Are there cases where we now would do a lot of predicate pruning work even though the overhead of just evaluating the qual is trivial, e.g. because there's only one row due to a pkey? The predtest code is many things but lightning fast is not one of them. Obviously that won't matter for analytics queries, but I could see it hurt in OLTPish workloads... Greetings, Andres Freund
On 01.03.2018 23:15, Andres Freund wrote:
Hi, This patch seems like quite a good improvement. One thing I've not really looked at but am slightly concerned in passing: Are there cases where we now would do a lot of predicate pruning work even though the overhead of just evaluating the qual is trivial, e.g. because there's only one row due to a pkey? The predtest code is many things but lightning fast is not one of them. Obviously that won't matter for analytics queries, but I could see it hurt in OLTPish workloads... Greetings, Andres Freund
Hi,
I am sorry for delay with answer.
I understand your concern and did the following experiment.
I have initialized the database in the following way:
create table base (k integer primary key, v integer);
create table part1 (check (k between 1 and 10000)) inherits (base);
create table part2 (check (k between 10001 and 20000)) inherits (base);
create index pi1 on part1(v);
create index pi2 on part2(v);
insert into part1 values (generate series(1,10000), random()*100000);
insert into part2 values (generate_series(10001,20000), random()*100000);
vacuum analyze part1;
vacuum analyze part2;
Vanilla Postgres uses the following plan:
explain select * from base where k between 1 and 20000 and v = 100;
QUERY PLAN
-----------------------------------------------------------------------
Append (cost=0.00..16.63 rows=3 width=8)
-> Seq Scan on base (cost=0.00..0.00 rows=1 width=8)
Filter: ((k >= 1) AND (k <= 20000) AND (v = 100))
-> Index Scan using pi1 on part1 (cost=0.29..8.31 rows=1 width=8)
Index Cond: (v = 100)
Filter: ((k >= 1) AND (k <= 20000))
-> Index Scan using pi2 on part2 (cost=0.29..8.31 rows=1 width=8)
Index Cond: (v = 100)
Filter: ((k >= 1) AND (k <= 20000))
(9 rows)
Execution of this query 100000 times gives is done in 12 seconds.
With applied patch query plan is changed to:
QUERY PLAN
-----------------------------------------------------------------------
Append (cost=0.00..16.62 rows=3 width=8)
-> Seq Scan on base (cost=0.00..0.00 rows=1 width=8)
Filter: ((k >= 1) AND (k <= 20000) AND (v = 100))
-> Index Scan using pi1 on part1 (cost=0.29..8.30 rows=1 width=8)
Index Cond: (v = 100)
-> Index Scan using pi2 on part2 (cost=0.29..8.30 rows=1 width=8)
Index Cond: (v = 100)
(7 rows)
Elapsed time of 100000 query executions is 13 seconds.
So you was right that increased query optimization time exceeds advantage of extra checks elimination.
But it is true only if we are not using prepare statements.
With prepared statements results are the following:
Vanilla: 0m3.915s
This patch: 0m3.563s
So improvement is not so large, but it exists.
If somebody wants to repeat my experiments, I attached to this mail small shell script which I used to run query several times.
Non-prepared query is launched using the following command:
time ./repeat-query.sh "select * from base where k between 1 and 20000 and v = 100" 100000
and prepared query:
time ./repeat-query.sh "execute select100" 100000 "prepare select100 as select * from base where k between 1 and 20000 and v = 100"
And once again I want to notice that using prepared statements can increase performance almost 3 times!
As far as using prepared statements is not always possible I want to recall autoprepare patch waiting at the commitfest:
https://www.postgresql.org/message-id/flat/8eed9c23-19ba-5404-7a9e-0584b836b3f3%40postgrespro.ru#8eed9c23-19ba-5404-7a9e-0584b836b3f3@postgrespro.ru
--
Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Attachment
On 21.03.2018 20:30, Konstantin Knizhnik wrote:
Attached please find rebased version of the patch.On 01.03.2018 23:15, Andres Freund wrote:Hi, This patch seems like quite a good improvement. One thing I've not really looked at but am slightly concerned in passing: Are there cases where we now would do a lot of predicate pruning work even though the overhead of just evaluating the qual is trivial, e.g. because there's only one row due to a pkey? The predtest code is many things but lightning fast is not one of them. Obviously that won't matter for analytics queries, but I could see it hurt in OLTPish workloads... Greetings, Andres Freund
Hi,
I am sorry for delay with answer.
I understand your concern and did the following experiment.
I have initialized the database in the following way:
create table base (k integer primary key, v integer);
create table part1 (check (k between 1 and 10000)) inherits (base);
create table part2 (check (k between 10001 and 20000)) inherits (base);
create index pi1 on part1(v);
create index pi2 on part2(v);
insert into part1 values (generate series(1,10000), random()*100000);
insert into part2 values (generate_series(10001,20000), random()*100000);
vacuum analyze part1;
vacuum analyze part2;
Vanilla Postgres uses the following plan:
explain select * from base where k between 1 and 20000 and v = 100;
QUERY PLAN
-----------------------------------------------------------------------
Append (cost=0.00..16.63 rows=3 width=8)
-> Seq Scan on base (cost=0.00..0.00 rows=1 width=8)
Filter: ((k >= 1) AND (k <= 20000) AND (v = 100))
-> Index Scan using pi1 on part1 (cost=0.29..8.31 rows=1 width=8)
Index Cond: (v = 100)
Filter: ((k >= 1) AND (k <= 20000))
-> Index Scan using pi2 on part2 (cost=0.29..8.31 rows=1 width=8)
Index Cond: (v = 100)
Filter: ((k >= 1) AND (k <= 20000))
(9 rows)
Execution of this query 100000 times gives is done in 12 seconds.
With applied patch query plan is changed to:
QUERY PLAN
-----------------------------------------------------------------------
Append (cost=0.00..16.62 rows=3 width=8)
-> Seq Scan on base (cost=0.00..0.00 rows=1 width=8)
Filter: ((k >= 1) AND (k <= 20000) AND (v = 100))
-> Index Scan using pi1 on part1 (cost=0.29..8.30 rows=1 width=8)
Index Cond: (v = 100)
-> Index Scan using pi2 on part2 (cost=0.29..8.30 rows=1 width=8)
Index Cond: (v = 100)
(7 rows)
Elapsed time of 100000 query executions is 13 seconds.
So you was right that increased query optimization time exceeds advantage of extra checks elimination.
But it is true only if we are not using prepare statements.
With prepared statements results are the following:
Vanilla: 0m3.915s
This patch: 0m3.563s
So improvement is not so large, but it exists.
If somebody wants to repeat my experiments, I attached to this mail small shell script which I used to run query several times.
Non-prepared query is launched using the following command:
time ./repeat-query.sh "select * from base where k between 1 and 20000 and v = 100" 100000
and prepared query:
time ./repeat-query.sh "execute select100" 100000 "prepare select100 as select * from base where k between 1 and 20000 and v = 100"
And once again I want to notice that using prepared statements can increase performance almost 3 times!
As far as using prepared statements is not always possible I want to recall autoprepare patch waiting at the commitfest:
https://www.postgresql.org/message-id/flat/8eed9c23-19ba-5404-7a9e-0584b836b3f3%40postgrespro.ru#8eed9c23-19ba-5404-7a9e-0584b836b3f3@postgrespro.ru
--Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Also I do more testing, now using pgbench.
Scripts for initialization of the database and for custom script for pgbench are attached.
Results at my computer are the following:
pgbench options | Vanilla | Patch |
-c 1 | 9208 | 8289 |
-c 1 -M prepared | 38503 | 41206 |
-c 10 | 39224 | 34040 |
-c 10 -M prepared | 165465 | 172874 |
-- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Attachment
On 22 March 2018 at 22:38, Konstantin Knizhnik <k.knizhnik@postgrespro.ru> wrote: > Attached please find rebased version of the patch. Hi, I started looking over this patch and have a few comments: I don't think this range type idea is a great one. I don't think it's going to ever perform very well. I also see you're not checking the collation of the type anywhere. As of now, no range types have collation support, but you can't really go and code this with that assumption. I also don't really like the sequence scan over pg_range. Probably a better way to do this would be to add a new bt proc, like what was done in 0a459cec for 2 new functions. Something like BTISNEXTVAL_PROC and BTISPREVVAL_PROC. You'd then need to define functions for all the types based on integers, making functions which take 2 parameters of the type, and an additional collation param. The functions should return bool. int4_isnextval(2, 3, InvalidOid) would return true. You'd need to return false on wraparound. I also think that the patch is worth doing without the additional predicate_implied_by() smarts. In fact, I think strongly that the two should be considered as two independent patches. Partial indexes suffer from the same issue you're trying to address here and that would be resolved by any patch which makes changes to predicate_implied_by(). Probably the best place to put the code to skip the redundant quals is inside set_append_rel_size(). There's already code there that skips quals that are seen as const TRUE. This applies for UNION ALL targets with expressions that can be folded to constants once the qual is passed through adjust_appendrel_attrs() For example: explain select * from (select 1 as a from pg_class union all select 2 from pg_class) t where a = 1; I've attached a patch to show what I had in mind. I had to change how partition_qual is populated, which I was surprised to see only gets populated for sub-partitioned tables (the top-level parent won't have a qual since it's not partitioned) I didn't update the partition pruning code that assumes this is only populated for sub-partitioned table. That will need to be done. The patch comes complete with all the failing regression tests where the redundant quals have been removed by the new code. If you want to make this work for CHECK constraints too, then I think the code for that can be added to the same location as the code I added in the attached patch. You'd just need to fetch the check constraint quals and just add some extra code to check if the qual is redundant. Some new performance benchmarks would then be useful in order to find out how much overhead there is. We might learn that we don't want to enable it by default if it's too expensive. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
On 9 July 2018 at 13:26, David Rowley <david.rowley@2ndquadrant.com> wrote: > I started looking over this patch and have a few comments: Hi Konstantin, Wondering, are you going to be submitting an updated patch for this commitfest? If not then I think we can mark this as returned with feedback as it's been waiting on author for quite a while now. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Hi David, On 11.09.2018 06:49, David Rowley wrote: > On 9 July 2018 at 13:26, David Rowley <david.rowley@2ndquadrant.com> wrote: >> I started looking over this patch and have a few comments: > Hi Konstantin, > > Wondering, are you going to be submitting an updated patch for this > commitfest? If not then I think we can mark this as returned with > feedback as it's been waiting on author for quite a while now. > First of all thank your for review. I am very sorry for delay with answer: I was in vacation in July and just forgot about this mail. I have to agree with you that it is better to split this patch into two and that using range type for open and close intervals match is so good idea. Also the patch proposed by you is much simple and does mostly the same. Yes, it is not covering CHECK constraints, but as far as partitioning becomes now standard in Postgres, I do not think that much people will use old inheritance mechanism and CHECK constraints. In any case, there are now many optimizations which works only for partitions, but not for inherited tables. I attach to this mail your patch combined with corrected tests outputs. Unfortunately the performance gain is not so much as I expected (even without additional predicate_implied_by() smarts). On the following database: create table bt (k integer, v integer) partition by range (k); create table dt1 partition of bt for values from (1) to (10001); create table dt2 partition of bt for values from (10001) to (20001); create index dti1 on dt1(v); create index dti2 on dt2(v); insert into bt values (generate_series(1,20000), generate_series(1,20000)); analyze bt; and pgbench script: \set x random(1, 10000) select * from bt where k between 1 and 20001 and v=:x; I got ~170k TPS with this patch and about ~160k TPS without it. My hypothesis was that we have to perform redundant predicate only once (for one selected record) and it adds no so much overhead comparing with index search cost. So I tried another version of the query which selects large number of records: select sum(*) from bt where k between 1 and 20001 and v between :x and :x + 1000; Now patch version shows 23k TPS vs. 19k for Vanilla.
Attachment
On 12 September 2018 at 08:32, Konstantin Knizhnik <k.knizhnik@postgrespro.ru> wrote: > Also the patch proposed by you is much simple and does mostly the same. Yes, > it is not covering CHECK constraints, > but as far as partitioning becomes now standard in Postgres, I do not think > that much people will use old inheritance mechanism and CHECK constraints. > In any case, there are now many optimizations which works only for > partitions, but not for inherited tables. I've not had time to look at your updated patch yet, but one thing I thought about after my initial review, imagine you have a setup like: create table listp (a int, b int) partition by list(a); create table listp1 partition of listp for values in(1); create index listp_a_b_idx on listp (a,b); and a query: select * from listp where a = 1 order by b; if we remove the "a = 1" qual, then listp_a_b_idx can't be used. I didn't test this in your patch, but I guess since the additional quals are not applied to the children in set_append_rel_size() that by the time set_append_rel_pathlist() is called, then when we go generating the paths, the (a,b) index won't be any good. Perhaps there's some workaround like inventing some sort of "no-op" qual that exists in planning but never makes it way down to scans. Although I admit to not having fully thought that idea through. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On 12.09.2018 08:14, David Rowley wrote: > On 12 September 2018 at 08:32, Konstantin Knizhnik > <k.knizhnik@postgrespro.ru> wrote: >> Also the patch proposed by you is much simple and does mostly the same. Yes, >> it is not covering CHECK constraints, >> but as far as partitioning becomes now standard in Postgres, I do not think >> that much people will use old inheritance mechanism and CHECK constraints. >> In any case, there are now many optimizations which works only for >> partitions, but not for inherited tables. > I've not had time to look at your updated patch yet, but one thing I > thought about after my initial review, imagine you have a setup like: > > create table listp (a int, b int) partition by list(a); > create table listp1 partition of listp for values in(1); > create index listp_a_b_idx on listp (a,b); > > and a query: > > select * from listp where a = 1 order by b; > > if we remove the "a = 1" qual, then listp_a_b_idx can't be used. Looks like this qual is considered for choosing optimal path before it is removed from list of quals in set_append_rel_size. At least the presence of this patch is not breaking the plan in this case: create table listp (a int, b int) partition by list(a); create table listp1 partition of listp for values in(1); create table listp2 partition of listp for values in(2); create index listp_a_b_idx on listp (a,b); insert into listp values (1,generate_series(1,100000)); insert into listp values (2,generate_series(100001,200000)); explain select * from listp where a = 1 order by b; QUERY PLAN ------------------------------------------------------------------------------------------------ Merge Append (cost=0.30..4630.43 rows=100000 width=8) Sort Key: listp1.b -> Index Only Scan using listp1_a_b_idx on listp1 (cost=0.29..3630.42 rows=100000 width=8) (3 rows) > I didn't test this in your patch, but I guess since the additional > quals are not applied to the children in set_append_rel_size() that by > the time set_append_rel_pathlist() is called, then when we go > generating the paths, the (a,b) index won't be any good. > > Perhaps there's some workaround like inventing some sort of "no-op" > qual that exists in planning but never makes it way down to scans. > Although I admit to not having fully thought that idea through. > -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
On Wed, Sep 12, 2018 at 11:43:09AM +0300, Konstantin Knizhnik wrote: > Looks like this qual is considered for choosing optimal path before it is > removed from list of quals in set_append_rel_size. Hm... The latest reviews have not been addressed yet, so I have marked this as returned with feedback. -- Michael
Attachment
On 12 September 2018 at 08:32, Konstantin Knizhnik <k.knizhnik@postgrespro.ru> wrote: > Also the patch proposed by you is much simple and does mostly the same. Yes, > it is not covering CHECK constraints, I started to look at this and found a problem in regards to varno during the predicate_implied_by() test. The problem is that the partition bound is always stored as varno=1 (For example, see how get_qual_for_list() calls makeVar()). This causes the patch to fail in cases where the partitioned table is not varno=1. You're also incorrectly using rinfo->clause to pass to predicate_implied_by(). This is a problem because stored here have not been translated to have the child varattnos. childqual is the correct thing to use as that's just been translated. You may have not used it as the varnos will have been converted to the child's varno, which will never be varno=1, so you might have found that not to work due to the missing code to change the varnos to 1. I've attached the diff for allpaths.c (only) that I ended up with to make it work. This causes the output of many other regression test to change, so you'll need to go over these and verify everything is correct again. Please, can you also add a test which tests this code which has a partition with columns in a different order than it's parent. Having an INT and a TEXT column is best as if the translations are done incorrectly it's likely to result in a crash which will alert us to the issue. It would be good to also verify the test causes a crash if you temporarily put the code back to using the untranslated qual. Thanks for working on this. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
On 04.10.2018 06:19, David Rowley wrote: > On 12 September 2018 at 08:32, Konstantin Knizhnik > <k.knizhnik@postgrespro.ru> wrote: >> Also the patch proposed by you is much simple and does mostly the same. Yes, >> it is not covering CHECK constraints, > I started to look at this and found a problem in regards to varno > during the predicate_implied_by() test. The problem is that the > partition bound is always stored as varno=1 (For example, see how > get_qual_for_list() calls makeVar()). This causes the patch to fail in > cases where the partitioned table is not varno=1. You're also > incorrectly using rinfo->clause to pass to predicate_implied_by(). > This is a problem because stored here have not been translated to have > the child varattnos. childqual is the correct thing to use as that's > just been translated. You may have not used it as the varnos will have > been converted to the child's varno, which will never be varno=1, so > you might have found that not to work due to the missing code to > change the varnos to 1. > > I've attached the diff for allpaths.c (only) that I ended up with to > make it work. This causes the output of many other regression test to > change, so you'll need to go over these and verify everything is > correct again. > > Please, can you also add a test which tests this code which has a > partition with columns in a different order than it's parent. Having > an INT and a TEXT column is best as if the translations are done > incorrectly it's likely to result in a crash which will alert us to > the issue. It would be good to also verify the test causes a crash if > you temporarily put the code back to using the untranslated qual. > > Thanks for working on this. > Thank you very much for detecting and fixing this problem. I have checked that all changes in plan caused by this fix are correct. Updated version of the patch is attached. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Attachment
On 4 October 2018 at 22:11, Konstantin Knizhnik <k.knizhnik@postgrespro.ru> wrote: > On 04.10.2018 06:19, David Rowley wrote: >> Please, can you also add a test which tests this code which has a >> partition with columns in a different order than it's parent. Having >> an INT and a TEXT column is best as if the translations are done >> incorrectly it's likely to result in a crash which will alert us to >> the issue. It would be good to also verify the test causes a crash if >> you temporarily put the code back to using the untranslated qual. >> >> Thanks for working on this. >> > Thank you very much for detecting and fixing this problem. > I have checked that all changes in plan caused by this fix are correct. > Updated version of the patch is attached. Can you add the test that I mentioned above? -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On 04.10.2018 12:19, David Rowley wrote: > On 4 October 2018 at 22:11, Konstantin Knizhnik > <k.knizhnik@postgrespro.ru> wrote: >> On 04.10.2018 06:19, David Rowley wrote: >>> Please, can you also add a test which tests this code which has a >>> partition with columns in a different order than it's parent. Having >>> an INT and a TEXT column is best as if the translations are done >>> incorrectly it's likely to result in a crash which will alert us to >>> the issue. It would be good to also verify the test causes a crash if >>> you temporarily put the code back to using the untranslated qual. >>> >>> Thanks for working on this. >>> >> Thank you very much for detecting and fixing this problem. >> I have checked that all changes in plan caused by this fix are correct. >> Updated version of the patch is attached. > Can you add the test that I mentioned above? > Will the following test be enough: -- check that columns for parent table are correctly mapped to child partition of their order doesn't match create table paren (a int, b text) partition by range(a); create table child_1 partition of paren for values from (0) to (10); create table child_2 (b text, a int); alter table paren attach partition child_2 for values from (10) to (20); insert into paren values (generate_series(0,19), generate_series(100,119)); explain (costs off) select * from paren where a between 0 and 9; explain (costs off) select * from paren where a between 10 and 20; explain (costs off) select * from paren where a >= 5; explain (costs off) select * from paren where a <= 15; select count(*) from paren where a >= 5; select count(*) from paren where a < 15; drop table paren cascade; -------------------------------------- If so, then updated patch is attached. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Attachment
On 5 October 2018 at 04:45, Konstantin Knizhnik <k.knizhnik@postgrespro.ru> wrote: > Will the following test be enough: > > -- check that columns for parent table are correctly mapped to child > partition of their order doesn't match > create table paren (a int, b text) partition by range(a); > create table child_1 partition of paren for values from (0) to (10); > create table child_2 (b text, a int); > alter table paren attach partition child_2 for values from (10) to (20); > insert into paren values (generate_series(0,19), generate_series(100,119)); > > explain (costs off) select * from paren where a between 0 and 9; > explain (costs off) select * from paren where a between 10 and 20; > explain (costs off) select * from paren where a >= 5; > explain (costs off) select * from paren where a <= 15; > > select count(*) from paren where a >= 5; > select count(*) from paren where a < 15; > > drop table paren cascade; I started looking at this to see if this test would cause a crash with the original code, but it does not. The closest I can get is: drop table parent; create table parent (a bytea, b int) partition by range(a); create table child_1 (b int, a bytea); alter table parent attach partition child_1 for values from ('a') to ('z'); explain (costs off) select * from parent where b = 1; But despite the varattnos of the bytea and int accidentally matching, there's no crash due to the way operator_predicate_proof() requires more than just the varno and varattno to match. It requires the Vars to be equal(), which includes vartype, and those are not the same. So the proof just fails. In short, probably this test is doing nothing in addition to what various other tests are doing. So given the test is unable to crash the unfixed code, then I think it's likely not a worthwhile test to add. I wrote: > create table listp (a int, b int) partition by list(a); > create table listp1 partition of listp for values in(1); > create index listp_a_b_idx on listp (a,b); > > and a query: > > select * from listp where a = 1 order by b; > > if we remove the "a = 1" qual, then listp_a_b_idx can't be used. I had a look at what allows this query still to use the index and it's down to pathkey_is_redundant() returning true because there's still an equivalence class containing {a,1}. I don't quite see any reason why it would not be okay to rely on that working, but that only works for pathkeys. If you have a case like: set max_parallel_workers_per_gather=0; create table listp (a int, b int) partition by list(a); create table listp1 partition of listp for values in(1); insert into listp select 1,x from generate_Series(1,1000000) x; create index listp_a_b_idx on listp (a,b); vacuum analyze listp; explain analyze select * from listp where a = 1 and b = 1; the "a = 1" will be dropped and the index on (a,b) does not get used. Patched results in: postgres=# explain analyze select * from listp where a = 1 and b = 1; QUERY PLAN ------------------------------------------------------------------------------------------------------------ Append (cost=0.00..16925.01 rows=1 width=8) (actual time=0.019..169.231 rows=1 loops=1) -> Seq Scan on listp1 (cost=0.00..16925.00 rows=1 width=8) (actual time=0.017..169.228 rows=1 loops=1) Filter: (b = 1) Rows Removed by Filter: 999999 Planning Time: 0.351 ms Execution Time: 169.257 ms (6 rows) Whereas unpatched gets: postgres=# explain analyze select * from listp where a = 1 and b = 1; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------- Append (cost=0.42..4.45 rows=1 width=8) (actual time=0.657..0.660 rows=1 loops=1) -> Index Only Scan using listp1_a_b_idx on listp1 (cost=0.42..4.44 rows=1 width=8) (actual time=0.653..0.655 rows=1 loops=1) Index Cond: ((a = 1) AND (b = 1)) Heap Fetches: 0 Planning Time: 32.303 ms Execution Time: 0.826 ms (6 rows) so I was probably wrong about suggesting set_append_rel_size() as a good place to remove these quals. It should perhaps be done later, or maybe we can add some sort of marker to the qual to say it does not need to be enforced during execution. Probably the former would be best as we don't want to show these in EXPLAIN. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On 08.10.2018 00:16, David Rowley wrote:
Well, I made a different conclusion from this problem (inability use of compound index because of redundant qual elimination).On 5 October 2018 at 04:45, Konstantin Knizhnik <k.knizhnik@postgrespro.ru> wrote:Will the following test be enough: -- check that columns for parent table are correctly mapped to child partition of their order doesn't match create table paren (a int, b text) partition by range(a); create table child_1 partition of paren for values from (0) to (10); create table child_2 (b text, a int); alter table paren attach partition child_2 for values from (10) to (20); insert into paren values (generate_series(0,19), generate_series(100,119)); explain (costs off) select * from paren where a between 0 and 9; explain (costs off) select * from paren where a between 10 and 20; explain (costs off) select * from paren where a >= 5; explain (costs off) select * from paren where a <= 15; select count(*) from paren where a >= 5; select count(*) from paren where a < 15; drop table paren cascade;I started looking at this to see if this test would cause a crash with the original code, but it does not. The closest I can get is: drop table parent; create table parent (a bytea, b int) partition by range(a); create table child_1 (b int, a bytea); alter table parent attach partition child_1 for values from ('a') to ('z'); explain (costs off) select * from parent where b = 1; But despite the varattnos of the bytea and int accidentally matching, there's no crash due to the way operator_predicate_proof() requires more than just the varno and varattno to match. It requires the Vars to be equal(), which includes vartype, and those are not the same. So the proof just fails. In short, probably this test is doing nothing in addition to what various other tests are doing. So given the test is unable to crash the unfixed code, then I think it's likely not a worthwhile test to add. I wrote:create table listp (a int, b int) partition by list(a); create table listp1 partition of listp for values in(1); create index listp_a_b_idx on listp (a,b); and a query: select * from listp where a = 1 order by b; if we remove the "a = 1" qual, then listp_a_b_idx can't be used.I had a look at what allows this query still to use the index and it's down to pathkey_is_redundant() returning true because there's still an equivalence class containing {a,1}. I don't quite see any reason why it would not be okay to rely on that working, but that only works for pathkeys. If you have a case like: set max_parallel_workers_per_gather=0; create table listp (a int, b int) partition by list(a); create table listp1 partition of listp for values in(1); insert into listp select 1,x from generate_Series(1,1000000) x; create index listp_a_b_idx on listp (a,b); vacuum analyze listp; explain analyze select * from listp where a = 1 and b = 1; the "a = 1" will be dropped and the index on (a,b) does not get used. Patched results in: postgres=# explain analyze select * from listp where a = 1 and b = 1; QUERY PLAN ------------------------------------------------------------------------------------------------------------Append (cost=0.00..16925.01 rows=1 width=8) (actual time=0.019..169.231 rows=1 loops=1) -> Seq Scan on listp1 (cost=0.00..16925.00 rows=1 width=8) (actual time=0.017..169.228 rows=1 loops=1) Filter: (b = 1) Rows Removed by Filter: 999999Planning Time: 0.351 msExecution Time: 169.257 ms (6 rows) Whereas unpatched gets: postgres=# explain analyze select * from listp where a = 1 and b = 1; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------Append (cost=0.42..4.45 rows=1 width=8) (actual time=0.657..0.660 rows=1 loops=1) -> Index Only Scan using listp1_a_b_idx on listp1 (cost=0.42..4.44 rows=1 width=8) (actual time=0.653..0.655 rows=1 loops=1) Index Cond: ((a = 1) AND (b = 1)) Heap Fetches: 0Planning Time: 32.303 msExecution Time: 0.826 ms (6 rows) so I was probably wrong about suggesting set_append_rel_size() as a good place to remove these quals. It should perhaps be done later, or maybe we can add some sort of marker to the qual to say it does not need to be enforced during execution. Probably the former would be best as we don't want to show these in EXPLAIN.
Is it really good idea to define compound index with first key equal to partitioning key?
Restriction on this key in any case will lead to partition pruning. We do no need index for it...
In your case if we create index listp_b_idx:
create index listp_b_idx on listp (b);
then right plan we be generated:
Append (cost=0.42..8.45 rows=1 width=8) (actual time=0.046..0.047 rows=1 loops=1)
-> Index Scan using listp1_b_idx on listp1 (cost=0.42..8.44 rows=1 width=8) (actual time=0.046..0.046 rows=1 loops=1)
Index Cond: (b = 1)
and it is definitely more efficient than original plan with unpacked Postgres.
Append (cost=0.42..4.45 rows=1 width=8) (actual time=0.657..0.660 rows=1 loops=1) -> Index Only Scan using listp1_a_b_idx on listp1 (cost=0.42..4.44 rows=1 width=8) (actual time=0.653..0.655 rows=1 loops=1) Index Cond: ((a = 1) AND (b = 1)) Heap Fetches: 0
In any case, I have attached yet another version of the patch: it is my original patch with removed predicate test proof logic.
Unlike your patch, it works also for CHECK constraints, not only for standard partitions.
-- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company