Re: Get more from indices. - Mailing list pgsql-hackers

From Etsuro Fujita
Subject Re: Get more from indices.
Date
Msg-id 004a01ceea57$cbab0440$63010cc0$@etsuro@lab.ntt.co.jp
Whole thread Raw
In response to Re: Get more from indices.  (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>)
Responses Re: Get more from indices.  (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>)
List pgsql-hackers
Kyotaro HORIGUCHI wrote:
> the attched pathkey_and_uniqueindx_v4_20131122.patch is changed as
> follows.

The patch is compiled successfully and passes all regression tests.  Also,
the patch works well for the tests shown in an earlier email from
Horiguchi-san.  But in this version I found an incorrect behavior.

postgres=# CREATE TABLE t (a int not null, b int not null, c int, d text);
CREATE TABLE
postgres=# CREATE UNIQUE INDEX i_t_ab ON t (a, b);
CREATE INDEX
postgres=# INSERT INTO t (SELECT a / 5, 4 - (a % 5), a, 't' FROM
generate_series(000000, 099999) a);
INSERT 0 100000
postgres=# ANALYZE t;
ANALYZE
postgres=# CREATE TABLE t2 (e text, f int);
CREATE TABLE
postgres=# INSERT INTO t2 VALUES ('t', 2);
INSERT 0 1
postgres=# INSERT INTO t2 VALUES ('t', 1);
INSERT 0 1
postgres=# ANALYZE t2;
ANALYZE
postgres=# EXPLAIN SELECT * FROM t, t2 WHERE t.a < 10 AND t.d = t2.e ORDER
BY t.a, t.b, t.c, t.d, t2.f LIMIT 10;                                  QUERY PLAN
----------------------------------------------------------------------------
----Limit  (cost=0.29..9.30 rows=10 width=20)  ->  Nested Loop  (cost=0.29..129.99 rows=144 width=20)        Join
Filter:(t.d = t2.e)        ->  Index Scan using i_t_ab on t  (cost=0.29..126.80 rows=72
 
width=14)              Index Cond: (a < 10)        ->  Materialize  (cost=0.00..1.03 rows=2 width=6)              ->
SeqScan on t2  (cost=0.00..1.02 rows=2 width=6)
 
(7 rows)

postgres=# SELECT * FROM t, t2 WHERE t.a < 10 AND t.d = t2.e ORDER BY t.a,
t.b, t.c, t.d, t2.f LIMIT 10;a | b | c | d | e | f
---+---+---+---+---+---0 | 0 | 4 | t | t | 20 | 0 | 4 | t | t | 10 | 1 | 3 | t | t | 20 | 1 | 3 | t | t | 10 | 2 | 2 |
t| t | 20 | 2 | 2 | t | t | 10 | 3 | 1 | t | t | 20 | 3 | 1 | t | t | 10 | 4 | 0 | t | t | 20 | 4 | 0 | t | t | 1
 
(10 rows)

(Note the column f is sorted in the descending order.)

ISTM this was brought by the following change.

> In truncate_useless_pathkeys, if query_pathkeys is longer than pathkeys
> made from index columns old patch picked up the latter as IndexPath's
> pathkeys. But the former is more suitable according to the context here.

>  - truncate_useless_pathkeys returns root->query_pathkeys when
>    the index is fully-ordered and query_pathkeys contains the
>    pathkeys made from index columns.

I think it would be required to modify the patch so that the transformation
of the pathkeys is performed only when each pathkey of query_pathkeys
references the indexed relation.  (The above change might have been made
according to my comment in an earlier email I sent.  But I have to admit my
explanation there was not adequate.  I'm sorry for that.)

Here are random comments.

* In grouping_planner(), the patch resets the pathkeys of the cheapest
presorted index-path to query_pathkeys through
get_cheapest_fractional_path_for_pathkeys().  Is this necessary?  ISTM the
transformation of the pathkeys after the scan/join optimization would be no
longer necessary once it has been done at the index-path creation time, ie,
build_index_paths().  Why does the patch do that thing?

* In get_relation_info(), the patch determines the branch condition
"keyattno != ObjectIdAttributeNumber".  I fail to understand the meaning of
this branch condition.  Could you explain about it?

Thanks,

Best regards,
Etsuro Fujita




pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: Suggestion: Issue warning when calling SET TRANSACTION outside transaction block
Next
From: Rodolfo Campero
Date:
Subject: Re: PL/Python: domain over array support