Re: Get more from indices. - Mailing list pgsql-hackers

From Kyotaro HORIGUCHI
Subject Re: Get more from indices.
Date
Msg-id 20131115.155716.184031283.horiguchi.kyotaro@lab.ntt.co.jp
Whole thread Raw
In response to Re: Get more from indices.  (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>)
Responses Re: Get more from indices.  (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>)
List pgsql-hackers
Hello, I might made insufficient explanation. Surely it is
overdone for the example.

>> uniquetest=# create table t (a int, b int, c int, d text);
>> uniquetest=# create unique index i_t_pkey on t(a, b);
>> uniquetest=# insert into t
>> (select a % 10, a / 10, a, 't' from generate_series(0, 100000) a);
>> uniquetest=# analyze;
>> 
>> uniquetest=# explain (costs off, analyze on) select distinct * from t;

This is the simplest example for this patch.

The main intention of this patch is to widen application of my
another 'Index for UNION' patch.

https://commitfest.postgresql.org/action/patch_view?id=1279

> ISTM the right way to deal with this is not what you've done
> here, but just to deem the DISTINCT a no-op if there's a unique
> index on some subset of the distinct-ed columns.

It is not sufficient only to deem DISTINCT no-op in order to get
more utility of MergeAppends on IndexScans for UNION. The
following example (omitting data insertion..)

| create table cu11 (a int not null, b int not null, c int, d text);
| create unique index i_cu11_pk on cu11 (a, b);
| create table cu12 (like cu11 including all);
| explain select a, b from cu11 union select a, b from cu12
|    order by a, b limit 100;

yields following plan without this patch.

|  Limit
|   ->  Sort (Sort Key: cu11.a, cu11.b)
|    ->  HashAggregate
|     ->  Append
|      ->  Limit
|       ->  Unique
|        ->  Index Only Scan using cu11_a_b_idx on cu11
|      ->  Limit
|       ->  Unique
|        ->  Index Only Scan using cu12_a_b_idx on cu12

But, this could be far more effecient when the plan were as
follows if the rows are fully-ordered on the sort key.

|  Limit
|   ->  Unique 
|    ->  Merge Append (Sort Key: cu11.a, cu11.b)
|     ->  Index Only Scan using cu11_a_b_idx on cu11
|     ->  Index Only Scan using cu12_a_b_idx on cu12

Propagation of uniqueness on MergeAppend is required to do this
but not for the first expample on this thread.

>  This query is actually legally satisfied by a simple seqscan,
> which would be faster than either of the plans you mention.  In
> any case, it seems like a bad idea to me to conflate
> distinct-ness with ordering, so I don't like what you did to
> PathKeys.

Umm. Reconsidering on that and the discussion of the purose above
and the last patch of this thread, I've been wrong in where to
split the patches. I'll repost the reorganized patches on both
threads.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



pgsql-hackers by date:

Previous
From: David Rowley
Date:
Subject: Re: Improve code in tidbitmap.c
Next
From: Kyotaro HORIGUCHI
Date:
Subject: Re: Using indices for UNION.