Re: Index only scan - Mailing list pgsql-general

From Gavin Flower
Subject Re: Index only scan
Date
Msg-id 50760406.5090603@archidevsys.co.nz
Whole thread Raw
In response to Index only scan  (Lars Helge Øverland <larshelge@gmail.com>)
Responses Re: Index only scan  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-general
On 11/10/12 01:03, Lars Helge Øverland wrote:
Hi all,

first of all thanks for the great new "index only scan" feature in
9.2. We have managed to adapt our app (dhis2.org) to take advantage of
it and it really speeds up several queries significantly.

We are now in the process of designing a new component for analytics
and this feature got me thinking we could utilize postgres over other
alternatives like column-oriented databases. Basically we will have a
wide, denormalized table with 20+ columns with relatively low
cardinality. Typically we will have queries which sums a fact column
based on where/group by clauses on several dimension columns (standard
data warehouse stuff). An example would be "select a, b, c, sum(d)
from analytics where a=1 and b=2 group by a,b,c";

Now my initial idea was to simply put a single index on all of those
columns, in the hope that "index only scans" would kick in. It seems
this is not the case, as strictly one (single or multi-column) index
is required to enable index only scans for a query.

My question is: Would it be feasible and/or possible to implement
index only scans in a way that it could take advantage of several,
single-column indexes? For example, a query spanning columns a, b, c
could take advantage of 3 single-column indexes put on columns a, b,
c.

Finally, is there anyone else who are using postgres for this purpose
and have some good tips to share in order to achieve good performance,
including index strategies, beyond the standard config best practices?


best regards,

Lars Helge Øverland



Index only scans do use multiple indexes of single fields where appropriate.  Here the planner determined it only needed to scan 2 of the 3 relevant single field indexes.


Cheers,
Gavin

-- index_only_scan_001.sql


DROP TABLE IF EXISTS iostab;

CREATE TABLE iostab
(
    id  int PRIMARY KEY,
    a   int,
    b   int,
    c   int,
    d   int,
    z   text
);


INSERT INTO iostab (id, a, b, c, d, z) VALUES
    (generate_series(1, 1000000),
    1000 * random(),
    1000 * random(),
    1000 * random(),
    1000 * random(),
    'qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq' || random());
   
CREATE INDEX ON iostab (a);
CREATE INDEX ON iostab (b);
CREATE INDEX ON iostab (c);
CREATE INDEX ON iostab (d);
   
ANALYZE VERBOSE iostab;

EXPLAIN
SELECT
    i.*
FROM
    iostab i
WHERE
        i.a = 2
    AND i.b = 7
    AND i.c = 4
/**/;/**/


//////////////


DROP TABLE
psql:index_only_scan_001.sql:14: NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index "iostab_pkey" for table "iostab"
CREATE TABLE
INSERT 0 1000000
CREATE INDEX
CREATE INDEX
CREATE INDEX
CREATE INDEX
psql:index_only_scan_001.sql:30: INFO:  analyzing "public.iostab"
psql:index_only_scan_001.sql:30: INFO:  "iostab": scanned 15385 of 15385 pages, containing 1000000 live rows and 0 dead rows; 30000 rows in sample, 1000000 estimated total rows
ANALYZE
                                     QUERY PLAN                                    
------------------------------------------------------------------------------------
 Bitmap Heap Scan on iostab i  (cost=41.37..45.39 rows=1 width=90)
   Recheck Cond: ((b = 7) AND (a = 2))
   Filter: (c = 4)
   ->  BitmapAnd  (cost=41.37..41.37 rows=1 width=0)
         ->  Bitmap Index Scan on iostab_b_idx  (cost=0.00..20.55 rows=960 width=0)
               Index Cond: (b = 7)
         ->  Bitmap Index Scan on iostab_a_idx  (cost=0.00..20.57 rows=963 width=0)
               Index Cond: (a = 2)
(8 rows)

gavin=>

pgsql-general by date:

Previous
From: Sergey Konoplev
Date:
Subject: Re: Shorthand syntax for triggers
Next
From: Ondrej Ivanič
Date:
Subject: Re: Index only scan