Thread: not using index for select min(...)

not using index for select min(...)

From
Don Bowman
Date:
I have a table which is very large (~65K rows). I have
a column in it which is indexed, and I wish to use for
a join. I'm finding that I'm using a sequential scan
for this when selecting a MIN.

I've boiled this down to something like this:

=> create table X( value int primary key );
=> explain select min(value) from x;
 Aggregate  (cost=22.50..22.50 rows=1 width=4)
   ->  Seq Scan on x  (cost=0.00..20.00 rows=1000 width=4)
=> \d x
       Table "public.x"
 Column |  Type   | Modifiers
--------+---------+-----------
 value  | integer | not null
Indexes: x_pkey primary key btree (value)

Why wouldn't I be doing an index scan on this table?

--don

Re: not using index for select min(...)

From
Josh Berkus
Date:
Don,

> I have a table which is very large (~65K rows). I have
> a column in it which is indexed, and I wish to use for
> a join. I'm finding that I'm using a sequential scan
> for this when selecting a MIN.

Due to Postgres' system of extensible aggregates (i.e. you can write your own
aggregates), all aggregates will trigger a Seq Scan in a query.   It's a
known drawrback that nobody has yet found a good way around.

--
-Josh Berkus
 Aglio Database Solutions
 San Francisco


Re: not using index for select min(...)

From
Andrew Sullivan
Date:
On Fri, Jan 31, 2003 at 04:12:38PM -0500, Don Bowman wrote:
> Why wouldn't I be doing an index scan on this table?

Because you're using the aggregate function min().  See

<http://www.ca.postgresql.org/docs/faq-english.html#4.8>

A

--
----
Andrew Sullivan                         204-4141 Yonge Street
Liberty RMS                           Toronto, Ontario Canada
<andrew@libertyrms.info>                              M2P 2A8
                                         +1 416 646 3304 x110


Re: not using index for select min(...)

From
Bruno Wolff III
Date:
On Fri, Jan 31, 2003 at 16:12:38 -0500,
  Don Bowman <don@sandvine.com> wrote:
> I have a table which is very large (~65K rows). I have
> a column in it which is indexed, and I wish to use for
> a join. I'm finding that I'm using a sequential scan
> for this when selecting a MIN.
>
> I've boiled this down to something like this:
>
> => create table X( value int primary key );
> => explain select min(value) from x;

Use the following instead:
select value from x order by value limit 1;

Re: not using index for select min(...)

From
Sean Chittenden
Date:
> > I have a table which is very large (~65K rows). I have
> > a column in it which is indexed, and I wish to use for
> > a join. I'm finding that I'm using a sequential scan
> > for this when selecting a MIN.
>
> Due to Postgres' system of extensible aggregates (i.e. you can write
> your own aggregates), all aggregates will trigger a Seq Scan in a
> query.  It's a known drawrback that nobody has yet found a good way
> around.

I've spent some time in the past thinking about this, and here's the
best idea that I can come up with:

Part one: setup an ALTER TABLE directive that allows for the
addition/removal of cached aggregates.  Ex:

  ALTER TABLE tab1 ADD AGGREGATE CACHE ON count(*);
  ALTER TABLE tab1 ADD AGGREGATE CACHE ON sum(col2);
  ALTER TABLE tab1 ADD AGGREGATE CACHE ON sum(col2) WHERE col2 > 100;
  ALTER TABLE tab1 ADD AGGREGATE CACHE ON sum(col2) WHERE col2 <= 100;


Which would translate into some kind of action on a pg_aggregate_cache
catalog:

aggregate_cache_oid      OID   -- OID for the aggregate cache
aggregate_table_oid      OID   -- table OID
ins_aggfn_oid          OID    -- aggregate function id for inserts
upd_aggfn_oid          OID    -- aggregate function id for updates
del_aggfn_oid          OID    -- aggregate function id for deletes
cache_value          INT    -- the value of the cache
private_data          INT[4]   -- temporary data space for needed
                   -- data necessary to calculate cache_value
                   -- four is just a guesstimate for how much
                   -- space would be necessary to calculate
                   -- the most complex of aggregates
where_clause              ???   -- I haven't the faintest idea how to
                -- express some kind of conditional like this


Part two: setup a RULE or TRIGGER that runs on INSERT, UPDATE, or
DELETE.  For the count(*) exercise, the ON UPDATE would be a no-op.
For ON INSERT, the count(*) rule would have to do something like:

UPDATE pg_catalog.pg_aggregate_cache SET cached_value = (cached_value + 1)
       WHERE aggregate_cache_oid = 1111111;


For the sum(col2) aggregate cache, the math is a little more complex,
but I think it's quite reasonable given that it obviates a full table
scan.  For an insert:

UPDATE pg_catalog.pg_aggregate_cache SET cached_value =
       ((cached_value * private_data[0] + NEW.col2) / (private_data[0] + 1))
       WHERE aggregate_cache_oid = 1111112;


Now, there are some obvious problems:

1) avg requires a floating point return value, therefore an INT may
   not be an appropriate data type for cache_value or private_data.

2) aggregate caching wouldn't speed up anything but full table
   aggregates or regions of a column that are frequently needed.

3) all of the existing aggregates would have to be updated to include
   an insert, update, delete procedure (total of 60 aggregates, but
   only 7 by name).

4) the planner would have to be taught how to use/return values from
   the cache.

5) Each aggregate type makes use of the private_data column
   differently.  It's up to the cached aggregate function authors to
   not jumble up their private data space.

6) I don't know of a way to handle mixing of floating point numbers
   and integers.  That said, there's some margin of error that could
   creep into the floating point calculations such as avg.


And some benefits:

1) You only get caching for aggregates that you frequently use
   (sum(col2), count(*), etc.).

2) Aggregate function authors can write their own caching routines.

3) For tens of millions of rows, it can be very time consuming to
   sum() fifty million rows, but it's easy to amortize the cost of
   updating the cache on insert, update, delete over the course of a
   month.

4) If an aggregate cache definition isn't setup, it should be easy for
   the planner to fall back to a full table scan, as it currently is.


This definitely would be a performance boost and something that would
only be taken advantage of by DBAs that are intentionally performance
tuning their database, but for those that do, it could be a massive
win.  Thoughts?  -sc

--
Sean Chittenden

Re: not using index for select min(...)

From
Josh Berkus
Date:
Sean,

> I've spent some time in the past thinking about this, and here's the
> best idea that I can come up with:
>
> Part one: setup an ALTER TABLE directive that allows for the
> addition/removal of cached aggregates.  Ex:

Actually, Joe Conway and I may be working on something like this for a client.
Joe's idea is to use a hacked version of the statistics collector to cache
selected aggregate values in memory.  These aggregates would be
non-persistent, but the main concern for us is having aggregate values that
are instantly accessable, and that don't increase the cost of INSERTS and
UPDATES more than 10%.

This is to satisfy the needs of a particular client, though, so it may never
make it into the general PostgreSQL source.  We'll post it somewhere if it
works, though.

We already implemented caching aggregates to tables, with is trivially easy to
do with triggers.   The problem with this approach is the
UPDATE/INSERT/DELETE overhead; even with an SPI-optimized C trigger, it's
costing us up to 40% additional time when under heavy write activity ...
which is exactly when we can't afford delays.

For a database which has a low level of UPDATE activity, though, you can
already implement cached aggregates as tables without inventing any new
Postgres extensions.

--
Josh Berkus
Aglio Database Solutions
San Francisco