Thread: Partial indexes ... any good theoretical discussion?

Partial indexes ... any good theoretical discussion?

From
Jeff Boes
Date:
I've only just now noticed that CREATE INDEX accepts a 'WHERE' clause.
This is used to create something called a "partial index".  Hmm, ever
being one who sees the world as made of nails when first given a hammer ...

One of our tables, with a couple hundred thousand rows) has a
date-column index. We expire things out of the table every day (the vast
majority, but not exclusively, stuff that's a couple days old). We're
frequently running queries against the table, looking for "everything
since this time yesterday"; we hardly ever look back more than 24 hours.

If I created the index as something like:

    CREATE INDEX ix_foo ON foo(the_date)
      WHERE the_date >= now() - interval '24 hours';

what might I expect as the impact?  Do index values older than 24 hours
drop out? Or must I "refresh" the index from time to time (in our
application, probably a couple dozen times a day)?

And, absent pat answers to this, is there anything out there in PG-land
that documents partial indexes, and when to use them?

--
Jeff Boes                                      vox 269.226.9550 ext 24
Database Engineer                                     fax 269.349.9076
Nexcerpt, Inc.                                 http://www.nexcerpt.com
            ...Nexcerpt... Extend your Expertise


Re: Partial indexes ... any good theoretical discussion?

From
Tom Lane
Date:
Jeff Boes <jboes@nexcerpt.com> writes:
> If I created the index as something like:
>     CREATE INDEX ix_foo ON foo(the_date)
>       WHERE the_date >= now() - interval '24 hours';
> what might I expect as the impact?

You won't be allowed to, because now() isn't immutable, and only
immutable functions are allowed in index expressions and predicates.

You could imagine a batch job every night creating a new index

    CREATE INDEX ix_foo_20031003 ON foo(the_date)
      WHERE the_date >= '2003-10-03'

and then dropping the prior index.  Dunno if this would be worth the trouble,
but it might be.  The CREATE INDEX should run quite quickly if it only
has to pick up a few rows, which it would if you run it shortly after
the chosen boundary.

> And, absent pat answers to this, is there anything out there in PG-land
> that documents partial indexes, and when to use them?

http://developer.postgresql.org/docs/postgres/indexes-partial.html
The links at the bottom of the page point to various academic
papers (I hope the links all still work...)

            regards, tom lane

Re: Partial indexes ... any good theoretical discussion?

From
Christopher Browne
Date:
In an attempt to throw the authorities off his trail, Jeff Boes <jboes@nexcerpt.com> transmitted:
> I've only just now noticed that CREATE INDEX accepts a 'WHERE'
> clause. This is used to create something called a "partial index".
> Hmm, ever being one who sees the world as made of nails when first
> given a hammer ...
>
> One of our tables, with a couple hundred thousand rows) has a
> date-column index. We expire things out of the table every day (the
> vast majority, but not exclusively, stuff that's a couple days
> old). We're frequently running queries against the table, looking
> for "everything since this time yesterday"; we hardly ever look back
> more than 24 hours.
>
> If I created the index as something like:
>
>     CREATE INDEX ix_foo ON foo(the_date)
>       WHERE the_date >= now() - interval '24 hours';
>
> what might I expect as the impact?  Do index values older than 24
> hours drop out? Or must I "refresh" the index from time to time (in
> our application, probably a couple dozen times a day)?

That won't work, unfortunately.

[somedatabase]=# create index partial on transaction_log(trans_on) where
trans_on > now() - '5 days'::interval;
ERROR:  functions in index predicate must be marked IMMUTABLE
[somedatabase]=#

You can't have now() (which is certainly *not* immutable) as part of
the index.

A better idea would be to have a set of several tables, one for each
day, UNION ALL them together to generate a view to support queries,
and use a sequence to control which table is inserted to on any given
day, alongside some rules for insert/deletes.
--
(format nil "~S@~S" "aa454" "freenet.carleton.ca")
http://cbbrowne.com/info/linuxxian.html
"Waving away a cloud of smoke, I  look up, and am blinded by a bright,
white light.  It's God. No,  not Richard Stallman, or  Linus Torvalds,
but God. In a booming voice, He  says: "THIS IS A SIGN. USE LINUX, THE
FREE Unix SYSTEM FOR THE 386." -- Matt Welsh

Re: Partial indexes ... any good theoretical discussion?

From
"Matt Clark"
Date:
It won't work.

You could instead have a separate boolean attribute called 'expired' for
each row.  Set this to true whenever you expire the row, and create the
partial index using that attr.

Matt

> -----Original Message-----
> From: pgsql-admin-owner@postgresql.org
> [mailto:pgsql-admin-owner@postgresql.org]On Behalf Of Jeff Boes
> Sent: 03 October 2003 17:35
> To: pgsql-admin@postgresql.org
> Subject: [ADMIN] Partial indexes ... any good theoretical discussion?
>
>
> I've only just now noticed that CREATE INDEX accepts a 'WHERE' clause.
> This is used to create something called a "partial index".  Hmm, ever
> being one who sees the world as made of nails when first given a
> hammer ...
>
> One of our tables, with a couple hundred thousand rows) has a
> date-column index. We expire things out of the table every day (the vast
> majority, but not exclusively, stuff that's a couple days old). We're
> frequently running queries against the table, looking for "everything
> since this time yesterday"; we hardly ever look back more than 24 hours.
>
> If I created the index as something like:
>
>     CREATE INDEX ix_foo ON foo(the_date)
>       WHERE the_date >= now() - interval '24 hours';
>
> what might I expect as the impact?  Do index values older than 24 hours
> drop out? Or must I "refresh" the index from time to time (in our
> application, probably a couple dozen times a day)?
>
> And, absent pat answers to this, is there anything out there in PG-land
> that documents partial indexes, and when to use them?
>
> --
> Jeff Boes                                      vox 269.226.9550 ext 24
> Database Engineer                                     fax 269.349.9076
> Nexcerpt, Inc.                                 http://www.nexcerpt.com
>             ...Nexcerpt... Extend your Expertise
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 7: don't forget to increase your free space map settings
>