Thread: Partial indexes ... any good theoretical discussion?
I've only just now noticed that CREATE INDEX accepts a 'WHERE' clause. This is used to create something called a "partial index". Hmm, ever being one who sees the world as made of nails when first given a hammer ... One of our tables, with a couple hundred thousand rows) has a date-column index. We expire things out of the table every day (the vast majority, but not exclusively, stuff that's a couple days old). We're frequently running queries against the table, looking for "everything since this time yesterday"; we hardly ever look back more than 24 hours. If I created the index as something like: CREATE INDEX ix_foo ON foo(the_date) WHERE the_date >= now() - interval '24 hours'; what might I expect as the impact? Do index values older than 24 hours drop out? Or must I "refresh" the index from time to time (in our application, probably a couple dozen times a day)? And, absent pat answers to this, is there anything out there in PG-land that documents partial indexes, and when to use them? -- Jeff Boes vox 269.226.9550 ext 24 Database Engineer fax 269.349.9076 Nexcerpt, Inc. http://www.nexcerpt.com ...Nexcerpt... Extend your Expertise
Jeff Boes <jboes@nexcerpt.com> writes: > If I created the index as something like: > CREATE INDEX ix_foo ON foo(the_date) > WHERE the_date >= now() - interval '24 hours'; > what might I expect as the impact? You won't be allowed to, because now() isn't immutable, and only immutable functions are allowed in index expressions and predicates. You could imagine a batch job every night creating a new index CREATE INDEX ix_foo_20031003 ON foo(the_date) WHERE the_date >= '2003-10-03' and then dropping the prior index. Dunno if this would be worth the trouble, but it might be. The CREATE INDEX should run quite quickly if it only has to pick up a few rows, which it would if you run it shortly after the chosen boundary. > And, absent pat answers to this, is there anything out there in PG-land > that documents partial indexes, and when to use them? http://developer.postgresql.org/docs/postgres/indexes-partial.html The links at the bottom of the page point to various academic papers (I hope the links all still work...) regards, tom lane
In an attempt to throw the authorities off his trail, Jeff Boes <jboes@nexcerpt.com> transmitted: > I've only just now noticed that CREATE INDEX accepts a 'WHERE' > clause. This is used to create something called a "partial index". > Hmm, ever being one who sees the world as made of nails when first > given a hammer ... > > One of our tables, with a couple hundred thousand rows) has a > date-column index. We expire things out of the table every day (the > vast majority, but not exclusively, stuff that's a couple days > old). We're frequently running queries against the table, looking > for "everything since this time yesterday"; we hardly ever look back > more than 24 hours. > > If I created the index as something like: > > CREATE INDEX ix_foo ON foo(the_date) > WHERE the_date >= now() - interval '24 hours'; > > what might I expect as the impact? Do index values older than 24 > hours drop out? Or must I "refresh" the index from time to time (in > our application, probably a couple dozen times a day)? That won't work, unfortunately. [somedatabase]=# create index partial on transaction_log(trans_on) where trans_on > now() - '5 days'::interval; ERROR: functions in index predicate must be marked IMMUTABLE [somedatabase]=# You can't have now() (which is certainly *not* immutable) as part of the index. A better idea would be to have a set of several tables, one for each day, UNION ALL them together to generate a view to support queries, and use a sequence to control which table is inserted to on any given day, alongside some rules for insert/deletes. -- (format nil "~S@~S" "aa454" "freenet.carleton.ca") http://cbbrowne.com/info/linuxxian.html "Waving away a cloud of smoke, I look up, and am blinded by a bright, white light. It's God. No, not Richard Stallman, or Linus Torvalds, but God. In a booming voice, He says: "THIS IS A SIGN. USE LINUX, THE FREE Unix SYSTEM FOR THE 386." -- Matt Welsh
It won't work. You could instead have a separate boolean attribute called 'expired' for each row. Set this to true whenever you expire the row, and create the partial index using that attr. Matt > -----Original Message----- > From: pgsql-admin-owner@postgresql.org > [mailto:pgsql-admin-owner@postgresql.org]On Behalf Of Jeff Boes > Sent: 03 October 2003 17:35 > To: pgsql-admin@postgresql.org > Subject: [ADMIN] Partial indexes ... any good theoretical discussion? > > > I've only just now noticed that CREATE INDEX accepts a 'WHERE' clause. > This is used to create something called a "partial index". Hmm, ever > being one who sees the world as made of nails when first given a > hammer ... > > One of our tables, with a couple hundred thousand rows) has a > date-column index. We expire things out of the table every day (the vast > majority, but not exclusively, stuff that's a couple days old). We're > frequently running queries against the table, looking for "everything > since this time yesterday"; we hardly ever look back more than 24 hours. > > If I created the index as something like: > > CREATE INDEX ix_foo ON foo(the_date) > WHERE the_date >= now() - interval '24 hours'; > > what might I expect as the impact? Do index values older than 24 hours > drop out? Or must I "refresh" the index from time to time (in our > application, probably a couple dozen times a day)? > > And, absent pat answers to this, is there anything out there in PG-land > that documents partial indexes, and when to use them? > > -- > Jeff Boes vox 269.226.9550 ext 24 > Database Engineer fax 269.349.9076 > Nexcerpt, Inc. http://www.nexcerpt.com > ...Nexcerpt... Extend your Expertise > > > ---------------------------(end of broadcast)--------------------------- > TIP 7: don't forget to increase your free space map settings >