Jeff Janes <jeff.janes@gmail.com> writes:
> shouldn't it need a DBA to declare it? How is the system supposed to
> anticipate that at some point years in the future I will want to run
> the command sequence "create foo_archive as select from foo where
> year<2009; delete from foo where year<2009", or its partition-based
> equivalent, and have it operate on several billion rows cleanly and
> quickly? I don't think we can expect the system to anticipate what it
> has never before experienced. This is the DBA's job.
Well, the not-fully spelled out proposal would be to still work it out
from a list of columns picked by the DBA. I though that an existing
index would be best, but maybe just columns would be good.
I guess it's already time to play loose and invent some SQL syntax to
make it easier talking about the same thing:
ALTER TABLE foo SEGMENT ON (year, stamp);
Now the aim would be to be able to implement the operation you describe
by using the new segment map, which is an index pointing to sequential
ranges of on-disk blocks where the data is known to share a common key
range over the columns you're segmenting on. I would imagine this SQL:
TRUNCATE foo WHERE year < 2009;
As the on-disk location of the data that qualify this WHERE clause is
known, it could be possible to (predicate) lock it and bulk remove it,
unlinking whole segments (1GB) at a time when relevant.
> While automatic clustering would be nice, it isn't the same thing as
> partitioning.
That has been my initial reaction to that kind of ideas too. After some
more time brewing the ideas, I'm not convinced that the use cases that
usually drives you to the latter can't be solved with the former.
Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support