On 2/9/07, Simon Riggs <simon@2ndquadrant.com > wrote: On Wed, 2007-02-07 at 14:17 -0500, Tom Lane wrote:
> ISTM we could fix that by extending the index VACUUM interface to
> include two concepts: aside from "remove these TIDs when you find them",
> there could be "replace these TIDs with those TIDs when you find them".
> This would allow pointer-swinging to one of the child tuples, after
> which the old root could be removed. This has got the same atomicity
> problem as for CREATE INDEX, because it's the same thing: you're
> de-HOT-ifying the child. So if you can solve the former, I think you
> can make this work too.
This is looking like the best option out of the many, since it doesn't
have any serious restrictions or penalties. Let's see what Pavan thinks,
since he's been working on this aspect.
ISTM that there two related issues that we need to solve to make
progress.
- We must make de-HOTifying or CHILLing crash safe
- Concurrent index scans should work correctly with CHILLing operations
I think the first issue can be addressed on the lines of what Heikki suggested.
We can CHILL one tuple at a time. I am thinking of a two step process.
In the first step, the root-tuple and the heap-only tuple (which needs CHILLing)
are marked with a special flag, CHILL_IN_PROGRESS. This operation is
WAL logged. We then insert appropriate index entries for the tuple under
consideration.
In the second step, the HEAP_UPDATED_ROOT and HEAP_ONLY_TUPLE
flags on the heap tuples are adjusted and CHILL_IN_PROGRESS flags are cleared.
During normal operations, if CHILL_IN_PROGRESS flag is found set, we might
need to do some more work to figure out whether the index insert operations
were successful or not. If we find that there are missing index entries for the tuple
under consideration for CHILLing, then those could be added now and flags
are set/reset appropriately.
The second problem of concurrent index scans seems a bit more complex.
We need a mechanism so that no tuples are missed or tuples are
not returned twice. Since CHILLing of a tuple adds a new access path to the
tuple from the index, a concurrent index scan may return a tuple twice.
How about grabbing a AccessExclusiveLock during CHILLing
operation ? This would prevent any concurrent index scans. Since CHILLing
of a large table can take a long time, the operation can be spread across
time with periodic acquire/release of the lock. This would prevent starvation
of other backends. Since CHILLing is required only for CREATE INDEX
and stub-cleanup, I am assuming that its ok for it to be lazy in nature.
Thanks,
Pavan
--
EnterpriseDB
http://www.enterprisedb.com