On 08/15/2014 02:02 AM, Alvaro Herrera wrote:
> Alvaro Herrera wrote:
>> Heikki Linnakangas wrote:
>
>>> I'm sure this still needs some cleanup, but here's the patch, based
>>> on your v14. Now that I know what this approach looks like, I still
>>> like it much better. The insert and update code is somewhat more
>>> complicated, because you have to be careful to lock the old page,
>>> new page, and revmap page in the right order. But it's not too bad,
>>> and it gets rid of all the complexity in vacuum.
>>
>> It seems there is some issue here, because pageinspect tells me the
>> index is not growing properly for some reason. minmax_revmap_data gives
>> me this array of TIDs after a bunch of insert/vacuum/delete/ etc:
>
> I fixed this issue, and did a lot more rework and bugfixing. Here's
> v15, based on v14-heikki2.
So, the other design change I've been advocating is to store the revmap
in the first N blocks, instead of having the two-level structure with
array pages and revmap pages.
Attached is a patch for that, to be applied after v15. When the revmap
needs to be expanded, all the tuples on it are moved elsewhere
one-by-one. That adds some latency to the unfortunate guy who needs to
do that, but as the patch stands, the revmap is only ever extended by
VACUUM or CREATE INDEX, so I think that's fine. Like with my previous
patch, the point is to demonstrate how much simpler the code becomes
this way; I'm sure there are bugs and cleanup still necessary.
PS. Spotted one oversight in patch v15: callers of mm_doupdate must
check the return value, and retry the operation if it returns false.
- Heikki