Re: Page at a time index scan - Mailing list pgsql-patches

From Tom Lane
Subject Re: Page at a time index scan
Date
Msg-id 28697.1147113962@sss.pgh.pa.us
Whole thread Raw
In response to Re: Page at a time index scan  (Simon Riggs <simon@2ndquadrant.com>)
Responses Re: Page at a time index scan
Re: Page at a time index scan
List pgsql-patches
Simon Riggs <simon@2ndquadrant.com> writes:
> So we just optimised for slowly-but-continually churning tables (i.e.
> DELETEs match INSERTs, or just UPDATEs). i.e. we just improved VACUUM
> performance for those that don't need it that often. That might be the
> common case, but it isn't the one thats hurting most.

No, we just improved VACUUM performance for those that need it most.
I was just doing some desultory experiments with today's code versus
yesterday's (it's easy to make a direct comparison because they're
initdb-compatible ... just stop one postmaster executable and start
another on the same database).  I made a table of 16M rows with an
index over a random-data integer column.  With a thoroughly disordered
index (built on-the-fly as the random data was inserted), the time to
VACUUM after deleting a small number of rows was 615 seconds with
yesterday's code, 31 seconds today.  With a perfectly-ordered index
(identical table, but CREATE INDEX after all the data is in place), the
times were about 28 and 26 seconds respectively.  (I wouldn't put a
*whole* lot of faith in these numbers, since they're from a Dell machine
with a toy ATA drive, but anyway they do show that sequential access to
the index makes a huge difference.)  But perfectly-ordered indexes are
impossible to maintain unless your data is either static or insert-at-
the-end-only.

Anyway, while the extra LWLock and shared-memory access during a split
is annoying, I think it's really negligible (and so does pgbench).
A page split is going to involve many times that much work --- you've
got to acquire lock on at least four different buffers, visit the FSM,
possibly ask the kernel for another disk page, do XLogInsert twice, etc.
Any one of those things involves significantly more CPU effort and
contention than _bt_vacuum_cycleid().  So while I'd be happy to get rid
of it, not at the price of slowing down VACUUM again.

            regards, tom lane

pgsql-patches by date:

Previous
From: Martijn van Oosterhout
Date:
Subject: [PATCH] Round 2: Magic block for modules
Next
From: Alvaro Herrera
Date:
Subject: Re: [WIP] The relminxid addition, try 3