Home > mailing lists

Re: [PATCHES] GIN improvements - Mailing list pgsql-hackers

From	Tom Lane
Subject	Re: [PATCHES] GIN improvements
Date	July 23, 2008 18:26:56
Msg-id	13239.1216848409@sss.pgh.pa.us Whole thread Raw
In response to	Re: [PATCHES] GIN improvements (Alvaro Herrera <alvherre@commandprompt.com>)
List	pgsql-hackers

Tree view

Alvaro Herrera <alvherre@commandprompt.com> writes:
> Tom Lane wrote:
>> It's a mess:

> These are rather severe problems.  Maybe there's a better solution, but
> perhaps it would be good enough to lock out concurrent access to the
> index while the bulkinsert procedure is working.

Ugh...

The idea I was toying with was to not allow GIN scans to "stop" on
pending-insertion pages; rather, they should suck out all the matching
tuple IDs into backend-local memory as fast as they can, and then return
the TIDs to the caller one at a time from that internal array.  Then,
when the scan is later visiting the main part of the index, it could
check each matching TID against that array to see if it'd already
returned the TID.  (So it might be an idea to sort the TID array after
gathering it, to make those subsequent checks fast via binary search.)

This would cost in backend-local memory, of course, but hopefully not
very much.  The advantages are the elimination of the deadlock risk
from scan-blocks-insertcleanup-blocks-insert, and fixing the race
condition when a TID previously seen in the pending list is moved to
the main index.  There were still a number of locking issues to fix
but I think they're all relatively easy to deal with.
        regards, tom lane

pgsql-hackers by date:

From: Dimitri Fontaine
Date: 23 July 2008, 18:08:34
Subject: PostgreSQL extensions packaging

From: "Dann Corbit"
Date: 23 July 2008, 19:50:46
Subject: Re: Research/Implementation of Nested Loop Join optimization

Re: [PATCHES] GIN improvements - Mailing list pgsql-hackers

Previous

Next