Re: GIN fast insert - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: GIN fast insert |
Date | |
Msg-id | 603c8f070902230650p40657826x854f9bcb25196ee9@mail.gmail.com Whole thread Raw |
In response to | Re: GIN fast insert (Simon Riggs <simon@2ndQuadrant.com>) |
Responses |
Re: GIN fast insert
|
List | pgsql-hackers |
On Mon, Feb 23, 2009 at 4:56 AM, Simon Riggs <simon@2ndquadrant.com> wrote: >> It would be helpful if Heikki or Simon could jump in here, but my >> understanding is that cleaning up the pending list is a read-write >> operation. I don't think we can do that on a hot standby server. > > >From reading the docs with the patch the pending list is merged into the > main index when a VACUUM is performed. I (think I) can see that > additions to the pending list are WAL logged, so that will work in Hot > Standby. I also see ginEntryInsert() calls during the move from pending > list to main index which means that is WAL logged also. AFAICS this > *could* work during Hot Standby mode. > Best check here http://wiki.postgresql.org/wiki/Hot_Standby#Usage > rather than attempt to read the patch. > Teodor, can you confirm > * we WAL log the insert into the pending list > * we WAL log the move from the pending list to the main index > * that we maintain the pending list correctly during redo so that it can > be accessed by index scans > > The main thing with Hot Standby is that we can't do any writes. So a > pending list cannot change solely because of a gingettuple call on the > *standby*. That's what I thought. Thanks for confirming. > But that's easy to disable. If all the inserts happened on > the primary node and all the reads happened on the standby, then pending > list would never be cleaned up if the cleanup is triggered only by read. No, because the inserts would trigger VACUUM on the primary. > I would suggest that we trigger cleanup by read at threshold size X and > trigger cleanup by insert at threshold size 5X. That avoids the strange > case mentioned, but generally ensures only reads trigger cleanup. (But > why do we want that??) I think that's actually not what we want. What we want is for VACUUM to deal with it. Unfortunately that's hard to guarantee since, for example, someone might turn autovacuum off. So the issue is what do we do when we're in the midst of an index scan and our TIDBitmap has become lossy. Right now, the answer is that we clean up the pending list from inside the index scan and then retry the index scan. I don't think that's going to work. I'm starting to think that the right thing to do here is to create a non-lossy option for TIDBitmap. Tom has been advocating just losing the index scan AM altogether, but that risks losing performance in cases where a LIMIT would have stopped the scan well prior to completion. > I found many parts of the patch and docs quite confusing because of the > way things are named. For me, this is a deferred or delayed insert > technique to allow batching. I would prefer if everything used one > description, rather than "fast", "pending", "delayed" etc. I mentioned this in my previous review (perhaps not quite so articulately) and I completely agree with you. It's clear enough reading the patch because you know that all the changes in the patch must be related to each other, but once it's applied it's going to be tough to figure out. > Personally, I see ginInsertCleanup() as a scheduled task unrelated to > vacuum. Making the deferred tasks happen at vacuum time is just a > convenient way of having a background task occur regularly. That's OK > for now, but I would like to be able to request a background task > without having to hook into AV. This has been discussed previously and I assume you will be submitting a patch at some point, since no one else has volunteered to implement it. I think autovacuum is the right way to handle this particular case because it is a cleanup operation that is not dependent on time but on write activity and hooks into more or less the same stats infrastructure, but I don't deny the existence of other cases that would benefit from a scheduler. ...Robert
pgsql-hackers by date: