Re: log_newpage header comment - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: log_newpage header comment |
Date | |
Msg-id | CA+Tgmob5-7D4RA23ErhTP9piTRanswFjhnutoXtZBTuXJE_4uQ@mail.gmail.com Whole thread Raw |
In response to | Re: log_newpage header comment (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: log_newpage header comment
|
List | pgsql-hackers |
On Thu, Jun 7, 2012 at 8:06 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> It seems that in implementing ginbuildempty(), I falsified the first >> "note" in the header comment for log_newpage(): > >> * Note: all current callers build pages in private memory and write them >> * directly to smgr, rather than using bufmgr. Therefore there is no need >> * to pass a buffer ID to XLogInsert, nor to perform MarkBufferDirty within >> * the critical section. > >> 1. Considering that we're logging the entire page, is it necessary (or >> even desirable) to include the buffer ID in the rdata structure? If >> so, why? To put that another way, does my abuse of log_newpage >> constitute a bug in gistbuildempty()? > > AFAICS, not passing the buffer ID to XLogInsert is not an issue, since > we are logging the whole page in any case. However, failing to perform > MarkBufferDirty within the critical section definitely is an issue. However, I'm not failing to do that: there's an enclosing critical section. >> 2. Should we add a new function that does the same thing as >> log_newpage for a shared buffer? I'm imagining that the signature >> would be: > > Either that or rethink building this data in shared buffers. What's the > point of that, exactly, for a page that we are most certainly not going > to use in normal operation? Well, right. I mean, I think the current implementation is mostly design-by-accident, and my first thought was the same as yours: don't clutter shared_buffers with pages we don't actually need for anything.But there is a down side to doing it the other way,too. Look at what btbuildempty does: /* Write the page. If archiving/streaming, XLOG it. */ smgrwrite(index->rd_smgr, INIT_FORKNUM, BTREE_METAPAGE, (char *) metapage, true); if (XLogIsNeeded()) log_newpage(&index->rd_smgr->smgr_rnode.node,INIT_FORKNUM, BTREE_METAPAGE, metapage); /* * An immediate sync is require even if we xlog'd the page, because the * write did not go throughshared_buffers and therefore a concurrent * checkpoint may have move the redo pointer past our xlog record. */ smgrimmedsync(index->rd_smgr, INIT_FORKNUM); So we have to write the page out immediately, then we have to XLOG it, and then even though we've XLOG'd it, we still have to fsync it immediately. It might be better to go through shared_buffers, which would allow the write and fsync to happen in the background. The cache-poisoning effect is probably trivial - even on a system with 32MB of shared_buffers there are thousands of pages, and init forks are never going to consume more than a handful unless you're creating an awful lot of unlogged relations very quickly - in which case maybe avoiding the immediate fsyncs is a more pressing concern. On the other hand, we haven't had any complaints about the way it works now, either. What's your thought? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: