Re: Group commit, revised - Mailing list pgsql-hackers
From | Simon Riggs |
---|---|
Subject | Re: Group commit, revised |
Date | |
Msg-id | CA+U5nM+Od94quNOsObo7gfiCRijwtJSWko8Gz9tyorHY5e96Sw@mail.gmail.com Whole thread Raw |
In response to | Re: Group commit, revised (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: Group commit, revised
|
List | pgsql-hackers |
On Wed, Jan 18, 2012 at 1:23 AM, Robert Haas <robertmhaas@gmail.com> wrote: > On Tue, Jan 17, 2012 at 12:37 PM, Heikki Linnakangas > <heikki.linnakangas@enterprisedb.com> wrote: >> I found it very helpful to reduce wal_writer_delay in pgbench tests, when >> running with synchronous_commit=off. The reason is that hint bits don't get >> set until the commit record is flushed to disk, so making the flushes more >> frequent reduces the contention on the clog. However, Simon made async >> commits nudge WAL writer if the WAL page fills up, so I'm not sure how >> relevant that experience is anymore. Still completely relevant and orthogonal to this discussion. The patch retains multi-modal behaviour. > There's still a small but measurable effect there in master. I think > we might be able to make it fully auto-tuning, but I don't think we're > fully there yet (not sure how much this patch changes that equation). > > I suggested a design similar to the one you just proposed to Simon > when he originally suggested this feature. It seems that if the WAL > writer is the only one doing WAL flushes, then there must be some IPC > overhead - and context switching - involved whenever WAL is flushed. > But clearly we're saving something somewhere else, on the basis of > Peter's results, so maybe it's not worth worrying about. It does seem > pretty odd to have all the regular backends going through the WAL > writer and the auxiliary processes using a different mechanism, > though. If we got rid of that, maybe WAL writer wouldn't even require > a lock, if there's only one process that can be doing it at a time. When we did sync rep it made sense to have the WALSender do the work and for others to just wait. It would be quite strange to require a different design for essentially the same thing for normal commits and WAL flushes to local disk. I should mention the original proposal for streaming replication had each backend send data to standby independently and that was recognised as a bad idea after some time. Same for sync rep also. The gain is that previously there was measurable contention for the WALWriteLock, now there is none. Plus the gang effect continues to work even when the database gets busy, which isn't true of piggyback commits as we use now. Not sure why its odd to have backends do one thing and auxiliaries do another. The whole point of auxiliary processes is that they do a specific thing different to normal backends. Anyway, the important thing is to have auxiliary processes be independent of each other as much as possible, which simplifies error handling and state logic in the postmaster. With regard to context switching, we're making a kernel call to fsync, so we'll get a context switch anyway. The whole process is similar to the way lwlock wake up works. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
pgsql-hackers by date: