Re: Analysis of ganged WAL writes - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Re: Analysis of ganged WAL writes |
Date | |
Msg-id | 24856.1034012573@sss.pgh.pa.us Whole thread Raw |
In response to | Re: Analysis of ganged WAL writes (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: Analysis of ganged WAL writes
|
List | pgsql-hackers |
I wrote: > That says that the best possible throughput on this test scenario is 5 > transactions per disk rotation --- the CPU is just not capable of doing > more. I am actually getting about 4 xact/rotation for 10 or more > clients (in fact it seems to reach that plateau at 8 clients, and be > close to it at 7). After further thought I understand why it takes 8 clients to reach full throughput in this scenario. Assume that we have enough CPU oomph so that we can process four transactions, but not five, in the time needed for one revolution of the WAL disk. If we have five active clients then the behavior will be like this: 1. Backend A becomes ready to commit. It locks WALWriteLock and issues a write/flush that will only cover its own commit record. Assume that it has to wait one full disk revolution for the write to complete (this will be the steady-state situation). 2. While A is waiting, there is enough time for B, C, D, and E to run their transactions and become ready to commit. All eventually block on WALWriteLock. 3. When A finishes its write and releases WALWriteLock, B will acquire the lock and initiate a write that (with my patch) will cover C, D, and E's commit records as well as its own. 4. While B is waiting for the disk to spin, A receives a new transaction from its client, processes it, and becomes ready to commit. It blocks on WALWriteLock. 5. When B releases the lock, C, D, E acquire it and quickly fall through, seeing that they need do no work. Then A acquires the lock. GOTO step 1. So with five active threads, we alternate between committing one transaction and four transactions on odd and even disk revolutions. It's pretty easy to see that with six or seven active threads, we will alternate between committing two or three transactions and committing four. Only when we get to eight threads do we have enough backends to ensure that four transactions are available to commit on every disk revolution. This must be so because the backends that are released at the end of any given disk revolution will not be able to participate in the next group commit, if there is already at least one backend ready to commit. So this solution isn't perfect; it would still be nice to have a way to delay initiation of the WAL write until "just before" the disk is ready to accept it. I dunno any good way to do that, though. I went ahead and committed the patch for 7.3, since it's simple and does offer some performance improvement. But maybe we can think of something better later on... regards, tom lane
pgsql-hackers by date: