Re: Proposal: "Causal reads" mode for load balancing reads without stale data - Mailing list pgsql-hackers
From | Thomas Munro |
---|---|
Subject | Re: Proposal: "Causal reads" mode for load balancing reads without stale data |
Date | |
Msg-id | CAEepm=2cvEJveJhNqsNZgWN=S1ZAhMXyATr6fYhibFtpQRrDKA@mail.gmail.com Whole thread Raw |
In response to | Re: Proposal: "Causal reads" mode for load balancing reads without stale data (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: Proposal: "Causal reads" mode for load balancing reads
without stale data
Re: Proposal: "Causal reads" mode for load balancing reads without stale data |
List | pgsql-hackers |
On Wed, Mar 23, 2016 at 12:37 PM, Robert Haas <robertmhaas@gmail.com> wrote: > On Mon, Mar 14, 2016 at 2:38 PM, Thomas Munro > <thomas.munro@enterprisedb.com> wrote: >>> What's with the extra block? >> >> Yeah, that's silly, thanks. Tidied up for the next version. > > Some more comments on 0001: > > + <literal>remote_write</>, <literal> remote_apply</>, > <literal>local</>, and <literal>off</>. > > Extra space. Fixed. > + * apply when the transaction eventuallly commits or aborts. > > Spelling. Fixed. > + if (synchronous_commit == SYNCHRONOUS_COMMIT_REMOTE_APPLY) > + assign_synchronous_commit(SYNCHRONOUS_COMMIT_REMOTE_FLUSH, NULL); > + > + SyncRepWaitForLSN(gxact->prepare_end_lsn); > + > + if (synchronous_commit == SYNCHRONOUS_COMMIT_REMOTE_APPLY) > + assign_synchronous_commit(SYNCHRONOUS_COMMIT_REMOTE_APPLY, NULL); > > You can't do this. Directly changing the value of a variable that is > backing a GUC is verboten, and doing it through the thin veneer of > calling the assign-hook will not avoid the terrible wrath of the > powers that dwell in the outer dark, and/or Pittsburgh. You probably > need a dance here similar to the way forcePageWrites/fullPageWrites > work. Yeah, that was terrible. Instead of that I have now made this interface change: -SyncRepWaitForLSN(XLogRecPtr XactCommitLSN) +SyncRepWaitForLSN(XLogRecPtr lsn, bool commit) If you want to wait for non-commit records (prepare being the only case of that), you pass false in the second argument, and then the wait level is capped at remote flush inside that function. There is no way to wait for non-commit records to be applied and there is no point in making it so that you can (they don't have any user-visible effect). > /* > + * Check if the caller would like to ask standbys for immediate feedback > + * once this commit is applied. > + */ > > Whitespace. Fixed. > + /* > + * Check if the caller would like to ask standbys for immediate feedback > + * once this abort is applied. > + */ > > Whitespace again. Fixed. > /* > + * doRequestWalReceiverReply is used by recovery code to ask the main recovery > + * loop to trigger a walreceiver reply. > + */ > +static bool doRequestWalReceiverReply; > > This is the sort of comment that leads me to ask "why even bother > writing a comment?". Try to say something that's not obvious from the > variable name. The comment for XLogRequestWalReceiverReply has a > similar issue. Changed. > +static void WalRcvBlockSigUsr2(void) > > Style - newline after void. Fixed. > +static void WalRcvUnblockSigUsr2(void) > > And again here. Fixed. > + WalRcvUnblockSigUsr2(); > len = walrcv_receive(NAPTIME_PER_CYCLE, &buf); > + WalRcvBlockSigUsr2(); > > This does not seem like it will be cheap on all operating systems. I > think you should try to rejigger this somehow so that it can just set > the process latch and the wal receiver figures it out from looking at > shared memory. Like maybe a flag in WalRcvData? An advantage of this > is that it should cut down on the number of signals significantly, > because it won't need to send SIGUSR1 when the latch is already set. Still experimenting with a latch here. I will come back on this point soon. > + * Although only "on", "off", "remote_apply", "remote_write", and "local" are > + * documented, we accept all the likely variants of "on" and "off". > > Maybe switch to listing the undocumented values. It follows a pattern used by several nearby bits of code, so it doesn't look like it should be different, and besides you can see what the undocumented values are, they're right below. Here are some test results run on a bunch of Amazon EC2 "m3.large" under Ubuntu Trusty in the Oregon zone, all in the same subnet. Defaults except 1GB shared_buffers. 1. Simple sequential updates (using 'test-causal-reads.c', already posted up-thread): synchronous_commit TPS ==================== ==== off 9234 local 1223 remote_write 907 on 587 remote_apply 555 causal_reads TPS ==================== ==== 0 cr standbys 1112 1 cr standbys 541 2 cr standbys 487 3 cr standbys 467 2. Some pgbench -c4 -j2 -N bench2 runs: synchronous_commit TPS ==================== ==== off 3937 local 1984 remote_write 1701 on 1373 remote_apply 1349 causal_reads TPS ==================== ==== 0 cr standbys 1973 1 cr standbys 1413 2 cr standbys 1282 3 cr standbys 1163 -- Thomas Munro http://www.enterprisedb.com
Attachment
pgsql-hackers by date: