Re: Hot Standby 0.2.1 - Mailing list pgsql-hackers
From | Simon Riggs |
---|---|
Subject | Re: Hot Standby 0.2.1 |
Date | |
Msg-id | 1253604161.4449.138.camel@ebony.2ndQuadrant Whole thread Raw |
In response to | Re: Hot Standby 0.2.1 (Jeff Janes <jeff.janes@gmail.com>) |
Responses |
Re: Hot Standby 0.2.1
|
List | pgsql-hackers |
On Mon, 2009-09-21 at 19:42 -0700, Jeff Janes wrote: > On Tue, Sep 15, 2009 at 2:41 PM, Simon Riggs <simon@2ndquadrant.com> wrote: > > > > OK, here is the latest version of the Hot Standby patchset. This is > > about version 30+ by now, but we should regard this as 0.2.1 > > Patch against CVS HEAD (now): clean apply, compile, no known bugs. > > > > OVERVIEW > > > > You can download PDF versions of the fine manual is here > > http://wiki.postgresql.org/images/0/01/Hot_Standby_main.pdf > > > >From this doc: > > "In recovery, transactions will not be permitted to take any lock > higher other than > AccessShareLock or AccessExclusiveLock. In addition, transactions may never > assign a TransactionId and may never write WAL. The LOCK TABLE command by > default applies an AccessExclusiveLock. Any LOCK TABLE command that runs on > the standby and requests a specific lock type other than AccessShareLock will be > rejected." > > The first sentence seems to say that clients on the stand-by can take > ACCESS EXCLUSIVE, while the last sentence seems to say that they > cannot do so. You are right to pick up that discrepancy, as Heikki did also. The root cause of that discrepancy is a change in patch behaviour between January and now that I will use your post to highlight and discuss, if you don't mind. (and yes, the docs need to be corrected) Initially, it seemed that it was certain that a read-only backend could not take an AccessExclusiveLock. On further thought, there is no particular reason to block AccessExclusiveLocks themselves, just that most things you might do while holding one are banned. But the lock itself is fine. (Any challenge on that?) AccessExclusiveLocks can be used to serialize the actions of other backends. That is a very common use case, so my concern was that LOCK TABLE would be a no-op unless we allowed AccessExclusiveLock, so the patch does allow it. > I did a little experiment on a hot standby instance. I expected that > either I would be denied the lock altogether, or the lock would cause > WAL replay to be paused until either I committed or was forcibly > canceled. But neither happened, I was granted the lock but WAL replay > continued anyway. > > jjanes=# begin; > BEGIN > jjanes=# lock table pgbench_history in access exclusive mode; > LOCK TABLE > jjanes=# select count(*) from pgbench_history; > count > -------- > 519104 > (1 row) > > jjanes=# select count(*) from pgbench_history; > count > -------- > 527814 > (1 row) > > Is this the expected behavior? By me, yes. WAL replay does not require a table lock to progress. Any changes are protected with block-level locks. It does acquire a table lock and cancel conflicting queries when it is about to replay something that would cause a query to explode, such as dropping a table, as explained in docs. So this is not a bug. The explanation of how the above sequence of events occurs is that the backend acquires AccessExclusiveLock - please check on other session in pg_locks. WAL replay continues by the Startup process, inserting further rows into the pgbench_history table as a series of transactions. The second select takes a later snapshot than the first and so sees more data than the first select, hence a larger count. (And I am pleased to see that recovery is progressing quickly even while your queries run). So not a bug, but just one of many possible behaviours we could enforce. 1. Allow AccessExclusiveLocks yet they do not interrupt WAL replay 2. Allow AccessExclusiveLocks but have them pause WAL replay 3. Disallow AccessExclusiveLocks (and so LOCK TABLE is effectively a no-op because it will not be able to serialize anything) So the patch originally implemented (3) but now implements (1). I would say that (2) is very undesirable because it puts WAL replay in the control of non-superusers. That could mean LOCK TABLE implicitly alters the high availability of the standby, and might even be used maliciously to do that. I'm open to views on whether we should use (1) or (3). Comments? Implementing either is no problem and we have a straight choice. We may even wish to review that again later from additional feedback. (Jeff, you have also helped me understand that there is a bug in the way serializable transactions are cancelled, which is easily corrected. Thanks for that unexpected windfall, but it is otherwise unrelated to your comments.) -- Simon Riggs www.2ndQuadrant.com
pgsql-hackers by date: