On Fri, 2009-11-20 at 11:14 +0900, Josh Berkus wrote:
> On 11/15/09 11:07 PM, Heikki Linnakangas wrote:
> > - When replaying b-tree deletions, we currently wait out/cancel all
> > running (read-only) transactions. We take the ultra-conservative stance
> > because we don't know how recent the tuples being deleted are. If we
> > could store a better estimate for latestRemovedXid in the WAL record, we
> > could make that less conservative.
>
> Simon was explaining this issue here at JPUGCon; now that I understand
> it, this specific issue seems like the worst usability issue in HS now.
> Bad enough to kill its usefulness for users, or even our ability to get
> useful testing data; in an OLTP production database with several hundred
> inserts per second it would result in pretty much never being able to
> get any query which takes longer than a few seconds to complete on the
> slave.
<sigh> This post isn't really very helpful. You aren't providing the
second part of the discussion, nor even requesting that this issue be
fixed. I can see such comments being taken up by people with a clear
interest in dissing HS.
The case of several hundred inserts per second would not generate any
cleanup records at all. So its not completely accurate, nor is it
acceptable to generalise. There is nothing about the HS architecture
that will prevent it from being used by high traffic sites, or for long
standby queries. The specific action that will cause problems is a work
load that generates high volume inserts and deletes. A solution is
possible.
Heikki and I had mentioned that solving this need not be part of the
initial patch, since it wouldn't effect all users. I specifically
removed my solution in July/Aug, to allow the patch to be slimmed down.
In any case, the problem does have a simple workaround that is
documented as part of the current patch. Conflict resolution is
explained in detail with the patch.
>From my side, the purpose of discussing this was to highlight something
which is not technically a bug, yet clearly still needs work before
close. And it also needs to be on the table, to allow further discussion
and generate the impetus to allow work on it in this release.
-- Simon Riggs www.2ndQuadrant.com