On Mon, 26 Oct 2020 at 21:15, Peter Geoghegan <pg@bowt.ie> wrote:
> Now for the not-so-good news. The TPS numbers looked like this
> (results in original chronological order of the runs, which I've
> interleaved):
While it is important we investigate the worst cases, I don't see this
is necessarily bad.
HOT was difficult to measure, but on a 2+ hour run on a larger table,
the latency graph was what showed it was a winner. Short runs and
in-memory data masked the benefits in our early analyses.
So I suggest not looking at the totals and averages but on the peaks
and the long term trend. Showing that in graphical form is best.
> The patch adds a backstop. It seems to me that that's really what we
> need here. Predictability over time and under a large variety of
> different conditions. Real workloads constantly fluctuate.
Yeh, agreed. This is looking like a winner now, but lets check.
> Even if people end up not buying my argument that it's worth it for
> workloads like this, there are various options. And, I bet I can
> further improve the high contention cases without losing the valuable
> part -- there are a number of ways in which I can get the CPU costs
> down further that haven't been fully explored (yes, it really does
> seem to be CPU costs, especially due to TID sorting). Again, this
> patch is all about extreme pathological workloads, system stability,
> and system efficiency over time -- it is not simply about increasing
> system throughput. There are some aspects of this design (that come up
> with extreme workloads) that may in the end come down to value
> judgments. I'm not going to tell somebody that they're wrong for
> prioritizing different things (within reason, at least). In my opinion
> almost all of the problems we have with VACUUM are ultimately
> stability problems, not performance problems per se. And, I suspect
> that we do very well with stupid benchmarks like this compared to
> other DB systems precisely because we currently allow non-HOT updaters
> to "live beyond their means" (which could in theory be great if you
> frame it a certain way that seems pretty absurd to me). This suggests
> we can "afford" to go a bit slower here as far as the competitive
> pressures determine what we should do (notice that this is a distinct
> argument to my favorite argument, which is that we cannot afford to
> *not* go a bit slower in certain extreme cases).
>
> I welcome debate about this.
Agreed, we can trade initial speed for long term consistency. I guess
there are some heuristics there on that tradeoff.
--
Simon Riggs http://www.EnterpriseDB.com/