Re: Using quicksort for every external sort run - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: Using quicksort for every external sort run
Date
Msg-id CAM3SWZQy0YH8p8TrqOytbLOmfLhXrTz4qmQA6GDFB6j7vPOG9w@mail.gmail.com
Whole thread Raw
In response to Re: Using quicksort for every external sort run  (Greg Stark <stark@mit.edu>)
Responses Re: Using quicksort for every external sort run  (Greg Stark <stark@mit.edu>)
List pgsql-hackers
On Wed, Nov 18, 2015 at 5:22 PM, Greg Stark <stark@mit.edu> wrote:
> On Wed, Nov 18, 2015 at 11:29 PM, Peter Geoghegan <pg@heroku.com> wrote:
>> Other systems expose this explicitly, and, as I said, say in an
>> unqualified way that a multi-pass merge should be avoided. Maybe the
>> warning isn't the right way of communicating that message to the DBA
>> in detail, but I am confident that it ought to be communicated to the
>> DBA fairly clearly.
>
> I'm pretty convinced warnings from DML are a categorically bad idea.
> In any OLTP load they're effectively fatal errors since they'll fill
> up log files or client output or cause other havoc. Or they'll cause
> no problem because nothing is reading them. Neither behaviour is
> useful.

To be clear, this is a LOG level message, not a WARNING.

I think that if the DBA ever sees the multipass_warning message, he or
she does not have an OLTP workload. If you experience what might be
considered log spam due to multipass_warning, then the log spam is the
least of your problems. Besides, log_temp_files is a very similar
setting (albeit one that is not enabled by default), so I tend to
doubt that your view that that style of log message is categorically
bad is widely shared. Having said that, I'm not especially attached to
the idea of communicating the concern to the DBA using the mechanism
of a checkpoint_warning-style LOG message (multipass_warning).

Yes, I really do mean it when I say that the DBA is not supposed to
see this message, no matter how much or how little memory or data is
involved. There is no nuance intended here; it isn't sensible to allow
a multi-pass sort, just as it isn't sensible to allow checkpoints
every 5 seconds. Both of those things can be thought of as thrashing.

> Perhaps the right thing to do is report a statistic to pg_stats so
> DBAs can see how often sorts are in memory, how often they're on disk,
> and how often the on disk sort requires n passes.

That might be better than what I came up with, but I hesitate to track
more things using the statistics collector in the absence of a clear
consensus to do so. I'd be more worried about the overhead of what you
suggest than the overhead of a LOG message, seen only in the case of
something that's really not supposed to happen.

-- 
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: proposal: LISTEN *
Next
From: David Steele
Date:
Subject: Re: Re: In-core regression tests for replication, cascading, archiving, PITR, etc.