Re: cost_sort() improvements - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: cost_sort() improvements
Date
Msg-id CAH2-Wzm17qTRO71UToUqu9Lu64Jx+4rt09Ux8eBcG-_RgQP45A@mail.gmail.com
Whole thread Raw
In response to cost_sort() improvements  (Teodor Sigaev <teodor@sigaev.ru>)
Responses Re: cost_sort() improvements
List pgsql-hackers
On Thu, Jun 28, 2018 at 9:47 AM, Teodor Sigaev <teodor@sigaev.ru> wrote:
> Current estimation of sort cost has following issues:
>  - it doesn't differ one and many columns sort
>  - it doesn't pay attention to comparison function cost and column width
>  - it doesn't try to count number of calls of comparison function on per
> column
>    basis

I've been suspicious of the arbitrary way in which I/O for external
sorts is costed by cost_sort() for a long time. I'm not 100% sure
about how we should think about this question, but I am sure that it
needs to be improved in *some* way. It's really not difficult to show
that external sorts are now often faster than internal sorts, because
they're able to be completed on-the-fly, which can have very good CPU
cache characteristics, and because the I/O latency can be hidden
fairly well much of the time. Of course, as memory is taken away,
external sorts will eventually get slower and slower, but it's
surprising how little difference it makes. (This makes me tempted to
look into a sort_mem GUC, even though I suspect that that will be
controversial.)

Clearly there is a cost to doing I/O even when an external sort is
faster than an internal sort "in isolation"; I/O does not magically
become something that we don't have to worry about. However, the I/O
cost seems more and more like a distributed cost. We don't really have
a way of thinking about that at all. I'm not sure if that much bigger
problem needs to be addressed before this specific problem with
cost_sort() can be addressed.

-- 
Peter Geoghegan


pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: Tips on committing
Next
From: Alvaro Herrera
Date:
Subject: Re: Listing triggers in partitions (was Re: Remove mention in docsthat foreign keys on partitioned tables)