Home > mailing lists

Re: [Question] Similar Cost but variable execution time in sort - Mailing list pgsql-hackers

From	Tom Lane
Subject	Re: [Question] Similar Cost but variable execution time in sort
Date	March 5, 2023 19:51:05
Msg-id	3165518.1678035065@sss.pgh.pa.us Whole thread Raw
In response to	[Question] Similar Cost but variable execution time in sort (Ankit Kumar Pandey <itsankitkp@gmail.com>)
Responses	Re: [Question] Similar Cost but variable execution time in sort
List	pgsql-hackers

Tree view

Ankit Kumar Pandey <itsankitkp@gmail.com> writes:
> From my observation, we only account for data in cost computation but 
> not number of columns sorted.
> Should we not account for number of columns in sort as well?

I'm not sure whether simply charging more for 2 sort columns than 1
would help much.  The traditional reasoning for not caring was that
data and I/O costs would swamp comparison costs anyway, but maybe with
ever-increasing memory sizes we're getting to the point where it is
worth refining the model for in-memory sorts.  But see the header
comment for cost_sort().

Also ... not too long ago we tried and failed to install more-complex
sort cost estimates for GROUP BY.  The commit log message for f4c7c410e
gives some of the reasons why that failed, but what it boils down to
is that useful estimates would require information we don't have, such
as a pretty concrete idea of the relative costs of different datatypes'
comparison functions.

In short, maybe there's something to be done here, but I'm afraid
there is a lot of infrastructure slogging needed first, if you want
estimates that are better than garbage-in-garbage-out.

            regards, tom lane

pgsql-hackers by date:

From: Joseph Koshakow
Date: 05 March 2023, 19:39:58
Subject: Re: Date-Time dangling unit fix

From: Justin Pryzby
Date: 05 March 2023, 20:47:58
Subject: Re: zstd compression for pg_dump

Re: [Question] Similar Cost but variable execution time in sort - Mailing list pgsql-hackers

Previous

Next