Re: [Question] Similar Cost but variable execution time in sort - Mailing list pgsql-hackers

From Ankit Kumar Pandey
Subject Re: [Question] Similar Cost but variable execution time in sort
Date
Msg-id d3f3b586-3627-35b1-1129-542be0295eb9@gmail.com
Whole thread Raw
In response to Re: [Question] Similar Cost but variable execution time in sort  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
> On 05/03/23 22:21, Tom Lane wrote:

> Ankit Kumar Pandey <itsankitkp@gmail.com> writes:
> > From my observation, we only account for data in cost computation but 
> > not number of columns sorted.
> > Should we not account for number of columns in sort as well?
>
> I'm not sure whether simply charging more for 2 sort columns than 1
> would help much.  The traditional reasoning for not caring was that
> data and I/O costs would swamp comparison costs anyway, but maybe with
> ever-increasing memory sizes we're getting to the point where it is
> worth refining the model for in-memory sorts.  But see the header
> comment for cost_sort().
> 
> Also ... not too long ago we tried and failed to install more-complex
> sort cost estimates for GROUP BY.  The commit log message for f4c7c410e
> gives some of the reasons why that failed, but what it boils down to
> is that useful estimates would require information we don't have, such
> as a pretty concrete idea of the relative costs of different datatypes'
> comparison functions.
>
> In short, maybe there's something to be done here, but I'm afraid
> there is a lot of infrastructure slogging needed first, if you want
> estimates that are better than garbage-in-garbage-out.
>
>            regards, tom lane

Thanks, I can see the challenges in this.

Regards,
Ankit





pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Date-Time dangling unit fix
Next
From: Jim Jones
Date:
Subject: [PATCH] Add CANONICAL option to xmlserialize