Re: On disable_cost - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: On disable_cost
Date
Msg-id e95fe321-d55e-477b-aa7f-aafc3257ac1f@iki.fi
Whole thread Raw
In response to Re: On disable_cost  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On 23/08/2024 22:05, Robert Haas wrote:
> On Fri, Aug 23, 2024 at 2:48 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> If we're going to do this, I'd prefer a solution that doesn't force
>> API changes onto the vast majority of index AMs that don't have a
>> problem here.
> 
> That's a fair concern.

Yeah, although I don't think it's too bad. There are not that many 
out-of-tree index AM implementations to begin with, and we do change 
things often enough that any interesting AM implementation will likely 
need a few #ifdef PG_VERSION blocks for each PostgreSQL major version 
anyway. pgvector certainly does.

>> One way could be to formalize the hack we were just discussing:
>> "To refuse a proposed path, amcostestimate can set the path's
>> disabled_nodes value to anything larger than 1".  I suspect that
>> that would actually be sufficient, since the path would then lose
>> to the seqscan path in add_path even if that were disabled; but
>> we could put in a hack to prevent it from getting add_path'd at all.
>>
>> Another way could be to bless what hnsw is already doing:
>> "To refuse a proposed path, amcostestimate can return an
>> indexTotalCost of DBL_MAX" (or maybe insisting on +Inf would
>> be better).  That would still require changes comparable to
>> what you specify above, but only in the core-code call path
>> not in every AM.
> 
> If just setting disabled_nodes to a value larger than one works, I'd
> be inclined to not do anything here at all, except possibly document
> that you can do that. Otherwise, we should probably change the code
> somehow.

Modifying the passed-in Path feels hacky. amcostestimate currently 
returns all the estimates in *output parameters, it doesn't modify the 
Path at all.

> I find both of your proposed solutions above to be pretty inelegant,
> and I think if this problem occurred with a core AM, I'd push for an
> API break rather than accept the ugliness. "This path is not valid
> because the AM cannot support it", "this path is crazy expensive", and
> "the user told us not to do it this way" are three different things,
> and signalling two or more of them in the same way muddies the water
> in a way that I don't like. API breaks aren't free, though, so I
> certainly understand why you're not very keen to introduce one where
> it can reasonably be avoided.

The +Inf approach seems fine to me. Or perhaps NaN. Your proposal would 
certainly be the cleanest interface if we don't mind incurring churn to 
AM implementations.

-- 
Heikki Linnakangas
Neon (https://neon.tech)




pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: On disable_cost
Next
From: Tom Lane
Date:
Subject: Re: On disable_cost