Re: Modest proposal to extend TableAM API for controlling cluster commands - Mailing list pgsql-hackers
From | Andres Freund |
---|---|
Subject | Re: Modest proposal to extend TableAM API for controlling cluster commands |
Date | |
Msg-id | 20220616031850.phyyk2qbazhwcusy@alap3.anarazel.de Whole thread Raw |
In response to | Re: Modest proposal to extend TableAM API for controlling cluster commands (Mark Dilger <mark.dilger@enterprisedb.com>) |
Responses |
Re: Modest proposal to extend TableAM API for controlling cluster commands
Re: Modest proposal to extend TableAM API for controlling cluster commands |
List | pgsql-hackers |
Hi, On 2022-06-15 20:10:30 -0700, Mark Dilger wrote: > > On Jun 15, 2022, at 7:30 PM, Andres Freund <andres@anarazel.de> wrote: > >> But it's up to the TAM what it wants to do with that boolean, if in fact it > >> does anything at all based on that. A TAM could decide to do the exact > >> opposite of what Heap-AM does and instead sort on VACUUM FULL but not sort > >> on CLUSTER, or perhaps perform a randomized shuffle, or <insert your weird > >> behavior here>. > > > > That's bogus. Yes, an AM can do stupid stuff in a callback. But so what, > > that's possible with all extension APIs. > > I don't think it's "stupid stuff". A major motivation, perhaps the only > useful motivation, for implementing a TAM is to get a non-trivial > performance boost (relative to heap) for some target workload, almost > certainly at the expense of worse performance for another workload. To > optimize any particular workload sufficiently to make it worth the bother, > you've pretty much got to do something meaningfully different than what heap > does. Sure. I just don't see what that has to do with doing something widely differing in VACUUM FULL. Which AM can't support that? I guess there's some where implementing the full visibility semantics isn't feasible, but that's afaics OK. > >> From the point-of-view of a TAM implementor, VACUUM FULL and CLUSTER are > >> synonyms. Or am I missing something? > > > > The callback gets passed use_sort. So just implement it use_sort = false and > > error out if use_sort = true? > > I'm not going to say that your idea is unreasonable for a TAM that you might > choose to implement, but I don't see why that should be required of all TAMs > anybody might ever implement. > The callback that gets use_sort is called from copy_table_data(). By that time, it's too late to avoid the > > /* > * Open the relations we need. > */ > NewHeap = table_open(OIDNewHeap, AccessExclusiveLock); > OldHeap = table_open(OIDOldHeap, AccessExclusiveLock); > > code that has already happened in cluster.c's copy_table_data() function, > and unless I raise an error, after returning from my TAM's callback, the > cluster code will replace the old table with the new one. I'm left no > choices but to copy my data over, loose my data, or abort the command. None > of those are OK options for me. I think you need to do a bit more explaining of what you're actually trying to achieve here. You're just saying "I don't want to", which doesn't really help me to understand the set of useful options. > I'm open to different solutions. If a simple callback like > relation_supports_cluster(Relation rel) is too simplistic, and more > parameters with more context information is wanted, then fine, let's do > that. FWIW, I want to go *simpler* if anything, not more complicated. I.e. make the relation_copy_for_cluster optional. I still think it's a terrible idea to silently ignore tables in CLUSTER or VACUUM FULL. > But I can't really complete my work with the interface as it stands > now. Since you've not described that work to a meaningful degree... Greetings, Andres Freund
pgsql-hackers by date: