Re: POC: converting Lists into arrays - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Re: POC: converting Lists into arrays |
Date | |
Msg-id | 4555.1551128115@sss.pgh.pa.us Whole thread Raw |
In response to | Re: POC: converting Lists into arrays (Peter Geoghegan <pg@bowt.ie>) |
List | pgsql-hackers |
Peter Geoghegan <pg@bowt.ie> writes: > If we knew that the list code was the bottleneck in a handful of > cases, then I'd come down on Robert's side here. It would then be > possible to update the relevant bottlenecked code in an isolated > fashion, while getting the lion's share of the benefit. However, I > don't think that that's actually possible. The costs of using Lists > everywhere are real and measurable, but it's also highly distributed. > At least, that's my recollection from previous discussion from several > years back. I remember talking about this with Andres in early 2016. Yeah, that's exactly the point. If we could replace some small number of places with a vector-ish data structure and get all/most of the win, then that would be the way to go about it. But I'm pretty sure that we aren't going to make much of an improvement without wholesale changes. Nor is it really all that attractive to have some pieces of the parse/plan/execution tree data structures using one kind of list while other places use another. If we're to attack this at all, I think we should go for a wholesale replacement. Another way of looking at this is that if we expected that extensions had a lot of private Lists, unrelated to these core data structures, it might be worth preserving the List implementation so as not to cause problems for such usage. But I doubt that that's the case; or that any such private lists are more likely to be broken by these API changes than the core usage is (600 changes in however many lines we've got is not a lot); or that people would really want to deal with two independent list implementations with different behaviors just to avoid revisiting some portions of their code while they're being forced to revisit others anyway. > If you want to stop third party code from compiling, you can find a > way to do that without really changing your approach. Nothing stops > you from changing some symbol names minimally, and then making > corresponding mechanical changes to all of the client code within the > tree. Third party code authors would then follow this example, with > the expectation that it's probably going to be a totally mechanical > process. Yeah, if we expected that only mechanical changes would be needed, and forcing certain syntax changes would be a good guide to what had to be done, then this'd be a helpful way to proceed. The lnext changes in my proposed patch do indeed work like that. But the part that's actually hard is finding/fixing the places where you can't safely use lnext anymore, and there's nothing very mechanical about that. (Unless you want to just forbid lnext altogether, which maybe is a reasonable thing to contemplate, but I judged it overkill.) BTW, I neglected to respond to Robert's earlier point that >>> It is also perhaps worth mentioning that reimplementing a List as an >>> array means that it is... not a list. That is a pretty odd state of >>> affairs I think the reason we have Lisp-style lists all over the place has little to do with whether those are ideal data structures, and a lot to do with the fact that chunks of Postgres were originally written in Lisp, and in that language using lists for everything is just How It's Done. I don't have any problem with regarding that nomenclature as being mostly a legacy thing, which is how I documented it in the proposed revision to pg_list.h's header comment. regards, tom lane
pgsql-hackers by date: