Re: POC: converting Lists into arrays - Mailing list pgsql-hackers
From | David Rowley |
---|---|
Subject | Re: POC: converting Lists into arrays |
Date | |
Msg-id | CAKJS1f-pNfoUJrmU8vgD1WrKLqs9WxOVLQ0RpSWu9h6W9RnypA@mail.gmail.com Whole thread Raw |
In response to | Re: POC: converting Lists into arrays (Andres Freund <andres@anarazel.de>) |
Responses |
Re: POC: converting Lists into arrays
Re: POC: converting Lists into arrays |
List | pgsql-hackers |
On Tue, 5 Mar 2019 at 11:11, Andres Freund <andres@anarazel.de> wrote: > On 2019-03-04 16:28:40 -0500, Tom Lane wrote: > > Andres Freund <andres@anarazel.de> writes: > > > I don't buy this. I think e.g. redisgning the way we represent > > > targetlists would be good (it's e.g. insane that we recompute > > > descriptors out of them all the time), and would reduce their allocator > > > costs. > > > > Maybe we're not on the same page here, but it seems to me that that'd be > > addressable with pretty localized changes (eg, adding more fields to > > TargetEntry, or keeping a pre-instantiated output tupdesc in each Plan > > node). But if the concern is about the amount of palloc bandwidth going > > into List cells, we're not going to be able to improve that with localized > > data structure changes; it'll take something like the patch I've proposed. > > What I'm saying is that it'd be reasonable to replace the use of list > for targetlists with 'list2' without a wholesale replacement of all the > list code, and it'd give us benefits. So you think targetlists are the only case to benefit from an array based list? (Ignoring the fact that I already showed another case) When we discover the next thing to benefit, then the replacement will be piecemeal, just the way Tom would rather not do it. I personally don't want to be up against huge resistance when I discover that turning a single usage of a List into List2 is better. We'll need to consider backpatching pain / API breakage *every single time*. A while ago I did have a go at changing some List implementations for my then proposed ArrayList and it was beyond a nightmare, as each time I changed one I realised I needed to change another. In the end, I just gave up. Think of all the places we have forboth() and forthree(), we'll need to either provide a set of macros that take various combinations of List and List2 or do some conversion beforehand. With respect, if you don't believe me, please take my ArrayList patch [1] and have a go at changing targetlists to use ArrayLists all the way from the parser through to the executor. I'll be interested in the diff stat once you're done. It's true that linked lists are certainly better for some stuff; list_concat() is going to get slower, lcons() too, but likely we can have a bonus lcons() elimination round at some point. I see quite a few of them that look like they could be changed to lappend(). I also just feel that if we insist on more here then we'll get about nothing. I'm also blocked on my partition performance improvement goals on list_nth() being O(N), so I'm keen to see progress here and do what I can to help with that. With list_concat() I find that pretty scary anyway. Using it means we can have a valid list that does not get it's length updated when someone appends a new item. Most users of that do list_copy() to sidestep that and other issues... which likely is something we'd want to rip out with Tom's patch. [1] https://www.postgresql.org/message-id/CAKJS1f_2SnXhPVa6eWjzy2O9A=ocwgd0Cj-LQeWpGtrWqbUSDA@mail.gmail.com -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
pgsql-hackers by date: