Re: POC: converting Lists into arrays - Mailing list pgsql-hackers
From | David Rowley |
---|---|
Subject | Re: POC: converting Lists into arrays |
Date | |
Msg-id | CAKJS1f9y9+8HPhYM0Z1SBoF2f0pKkdCGBXRX49zW00SLvRqg4g@mail.gmail.com Whole thread Raw |
In response to | Re: POC: converting Lists into arrays (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: POC: converting Lists into arrays
Re: POC: converting Lists into arrays Re: POC: converting Lists into arrays |
List | pgsql-hackers |
On Tue, 26 Feb 2019 at 18:34, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > David Rowley <david.rowley@2ndquadrant.com> writes: > > Using the attached patch (as text file so as not to upset the CFbot), > > which basically just measures and logs the time taken to run > > pg_plan_query. ... > > Surprisingly it took 1.13% longer. I did these tests on an AWS > > md5.large instance. > > Interesting. Seems to suggest that maybe the cases I discounted > as being infrequent aren't so infrequent? Another possibility > is that the new coding adds more cycles to foreach() loops than > I'd hoped for. I went and had a few adventures with this patch to see if I could figure out why the small ~1% regression exists. Profiling did not prove very useful as I saw none of the list functions come up. I had suspected it was the lcons() calls being expensive because then need to push the elements up one place each time, not something that'll scale well with larger lists. After changing things so that a new "first" element index in the List would allow new_head_cell() to just move everything to the end of the list and mark the start of the actual data... I discovered that slowed things down further... Likely due to all the additional arithmetic work required to find the first element. I then tried hacking at the foreach() macro after wondering if the lnext() call was somehow making things difficult for the compiler to predict what cell would come next. I experimented with the following monstrosity: for ((cell) = list_head(l); ((cell) && (cell) < &((List *) l)->elements[((List *) l)->first + ((List *) l)->length]) || (cell = NULL) != NULL; cell++) it made things worse again... It ended up much more ugly than I thought it would have as I had to account for an empty list being NIL and the fact that we need to make cell NULL after the loop is over. I tried a few other things... I didn't agree with your memmove() in list_concat(). I think memcpy() is fine, even when the list pointers are the same since we never overwrite any live cell values. Strangely I found memcpy slower than memmove... ? The only thing that I did to manage to speed the patch up was to ditch the additional NULL test in lnext(). I don't see why that's required since lnext(NULL) would have crashed with the old implementation. Removing this changed the 1.13% regression into a ~0.8% regression, which at least does show that the foreach() implementation can have an effect on performance. > Anyway, it's just a POC; the main point at this stage is to be > able to make such comparisons at all. If it turns out that we > *can't* make this into a win, then all that bellyaching about > how inefficient Lists are was misinformed ... My primary concern is how much we bend over backwards because list_nth() performance is not O(1). I know from my work on partitioning that ExecInitRangeTable()'s building of es_range_table_array has a huge impact for PREPAREd plans for simple PK lookup SELECT queries to partitioned tables with a large number of partitions, where only 1 of which will survive run-time pruning. I could get the execution speed of such a query with 300 partitions to within 94% of the non-partitioned version if the rtable could be looked up O(1) in the executor natively, (that some changes to ExecCheckRTPerms() to have it skip rtable entries that don't require permission checks.). Perhaps if we're not going to see gains from the patch alone then we'll need to tag on some of the additional stuff that will take advantage of list_nth() being fast and test the performance of it all again. Attached is the (mostly worthless) series of hacks I made to your patch. It might save someone some time if they happened to wonder the same thing as I did. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
pgsql-hackers by date: