Re: POC: converting Lists into arrays - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: POC: converting Lists into arrays
Date
Msg-id CAH2-Wz=xPuBfXezetjmrdH4HN0T8j7pzALtvu0+wnqs=xDKsSQ@mail.gmail.com
Whole thread Raw
In response to Re: POC: converting Lists into arrays  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: POC: converting Lists into arrays
Re: POC: converting Lists into arrays
List pgsql-hackers
On Mon, Feb 25, 2019 at 10:59 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Because it involves touching ten times more code (and that's a very
> conservative estimate).  Excluding changes in pg_list.h + list.c,
> what I posted touches approximately 600 lines of code (520 insertions,
> 644 deletions to be exact).  For comparison's sake, there are about
> 1800 uses of foreach in the tree, each of which would require at least
> 3 changes to replace (the foreach itself, the ListCell variable
> declaration, and at least one lfirst() reference in the loop body).

If we knew that the list code was the bottleneck in a handful of
cases, then I'd come down on Robert's side here. It would then be
possible to update the relevant bottlenecked code in an isolated
fashion, while getting the lion's share of the benefit. However, I
don't think that that's actually possible. The costs of using Lists
everywhere are real and measurable, but it's also highly distributed.
At least, that's my recollection from previous discussion from several
years back. I remember talking about this with Andres in early 2016.

> So we've already blown past 5000 lines worth of changes if we want to
> do it another way ... and that's just *one* component of the List API.

If you want to stop third party code from compiling, you can find a
way to do that without really changing your approach. Nothing stops
you from changing some symbol names minimally, and then making
corresponding mechanical changes to all of the client code within the
tree. Third party code authors would then follow this example, with
the expectation that it's probably going to be a totally mechanical
process.

I'm not necessarily advocating that approach. I'm simply pointing out
that a compromise is quite possible.

> Nor is there any reason to think the changes would be any more mechanical
> than what I had to do here.  (No fair saying that I already found the
> trouble spots, either.  A different implementation would likely break
> assumptions in different ways.)

The idea of making a new vector/array implementation that is a more or
less drop in replacement for List seems okay to me. C++ has both a
std::list and a std::vector, and they support almost the same
interface. Obviously the situation is different here, since you're
retrofitting a new implementation with different performance
characteristics, rather than implementing both in a green field
situation. But it's not that different.

-- 
Peter Geoghegan


pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: \describe*
Next
From: Peter Eisentraut
Date:
Subject: Re: [HACKERS] generated columns