Re: next draft of list rewrite - Mailing list pgsql-patches

From Tom Lane
Subject Re: next draft of list rewrite
Date
Msg-id 28870.1075069041@sss.pgh.pa.us
Whole thread Raw
In response to next draft of list rewrite  (Neil Conway <neilc@samurai.com>)
Responses Re: next draft of list rewrite  (Neil Conway <neilc@samurai.com>)
List pgsql-patches
Neil Conway <neilc@samurai.com> writes:
> I've attached the latest header and implementation files (pg_list.h
> and list.c, respectively) for the new linked list implementation. I'm
> pretty satisfied with the new API: now would be the ideal time to
> suggest any changes you think I should make to it. I decided to use a
> new naming convention: all public functions and macros are prefixed
> with list_ or cell_, as appropriate, and type-specific functions are
> suffixed with _int or _oid, as appropriate.

I'm fairly unhappy with that idea.  I thought the plan was to keep the
API exactly the same as before, with the single exception of having to
use ListCell* rather than List* as the type of pointers used to iterate
through a list.  Your apparent intention to rename every single list
routine makes the change far more invasive than it has any need to be.
It will undoubtedly break a fair amount of user-written code that we
don't have control over; and to what benefit?  Everyone who works on
the backend is already accustomed to the existing names, which are
mostly derived from ancient Lisp tradition.  I don't think we should
change them.  If you want to push forward on it, we had better have
a vote in pghackers, rather than assuming everyone concerned will see
this discussion in -patches.

Other than the naming issues, the code looks reasonable.  I noticed one
trivial bug, which is that the Asserts in list_member_int and
list_member_oid seem backwards.  A bigger problem is that list_length,
list_head, and list_tail must *not* be macros --- we cannot afford
double-evaluation bugs associated with operations as widely used as these.
Since these functions are not used within loop bodies, I don't think
that avoiding function call overhead for them is conceivably worth the
risks involved.

> ISTM that the "allocate the list itself and the head node in one
> palloc()" idea that Tom suggested actually won't work: since the head
> node can change over the life of the list, we need to be able to
> deallocate a list cell via pfree() -- which would not be the case if a
> particular node was allocated via the same palloc() as the list
> itself.

No, you'd simply leave that cell-space wasted if the head member were to
change.  It's doable, though it would complicate the deletion routine
and probably nconc as well.  It's entirely likely that it'd be more
complexity than it's worth, though, so if you don't want to do it that's
fine with me.  Certainly it makes sense not to do it in the first pass.

> BTW, one nice thing about the new List API is that it lets us get rid
> of the WRITE_INTLIST_FIELD() and related cruft in nodes/, because each
> type of list (T_List, T_IntList, or T_OidList) is now a legitimate
> Node in its own right. So we can now use WRITE_NODE_FIELD() to output
> a list of integers, for example.

Right, that was a benefit I was expecting ...

> (The implementation is a little "off the cuff", and I haven't reviewed
> it; also, since the tree is totally broken with this patch applied, I
> haven't tested it either.

This is one reason I'd suggest refraining from renaming everything.
Without the rename, it'd be possible to change the implementation with
little or no change to the rest of the tree.  The idea I had was to
temporarily place "(ListCell*)" casts in foreach, lfirst, and lnext,
so that they would work without complaint if the iteration variable is
misdeclared as List* rather than ListCell*.  Then it should be possible
to test and commit the reimplemented list.c with few if any source-code
changes elsewhere.  After that, you can incrementally go through the
rest of the tree and change iteration variables to ListCell*, removing
the casts in the macros when all is done.  This is a lot more pleasant
way to proceed than a "big bang" changeover.  (Removal of the FastList
code could also be done incrementally.)

            regards, tom lane

pgsql-patches by date:

Previous
From: "Magnus Hagander"
Date:
Subject: Win32 signals patch
Next
From: Tom Lane
Date:
Subject: Re: ANALYZE patch for review