Re: embedded list v2 - Mailing list pgsql-hackers

From Andres Freund
Subject Re: embedded list v2
Date
Msg-id 201209161623.14340.andres@2ndquadrant.com
Whole thread Raw
In response to Re: embedded list v2  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: embedded list v2  (Andres Freund <andres@2ndquadrant.com>)
List pgsql-hackers
On Saturday, September 15, 2012 07:21:44 PM Tom Lane wrote:
> Andres Freund <andres@2ndquadrant.com> writes:
> > On Saturday, September 15, 2012 07:32:45 AM Tom Lane wrote:
> >> Well, actually, that just brings us to the main point which is: I do not
> >> believe that circular links are a good design choice here.
> > 
> > I think I have talked about the reasoning on the list before, but here it
> > goes: The cases where I primarily used them are cases where the list
> > *manipulation* is a considerable part of the expense. In those situations
> > having less branches is beneficial and, to my knowledge, cannot be done
> > in normal flat lists.
> > For the initial user of those lists - the slab allocator for postgres
> > which I intend to finish at some point - the move to circular lists
> > instead of classical lists was an improvement in the 12-15% range.
> 
> Let me make my position clear: I will not accept performance as the sole
> figure of merit for this datatype.  It also has to be easy and reliable
> to use, and the current design is a failure on those dimensions.
> This question of ease and reliability of use isn't just academic, since
> you guys had trouble converting catcache.c without introducing bugs.
> That isn't exactly a positive showing for this design.
Uhm. I had introduced a bug there, true (Maybe Alvaro as well, I can't remember 
anything right now). But it was something like getting the tail instead of the 
head element due to copy paste. Nothing will be able to protect the code from 
me.

> Furthermore, one datapoint for one use-case (probably measured on only
> one CPU architecture) isn't even a very convincing case for the
> performance being better.  To take a handy example, if we were to
> convert dynahash to use dlists, having to initialize each hash bucket
> header this way would probably be a pretty substantial cost for a lot
> of hashtable use-cases.  We have a lot of short-lived dynahash tables.

What do you think about doing something like:

#define DLIST_INIT(name) {{&name.head, &name.head}}
static dlist_head DatabaseList = DLIST_INIT(DatabaseList);

That works, but obviously the initialization will have to be performed at some 
point.

> This also ties into the other problem, since it's hard to code such
> loop control as a macro without multiple evaluation of the list-head
> argument.  To me that's a show stopper of the first order.  I'm not
> going to accept a replacement for foreach() that introduces bug hazards
> that aren't in foreach().
What do you think about something like:

typedef struct dlist_iter
{/* * Use a union with equivalent storage as dlist_node to make it possible to * initialize the struct inside a macro
withoutmultiple evaluation. */union {    struct {        dlist_node *cur;        dlist_node *end;    };    dlist_node
init;};
} dlist_iter;

typedef struct dlist_mutable_iter
{union {    struct {        dlist_node *cur;        dlist_node *end;    };    dlist_node init;};dlist_node *next;
} dlist_mutable_iter;

#define dlist_iter_foreach(iter, ptr)                                         \for (iter.init = (ptr)->head; iter.cur
!=iter.end;                      \     iter.cur = iter.cur->next)
 

#define dlist_iter_foreach_modify(iter, ptr)                                 \for (iter.init = (ptr)->head, iter.next =
iter.cur->next;               \     iter.cur != iter.end                                                \     iter.cur
=iter.next, iter.next = iter.cur->next)
 

With that and some trivial changes *all* multiple evaluation possibilities are 
gone.

(_iter_ in there would go, thats just so I can have both in the same file for 
now).

> There are certainly some multiple-evaluation macros, but I don't think
> they are associated with core data types.  You will not find any in
> pg_list.h for instance.  I think it's important that these new forms
> of list be as easy and reliable to use as List is.  I'm willing to trade
> off some micro-performance to have that.
Just from what I had in open emacs frames without switching buffers when I read 
that email:
Min/Max in c.h and about half of the macros in htup.h (I had the 9.1 tree open 
at that point)...

Greetings,

Andres
-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



pgsql-hackers by date:

Previous
From: Gurjeet Singh
Date:
Subject: Patch to include c.h
Next
From: Andrew Dunstan
Date:
Subject: Re: _FORTIFY_SOURCE by default?