Re: embedded list v2 - Mailing list pgsql-hackers
From | Andres Freund |
---|---|
Subject | Re: embedded list v2 |
Date | |
Msg-id | 201209151008.51251.andres@2ndquadrant.com Whole thread Raw |
In response to | Re: embedded list v2 (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: embedded list v2
|
List | pgsql-hackers |
Hi Tom, On Saturday, September 15, 2012 07:32:45 AM Tom Lane wrote: > Andres Freund <andres@2ndquadrant.com> writes: > > On Friday, September 14, 2012 10:48:35 PM Tom Lane wrote: > >> Instead let's provide a macro for an empty list value, so that you can > >> do something like > >> static ilist_d_head DatabaseList = EMPTY_DLIST; > > > > I don't think that can work with dlists because they are circular and > > need their pointers to be adjusted. > > Well, actually, that just brings us to the main point which is: I do not > believe that circular links are a good design choice here. They prevent > the possibility of trivial list initialization, they make plain > iteration over the list more expensive, and you have provided no > evidence that there's any meaningful savings in other operations, much > less enough to justify those disadvantages. I think I have talked about the reasoning on the list before, but here it goes: The cases where I primarily used them are cases where the list *manipulation* is a considerable part of the expense. In those situations having less branches is beneficial and, to my knowledge, cannot be done in normal flat lists. For the initial user of those lists - the slab allocator for postgres which I intend to finish at some point - the move to circular lists instead of classical lists was an improvement in the 12-15% range. Inhowfar do they make iteration more expensive? ptr != head shouldn't be more expensive than !ptr. Imo, that leaves initialization where I admit that you have a case. I don't find it a big problem in practise though. > >> The apparently random decisions to make some things inline functions > >> and other things macros (even though they evaluate their args multiple > >> times) will come back to bite us. I am much more interested in > >> unsurprising behavior of these constructs than I am in squeezing out > >> an instruction or two, so I'm very much against the unsafe macros. > > > > At the moment the only thing that are macros are things that cannot be > > expressed as inline functions because they return the actual contained > > type and/or because they contain a for() loop. > > I don't really care. If you can't build it without multiple-evaluation > macros, it's too dangerous for a fundamental construct that's meant to > be used all over the place. Time to redesign. Its not like pg doesn't use any other popularly used macros that have multiple evaluation hazarards. Even in cases where they *could* be replaced by inline functions without that danger. > One possible option, though it's a bit restrictive, is to insist that > the list node be the first element of whatever it's embedded in, > eliminating the need for ilist_container altogether. This would not > work if we meant to put these kinds of list links into Node-derived > structs, but I suspect we don't need that. I already had list elements which are in multiple lists at the same time and I think replacing some of the pg_list.h usages might be a good idea even for Node derived structures given the List manipulation overhead we have seen in several profiles. > Then for instance (given the additional choice to not use circular links) > dlist_get_head would degenerate to ((type *) (ptr)->head.next), which > eliminates its multi-evaluation hazard and saves a few instructions as well. I still fail to see why not using cirular lists removes instructions in such a situation: Get the first element in a circular list: (ptr)->head.next ->head.next is at a constant offset from the start of *ptr, just as a ->first pointer is. In iterations like: for(name = (ptr)->head.next;name != &(ptr)->head;name = name->next) { } you have one potentially additional indexed memory access (&ptr->head) for the whole loop to an address which will normally lie on the same cacheline as the already accessed ->next pointer. > Or if you don't want to do that, dlist_get_head(type, membername, ptr) > could be defined as > ((type *) dlist_do_get_head(ptr, offsetof(type, membername))) > where dlist_do_get_head is an inline'able function, eliminating the > multi-evaluation-of-ptr hazard. Thats a rather neat idea. Let me play with it for a second, it might make some of the macros safe, although I don't see how it could work for for() loops. Just to make that clear, purely accessing the first node can be done without any macros at al, its just the combination of returning the first node and getting the contained element that needs to be a macro because of the variadic type issues (at times, I really wish for c++ style templates...) I will take a stab at trying to improve Alvaro's current patch wrt to those issues. Greetings, Andres -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
pgsql-hackers by date: