Thread: Double linked list with one pointer
If I'm not wrong Neil Conway is working on reimplement a double linked list. Looking around I found this post of "Herb Sutter" on comp.lang.c++: ======================================================================== In particular, a motivation behind two-way pointers is that you can have a more space-efficient doubly linked list if you store only one (not two) pointer's worth of storage in each node. But how can the list still be traversable in both directions? The idea is that each node stores, not a pointer to one other node, but a pointer to the previous node XOR'd with a pointer to the next node. To traverse the list in either direction, at each node you get a pointer to the next node by simply XORing the current node's two-way pointer value with the address of the last node you visited, which yields the address of the next node you want to visit. For more details, see: "Running Circles Round You, Logically" by Steve Dewhurst C/C++ Users Journal (20, 6), June 2002 I don't think the article is available online, alas, but you can find some related source code demonstrating the technique at: http://www.semantics.org/tyr/tyr0_5/list.h ========================================================================= In this way we are going to save a pointer for each node, what do you think ? Regards Gaetano Mendola
Gaetano Mendola <mendola@bigfoot.com> writes: > If I'm not wrong Neil Conway is working on > reimplement a double linked list. No, he's working on keeping track of the list tail element (and length, but the tail element is the important part). There was nothing about double linking. > In particular, a motivation behind two-way pointers is that you > can have a more space-efficient doubly linked list if you store only one > (not two) pointer's worth of storage in each node. But how can the list > still be traversable in both directions? The idea is that each node > stores, not a pointer to one other node, but a pointer to the previous > node XOR'd with a pointer to the next node. This is way too weird for my taste. We do not need two-way list traversal in 99.9% of the backend code (note the near complete lack of use of Dllist). Also, the described scheme is slower to traverse than a standard list since you have to remember two words of state (prev and cur pointer not just cur) to traverse the list; that bookkeeping, plus the cost of the XOR itself, adds up. Another cost that would be significant from my point of view is loss of readability of list structures in the debugger. I don't want to pay that cost to buy a feature we don't need. regards, tom lane
Gaetano Mendola wrote: > If I'm not wrong Neil Conway is working on > reimplement a double linked list. > Looking around I found this post of > "Herb Sutter" on comp.lang.c++: > > ======================================================================== > In particular, a motivation behind two-way pointers is that you > can have a more space-efficient doubly linked list if you store only one > (not two) pointer's worth of storage in each node. But how can the list > still be traversable in both directions? The idea is that each node > stores, not a pointer to one other node, but a pointer to the previous > node XOR'd with a pointer to the next node. To traverse the list in either > direction, at each node you get a pointer to the next node by simply > XORing the current node's two-way pointer value with the address of the > last node you visited, which yields the address of the next node you want > to visit. For more details, see: > > "Running Circles Round You, Logically" > by Steve Dewhurst > C/C++ Users Journal (20, 6), June 2002 > > I don't think the article is available online, alas, but you can find some > related source code demonstrating the technique at: > > http://www.semantics.org/tyr/tyr0_5/list.h That certainly is an amazing idea. You know the pointer you are coming from so you can XOR to find the next pointer. I agree with a Tom that we don't have much use for double-link lists, but is a nice idea. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
> -----Original Message----- > From: Bruce Momjian [mailto:pgman@candle.pha.pa.us] > Sent: Saturday, December 06, 2003 5:02 PM > To: Gaetano Mendola > Cc: pgsql-hackers@postgresql.org > Subject: Re: [HACKERS] Double linked list with one pointer > > > Gaetano Mendola wrote: > > If I'm not wrong Neil Conway is working on > > reimplement a double linked list. > > Looking around I found this post of > > "Herb Sutter" on comp.lang.c++: > > > > > ====================================================================== > > == > > In particular, a motivation behind two-way pointers is that you > > can have a more space-efficient doubly linked list if you > store only one > > (not two) pointer's worth of storage in each node. But how > can the list > > still be traversable in both directions? The idea is that each node > > stores, not a pointer to one other node, but a pointer to > the previous > > node XOR'd with a pointer to the next node. To traverse the > list in either > > direction, at each node you get a pointer to the next node by simply > > XORing the current node's two-way pointer value with the > address of the > > last node you visited, which yields the address of the next > node you want > > to visit. For more details, see: > > > > "Running Circles Round You, Logically" > > by Steve Dewhurst > > C/C++ Users Journal (20, 6), June 2002 > > > > I don't think the article is available online, alas, but > you can find > > some related source code demonstrating the technique at: > > > > http://www.semantics.org/tyr/tyr0_5/list.h > > That certainly is an amazing idea. You know the pointer you > are coming from so you can XOR to find the next pointer. > > I agree with a Tom that we don't have much use for > double-link lists, but is a nice idea. Except when it causes undefined behavior. The subtraction trick also suffers from the same evil flaw. http://groups.google.com/groups?hl=en&lr=&ie=UTF-8&oe=UTF-8&threadm=SErn 3.289%24O17.9552%40client&rnum=3&prev=/groups%3Fq%3Dxor%2Bhack%2Bgroup:c omp.lang.c.*%2Bauthor:corbit%26hl%3Den%26lr%3D%26ie%3DUTF-8%26oe%3DUTF-8 %26selm%3DSErn3.289%2524O17.9552%2540client%26rnum%3D3 From the C-FAQ: 3.3b: Here's a slick expression: a ^= b ^= a ^= b It swaps a and b without using a temporary. A: Not portably, it doesn't. It attempts to modify the variable atwice between sequence points, so its behavior is undefined. For example, it has been reported that when given the code int a = 123, b = 7654; a ^= b ^= a ^= b; the SCO Optimizing C compiler (icc) sets b to 123 and a to 0. See also questions 3.1, 3.8, 10.3, and 20.15c. 10.3: How can I write a generic macro to swap two values? A: There is no good answer to this question. If the values areintegers, a well-known trick using exclusive-OR could perhapsbeused, but it will not work for floating-point values orpointers, or if the two values are the same variable. (Seequestions3.3b and 20.15c.) If the macro is intended to beused on values of arbitrary type (the usual goal), it cannotusea temporary, since it does not know what type of temporaryit needs (and would have a hard time picking a name forit ifit did), and standard C does not provide a typeof operator. The best all-around solution is probably to forget about using amacro, unless you're willing to pass in the type as a thirdargument. 20.15c: How can I swap two values without using a temporary? A: The standard hoary old assembly language programmer's trick is: a ^= b; b ^= a; a ^= b; But this sort of code has little place in modern, HLLprogramming. Temporary variables are essentially free,and the idiomaticcode using three assignments, namely int t = a; a = b; b = t; is not only clearer to the human reader, it is more likely to berecognized by the compiler and turned into the most-efficientcode(e.g. using a swap instruction, if available). The lattercode is obviously also amenable to use with pointersandfloating-point values, unlike the XOR trick. See also questions3.3b and 10.3.
Bruce Momjian wrote: >Gaetano Mendola wrote: > > >>I don't think the article is available online, alas, but you can find some >>related source code demonstrating the technique at: >> >> http://www.semantics.org/tyr/tyr0_5/list.h >> >> > >That certainly is an amazing idea. You know the pointer you are coming >from so you can XOR to find the next pointer. > >I agree with a Tom that we don't have much use for double-link lists, >but is a nice idea. > > I must confess that it strikes me as a really really horrid and ugly hack - very likely to be error-prone and non-portable and undebuggable, and for almost no saving worth having. But maybe that's just me. In general I've been impressed with the quality of Pg code over the last 6 months or so - I'd hate to see it polluted by something like this. cheers andrew
Andrew Dunstan wrote: > > > Bruce Momjian wrote: > > >Gaetano Mendola wrote: > > > > > >>I don't think the article is available online, alas, but you can find some > >>related source code demonstrating the technique at: > >> > >> http://www.semantics.org/tyr/tyr0_5/list.h > >> > >> > > > >That certainly is an amazing idea. You know the pointer you are coming > >from so you can XOR to find the next pointer. > > > >I agree with a Tom that we don't have much use for double-link lists, > >but is a nice idea. > > > > > > > I must confess that it strikes me as a really really horrid and ugly > hack - very likely to be error-prone and non-portable and undebuggable, > and for almost no saving worth having. But maybe that's just me. > > In general I've been impressed with the quality of Pg code over the last > 6 months or so - I'd hate to see it polluted by something like this. Agreed, I just thought it was neat and I hadn't realized it was possible. Also, I don't think the problems for doing this for swaping values without a temp table is relevant. In those cases, you are doing the swap without a temp variable, and this is the problem, more than the XOR itself, which is what we would use and use real temp table. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Andrew Dunstan <andrew@dunslane.net> writes: > I must confess that it strikes me as a really really horrid and ugly > hack - very likely to be error-prone and non-portable and undebuggable, > and for almost no saving worth having. But maybe that's just me. No, that was exactly my reaction too. I'd be willing to buy into it if we had a lot of lists that needed to be traversable in both directions and we were concerned about the storage overhead for such lists. But we don't, we aren't, and so I'm not ... regards, tom lane
"Dann Corbit" <DCorbit@connx.com> writes: > From the C-FAQ: > > A: Not portably, it doesn't. It attempts to modify the variable a > twice between sequence points, so its behavior is undefined. > 10.3: How can I write a generic macro to swap two values? Neither of these are really relevant, a) he's not talking about swapping anyways, and b) there's no reason to try to write the linked list code in a single statement or macro. > but it will not work for floating-point values or pointers Treating pointers as integers is technically nonportable but realistically you would be pretty hard pressed to find any architecture anyone runs postgres on where there isn't some integer datatype that you can cast both directions from pointers safely. Presumably if you wanted to do this you would implement it all in an abstraction that hid all the details from calling code. You could even implement gdb functions to help debugging. But that's a lot of effort for something postgres didn't even need in the first place. -- greg
Greg Stark <gsstark@mit.edu> writes: > Treating pointers as integers is technically nonportable but > realistically you would be pretty hard pressed to find any > architecture anyone runs postgres on where there isn't some integer > datatype that you can cast both directions from pointers safely. ... like, say, Datum. We already make that assumption, so there's no new portability risk involved. > Presumably if you wanted to do this you would implement it all in an > abstraction that hid all the details from calling code. The hard part would be to make such an abstraction. For instance, I don't think you could hide the fact that two state variables are required not one. (The obvious solution of making the state variable be a two-component struct is not a good answer, because few if any compilers would be able to figure out that they could/should put the struct fields in registers; but keeping 'em in memory would absolutely kill performance.) There are worse problems too, having to do with lists that are modified while they are traversed. There are many places that assume they can lremove() any element other than the one they are currently stopped on (for example, lnext() off an element and immediately delete it). That will absolutely not work in the XOR scenario, and I see no way to make it work without exposing some kind of interaction between lremove and the list traversal macros. Likewise for insertion of elements into lists. In short the abstraction would be pretty leaky and correspondingly hard to use. (If you've not read Joel Spolsky's article about leaky abstractions, you should; I think it should be required reading for every software designer. http://www.joelonsoftware.com/articles/LeakyAbstractions.html ) Have we beaten this idea to death yet? regards, tom lane
Tom Lane wrote: >Greg Stark <gsstark@mit.edu> writes: > > >>Treating pointers as integers is technically nonportable but >>realistically you would be pretty hard pressed to find any >>architecture anyone runs postgres on where there isn't some integer >>datatype that you can cast both directions from pointers safely. >> >> > >... like, say, Datum. We already make that assumption, so there's no >new portability risk involved. > > There is a new type in C99 for "integer that can hold a pointer value". I think it's called intptr_t resp. uintptr_t, but I don't have the standard around. It will be necessary for a 64-bit Windows port: Microsoft decided that pointer are 64-bit on WIN64, int&long remain 32-bit. Microsoft's own typedefs are UINT_PTR, DWORD_PTR, INT_PTR. -- Manfred
At 2003-12-07 18:19:26 +0100, manfred@colorfullife.com wrote: > > There is a new type in C99 for "integer that can hold a pointer > value". I think it's called intptr_t resp. uintptr_t, but I don't have > the standard around. Yes, they're called intptr_t and uintptr_t (§7.18.1.4), but they're both optional types. I note in passing that the xor-trick is one of the pointer-hiding tricks that are the bane of garbage collectors for C (the other common example being writing a pointer to a file and reading it back). -- ams