Thread: Double linked list with one pointer

Double linked list with one pointer

From

Gaetano Mendola

Date:

06 December 2003, 12:04:54

If I'm not wrong Neil Conway is working on
reimplement a double linked list.
Looking around I found this post of
"Herb Sutter" on comp.lang.c++:

========================================================================
In particular, a motivation behind two-way pointers is that you
can have a more space-efficient doubly linked list if you store only one
(not two) pointer's worth of storage in each node. But how can the list
still be traversable in both directions? The idea is that each node
stores, not a pointer to one other node, but a pointer to the previous
node XOR'd with a pointer to the next node. To traverse the list in either
direction, at each node you get a pointer to the next node by simply
XORing the current node's two-way pointer value with the address of the
last node you visited, which yields the address of the next node you want
to visit. For more details, see:
   "Running Circles Round You, Logically"   by Steve Dewhurst   C/C++ Users Journal (20, 6), June 2002

I don't think the article is available online, alas, but you can find some
related source code demonstrating the technique at:
   http://www.semantics.org/tyr/tyr0_5/list.h
=========================================================================


In this way we are going to save a pointer for each node,
what do you think ?



Regards
Gaetano Mendola

Re: Double linked list with one pointer

From

Tom Lane

Date:

06 December 2003, 14:44:40

Gaetano Mendola <mendola@bigfoot.com> writes:
> If I'm not wrong Neil Conway is working on
> reimplement a double linked list.

No, he's working on keeping track of the list tail element (and length,
but the tail element is the important part).  There was nothing about
double linking.

> In particular, a motivation behind two-way pointers is that you
> can have a more space-efficient doubly linked list if you store only one
> (not two) pointer's worth of storage in each node. But how can the list
> still be traversable in both directions? The idea is that each node
> stores, not a pointer to one other node, but a pointer to the previous
> node XOR'd with a pointer to the next node.

This is way too weird for my taste.  We do not need two-way list
traversal in 99.9% of the backend code (note the near complete lack of
use of Dllist).  Also, the described scheme is slower to traverse than a
standard list since you have to remember two words of state (prev and
cur pointer not just cur) to traverse the list; that bookkeeping, plus
the cost of the XOR itself, adds up.  Another cost that would be
significant from my point of view is loss of readability of list
structures in the debugger.  I don't want to pay that cost to buy a
feature we don't need.
        regards, tom lane

Re: Double linked list with one pointer

From

Bruce Momjian

Date:

06 December 2003, 21:04:02

Gaetano Mendola wrote:
> If I'm not wrong Neil Conway is working on
> reimplement a double linked list.
> Looking around I found this post of
> "Herb Sutter" on comp.lang.c++:
> 
> ========================================================================
> In particular, a motivation behind two-way pointers is that you
> can have a more space-efficient doubly linked list if you store only one
> (not two) pointer's worth of storage in each node. But how can the list
> still be traversable in both directions? The idea is that each node
> stores, not a pointer to one other node, but a pointer to the previous
> node XOR'd with a pointer to the next node. To traverse the list in either
> direction, at each node you get a pointer to the next node by simply
> XORing the current node's two-way pointer value with the address of the
> last node you visited, which yields the address of the next node you want
> to visit. For more details, see:
> 
>     "Running Circles Round You, Logically"
>     by Steve Dewhurst
>     C/C++ Users Journal (20, 6), June 2002
> 
> I don't think the article is available online, alas, but you can find some
> related source code demonstrating the technique at:
> 
>     http://www.semantics.org/tyr/tyr0_5/list.h

That certainly is an amazing idea.  You know the pointer you are coming
from so you can XOR to find the next pointer.

I agree with a Tom that we don't have much use for double-link lists,
but is a nice idea.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073

Re: Double linked list with one pointer

From

"Dann Corbit"

Date:

06 December 2003, 21:28:40

> -----Original Message-----
> From: Bruce Momjian [mailto:pgman@candle.pha.pa.us]
> Sent: Saturday, December 06, 2003 5:02 PM
> To: Gaetano Mendola
> Cc: pgsql-hackers@postgresql.org
> Subject: Re: [HACKERS] Double linked list with one pointer
>
>
> Gaetano Mendola wrote:
> > If I'm not wrong Neil Conway is working on
> > reimplement a double linked list.
> > Looking around I found this post of
> > "Herb Sutter" on comp.lang.c++:
> >
> >
> ======================================================================
> > ==
> > In particular, a motivation behind two-way pointers is that you
> > can have a more space-efficient doubly linked list if you
> store only one
> > (not two) pointer's worth of storage in each node. But how
> can the list
> > still be traversable in both directions? The idea is that each node
> > stores, not a pointer to one other node, but a pointer to
> the previous
> > node XOR'd with a pointer to the next node. To traverse the
> list in either
> > direction, at each node you get a pointer to the next node by simply
> > XORing the current node's two-way pointer value with the
> address of the
> > last node you visited, which yields the address of the next
> node you want
> > to visit. For more details, see:
> >
> >     "Running Circles Round You, Logically"
> >     by Steve Dewhurst
> >     C/C++ Users Journal (20, 6), June 2002
> >
> > I don't think the article is available online, alas, but
> you can find
> > some related source code demonstrating the technique at:
> >
> >     http://www.semantics.org/tyr/tyr0_5/list.h
>
> That certainly is an amazing idea.  You know the pointer you
> are coming from so you can XOR to find the next pointer.
>
> I agree with a Tom that we don't have much use for
> double-link lists, but is a nice idea.

Except when it causes undefined behavior.  The subtraction trick also
suffers from the same evil flaw.

http://groups.google.com/groups?hl=en&lr=&ie=UTF-8&oe=UTF-8&threadm=SErn
3.289%24O17.9552%40client&rnum=3&prev=/groups%3Fq%3Dxor%2Bhack%2Bgroup:c
omp.lang.c.*%2Bauthor:corbit%26hl%3Den%26lr%3D%26ie%3DUTF-8%26oe%3DUTF-8
%26selm%3DSErn3.289%2524O17.9552%2540client%26rnum%3D3

From the C-FAQ:

3.3b:    Here's a slick expression:
    a ^= b ^= a ^= b
It swaps a and b without using a temporary.

A:    Not portably, it doesn't.  It attempts to modify the variable atwice between sequence points, so its behavior is
undefined.
For example, it has been reported that when given the code
    int a = 123, b = 7654;    a ^= b ^= a ^= b;
the SCO Optimizing C compiler (icc) sets b to 123 and a to 0.
See also questions 3.1, 3.8, 10.3, and 20.15c.

10.3:    How can I write a generic macro to swap two values?

A:    There is no good answer to this question.  If the values areintegers, a well-known trick using exclusive-OR could
perhapsbeused, but it will not work for floating-point values orpointers, or if the two values are the same variable.
(Seequestions3.3b and 20.15c.)  If the macro is intended to beused on values of arbitrary type (the usual goal), it
cannotusea temporary, since it does not know what type of temporaryit needs (and would have a hard time picking a name
forit ifit did), and standard C does not provide a typeof operator. 
The best all-around solution is probably to forget about using amacro, unless you're willing to pass in the type as a
thirdargument.

20.15c:    How can I swap two values without using a temporary?

A:    The standard hoary old assembly language programmer's trick is:
    a ^= b;    b ^= a;    a ^= b;
But this sort of code has little place in modern, HLLprogramming.  Temporary variables are essentially free,and the
idiomaticcode using three assignments, namely 
    int t = a;    a = b;    b = t;
is not only clearer to the human reader, it is more likely to berecognized by the compiler and turned into the
most-efficientcode(e.g. using a swap instruction, if available).  The lattercode is obviously also amenable to use with
pointersandfloating-point values, unlike the XOR trick.  See also questions3.3b and 10.3.

Re: Double linked list with one pointer

From

Andrew Dunstan

Date:

06 December 2003, 23:28:33

Bruce Momjian wrote:

>Gaetano Mendola wrote:
>  
>
>>I don't think the article is available online, alas, but you can find some
>>related source code demonstrating the technique at:
>>
>>    http://www.semantics.org/tyr/tyr0_5/list.h
>>    
>>
>
>That certainly is an amazing idea.  You know the pointer you are coming
>from so you can XOR to find the next pointer.
>
>I agree with a Tom that we don't have much use for double-link lists,
>but is a nice idea.
>  
>

I must confess that it strikes me as a really really horrid and ugly 
hack - very likely to be error-prone and non-portable and undebuggable, 
and for almost no saving worth having. But maybe that's just me.

In general I've been impressed with the quality of Pg code over the last 
6 months or so - I'd hate to see it polluted by something like this.

cheers

andrew

Re: Double linked list with one pointer

From

Bruce Momjian

Date:

07 December 2003, 01:21:46

Andrew Dunstan wrote:
> 
> 
> Bruce Momjian wrote:
> 
> >Gaetano Mendola wrote:
> >  
> >
> >>I don't think the article is available online, alas, but you can find some
> >>related source code demonstrating the technique at:
> >>
> >>    http://www.semantics.org/tyr/tyr0_5/list.h
> >>    
> >>
> >
> >That certainly is an amazing idea.  You know the pointer you are coming
> >from so you can XOR to find the next pointer.
> >
> >I agree with a Tom that we don't have much use for double-link lists,
> >but is a nice idea.
> >  
> >
> 
> 
> I must confess that it strikes me as a really really horrid and ugly 
> hack - very likely to be error-prone and non-portable and undebuggable, 
> and for almost no saving worth having. But maybe that's just me.
> 
> In general I've been impressed with the quality of Pg code over the last 
> 6 months or so - I'd hate to see it polluted by something like this.

Agreed, I just thought it was neat and I hadn't realized it was
possible.

Also, I don't think the problems for doing this for swaping values
without a temp table is relevant.  In those cases, you are doing the
swap without a temp variable, and this is the problem, more than the XOR
itself, which is what we would use and use  real temp table.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073

Re: Double linked list with one pointer

From

Tom Lane

Date:

07 December 2003, 01:51:01

Andrew Dunstan <andrew@dunslane.net> writes:
> I must confess that it strikes me as a really really horrid and ugly 
> hack - very likely to be error-prone and non-portable and undebuggable, 
> and for almost no saving worth having. But maybe that's just me.

No, that was exactly my reaction too.  I'd be willing to buy into it
if we had a lot of lists that needed to be traversable in both
directions and we were concerned about the storage overhead for such
lists.  But we don't, we aren't, and so I'm not ...
        regards, tom lane

Re: Double linked list with one pointer

From

Greg Stark

Date:

07 December 2003, 05:12:23

"Dann Corbit" <DCorbit@connx.com> writes:

> From the C-FAQ:
> 
> A:    Not portably, it doesn't.  It attempts to modify the variable a
>     twice between sequence points, so its behavior is undefined.

> 10.3:    How can I write a generic macro to swap two values?

Neither of these are really relevant, a) he's not talking about swapping
anyways, and b) there's no reason to try to write the linked list code in a
single statement or macro.

> but it will not work for floating-point values or pointers

Treating pointers as integers is technically nonportable but realistically you
would be pretty hard pressed to find any architecture anyone runs postgres on
where there isn't some integer datatype that you can cast both directions from
pointers safely.

Presumably if you wanted to do this you would implement it all in an
abstraction that hid all the details from calling code. You could even
implement gdb functions to help debugging. 

But that's a lot of effort for something postgres didn't even need in the
first place.

-- 
greg

Re: Double linked list with one pointer

From

Tom Lane

Date:

07 December 2003, 12:41:29

Greg Stark <gsstark@mit.edu> writes:
> Treating pointers as integers is technically nonportable but
> realistically you would be pretty hard pressed to find any
> architecture anyone runs postgres on where there isn't some integer
> datatype that you can cast both directions from pointers safely.

... like, say, Datum.  We already make that assumption, so there's no
new portability risk involved.

> Presumably if you wanted to do this you would implement it all in an
> abstraction that hid all the details from calling code.

The hard part would be to make such an abstraction.  For instance,
I don't think you could hide the fact that two state variables are
required not one.  (The obvious solution of making the state variable
be a two-component struct is not a good answer, because few if any
compilers would be able to figure out that they could/should put the
struct fields in registers; but keeping 'em in memory would absolutely
kill performance.)

There are worse problems too, having to do with lists that are modified
while they are traversed.  There are many places that assume they can
lremove() any element other than the one they are currently stopped on
(for example, lnext() off an element and immediately delete it).  That
will absolutely not work in the XOR scenario, and I see no way to make
it work without exposing some kind of interaction between lremove and
the list traversal macros.  Likewise for insertion of elements into
lists.

In short the abstraction would be pretty leaky and correspondingly hard
to use.  (If you've not read Joel Spolsky's article about leaky
abstractions, you should; I think it should be required reading for
every software designer.
http://www.joelonsoftware.com/articles/LeakyAbstractions.html )

Have we beaten this idea to death yet?
        regards, tom lane

Re: Double linked list with one pointer

From

Manfred Spraul

Date:

07 December 2003, 13:20:26

Tom Lane wrote:

>Greg Stark <gsstark@mit.edu> writes:
>  
>
>>Treating pointers as integers is technically nonportable but
>>realistically you would be pretty hard pressed to find any
>>architecture anyone runs postgres on where there isn't some integer
>>datatype that you can cast both directions from pointers safely.
>>    
>>
>
>... like, say, Datum.  We already make that assumption, so there's no
>new portability risk involved.
>  
>
There is a new type in C99 for "integer that can hold a pointer value". 
I think it's called intptr_t resp. uintptr_t, but I don't have the 
standard around.
It will be necessary for a 64-bit Windows port: Microsoft decided that 
pointer are 64-bit on WIN64, int&long remain 32-bit. Microsoft's own 
typedefs are UINT_PTR, DWORD_PTR, INT_PTR.

--   Manfred

Re: Double linked list with one pointer

From

Abhijit Menon-Sen

Date:

08 December 2003, 07:07:28

At 2003-12-07 18:19:26 +0100, manfred@colorfullife.com wrote:
>
> There is a new type in C99 for "integer that can hold a pointer
> value". I think it's called intptr_t resp. uintptr_t, but I don't have
> the standard around.

Yes, they're called intptr_t and uintptr_t (§7.18.1.4), but they're both
optional types.

I note in passing that the xor-trick is one of the pointer-hiding tricks
that are the bane of garbage collectors for C (the other common example
being writing a pointer to a file and reading it back).

-- ams