Double linked list with one pointer

Started by Gaetano Mendolaabout 22 years ago12 messages
#1Gaetano Mendola
mendola@bigfoot.com

If I'm not wrong Neil Conway is working on
reimplement a double linked list.
Looking around I found this post of
"Herb Sutter" on comp.lang.c++:

========================================================================
In particular, a motivation behind two-way pointers is that you
can have a more space-efficient doubly linked list if you store only one
(not two) pointer's worth of storage in each node. But how can the list
still be traversable in both directions? The idea is that each node
stores, not a pointer to one other node, but a pointer to the previous
node XOR'd with a pointer to the next node. To traverse the list in either
direction, at each node you get a pointer to the next node by simply
XORing the current node's two-way pointer value with the address of the
last node you visited, which yields the address of the next node you want
to visit. For more details, see:

"Running Circles Round You, Logically"
by Steve Dewhurst
C/C++ Users Journal (20, 6), June 2002

I don't think the article is available online, alas, but you can find some
related source code demonstrating the technique at:

http://www.semantics.org/tyr/tyr0_5/list.h
=========================================================================

In this way we are going to save a pointer for each node,
what do you think ?

Regards
Gaetano Mendola

#2Richard Schilling
rschilling@nationalinformatics.net
In reply to: Gaetano Mendola (#1)
Fwd: Double linked list with one pointer [mendola@bigfoot.com]

I could see how this would work if you always had a reference to one of the nodes.

The problem with the approach I can see is that you *have* to always know the value of at least one pointer, and maintaining that will ultimately require more coding than just having two pointers.
Assume CurrentNode is a pointer to on element in the list:

CurrentNode = <some node in the list> // This node knows what one of the pointers is.
List *TempNode = CurrentNode // This node does not know what one of the pointers is because it was never told.

So, TempNode can't traverse the list. Like I say, passing the known pointer value around would take more coding than maintaining two pointers in the list.

Richard Schilling

----- Begin Forwarded Message -----
Date: 2003.12.06 08:03
Subject: [HACKERS] Double linked list with one pointer
From: Gaetano Mendola <mendola@bigfoot.com>
To: "pgsql-hackers@postgresql.org" <pgsql-hackers@postgresql.org>
Message-ID: <3FD1FDE4.9070503@bigfoot.com>

If I'm not wrong Neil Conway is working on
reimplement a double linked list.
Looking around I found this post of
"Herb Sutter" on comp.lang.c++:

========================================================================
In particular, a motivation behind two-way pointers is that you
can have a more space-efficient doubly linked list if you store only one
(not two) pointer's worth of storage in each node. But how can the list
still be traversable in both directions? The idea is that each node
stores, not a pointer to one other node, but a pointer to the previous
node XOR'd with a pointer to the next node. To traverse the list in either
direction, at each node you get a pointer to the next node by simply
XORing the current node's two-way pointer value with the address of the
last node you visited, which yields the address of the next node you want
to visit. For more details, see:

"Running Circles Round You, Logically"
by Steve Dewhurst
C/C++ Users Journal (20, 6), June 2002

I don't think the article is available online, alas, but you can find some
related source code demonstrating the technique at:

http://www.semantics.org/tyr/tyr0_5/list.h
=========================================================================

In this way we are going to save a pointer for each node,
what do you think ?

Regards
Gaetano Mendola

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings
----- End Forwarded Message -----

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Gaetano Mendola (#1)
Re: Double linked list with one pointer

Gaetano Mendola <mendola@bigfoot.com> writes:

If I'm not wrong Neil Conway is working on
reimplement a double linked list.

No, he's working on keeping track of the list tail element (and length,
but the tail element is the important part). There was nothing about
double linking.

In particular, a motivation behind two-way pointers is that you
can have a more space-efficient doubly linked list if you store only one
(not two) pointer's worth of storage in each node. But how can the list
still be traversable in both directions? The idea is that each node
stores, not a pointer to one other node, but a pointer to the previous
node XOR'd with a pointer to the next node.

This is way too weird for my taste. We do not need two-way list
traversal in 99.9% of the backend code (note the near complete lack of
use of Dllist). Also, the described scheme is slower to traverse than a
standard list since you have to remember two words of state (prev and
cur pointer not just cur) to traverse the list; that bookkeeping, plus
the cost of the XOR itself, adds up. Another cost that would be
significant from my point of view is loss of readability of list
structures in the debugger. I don't want to pay that cost to buy a
feature we don't need.

regards, tom lane

#4Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Gaetano Mendola (#1)
Re: Double linked list with one pointer

Gaetano Mendola wrote:

If I'm not wrong Neil Conway is working on
reimplement a double linked list.
Looking around I found this post of
"Herb Sutter" on comp.lang.c++:

========================================================================
In particular, a motivation behind two-way pointers is that you
can have a more space-efficient doubly linked list if you store only one
(not two) pointer's worth of storage in each node. But how can the list
still be traversable in both directions? The idea is that each node
stores, not a pointer to one other node, but a pointer to the previous
node XOR'd with a pointer to the next node. To traverse the list in either
direction, at each node you get a pointer to the next node by simply
XORing the current node's two-way pointer value with the address of the
last node you visited, which yields the address of the next node you want
to visit. For more details, see:

"Running Circles Round You, Logically"
by Steve Dewhurst
C/C++ Users Journal (20, 6), June 2002

I don't think the article is available online, alas, but you can find some
related source code demonstrating the technique at:

http://www.semantics.org/tyr/tyr0_5/list.h

That certainly is an amazing idea. You know the pointer you are coming
from so you can XOR to find the next pointer.

I agree with a Tom that we don't have much use for double-link lists,
but is a nice idea.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#5Dann Corbit
DCorbit@connx.com
In reply to: Bruce Momjian (#4)
Re: Double linked list with one pointer

-----Original Message-----
From: Bruce Momjian [mailto:pgman@candle.pha.pa.us]
Sent: Saturday, December 06, 2003 5:02 PM
To: Gaetano Mendola
Cc: pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] Double linked list with one pointer

Gaetano Mendola wrote:

If I'm not wrong Neil Conway is working on
reimplement a double linked list.
Looking around I found this post of
"Herb Sutter" on comp.lang.c++:

======================================================================

==
In particular, a motivation behind two-way pointers is that you
can have a more space-efficient doubly linked list if you

store only one

(not two) pointer's worth of storage in each node. But how

can the list

still be traversable in both directions? The idea is that each node
stores, not a pointer to one other node, but a pointer to

the previous

node XOR'd with a pointer to the next node. To traverse the

list in either

direction, at each node you get a pointer to the next node by simply
XORing the current node's two-way pointer value with the

address of the

last node you visited, which yields the address of the next

node you want

to visit. For more details, see:

"Running Circles Round You, Logically"
by Steve Dewhurst
C/C++ Users Journal (20, 6), June 2002

I don't think the article is available online, alas, but

you can find

some related source code demonstrating the technique at:

http://www.semantics.org/tyr/tyr0_5/list.h

That certainly is an amazing idea. You know the pointer you
are coming from so you can XOR to find the next pointer.

I agree with a Tom that we don't have much use for
double-link lists, but is a nice idea.

Except when it causes undefined behavior. The subtraction trick also
suffers from the same evil flaw.

http://groups.google.com/groups?hl=en&amp;lr=&amp;ie=UTF-8&amp;oe=UTF-8&amp;threadm=SErn
3.289%24O17.9552%40client&rnum=3&prev=/groups%3Fq%3Dxor%2Bhack%2Bgroup:c
omp.lang.c.*%2Bauthor:corbit%26hl%3Den%26lr%3D%26ie%3DUTF-8%26oe%3DUTF-8
%26selm%3DSErn3.289%2524O17.9552%2540client%26rnum%3D3

From the C-FAQ:

3.3b: Here's a slick expression:

a ^= b ^= a ^= b

It swaps a and b without using a temporary.

A: Not portably, it doesn't. It attempts to modify the variable a
twice between sequence points, so its behavior is undefined.

For example, it has been reported that when given the code

int a = 123, b = 7654;
a ^= b ^= a ^= b;

the SCO Optimizing C compiler (icc) sets b to 123 and a to 0.

See also questions 3.1, 3.8, 10.3, and 20.15c.

10.3: How can I write a generic macro to swap two values?

A: There is no good answer to this question. If the values are
integers, a well-known trick using exclusive-OR could perhaps
be used, but it will not work for floating-point values or
pointers, or if the two values are the same variable. (See
questions 3.3b and 20.15c.) If the macro is intended to be
used on values of arbitrary type (the usual goal), it cannot
use a temporary, since it does not know what type of temporary
it needs (and would have a hard time picking a name for it if
it did), and standard C does not provide a typeof operator.

The best all-around solution is probably to forget about using a
macro, unless you're willing to pass in the type as a third
argument.

20.15c: How can I swap two values without using a temporary?

A: The standard hoary old assembly language programmer's trick is:

a ^= b;
b ^= a;
a ^= b;

But this sort of code has little place in modern, HLL
programming. Temporary variables are essentially free,
and the idiomatic code using three assignments, namely

int t = a;
a = b;
b = t;

is not only clearer to the human reader, it is more likely to be
recognized by the compiler and turned into the most-efficient
code (e.g. using a swap instruction, if available). The latter
code is obviously also amenable to use with pointers and
floating-point values, unlike the XOR trick. See also questions
3.3b and 10.3.

#6Andrew Dunstan
andrew@dunslane.net
In reply to: Bruce Momjian (#4)
Re: Double linked list with one pointer

Bruce Momjian wrote:

Gaetano Mendola wrote:

I don't think the article is available online, alas, but you can find some
related source code demonstrating the technique at:

http://www.semantics.org/tyr/tyr0_5/list.h

That certainly is an amazing idea. You know the pointer you are coming
from so you can XOR to find the next pointer.

I agree with a Tom that we don't have much use for double-link lists,
but is a nice idea.

I must confess that it strikes me as a really really horrid and ugly
hack - very likely to be error-prone and non-portable and undebuggable,
and for almost no saving worth having. But maybe that's just me.

In general I've been impressed with the quality of Pg code over the last
6 months or so - I'd hate to see it polluted by something like this.

cheers

andrew

#7Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Andrew Dunstan (#6)
Re: Double linked list with one pointer

Andrew Dunstan wrote:

Bruce Momjian wrote:

Gaetano Mendola wrote:

I don't think the article is available online, alas, but you can find some
related source code demonstrating the technique at:

http://www.semantics.org/tyr/tyr0_5/list.h

That certainly is an amazing idea. You know the pointer you are coming
from so you can XOR to find the next pointer.

I agree with a Tom that we don't have much use for double-link lists,
but is a nice idea.

I must confess that it strikes me as a really really horrid and ugly
hack - very likely to be error-prone and non-portable and undebuggable,
and for almost no saving worth having. But maybe that's just me.

In general I've been impressed with the quality of Pg code over the last
6 months or so - I'd hate to see it polluted by something like this.

Agreed, I just thought it was neat and I hadn't realized it was
possible.

Also, I don't think the problems for doing this for swaping values
without a temp table is relevant. In those cases, you are doing the
swap without a temp variable, and this is the problem, more than the XOR
itself, which is what we would use and use real temp table.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#6)
Re: Double linked list with one pointer

Andrew Dunstan <andrew@dunslane.net> writes:

I must confess that it strikes me as a really really horrid and ugly
hack - very likely to be error-prone and non-portable and undebuggable,
and for almost no saving worth having. But maybe that's just me.

No, that was exactly my reaction too. I'd be willing to buy into it
if we had a lot of lists that needed to be traversable in both
directions and we were concerned about the storage overhead for such
lists. But we don't, we aren't, and so I'm not ...

regards, tom lane

#9Greg Stark
gsstark@mit.edu
In reply to: Dann Corbit (#5)
Re: Double linked list with one pointer

"Dann Corbit" <DCorbit@connx.com> writes:

From the C-FAQ:

A: Not portably, it doesn't. It attempts to modify the variable a
twice between sequence points, so its behavior is undefined.

10.3: How can I write a generic macro to swap two values?

Neither of these are really relevant, a) he's not talking about swapping
anyways, and b) there's no reason to try to write the linked list code in a
single statement or macro.

but it will not work for floating-point values or pointers

Treating pointers as integers is technically nonportable but realistically you
would be pretty hard pressed to find any architecture anyone runs postgres on
where there isn't some integer datatype that you can cast both directions from
pointers safely.

Presumably if you wanted to do this you would implement it all in an
abstraction that hid all the details from calling code. You could even
implement gdb functions to help debugging.

But that's a lot of effort for something postgres didn't even need in the
first place.

--
greg

#10Tom Lane
tgl@sss.pgh.pa.us
In reply to: Greg Stark (#9)
Re: Double linked list with one pointer

Greg Stark <gsstark@mit.edu> writes:

Treating pointers as integers is technically nonportable but
realistically you would be pretty hard pressed to find any
architecture anyone runs postgres on where there isn't some integer
datatype that you can cast both directions from pointers safely.

... like, say, Datum. We already make that assumption, so there's no
new portability risk involved.

Presumably if you wanted to do this you would implement it all in an
abstraction that hid all the details from calling code.

The hard part would be to make such an abstraction. For instance,
I don't think you could hide the fact that two state variables are
required not one. (The obvious solution of making the state variable
be a two-component struct is not a good answer, because few if any
compilers would be able to figure out that they could/should put the
struct fields in registers; but keeping 'em in memory would absolutely
kill performance.)

There are worse problems too, having to do with lists that are modified
while they are traversed. There are many places that assume they can
lremove() any element other than the one they are currently stopped on
(for example, lnext() off an element and immediately delete it). That
will absolutely not work in the XOR scenario, and I see no way to make
it work without exposing some kind of interaction between lremove and
the list traversal macros. Likewise for insertion of elements into
lists.

In short the abstraction would be pretty leaky and correspondingly hard
to use. (If you've not read Joel Spolsky's article about leaky
abstractions, you should; I think it should be required reading for
every software designer.
http://www.joelonsoftware.com/articles/LeakyAbstractions.html )

Have we beaten this idea to death yet?

regards, tom lane

#11Manfred Spraul
manfred@colorfullife.com
In reply to: Tom Lane (#10)
Re: Double linked list with one pointer

Tom Lane wrote:

Greg Stark <gsstark@mit.edu> writes:

Treating pointers as integers is technically nonportable but
realistically you would be pretty hard pressed to find any
architecture anyone runs postgres on where there isn't some integer
datatype that you can cast both directions from pointers safely.

... like, say, Datum. We already make that assumption, so there's no
new portability risk involved.

There is a new type in C99 for "integer that can hold a pointer value".
I think it's called intptr_t resp. uintptr_t, but I don't have the
standard around.
It will be necessary for a 64-bit Windows port: Microsoft decided that
pointer are 64-bit on WIN64, int&long remain 32-bit. Microsoft's own
typedefs are UINT_PTR, DWORD_PTR, INT_PTR.

--
Manfred

In reply to: Manfred Spraul (#11)
Re: Double linked list with one pointer

At 2003-12-07 18:19:26 +0100, manfred@colorfullife.com wrote:

There is a new type in C99 for "integer that can hold a pointer
value". I think it's called intptr_t resp. uintptr_t, but I don't have
the standard around.

Yes, they're called intptr_t and uintptr_t (�7.18.1.4), but they're both
optional types.

I note in passing that the xor-trick is one of the pointer-hiding tricks
that are the bane of garbage collectors for C (the other common example
being writing a pointer to a file and reading it back).

-- ams