Homogeneous, Reference-Counted Lists
Introduction
Here's an example of the canonical form of linked list, in which all the
items have the same type, and each node is automatically returned to
storage via a reference-counting scheme.
If you could get through the brain teaser that was my last post
(heterogeneous lists), then this one will be a breeze.
I also briefly explore list iteration issues, including a discussion of
how to visit list items in reverse order.
Discussion
The fact that the items are homogeneous allows us to implement the list
using traditional techniques: as an abstract data type exported by a
generic package, which imports the item type as a generic formal
parameter.
Here's the elided spec:
generic
type Item_Type is private;
package Lists is
type List_Type is private;
Null_List : constant List_Type;
function Cons (Item : Item_Type;
List : List_Type) return List_Type;
function Get_Head (List : List_Type) return Item_Type;
function Get_Tail (List : List_Type) return List_Type;
...
end Lists;
Cons is the operation by which all lists are ultimately constructed.
Since this is a homogeneous list, we can declare Cons right there in the
Lists package.
Selector Get_Head and constructor Get_Tail are operations that undo the
effect of Cons. Get_Head is returns the item at the head of the list,
and Get_Tail is returns the list that follows the head.
A aside about operation names: the concatenation operator might be a
better choice than "Cons", since it would allow us to write expressions
like:
List := 6 & List;
instead of
List := Cons (6, List);
I chose Cons partly because it's short, but mostly because lately I've
been writing tools for navigation of Ada source using Emacs Lisp. I've
got lists on the brain these days.
The names "get"_head and "get"_tail were chosen to work around certain
name-space issues in Ada. In a declaration, the object and the type
share a name-space, which requires that they have different names.
If the operation were named just
function Head (List : List_Type) return Item_Type;
then the declaration
declare
Head : constant Item_Type := Head (List);
begin
would be illegal. Ditto for this:
declare
Tail : constant List_Type := Tail (List);
begin
As a writer of a reusable module, I want to put as few constraints on
its use as possible. As a writer of a list abstraction, I don't want to
restrict a client's ability to use Head and Tail as names of objects.
Which means I have to come up with a different name for those
operations.
In the software community we generally refer to modifiers and selectors
as "set and get operations." Since the operations for retrieving the
head or tail of a list have the sense of a query, the names Get_Head and
Get_Tail seemed like the natural choice.
OK, back to our discussion. Another useful operation is to change the
value of the item at the head of the list:
procedure Set_Head (List : in List_Type;
Item : in Item_Type);
Set_Head and Get_Head are therefore symmetrical operations.
The operation
procedure Set_Tail (List : in List_Type;
Tail : in List_Type);
is analogous to Set_Head.
That covers the interesting public operations. Let's move on now to the
implementation issues.
We're going to use reference-counting to keep track of how many
references there are to each node. When the count goes to zero, that
means there are no more references to that node, and we can safely
return the node to storage. This eliminates memory leaks.
A node in the list looks like this:
type Node_Type;
type Node_Access is access all Node_Type;
type Node_Type is
record
Item : Item_Type;
Count : Natural;
Next : Node_Access;
end record;
The component Count is the reference count, and Next is the pointer
which links one node in the list to the next node. Note that here
Node_Type is just a simple record type. It doesn't need to be tagged,
because all nodes in a homogeneous list are identical.
List_Type is implemented as a private derivation of Controlled,
extending that type with a pointer to a node:
type List_Type is
new Ada.Finalization.Controlled with record
Head : Node_Access;
end record;
Note that the tagged-ness of List_Type isn't stated in the public view
of the type:
type List_Type is private;
because it's being used here strictly as an implementation technique.
Automatic incrementing and decrementing of reference counts is effected
by overriding Controlled operations Adjust and Finalize.
A controlled object gets adjusted during an assignment, immediately
after making a bit-wise copy from the value on the right hand side to
the object on the left hand side.
When we assign one list object to another, we have to increment the
reference count, because there is now one more object referring to that
node:
procedure Adjust (List : in out List_Type) is
begin
if List.Head /= null then
List.Head.Count := List.Head.Count + 1;
end if;
end Adjust;
A controlled object gets finalized immediately prior to the bit-wise
copy, and when the scope in which the object is declared ends. When a
list object is finalized, it means there is one less list object
pointing to that node, so you have to decrement the reference count:
procedure Finalize (Node : Node_Access) is
begin
if Node = null then
return;
end if;
Node.Count := Node.Count - 1;
if Node.Count = 0 then
Finalize (Node.Next);
end if;
end Finalize;
As I mentioned in my previous post, finalization of one node can cause a
chain reaction of node finalizations to occur. When the reference count
for a node drops to zero, you have to return the node to storage. That
means there's one less reference to the next node, so that next node
needs to have its reference count decremented. But if its count goes to
zero, then you have to finalize its next node, and so on.
The recursion terminates when you decrement a node whose reference count
is greater than one, or you fall off the end of the list (because the
reference counts of all the nodes in the list were one).
In this example, the storage pool is implemented as a simple list of
unused nodes. "Return the node to storage" just means inserting it at
the head of the free list:
Node.Next := Free_List;
Free_List := Node;
Other schemes are possible, such as using Unchecked_Deallocation to
return the memory to heap.
While Finalize puts nodes on the Free_List, Cons takes nodes off. If
the Free_List is empty, Cons will allocate a fresh node from the heap;
otherwise, it will just remove a node from the Free_List:
if Free_List = null then
Head := ;
else
Head := Free_List;
Free_List := Free_List.Next;
;
end if;
...
end Cons;
The implementation of other List_Type operations is straightforward.
Most of the operations were described in my last post (on heterogeneous
lists) anyway, so I won't repeat that information here.
An aside on exception handling: I haven't bothered to do any manual
error checking, mostly because it would duplicate the checks that are
already being performed automatically by Ada. For example, if you try to
get the head of a null list:
function Get_Head (List : List_Type) return Item_Type is
begin
return List.Head.Item;
end;
then Ada will raise Constraint_Error, because you're trying to
dereference a null pointer. How much more checking do you really need
to do?
My personal philosophy is to let the language do as much work for me as
possible. The only time I explicitly check for a precondition violation
is if there's no convenient way to let the language do the check, and
only then if the violation would corrupt the state of the abstraction.
However, this behavior should of course be documented. The interesting
pre- and post-conditions of each operation should be stated explicitly,
as well as the exception behavior. Something like:
function Get_Head (List : List_Type) return Item_Type;
--
-- Preconditions:
-- List /= Null_List
--
-- Exceptions:
-- Constraint_Error, when the precondition isn't satisfied
My other reason for not explicitly performing a check (or internally
handling the predefined exception, and then propagating another) is for
reasons of efficiency. A client who obeys the pre-conditions shouldn't
have to pay any run-time penalty for others' bad behavior.
This has practical consequences, because any potential client of a
reusable component who perceives that the component isn't as efficient
as it could be isn't going to use it. He'll just write his own, more
efficient version from scratch.
So give the people what they want.
A compromise position is to use pragma Assert to check preconditions,
something like:
function Get_Head (List : List_Type) return Item_Type is
begin
pragma Assert (List.Head /= null,
"trying to get head of empty list");
return List.Head.Item;
end;
The benefit of this approach is that there should be a compiler switch
you can use to turn the assertion checks on or off (the switch for gnat
is called "-gnata"). You can enable the checks during initial
development, then later disable them when you're satisfied everyone's
being good.
The other benefit of using pragma Assert is that if the assertion check
does fail, then (depending on how good your compiler is) you'll get an
error message with very specific debugging information, such as name of
the module in which the error occurred, and on what line.
Iteration
There are two techniques for iteration, "active" vs "passive", which I
discussed in one of my previous posts. A list is interesting because it
is its own active iterator; no separate iterator type is necessary for
list traversal.
You'd typically structure active iteration over a list as follows:
declare
Index : List_Type := List;
begin
while Index /= Null_List loop
... Get_Head (Index) ...
Index := Get_Tail (Index);
end loop;
end;
This is fine, although somewhat error prone, because you can forget to
advance the index (a mistake I often make), or worse, forget to declare
an index and just use the list:
while List /= Null_List loop
... Get_Head (List) ...
List := Get_Tail (List);
end loop;
This destroys the list (gulp!), which may not be the, um, desired
behavior. This of course has never happened to anyone you know. Ahem.
An alternate approach is to use a passive iterator, which is more or
less bullet-proof, because it hides all the index manipulation. It may
also be more efficient. Here's why.
Because the list object is controlled, adjustment and finalization occur
every time an assignment occurs. This can happen like crazy if lots of
temporaries are created. In the example above, all this activity
happens every pass through the loop.
A passive iterator may be more efficient because it traverses the list
using the underlying access objects instead of list objects, thus
avoiding the overhead associated with controlled assignment.
The implementation of a passive iterator for forward iteration looks
just like the active iteration we showed above:
procedure Iterate_Forward (List : in List_Type) is
Done : Boolean := False;
Node : Node_Access := List.Head;
begin
while Node /= null loop
Process (Node.Item, Done);
exit when Done;
Node := Node.Next;
end loop;
end Iterate_Forward;
The only difference is that we have to check a Done flag, to see if the
client wants to terminate the iteration. (You need this to implement
efficient search schemes and the like.)
There's also a passive iterator for backwards traversal. Huh? Over a
singly-linked list?
Recursion is your friend. What we do is use a little helper routine to
recursively traverse all the nodes in the list. When we fall off the
end, the recursion terminates, and we process the item at each node
(from foot to head) as we unwind the stack:
procedure Iterate_Backward (List : in List_Type) is
Done : Boolean := False;
procedure Visit (Node : in Node_Access) is
begin
if Node /= null then
Visit (Node.Next);
if not Done then
Process (Node.Item, Done);
end if;
end if;
end Visit;
begin
Visit (List.Head);
end Iterate_Backward;
Obviously, this isn't the most efficient mechanism for traversing a
linked list in foot-to-head order, and if you're serious about backwards
iteration, then you should probably be using a doubly-linked list. But
it'll do in a pinch, eh?
Matt
The source below is in a format suitable for use with gnatchop.
--STX
with Ada.Finalization;
generic
type Item_Type is private;
with function Get_Image (Item : Item_Type) return String is <>;
package Lists is
type List_Type is private;
Null_List : constant List_Type;
function Get_Head (List : List_Type) return Item_Type;
procedure Set_Head (List : in List_Type;
Item : in Item_Type);
function Get_Tail (List : List_Type) return List_Type;
procedure Set_Tail (List : in List_Type;
Tail : in List_Type);
function Cons (Item : Item_Type;
List : List_Type) return List_Type;
type Item_Array is
array (Positive range <>) of Item_Type;
function To_List (Item : Item_Type) return List_Type;
function To_List (Items : Item_Array) return List_Type;
function Get_Image (List : List_Type) return String;
generic
with procedure Process
(Item : in out Item_Type;
Done : in out Boolean);
procedure Iterate_Forward (List : in List_Type);
generic
with procedure Process
(Item : in out Item_Type;
Done : in out Boolean);
procedure Iterate_Backward (List : in List_Type);
private
type Node_Type;
type Node_Access is access all Node_Type;
type Node_Type is
record
Item : Item_Type;
Count : Natural;
Next : Node_Access;
end record;
type List_Type is
new Ada.Finalization.Controlled with record
Head : Node_Access;
end record;
procedure Adjust (List : in out List_Type);
procedure Finalize (List : in out List_Type);
use Ada.Finalization;
Null_List : constant List_Type :=
(Controlled with Head => null);
end Lists;
package body Lists is
Free_List : Node_Access;
procedure Finalize (Node : Node_Access);
function Get_Head (List : List_Type) return Item_Type is
begin
return List.Head.Item;
end;
procedure Set_Head (List : in List_Type;
Item : in Item_Type) is
begin
List.Head.Item := Item;
end;
function Get_Tail (List : List_Type) return List_Type is
Tail : constant Node_Access := List.Head.Next;
begin
if Tail /= null then
Tail.Count := Tail.Count + 1;
end if;
return List_Type'(Controlled with Head => Tail);
end Get_Tail;
procedure Set_Tail (List : in List_Type;
Tail : in List_Type) is
begin
Finalize (List.Head.Next);
List.Head.Next := Tail.Head;
if Tail.Head /= null then
Tail.Head.Count := Tail.Head.Count + 1;
end if;
end Set_Tail;
function Cons (Item : Item_Type;
List : List_Type) return List_Type is
Head : Node_Access;
begin
if Free_List = null then
Head := new Node_Type'(Count => 1,
Next => List.Head,
Item => Item);
else
Head := Free_List;
Free_List := Free_List.Next;
Head.all := Node_Type'(Count => 1,
Next => List.Head,
Item => Item);
end if;
if List.Head /= null then
List.Head.Count := List.Head.Count + 1;
end if;
return List_Type'(Controlled with Head);
end Cons;
function To_List (Item : Item_Type) return List_Type is
begin
return Cons (Item, Null_List);
end;
function To_List (Items : Item_Array) return List_Type is
List : List_Type;
begin
for I in reverse Items'Range loop
List := Cons (Items (I), List);
end loop;
return List;
end To_List;
function Get_Image (List : List_Type) return String is
function Count_Image (Count : Natural) return String is
Image : constant String := Integer'Image (Count);
begin
return "(" & Image (2 .. Image'Last) & ")";
end;
function Get_Image (Node : Node_Access) return String is
begin
if Node = null then
return "";
else
return
" " &
Get_Image (Node.Item) &
Count_Image (Node.Count) &
Get_Image (Node.Next);
end if;
end Get_Image;
begin
if List.Head = null then
return "()";
else
return
"(" &
Get_Image (List.Head.Item) &
Count_Image (List.Head.Count) &
Get_Image (List.Head.Next) &
")";
end if;
end Get_Image;
procedure Iterate_Forward (List : in List_Type) is
Done : Boolean := False;
Node : Node_Access := List.Head;
begin
while Node /= null loop
Process (Node.Item, Done);
exit when Done;
Node := Node.Next;
end loop;
end Iterate_Forward;
procedure Iterate_Backward (List : in List_Type) is
Done : Boolean := False;
procedure Visit (Node : in Node_Access) is
begin
if Node /= null then
Visit (Node.Next);
if not Done then
Process (Node.Item, Done);
end if;
end if;
end Visit;
begin
Visit (List.Head);
end Iterate_Backward;
procedure Adjust (List : in out List_Type) is
begin
if List.Head /= null then
List.Head.Count := List.Head.Count + 1;
end if;
end Adjust;
procedure Finalize (List : in out List_Type) is
begin
Finalize (List.Head);
end;
procedure Finalize (Node : Node_Access) is
begin
if Node = null then
return;
end if;
pragma Assert (Node.Count > 0,
"finalizing node with ref count = 0");
Node.Count := Node.Count - 1;
if Node.Count = 0 then
Finalize (Node.Next);
Node.Next := Free_List;
Free_List := Node;
end if;
end Finalize;
end Lists;
with Lists;
package Integer_Lists is
new Lists (Integer, Get_Image => Integer'Image);
with Integer_Lists; use Integer_Lists;
with Ada.Text_IO; use Ada.Text_IO;
with Ada.Integer_Text_IO; use Ada.Integer_Text_IO;
procedure Test_Lists is
List : List_Type;
L1, L2 : List_Type;
begin
Put_Line (Get_Image (List));
New_Line;
List := Cons (6, List);
Put_Line (Get_Image (List));
New_Line;
-- declare
-- Temp : constant List_Type :=
-- Cons (57, Cons (58, Null_List));
-- begin
-- Put_Line (Get_Image (Temp));
-- List := Cons (Temp, List);
-- Put ("head: "); Put_Line (Get_Image (List_Type'(Get_Item (List))));
-- Put ("tail: "); Put_Line (Get_Image (Get_Tail (List)));
-- Put_Line (Get_Image (List));
-- New_Line;
-- end;
List := Cons (2, List);
Put_Line (Get_Image (List));
L1 := List;
Put_Line (Get_Image (List));
L2 := List;
Put_Line (Get_Image (List));
L1 := Null_List;
Put_Line (Get_Image (List));
L1 := Get_Tail (List);
Put_Line (Get_Image (List));
L2 := Null_List;
Put_Line (Get_Image (List));
L2 := L1;
Put_Line (Get_Image (List));
New_Line;
-- declare
-- Temp : constant List_Type :=
-- Cons (99, Cons (98, Cons (97, Null_List)));
-- begin
-- Put_Line (Get_Image (Temp));
-- List := Cons (Temp, List);
-- Put_Line (Get_Image (List));
-- end;
Put_Line (Get_Image (List));
New_Line;
L1 := Null_List;
Put_Line (Get_Image (List));
New_Line;
L2 := Null_List;
Put_Line (Get_Image (List));
New_Line;
while List /= Null_List loop
List := Get_Tail (List);
Put_Line (Get_Image (List));
New_Line;
end loop;
List := To_List (Items => (1, 2, 3, 4, 5, 6, 7, 8, 9));
Put_Line (Get_Image (List));
New_Line;
declare
Index : List_Type := List;
begin
while Index /= Null_List loop
Put (Integer'Image (Get_Head (Index)));
-- Put (" "); Put_Line (Get_Image (List));
Index := Get_Tail (Index);
end loop;
New_Line;
end;
declare
procedure Put_Item
(Item : in out Integer;
Done : in out Boolean) is
begin
Put (Integer'Image (Item));
end;
procedure Put_Items is
new Iterate_Forward (Put_Item);
begin
Put_Items (List);
New_Line;
end;
declare
procedure Put_Item
(Item : in out Integer;
Done : in out Boolean) is
begin
Put (Integer'Image (Item));
end;
procedure Put_Items is
new Iterate_Backward (Put_Item);
begin
Put_Items (List);
New_Line;
end;
New_Line (2);
L1 := To_List ((10, 20, 30, 40, 50));
Put_Line (Get_Image (List));
Put_Line (Get_Image (L1));
New_Line;
Set_Tail (List, L1);
Put_Line (Get_Image (List));
Put_Line (Get_Image (L1));
New_Line;
Set_Tail (List, Null_List);
Put_Line (Get_Image (List));
Put_Line (Get_Image (L1));
New_Line (2);
end Test_Lists;
Contributed by: Matthew Heaney
Contributed on: March 8, 1999
License: Public Domain
Back