Homogeneous, Reference-Counted Lists

Introduction

Here's an example of the canonical form of linked list, in which all the
items have the same type, and each node is automatically returned to
storage via a reference-counting scheme.

If you could get through the brain teaser that was my last post
(heterogeneous lists), then this one will be a breeze.

I also briefly explore list iteration issues, including a discussion of
how to visit list items in reverse order.


Discussion

The fact that the items are homogeneous allows us to implement the list
using traditional techniques: as an abstract data type exported by a
generic package, which imports the item type as a generic formal
parameter.

Here's the elided spec:


generic
   type Item_Type is private;
package Lists is

   type List_Type is private;

   Null_List : constant List_Type;

   function Cons (Item : Item_Type;
                  List : List_Type) return List_Type;

   function Get_Head (List : List_Type) return Item_Type;

   function Get_Tail (List : List_Type) return List_Type;
...
end Lists;


Cons is the operation by which all lists are ultimately constructed.
Since this is a homogeneous list, we can declare Cons right there in the
Lists package.

Selector Get_Head and constructor Get_Tail are operations that undo the
effect of Cons.  Get_Head is returns the item at the head of the list,
and Get_Tail is returns the list that follows the head.

A aside about operation names: the concatenation operator might be a
better choice than "Cons", since it would allow us to write expressions
like:

  List := 6 & List;

instead of

  List := Cons (6, List);

I chose Cons partly because it's short, but mostly because lately I've
been writing tools for navigation of Ada source using Emacs Lisp.  I've
got lists on the brain these days.

The names "get"_head and "get"_tail were chosen to work around certain
name-space issues in Ada.  In a declaration, the object and the type
share a name-space, which requires that they have different names.

If the operation were named just

  function Head (List : List_Type) return Item_Type;

then the declaration

  declare
     Head : constant Item_Type := Head (List);
  begin

would be illegal.  Ditto for this:

  declare
     Tail : constant List_Type := Tail (List);
  begin

As a writer of a reusable module, I want to put as few constraints on
its use as possible.  As a writer of a list abstraction, I don't want to
restrict a client's ability to use Head and Tail as names of objects.
Which means I have to come up with a different name for those
operations.

In the software community we generally refer to modifiers and selectors
as "set and get operations."  Since the operations for retrieving the
head or tail of a list have the sense of a query, the names Get_Head and
Get_Tail seemed like the natural choice.

OK, back to our discussion.  Another useful operation is to change the
value of the item at the head of the list:

   procedure Set_Head (List : in List_Type;
                       Item : in Item_Type);

Set_Head and Get_Head are therefore symmetrical operations.

The operation

   procedure Set_Tail (List : in List_Type;
                       Tail : in List_Type);

is analogous to Set_Head.


That covers the interesting public operations.  Let's move on now to the
implementation issues.

We're going to use reference-counting to keep track of how many
references there are to each node.  When the count goes to zero, that
means there are no more references to that node, and we can safely
return the node to storage.  This eliminates memory leaks.

A node in the list looks like this:

   type Node_Type;
   type Node_Access is access all Node_Type;

   type Node_Type is
      record
         Item  : Item_Type;
         Count : Natural;
         Next  : Node_Access;
      end record;

The component Count is the reference count, and Next is the pointer
which links one node in the list to the next node.  Note that here
Node_Type is just a simple record type.  It doesn't need to be tagged,
because all nodes in a homogeneous list are identical.

List_Type is implemented as a private derivation of Controlled,
extending that type with a pointer to a node:

   type List_Type is
     new Ada.Finalization.Controlled with record
        Head : Node_Access;
     end record;

Note that the tagged-ness of List_Type isn't stated in the public view
of the type:

  type List_Type is private;

because it's being used here strictly as an implementation technique.

Automatic incrementing and decrementing of reference counts is effected
by overriding Controlled operations Adjust and Finalize.

A controlled object gets adjusted during an assignment, immediately
after making a bit-wise copy from the value on the right hand side to
the object on the left hand side.

When we assign one list object to another, we have to increment the
reference count, because there is now one more object referring to that
node:

   procedure Adjust (List : in out List_Type) is
   begin
      if List.Head /= null then
         List.Head.Count := List.Head.Count + 1;
      end if;
   end Adjust;

A controlled object gets finalized immediately prior to the bit-wise
copy, and when the scope in which the object is declared ends.  When a
list object is finalized, it means there is one less list object
pointing to that node, so you have to decrement the reference count:

   procedure Finalize (Node : Node_Access) is
   begin

      if Node = null then
         return;
      end if;

      Node.Count := Node.Count - 1;

      if Node.Count = 0 then

         Finalize (Node.Next);

         

      end if;

   end Finalize;


As I mentioned in my previous post, finalization of one node can cause a
chain reaction of node finalizations to occur.  When the reference count
for a node drops to zero, you have to return the node to storage.  That
means there's one less reference to the next node, so that next node
needs to have its reference count decremented.  But if its count goes to
zero, then you have to finalize its next node, and so on.

The recursion terminates when you decrement a node whose reference count
is greater than one, or you fall off the end of the list (because the
reference counts of all the nodes in the list were one).

In this example, the storage pool is implemented as a simple list of
unused nodes.  "Return the node to storage" just means inserting it at
the head of the free list:

      Node.Next := Free_List;
      Free_List := Node;

Other schemes are possible, such as using Unchecked_Deallocation to
return the memory to heap.

While Finalize puts nodes on the Free_List, Cons takes nodes off.  If
the Free_List is empty, Cons will allocate a fresh node from the heap;
otherwise, it will just remove a node from the Free_List:

     if Free_List = null then

        Head := ;

     else

        Head := Free_List;
        Free_List := Free_List.Next;

        ;

     end if;
     ...
   end Cons;


The implementation of other List_Type operations is straightforward.
Most of the operations were described in my last post (on heterogeneous
lists) anyway, so I won't repeat that information here.

An aside on exception handling: I haven't bothered to do any manual
error checking, mostly because it would duplicate the checks that are
already being performed automatically by Ada.  For example, if you try to
get the head of a null list:

   function Get_Head (List : List_Type) return Item_Type is
   begin
      return List.Head.Item;
   end;

then Ada will raise Constraint_Error, because you're trying to
dereference a null pointer.  How much more checking do you really need
to do?

My personal philosophy is to let the language do as much work for me as
possible.  The only time I explicitly check for a precondition violation
is if there's no convenient way to let the language do the check, and
only then if the violation would corrupt the state of the abstraction.

However, this behavior should of course be documented.  The interesting
pre- and post-conditions of each operation should be stated explicitly,
as well as the exception behavior.  Something like:

  function Get_Head (List : List_Type) return Item_Type;
  --
  -- Preconditions:
  --   List /= Null_List
  --
  -- Exceptions:
  --   Constraint_Error, when the precondition isn't satisfied


My other reason for not explicitly performing a check (or internally
handling the predefined exception, and then propagating another) is for
reasons of efficiency.  A client who obeys the pre-conditions shouldn't
have to pay any run-time penalty for others' bad behavior.

This has practical consequences, because any potential client of a
reusable component who perceives that the component isn't as efficient
as it could be isn't going to use it.  He'll just write his own, more
efficient version from scratch.

So give the people what they want.

A compromise position is to use pragma Assert to check preconditions,
something like:

   function Get_Head (List : List_Type) return Item_Type is
   begin
      pragma Assert (List.Head /= null,
                     "trying to get head of empty list");
      return List.Head.Item;
   end;

The benefit of this approach is that there should be a compiler switch
you can use to turn the assertion checks on or off (the switch for gnat
is called "-gnata").  You can enable the checks during initial
development, then later disable them when you're satisfied everyone's
being good.

The other benefit of using pragma Assert is that if the assertion check
does fail, then (depending on how good your compiler is) you'll get an
error message with very specific debugging information, such as name of
the module in which the error occurred, and on what line.


Iteration

There are two techniques for iteration, "active" vs "passive", which I
discussed in one of my previous posts.  A list is interesting because it
is its own active iterator; no separate iterator type is necessary for
list traversal.

You'd typically structure active iteration over a list as follows:

  declare
    Index : List_Type := List;
  begin
    while Index /= Null_List loop
       ... Get_Head (Index) ...
       Index := Get_Tail (Index);
    end loop;
  end;


This is fine, although somewhat error prone, because you can forget to
advance the index (a mistake I often make), or worse, forget to declare
an index and just use the list:

  while List /= Null_List loop
     ... Get_Head (List) ...
     List := Get_Tail (List);
  end loop;

This destroys the list (gulp!), which may not be the, um, desired
behavior.  This of course has never happened to anyone you know.  Ahem.

An alternate approach is to use a passive iterator, which is more or
less bullet-proof, because it hides all the index manipulation.  It may
also be more efficient.  Here's why.

Because the list object is controlled, adjustment and finalization occur
every time an assignment occurs.  This can happen like crazy if lots of
temporaries are created.  In the example above, all this activity
happens every pass through the loop.

A passive iterator may be more efficient because it traverses the list
using the underlying access objects instead of list objects, thus
avoiding the overhead associated with controlled assignment.

The implementation of a passive iterator for forward iteration looks
just like the active iteration we showed above:

   procedure Iterate_Forward (List : in List_Type) is

      Done : Boolean := False;
      Node : Node_Access := List.Head;
   begin
      while Node /= null loop
         Process (Node.Item, Done);
         exit when Done;
         Node := Node.Next;
      end loop;
   end Iterate_Forward;

The only difference is that we have to check a Done flag, to see if the
client wants to terminate the iteration.  (You need this to implement
efficient search schemes and the like.)

There's also a passive iterator for backwards traversal.  Huh?  Over a
singly-linked list?

Recursion is your friend.  What we do is use a little helper routine to
recursively traverse all the nodes in the list.  When we fall off the
end, the recursion terminates, and we process the item at each node
(from foot to head) as we unwind the stack:

   procedure Iterate_Backward (List : in List_Type) is

      Done : Boolean := False;

      procedure Visit (Node : in Node_Access) is
      begin
         if Node /= null then
            Visit (Node.Next);

            if not Done then
               Process (Node.Item, Done);
            end if;
         end if;
      end Visit;

   begin

      Visit (List.Head);

   end Iterate_Backward;


Obviously, this isn't the most efficient mechanism for traversing a
linked list in foot-to-head order, and if you're serious about backwards
iteration, then you should probably be using a doubly-linked list.  But
it'll do in a pinch, eh?

Matt


The source below is in a format suitable for use with gnatchop.


--STX
with Ada.Finalization;

generic
   type Item_Type is private;
   with function Get_Image (Item : Item_Type) return String is <>;
package Lists is

   type List_Type is private;

   Null_List : constant List_Type;

   function Get_Head (List : List_Type) return Item_Type;

   procedure Set_Head (List : in List_Type;
                       Item : in Item_Type);

   function Get_Tail (List : List_Type) return List_Type;

   procedure Set_Tail (List : in List_Type;
                       Tail : in List_Type);

   function Cons (Item : Item_Type;
                  List : List_Type) return List_Type;

   type Item_Array is
      array (Positive range <>) of Item_Type;

   function To_List (Item : Item_Type) return List_Type;

   function To_List (Items : Item_Array) return List_Type;

   function Get_Image (List : List_Type) return String;

   generic
      with procedure Process
        (Item : in out Item_Type;
         Done : in out Boolean);
   procedure Iterate_Forward (List : in List_Type);

   generic
      with procedure Process
        (Item : in out Item_Type;
         Done : in out Boolean);
   procedure Iterate_Backward (List : in List_Type);


private

   type Node_Type;
   type Node_Access is access all Node_Type;

   type Node_Type is
      record
         Item  : Item_Type;
         Count : Natural;
         Next  : Node_Access;
      end record;

   type List_Type is
     new Ada.Finalization.Controlled with record
        Head : Node_Access;
     end record;

   procedure Adjust (List : in out List_Type);

   procedure Finalize (List : in out List_Type);

   use Ada.Finalization;

   Null_List : constant List_Type :=
     (Controlled with Head => null);

end Lists;


package body Lists is

   Free_List : Node_Access;

   procedure Finalize (Node : Node_Access);


   function Get_Head (List : List_Type) return Item_Type is
   begin
      return List.Head.Item;
   end;

   procedure Set_Head (List : in List_Type;
                       Item : in Item_Type) is
   begin
      List.Head.Item := Item;
   end;


   function Get_Tail (List : List_Type) return List_Type is

      Tail : constant Node_Access := List.Head.Next;
   begin
      if Tail /= null then
         Tail.Count := Tail.Count + 1;
      end if;

      return List_Type'(Controlled with Head => Tail);
   end Get_Tail;



   procedure Set_Tail (List : in List_Type;
                       Tail : in List_Type) is
   begin

      Finalize (List.Head.Next);
      List.Head.Next := Tail.Head;

      if Tail.Head /= null then
         Tail.Head.Count := Tail.Head.Count + 1;
      end if;

   end Set_Tail;



   function Cons (Item : Item_Type;
                  List : List_Type) return List_Type is

      Head : Node_Access;

   begin

      if Free_List = null then

         Head := new Node_Type'(Count => 1,
                                Next  => List.Head,
                                Item  => Item);

      else

         Head := Free_List;
         Free_List := Free_List.Next;

         Head.all := Node_Type'(Count => 1,
                                Next  => List.Head,
                                Item  => Item);

      end if;

      if List.Head /= null then
         List.Head.Count := List.Head.Count + 1;
      end if;

      return List_Type'(Controlled with Head);

   end Cons;


   function To_List (Item : Item_Type) return List_Type is
   begin
      return Cons (Item, Null_List);
   end;


   function To_List (Items : Item_Array) return List_Type is

      List : List_Type;

   begin

      for I in reverse Items'Range loop
         List := Cons (Items (I), List);
      end loop;

      return List;

   end To_List;



   function Get_Image (List : List_Type) return String is

      function Count_Image (Count : Natural) return String is
         Image : constant String := Integer'Image (Count);
      begin
         return "(" & Image (2 .. Image'Last) & ")";
      end;

      function Get_Image (Node : Node_Access) return String is
      begin
         if Node = null then
            return "";
         else
            return
              " " &
              Get_Image (Node.Item) &
              Count_Image (Node.Count) &
              Get_Image (Node.Next);
         end if;
      end Get_Image;

   begin

      if List.Head = null then
         return "()";
      else
         return
           "(" &
           Get_Image (List.Head.Item) &
           Count_Image (List.Head.Count) &
           Get_Image (List.Head.Next) &
           ")";
      end if;

   end Get_Image;


   procedure Iterate_Forward (List : in List_Type) is

      Done : Boolean := False;
      Node : Node_Access := List.Head;
   begin
      while Node /= null loop
         Process (Node.Item, Done);
         exit when Done;
         Node := Node.Next;
      end loop;
   end Iterate_Forward;


   procedure Iterate_Backward (List : in List_Type) is

      Done : Boolean := False;

      procedure Visit (Node : in Node_Access) is
      begin
         if Node /= null then
            Visit (Node.Next);

            if not Done then
               Process (Node.Item, Done);
            end if;
         end if;
      end Visit;

   begin

      Visit (List.Head);

   end Iterate_Backward;



   procedure Adjust (List : in out List_Type) is
   begin
      if List.Head /= null then
         List.Head.Count := List.Head.Count + 1;
      end if;
   end Adjust;



   procedure Finalize (List : in out List_Type) is
   begin
      Finalize (List.Head);
   end;



   procedure Finalize (Node : Node_Access) is
   begin

      if Node = null then
         return;
      end if;

      pragma Assert (Node.Count > 0,
                     "finalizing node with ref count = 0");

      Node.Count := Node.Count - 1;

      if Node.Count = 0 then

         Finalize (Node.Next);

         Node.Next := Free_List;
         Free_List := Node;

      end if;

   end Finalize;


end Lists;


with Lists;

package Integer_Lists is
  new Lists (Integer, Get_Image => Integer'Image);


with Integer_Lists; use Integer_Lists;

with Ada.Text_IO;         use Ada.Text_IO;
with Ada.Integer_Text_IO; use Ada.Integer_Text_IO;

procedure Test_Lists is

   List : List_Type;

   L1, L2 : List_Type;

begin

   Put_Line (Get_Image (List));
   New_Line;

   List := Cons (6, List);
   Put_Line (Get_Image (List));
   New_Line;

--    declare
--       Temp : constant List_Type :=
--         Cons (57, Cons (58, Null_List));
--    begin
--       Put_Line (Get_Image (Temp));
--       List := Cons (Temp, List);
--       Put ("head: "); Put_Line (Get_Image (List_Type'(Get_Item (List))));
--       Put ("tail: "); Put_Line (Get_Image (Get_Tail (List)));
--       Put_Line (Get_Image (List));
--       New_Line;
--    end;

   List := Cons (2, List);
   Put_Line (Get_Image (List));
   L1 := List;
   Put_Line (Get_Image (List));
   L2 := List;
   Put_Line (Get_Image (List));
   L1 := Null_List;
   Put_Line (Get_Image (List));
   L1 := Get_Tail (List);
   Put_Line (Get_Image (List));
   L2 := Null_List;
   Put_Line (Get_Image (List));
   L2 := L1;
   Put_Line (Get_Image (List));
   New_Line;


--    declare
--       Temp : constant List_Type :=
--         Cons (99, Cons (98, Cons (97, Null_List)));
--    begin
--       Put_Line (Get_Image (Temp));
--       List := Cons (Temp, List);
--       Put_Line (Get_Image (List));
--    end;

   Put_Line (Get_Image (List));
   New_Line;

   L1 := Null_List;
   Put_Line (Get_Image (List));
   New_Line;

   L2 := Null_List;
   Put_Line (Get_Image (List));
   New_Line;

   while List /= Null_List loop

      List := Get_Tail (List);
      Put_Line (Get_Image (List));
      New_Line;

   end loop;

   List := To_List (Items => (1, 2, 3, 4, 5, 6, 7, 8, 9));
   Put_Line (Get_Image (List));
   New_Line;

   declare
      Index : List_Type := List;
   begin
      while Index /= Null_List loop
         Put (Integer'Image (Get_Head (Index)));
--       Put ("  ");  Put_Line (Get_Image (List));
         Index := Get_Tail (Index);
      end loop;
      New_Line;
   end;

   declare
      procedure Put_Item
        (Item : in out Integer;
         Done : in out Boolean) is
      begin
         Put (Integer'Image (Item));
      end;

      procedure Put_Items is
        new Iterate_Forward (Put_Item);
   begin
      Put_Items (List);
      New_Line;
   end;

   declare
      procedure Put_Item
        (Item : in out Integer;
         Done : in out Boolean) is
      begin
         Put (Integer'Image (Item));
      end;

      procedure Put_Items is
        new Iterate_Backward (Put_Item);
   begin
      Put_Items (List);
      New_Line;
   end;

   New_Line (2);

   L1 := To_List ((10, 20, 30, 40, 50));

   Put_Line (Get_Image (List));
   Put_Line (Get_Image (L1));
   New_Line;
   Set_Tail (List, L1);
   Put_Line (Get_Image (List));
   Put_Line (Get_Image (L1));
   New_Line;
   Set_Tail (List, Null_List);
   Put_Line (Get_Image (List));
   Put_Line (Get_Image (L1));
   New_Line (2);

end Test_Lists;
Contributed by: Matthew Heaney
Contributed on: March 8, 1999
License: Public Domain
Back