Skip to content

Latest commit

 

History

History
1972 lines (1284 loc) · 76.4 KB

gc.textile

File metadata and controls

1972 lines (1284 loc) · 76.4 KB
layout
default

chapter 5
Garbage Collection

Execution time program image

This chapter will abruptly subject the reader to a discussion
of various low level issues. First, the Ruby ‘C’ program
memory structure, followed by the Memory Structure supporting
the actual Ruby Language execution. This may be tough going for some readers, but this chapter is important to understanding later chapters.

C Program Memory Structure

A general a ‘C’ program has the following basic memory structure.

1. Program Code Space
2. Constants, Static variables, and global variables
3. Machine stack
4. Heap, ;also refered to a <i>Free Store</i>.

Items 1 and 2 are generally contained in the load image. Items 3 an 4 are setup by program initialization.

Once a ‘C’ program is initialized, it contains at least three seperate
memory segments. Program Code, Globals, and static variables are
generally in the load image stored in Text Memory. Arguments and local
variables of functions are stored on the Machine Stack. Lastly, the program can also
acquire additional memory from the Free Store (also refered to as the Heap).

The third item should to well known to most programmers. The Machine Stack
is implemented by the processor hosting Ruby.
In Figure 1a. the stack is shown growing toward higher addresses. This is not universal, the X86 Memory Stack, Figure 1b, grows toward the lower addresses.

(Macstack)

Figure 1: Machine Stacks

The Ruby language program uses the

the Machine Stack to handle it’s stack operations. The small items, like Integers and arguments are pushed onto the Machine Stack by the underlying code generated by the
‘C’ Compiler.

As we will see, the ruby interpreter pushes larger blocks of data onto the Machine Stack. These blocks are called a Stack Frames. The Stack Frame is a predefine memory block that allows Ruby push structured data onto the Machine Stack. Ruby also pushes various other structured and unstructured data onto the stack.

The Stack Frame corresponds to a Ruby Function Call. When the function returns, the Stack Frame is poped from the Machine Stack! The left side of Figure 2 shows the stack after the Ruby Function ‘D’ is called. The right shows the stack after returning from the functions ‘D’ & ‘C’.

(Macstack)

Figure 2: Machine Stack – Using native alloca

Alloca() vs Malloc()

The malloc() function returns a pointer to a block of memory of the
size requested, of a Null pointer to indicate an error. The memory block
is obtained from the Heap. Memory Blocks acquired using malloc() must eventually be returned to the Heap via Free() function.

The Native alloca() function is a different animal. This function is a very low level function without many safeguards! This function is
generally implemented as in-line function generated by the complier. The requested memory block is allocated on the Machine Stack. Thus, when the function that called alloca() returns the memory block evaporates. The inlined code is generally a single instruction adjusting the stack pointer,
it does not check for stack overflow, and there is no NULL error return.

The use of Native alloca() is good in the respect that is simple and
can improve execution times when used appropriately.

Now for the “fly in the ointment”! The sad fact is that Native alloc() is not supported by all host systems and compilers used to execute the Ruby Language Program (i.e. the Ruby Interpreter). The parts of the
Ruby interpreter that use alloca() determine which implementation of
alloca() to use.

The Alloca() Emulation code is provided by missing/alloca.c.

The alloca emulation function uses a linked list to manage the allocation and release of the acquired memory. Each entry in the linked list consists of a header and a block of memory. Each Header contains a pointer to the next entry (if any) and the current Machine Stack location. A List Element is made up of a Header (described above) followed by a Size block of memory.

(Calloca)
Figure 3: Alloca Emulation linked list diagram

This list and it’s operations are controlled by one function, alloca() This is necessary to maintain compatibility with Native Alloca.

When a user calls alloca(), the function performs Garbage Collection. The Garbage Collector steps backward through the linked List removing all List Elements where the recorded depth is deeper in stack than the current Machine Stack pointer. After garbage collection is finished, a new List Element is created and placed a the top of the linked list.

If the size is zero, the function returns after the garbage collection is perfomed and does not create a List Element

This is important to remember that until the next alloca() call the linked
list remains untouched. While a List Element may exist beyond the point
where calling function returns, it will evaporate the next time the alloca() function is called.

The Figure below shows the results after the return of ‘D’ and ‘C’ and the next garbage collection cycle caused by a alloca() call. The figure below does not show creation of the next List Element, if any.

(Calloca)
Figure 4: Alloca Emulation linked list diagram

The missing/alloca.c function emulates the Native Alloca() on host systems or compilers that do not support the native alloca().

GC … Explanation

We have discussed the underlying support for stack resident memory allocation and it’s emulation for those host systems and compilers that do not have Native alloca support. The balance of this chapter will be the
management of the Heap.


h3. GC

Normally, objects reside in memory and at the conclusion of the program these memory resident objects are released along the other memory resources used by the program. As long as the memory space is large enough to provide all the memory requested, things work fine!

In actuallity, memory is always a limited resource. Even if the user has access to all them memory available, even small amounts of discarded memory add up to very large values for long running programs. Almost all programs will acquire blocks of heap memory. When these resources are not released properly, the accumlation of this memory debris is called a Memory Leak.

The control of the memory resources is always a available to the programmer by using malloc() , and free(). In Object Oriented programs knowing when to release objects from memory can become
difficult due to complex object references.
Ruby relieves the programmer of much of this complexity by managing object memory allocation and release automatically. Ruby handles this task, as many other languages do, by using a Garbage Collection Scheme.

Ruby uses a modified version of the Boehm-Demers-Weiser garbage collector , generally known as simply Boehm GC. With this function there is no need release the memory allocated for Ruby objects and other automatically allocated memory. The Ruby Garbage Collector, hereafter refered to as just GC, manages both the allocation and release of these memory resources.

The issue of Heap Memory Compaction arises in many discussions of the various implementations of Boehm GC. Many programs that the use a
Garbage Collector, often face issues of memory fragmentation. Ruby does not implement memory compaction, but does not generally suffer from crippling heap fragmentation.

One reason for this is that the basic object structures are all the same size. It is also true that some objects, strings leap to mind, allocate additional memory to hold the string value.
The free store or heap storage algorithm is controll at a low level by the
malloc functiion. This, and other issues concernint the Ruby GC will be covered in a later section.

The Boehm GC has been used successfully in many systems. More information on Boehm GC is available at: http://www.hpl.hp.com/personal/Hans_Boehm/gc.

Recent languages such as Java and Perl, Python, C#, and Eiffel use Garbage Collection. The rest of this chapter will be devoted to the details of Ruby’s Garbage Collector.

What is Garbage Collection?

The first question is what is GC trying accomplish. The GC allocates memory from the
heap for objects and memory controlled by Objects. Additionally, and
most importantly, it manages this resource and reclaims the allocated memory from the program when it is
no longer in use. This releaves the programmer of the task of keeping track of allocated memory and releasing it when no longer needed.

Ruby’s GC is, as are most Garbage Collectors,
is composed of three main sections. The first is Memory
Allocation
, the second is Garbage Detection, and the last is Garbage Reclarmation.

Memory Allocation functions control access to the memory set aside for GC functions.
This memory is initially allocated from Heap Memory at program initialization. When this
initial allocation of memory is exhausted, Ruby will perform Garbage Collection. If there is
still not enougth memory to satisfy the allocation request, another block of memory is requested and added to the memory controlled by the Garbage Collector.

The various memory
allocation functions return blocks of memory of the requested sizes. These functions also
maintain lists of memory that is allocated and memory that is available for allocation. In addition to being used by memory allocation functions, these lists are
also updated by the Garbage Reclarmation processing.

Garbage Detection is the process of determining which objects are still in use. These objects
are Live Objects and will be preserved. Conversely, all other objects are considered “Dead” and no longer
viable. These objects, called garbage, are processed by the Garbage Reclarmation section.

Garbage Reclarmation section of GC attempts to find all the memory being used by the so
called Garbage Objects and returns it
to the Memory Allocation functions by updating the lists of available memory.
The memory returned is not just the object itself, but any addtional memory attached to the basic
object. In Ruby for example, Fixnum, true, and false are all completely contained the basic object structure.
Strings, on the other hand, allocate additional memory to hold the actual
character string.

Survey of garbage collector types

The following sections discuss several general methods for performingGarbage Collection.
We briefing discuss the major types, their weaknesses and strengths. Bear in mind the Mark and Sweep is the type used by Ruby.

Mark & sweep

The Mark phase finds live objects and marks them to prevent the
GC from deallocating the associated memory. It performs this task by defining a Root Set of known viable objects and tracing their connections to other objects.

The Root Set are objects referenced by:

  1. Ruby Global Variables and Constants
  2. Ruby live Local Variables
  3. The machine stack
  4. Objects referenced by ‘C’ Global Variables registered by rb_gc_register_address()

(Gcimage)
Figure 5: Root Set trace of live objects

The set of objects found by the sweep phase that are not marked are considered dead objects,
also called garbage.

Mark and Sweep Merits:

  1. While Mark & Sweep GC’s normally have problems with memory fragmentation, Ruby GC is somewhat
    immune to this problem because the common size of Ruby Objects.
  2. One other advantage is that Mark & Sweep can release a cycle. See the section on Reference Counting for a discussion of this problem.

Mark and Sweep Deficiencies

  1. The minimum time it takes to run one Mark and Sweep Cycle is directly proportional to the
    number of Managed Objects. Each object must be ‘touched’ at least once. All live objects must be marked and all garbage objects must be collected.
  2. Because the Mark and Sweep cycle locks out other processing while it runs, the GC can make a noticable dent in a programs performace.

Stop & copy

This method is a modification of the Mark and Sweep Garbage Collector. The managed heap space is divided into two sections. One is the current active heap and the other is empty, call them A and B. Any any one time, only one these heaps will be active! The GC allocates new objects in the active heap.

(Stop2)
Figure 6: Stop & copy (1)

When the GC Runs, it transverses all active objects in the same way as Mark and Sweep. However, the live objects are transferred to the inactive heap instead marking them. When the Mark phase is complete, the inactive heap is made the active heap. The freshly inactivated heap only contains garbage objects and thus they are implicitly collected. This is, the inactive area is considered empty!

(Stop3)
Figure 7: Stop & copy (2)

Stop and Copy Merits:

  1. The live objects are essentially compacted during the Garbage Collection cycle.
  2. This additonally improves the locality of live objects. Since they are all located near each other, they reduce memory paging in a virtual memory system. There is a much higher probabilty that referneces will be in the memory cache.
  3. Like it’s cousin Mark and Sweep, this method also is able to collect cycles.

Stop and Copy Deficiencies

  1. The time to copy objects to the inactive heap takes longer than simply marking a live object.
  2. The position of the objects change. The relocation of references may take a signigicant amount of processing.

Reference counting

A Reference Counting Garbage Collector is different from the other types of GC’s we have discussed so far. In this method, each object has a reference counter associate with it. This counter is incremented by one each tiem a new reference pointer is created that points to a particular object. For example, when a pointer is copied from one place to another by an assignment statement.

When a referece to an object is removed, the counter is decremented by one. When the reference counter becomes zero, the object is collected, e.g., the memory allocated for the object is reclaimed.

(Refcnt)
Figure 8: Reference counting example

Reference Counting Merits:

  1. The processing load imposed by the Garbage Collector is distributed across the whole execution of the program.
  2. Objects are, theorically, released the moment they can no longer be reached by the program.

Reference Counting Deficiencies

  1. The most serious problem with reference counting is that is it not always effective. The inability to handle cycles is discussed below.
  2. Reference counting is very hard to make efficient. Since every time a refernce to an object is created or destroyed, a reference counter must be incremented or decremented. This adds more overhead to the processing of objects and references.
  3. Additionally, if the reference counter for a large complex object structure becomes zero, it can cascade through references finding other referneces becoming zero. In some cases this may cause a noticable performace hit.

As mentioned above, a major deficiency of reference counting is handling cycles. These are a group of objects that form a circular structure. A simple example is shown below:

(Cycle)
Figure 9: Cycle

As you can see, even if such a structure loses the last reference from an external object, the counters will never reach zero.

Management of Ruby objects

The Ruby GC only manages Ruby Objects and other memory blocks controlled by Ruby Objects. Ruby Objects often acquire addtional memory to provide storage for a variety of purposes. These include buffers to hold character strings, memory to hold numeric values, memory to hold arrays and hashes.

While memory can be allocated directly from the heap by calling ‘C’ language functions, this memory will not be managed by the Ruby GC. It is the Users resposibility to release any memory acquired in this manner. Failure to release memory will result in a memory leak , as shown below:


Void not_ok ()
{
Malloc (1024); /* Possible Memory Leak! */
}

In the following case, the user allocates memory for a Ruby object using the Ruby API. Ruby will manage this block on memory automatically and release it when it is no longer in use.

Void this_is_ok ()
{
    Rb_ary_new ();    /* Object managed by the Ruby GC */
away, *
}

Struct RVALUE

As we stated the Ruby GC only manages Ruby Objects. This control is accomplished
using the basic Ruby object structure. Ruby objects are always addressed with a
reference pointer of type RVALUE.

The struct RVALUE a union of Object Structures and System Structures. There is one object structure for each ruby object type( object, klass, string, …), and additonal objects to support the Ruby interpreter operations.

typedef struct RVALUE {
    union {
  struct {
      unsigned long flags;  /* always 0 for freed obj */
      struct RVALUE *next;
  } free;
  struct RBasic  basic;
  struct RObject object;
  struct RClass  klass;
  struct RFloat  flonum;
  struct RString string;
  struct RArray  array;
  struct RRegexp regexp;
  struct RHash   hash;
  struct RData   data;
  struct RStruct rstruct;
  struct RBignum bignum;
  struct RFile   file;
  struct RNode   node;
  struct RMatch  match;
  struct RVarmap varmap;
  struct SCOPE   scope;
    } as;
#ifdef GC_DEBUG
    char *file;            /* Internal Debugging Feature */
    int   line;
#endif
} RVALUE;

(Gc.C)

Ruby GC managed memory contains two distinct types of data. The first is the actual Ruby Objects, which are always exactly 20 Bytes(40 Bytes on 64 bit Processors). The other type is the additional memory allocated and referenced by these objects. This memory is used to store data and table references.

Of the RVALUE Structures listed above, only the ones itemized below represent the structures that implement Ruby Objects.

<i>Struct</i> <b>RBasic </b>   - Preamble for all Structures(flags & Klass ptr)<br>
<i>Struct</i> <b>RObject</b>   - General Object - For things not applicable below
<i>Struct</i> <b>RClass </b>   - Class objects
<i>Struct</i> <b>RFloat </b>   - Integer Decimals (small & medium)
<i>Struct</i> <b>RString</b>   - Character strings
<i>Struct</i> <b>RArray </b>   - Arrays
<i>Struct</i> <b>RRegexp</b>   - Regular expression
<i>Struct</i> <b>RHash  </b>   - Hash table
<i>Struct</i> <b>RFile  </b>   - IO File,  Socket,  And so on
<i>Struct</i> <b>RData  </b>   - Class for describing everything for 'C' level
<i>Struct</i> <b>RStruct</b>   - The structure of Ruby Struct Class
<i>Struct</i> <b>RBignum</b>   - Big integer

The importance of the RVALUE Structure can not by over emphasized. This is the core of the Ruby Interpreter. It not only expresses all Ruby objects, but it also contains structures used by the Interpter itself. All of these structures are same length and share a common preamble, the RBasic Structure.

The Structure Definition of RBasic is as follows:

  struct RBasic {
  unsigned long flags;
  VALUE klass;
  };

(ruby.h)

The two structures most important to the operation of the Ruby GC are:


1) Struct RBasic
2) Struct as.free

As you will remember, most of the RVALUE structures contain RBasic, The only exceptions are RNode and SCOPE. In the case of as.free, it reinterprets RBasic enteries only for the purpose of linking RVALUE blocks into the *freelst.

<b>as.free</b>
unsigned long flags;   -- Always zero when added to the *freelist
struct RVALUE *next;   -- Points at next entry in the *freelist

The construction of the *freelist ;will be discussed shortly.

The flags word in RBasic, as explained in Chapter 02, contains all the state information for a Ruby Object.

Free Store Mapping

Due to the way the malloc() functions, the free store memory is not always allocated in contiguous fashion.
This is because malloc allocates memory in various locations depending on the size
of the requested block. Exactly where memory allocations will take place is beyond the scope of the document and is dependent on the exact implementation of malloc()! However, since Ruby make one or more very large allocations (Ruby heap’s) and many smaller allocations during object creation and Interpreter processing, this generally results in only two primary locations for memory allocation.

The following is a plausible memory map for many processors running Ruby.

(Memory Allocation Map)

Figure 10: Memory allocation map

Ruby object heaps

The Ruby Interpreter implements a specialized memory management system for Ruby Objects
on top of the more general Free Store management scheme provided by malloc(), realloc(), and free() functions. Since Ruby Objects are fixed in size, the Ruby GC can optimize their management.

The Ruby Object Heap, hereafter called an Object Heap, is implemented using two interrelated arrays. The array heaps[ ] is the root structure. Each element contains a heaps_slot structure. Initially only the first elment of heaps[ ] is instantiated.

The following code fragment show the important defines, arrays, and structures that support Object Heap contruction and management.

#define HEAPS_INCREMENT 10

#define HEAP_MIN_SLOTS 10000
static int heap_slots = HEAP_MIN_SLOTS;

static struct heaps_slot {
    void *membase;
    RVALUE *slot;
    int limit;
} *heaps;

static int heaps_length = 0;
static int heaps_used = 0;

(gc.c)

The heaps_slot structure contains three elements that describe an allocated array. This array is often refered to as the heap.
It is made up of heap_slots number of elements, refered to as a slot. Each of these slots is RVALUE sized. In other words, each slot is an empty object-sized block of memory that has been set to zero.

The last sentence above is important. These slots will eventully become Ruby Objects!

The three elements the heap_slots structure are:

1) *membase - Points to the origin of the allocated memory block.
2) *slot    - The first properly "aligned" slot location in the heap.
              (*membase and *slot MAY or MAY NOT be the same address)
3) limit    - The total number of available "slots" in the heap.

(Heapitems)

Figure 11: Relationship between the heaps_slot structure and the heap it describes

The slot value is necessary because the starting address of the heap must be evenly divisible by the size of RVALUE. When this is not the case, the nearest evenly divisible address thats greater than membase is used.

The heaps structure is a array of heaps_slot elements. The variables
that control it’s usage are:

1) *heaps       - Base address  of an array of <b>heaps_slot</b> elements.
2) heaps_length - The number of elements in the <b>heaps</b> array.
3) heaps_used   - The number of elements in <b>heaps</b> array being used.

(Heapitems)

Figure 12: The complete heaps structure

Freelist

The freelist is the repository of object-sized memory blocks that will be
used to create new objects. The list is created or extended by the add_heap()
function by linking all the elements in a new heap into the
freelist.

Also remember, as objects are reclaimed during the sweep phase these objects are also added to the *freelist.

This accomplished by steping through each element in the new heap and perfoming the following actions until the array of new elements is exhausted.


1) The heap element is referenced through the as.free structure.
2) The as.flags word is zero indicating it is a empty object, as.next
is set to the contents of the
freelist.
3) The *freelist is set to the address of the current heap element.

(*freelist linking)

Figure 13: Freelist construction order

When the Ruby GC sweeps memory each object that is not marked is added to the free list. Any memory referenced by the object is simply freed and returned to free stote. The as.free entry is used to construct the freelist. The as.flags word is cleared to indicate that the object is “free”, and the as.next entry is used to link the object into the *freelist as shown above.

Add_heap ()

▼ Add_heap () (Concise edition)


Static void
add_heap()
{
RVALUE *p, *pend;

// Section 1 — Extend the length of Heaps Array (if necessary)
//
if (heaps_used == heaps_length) {
/* Realloc heaps */
struct heaps_slot *p;
int length;

heaps_length += HEAPS_INCREMENT; length = heaps_length*sizeof(struct heaps_slot); RUBY_CRITICAL( if (heaps_used > 0) { p = (struct heaps_slot *)realloc(heaps, length); if (p) heaps = p; } else { p = heaps = (struct heaps_slot *)malloc(length); }); if (p == 0) rb_memerror(); }

// Section 2 – Create new Heap Array and compute Heap_slots values
//
for (;;) {
RUBY_CRITICAL(p =
(RVALUE*)malloc(sizeof(RVALUE)(heap_slots+1)));
if (p == 0) {
if (heap_slots == HEAP_MIN_SLOTS) {
rb_memerror();
}
heap_slots = HEAP_MIN_SLOTS;
continue;
}
heaps[heaps_used].membase = p;
if ((VALUE)p % sizeof(RVALUE) == 0)
heap_slots += 1;
else
p = (RVALUE
)((VALUE)p + sizeof(RVALUE) -
((VALUE)p % sizeof(RVALUE)));
heaps[heaps_used].slot = p;
heaps[heaps_used].limit = heap_slots;
break;
}

// Section 3 – Compute Begin/End of Heap and Adjust Parameters
//
pend = p + heap_slots;
if (lomem == 0 || lomem > p) lomem = p;
if (himem < pend) himem = pend;
heaps_used++;
heap_slots *= 1.8;
if (heap_slots <= 0) heap_slots = HEAP_MIN_SLOTS;

// Section 4 – Add all new Slots to *freelist
//
while (p < pend) {
p→as.free.flags = 0;
p→as.free.next = freelist;
freelist = p;
p++;
}
}

(gc.c)

The following discussion covers the sections noted in the code block above.

Section 1:

When the initial allotted elements in the heaps array has been exhausted, the
array must be expanded to hold additional elements. The special case of initialization
is handled by the same code, since the size heaps_used and heaps_length are both initially zero.

The length of the heaps array is initially set to the value of the HEAP_MIN_SLOTS constant. Thereafter, it is exapanded by the same value.

Section 2:

This section attempts to allocate a new heap. If allocation fails and the heap_slots value is greater than the HEAP_MIN_SLOTS constant value, an attempt is made to to allocate just HEAP_MIN_SLOTS. If this also fails, Ruby aborts with memory failure.

If memory allocation is sucessful, the values for the heaps_slot element are computed. The membase value is the address returned by the malloc function. If this value can not be evenly divided by the size of RVALUE, the next heighest address that is evenly divisable is used for the slot value. Otherwise, the membase value is used for slot.

Section 3:

This section computes the beginning and end addresses for the new heap. Next, the number of heaps_used is incremented. Finally, heap_slots variable is multiplied by the constant ‘1.8’. This means that each new heap will be ‘1.8’ times larger that the last. This is done to reduce the number times that the GC is called for programs with large numbers of objects.

Section 4:

The last step is to link each element of the heap into the *freelist. This provides Ruby with a large number of new “empty” objects to be used.

Rb_newobj ()

▼ rb_newobj ()

VALUE
rb_newobj()
{
VALUE obj;

if (!freelist) garbage_collect(); obj = (VALUE)freelist; freelist = freelist→as.free.next; MEMZEROobj, RVALUE, 1); return obj;

}

(gc.c)

When Ruby needs to create a new object, the rb_newobj() function is called. This function performs the following basic operations:

1) If the <b>*freelist</b> is empty, garbage collection is performed.
2) A new object is extracted from the <b>*freelist</b>.
3) The new object is cleared and returned to the caller.

Marking

This process is performed by marking that following objects and their descendants:

1) global variables and constants in Ruby.
2) live local variables in Ruby
3) the machine stack
4) objects referenced by C global variables
   registerd by rb_gc_register_address()

Rb_gc_mark ()

rb_gc_mark() – Simply Calls gc_mark().
void
rb_gc_mark(ptr)
    VALUE ptr;
{
    gc_mark(ptr, 0);
}
The separation of   rb_gc_mark()  into two functions is a result of
<i>refactoring</i>. This allows the core processing functions of gc_mark() to
be modified for reuse in different contexts.  This is example of the
continuing work being done to not only add new functionality,  but to
improve the underlying code of the Ruby Interpreter.
gc_mark() – Recursively marks all live objects
static void
gc_mark(ptr, lev)
    VALUE ptr;
    int lev;
{
    register RVALUE *obj;

    obj = RANY(ptr);
    if (rb_special_const_p(ptr)) return; /* special const not marked */
    if (obj->as.basic.flags == 0) return;       /* free cell */
    if (obj->as.basic.flags & FL_MARK) return;  /* already marked */
    obj->as.basic.flags |= FL_MARK;

    if (lev > GC_LEVEL_MAX || (lev == 0 && ruby_stack_check())) {
      if (!mark_stack_overflow) {
        if (mark_stack_ptr - mark_stack < MARK_STACK_MAX) {
          *mark_stack_ptr = ptr;
          mark_stack_ptr++;
        }
        else {
         mark_stack_overflow = 1;
        }
      }
      return;
   }
   gc_mark_children(ptr, lev+1);
}

The RANY definition allows the reference of any object. For this particular function, it’s purpose to reference the flags word of each object processed. It’s definition
is shown below:

    #define RANY(o) ((RVALUE*)(o))

(gc.c)

If the object is a special constant, already marked, or a ‘free’ object no further processing is performed. Otherwise, the object is marked and further processing is required.

After the object being marked, a bit of special processing takes place. For purposes of program stability, the depth of recursion during object tracing is limited. When this depth is reached, an attempt is made to place the current object pointer onto a special axuillary stack, the mark_stack.

If the mark stack is full, the the mark_stack_overflow marker is checked. If it is set, when the marking phase is finished, all viable objects in all the heaps are marked. This is necessary to insure that all ‘live’ objects will be marked!

If the mark_stack_overflow marker is not set, then all the objects on the mark_stack are passed to gc_mark_children. When complete, the Ruby GC starts the sweep phase.

gc_mark_children ()

The following is a abbreviated version of the source code for Gc_mark_children(). The code contains two main sections. The first processes the T_NODE Objects. The second processes the standard Object Nodes.

▼ gc_mark_children ()

static void
gc_mark_children(ptr, lev)
VALUE ptr;
int lev;
{
#ifdef ENABLE_DBG
if (mute_flag != 0) {
printf(“ gc_mark_children(…) at (923)\n”);
}
#endif
register RVALUE *obj = RANY;

goto marking; /* skip */ again: obj = RANY; if (rb_special_const_p(ptr)) return; /* special const not marked */ if (obj→as.basic.flags == 0) return; /* free cell */ if (obj→as.basic.flags & FL_MARK) return; /* already marked */ obj→as.basic.flags |= FL_MARK; marking: if (FL_TEST(obj, FL_EXIVAR)) { rb_mark_generic_ivar(ptr); } switch (obj→as.basic.flags & T_MASK) { case T_NIL: case T_FIXNUM: rb_bug(“rb_gc_mark() called for broken object”); break; case T_NODE: mark_source_filename(obj→as.node.nd_file); switch (nd_type(obj)) { case NODE_IF: /* 1,2,3 */ case NODE_FOR: case NODE_ITER: /*………… abbreviation………… */ } return; /* no need to mark class. */ } gc_mark(obj→as.basic.klass, lev); switch (obj→as.basic.flags & T_MASK) { case T_ICLASS: case T_CLASS: case T_MODULE: mark_tbl(obj→as.klass.m_tbl, lev); mark_tbl(obj→as.klass.iv_tbl, lev); ptr = obj→as.klass.super; goto again; case T_ARRAY: if (FL_TEST(obj, ELTS_SHARED)) { ptr = obj→as.array.aux.shared; goto again; } else { long i, len = obj→as.array.len; VALUE *ptr = obj→as.array.ptr; for (i=0; i < len; i++) { gc_mark(*ptr++, lev); } } break; /*………… abbreviation………… */ default: rb_bug(“rb_gc_mark(): unknown data type 0x%x(0x%x) %s”, obj→as.basic.flags & T_MASK, obj, is_pointer_to_heap(obj) ? “corrupted object” : “non object”); }

}

(gc.c)

Gc_mark_children() tries to find all the objects that can be reached from the input object. As we will see in this and following sections, this is a endeavor that involves detailed processing. What do we mean? Consider the following simple array: “ary = [ 1, 2]”

(node tree)

Figure 14: Objects for “ary = [ 1 ;, 2 ]”

The Gray structures(See Below) indicate the extent of the basic input object, that is the part of the object that has already been marked. This is the object that was processed by gc_mark(). It has been passed to this routine to do what? To mark all of it’s childern! The array structure itself is not marked, it is considered part of the input object and it will deleted when it’s object is destroyed. This is true of all auxillary memory used by objects.

The blue structures are the children of the input object, They are the reachable objects. These are the objects that gc_mark_children() will Mark if they are not already marked.

As you can see, these are the constructs that represent the objects and data created by the programmer. However, there are another set of objects supported by the Ruby Interpreter. These are T_NODES. These are the objects that represent the program code inside Ruby. Each source line is converted into a tree of T_NODES.

T_NODES will be discussed in some detail in Secton II. For now, T_NODES are shown just to illustrate the processing in first section of gc_mark_children(). The following T_NODE Tree is created for the source line “x = 5 + 1”:

(tnode tree)

Figure 15: T_NODES Generated for “x = 5 + 1”

As before the gray structures illustrate the input object processed by gc_mark() and the Blue structures represent the objects that are “reachable” by gc_mark_children().

This pairing of gc_mark and gc_mark_children() work recursively to trace
the objects. The trace depth variable lev and the mark_stack work to
prevent run-away recursion. These features breaks very deep traces into smaller chucks to
ease processing. It is important to know that while these features sometimes fail,
the program does not crash. It simply marks ALL of the ojbects. While this mary appear
crude, it will allow Ruby to continue to run until the processor memory is exhausted.

In the case of T_DATA Objects, if a Marking Function is present, it is executed. Ruby has no way of knowing what the structure of a T_DATA Object represents, thus we can not process this object any further.

 case T_DATA:
   if (obj-> As.Data.Dmark) (*obj-> As.Data.Dmark) (DATA_PTR (obj));
 break;

(gc.c)

Rb_gc ()

The function rb_gc() simply calls the function garbage_collect() and on completion calls rb_gc_finalize_deferred().



void
rb_gc()
{
garbage_collect();
rb_gc_finalize_deferred();
}

(gc.c)

The function garbage_collect() attempts to reclaim the space allocated to non-live objects before adding another heap. The reclaimation process is performed by two main tasks. The first task is performed by calling gc_mark() for the root set of objects. These are:

  1. global variables and constants in Ruby.
  2. live local variables in Ruby
  3. the machine stack
  4. objects referenced by C global variables registerd by rb_gc_register_address()

Second, the function gc_sweep is then called called to reclaim all objects that were not marked by the marking functions. That is, all objects that were not reachable from the root set of objects.

The function rb_gc_finalize_deferred() is called to perform any actions that have be performed before an object is destroyed. There three instances when this must be done.

  1. When an object was placed on the deferred_final_list by calling the method ObjectSpace.define_finalizer, This method stores a proc that rb_gc_finalize_deferred() will execute before deleting the target object.
  2. When a T_DATA Object has a dfree function specified.
  3. When a T_FILE Object is destroyed, any open I/O channels must be closed.

Ruby stack and object lists

The following section of code, from garbage_collect(), traces objects found in Ruby stack structures and certain lists of objects. These include Local Variables, Nodes in Stack Frames, Scope Objects, in-block Variables, and any finalizer objects.

   gc_mark((VALUE)ruby_current_node, 0);

    /* mark frame stack */
    for (frame = ruby_frame; frame; frame = frame->prev) {
  rb_gc_mark_frame(frame);
  if (frame->tmp) {
      struct FRAME *tmp = frame->tmp;
      while (tmp) {
    rb_gc_mark_frame(tmp);
    tmp = tmp->prev;
      }
  }
    }
    gc_mark((VALUE)ruby_scope, 0);
    gc_mark((VALUE)ruby_dyna_vars, 0);
    if (finalizer_table) {
  mark_tbl(finalizer_table, 0);
    }

(gc.c)

The variable ruby_current_node points to the current node on the Abstract Syntax Tree(AST). As usual, marking this node also marks all nodes reachable from this node.

Each Stack Frame contains a pointer to an AST Node, which is marked! Additionally, a Stack Frame may contain a sub-list of Stack Frames, which also have their Nodes marked.

Marking ruby_scope marks all local variables in the current Scope.

Marking ruby_dyna_vars marks all in-block Variables in the current Scope.

Finally, if a finalizer_table is present, all objects contained in the table are marked.

Registers and the Stack

The GC also has to worry about objects referenced from machine registers and the stack. For registers, this presents a problem for the interpreter. When you attempt to examine register contents, knowledge of the underlying architecture is required. This presents some difficulties. This will be obvious when examining the code. For the purposes of this document, will examine only the X86 Processor.

The following code example is for the X86 Processor ( both 32 & 64 bit). The FLUSH_REGISTER WINDOWS definition is not used by the X86. A setjmp() function is used to store the machine registers in a known location. Then function mark_locationsarray() then marks any objects referenced from the saved registers. This function will be discussed shortly.

Now the registers have been marked, attempt is made to find all apparent object references on the full stack. These references unlike others are more tentative in nature. All we know about the references on the stack is that they point an object in an active heap. When found, these objects are marked.

The only problem we have is that the object may be a “dead” object. We just don’t know for certain! The reference may never be used. This is yet another example of why GC is called a conservative garbage collector. When we are uncertain about whether an object is alive or dead, we err on the side of caution and mark it!

    FLUSH_REGISTER_WINDOWS;     /* "((void)0)" for X86 Architecture */

    setjmp(save_regs_gc_mark);  /* Save registers in local array */

    mark_locations_array((VALUE*)save_regs_gc_mark,
                     sizeof(save_regs_gc_mark) / sizeof(VALUE *));

    #if STACK_GROW_DIRECTION < 0
       rb_gc_mark_locations((VALUE*)STACK_END, rb_gc_stack_start);  ??

    ( code for other architectures )

(gc.c)
#ifdef __GNUC__
#if defined(__human68k__) || defined(DJGPP)
#if defined(__human68k__)
typedef unsigned long rb_jmp_buf[8];
__asm__ (".even\n\
_rb_setjmp:\n\
	move.l	4(sp),a0\n\
	movem.l	d3-d7/a3-a5,(a0)\n\
	moveq.l	#0,d0\n\
	rts");
#ifdef setjmp
#undef setjmp
#endif
#else
#if defined(DJGPP)
typedef unsigned long rb_jmp_buf[6];
__asm__ (".align 4\n\
_rb_setjmp:\n\
	pushl	%ebp\n\
	movl	%esp,%ebp\n\
	movl	8(%ebp),%ebp\n\
	movl	%eax,(%ebp)\n\
	movl	%ebx,4(%ebp)\n\
	movl	%ecx,8(%ebp)\n\
	movl	%edx,12(%ebp)\n\
	movl	%esi,16(%ebp)\n\
	movl	%edi,20(%ebp)\n\
	popl	%ebp\n\
	xorl	%eax,%eax\n\
	ret");
#endif
#endif
int rb_setjmp (rb_jmp_buf);
#define jmp_buf rb_jmp_buf
#define setjmp rb_setjmp
#endif /* __human68k__ or DJGPP */
#endif /* __GNUC__ */

(gc.c)</a>

Mark_locations_array ()

This function purpose is to step through an array a possible VALUE words and determine
if they point to a object in a valid heap. Each possible VALUE is presented to the function is_pointer_to_heap(). If the function returns true the object is marked by calling gc_mark().

As discussed in previous sections, these arrays have been stored registers and the stack.

mark_locations_array(x, n)
    register VALUE *x;
    register long n;
{
    VALUE v;
    while (n--) {
        v = *x;
        if (is_pointer_to_heap((void *)v)) {
            gc_mark(v, 0);
        }
    x++;
    }
}

(gc.c)

Is_pointer_to_heap ()

This function given a possible object address, determines if it is a valid address. We know that all objects are allocated from one or more heaps in memory. If the given pointer falls within the address range of a valid heap, then it is an object.

However, just falling with the proper range does not make is a “live” object. It simply indicates that it maybe a valid object. The object could easily be on the freelist, for example. This function only determines that it is valid object address.

static inline int
is_pointer_to_heap(ptr)
    void *ptr;
{
    register RVALUE *p = RANY(ptr);
    register RVALUE *heap_org;
    register long i;

    if (p < lomem || p > himem) return Qfalse;
    if ((VALUE)p % sizeof(RVALUE) != 0) return Qfalse;

    /* check if p looks like a pointer */
    for (i=0; i < heaps_used; i++) {
        heap_org = heaps[i].slot;
        if (heap_org <= p && p < heap_org + heaps[i].limit)
          return Qtrue;
    }
    return Qfalse;
}

(gc.c)

Register window

#if defined (sparc) || defined (sparc)

  1. if defined (linux) || defined (linux)
    #define FLUSH_REGISTER_WINDOWS asm (“ta 0×83”)
  2. else /* Solaris and not sparc linux *
    #define FLUSH_REGISTER_WINDOWS asm (“ta 0×03”)
  3. endif
    #else /* Not a sparc *
    #define FLUSH_REGISTER_WINDOWS
    #endif

(defines.h)

Machine stack

rb_gc_mark_locations (rb_gc_stack_start, (VALUE*) STACK_END);
#if defined (human68k)
rb_gc_mark_locations ((VALUE*) ((char*) rb_gc_stack_start

(VALUE*) ((char*) STACK_END + 2)); #endif

(gc.c)

Init_stack ()

This function initializes the stack parameters. As with some previous sections that involved machine level interactions, we will discuss specifics for the X86 processor only.

First, the starting address of the stack must be computed. First, if the specified parameter address contains zero, it will be filled with it’s own address.

Now the macro STACK_UPPER(x, y, z) is a piece of tricky business. Since stack direction and usage varies, this macro adjusts for it. For X86 processors the third parameter, z, is always used. Thus, the first usage of STACK_UPPER() executes the instruction ++addr. The second, executes the conditional rb_gc_stack_start &lt addr.

If the symbol HAVE_GETRLIMIT is defined, the code following will compute a “reasonable” stack size for the current processor. This is done by asking the processor for the current Stack Limit and using it to compute the STACK_LEVEL_MAX value. For the X86 Processor the initial value is set to 655300.

void
Init_stack(addr)
    VALUE *addr;
{
#ifdef __ia64
  ::       ::          ::
#endif

#if defined(_WIN32) || defined(__CYGWIN__)
  ::       ::          ::
#elif defined(STACK_END_ADDRESS)
  ::       ::          ::
#else
    if (!addr) addr = (VALUE *)&addr;   /* X86 Processing */
    STACK_UPPER(&addr, addr, ++addr);
    if (rb_gc_stack_start) {
  if (STACK_UPPER(&addr,
      rb_gc_stack_start > addr,
      rb_gc_stack_start < addr))
      rb_gc_stack_start = addr;
  return;
    }
    rb_gc_stack_start = addr;
#endif

#ifdef HAVE_GETRLIMIT                  /* X86 Processing */
    {
  struct rlimit rlim;

  if (getrlimit(RLIMIT_STACK, &rlim) == 0) {
      unsigned int space = rlim.rlim_cur/5;

      if (space > 1024*1024) space = 1024*1024;
      STACK_LEVEL_MAX = (rlim.rlim_cur - space) / sizeof(VALUE);
  }
    }
#endif
}


(gc.c)

STACK_END

Again, as we did previously, we have isolated the defines for the X86 processor. Other processors take slightly different tacks. The important issue here is that the two defines that “set” and “return” the Stack End Address are created properly for the current processor.

#ifdef C_ALLOCA
   ::         ::       ::
#else
# if defined(__GNUC__) &&
         defined(USE_BUILTIN_FRAME_ADDRESS) && !defined(__ia64)
   ::         ::       ::
   ::         ::       ::
# else                       /* Starting X86 Defines */
#  define  SET_STACK_END    VALUE *stack_end = alloca(1)
# endif
# define STACK_END (stack_end)
#endif

(gc.c)

The SET_STACK_END operation creates the pointer to the Stack End Address by allocating one word on the current stack. The returned address is stored in the pointer location, stack_end.

    VALUE *stack_end = alloca(1)

When ever the define STACK_END is used, the value (stack_end) is inserted.

These methods of creating and returning the Stack End Address isolates all the processor differences in static defines rather than coding executable methods of processor differentiation. This is a common technique used throughout the Ruby Interpreter.

Rb_gc_mark_locations ()

This function searches the stack for pointers into active heaps. Checks are made to assure that the possible object appears to be an appropriate size and resides at valid heap location. At this point, we assume it is a “live” object. There is some chance that the object is already “dead”, but as a conservative collector we accept the occasional uncollected “dead” object.

The actual traversing of the stack and marking objects is performed by mark_locations_array(). The function rb_gc_mark_locations () only coverts the input parameters into a form that is acceptable to mark_locations_array().

void
rb_gc_mark_locations(start, end)
    VALUE *start, *end;
{
    long n;

    n = end - start;
    mark_locations_array(start,n);
}
(gc.c)

While traversing a stack is generally a processor specific function, rb_gc_mark_locations() avoids this problem by requiring the caller make adjustments for processor differences. The following code encompasses the various processor adjustments for calls to rb_gc_mark_locations().

 <b><i>garbage_collect():</i></b>  (continued)
    ::       ::       ::
    ::       ::       ::
#if STACK_GROW_DIRECTION < 0      /* X86 Processors */
    rb_gc_mark_locations((VALUE*)STACK_END, rb_gc_stack_start);

#elif STACK_GROW_DIRECTION > 0
    rb_gc_mark_locations(rb_gc_stack_start, (VALUE*)STACK_END + 1);

#else
    if ((VALUE*)STACK_END < rb_gc_stack_start)
       rb_gc_mark_locations((VALUE*)STACK_END, rb_gc_stack_start);
    else
       rb_gc_mark_locations(rb_gc_stack_start, (VALUE*)STACK_END + 1);
#endif

#ifdef __ia64
    /* mark backing store (flushed register window on the stack) */
    /* the basic idea from guile GC code                         */
    rb_gc_mark_locations(rb_gc_register_stack_start,
          (VALUE*)rb_ia64_bsp());
#endif

#if defined(__human68k__) || defined(__mc68000__)
    rb_gc_mark_locations((VALUE*)((char*)STACK_END + 2),
       (VALUE*)((char*)rb_gc_stack_start + 2));
#endif

(gc.c)

As an example the first two lines are for calling this rb_gc_mark_locations() on X86 Processors.

Notice that the “start” and “end” specifications are not necessarily the actual start and end addresses. These values are calculated for each processor so the stack is traversed in a positive direction, regardless of the actual stack orientation.

Other route objects

Lastly, there are a number of system tables and miscellaneous objects that are left to be marked. We will cover them in this section. As you can see below, this is the final marking phase before the sweep begins!



garbage_collect(): (continued)
:: :: ::
:: :: ::
rb_gc_mark_threads();

/* mark protected global variables */ for (list = global_List; list; list = list→next) { rb_gc_mark_maybe(*list→varptr); } rb_mark_end_proc(); rb_gc_mark_global_tbl(); rb_mark_tbl(rb_class_tbl); rb_gc_mark_trap_list(); /* mark generic instance variables for special constants */ rb_mark_generic_ivar_tbl(); rb_gc_mark_parser(); /* gc_mark objects whose marking are not completed*/ do { while (!MARK_STACK_EMPTY) { if (mark_stack_overflow){ gc_mark_all(); } else { gc_mark_rest(); } } rb_gc_abort_threads(); } while (!MARK_STACK_EMPTY); gc_sweep(); /* Performs the Sweep Phase of the Garbage Collector */ }

(gc.c)

rb_gc_mark_threads()
If threads are being used, this function will mark all active threads.

Global List and rb_gc_mark_maybe()
The global list contains pointers to both simple global address and objects. The global_list is scanned and each address is passed to rb_gc_mark_maybe(). This function determines if the pointer resides in an active heap, which means the address points at an object. All objects found on the global_list are marked.

rb_gc_mark_end_proc()
The handling of proc’s by eval.c evolves keeping various lists of proc objects. This function marks all the objects recorded in these lists.

rb_gc_mark_global_tbl()
Ruby maintains an internal hash table of all define global variables. This function marks each of these variables.

rb_gc_mark_trap_list()
Ruby keeps a list of signal traps for handling threads. This function marks the objects associated with these thread signals.

rb_gc_mark_generic_ivar_tbl()
When instance variables are created for special constants, they are stored in the generic_ivar_tbl. This is necessary because special constants exist only in the VALUE word and do not instantiate an actual object. All variable objects in this table are marked!

rb_gc_mark_parser()
This function marks the objects that hold the state of the parser.

Marking Objects with incomplete marking
The last marking sequence to complete is the marking of objects on the mark_stack. As stated previously, if the mark_stack has overflowed all objects will be marked. Otherwise, just the objects on the mark_stack are fed to gc_mark_children().

rb_gc_abort_threads()
At the completion of all marking, this function will terminate all unmarked threads.

Sweep

Special treatment of NODE Objects

For the sweep phase we will search for objects that are not marked and release them! However, there is a small problem with objects of type NODE if the parser’s semantic stack is not allocated on the machine stack. For configurations were the machine stack is not used, the following code is provided to mark those objects. Notice that even for these configurations, this is only a problem if the parser in compiling!

static void
gc_sweep()
{
 RVALUE *p, *pend, *final_list;
 int freed = 0;
 int i;
 unsigned long live = 0;
 unsigned long free_min = 0;
   ::       ::       ::    ::
   ::       ::       ::    ::
/* We should not reclaim nodes during compilation             */
/* if yacc's semantic stack is not allocated on machine stack */

 if (ruby_in_compile && ruby_parser_stack_on_heap()) {
   for (i = 0; i < heaps_used; i++) {
     p = heaps[i].slot; pend = p + heaps[i].limit;
     while (p < pend) {
       if (!(p->as.basic.flags&FL_MARK) && BUILTIN_TYPE(p) == T_NODE)
         gc_mark((VALUE)p, 0);
        p++;
        }
     }
   }
   ::       ::       ::    ::
   ::       ::       ::    ::

(gc.c)

Reclaiming “dead” Objects

The following section of gc_sweep() examines the active heaps for objects that are not marked! Unmarked, or “dead”, objects are first processed by obj_free(). This function releases any auxiliary memory that has been allocated for an object. These are things like the memory that holds the string for a string object. Other objects, such a float, are self-contained objects and are not modified.

This does not actually “free” an object, it simply reduces it to a bare, and probably an incomplete, object. It becomes free only when it is placed on the freelist. However, before it can legitimately be placed on that list, it must be checked for a finalizer.

The rest of the section is concerned with insuring that unnecessary heaps are reclaimed and that there are sufficient "free’ objects available. Care is taken not to reclaim an empty heap if the number of “free” objects is less than the computed value free_min. This value never less than the constant FREE_MIN (currently 4096). If however, after all reclamation has been complete the number of “free” objects is less than free_min, the function add_heap() performed to increase their number.

  <b><i>gc_sweep():</i></b>  (continued)
   ::       ::       ::    ::
   ::       ::       ::    ::
  freelist = 0;
  final_list = deferred_final_list;
  deferred_final_list = 0;

  for (i = 0; i < heaps_used; i++) {
    int n = 0;
    RVALUE *free = freelist;
    RVALUE *final = final_list;

    p = heaps[i].slot; pend = p + heaps[i].limit;
    while (p < pend) {
      if (!(p->as.basic.flags & FL_MARK)) {
        if (p->as.basic.flags) {
            obj_free((VALUE)p);
        }
        if (need_call_final && FL_TEST(p, FL_FINALIZE)) {
          p->as.free.flags = FL_MARK; /* remain marked */
          p->as.free.next = final_list;
          final_list = p;
        }
        else {
          p->as.free.flags = 0;
          p->as.free.next = freelist;
          freelist = p;
        }
        n++;
      }
      else if (RBASIC(p)->flags == FL_MARK) {
        /* objects to be finalized */
        /* do nothing remain marked */
      }
      else {
        RBASIC(p)->flags &= ~FL_MARK;
        live++;
      }
      p++;
    }
    if (n == heaps[i].limit && freed > free_min) {
      RVALUE *pp;

      heaps[i].limit = 0;
      for (pp = final_list; pp != final; pp = pp->as.free.next) {
          pp->as.free.flags |= FL_SINGLETON; /* freeing page mark */
      }
      freelist = free;  /* cancel this page from freelist */
    }
    else {
      freed += n;
    }
  }
  if (malloc_increase > malloc_limit) {
      malloc_limit += (malloc_increase - malloc_limit) * (double)live / (live + freed);
      if (malloc_limit < GC_MALLOC_LIMIT) malloc_limit = GC_MALLOC_LIMIT;
  }
  malloc_increase = 0;
  if (freed < free_min) {
      add_heap();
  }
   ::       ::       ::    ::
   ::       ::       ::    ::

(gc.c)

Finalizers

Ruby supplies two methods for managing finalization processing for objects. They are:
1) ObjectSpace.define_finalizer
2) ObjectSpace.undefine_finalizer

Defined finalizers are maintained in the hash table finalizer_table. This table is indexed by the Object’s VALUE and contains a reference to a proc block.

Unfortunately, two more depreciated methods must also be handled. They are:
1) ObjectSpace.add_finalizer
2) ObjectSpace.remove_finalizer

These finalizers are held in the table finalizers. This table is indexed by the Object’s ID and contains a reference to a proc block. As noted these are depreciated methods and are meant to be eventually removed.

Let us look closer at the section above responsible for finalizer processing(shown below). When the first define_finalizer() method is executed, the variable need_call_final is set to one(1). This enables the checking for finalizers. Surprisingly, the flag is never cleared once set. Even if all finalizers are undefined.
   ::       ::       ::    ::
   if (need_call_final && FL_TEST(p, FL_FINALIZE)) {
     p->as.free.flags = FL_MARK; /* remain marked */
     p->as.free.next = final_list;
     final_list = p;
   }
   else {
     p->as.free.flags = 0;
     p->as.free.next = freelist;
     freelist = p;
   }
   ::       ::       ::    ::

Remember at this point in the processing the current object is unmarked!. If the FL_FINALIZE Flag is set for the current object, the object is placed on the final_list and marked. For those that are wondering why the object is being marked again, it must now persist in it’s new state until the proc associated with the finalizer is actually executed.

If FL_FINALIZE is not set, the object is placed on the free_list.

Finalizers are only executed at the completion of garbage collection and at program exit.

Run_final

The function run_final(), as shown below, handles executing the finalizer proc’s.

static void
run_final(obj)
    VALUE obj;
{
    long i;
    int status, critical_save = rb_thread_critical;
    VALUE args[3], table, objid;

    objid = rb_obj_id(obj); /* make obj into id */
       ::       ::       ::    ::
       ::       ::       ::    ::
     /* Code for Depreciated 'add_final' method */
       ::       ::       ::    ::
       ::       ::       ::    ::
    if (finalizer_table && st_delete(finalizer_table,
                  (st_data_t*)&obj, &table)) {
      for (i=0; i<RARRAY(table)->len; i++) {
        VALUE final = RARRAY(table)->ptr[i];
        args[0] = RARRAY(final)->ptr[1];
        if (!args[1]) args[1] = rb_ary_new3(1, objid);
        args[2] = FIX2INT(RARRAY(final)->ptr[0]);
        rb_protect((VALUE(*)_((VALUE)))run_single_final,
                   (VALUE)args, &status);
      }
    }
    rb_thread_critical = critical_save;
}

(gc.c)

This function processes the finalizer_table created by the define_finalizer() method. The finalizer parameters are referenced through the hash table finalizer_table. The hash key is the target Object’s VALUE Pointer and hash value is a VALUE pointer to the Block Array.

The Block Array contains one VALUE Pointer to a Command Array for each define_finalizer() method call applied to a specific object. If define_finalizer() was called twice for the same object, the Block Array will contain two entries.

Each VALUE pointer in the Block Array references an array containing three entries, cmd, args, and safe_level. The proc object is executed by feeding these three entries to the rb_eval_cmd() function.

(finalizer structure)

Figure 16: Finalizer Structure

Rb_gc_force_recycle ()

There comes a time when you wish to kill an object with extreme prejudice! Not at some undetermined time later, now!. While this facility is not available the user, it is available to internal routines where the consequences are determinate. In any case, rb_gc_force_recycle() will clear the objects flags(i.e. effectively deleting the object) and places the object on the freelist!.

A word of warning though, rb_gc_force_recycle() does not release any memory allocated by the object. Failure to take this into account will lead to memory leaks!

void
rb_gc_force_recycle(p)
    VALUE p;
{
    RANY(p)->as.free.flags = 0;
    RANY(p)->as.free.next = freelist;
    freelist = RANY(p);
}

(gc.c)

Timing of starting

Inside gc.c

Where does Garbage Collection start? There are three primary places:

1.  Ruby_xmalloc()
2.  Ruby_xrealloc()
3.  Rb_newobj()

These are the functions that allocate space for objects, both the object itself and any axillary space needed. Axillary space consists of things like space for strings and other data structures.

The functions Ruby_xmalloc() and Ruby_xrealloc() are used by a variety of internal functions to process requests for additional memory. When a memory allocation request fails, these functions call garbage_collect() to free up memory. A second failure will cause Ruby to generate a memory allocation error and exit.

The function rb_newobj() is used to create new objects. When there are no objects on the freelist, the function garbage_collect() is run to free up the space allocated to “dead” objects. When this does not create enough free objects, a new heap is created. As above, failure to allocate memory will cause Ruby to print an error and exit.

Inside interpreter

Inside the interpreter there are numerous places where memory must be allocated, such as opening files. In these and other cases when memory is required, the function rb_gc() is called and this function calls garbage_collect().

The Ruby statements GC.start, gc.garbage_collect, and ObjectSpace.garbage_collect all call rb_gc() directly.

Formation of object

The story of Garbage Collection has ended. We have discussed the reclamation of dead objects. We now need to further discuss the formation of objects. While this was done to some degree during the previous chapter for Classes, we will continue the discussion in the following sections.

Allocation framework

We have covered the formation of several types of objects. Here we tackle the creation of a class instance with the ‘new’ method.

    ::     ::   ::
  class C
  end

  C.new()
    ::     ::   ::

(<somefile&gt.c)

We can guess that C.new() will create an object.

But how is this accomplished? With most languages, this sort of operation would be handled internally by the interpreter. In Ruby, this and many other operations are handled by predefined methods. As part of the initialization of the Ruby interpreter, all the built-in objects and their methods are created before interpretation begins. The following is a portion for the initialization code for Objects.


void

Init_Object()
{
:: :: ::
rb_define_alloc_func(rb_cObject, rb_class_allocate_instance);
:: :: ::
rb_define_method(rb_cClass, “new”, rb_class_new_instance, -1);
:: :: ::
}

(object.c)

Let’s set aside the first statement for minute and consider the second statement. This statement calls a function that adds the method new() to the class Class. This means when a new() method is specified for a class, the function rb_class_new_instance() is executed.
VALUE
rb_class_new_instance(argc, argv, klass)
    int argc;
    VALUE *argv;
    VALUE klass;
{
    VALUE obj;

    obj = rb_obj_alloc(klass);
    rb_obj_call_init(obj, argc, argv);

    return obj;
}

(object.c)

How does the interpreter “know” which allocation function to use when creating an object? Since different objects can be allocated differently, then each type of object should have a way to define the allocation function. That method is called appropriately rb_define_alloc_func(). Notice that this is the first statement highlighted in Init_Object().

For class allocation, the allocation function rb_class_allocate_instance() is called by rb_obj_alloc()

static VALUE
rb_class_allocate_instance(klass)
VALUE klass;
{
NEWOBJ;
OBJSETUP;
return (VALUE)obj;
}

(object.c)

This allocation function, and many others, makes use of the macro’s NEWOBJ and OBJSETUP to create an appropriate object structure and install the specified values.

 #define NEWOBJ (obj and type) type *obj = (type*) rb_newobj ()

 #define OBJSETUP (obj, c, t) do {\
   RBASIC (obj) -> Flags = (t); \
   RBASIC (obj) -> Klass = (c); \
   if (rb_safe_level () > = 3) FL_SET (obj and FL_TAINT);\
  While (0)

(ruby.h)

The NEWOBJ macro calls rb_newobj(), which fetches the next RVALUE slot off the
freelist. Additionally, it sets the entire RVALUE block to zero.

The OBJSETUP macro sets the Klass and Flags words in the object. It also sets the FL_TAINT flag if necessary.

Returning to the discussion of rb_class_new_instance() above. When control returns from rb_obj_alloc(), the new object is initialized by calling rb_obj_call_init(). For a new class object this entails no further action and the function rb_obj_dummy() is called.

Creating a class object using the new() command involves only the following generic steps:

SomeClass.New = Class#new (rb_class_new_instance)
    SomeClass.Allocate = Class#allocate (rb_class_allocate_instance)
    SomeClass#initialize = Object#initialize (rb_obj_dummy)

Formation of user definition objects

This section is concerned with the formation of user defined objects with the extended library. The extended library API contains three functions for handling these type of objects.

1. Data_Wrap_Struct - Wraps a value/structure in a Ruby RData Object
2. Data_Make_Struct - Allocates Memory for a 'c-type' and wraps it
3. Data_Get_Struct  - Returns Pointer to the wrapped structure
</pre

<img src="images/ch_rdata_object.png" alt="(RData Structure)"><br>
<br>Figure 17: RData Structure


The <b>RDATA</b>  structure is the ruby object structure used to 'wrap' <i>user defined</i>  objects.   Other than the standard <b>basic</b>  structure, the three other items are used as follows:


<b>*dmark</b>  points to a function for <i>marking</i>  Ruby Object references.   This is only necessary if your structure contains references to Ruby objects.   If so,  you must supply a function (pointed to by <b>*dmark</b>)  that will call <b>rb_gc_mark()</b>  for each reference to a Ruby Object.   If no function is needed,   then <b>*dmark</b>  may be set to zero.<br><br>
<b>*free</b>  points to a user function to free the user defined objects,  the system function <i>free()</i>,   or a '-1'  to just to free the memory pointed to by <b>*data_ptr</b>.   Note that both specifying <i>free()</i>  and '-1' accomplish the same thing,  the choice is up to the user.<BR><BR>
<b>*data_ptr</b>  points at the user defined structure 'wrapped' by the <b>RDATA</b>  Object.


h4.  Data_Wrap_Struct()   and   Data_Make_Struct()
<BR>
The <b>Data_Wrap_Struct()</b>  function is called as follows:
<pre class="graylist">
<font size=-1>Wrap_Data_Struct( VALUE class, void (*dmark)(), void (*free)(), *ptr )</font>

The purpose of Data_Wrap_Struct() is to encapsolate a user defined value or structure in a Ruby RData Object. Let us look at a simple example:

typedef struct adot {
  int horz;
  int vert;
} ADot;

ADot *ptr;
VALUE info

ptr = ALLOC(ADot);
ptr->horz = 0;
ptr->vert = 0;

info = Data_Wrap_Struct( cDots,  0,  free,  ptr);

Though not shown here, this code refers on a user defined class named cDots. This class is not particularly important beyond the need for a specific class to be specified.

Since the encapsulated structure does not refer to any other Ruby Objects, a *dmark function pointer does not need to be specified. Further, because the structure does not have references to other user defined allocations, the method free is all that is required for freeing the ADot structure. The value ‘-1’ could have been used instead of specifying free and produced the same result.

As you can guess, the ptr value points a the starting address of a ADot structure.

In conclusion, it returns a VALUE pointer to a RData object.

The Data_Make_Struct() function is called as follows:


Data_Make_Struct( VALUE class, c-type, void (dmark)(), void (free)(), (c-type) *ptr )

The primary difference between these functions is that Data_Make_Struct() performs the allocation of the ‘c’ data structure for you. Presented here is another version of the previous example. It uses Data_Make_Struct() instead of Data_Wrap Struct().

typedef struct adot {
  int horz;
  int vert;
} ADot;

ADot *ptr;
VALUE info

info = Data_Wrap_Struct( cDots,  ADot,   0,  free,  ptr);

ptr->horz = 0;
ptr->vert = 0;

While the use of ptr in this case to initialize the structure is valid, you should generally use the method Data_Get_Struct() to fetch the address of the encapsulated structure.

The definitions for Data_Wrap_Struct and Data_Make_Structare defined as macros.

<font size=-1>
/*
#define RUBY_DATA_FUNC(func) ((void (*)_((void*)))func)
*/
typedef void (*RUBY_DATA_FUNC) _((void*));

VALUE rb_data_object_alloc _((VALUE,void*,RUBY_DATA_FUNC,RUBY_DATA_FUNC));

#define Data_Wrap_Struct(klass,mark,free,sval)\
    rb_data_object_alloc(klass,sval,(RUBY_DATA_FUNC)mark,(RUBY_DATA_FUNC)free)

#define Data_Make_Struct(klass,type,mark,free,sval) (\
    sval = ALLOC(type),\
    memset(sval, 0, sizeof(type)),\
    Data_Wrap_Struct(klass,mark,free,sval)\
)

(ruby.h)
</font>

The function Data_Make_Struct uses Data_Wrap_Struct to actually create the desired RData Object. So in both cases that actual creation of the object is accomplished with the function rb_data_object_alloc()

VALUE rb_data_object_alloc(klass, datap, dmark, dfree)
    VALUE klass;
    void *datap;
    RUBY_DATA_FUNC dmark;
    RUBY_DATA_FUNC dfree;
{
    NEWOBJ(data, struct RData);
    if (klass) Check_Type(klass, T_CLASS);
    OBJSETUP(data, klass, T_DATA);
    data->data = datap;
    data->dfree = dfree;
    data->dmark = dmark;

    return (VALUE)data;
}

(gc.c)

Data_Get_Struct ()

The Data_Get_Struct() function is called as follows:


Data_Get_Struct( VALUE object, c-type, (c-type) *ptr )

This function returns a c-type pointer to the value or structure embedded in a RData Object. The definition for Data_Get_Struct is a macro and it’s definition is as follows:

#define Data_Get_Struct(obj,type,sval) do {\
    Check_Type(obj, T_DATA); \
    sval = (type*)DATA_PTR(obj);\
} while (0)

(ruby.h)

The following is an example of how to retrieve data from a RData object. In this case we examine the data stored in the examples above:

typedef struct adot {
  int horz;
  int vert;
} ADot;

ADot *sval;
VALUE info;

int HORZ;
int VERT;

Data_Get_Struct( info, ADot, sval);

HORZ = sval->horz;
VERT = sval->vert;

The original work is Copyright © 2002 – 2004 Minero AOKI.
Translations, additions, and graphics by C.E. Thornton
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike2.5 License.