Home ⌂Doc Index ◂Up ▴

Conservative Garbage Collector Porting Directions

The collector is designed to be relatively easy to port, but is not portable code per se. The collector inherently has to perform operations, such as scanning the stack(s), that are not possible in portable C code.

All of the following assumes that the collector is being ported to a byte-addressable 32- or 64-bit machine. Currently all successful ports to 64-bit machines involve LP64 targets. The code base includes some provisions for P64 targets (notably Win64), but that has not been tested. You are hereby discouraged from attempting a port to non-byte-addressable, or 8-bit, or 16-bit machines.

The difficulty of porting the collector varies greatly depending on the needed functionality. In the simplest case, only some small additions are needed for the include/private/gcconfig.h file. This is described in the following section. Later sections discuss some of the optional features, which typically involve more porting effort.

Note that the collector makes heavy use of ifdefs. Unlike some other software projects, we have concluded repeatedly that this is preferable to system dependent files, with code duplicated between the files. However, to keep this manageable, we do strongly believe in indenting ifdefs correctly (for historical reasons usually without the leading sharp sign). (Separate source files are of course fine if they do not result in code duplication.)

Adding Platforms to gcconfig.h

If neither thread support, nor tracing of dynamic library data is required, these are often the only changes you will need to make.

The gcconfig.h file consists of three sections:

  1. A section that defines GC-internal macros that identify the architecture (e.g. IA64 or I386) and operating system (e.g. LINUX or MSWIN32). This is usually done by testing predefined macros. By defining our own macros instead of using the predefined ones directly, we can impose a bit more consistency, and somewhat isolate ourselves from compiler differences. It is relatively straightforward to add a new entry here. But please try to be consistent with the existing code. In particular, 64-bit variants of 32-bit architectures general are not treated as a new architecture. Instead we explicitly test for 64-bit-ness in the few places in which it matters. (The notable exception here is I386 and X86_64. This is partially historical, and partially justified by the fact that there are arguably more substantial architecture and ABI differences here than for RISC variants.) On GNU-based systems, cpp -dM empty_source_file.c seems to generate a set of predefined macros. On some other systems, the "verbose" compiler option may do so, or the manual page may list them.
  2. A section that defines a small number of platform-specific macros, which are then used directly by the collector. For simple ports, this is where most of the effort is required. We describe the macros below. This section contains a subsection for each architecture (enclosed in a suitable ifdef. Each subsection usually contains some architecture-dependent defines, followed by several sets of OS-dependent defines, again enclosed in ifdefs.

  3. A section that fills in defaults for some macros left undefined in the preceding section, and defines some other macros that rarely need adjustment for new platforms. You will typically not have to touch these. If you are porting to an OS that was previously completely unsupported, it is likely that you will need to add another clause to the definition of GET_MEM.

The following macros must be defined correctly for each architecture and operating system:

Additional requirements for a basic port

In some cases, you may have to add additional platform-specific code to other files. A likely candidate is the implementation of GC_with_callee_saves_pushed in mach_dep.c. This ensure that register contents that the collector must trace from are copied to the stack. Typically this can be done portably, but on some platforms it may require assembly code, or just tweaking of conditional compilation tests.

For GC v7, if your platform supports getcontext, then defining the macro UNIX_LIKE for your OS in gcconfig.h (if it is not defined there yet) is likely to solve the problem. otherwise, if you are using gcc, _builtin_unwind_init will be used, and should work fine. If that is not applicable either, the implementation will try to use setjmp. This will work if your setjmp implementation saves all possibly pointer-valued registers into the buffer, as opposed to trying to unwind the stack at longjmp time. The setjmp_test test tries to determine this, but often does not get it right.

In GC v6.x versions of the collector, tracing of registers was more commonly handled with assembly code. In GC v7, this is generally to be avoided.

Most commonly os_dep.c will not require attention, but see below.

Thread support

Supporting threads requires that the collector be able to find and suspend all threads potentially accessing the garbage-collected heap, and locate any state associated with each thread that must be traced.

The functionality needed for thread support is generally implemented in one or more files specific to the particular thread interface. For example, somewhat portable pthread support is implemented in pthread_support.c and pthread_stop_world.c. The essential functionality consists of:

These very often require that the garbage collector maintain its own data structures to track active threads.

In addition, LOCK and UNLOCK must be implemented in gc_locks.h.

The easiest case is probably a new pthreads platform on which threads can be stopped with signals. In this case, the changes involve:

  1. Introducing a suitable GC_xxx_THREADS macro, which should be automatically defined by gc_config_macros.h in the right cases. It should also result in a definition of GC_PTHREADS, as for the existing cases.
  2. For GC v7, ensuring that the atomic_ops package at least minimally supports the platform. If incremental GC is needed, or if pthread locks do not perform adequately as the allocation lock, you will probably need to ensure that a sufficient atomic_ops port exists for the platform to provided an atomic test and set operation. The latest GC code can use GCC atomic intrinsics instead of atomic_ops package (see include/private/gc_atomic_ops.h).
  3. Making any needed adjustments to pthread_stop_world.c and pthread_support.c. Ideally none should be needed. In fact, not all of this is as well standardized as one would like, and outright bugs requiring workarounds are common. Non-preemptive threads packages will probably require further work. Similarly thread-local allocation and parallel marking requires further work in pthread_support.c, and may require better atomic_ops support.

Dynamic library support

So long as DATASTART and DATAEND are defined correctly, the collector will trace memory reachable from file scope or static variables defined as part of the main executable. This is sufficient if either the program is statically linked, or if pointers to the garbage-collected heap are never stored in non-stack variables defined in dynamic libraries.

If dynamic library data sections must also be traced, then:

Implementations that scan for writable data segments are error prone, particularly in the presence of threads. They frequently result in race conditions when threads exit and stacks disappear. They may also accidentally trace large regions of graphics memory, or mapped files. On at least one occasion they have been known to try to trace device memory that could not safely be read in the manner the GC wanted to read it.

It is usually safer to walk the dynamic linker data structure, especially if the linker exports an interface to do so. But beware of poorly documented locking behavior in this case.

Incremental GC support

For incremental and generational collection to work, os_dep.c must contain a suitable virtual dirty bit implementation, which allows the collector to track which heap pages (assumed to be a multiple of the collectors block size) have been written during a certain time interval. The collector provides several implementations, which might be adapted. The default (DEFAULT_VDB) is a placeholder which treats all pages as having been written. This ensures correctness, but renders incremental and generational collection essentially useless.

Stack traces for debug support

If stack traces in objects are need for debug support, GC_dave_callers and GC_print_callers must be implemented.

Disclaimer

This is an initial pass at porting guidelines. Some things have no doubt been overlooked.



Home ⌂Doc Index ◂Up ▴