Garbage Collection & Memory Management
Summer School

20-21 July 2004, Canterbury, UK

Programme

Tuesday 20 July

08:00-09:00

Breakfast

 

09:00-09:15

Introduction to the Summer School

 

09:15-10:30

Richard Jones

Richard Jones

A Rapid Introduction to Garbage Collection

The aim of this talk is to bring all participants up to speed with garbage collection in the twenty-first century. I shall describe basic garbage collection techniques, point out their strengths and limitations, and indicate how they may be improved. I shall discuss the metrics by which GC algorithms may be characterised and introduce frameworks for discussion of these algorithms. By the end of this talk, all participants should be well-equipped for the following talks.

Slides

 

10:30-11:00

 

Break

 

11:00-12:30

Hans Boehm

Hans-J.Boehm

Engineering a Conservative Mark-Sweep Garbage Collector

Non-moving tracing garbage collectors have several characteristics that often make them desirable, either by themselves, or in combination with other techniques:

  • Since they don't move objects, they touch each object only once per collection, usually requiring less cache traffic.
  • They can completely avoid touching objects which don't contain pointers. These may constitute a large fraction of the heap. Hence this may substantially further reduce the amount of touched memory.
  • They can easily deal with ambiguous pointers.
  • Both concurrent (collector runs in parallel with client) and parallel (multiple collector threads) variants are relatively easy to implement.

We have spent an embarrassingly long time evolving a conservative garbage collector which is now used by a number of language runtime systems and C/C++ applications. We discuss the implementation techniques that we use to obtain good performance, and to realize the above advantages in our context.

Slides

 

12:30-13:30

 

Lunch

 

13:30-15:00

David Detlefs

Dave Detlefs

Garbage-First Garbage Collection

In the first half of this presentation, I describe our "Garbage-First" garbage collector. Garbage-First is a server-style garbage collector, targeted for multiprocessors with large memories, that meets a soft real-time goal with high probability, while achieving high throughput. Whole-heap operations, such as global marking, are performed concurrently with mutation, to prevent interruptions proportional to heap or live-data size. Concurrent marking both provides collection "completeness" and identifies regions ripe for reclamation via compacting evacuation. This evacuation is performed in parallel on multiprocessors, to increase throughput.

The second half of the presentation will be more interactive. I will describe a difficulty we faced (and still face) in the garbage-first design: the minimization of remembered set space overhead. I hope to lead the class on brainstorming of possible explanations for our problems, experiments that could distinguish between them, and possible solutions (including the one we're working on).

Slides

 

15:00-15:30

 

Break

 

15:30-17:00

Rick Hudson

Rick Hudson

Prefetch Injection Based on Hardware Monitoring and Object Metadata

In this talk, we present Mississippi Delta, a novel technique for prefetching linked data structures that closely integrates the hardware performance monitor, the garbage collector's global view of heap and object layout, the type-level metadata inherent in type-safe programs, and JIT compiler analysis. We show how Mississippi Delta's dynamic closed loop system abstracts raw addresses and instruction pointers delivered by the hardware up into a concise metadata graph where reasoning can be done at the type level instead of at the raw address
level. We show how Mississippi Delta guides the JIT analysis in inserting timely prefetches by finding delinquent paths through the metadata graph and calculating deltas. Finally, we show that these
prefetch techniques result in a 11-14% speedup on the cache miss intensive SPEC JBB2000 benchmark.

Mississippi Delta further expands the garbage collector's role to include observing memory system performance and guiding memory optimizations using global heap properties related to object placement and connectivity. Mississippi Delta demonstrates that researchers should view the garbage collector as an integral part of the dynamic profile feedback loop that produces highly optimized code.

Slides

 

17:00-17:15

 

Break

 

17:15-1800

PANEL

Garbage Collection for Very Large Heaps

As managed-runtime languages like Java and C# become more popular choices for server-side application platforms. These applications re usually long-running, heavily multi-threaded, require very large heaps, executed on large multiprocessors, load classes dynamically and make stringent demands of garbage collector performance. Will current approaches to GC scale to multi-gigabyte heaps or are new approaches needed?

Slides: EB HB EM

 

19:30

 

Summer School Dinner, The Goods Shed, Station Road West

 

Wednesday 21 July

08:00-09:00

Breakfast

 

09:00-09:45

Emery Berger

Emery Berger

Explicit Memory Management for High-Performance Applications

Fast and effective memory management is crucial for many applications, including web servers, database managers, and scientific codes. However, current memory managers do not provide adequate support for these
applications on modern architectures, severely limiting their performance, scalability, and robustness.
In this talk, I describe how to design explicit memory managers that support high-performance applications. I first address the software engineering challenges of building efficient memory managers. I then
show how current general-purpose memory managers do not scale on multiprocessors, cause false sharing of heap objects, and systematically leak memory. I describe a fast, provably scalable general-purpose memory
manager that solves these problems, improving performance by up to a factor of 60. Finally, I show how to support server applications that must tear down memory associated with terminated connections or
transactions. I present a generalization of regions and heaps called reaps that supports teardown with higher performance and lower memory consumption than current approaches.

Slides

09:45-10:30

Eliot Moss

Eliot Moss

Sapphire: Copying GC Without Stopping the World

work of: Richard Hudson and Eliot Moss

The growing use in concurrent systems built in languages that require garbage collection (GC), such as Java, is raising practical interest in concurrent GC. Sapphire is a new algorithm for concurrent copying GC for
Java. It stresses minimizing the amount of time any given application thread may need to block to support the collector. In particular, Sapphire is intended to work well in the presence of a large number of application
threads, on small- to medium-scale shared memory multiprocessors. Sapphire extends previous concurrent copying algorithms, and is most closely related to replicating copying collection, a GC technique in which
application threads observe and update primarily the old copies of objects. The key innovations of Sapphire are: (1) the ability to ``flip'' one thread at a time (changing the thread's view from the old copies of objects to the
new copies), as opposed to needing to stop all threads and flip them at the same time; (2) exploiting Java semantics and assuming any data races occur on volatile fields, to avoid a barrier on reads of non-volatile fields; and (3) working in concert with virtually any underlying (non-concurrent) copying collection algorithm.

Slides

 

10:30-11:00

 

Break

 

11:00-11:45

Emery Berger

Emery Berger

Hippocratic Garbage Collection

Programs written in garbage-collected languages like Java often have large working sets and poor locality. Under memory pressure, a full garbage collection is likely to touch pages that are no longer resident, triggering paging. The result is a pronounced drop in throughput and pauses lasting tens of seconds. We present the Hippocratic collector, a generational, page-oriented compacting collector that cooperates with the virtual memory manager to eliminate paging during collections. This cooperation requires only modest modifications to existing virtual memory managers. We have implemented this collector in the Jikes RVM and the Linux kernel. We show that the Hippocratic collector performs competitively in the absence of memory pressure. Under severe memory pressure, it improves the throughput of the SPECjbb benchmark by a factor of two over the best garbage collector we tested, while reducing pause times by an order of magnitude.

Slides

11:45-12:30

Eliot Moss

Eliot Moss

MC^2: High-Performance Garbage Collection for Memory-Constrained Environments

Java is becoming an important platform for memory-constrained consumer devices such as PDAs and cellular phones, because it provides safety and portability. Since Java uses garbage collection, efficient garbage
collectors that run in constrained memory are essential. Typical collection techniques used on these devices are mark-sweep and mark-compact. Mark-sweep collectors can provide good throughput and pause
times but suffer from fragmentation. In order to to run in memory-constrained environments they need to use additional compaction techniques. Mark-compact collectors prevent fragmentation and have low space overheads. Generational mark-compact collectors also provide good throughput. However, they can suffer from occasional long pauses.

Copying collectors can provide higher throughput than either of these techniques, but because of their high space overhead, they previously were unsuitable for memory-constrained devices. This paper presents MC^2
(Memory-Constrained Copying), a copying, generational garbage collector that meets the demands of memory-constrained devices with soft real-time requirements. MC^2 has low space overhead and tight space bounds, prevents fragmentation, provides good throughput, and yields short pause times. These qualities make MC^2 also attractive for other environments, including desktop and server systems.

Slides MOS-collector MOS-performance MC^2

 

12:30-13:30

 

Lunch

 

13:30-15:00

David Bacon

David Bacon

A (Grand?) Unified Theory of Storage Reclamation

The two basic forms of automatic storage reclamation, tracing and reference counting, were invented at the dawn of the high-level language era over 40 years ago. Since then there have been many improvements and optimizations, but all systems are based on one or the other of these methods, which are uniformly viewed as being fundamentally different and possessing very distinct performance properties. We have implemented high-performance collectors of both types, and in the process observed that the more we optimized them, the more similarly they behaved -- that they seem to share some deep structure.

We present a formulation of the two algorithms that shows that they are in fact duals of each other. Intuitively, the difference is that tracing operates on live objects, or "matter", while reference counting operates on dead objects, or "anti-matter". For every operation by the tracing collector, there is a precisely corresponding anti-operation by the reference counting collector.

Viewed in this light, we show that all high-performance collectors (for example, deferred reference counting and generational collection) are in fact hybrids of tracing and reference counting. We are developing a uniform cost-model for the collectors to quantify the tradeoffs that result from choosing different hybridizations of tracing and reference counting. This will allow the correct scheme to be selected based on system performance requirements and the expected properties of the target application.

 

15:00-15:30

 

Break

 

15:30-17:00

Robert Berry

Ryan Sciampacone

A production world view of garbage collection for the Java ME and SE spaces

The J9 Java Virtual Machine (VM) from IBM is a compliant, scaling VM that can be configured for embedded (J2ME), workstation and server (J2SE) level hardware and workloads. A key component in the configuration of J9 is the garbage collector. The issues involved in selecting a particular garbage collection strategy include the target environment for the VM, specifically the hardware and OS, as well as customer requirements, such as minimum throughput or guaranteed pause times. These restrictions and requirements result in a number of different collection strategies that need to be maintained in an efficient and cost effective manner. Modron is a garbage collection framework and suite of collectors available in J9 that is intended to provide a maintainable code base that delivers world-class garbage collection technology in both the J2ME and J2SE spaces. This talk is split into two parts. The first part focuses on how the Modron framework evolved from an existing code base and customer requirements. The second part describes how the J2ME and J2SE spaces affected decisions made during the implementation of both the infrastructure and collectors in Modron.

Slides

 

17:00

 

Finish