Learning Theory of Garbage Collection

I want to learn the theory behind garbage collection. How should I do it? The obvious answer is a compiler tutorial ... The question is, do you need to study lexical analysis, parsing and other things that usually precede garbage collection in the text?

In short, what are the prerequisites for studying the theory of garbage collection?

PS - I know the purpose of parsing, lexical analysis, etc. Just not how they are implemented.

+10
garbage-collection language-agnostic theory
Aug 23 '09 at 2:00
source share
7 answers

Read these documents in order. They are in a progressive order of subject / complexity (not chronologically).

The list is taken directly from prof. Katherine McKinley on memory management is here where you will find links to all articles.

I took the course last semester, so I read it all, and I have to say that I learned what I intended to study!

Please note that links to freely available copies of most of the articles presented below are included in the garbage-collection wiki tag at https://stackoverflow.com/a/a/3/9/ .

  • Real-time list processing on a serial computer , Baker, CACM, 21 (4) 280-294, 1978.
  • Non-recursive list compression algorithm , Cheney, CACM, 13 (11): 677--678, 1970.
  • Real-time garbage collection based on object lifetime , Lieberman & Hewitt, CACM, 26 (6): 419-429, 1983.
  • Generation Cleanup: A Non-Destructive High-Performance Storage Recovery Algorithm , Ungar, Proceedings of the first ACM SIGSOFT / SIGPLAN Software Development Symposium on Practical Software Development Environments, 1984, pp. 157-167.
  • Easy Generation of Generation of Garbage and Quick Disposal , Appel, Software - Practice and Experience 19 (2): 171-183, February 1989
  • Aged garbage collection , D. Stefanovich, C. S. Mackinley, J. B. Moss, ACM Conference on Object Oriented Programming Systems, Languages ​​and Applications. (OOPSLA), p. 370--381. Denver, November 1999
  • Garbage collection of earlier versions in practice: Java virtual machine assessment , D. Stefanovich, M. Herz, S. M. Blackburn, K. S. Mackinley and D. E. Moss, Memory system performance, Berlin, Germany, p. 175-184, June 2002,
  • Beltway: Gridlock Garbage Bypass , SM Blackburn, R. Jones, KS McKinley and JEB Moss, ACM Conference on Designing and Implementing Programming Languages, Berlin, Germany, p. 153-164, June 2002
  • An efficient machine-independent garbage collection procedure in various list structures , Schorr & Waite, CACM, 10 (8): 501-506, 1967.
  • Comparison of Compression Algorithms for Garbage Collection, Cohen & Nicolau, ACM Transactions in Programming Languages ​​and Systems (TOPLAS), Volume 5, Issue 4, Pages 532--553, October 1983.
  • MC2: high-performance garbage collection for cramped environments , Sachindran, Berger & Moss, ACM Conference on object-oriented programming systems, languages ​​and applications, p. 81-96, Vancouver, British Columbia, October 2004
  • Immix: a garbage collector in the Mark region with spatial efficiency, fast collection and performance of mutators , Blackburn & McKinley, ACM Conference on the Development and Implementation of Programming Languages, p. 22–32, Tucson, AZ, June 2008
  • Efficient incremental automatic garbage collector , Deutsch and Bobrow, CACM, 19 (9): 522-526, September 1976.
  • External link counting: quick garbage collection without waiting , S. M. Blackburn and C. S. McKinley, ACM 2003 SIGPLAN Conference Materials on Object Oriented Programming Systems, Languages ​​and Applications, p. 344-359, Annehiem, CA, October 2003
  • Cycle Tracing: An Effective Parallel Cycle Collection by Mark-Sweep , Frampton & Blackburn, 2009. (Presented by ISMM.)
  • Multiprocessor compact garbage collection , Guy L. Steel, Jr., CACM 18 (9): 495-508, 1975.
  • On-the-fly garbage collection: joint exercises , EW Dijkstra, L. Lamport, AJ Martin, CS Scholten and EFM Steffens, Communications of ACM, 21 (11): 966-975, November 1978.
  • Correctly Conclusion Concurrent Garbage Collection Algorithms, Vechev, Yahav and Bacon, ACM Conference on the Development and Implementation of Programming Languages, Ottawa, Ontario, pp. 341-353, 2006.
  • Real-time, low-cost, consistent-use garbage collector , Bacon, Cheng, and Rajan, ACM Symposium on the Principles of Programming Languages, New Orleans, Louisiana, pp. 285-298, 2003.
  • Taxes and Costs: Democratic Planning for Real-Time Garbage Collection , Auerbach, Bacon, Cheng, Grove, Biron, Gracie, McCloskey, Mitch and Skiampakone, ACM International Firmware Conference, Atlanta, GA, pp. 245-254, 2008 .
  • Garbage collection in an unsuitable environment , H. Boehm and M. Weiser, Practice and experience in software development, 18 (9): 807-820, 1988.
  • Hoard: Scalable Memory Allocator for Multithreaded Applications , ED Berger, KS McKinley, RD Blumofe, and PR Wilson, Ninth International Conference on Architectural Support for Programming Languages ​​and Operating Systems, Cambridge, MA, p. 117-128, November 2000,
  • Cork: Dynamic memory leak detection for languages ​​with a garbage collector , Jump & McKinley, Presented to ACM Transaction at Software Practice & Experience, 2009. (A shortened version appears at the ACM Programming Languages ​​Conference, Nice, France, January 2009).
  • Leak Pruning , Bond & McKinley, ACM Architecture Support Conference for Programming Languages ​​and Operating Systems, Washington, DC, March 2009 (Appears.)
  • Free-me: Static Analysis for Restoring Individual Objects , Guyer & McKinley, ACM Conference on the Development and Implementation of Programming Languages, Ottawa, Canada, p. 364-375, June 2006
  • Garbage collection can be faster than stack allocation , Appel, Information Processing Letters 25 (4): 275-279, June 17, 1987.
  • The advantage of garbage collection: improved localization of the program Huang, Blackburn, McKinley, Moss, Wang and Cheng, ACM Conference on Object Oriented Programming Systems, Languages ​​and Applications, Vancouver, British Columbia, pp. 69-80, October 2004
  • Demystifying Magic: High Level Low Level Programming , Daniel Frampton, Stephen M. Blackburn, Perry Cheng, Robin Garner, David P. Grove, J. Eliot B. Moss and Sergey I. Salishev. ACM International Virtual Runtime Conference, Washington, DC March 2009
  • Myths and Reality: Impact of Garbage Collection on Performance , S. M. Blackburn, P. Cheng, and C. S. Mackinley, ACM SIGMETRICS Conference on Computer Measurement and Modeling Systems, p. 25--36, New York, New York, June 2004
  • Unified Garbage Collection Theory , Bacon, Cheng and Rajan, ACM Conference on Object Oriented Programming, Systems, Languages ​​and Applications, Vancouver, British Columbia, Canada, p. 50-68, 2004.
+19
Aug 23 '09 at 17:23
source share

I want to learn the theory of garbage collection. How should I do it?

I am also interested in garbage collection (to the point that I wrote my own garbage collection virtual machine called HLVM ). I learned to read as many research papers on garbage collection as I could, and play with ideas myself, both in my virtual machine and in writing safe high-level simulations.

The obvious answer is a compiler tutorial ... The question is, do you need to study lexical analysis, analysis and other things that usually precede garbage collection in the text?

Lexical analysis, parsing and other things are not related to garbage collection. You can get an outdated overview of garbage collection from the compilers book, but you need to read research papers to get a modern look, for example, regarding multicore.

In short, what are the prerequisites for studying the theory of garbage collection?

You should be aware of basic graph theory, pointers, stacks, threads, and (if you are interested in multithreading) low-level concurrency primitives such as locks.

Garbage collection is all about determining reachability. When a program can no longer get a reference to a value because that value has become unavailable, then the GC can restart the memory that this value occupies. Reachability is determined by traversing the heap, starting with a set of "global roots" (global variables and pointers in stream stacks and in general registers)

GC design has many aspects, but you can start with four basic garbage collection algorithms:

  • Mark-and-sweep (McCarthy, 1960)
  • Mark and the Compact (Haddon and Waite, 1967)
  • Stop-and-copy (Cheney, 1970)
  • Mark Region (McKinley et al., 2007)

Perhaps the most noticeable evolution of these basic ideas is the generational garbage collection, which has been the de facto standard for many years.

According to my personal feelings, some of the little-known garbage collection works contain as much useful information that I also highly recommend:

  • Baker treadmill (beautiful GC in real time).
  • VCGC (a completely different scheme of three-color marking).

You can also study the three types of recording barriers (Dijkstra, Steel and Yuasy) and take a look at the marking of cards and memorable tricks of the set.

Then you may also need to study the actual design decisions that some developers have chosen for language implementations, such as Java and .NET, as well as the SML / NJ compiler for Standard ML, the OCaml compiler, the Glasgow Haskell compiler, and others. The differences between the accepted methods are as great as the similarities between them!

There are also some great tangential link articles, such as Henderson Accurate Garbage Collection in a non-working environment . I used this technique to avoid writing a stack for HLVM.

Memorymanagement.org is an invaluable resource, especially a glossary of GC related terms.

+14
Dec 23 '10 at 22:23
source share

There is a whole book on garbage collection and not bad if I can add:

Richard Jones and Rafael Lins, Garbage Collection, Wiley and Sons (1996), ISBN 0471941484

Richard Jones also maintains a good garbage collection site.

The earliest garbage cans are very readable. You can start with Paul Wilson’s survey, “Uniprocessor Garbage Collection Methods,” (1992, LNCS Volume 637), and then dive into the original literature on topics that sounds interesting.

+11
Aug 23 '09 at 14:09
source share

I don’t know any compiler tutorial that also explains garbage collection, because, as you said yourself, the two are completely unrelated.

Actually, I really like the Wikipedia article as an introductory explanation with many good pointers. Definitely one of the best CS Wikipedia articles.

+3
Aug 23 '09 at 14:06
source share

Chapter 9, “Memory Management,” on object-oriented software development, the second edition of Bertrand Meyer, is pretty informative.

+2
Aug 23 '09 at 14:17
source share

I would start by reading this article: Joint Garbage Theory, Bacon, Cheng, and Rajan, ACM Conference on Object Oriented Programming, Systems, Languages, and Applications, Vancouver, British Columbia, Canada, pp. 50-68, 2004.

+2
Sep 30 '09 at 18:34
source share

You can also take a look at Squeak: Open Personal Computing , which covers the Squeak Smalltalk garbage collector, among other ST projects. You should also take a look at the “Violin” - it is almost entirely written in Smalltalk, and all sources, including the GC, are freely accessible and easy to learn using Smalltalk browsers.

-2
Aug 23 '09 at 14:22
source share



All Articles