Static Chain

last modified: November 13, 2014

An implementation technique for LexicalScoping.

Extracted from GlobalVariablesConsideredHarmful:

Here's an example, in a hypothetical language very similar to (but not the same as) CeeLanguage, but extended to have complete lexical scoping:

typedef (*FuncPtr)();
FuncPtr   /* f returns pointer to function */
f() {
   FuncPtr ret;
   int i = 0;
   /* here is a lexically scoped func that accesses i */
   void g() { printf("%d\n", i); },
   g();         /* ==> 0 */
   ++i;
   g();         /* ==> 1 */
   return g;    /* return ptr to nested function g from f */
},
main() {
   FuncPtr h;
   h = f();     /* prints 0, then 1 */
   (*h)();      /* indirect call to g(), prints 1 */
},

When f() is called via the Func**Ptr h in main(), g() still has valid access to the stack variable i in f(), which means that the C compiler is not allowed to throw away the stack frame(s) created by f() when f() returns. Also, if f() is recursive, the i that g() references must be the one in the most recent stack frame created by f().

In an implementation using StaticChain**s, both calls to g() from within f(), in addition to whatever else, would get an implicit pointer to f()'s ActivationRecord passed in; the reference to i within g() goes through this pointer - this is what a StaticChain is. (A "display" uses an external array to contain the stack of back pointers up the lexical structure).

Additional nastiness - a CactusStack, heap-allocated ActivationRecord**s, or other hacks, are needed to handle h() - which is a FirstClass LexicalClosure.

The StaticChain/display is required for any sort of lexical scoping, even in the absence of FirstClass LexicalClosure**s.


Also see CactusStack.


After changed the printf, I cannot get the result: "indirect call to g(), prints 1", is this a suitable example?

Here is my test:

[root@test tmp]# ./f
0
1
1
[root@test tmp]# ./g
0 0xbfffe2c8
1 0xbfffe2c8
-1073749304 0xbfffe2c8
[root@test tmp]# diff f.c g.c
7c7
<      void g() { printf("%d\n", i); },
---
>      void g() { printf("%d %p\n", i, &i); },
[root@test tmp]# 

No C compilers support this; the C standard does not include nested procedures.

gcc provides nested procedures anyway but even there the above fails for the value of i is in the stack frame that disappeared as f returned. The languages Scheme and JavaScript makes programs like this work. I know no high performance language where this works.

CommonLisp is fairly high-performance nowadays - benchmarks show speeds within 40% of optimized C and well ahead of Java and CeeSharp. ObjectiveCaml also provides FirstClass LexicalClosures, and it can often beat C in speed.

(I was going to add Algol and Pascal too, but the original comment is referring to 'h = f()', which is a downward lexical closure. Algol, AFAIK, does not support those.)

Can ObjectiveCaml still beat C when using downward lexical closures? (Assuming they are explicitly created in C, which means they arn't real, of course. (Can one do it without malloc?)) I would think that ObjectiveCaml gets a lot of it's speed from recognizing when it can generate code in a C style (whatever that means)


Loading...