FAILED Issue PROCLAIM-LEXICAL

This is the writeup of failed issue PROCLAIM-LEXICAL. Because it did not pass, it has no official standing other than as a historical document.
Click here to see my personal notes on this issue.
--Kent Pitman (10-Mar-2001)

Issue:        PROCLAIM-LEXICAL
References:   variables (p55), scope/extent (p37), global variables (p68),
	      declaration specifiers (p157)
Category:     CLARIFICATION/ADDITION
Edit history: Version 2 by Rees 28-Apr-87
              Version 3 by Moon 16-May-87
              Version 4 by Masinter 27-Oct-87
              Version 5 by Masinter 14-Nov-87
              Version 6 by Pitman 15-Sep-88
	       (major revision, for review by Jonathan Rees and Jeff Dalton)
	      Version 7 by Pitman 24-Sep-88
   	       (minor revisions based on comments from Rees and Dalton)
	      Version 8 by Pitman 06-Oct-88 (merge recent discussion)
	      Version 9 by Masinter  8-Dec-88 (make JonL's changes)

Problem Description:

  Although local variables in Common Lisp may be `special' or `lexical,'
  global variables (with the exception of named constants) may currently
  only be `special.'

  The Scheme language permits free variable references to refer to global
  bindings. Their experience suggests that such usage would be useful to
  the Common Lisp community. The absence of such a facility in Common Lisp
  is a barrier both culturally (to the sharing of ideas) and technically
  (to the sharing of code).

  SPECIAL proclamations are uncontrollably pervasive. There is no way
  to locally override or globally undo a SPECIAL proclamation.

Background/Analysis:

  Variable evaluation may be viewed in Common Lisp as a search through
  a set of environments to find a binding, and then the dereferencing of
  that binding. The environments with which Common Lisp deals are
  Lexical (L), Dynamic (D), and Global (G).

  A SPECIAL declaration for a variable amounts to a request that the
  variable be resolved by searching first the Dynamic and then the Global
  environment (DG).

  As currently described in CLtL, lexical variable reference searches
  only the Lexical environment (L).

  Because undeclared free variables in the interpreter are implicitly 
  declared SPECIAL by most (perhaps all) implementations, this amounts
  to a search of Lexical, Dynamic, and Global (LDG). However, the 
  accompanying warnings in many implementations make it clear that this
  behavior is not intended to be taken seriously.

  Constants are looked up solely in the Global environment (G). They
  have other properties as well, of course.

  In the Scheme language, the default lookup is first Lexical, then
  Global (LG). Providing compatibility for Scheme code is, and more
  generally for a Scheme working style is therefore difficult because
  Common Lisp does not provide the LG search style.

  The issue of whether a variable can be assigned is orthogonal.

  The issue of whether a variable can be bound and, if it can be, which
  environment is used for the new binding is orthogonal.

Proposal (PROCLAIM-LEXICAL:LG):

  Provide a new declaration (and proclamation) called LEXICAL which implies
  LG lookup. That is, variables declared LEXICAL would be looked up first
  in the lexical environment (L) and then in the global environment (G)
  if not found in the lexical.

  Clarify that a dynamic binding of a variable creates a new binding
  in the dynamic environment (D) leaving the global environment (G)
  unaffected.

  Clarify that special variable access does DG lookup.  That is, 
  variables declared SPECIAL would be looked up first in the dynamic
  environment (D) and then in the global environment (G) if not found
  in the dynamic one. Further clarify that SYMBOL-VALUE does DG lookup.

  Define that a lexical binding of a variable creates a new binding
  in the lexical environment (L), leaving the global environment (G)
  and the dynamic environment (D) unaffected.

  Note that an assignment to a variable which is bound in the global 
  environment (G) will affect lexical (LG) lookups for which there is
  no lexical (L) binding and dynamic (DG) lookups for which there is
  no dynamic (D) binding.

  Note that these restrictions describe an abstract model, not a
  concrete implementation. An implementation may still choose to
  implement dynamic binding as either deep or shallow, but some
  searching may be necessary to find the global cell in shallow bound
  implementations [unless dynamic binding has been forbidden for
  that variable].

  Like SPECIAL declarations (and unlike type declarations),
  compilers and interpreters would be required to notice and 
  respect LEXICAL declarations.

Examples:

 #1: (proclaim '(lexical x))
     (setq x 1)
     (defun f (fn) (list x (funcall fn)))
     (defun g (fn)
       (let ((x 2))
         (declare (special x))
	 (funcall fn #'(lambda () x))))
     (g #'f) => (1 2)

 #2: ; Warning: It is unlikely that any serious program would 
     ;  be written in so obscure a manner as this example.
     ;  This just tests the fringe cases.
     (proclaim '(lexical x))
     (proclaim '(special y))
     (setq x 1 y 2)
     (defun tst ()
       (let ((x 3) (y 4))
	 (locally (declare (special x) (lexical y))
		  (list x y
			(funcall (let ((x 5) (y 6))
				   #'(lambda () (list x y))))))))
     (tst) => (1 2 (5 4))

    If the results of this example confuse you, keep in mind
    that the results of code like this would be somewhat
    confusing no matter what the chosen semantics because the
    code itself is far from perspicuous.

    An explanation of this behavior, which some people find less
    than intuitive because of the bizarre choice of constructs:
    
      X gets bound lexically to 3 because X is [pervasively]
       proclaimed LEXICAL.
      Y gets bound specially to 4 because Y is [pervasively]
       proclaimed SPECIAL.
      Reference style for name X is changed to SPECIAL, making
       lexical X=3 invisible.
      Reference style for name Y is changed to LEXICAL, making
       dynamic Y=4 invisible.
      Global X=1 and global Y=2 are first two elements of list.
      X gets bound lexically to 5 because X is [pervasively]
       proclaimed LEXICAL.
      Y gets bound specially to 6 because Y is [pervasively]
       proclaimed SPECIAL.
      Closure is returned, capturing [lexical] X=5 but not
       [special] Y=6.
      Dynamic binding of Y to 6 disappears, dynamic binding
       of Y to 4 reverts.
      Closure is funcalled, returning captured X=5 and dynamically
       active Y=4 in a list which becomes third list element.

Rationale:

  This mechanism provides a simple and straightforward answer to
  the problems stated above.

Current Practice:

  Probably no one implements this.

Cost to Implementors:

  A fair amount compiler work would probably be needed. Some compilers
  may have hooks for most of this already laying around, but some may not.

  Note well that this proposal does not require separate global lexical
  and dynamic cells, so the data storage layout of Lisp need not change.

  Moon says...
   I have now thought of an efficient way to do this on Lisp machines,
   using invisible pointers, and another efficient way to do it on
   stock hardware, using one extra instruction on every global
   reference of one or the other sort, plus a few extra instructions
   in SPECIAL binding and unbinding.  Given that, I no longer object
   to the proposal as unimplementable.

   It doesn't just require a few compiler changes, it requires some
   reimplementation of the representation of global variables, with
   concomitant changes to the compiler, the loader, the interpreter,
   and probably the debugger. Every symbol now potentially has two
   values accessible from the interpreter (the current SPECIAL and
   the global LEXICAL) and you need the corresponding new data
   structure to keep track of that.

  Rees suggests...
   In shallow-bound implementations, implementors may have to add a
   small run-time routine that searches the dynamic saved-binding
   stack to look for the global value in the case where the variable
   has been dynamically bound.  One might want a bit (or a count)
   somewhere (perhaps in the symbol itself) to speed up the common
   case of access to a global binding of a variable that hasn't been
   dynamically bound; without some kind of optimization, you have to
   search the whole saved-binding stack on every reference to a 
   free [lexical] variable.
 
   While naively you might think you'd incur the cost of clearing the
   valid bit on every dynamic binding (not acceptable), in actuality
   the bit is a static property of programs (PROGV excepted).  So the
   only places you ever need to clear FOO's valid bit are in PROGV,
   in the interpreter, and when FASLOADing code that contains a compiled
   dynamic binding of FOO.

Cost to Users:

  For the most part, this change is upward compatible.
  Some code-walking tools would have to change.

Cost of Non-Adoption:

  It would continue to be difficult to share code with Scheme.

  New CL users coming from the Scheme community would be confused by
  their sometimes inability to map what they know about variable binding
  into the CL model of variable binding.

  Some interesting native CL applications would be impossible to write
  in a syntactically convenient style.

Benefits:

  Enhanced flexibility of expression.
  Rationalization of the semantics of dynamic variables.

Aesthetics:

  Improved appeal to a certain sector of the programming community.

Discussion:

  Rees points that it is an oversimplification to describe Scheme's
  binding simply as LG since they have no Dynamic environment and
  there is no way to distinguish LG and LDG. However, the reasons he
  prefers LG are:
   1. It's nice for readability and understandability to have a
      declaration which tells you that a variable will not be
      dynamically bound.
   2. It's nice for performance in deep-bound implementations to have a
      declaration that says that no search will be needed.
  Of course, he notes, there could be a counter-argument to item 2
  (in favor of LDG) in order to prefer shallow bound implementations,
  but that still would not defeat the argument in item 1. Rees believes
  that LG is slightly preferrable, but that LDG would be essentially
  adequate for most of his needs.

  Pitman supports PROCLAIM-LEXICAL:LG and believes that giving LDG the
  name LEXICAL would be a serious mistake, leaving open the door for
  program bugs due to accidental binding of variables presumed by the
  programmer not to be bound. If someone (Moon?) seriously wanted LDG
  type variables in addition to LG variables (under a name other than
  LEXICAL), Pitman would not object.

  Dalton expressed support for PROCLAIM-LEXICAL:LG (Version 6).
  He observes that another reason for opposing LDG is that it suggests
  the possibility that someone might want DLG. LG is simpler and still
  accomplishes the stated purpose. He adds ``I would like to be able
  to explain the global environment as a sort of giant, extensible
  LET abound everything.  This proposal seems to get fairly close.''

  It would be possible to submit a proposal for a GLOBAL (G) declaration
  under separate cover if anyone (Xerox?) was interested. Pitman thinks
  this would be an interesting idea. Dalton points out, however, that
  already with this proposal there is enough power to at least deal with
  globals -- albeit circuitously. For example, to reference a global
  variable X, one could write subroutines such as:
   (defun     global-x ()      (declare (lexical x)) x)
   (defun set-global-x (value) (declare (lexical x)) (setq x value))
  Eg, consider:
   (defun f (x) (+ (global-x) x))

  In principle, we could imaging saying that free variables should be
  lexical by default, but that would only reduce error checking to no
  good end. To be really useful, this proposal will need to be followed
  by a proposal for primitives analogous to DEFVAR and/or DEFPARAMETER
  but for lexical variables. However, since arguments over syntax are
  likely to have plenty of issues of their own, we've separated this
  proposal for primitive functionality from issues of syntax which
  can be dealt with separately once this is passed.

  Moon expressed concerns about the efficiency issues but after
  thinking about it for a while convinced himself that this is
  efficiently implementable both on stock and special purpose hardware.

  JonL expressed concerns about the last-minute nature of this change,
  which he sees as untested. This concern applies to the mixin of
  the dynamic environment implicit in the LDG proposal.

  Dalton suggests that an alternative solution to the speed issue
  might be possible to obtain by restricting a particular variable to
  be either LEXICAL or SPECIAL but not both.

  Dalton points that even if people don't like the details here, there
  must be a better fallback solution than "do nothing". Pitman agrees
  heartily.