Reflections on the Future History of Arming Bears

Table of Contents

1 Reflections on the Future History of Arming Bears

Mark Evenson
Created: 15-JUN-2020
Revised: <2020-06-17 Wed 09:33>

A presentation in the Online Lisp Meetings Series

https://www.reddit.com/r/lisp/comments/h9h4mh/reflections_on_the_future_history_of_arming_bears/

2 The prehistory of Arming Bears

2.1 Who I was

I discovered Common Lisp in 2005; an elisp User since 1987, I would have quit programming if I hadn't read Practical Common Lisp.

http://www.gigamonkeys.com/book/

What I am: 2020: a much more experienced coder, perhaps armed with enough lambdas to be dangerous(?).

2.2 What I found in Armed Bear Common Lisp

ABCL originated as the extension language for a the J Editor, an Emacs-like environment written in entirely in Java started sometime in the late 1990s, apparently written entirely by Peter Graves.

http://armedbear-j.sourceforge.net/

By 2005, Peter Graves announced to the mailing list that he would like would like to have J work well enough to concentrate on XCL. At this poin ABCL was largely the work of Peter and Andras Simon, and very close to a conforming implementation. Missing the long form of CLOS method combination, a manual to count as accompanying documentation.

Erik Hüelsmman, Ville Voultaine, Alessio Stalla, myself, and later, for a real MOP, Rudolf Schlatte.

Initially, Erik did most of the heavy lifting fixing major gaps in the implementation as well as the release engineering. Alessio added support for extensible sequences. Ville provided numerous fixes. I sort of hung on for dear life, learning what was broken by reading the Common Lisp HyperSpec. Eventually, I was able to understand things well enough to take over the release engineering, which I continue to do to the present day.

The implementation was close to for suitable values of "close". It wasn't until 2011 that the community was able to declare the implementation as ANSI conforming in Amsterdam at ECLM 2011.

3 abcl-1.6.0: platform stablization

In 2017, ORCL announced that JVMs would be released in new six-month cadence.

Pushing marketing to the max, ORCL has decided to start releasing a major version of the JVM every six months.

The openjdk8 platform is the current EOL'd base commonly used in industry; openjdk11 is the current long time release. Most Linux distributions have switched to using openjdk11 as the base.

The major change in going from openjdk8 to openjdk11 was that the classloader strategy has been deprecated in favor of the new modules abstraction.

https://openjdk.java.net/projects/jigsaw/spec/sotms/ https://bugs.openjdk.java.net/browse/JDK-8061971

The method that abcl-1.5.0's implementation of CL:LOAD thunks down to no longer existed, as the system classloader no longer extended URLClassLoader

https://discuss.gradle.org/t/gradle-is-broken-by-jdk9-application-class-loader/9206/4

We implemented a new strategy that works with modules to "find" things for CL:LOAD. The changes necessary for openjdk11 were surprisingly few.

We retain the old strategy for jar files running it on openjdk{6,7,8}.

https://abcl.org/trac/changeset/15133 https://abcl.org/trac/changeset/15134

4 abcl-1.7.0: The Road to Rigetti

A perhaps little known feature of ABCL is that it is quite capable of using the Java Native Access library to call into code loaded into the JVM process space via dlopen().

https://github.com/java-native-access/jna

With abcl-1.7.0, I've been able to extend the implemention's creation of arrays specialized on commonly used byte types to use memory allocated outside of the JVM's heap via system interfaces like malloc().

To do this, ABCL has extended CL:MAKE-ARRAY to take an additional keyword vis. :nio-buffer to use a generic contract for byte buffers for which I can use JNA to use an implementation that is backed by malloc()d memory.

The version of CFFI distributed with the latest (2020-06-10) Quicklisp now contains an implementation of CFFI-SYS:MAKE-SHAREABLE-BYTE-VECTOR which utilizes this extension if present in the ABCL runtime.

https://github.com/cffi/cffi/commit/47136ad9a97c2df98dbcd13a068e14489ced5b03

#+NIO
(defun make-shareable-vector (length &key (element-type '(unsigned-byte 8)))
  "Use memory on the heap for storing a vector of LENGTH with ELEMENT-TYPE

Returns the allocated vector as the first value, and the pointer to
the heap memory as the second.

Only works for 8, 16, 32 bit bytes.
"
  (let* ((type
           (first element-type))
         (bits-per-byte
           (second element-type))
         (bytes-per-element  
           (ceiling bits-per-byte 8)))
    (unless (subtypep element-type
                      '(or (unsigned-byte 8) (unsigned-byte 16) (unsigned-byte 32)))
      (signal 'type-error :datum element-type
                          :expected-type '(or
                                           (unsigned-byte 8)
                                           (unsigned-byte 16)
                                           (unsigned-byte 32))))
    (let* ((bytes
             (* length bytes-per-element))
           (heap-pointer
             (jss:new "com.sun.jna.Memory" bytes))
           (bytebuffer
             (#"getByteBuffer" heap-pointer 0 bytes))
           (static-vector
             (make-array length :element-type element-type :nio-buffer bytebuffer)))
      (setf (gethash static-vector *static-vector-pointer*)
            heap-pointer)
      (values
       static-vector
       heap-pointer))))

5 ABCL 2

5.1 What is ABCL1?

An amazing piece of code that runs more places than you would think. In addition to the java platform compatibility for a large number of implementations (recent forks in the Valley, in China), it does have an interpreter expressed in the Java code that has been successfully booted on the .NET CLR runtime, and other platforms that are to implement enough of the java.lang.* contracts.

And ABCL1 is slow. Slow to start up. Slow to compile. Slow to load things. But once it gets going, due to the ability to have many different GC implementations "pluggable" via the hosting JVM, it can be remarkably efficient. And it can address currently "exotic" processor architectures such as terrabytes of really, really fast memory with hardward support for many simultaneous threads of execution.

5.2 What can't we do with openjdk6/openjdk7

  1. Implement atomic compare and swap operations.
  2. Utilize improvements in JVM bytecode that have been added, notably support for dynamic method invocation in JSR-292 "Supporting Dynamically Typed Languages".

https://jcp.org/en/jsr/detail?id=292

6 Goals for ABCL2

6.1 1. Speed (startup, compiling, loading)

6.2 2. Second compiler targeting openjdk8

Re-entrant, generate code for dynamic method sites via invokedynamic instruction.

6.3 3. Use APIs available in openjdk8

Adding atomic compare and Swap operations will "complete" the usabilty of ABCL on terrabyte heaps.

7 What does the ABCL1 compiler produce?

Zipped archives that have a set of Lisp forms to be evaluated to CL:LOAD the artifact.

7.1 How does CL:LOAD work on the ABCL1 compiler artifacts?

  1. Given a CL:PATHNAME, open a stream of bytes from its CL:TRUENAME.

    Such a resource will either contain one or more top-level forms, or be a zip archive. Confusingly, in both of these cases, the CL:PATHNAME will have a PATHNAME-TYPE of "abcl". The case of containing one-or-more top-level forms is currently only used to package the system in system 'abcl.jar'.

    […]
      -rw-r--r--      3660   4-Jun-2020  15:21:30  org/armedbear/lisp/time.abcl
      -rw-r--r--      3448   4-Jun-2020  15:21:30  org/armedbear/lisp/time_1.cls
      -rw-r--r--      1150   4-Jun-2020  15:21:30  org/armedbear/lisp/time_2.cls
      -rw-r--r--      1536   4-Jun-2020  15:21:30  org/armedbear/lisp/time_3.cls
      -rw-r--r--      1353   4-Jun-2020  15:21:30  org/armedbear/lisp/time_4.cls
      -rw-r--r--      3231   4-Jun-2020  15:21:30  org/armedbear/lisp/time_5.cls
      -rw-r--r--       883   4-Jun-2020  15:21:30  org/armedbear/lisp/time_6.cls
      -rw-r--r--     16115   4-Jun-2020  15:21:30  org/armedbear/lisp/top-level.abcl
      -rw-r--r--      1565   4-Jun-2020  15:21:30  org/armedbear/lisp/top_level_1.cls
      -rw-r--r--      1385   4-Jun-2020  15:21:30  org/armedbear/lisp/top_level_10.cls
      -rw-r--r--      3402   4-Jun-2020  15:21:30  org/armedbear/lisp/top_level_11.cls
      -rw-r--r--      2861   4-Jun-2020  15:21:30  org/armedbear/lisp/top_level_12.cls
    […]
    
    M Filemode      Length  Date         Time      File
    - ----------  --------  -----------  --------  -------------------------
      -rw-rw-rw-      3229  11-Jun-2020  09:01:44  __loader__._
      -rw-rw-rw-      3714  11-Jun-2020  09:01:44  binding_tmp1U1MRERI_1.cls
      -rw-rw-rw-      2445  11-Jun-2020  09:01:44  binding_tmp1U1MRERI_2.cls
      -rw-rw-rw-      2174  11-Jun-2020  09:01:44  binding_tmp1U1MRERI_3.cls
      -rw-rw-rw-      1433  11-Jun-2020  09:01:44  binding_tmp1U1MRERI_4.cls
    - ----------  --------  -----------  --------  -------------------------
                     12995                         5 files
    
  2. If the resource is a zip archive, open it, access the forms contained within it.

    (merge-pathnames "__loader__._" zip-archive-pathname)
    
  3. Read and evaluate the form(s). They perform the actual loading of the fasl.

    The forms present in the loader artifact will be a bunch of SETQs to set up source properties and one or more SYSTEM::GET-FASL-FUNCTION calls which actually loads the classes emitted by the ABCL1 compiler.

    To load a given class, an array of primitive bytes ("byte[]") is loaded into memory from the archive. This array is passed to the java.lang.ClassLoader.defineClass() function to actually instantiate the class in the JVM.

    "; -*- Mode: Lisp -*-"
    (SYSTEM:INIT-FASL :VERSION 43)
    (SETQ SYSTEM:*SOURCE* #P"/Users/evenson/quicklisp/dists/quicklisp/software/alexandria-20200427-git/alexandria-1/arrays.lisp")
    
    (SETQ SYSTEM:*FASL-LOADER* (SYSTEM::MAKE-FASL-CLASS-LOADER "org.armedbear.lisp.arrays_tmpOKDK4UI"))
    (SYSTEM:%IN-PACKAGE "ALEXANDRIA")
    (PROGN (SYSTEM:PUT '#1=COPY-ARRAY '#2=SYSTEM::SOURCE (CONS '((:FUNCTION #1# ) "/Users/evenson/quicklisp/dists/quicklisp/software/alexandria-20200427-git/alexandria-1/arrays.lisp" 
    #3=25) (GET '#1#  '#2#  #4=NIL))) (SYSTEM:FSET '#1#  (SYSTEM::GET-FASL-FUNCTION SYSTEM:*FASL-LOADER* 
    0) #3#  '(#5=ARRAY &KEY (ELEMENT-TYPE (ARRAY-ELEMENT-TYPE #5# )) (#6=FILL-POINTER 
    (AND (ARRAY-HAS-FILL-POINTER-P #5# ) (#6#  #5# ))) (ADJUSTABLE (ADJUSTABLE-ARRAY-P 
    #5# ))) "Returns an undisplaced copy of ARRAY, with same fill-pointer and
    adjustability (if any) as the original, unless overridden by the keyword
    

8 Strategies to increase speed

8.1 Use a single, valid Java class file instead of current zip archive

The current compiler emits an entire class for each top-level DEFUN form in a given source unit. This simplified constructing the compiler, but for each class, one has to seek() from the beginning to load each class.

With a suitable abstractions, we should be able to instrument all of these top-level forms into a single Java class which can be injected into the JVM in one fell swoop.

8.2 Implement a verified version of our classes

openjdk6 onwards allows "type inference", but the Java class files must contain such information. ABCL1 doesn't.

ABCL1 emits class files whose version is 49.3 (Java 5)

https://docs.oracle.com/javase/specs/jvms/se8/html/jvms-4.html#jvms-4.10


1. Verification by type checking must be used to verify class files
whose version number is greater than or equal to 50.0.

2. Verification by type inference must be supported by all Java Virtual
Machine implementations, except those conforming to the Java ME CLDC
and Java Card profiles, in order to verify class files whose version
number is less than 50.0.

https://docs.oracle.com/javase/specs/jvms/se8/html/jvms-4.html#jvms-4.10

If, and only if, a class file's version number equals 50.0, then if
the type checking fails, a Java Virtual Machine implementation may
choose to attempt to perform verification by type inference (§4.10.2).

We only implement the "type checking" verification. Current JVMs are largely optimized for "type inference". At some point, loading classes via "type checking" will no longer be possible on future JVMS

Should be easier if we have a single class in our emitted fasl representation?

8.3 Speed: Rewrite current compiler to be re-entrant

The ABCL1 compiler

Involves untangling all the lexical and dynamic scope variables that keep its state.

If we re-write it "not to use specials", then we should be able to use multiple parallel threads to perform the loading

9 Implement a new compiler targeting openjdk8 compliant bytecode

Keep the current compiler! Build her younger sister…

I intend to use as much of the work of SICL as possible in the tooling for present in Cleavir.

https://github.com/robert-strandh/SICL

We should be able to expose dynamic method dispatch to JVM optimizations. By emitting them as JSR-292 invokedynamic operations, we can easily switch call sites in a way that should be ammendable to optimization by JVM byte-code tooling and the JVM implementations they optimize for.

10 Roadmap to ABCL2

ABCL2 is currently scheduled to be released in September 2020.

As of now (June 2020) honestly this is way too ambitious for the given the resources (mainly just me) and amount of work needed.

10.1 What I can plausibly do by September

  1. Implement atomic compare and swap operations
  2. Fork the current compiler, re-writing it to be re-entrant
  3. Implement writing type verification byte code

This would form a decent platform to start figuring out how to use invokedynamic to make switching call sites more efficient. Currently, the biggest win here would be for CLOS method dispatch.

11 Conclusion

Thanks for your attention to this future history of the Armed Bear Common Lisp implementation.

To support the development of ABCL2, please consider a financial donation to the Armed Bear appreciation campaign being run by the Common Lisp Foundation.

https://payments.common-lisp.net/project/abcl/

Author: Mark Evenson

Created: 2020-06-17 Wed 09:34

Validate