Skip to content

Namespace Load Optimization

Tims Gardner edited this page Oct 19, 2018 · 7 revisions

Arcadia's version of the ClojureCLR compiler incorporates important optimizations for loading namespaces fast. This article describes the rationale for and implementation of these optimizations.

Background

Clojure is a live dynamic language and a compiled language, permitting its runtime speed to be much faster than an interpreted language. Compiling from source can be a relatively slow process however – as much as 30 seconds for the entire Clojure and Arcadia codebase. Clojure solves this by supporting ahead of time (AOT) compilation, which pre-compiles Clojure source into CIL bytecode stored in a DLL file on disk.

AOT greatly reduces load times by eliminating the compilation step. Even with AOT, however, users have reported load times of as much as 9 seconds due to certain inefficiencies in the namespace initialization procedure. The problem is exacerbated in Unity, where the virtual machine is reset every time the user plays their game or edits a C# file, triggering the startup procedure multiple times during a single development session.

The optimizations described here eliminate these inefficiencies. They reduce our total startup time, including the overhead of starting Unity, by about a factor of three, going from 9s to 3s on a 2017 Razer Blade laptop with a 2.8Ghz Core i7 CPU.

The primary culprit is JIT time. Clojure is a completely dynamic language, and namespaces are not sets of static definitions that can be loaded directly by the VM like C# namespaces. Rather, they are processes that need to run to completion. These processes manifest in the bytecode primarily as an Initialize method for each namespace, with each expression in the namespace compiled into the Initialize method's body. For namespaces with many expressions, this method can become very large. The Mono VM must JIT compile every method to native code before it can execute it, and we've observed the time to JIT compile Clojure namespace initialization methods dominating our startups.

In many environments this JIT step can be skipped by compiling to native code ahead of time, but this is not an option in Unity3D, so this optimization is designed to minimize the cost of JIT compiling the bodies of namespaces. Three techniques work together to achieve this.

Avoid Top-Level Static Constant Initialization

By default, ClojureCLR will move every constant expression into a field that is statically initialized. This avoids interning vars or keywords every time a function is invoked. However, this optimization does not make sense for the code at the top level of a namespace, as it will only ever be run once, and the resulting static initializer method can be large enough to slow down the JIT. We introduce the dynamic var clojure.lang.Compiler.RegisterConstants that is true by default, but can be bound to false to prevent statically initializing constants. We bind it to false when compiling the namespace initialization code.

Load Clojure Metadata Lazily

A significant portion of namespace initialization code is the construction of Clojure metadata hashmaps (with documentation strings, argument lists etc.) and their assignment to vars. Almost none of this information is used to boot the language, so this code and the associated JIT cost is wasteful. In the common case where the metadata hashmap is constant data, we can move its construction and assignment bytecode out of the namespace initialization logic and into its own type, to be loaded lazily in when needed. We store the bytecode in the type initializer of a generated container type, and store a reference to that type in the var being compiled. This happens in two places to deal with optimized defns (described below) and all other expressions.

Three elements of Clojure metadata are actually used by the runtime during initialization, and receive special treatment: :dynamic, :macro, and :private. Our optimization implements these "static metadata flags" as instance fields on the Var class, so accessing them does not need to construct and load the whole Clojure metadata hashmap.

Initialize Namespaces From CLR Metadata

With constants and metadata out of the namespace initialization method, what remains is the compilation of every top level expression in the namespace. This optimization considers them in two categories: defns with constant metadata, and everything else. defns with constant metadata receive special treatment because their initialization bytecode is identical and does not require any per-expression JIT compilation. All other expressions are treated as arbitrary code that needs to be compiled normally and run. Most well written namespaces will consist primarily of expressions in the former category, so this technique is quite effective.

Every namespace's initialization type gets a NamespaceBodyAttribute CLR metadata attribute. This data is effectively baked into the the compiled binary and loaded very performantly by the VM. Namespace loading at runtime is reduced to feeding data from this attribute into the Compiler.InitializeNamespace method.

NamespaceBodyAttribute contains four fields, all of which are arrays. The lengths of the arrays are guaranteed to be equal, and each index into the arrays cooresponds to an expression at the top level of the namespace. The fields are:

  • string[] names if expression i is a defn with constant metadata, names[i] is the name of the var. Otherwise names[i] is null.
  • Type[] types if expression i is a defn with constant metadata, types[i] is the IFn subclass to associate with the var. Otherwise types[i] is a container type that stores the compiled expression in its static initializer.
  • StaticMetadataFlags[] metadataFlags if expression i is a defn with constant metadata, metadataFlags[i] is a bitmask of which static metadata flags to set when constructing the var as described by the StaticMetadataFlags enum type. Otherwise, metadataFlags[i] is StaticMetadataFlags.None and ignored.
  • Type[] metadataTypes if expression i is a defn with constant metadata, metadataTypes[i] is the container type storing the metadata construction and assignment bytecode in its static initializer to be loaded lazily by the var. Otherwise, metadataTypes[i] null.

Notes

  • We store arbitrary code in container types' static initializers for lazy metadata loading and top level expressions. This is a bit of a hack to avoid looking up a method by name reflectively at runtime and using the .NET framework's RuntimeHelpers.RunClassConstructor method instead, as we do here and here.
Clone this wiki locally