How Narwhal Works
This document provides information on how to use bin/narwhal
through its command line options, environment variables, and configuration files, then descends into the exact maddenning details of how it goes about bootstrapping and configuring itself.
Glossary
-
module: a JavaScript file that gets its own local scope and certain free variables so that it may export and import APIs.
-
library: a directory that contains additional top-level modules.
-
package: a downloadable and installable component that may include a library of additional modules, as well as executables, source code, or other resources.
-
sandbox: a system of module instances. sandboxes are not necessarily secure in our parlance, but are the finest security boundary Narwhal can support. All modules in a sandbox are mutually vulnerable to each other and to their containing sandbox. By injecting frozen modules into a sandbox, or through dependency injection using the
system
variable, it will be eventually possible to construct secure sandboxes. In a secure sandbox, monkey patching globals will not be possible, and strict mode will be enforced. However, all secure sandboxes will be able to share the same primordial objects, particularly Array, so managed communication among sandboxes will be possible. -
sea: a sea for Narwhal is like a virtual environment. for simplicity, the directory schema of a package, a sea, and Narwhal itself are all the same. They all have their own configuration and libraries, but Narwhal always starts searching for packages and modules in the current sea before searching for packages and modules in the main Narwhal installation, or system Narwhal installation.
Command Line Options
-
-e -c --command COMMAND
evaluate command (final option)
-
-r --require MODULE
pre-load a module
-
-m --module MAIN
run a module as a script (final option)
-
-I --include LIB
add a library path to loader in the position of highest precedence
-
-p --package PACKAGEPREFIXES
add a package prefix directory
-
-d --debug
set debug mode, system.debug = true
-
-P --no-packages
do not load packages automatically
-
-v --verbose
verbose mode: trace ‘require’ calls.
-
-l --log LEVEL
set the log level (critical, error, warn, info, debug)
-
-: --path DELIMITER
prints an augmented PATH with all package bins/
-
-V --version
print Narwhal version number and exit.
Environment Variables
-
NARWHAL_DEFAULT_ENGINE
may be set innarwhal.conf
to a engine name likerhino
,v8
, orxulrunner
. Usetusk engines
for a complete list and consult theREADME
in that engine directory for details about its function and readiness for use. -
NARWHAL_ENGINE
may be set at the command line, but is otherwise set toNARWHAL_DEFAULT_ENGINE
bybin/narwhal
and exposed in JavaScript assystem.engine
. This is the name of the JavaScript engine in use. -
NARWHAL_HOME
is the path to thenarwhal
directory and is available in JavaScript assystem.prefix
. -
NARWHAL_ENGINE_HOME
is the path to the narwhal engine directory, wherebootstrap.js
may be found, and is set bybin/narwhal
. -
NARWHAL_PATH
andJS_PATH
can be used to add high priority library directories to the module search path. These values are accessible in most sandboxes as therequire.loader.paths
variable, and may be editable in place with methods likeshift
,unshift
, andsplice
. Replacingrequire.loader.paths
with a new Array may not have any effect. In secure sandboxes,paths
are not available. -
NARWHAL_DEBUG
is an informational variable that can also be set with the-d
and--debug
command line options, and accessed or changed from within a JavaScript module assystem.debug
.NARWHAL_DEBUG
gets coerced to aNumber
, and the options stack, sojs -ddd -e 'print(system.debug)'
will print 3. -
NARWHAL_VERBOSE
instructs the module loader to report when modules have started and finished loading. This environment variable must be used to catalog modules that are loaded in the bootstrapping process. Otherwise, you can use the-v
and--verbose
options for the same effect for modules that are loaded after the command line arguments have been parsed, which happens before packages are loaded. -
NARWHAL_DEBUGGER
starts Narwhal with a debugger GUI if one is available for the engine. For the Rhino-Java engine, this activates the Rhino Java AWT-based debugger. -
SEA
is an environment variable set bysea
that notifiesnarwhal
to search the given virtual environment for packages first. This function can be approximated by using the-p
or--package
options to thenarwhal
orjs
command, and is inspectable from within a module as the variablesystem.packagePrefixes[0]
. -
SEALVL
(sea level) is an informational environment variable provided by thesea
command, analogous toSHLVL
(shell level) that is the number of instances ofsea
the present shell is running in. -
NARWHAL_JS_VERSION
refers to the JavaScript version, that defaults to"170"
for “1.7.0”, and is used by Rhino on Java to determine the valid JavaScript syntax.
Configuration Files
-
narwhal.conf
may be provided to configure site-specific or virtual-environment (sea) specific environment variables likeNARWHAL_DEFAULT_ENGINE
. You can also opt to specifyNARWHAL_ENGINE
, but that obviates the possibility of allowing the user to override the narwhal engine at the command line.narwhal.conf
follows the BSD convention of using shell scripts as configuration files, so you may use anybash
syntax in this file. Anarwhal.conf.template
exists for illustration. -
package.json
describes the Narwhal package. Narwhal itself is laid out as a package, so it might be used as a standard library package for other engines that might host module systems independently.package.json
names the package, its metadata, and its dependencies.package.json
should not be edited. -
local.json
may be created to override the values provided inpackage.json
for site-specific configurations. Alocal.json.template
exists to illustrate how this might be used to tell Narwhal that the parent directory contains packages, as this is a common development scenario. -
sources.json
contains data for Tusk on where to findpackage.json
files andpackage.zip
archives so that it can create a catalog of all installable packages, their descriptions, and dependencies. This file should not be edited unless the intention is to update the defaults provided for everyone. -
.tusk/sources.json
may be created for site-specific package sources and overrides the normalsources.json
. -
catalog.json
is meant to be maintained as a centrally managed catalog that may be downloaded from Github to.tusk/catalog.json
usingtusk update
. -
.tusk/catalog.json
is wheretusk
looks for information about packages that can be downloaded and installed. It may be downloaded withtusk update
or built fromsources.json
or.tusk/sources.json
usingtusk create-catalog
.
Bootstrapping Narwhal
Narwhal launches in stages. On UNIX-like systems, Narwhal starts with a bash
script, an engine specific bash
script, an engine specific JavaScript, then the common JavaScript.
-
bin/narwhal
At this stage, Narwhal uses only environment variables for configuration. This script discovers its own location on the file system and sources
narwhal.conf
as a shell script to load any system-level configuration variables likeNARWHAL_DEFAULT_ENGINE
. From there, it discerns and exports theNARWHAL_ENGINE
andNARWHAL_ENGINE_HOME
environment variables. It then executes the engine-specific script,$NARWHAL_ENGINE_HOME/bin/narwhal-$NARWHAL_ENGINE
. -
engines/{engine}/bin/narwhal-{engine}
This
bash
script performs some engine-specific configuration, like augmenting the JavaCLASSPATH
for the Rhino engine, and executes the engine-specific bootstrap JavaScript using the JavaScript engine for the engine.Some engines, like
k7
require the JavaScript engine to be on thePATH
. The Rhino engine just expects Java to be on thePATH
, and uses thejs.jar
included in the repository. -
engines/{engine}/bootstrap.js
This engine-specific JavaScript uses whatever minimal mechanisms the JavaScript engine provides for reading files and environment variables to read and evaluate
narwhal.js
.narwhal.js
evaluates to a function expression that accepts a zygoticsystem
Object
, to be replaced later by loading thesystem
module proper.bootstrap.js
provides asystem
object withglobal
,evalGlobal
,engine
, aengines
Array,print
,fs.read
,fs.isFile
,prefix
,packagePrefixes
, and optionallyevaluate
,debug
, orverbose
.-
global
is theglobal
Object
. This is passed explicitly in anticipation of times when it will be much harder to grab this object in engines where its name varies (likewindow
, orthis
) and where it will be unsafe to assume thatthis
defaults toglobal
for functions called anonymously. -
evalGlobal
is a function that callseval
in a scope where no global variables are masked by local variables, butvar
declarations are localized. This is passed explicitly in anticipation of situations down the line where it will be harder to calleval
in a pristine scope chain. -
engine
is a synonym for theNARWHAL_ENGINE
environment variable, the name of the engine. This variable is informational. -
prefix
is a synonym for theNARWHAL_HOME
environment variable, the path leading to thenarwhal
package containingbin/narwhal
. -
packagePrefixes
is a prioritized Array of all of the package directories to search for packages when that time comes. The first package prefix should be theSEA
environment variable, if it exists and has a path. This is the first place that thepackages
module will look for packages to load. The last package prefix is simply theprefix
,NARWHAL_HOME
. TheSEA
prefix appears first so that virtual environments can load their own package versions. -
engines
is an Array of engine names, used to extend the module search path at various stages to include engine specific libraries. There will usually be more than one engine in this Array. For Rhino, it is['rhino', 'default']
. Thedefault
engine contains many “catch-all” modules that, while being engine-specific, are also general enough to be shared among almost all engines. Other engines are likely to share dynamically linked C modules in a “c” engine, and the “rhino” engine itself is useful for the “helma” engine. -
print
is a temporary shortcut for writing a line to a logging console or standard output, favoring the latter if it is available. -
fs
is a pimitive duck-type of thefile
module, which will be loaded later. The module loader usesread
andisFile
to load the initial modules. -
evaluate
is a module evaluator. If the engine does not provide an evaluator, thesandbox
module has a suitable default, but some engines provide their own. For example, the “secure” engine injects a safe, hermetic evaluator.evaluate
accepts a module as a String, and optionally a file name and line number for debugging purposes.evaluate
returns a module factoryFunction
that acceptsrequire
,exports
,module
,system
, andprint
, the module-specific free variables for getting the exported APIs of other modules, providing their own exports, reading their meta data, and conveniently accessing thesystem
module andprint
function respectively. -
debug
is informational, may be used anywhere, and is read from theNARWHAL_DEBUG
environment variable, and may be set later by the-d
or--debug
command options. -
verbose
instructs the module loader to log when module start and finish loading, and is read from theNARWHAL_VERBOSE
environment variable, and may be set later by the-v
or--verbose
command options. To log the coming and going of modules as they occur before the packages and program modules get loaded, you must use the environment variable.
-
-
narwhal.js
This is the common script that creates a module loader, makes the global scope consistent across engines, finishes the
system
module, parses command line arguments, loads packages, executes the desired program, and finally calls the unload event for cleanup or running a daemon event loop.
When Narwhal is embedded, the recommended practice is to load the bootstrap.js
engine script directly, skipping the shell script phases.
Some engines, like Helma or GPSEE, may provide their own module loader implementation. In that case, they may bypass all of this bootstrapping business and simply include Narwhal as if it were a mere package.
No system has been constructed for Windows systems yet.
Narwhal Script
The narwhal.js
script is the next layer of blubber.
sandbox
module (loaded manually fromlib/sandbox.js
), provides the means to construct arequire
function so all other modules can be loaded.global
module, monkey patches the transitive globals so that every engine receives the same ServerJS and EcmaScript 5 global object, or as near to that as possible.system
module, including thefile
andlogger
modules, which is provided for convenience as a free variable in all modules.narwhal
module parses arguments.packages
module loads packages. *packages-engine
loads jars for Java/Rhino.- run command
unload
module sends anunload
signal to any observers, usually for cleanup or to kick off event loops.
Sandbox Module
The sandbox module provides a basic module Loader
for module files on disk, a MultiLoader
for plugable module factory loaders (for things like Objective-J modules and dynamically linked C modules), a Sandbox
for creating and memoizing module instances from the module factories. The sandbox module is useful for creating new sandboxes from within the main sandbox, which is useful for creating cheap module system reloaders that will instantiate fresh modules but only go to disk when the underlying module text has changed.
Global Module
The global module is engine-specific, and there is sharable version in the default engine. The purpose of the global module is to load modules like “json”, “string”, “array”, and “binary”, that monkey patch the globals if necessary to bring every engine up to speed with EcmaScript 5 and the ServerJS standard.
System Module
The system module provides the ServerJS System module standard, for standard IO streams, arguments, and environment variables. The system module goes beyond spec by being a free variable available in all modules, and by providing print
, fs
, and log
variables (at the time of this writing). print
is a late-bound alias for system.stdout.print
, which is to say that replacing system.stdout
will cause print
to redirect to the new output stream. fs
is an alias for the file
module, while log
is a Logger
instance from the logger
module that prints time-stamped log messages to system.stderr
.
Narwhal Module
The Narwhal module contains the command line parser declarations for Narwhal, and an Easter egg.
Packages Module
The packages module analyzes and installs packages, such that their libraries are available in the module search path, and also installs some engine-specific package components like Java archives at run-time. The package loader uses a five pass algorithm:
- find and read package.json for every accessible package, collating them into a catalog. This involves a breadth first topological search of the
packages/
directory of eachpackage
in thesystem.packagePrefixes
Array. This guarantees that the packages installed in the Sea (virtual environment) can override the versions installed with the system. - verify that the catalog is internally consistent, dropping any package that depends on another package that is not installed.
- sort the libraries from packages so that libraries that “depend” on other packages get higher precedence in the module search path.
- “analyze” the packages in order. This involves finding the library directories in each package, including engine-specific libraries for all of the
system.engines
, and performing engine-specific analysis like finding the Java archives (jars
) installed in each package. - “synthesize” a configuration from the analysis. This involves setting the module search path, and performing engine-specific synthesis, like installing a Java class loader for the Java archives, and creating a new, global
Packages
object.
Much of the weight of code in the packages
module concerns using both the conventional locations for libraries and whatnot, but also handling overriden configuration values, gracefully accepting both single Strings and Arrays of multiple options for all directories. For example, packages
assumes that each package has a lib
directory. However, the package may provide a package.json
that states that lib
has been put somewhere else, like {"lib": "lib/js"}
, or even multiple locations like {"lib": ["lib/js", "usr/lib/js"]}
. This applies to “packages” and “jars” as well.
Unload Module
When the program is finished, Narwhal checks whether the “unload” module has been used. If so, it calls the “send” function exported by that module, so that any observers attached with the “when” method get called in first on first off order. This is handy for modules like “reactor” that initiate an event loop.