kraken/doc/writeup.tex

%%
%% This is file `sample-acmsmall.tex',
%% generated with the docstrip utility.
%%
%% The original source files were:
%%
%% samples.dtx  (with options: `acmsmall')
%%
%% IMPORTANT NOTICE:
%%
%% For the copyright see the source file.
%%
%% Any modified versions of this file must be renamed
%% with new filenames distinct from sample-acmsmall.tex.
%%
%% For distribution of the original source see the terms
%% for copying and modification in the file samples.dtx.
%%
%% This generated file may be distributed as long as the
%% original source files, as listed above, are part of the
%% same distribution. (The sources need not necessarily be
%% in the same archive or directory.)
%%
%%
%% Commands for TeXCount
%TC:macro \cite [option:text,text]
%TC:macro \citep [option:text,text]
%TC:macro \citet [option:text,text]
%TC:envir table 0 1
%TC:envir table* 0 1
%TC:envir tabular [ignore] word
%TC:envir displaymath 0 word
%TC:envir math 0 word
%TC:envir comment 0 0
%%
%%
%% The first command in your LaTeX source must be the \documentclass command.
\documentclass[acmsmall]{acmart}

%%
%% \BibTeX command to typeset BibTeX logo in the docs
\AtBeginDocument{%
  \providecommand\BibTeX{{%
    \normalfont B\kern-0.5em{\scshape i\kern-0.25em b}\kern-0.8em\TeX}}}

%% Rights management information.  This information is sent to you
%% when you complete the rights form.  These commands have SAMPLE
%% values in them; it is your responsibility as an author to replace
%% the commands and values with those provided to you when you
%% complete the rights form.
\setcopyright{acmcopyright}
\copyrightyear{2022}
\acmYear{2022}
\acmDOI{10.1145/1122445.1122456}


%%
%% These commands are for a JOURNAL article.
\acmJournal{JACM}
\acmVolume{37}
\acmNumber{4}
\acmArticle{111}
\acmMonth{8}

%%
%% Submission ID.
%% Use this when submitting an article to a sponsored event. You'll
%% receive a unique submission ID from the organizers
%% of the event, and this ID should be used as the parameter to this command.
\acmSubmissionID{123-A56-BU3}

%%
%% The majority of ACM publications use numbered citations and
%% references.  The command \citestyle{authoryear} switches to the
%% "author year" style.
%%
%% If you are preparing content for an event
%% sponsored by ACM SIGGRAPH, you must use the "author year" style of
%% citations and references.
%% Uncommenting
%% the next command will enable that style.
%%\citestyle{acmauthoryear}

%%
%% end of the preamble, start of the body of the document source.
\begin{document}

%%
%% The "title" command has an optional parameter,
%% allowing the author to define a "short title" to be used in page headers.
\title{Efficient compilation of a functional Lisp based on Vau calculus}

%%
%% The "author" command and its associated commands are used to define
%% the authors and their affiliations.
%% Of note is the shared affiliation of the first two authors, and the
%% "authornote" and "authornotemark" commands
%% used to denote shared contribution to the research.
\author{Nathan Braswell}
\email{nathan.braswell@gtri.@gatech.edu}
%%\orcid{1234-5678-9012}
%%\author{G.K.M. Tobin}
%%\authornotemark[1]
%%\email{webmaster@marysville-ohio.com}
\affiliation{%
  \institution{Georgia Tech}
  %%\streetaddress{P.O. Box 1212}
  \city{Atlanta}
  \state{GA}
  \country{USA}
  %%\postcode{43017-6221}
}

%%
%% By default, the full list of authors will be used in the page
%% headers. Often, this list is too long, and will overlap
%% other information printed in the page headers. This command allows
%% the author to define a more concise list
%% of authors' names for this purpose.
%%\renewcommand{\shortauthors}{Trovato and Tobin, et al.}

%%
%% The abstract is a short summary of the work to be presented in the
%% article.
\begin{abstract}
  Vau and Fexprs, as formulated by John Shutt \cite{shutt2010fexprs}, provide a first class and more powerful alternative to
  macros. On the other hand, naively executing a language using Vau and Fexprs instead of macros is exceedingly slow,
  as the code of the fexpr (analogus to a macro invocation) is re-executed at runtime, every time it is encountered.
  Additionally, because it is unclear what code will be evaluated as a parameter to a function call and what code
  must be passed unevaluated to a combiner, little optimization can be done. We address this problem with, to our knowledge,
  the first partial evaluation system that can completely optimize away fexprs that are used and written in the style of macros,
  as well as some other more naturally written combiners.
\end{abstract}

%%
%% The code below is generated by the tool at http://dl.acm.org/ccs.cfm.
%% Please copy and paste the code instead of the example below.
%%
%%\begin{CCSXML}
%%<ccs2012>
%% <concept>
%%  <concept_id>10010520.10010553.10010562</concept_id>
%%  <concept_desc>Computer systems organization~Embedded systems</concept_desc>
%%  <concept_significance>500</concept_significance>
%% </concept>
%% <concept>
%%  <concept_id>10010520.10010575.10010755</concept_id>
%%  <concept_desc>Computer systems organization~Redundancy</concept_desc>
%%  <concept_significance>300</concept_significance>
%% </concept>
%% <concept>
%%  <concept_id>10010520.10010553.10010554</concept_id>
%%  <concept_desc>Computer systems organization~Robotics</concept_desc>
%%  <concept_significance>100</concept_significance>
%% </concept>
%% <concept>
%%  <concept_id>10003033.10003083.10003095</concept_id>
%%  <concept_desc>Networks~Network reliability</concept_desc>
%%  <concept_significance>100</concept_significance>
%% </concept>
%%</ccs2012>
%%\end{CCSXML}

%%\ccsdesc[500]{Computer systems organization~Embedded systems}
%%\ccsdesc[300]{Computer systems organization~Redundancy}
%%\ccsdesc{Computer systems organization~Robotics}
%%\ccsdesc[100]{Networks~Network reliability}

%%
%% Keywords. The author(s) should pick words that accurately describe
%% the work being presented. Separate the keywords with commas.
\keywords{partial evaluation, vau, fexprs, WebAssembly}


%%
%% This command processes the author and affiliation and title
%% information and builds the first part of the formatted document.
\maketitle

\section{Introduction and Motivation}

  Lisps generally have two different abstraction methods, functions and macros. Functions operate at runtime and always
  evaluate their parameters, while macros operate at compiletime and do not evaluate their parameters. This generally
  splits the language to a degree, and macros are not able to be used at runtime, though generally functions are
  able to be used in macros, with various restrictions. The macro systems generally attempt to be hygenic, either preventing
  or making it difficult to manipulate the environment of the code that the macro invocation will expand to. This is often
  needed, however, and various escape hatches can be implemented.

  Creating a powerful, safe, and easy to use macro system is quite difficult, and the resulting systems are often quite complex,
  generally more complex than the base language in which the reside. Macros are also not first class, and cannot be passed
  around as values and do not exist at all at runtime.

  Vau and Fexprs, as formulated by John Shutt \cite{shutt2010fexprs}, (at \url{https://web.wpi.edu/Pubs/ETD/Available/etd-090110-124904/unrestricted/jshutt.pdf}),
  provide a first class and more powerful alternative to macros, unifying functions, macros, and built-in language forms
  into a single concept called a combiner. A combiner may evaluate its arguments 0 or more times,
  and recieves the calling environment as an additional parameter. There is also an eval function which takes in an expression to evaluate
  and an environment in which to do the evaluation. Note that functions, macros, and even built-in language constructs like if, cond, let can be implemented
  as either user-defined or built-in combiners, making both macros and what were previously Lisp special forms first class! They can be named,
  passed to higher-order combiners, put into datastructures, etc.

  On the other hand, naively executing a language using combiners instead of macros is exceedingly slow,
  as the code of the fexpr (analogus to a macro invocation) is re-executed at runtime, every time it is encountered.
  Additionally, because it is unclear what code will be evaluated as a parameter to a function call and what code
  must be passed unevaluated to the combiner, little optimization can be done. We address this problem with, to our knowledge,
  the first partial evaluation system that can completely optimize away fexprs that are used and written in the style of macros,
  as well as some other more naturally written combiners. Our language is more restricted than Shutt's Kernel language, being
  purely functional and allowing no mutation, making the tracking of environments and optimization of access tractable.

  All code available at \url{https://github.com/limvot/kraken}

\section{Prior Work}
\begin{itemize}
    \item{} Axis of Eval list of 22 attempted implmentations - \url{https://axisofeval.blogspot.com/2011/09/kernel-underground.html} \\
      None doing partial evaluation, to my knowledge. I belive all abandond or linkrotted with the seeming exception of \url{https://github.com/rocketnia/fexpress},
      which is taking a very different approach (Lisp-2, explicit apply form, etc) in Racket.
    \item{} Lambda The Ultimate small discussion of partial eval for Vau/Kernel - \url{http://lambda-the-ultimate.org/node/4346}
    \item{} Implementing a Vau-based Language With Multiple Evaluation Strategies - \cite{kearsleyimplementing} \\
      Talks about how partial evaluation could make efficient, doesn't do it.
    \item{} Google Groups email thread by Andres Navarro - \url{https://groups.google.com/g/klisp/c/Dva-Le8Hr-g/m/pyl1Ufu-vksJ} \\
      Andres Navarro talks about his experimental fklisp which is a "very simple functional dialect of Kernel" with no mutation or first class continuations.
      It doesn't compile anything, but prints out the partially evalauted expression. Was a work in progress, ran into performance problems, seems abandond.
\end{itemize}

\subsection{Issues}
  As described in the introduction, the main issue with basing a langauge off of Vau and combiners is slowness.
  This comes from two main problems: one, the combiners taking the place of macros are re-executed every time
  they are encountered instead of being expanded into intermediate code like in a macro system, and two, because
  the compiler in general cannot tell if the combiner being called takes in its parameters by value or not, it cannot
  do practically any optimization.

\section{Solution}
  We partially evaluate a purely function version of this language in a nearly-single pass over the entire program.
  We allow environment chains consisting of both "real" environments with every contained symbol mapped to a value and "fake" environments
  that only have placeholder values. Since the language is purely functional, we know that if a symbol evaluates to a value anywhere, it
  will always evaluate to that value at runtime. With this, we can notice most calls to combiners that don't evaluate their parameters
  (since if we can resolve the combiner to a value, we know that will always be the combiner called at that location) and we can perform
  inlining and continue partial evaluation.

  If the resulting partially-evaluated program only contains static references to a subset of built in combiners and function (combiners that
  evaluate their parameters exactly once), the program can be compiled just like it was a normal Scheme program.
  If some call sites are indeterminate, they can still be compiled, but there will have to be a runtime check inserted that splits
  evaluation based on if the combiner evaluates its parameters or not, and eval and all builtins will have to be compiled into the resulting executable.

  A Vau/combiner based language allows most of the language to be built up in the language itself, as combiners. For instance, even lambda and let
  are derived instead of primitive. See below, where we define both let1 (a simple version of let binding only one variable) and lambda.
\begin{verbatim}
((wrap (vau (let1)

(let1 lambda (vau se (p b1) (wrap (eval (array vau p b1) se)))
  (lambda (n) (* n 2))
)

; impl of let1
)) (vau de (s v b) (eval (array (array vau (array s) b) (eval v de)) de)))
\end{verbatim}

\bibliographystyle{ACM-Reference-Format}
\bibliography{cited-paper}

\end{document}
\endinput
%%
%% End of file `sample-acmsmall.tex'.