307 lines
20 KiB
Plaintext
307 lines
20 KiB
Plaintext
|
|
Key Contributions to look out for that make this work in practical time:
|
|
1. First class environments that:
|
|
a. Have IDs
|
|
b. Can either be "real", in which case it maps symbols to values,
|
|
or "fake", in which case it maps symbols to themselves, but with the env ID as it's for-progress
|
|
c. Chain up to an upper environment that may be fake or real
|
|
2. AST nodes that maintain on-node:
|
|
a. The IDs of environments that, if "real", can be used to make progress in this subtree
|
|
b. The hashes of infinite recursive calls that were detected and stopped - if this hash isn't in the current call chain, this subtree can make progress
|
|
c. Extra IDs of environments that are "real" but have "fake" environments in their chain - this is used to make return value checking fast O(1 or log n, depending)
|
|
3. Combiners, both user-defined and built in (including that maintain a "wrap level" that:
|
|
a. Is a property of this function value, *not* the function itself
|
|
* meaning that if wrap_level > 1, you can evaluate each parameter and decrement wrap_level, even if you can't execute the call
|
|
4. The return value of a combiner is checked for:
|
|
a. If it is a value, in which case it is good to be returned if it doesn't contain a reference to the envID of the function it is being returned from
|
|
b. If it is (veval something env) where env doesn't contain a reference to the envID of the function it is being returned from
|
|
c. If it is a call to a function (func params...) and func doesn't take in a dynamic environment and params... are all good to be returned
|
|
This makes it so that combiner calls can return partially-evaluated code - any macro-like combiner would calculate the new code and return
|
|
(eval <constructed-code> dynamic_env), which would do what partial evaluation it could and either become a value or a call like case "b" above.
|
|
Case "b" allows this code essentially "tagged" with the environment it should be evaluated in to be returned out of "macro-like" combiners,
|
|
and this dovetails with the next point
|
|
5. The (veval something env) form essentially "tags" a piece of code with the environment it should be evaluated in. At each stage where
|
|
it is possible, the system checks for redundent constructions like these, where the env in (veval something env) is the currently active env.
|
|
In this case, it unwraps it to just "something" and continues on - this completes the second half of the macro-like combiner evaluation where
|
|
after being returned to the calling function the code is essentially spliced in.
|
|
6. The compiler can emit if/else branches on the wrap_level of combiners and in each branch further compile/partial eval if appropriate, allowing
|
|
dynamic calls to either functions or combiners with the overhead of a single branch
|
|
|
|
Note that points 4&5 make it so that any macro written as a combiner in "macro-style" will be expanded just like a macro would and cause no runtime overhead!
|
|
Additionally, point 6 makes it so that functions (wrap level 1 combiners) and non-parameter-evaluating (wrap level 0) combiners can be dynamically passed around and called with very minimal overhead.
|
|
Combine them together and you get a simpler but more flexiable semantics than macro based (pure functional) languages with little-to-no overhead.
|
|
|
|
Additional tricky spots to look out for:
|
|
1. If you don't do the needed-for-progress tracking, you have exponential runtime
|
|
2. If you aren't careful about storing analysis information on the AST node itself or memoize, a naive tree traversal of the DAG has exponential runtime
|
|
3. Infinite recursion can hide in sneaky places, including the interply between the partial evaluator and the compiler, and careful use of multiple recursion blockers / memoization is needed to prevent all cases
|
|
4. The invarients needed to prevent mis-evaluation are non-trivial to get right. Our invarients:
|
|
a. All calls to user-combiners have the parameters as total values, thus not moving something that needs a particular environment underneath a different environment
|
|
b. All return values from functions must not depend on the function's environment (there are a couple of interesting cases here, see combiner_return_ok(func_result, env_id))
|
|
c. All array values are made up of total values
|
|
d. Some primitive combiners don't obey "a", but they must be written with extreme care, and often partially evaluate only some of their parameters and have to keep track of which.
|
|
|
|
|
|
|
|
Everything operates on AST nodes, an ADT:
|
|
* val - integers, strings, booleans
|
|
* marked_array
|
|
* marked_symbol
|
|
* comb
|
|
* prim_comb
|
|
* marked_env
|
|
|
|
Each AST node contains a hash representing it&it's subtree.
|
|
|
|
fun needed_for_progress(ast_node) -> (progress_IDs, rec_stopping_hashes, extra_IDs):
|
|
returns
|
|
- environment IDs (stored in each AST node for it and it's children)
|
|
that must have real values if the partial evaluation of the subtree rooted at
|
|
this node is going to make progress partial evaluating.
|
|
|
|
progress_IDs is either true (meaning it will make progress no matter what), an
|
|
intset of env IDs (the ones that will cause it to make progress), or an empty
|
|
set, meaning it can't make forward progress no matter what
|
|
- hashes that if you're not inside the evaluation of, it could make progress
|
|
- extra IDs for envs it contains that don't count as forward progress IDs because the
|
|
env does have values, but envs in it's parent chain doesn't have values.
|
|
|
|
The calculation for needed_for_progress is straightforward-ish, with some tricky bits at comb and array.
|
|
|
|
Under these definitions, we call an AST subtree a "total val" if it is either a val or it's needed-for-progress IDs is nil.
|
|
|
|
fun mark(x, eval_pos):
|
|
x is env -> error
|
|
x is combiner -> error
|
|
x is symbol -> if x == true than MarkedVal(true)
|
|
else if x == false than MarkedVal(false)
|
|
else MarkedSymbol(x, needed_IDs=if eval_pos true else nil)
|
|
x is array ->
|
|
MarkedArray(is_val=!eval_pos, attempted=false, resume_hashes=nil,
|
|
values = [mark(x[0], eval_pos)] + [mark(xi, false) for xi in x[1:]])
|
|
true -> MarkedVal(x)
|
|
|
|
fun strip(x) -> value:
|
|
if X is an AST node representing a value, it returns the value.
|
|
May strip recursively in the case of an array value, etc.
|
|
Errors on env, comb (but not prim_comb!) non value symbols or arrays
|
|
|
|
fun try_unval(x) -> Result<ASTNode>:
|
|
//Removes one level of "value-ness".
|
|
x is Array -> if !x.array_is_val Error()
|
|
else Ok(MarkedArray(is_value=false,
|
|
values = [try_unval(x.values[0])] + x.values[1:]))
|
|
x is Symbol -> if !x.symbol_is_val Error()
|
|
else Ok(MarkedSymbol(symbol=x.symbol, is_value=false))
|
|
true -> Ok(x)
|
|
|
|
fun check_for_env_id_in_result(env_id, x):
|
|
return env_id in <either progress_IDs or extra_IDs in needed_for_progress(x)>
|
|
if either progress_IDs or extra_IDs is true, then we have a fallback, but
|
|
that doesn't get called even on large testcases so it's either rare or impossible.
|
|
Fallback is slow though, whereas this is just a check for set membership
|
|
|
|
// We only allow returning a value out of a combiner if the return value
|
|
// doesn't reference the environment of the combiner
|
|
fun combiner_return_ok(func_result, env_id):
|
|
func_result isn't later -> !check_for_env_id_in_result(env_id, func_result)
|
|
// special cases now
|
|
(veval body {env}) => (combiner_return_ok {env})
|
|
// The reason we don't have to check body is that this form is only creatable in ways that body was origionally a value and only need {env}
|
|
// Either it's created by eval, in which case it's fine, or it's created by something like (eval (array veval x de) de2) and the array has checked it,
|
|
// or it's created via literal vau invocation, in which case the body is a value.
|
|
(func ...params) => func doesn't take dynamic env && all params are combiner_return_ok
|
|
otherwise -> false
|
|
|
|
// We may end up in situations where the value/code we care about is wrapped up in
|
|
// a redundent call to veval, namely after sucessfully returning based on combiner_return_ok above.
|
|
// This call may prevent other optimizations though, so we should unwrap the redundent call if possible,
|
|
// and if it causes a change we should re-partially-evaluate to make further progress if we can
|
|
fun drop_redundent_veval(x, dynamic_env, env_stack, memostuff):
|
|
(veval node env) if env.id == dynamic_env.id -> drop_redundent_veval(node, dynamic_env, env_stack, memostuff)
|
|
(comb params...) if comb.wrap_level != -1 -> map drop_redundent_veval over params and if any change: partial_eval( (comb new_params...), dynamic_env, env_stack, memostuff)
|
|
else: x
|
|
else -> x
|
|
|
|
fun make_tmp_inner_env(params, de?, ue, env_id):
|
|
...
|
|
|
|
|
|
fun partial_eval_helper(x, only_head, env, env_stack, memostuff, force):
|
|
needed, hashes, _extra = needed_for_partial_eval(x)
|
|
if force || one of hashes is not in memostuff || needed == true || set_intersection(needed, env_stack.set_of_ids_that_are_vals) != empty_set:
|
|
x is MarkedVal -> x
|
|
x is MarkedEnv -> find(x.env_id == it.env_id, env_stack) ?: x
|
|
x is MarkedComb -> if !env.is_real && !x.se.is_real // both aren't real, re-evaluation of closure creation site
|
|
|| env.is_real && !x.se.is_real // new env real, but se isn't - the creation of the closure!
|
|
then let inner_env = make_tmp_inner_env(x.params, x.de?, env, x.env_id)
|
|
in MarkedComb(se=env, body=partial_eval_helper(body, false, inner_env, <add inner_env to env_stack>, memostuff, false))
|
|
x is MarkedPrimComb -> x
|
|
x is MarkedSymbol -> if x.is_val then x
|
|
else env_lookup_helper(x, env)
|
|
x is MarkedArray -> if x.is_val then x
|
|
else let
|
|
comb = partial_eval_helper(x.values[0], only_head=true, env, env_stack, memostuff, false)
|
|
params = x.values[1:]
|
|
if later_head?(comb) return MarkedArray(values=[comb]+params)
|
|
if comb.needed_for_progress == true:
|
|
comb = partial_eval_helper(comb, only_head=false, ...)
|
|
|
|
// If not -1, we always partial eval, if >0 we also unval/partial eval to do one full round of eval
|
|
wrap_level = comb.wrap_level
|
|
while wrap_level >= 0:
|
|
if wrap_level >= 1:
|
|
params = map(unval, map(\x. partial_eval_helper(x, ...), params))
|
|
params = map(\x. partial_eval_helper(x, ...), params)
|
|
wrap_level -= 1
|
|
if <any of the above error, or couldn't be unvaled yet>:
|
|
return MarkedArray(values=[comb.with_wrap_level(wrap_level)] + <params at whatever level they were sucessfully evaluated to>)
|
|
|
|
if comb is MarkedPrimComb:
|
|
result = comb.impl(params)
|
|
if result == 'LATER:
|
|
return MarkedArray(values=[comb.with_wrap_level(wrap_level)] + params)
|
|
else:
|
|
return result
|
|
|
|
if comb.is_varadic:
|
|
params = params[:comb.params.len-1] + [ params[comb.params.len-1:] ]
|
|
|
|
inner_env = MarkedEnv(id=comb.env_id, possible_de_symbol=comb.de?, possible_de=env, symbols=comb.params, values=params, upper=comb.se)
|
|
|
|
rec_stop_hash = combine_hash(inner_env.hash, comb.body.hash)
|
|
if rec_stop_hash in memostuff:
|
|
return MarkedArray(values=[comb] + params, transient_needed_env_id=true, rec_stopping_hash=rec_stop_hash)
|
|
|
|
memostuff.add(rec_stop_hash)
|
|
result = partial_eval_helper(body, false, inner_env, <add inner_env to env_stack>, memostuff, false)
|
|
memostuff.remove(rec_stop_hash)
|
|
|
|
if !combiner_return_ok(result, comb.env_id):
|
|
transiently_needed = if comb.de? != nil then env.id else nil
|
|
return MarkedArray(values=[comb] + params, transient_needed_env_id=transiently_needed, rec_stopping_hash=rec_stop_hash)
|
|
|
|
return drop_redundent_veval(result, env, env_stack, memostuff)
|
|
|
|
And then we define a root_env with PrimComb versions of all of the standard functions.
|
|
The ones that are most interesting and interact the most with partial evaluation are
|
|
vau eval cond
|
|
The other key is that array only takes in values, that is an array value never hides something that isn't a total value and needs more partial-evaluation
|
|
(this makes a lot of things simpler in other places since we can treat array values as values no matter what and know things aren't hiding in sneaky places)
|
|
|
|
fun needs_params_prim(...):
|
|
...
|
|
fun give_up_params_prim(...):
|
|
...
|
|
|
|
fun veval_inner(only_head, de, env_stack, memostuff, params):
|
|
body = params[0]
|
|
implicent_env = len(params) != 2
|
|
eval_env = if implicit_env { de } else { partial_eval_helper(params[1], only_head, de, env_stack, memostuff, false) }
|
|
evaled_body = partial_eval_helper(body, only_head, eval_env, env_stack, memostuff, false)
|
|
if implicit_env or combiner_return_ok(evaled_body, eval_env.idx):
|
|
return drop_redundent_veval(evaled_body, de, env_stack, memostuff)
|
|
else:
|
|
return drop_redundent_veval(MarkedArray(values=[MarkedPrimComb('veval, wrap_level=-1, val_head_ok=true, handler=veval_inner), evaled_body, eval_env], de, env_stack, memostuff)
|
|
|
|
root_env = {
|
|
eval: MarkedPrimComb('eval, wrap_level=1, val_head_ok=true, handler=lambda(only_head, de, env_stack, memostuff, params):
|
|
let
|
|
body = params[0]
|
|
implicit_env = len(params) != 2
|
|
return veval_inner(only_head, de, env_stack, memostuff, if implicit_env { [try_unval(body)] } else { [try_unval(body), params[1]] })
|
|
)
|
|
vapply: MarkedPrimComb('vapply, wrap_level=1, val_head_ok=true, handler=lambda(only_head, de, env_stack, memostuff, [func params env]):
|
|
return veval_inner(only_head, de, env_stack, memostuff, [MarkedArray(values=[func]+params), env)
|
|
)
|
|
lapply: MarkedPrimComb('lapply, wrap_level=1, val_head_ok=true, handler=lambda(only_head, de, env_stack, memostuff, [func params env]):
|
|
return veval_inner(only_head, de, env_stack, memostuff, [MarkedArray(values=[func.offset_wrap_level(-1)]+params), env)
|
|
)
|
|
vau: MarkedPrimComb('vau, wrap_level=0, val_head_ok=true, handler=lambda(only_head, de, env_stack, memostuff, params):
|
|
let
|
|
de? = if len(params) == 3 { params[0].symbol_value } else { nil }
|
|
params = map(lambda(x): s.symbol_value, if de? { params[1] } else { params[0] })
|
|
varadic = '& in params
|
|
params.remove('&)
|
|
implicit_env = len(params) != 2
|
|
body = try_unval(if de? { params[2] } else { params[1] })
|
|
env_id = <new_id>
|
|
if !only_head:
|
|
inner_env = make_tmp_inner_env(params, de?, upper=de, id=env_id)
|
|
body = partial_eval_helper(body, false, inner_env, <add inner_env to env_stack>, memostuff, false)
|
|
return MarkedComb(wrap_level=0, id=new_id, de?=de?, static_env=de, variadic=varadic, params=params, body=body)
|
|
)
|
|
wrap: ...<returns new MarkedPrimComb/MarkedComb with incremented wrap_level>...
|
|
unwrap: ...<returns new MarkedPrimComb/MarkedComb with decremented wrap_level>...
|
|
cond: ...
|
|
...Oddly tricky - is wrap_level 0, but...
|
|
... 1. unvals & partially evaluates starting from the first condition
|
|
... 2. if this condition is true, return the unvald & partially evaluated corresponding arm
|
|
... 3. if this condition is false, drop the arm and return to 1
|
|
... 4. In this case, we have an unknown between true & false
|
|
... 5. check to see if combine_hash(x.hash, env.hash) is in memostuff (prevent infinite recursion blocked on a cond guard!)
|
|
... 6. if the hash was in memostuff, return MarkedArray(later_hash=the_hash,
|
|
... values=[MarkedPrimComb('vcond,wraplevel=-1,...)] + map(unval, <remaining preds/arms>))
|
|
... 7. else new_preds_arms = map(partial_eval..., map(unval, <remaining preds/arms>))
|
|
... <TODO: 8. remove arms/preds now guarenteed to be false, remove all arms/preds after first true>
|
|
... 9. return MarkedArray(values=[MarkedPrimComb('vcond,wraplevel=-1,...)] + new_preds)
|
|
...
|
|
...The vcond is like cond but doesn't do any unvaling (as it's already been done) (and wrap_level is set to -1 so the function call machinery doesn't touch the params either)
|
|
...
|
|
symbol?: needs_params_prim(symbol?)
|
|
int?: needs_params_prim(int?)
|
|
string?: needs_params_prim(string?)
|
|
combiner?: ...
|
|
env?: ...
|
|
nil?: needs_params_prim(nil?)
|
|
bool?: needs_params_prim(bool?)
|
|
str-to-symbol: needs_params_prim(str-to-symbol)
|
|
get-text: needs_params_prim(get-text)
|
|
array?: ...
|
|
array: ...
|
|
len: ...
|
|
idx: ...
|
|
slice: ...
|
|
concat: ...
|
|
+: needs_params_prim(+)
|
|
-: needs_params_prim(-)
|
|
*: needs_params_prim(*)
|
|
/: needs_params_prim(/)
|
|
%: needs_params_prim(%)
|
|
band: needs_params_prim(band)
|
|
bor: needs_params_prim(bor)
|
|
bnot: needs_params_prim(bnot)
|
|
bxor: needs_params_prim(bxor)
|
|
<<: needs_params_prim(<<)
|
|
>>: needs_params_prim(>>)
|
|
=: needs_params_prim(=)
|
|
!=: needs_params_prim(!=)
|
|
<: needs_params_prim(<)
|
|
<=: needs_params_prim(<=)
|
|
>: needs_params_prim(>)
|
|
>=: needs_params_prim(>=)
|
|
str: needs_params_prim(true_str)
|
|
log: give_up_params_prim(log)
|
|
error: give_up_params_prim(error)
|
|
read-string: needs_params_prim(read-string)
|
|
empty_env: MarkedEnv()
|
|
}
|
|
|
|
fun compile(...):
|
|
...
|
|
... tagged words, etc
|
|
... eval
|
|
... vau / vau helper closure
|
|
...
|
|
Note that when it's compiling a call, it compiles an if/else chain on the wrap level of the combiner being called.
|
|
in the 0 branch, it emits the parameters as constant data
|
|
in the 1 branch, it unval's and partial evals all of the parameters before compiling them.
|
|
- note that this must be robust to partial-eval errors, as this branch might not ever happen at runtime and be nonsense code!
|
|
- if the partial evaluation errors, it emits a value that will cause an error at runtime into the compiled code
|
|
in the > 1 branch, it errors
|
|
...
|
|
...
|
|
Must be careful about infiniate recursion, including tricky cases that infinitly ping back and forth between
|
|
partial eval and compile even though both have individual internal recursion checks
|
|
...
|