Nathan Braswell
8b3cab7a2f
Fix multiple cond/slice bugs revealed by LotusRonin's new find testcase
2022-04-24 20:39:51 -04:00
Nathan Braswell
6c51639c6e
Thread through inline_symbols and inline_level to prep for inlining impl
2022-04-23 01:41:52 -04:00
Nathan Braswell
18250e716f
Ah, the remaining calls were to =. Added 'inlining' the = and comp_helper loop into repeated calls to comp_helper_helper, eliminating the param array overhead. Now fib only allocates 10 times (instead of 4 million), and runs in .107s, finally beating Python handilly and becoming about 2x as slow as Chez. Feels like a decent spot for now, and that was most all of the low hanging fruit. The only thing left now is inlining of user functions to get fib_let performing as well - it looks glacial now at .4s because of the 2 remaining closure calls that the let expands to
2022-04-21 01:09:10 -04:00
Nathan Braswell
0cb52eb0b4
Add inlining of add and subtract, and now might be beating Python, though not by a statistically significant amount with the number of tests. Fib is still allocating 4 million times or so, which is weird, since +&- should have been the last calls to do so. Time to track that down
2022-04-20 23:47:36 -04:00
Nathan Braswell
ec9f8d9d10
Implement unwrapped static calls! Modest speedup of 0.50 -> 0.43, I belive because calls to + and - still create the arrays. Still less than expected, though
2022-04-20 02:27:22 -04:00
Nathan Braswell
c2dbac67f5
Add, and move setup to, wrapper func for each user func. Next need to actually call the non-wrapper version if applicable...
2022-04-19 02:00:56 -04:00
Nathan Braswell
5cdaafebe2
Change lapply to optionally take in an explicit env, make it optional for vapply so they match, then tweak Y such that it threads the dynamic env through, then implement eta-reduction in the compiler backend. This provides about the same speedup again from the Y elimination, as it's kinda the other half for fully getting rid of Y such that there's just static recursive calls. fib.kp went from 1.7 -> 1.1 -> 0.5, and fib_let similarly. fib.kp is now faster than fib_manual, but just by a bit.
2022-04-17 01:52:01 -04:00
Nathan Braswell
3009b62f5e
Mostly eliminating Y combinator at compilation time by putting function values in memo early if we have env_val and we put in the anti-recursion hash from the partially evaled call that returned this comb, and then compiling calls also looks for its recursion-stopped hashes in memo. To finish the transformation, I need to perform an Eta-reduction as well, but we've already got over half of the speedup from eliminting the Y part and just leaving (lambda (& y) (lapply <now_const_func!> y)).
2022-04-14 02:49:00 -04:00
Nathan Braswell
8b21a6c55e
Use hyperfine to benchmark, add builtin_fib as a comparison for how fast we could try to be
2022-04-13 00:25:53 -04:00
Nathan Braswell
645b9f7172
Perhaps over-compilicated attempt to only reify envs when actually necessary. Also got a speedup from simplifying params creation when neither varadic nor uses de, which is really the main speedup here. Hopefully this is still a step forwards that will become more apparent with the removal of reifing params too, and inlining. Might be being foiled by the recursive call going through Y or something. Did see a reduction in allocations with the no-reifying thing, but only from 35mil to 34mil. Seems like it should be more with the number of leaf calls in fib, not sure whats up. Maybe there's more overhead going through Y than I thought and its all of that?
2022-04-11 02:17:17 -04:00
Nathan Braswell
d92f774c33
Add help message based on Marcus's suggestion
2022-04-10 10:45:52 -04:00
Nathan Braswell
1149363e62
Add debug_levels and turn off stack_traces by default, but save enough info about the last interaction with the top-level loop to enable re-running to problem spot with debugging on if it happens, and it works! This is the first step towards the opt/non-opt-wrap work while maintaining debugability
2022-04-09 00:45:58 -04:00
Nathan Braswell
7116012be1
Better debug parameter message
2022-04-06 00:24:34 -04:00
Nathan Braswell
dcc81ac2eb
make prints of top level strings not include the quotes
2022-04-06 00:13:46 -04:00
Nathan Braswell
db7c258d39
Make printing stack/env nicer
2022-04-05 23:59:18 -04:00
Nathan Braswell
29f02810f8
More debug work, including adding the code tracking throught marked_array for the stack traces, calling into debug when eval has a symbol not defined error (just the first error spot to do this, we can add them all gradually), allowing abort for debug, and adding (exit val) for debug that resumes execution
2022-04-05 00:30:03 -04:00
Nathan Braswell
99e24ac6a0
Add rough stack trace
2022-04-04 01:35:06 -04:00
Nathan Braswell
00299a8d3a
Fixed read, started in on a debug function with a repl and ability to exit. Haven't actually added any other debug functionality, but thought about how to do stack traces (linked list of env functional val pairs).
2022-04-02 01:01:34 -04:00
Nathan Braswell
d5b11ca037
compile static calls to static wasm calls
2022-04-01 01:06:40 -04:00
Nathan Braswell
b87afc6a12
Round allocated blocks up to the nearest 8 words, and split blocks that are >= 2x needed number of words. Now only allocates 1 wasm page for both compiled and interpreted versions at fib 30, a 269-538x improvement!
2022-03-30 21:27:01 -04:00
Nathan Braswell
b85873b240
Fixed a terrible bug where turns out I used the same name for a block and a parameter in the comparison meta-function -
...
they share the same namespace in the wasm DSL, so when I used it like a parameter in a loop it resolved to the number of scopes between the statement and the block'th parameter
which had the same type and the calculation worked fine, but it overwrote the parameter I thought wasn't being used and called a function later with.
Also, that seemed like that last of the really bad leaks & corruption, so re-enabling the freelist and am able to run some less-trivial benchmarks, namely (fib 30)!
The compiled version is between 4.7x and 65x slower than Python, depending on if you're using wasmer, wasmtime, wasm3. Would like to try WAVM as well.
A solid place to start, I think, and hopefully we'll be faster than Python once I implement a variety of dont-be-dumb optimizations (real-er malloc, static calls of statically known functions, etc)
2022-03-29 23:49:51 -04:00
Nathan Braswell
cb41fe3fc8
Added a singlely-linked list with the value-created 2 word header to mallocd blocks that tracks everything ever malloced, and an assertion on free that the refcount is 0. Found what I think was a (the?) key source of corruption - drop moving the ptr forwards to drop subs, but then freeing that moved ptr. Now fixed, things look much less weird, but there are remaining memory leaks to track down
2022-03-28 23:57:38 -04:00
Nathan Braswell
f5ba367096
Debugging refcounting, fixed 3 mem leaks so far
2022-03-24 01:58:49 -04:00
Nathan Braswell
5c1473d32c
Continue brain-dumping psudocode and notes. I think I've got most everything critical now
2022-03-21 23:52:07 -04:00
Nathan Braswell
6fa2c44619
Add no_compile option to test more staight dynamic eval with a fib and fact test. Compiled is faster, though only 2x on fib - I imagine the hot inner loop isn't actually doing a lot that can be partial evaled, it's the outside. Will need tests that excercise more
2022-03-19 01:48:58 -04:00
Nathan Braswell
f0d68c3efe
Finally implemented runtime vau with varadic, which involved a half-rewrite
2022-03-17 23:20:22 -04:00
Nathan Braswell
f10be4511f
Add support for de to runtime vaus as well as parameter length checking. Do need to add support for varidac functions...
2022-03-17 00:35:21 -04:00
Nathan Braswell
1a2ecd65b0
Implemented runtime vau, but still need to add support for functions taking in the dynamic env (gotta shift those env arrays around)
2022-03-16 02:10:29 -04:00
Nathan Braswell
67ba716003
Add runtime version of cond
2022-03-15 02:13:42 -04:00
Nathan Braswell
31ee20be7b
Implemented wasm eval and fixed slice
2022-03-13 22:56:04 -04:00
Nathan Braswell
1b220023bc
Fix cond to not die on guarded errors, implement a new if in macro-style, port some more over to_compile.kp. Stopped just before 'vau, which seems to loop forever or somesuch
2022-03-13 15:11:30 -04:00
Nathan Braswell
947d854ebb
Implement array functions (len idx slice concat) for strings in wasm versions. All work - note I think slice is broken (or at least exposes brokenness) for arrays (not the newly adding strings)!
2022-03-13 01:44:38 -05:00
Nathan Braswell
d1b6e520f9
Added support for strings to array functions for evaluator (compiled is next)
2022-03-12 20:19:00 -05:00
Nathan Braswell
d87f292c1c
Additional optimization using intset for env_stack, some small bugfixes regarding not making a marked_array out of components that errored, moved over a lot of code to to_compile.kp.
2022-03-10 01:06:44 -05:00
Nathan Braswell
a08415e1e6
Cleanup & some demo code for presentation
2022-03-08 15:55:59 -05:00
Nathan Braswell
7fed3a58f5
Added more to to_compile.kp and runtime started growing again - main bottleneck was the silly using lists as sets thing, changed the small-int uses of these to a new custom bitset and brought the time from 47s back down to 6s. There is a remaining hotspot where partial_eval_helper matches needed_for_progress vs the bitset, but that'll have to wait for tomorrow. Thinking of maintaining a env_stack bitset and adding a bitset_union_nonempty function. Note the new bitset does use some cons/car/cdr operations that'll be a bit different in Kraken, which I'll need to look at. Maybe when porting I can just use indexing if there's not a great way to unify them.
2022-03-08 02:54:26 -05:00
Nathan Braswell
90fe8e1bfa
Bunch of optimization that took us from 3:50 to 0:04 for the current to_compile.kp. Mainly pulling len out of hot loops and using a naive binary tree instead of alists for maps
2022-03-07 02:10:42 -05:00
Nathan Braswell
c8c9bba429
Have nodes carry around information about the additional non-real envs that aren't real because of a non-real env in their chain. These envs don't show up in needed partial idxs, since it's the up the chain env that actually needs progressing, but allow us to do check-for-env-id normally in essentially O(1). This made the function much more efficient by number of invocations and cut some of the other hottest functions by nearly an order of magnitude, but only took 15-20 seconds off of a 4 minute compile. This is unfortunate (Chez profile only shows invocation numbers, not time numbers, so this is hard to tell) but at least this part is better now.
2022-03-06 03:22:35 -05:00
Nathan Braswell
8cdf41826b
Starting to port over & self-host!
2022-03-03 00:33:25 -05:00
Nathan Braswell
4a273c9ba2
Bigfix error infinite recursion, error printing, wrap_level not being in hash_comb, extend to_compile.kp a bit
2022-03-02 01:44:20 -05:00
Nathan Braswell
dd0463d059
Comment out generated debugging and other log based code for large speedup - tried several other optimizations but they counterintitively made things worse
2022-02-28 23:47:02 -05:00
Nathan Braswell
3f26a3ad7d
Finish porting mif and fixing up other inconsistancies. Fix bug for emitting signed numbers as hex in compile. Runs correctly in both Chez and Chicken interpreter now, which Chez being about 3x faster
2022-02-28 00:27:19 -05:00
Nathan Braswell
ea15f48d6f
Implement dlambda and correct dlet. More attempt at Gambit
2022-02-23 16:43:03 -05:00
Nathan Braswell
54097ac074
Port the let+ macro from http://www.phyast.pitt.edu/~micheles/scheme/scheme15.html over mostly, and it works in both Chez and Chicken! Will massage some more to get it to be the same as our previous dlet, but it is working!
2022-02-23 00:56:46 -05:00
Nathan Braswell
f8bab2ada5
I caught the Chicken compiler red handed, it's compiled version has zip change behavior part way through, caught in the act with some prints. Where it does so changes based on optimization level, which is a bad sign. Starting a (hopfully quick) port to more standard scheme - looking to support Chez and Gambit in addition to Chicken, with at least some commented out code if not some sort of conditional compilation. We're off to a roaring start with define-syntax broken in Gambit 4.9.3, from 2019, but there was a new version released last month that I think should fix it.
2022-02-22 02:19:17 -05:00