summaryrefslogtreecommitdiff
path: root/static/src/_posts
diff options
context:
space:
mode:
Diffstat (limited to 'static/src/_posts')
-rw-r--r--static/src/_posts/2022-04-03-ginger-a-small-vm-update.md222
1 files changed, 222 insertions, 0 deletions
diff --git a/static/src/_posts/2022-04-03-ginger-a-small-vm-update.md b/static/src/_posts/2022-04-03-ginger-a-small-vm-update.md
new file mode 100644
index 0000000..74395f0
--- /dev/null
+++ b/static/src/_posts/2022-04-03-ginger-a-small-vm-update.md
@@ -0,0 +1,222 @@
+---
+title: >-
+ Ginger: A Small VM Update
+description: >-
+ It works gooder now.
+tags: tech
+series: ginger
+---
+
+During some recent traveling I had to be pulled away from cryptic-net work for a
+while. Instead I managed to spend a few free hours, and the odd international
+plane ride, to fix the ginger vm.
+
+The problem, as it stood, was that it only functioned "correctly" in a very
+accidental sense. I knew from the moment that I published it that it would get
+mostly rewritten immediately.
+
+And so here we are, with a rewritten vm and some new realizations.
+
+## Operation
+
+The `Operation` type was previously defined like so:
+
+```
+type Operation interface {
+ Perform([]Thunk, Operation) (Thunk, error)
+}
+```
+
+I'm not going to explain it, because it's both confusing and wrong.
+
+One thing that is helpful in a refactor, especially in a strongly typed
+language, is to tag certain interfaces as being axiomatic, and conforming the
+rest of your changes around those. If those interfaces are simple enough to
+apply broadly _and_ accurately describe desired behavior, they will help
+pre-decide many difficult decisions you'd otherwise have to make.
+
+So with that mind, I tagged `Operation` as being an axiomatic interface, given
+that ginger is aiming to be a functional language (and I'm wondering if I should
+just rename `Operation` to `Function`, while I'm at it). The new definition of
+the interface is:
+
+```
+type Operation interface {
+ Perform(Value) Value
+}
+```
+
+`Operation` takes and argument and returns a result, it could not possibly be
+boiled down any further. By holding `Operation` to this definition and making
+decisions from there, it was pretty clear what the next point of attack was.
+
+## If/Recur
+
+The reason that `Operation` had previously been defined in such a fucked up way
+was to support the `if` and `recur` `Operation`s, as if they weren't different
+than any other `Operation`s. But truthfully they are different, as they are
+actually control flow constructs, and so require capabilities that no other
+`Operation` would be allowed to use anyway.
+
+The new implementation reflects this. `if` and `recur` are now both handled
+directly by the compiler, while global `Operation`s like `tupEl` are
+implementations of the `Operation` interface.
+
+## Compile Step
+
+The previous iteration of the vm hadn't distinguished between a compile step and
+a run step. In a way it did both at the same time, by abusing the `Thunk` type.
+Separating the two steps, and ditching the `Thunk` type in the process, was the
+next major change in the refactoring.
+
+The compile step can be modeled as a function which takes a `Graph` and returns
+an `Operation`, where the `Graph`'s `in` and `out` names correspond to the
+`Operation`'s argument and return, respectively. The run step then reads an
+input from the user, calls the compiled `Operation` with that input, and outputs
+the result back to the user.
+
+As an example, given the following program:
+
+```
+* six-or-more.gg
+
+max = {
+ a = tupEl < (in, 0)
+ b = tupEl < (in, 1)
+ out = if < (gt < (a, b), a, b)
+}
+
+out = max < (in, 6)
+```
+
+we want to compile an `Operation` which accepts a number and returns the greater
+of that number and 6. I'm going to use anonymous go functions to demonstrate the
+anatomy of the compiled `Operation`, as that's what's happening in the current
+compiler anyway.
+
+```
+// After compilation, this function will be in-memory and usable as an
+// Operation.
+
+sixOrMore := func(in Value) Value {
+
+ max := func(in Value) Value {
+
+ a := tupEl(in, 0)
+ b := tupEl(in, 1)
+
+ if a > b {
+ return a
+ }
+
+ return b
+ }
+
+ return max(in, 6)
+}
+```
+
+Or at least, this is what I tried for _initially_. What I found was that it was
+easier, in the context of how `graph.MapReduce` works, to make even the leaf
+values, e.g. `in`, `0`, `1`, and `6`, map to `Operations` as well. `in` is
+replaced with an anonymous function which returns its argument, and the numbers
+are replaced with anonymous functions which ignore their argument and always
+return their respective number.
+
+So the compiled `Operation` looks more like this:
+
+```
+// After compilation, this function will be in-memory and usable as an
+// Operation.
+
+sixOrMore := func(in Value) Value {
+
+ max := func(in Value) Value {
+
+ a := tupEl(
+ func(in Value) Value { return in }(in),
+ func(_ Value) Value { return 0}(in),
+ )
+
+ b := tupEl(
+ func(in Value) Value { return in }(in),
+ func(_ Value) Value { return 1}(in),
+ )
+
+ if a > b {
+ return a
+ }
+
+ return b
+ }
+
+ return max(
+ func(in Value) Value { return in }(in),
+ func(_ Value) Value { return 6}(in),
+ )
+}
+```
+
+This added layer of indirection for all leaf values is not great for
+performance, and there's probably further refactoring which could be done to
+make the result look more like the original ideal.
+
+To make things a bit messier, even that representation isn't quite accurate to
+the current result. The compiler doesn't properly de-duplicate work when
+following name values. In other words, everytime `a` is referenced within `max`,
+the `Operation` which the compiler produces will recompute `a` via `tupEl`.
+
+So the _actual_ compiled `Operation` looks more like this:
+
+```
+// After compilation, this function will be in-memory and usable as an
+// Operation.
+
+sixOrMore := func(in Value) Value {
+
+ return func(in Value) Value {
+
+ if tupEl(func(in Value) Value { return in }(in), func(_ Value) Value { return 0}(in)) >
+ tupEl(func(in Value) Value { return in }(in), func(_ Value) Value { return 1}(in)) {
+
+ return tupEl(func(in Value) Value { return in }(in), func(_ Value) Value { return 0}(in))
+ }
+
+ return tupEl(func(in Value) Value { return in }(in), func(_ Value) Value { return 1}(in))
+ }(
+ func(in Value) Value { return in }(in),
+ func(_ Value) Value { return 6}(in),
+ )
+}
+```
+
+Clearly, there's some optimization to be done still.
+
+## Results
+
+While it's still not perfect, the new implementation is far and away better than
+the old. This can be seen just in the performance for the fibonacci program:
+
+```
+# Previous VM
+
+$ time ./eval "$(cat examples/fib.gg)" 10
+55
+
+real 0m8.737s
+user 0m9.871s
+sys 0m0.309s
+```
+
+```
+# New VM
+
+$ time ./eval "$(cat examples/fib.gg)" 50
+12586269025
+
+real 0m0.003s
+user 0m0.003s
+sys 0m0.000s
+```
+
+They're not even comparable.