summaryrefslogtreecommitdiff
path: root/static/src/_posts/2021-12-31-ginger-its-alive.md
diff options
context:
space:
mode:
Diffstat (limited to 'static/src/_posts/2021-12-31-ginger-its-alive.md')
-rw-r--r--static/src/_posts/2021-12-31-ginger-its-alive.md422
1 files changed, 0 insertions, 422 deletions
diff --git a/static/src/_posts/2021-12-31-ginger-its-alive.md b/static/src/_posts/2021-12-31-ginger-its-alive.md
deleted file mode 100644
index efd6bd0..0000000
--- a/static/src/_posts/2021-12-31-ginger-its-alive.md
+++ /dev/null
@@ -1,422 +0,0 @@
----
-title: >-
- Ginger: It's Alive!
-description: >-
- The new best language for computing fibonacci numbers.
-series: ginger
-tags: tech
----
-
-As a kind of Christmas present to myself I took a whole week off of work
-specifically to dedicate myself to working on ginger.
-
-My concrete goal was to be able to run a ginger program to compute any Nth
-fibonacci number, a goal I chose because it would require the implementation of
-conditionals, some kind of looping or recursion, and basic addition/subtraction.
-In other words, it would require all the elements which comprise a Turing
-complete language.
-
-And you know what? I actually succeeded!
-
-The implementation can be found [here][impl]. At this point ginger is an
-interpreted language running in a golang-based VM. The dream is for it to be
-self-hosted on LLVM (and other platforms after), but as an intermediate step to
-that I decided on sticking to what I know (golang) rather than having to learn
-two things at once.
-
-In this post I'm going to describe the components of this VM at a high level,
-show a quick demo of it working, and finally talk about the roadmap going
-forward.
-
-[impl]: https://github.com/mediocregopher/ginger/tree/ebf57591a8ac08da8a312855fc3a6d9c1ee6dcb2
-
-## Graph
-
-The core package of the whole project is the [`graph`][graph] package. This
-package implements a generic directed graph datastructure.
-
-The generic part is worth noting; I was able to take advantage of go's new
-generics which are currently [in beta][go118]. I'd read quite a bit on how the
-generic system would work even before the beta was announced, so I was able to
-hit the ground running and start using them without much issue.
-
-Ginger's unique graph datastructure has been discussed in previous posts in this
-series quite a bit, and this latest implementation doesn't deviate much at a
-high level. Below are the most up-to-date core datatypes and functions which are
-used to construct ginger graphs:
-
-```go
-
-// Value is any value which can be stored within a Graph. Values should be
-// considered immutable, ie once used with the graph package their internal
-// value does not change.
-type Value interface {
- Equal(Value) bool
- String() string
-}
-
-// OpenEdge consists of the edge value (E) and source vertex value (V) of an
-// edge in a Graph. When passed into the AddValueIn method a full edge is
-// created. An OpenEdge can also be sourced from a tuple vertex, whose value is
-// an ordered set of OpenEdges of this same type.
-type OpenEdge[E, V Value] struct { ... }
-
-// ValueOut creates a OpenEdge which, when used to construct a Graph, represents
-// an edge (with edgeVal attached to it) coming from the vertex containing val.
-func ValueOut[E, V Value](edgeVal E, val V) *OpenEdge[E, V]
-
-// TupleOut creates an OpenEdge which, when used to construct a Graph,
-// represents an edge (with edgeVal attached to it) coming from the vertex
-// comprised of the given ordered-set of input edges.
-func TupleOut[E, V Value](edgeVal E, ins ...*OpenEdge[E, V]) *OpenEdge[E, V]
-
-// Graph is an immutable container of a set of vertices. The Graph keeps track
-// of all Values which terminate an OpenEdge. E indicates the type of edge
-// values, while V indicates the type of vertex values.
-type Graph[E, V Value] struct { ... }
-
-// AddValueIn takes a OpenEdge and connects it to the Value vertex containing
-// val, returning the new Graph which reflects that connection.
-func (*Graph[E, V]) AddValueIn(val V, oe *OpenEdge[E, V]) *Graph[E, V]
-
-// ValueIns returns, if any, all OpenEdges which lead to the given Value in the
-// Graph (ie, all those added via AddValueIn).
-func (*Graph[E, V]) ValueIns(val Value) []*OpenEdge[E, V]
-
-```
-
-The current `Graph` implementation is _incredibly_ inefficient, it does a lot of
-copying, looping, and equality checks which could be optimized out one day.
-That's going to be a recurring theme of this post, as I had to perform a
-balancing act between actually reaching my goal for the week while not incurring
-too much tech debt for myself.
-
-[graph]: https://github.com/mediocregopher/ginger/blob/ebf57591a8ac08da8a312855fc3a6d9c1ee6dcb2/graph/graph.go
-[go118]: https://go.dev/blog/go1.18beta1
-
-### MapReduce
-
-There's a final operation I implemented as part of the `graph` package:
-[MapReduce][mapreduce]. It's a difficult operation to describe, but I'm going to
-do my best in this section for those who are interested. If you don't understand
-it, or don't care, just know that `MapReduce` is a generic tool for transforming
-graphs.
-
-For a description of `MapReduce` we need to present an example graph:
-
-```
- +<--b---
- + \
-X <--a--+<--c----+<--f-- A
- + /
- + +<---g---
- +<--d--+
- +<---h---
- \
-Y <---------e----------- B
-```
-
-Plus signs indicate tuples, and lowercase letters are edge values while upper
-case letters are vertex values. The pseudo-code to construct this graph in go
-might look like:
-
-```go
- g := new(Graph)
-
- fA := ValueOut("f", "A")
-
- g = g.AddValueIn(
- "X",
- TupleOut(
- "a",
- TupleOut("b", fA),
- TupleOut("c", fA),
- TupleOut(
- "d",
- ValueOut("g", "A"),
- ValueOut("h", "B"),
- ),
- ),
- )
-
- g = g.AddValueIn("e", "B")
-```
-
-As can be seen in the [code][mapreduce], `MapReduce`'s first argument is an
-`OpenEdge`, _not_ a `Graph`. Fundamentally `MapReduce` is a reduction of the
-_dependencies_ of a particular value into a new value; to reduce the
-dependencies of multiple values at the same time would be equivalent to looping
-over those values and calling `MapReduce` on each individually. Having
-`MapReduce` only deal with one edge at a time is more flexible.
-
-So let's focus on a particular `OpenEdge`, the one leading into `X` (returned by
-`TupleOut("a", etc...)`. `MapReduce` is going to descend into this `OpenEdge`
-recursively, in order to first find all value vertices (ie the leaf vertices,
-those without any children of their own).
-
-At this point `MapReduce` will use its second argument, the `mapVal` function,
-which accepts a value of one type and returns a value of another type. This
-function is called on each value from every value vertex encountered. In this
-case both `A` and `B` are connectable from `X`, so `mapVal` will be called on
-each _only once_. This is the case even though `A` is connected to multiple
-times (once with an edge value of `f`, another with an edge value of `b`).
-`mapVal` only gets called once per vertex, not per connection.
-
-With all values mapped, `MapReduce` will begin reducing. For each edge leaving
-each value vertex, the `reduceEdge` function is called. `reduceEdge` accepts as
-arguments the edge value of the edge and the _mapped value_ (not the original
-value) of the vertex, and returns a new value of the same type that `mapVal`
-returned. Like `mapVal`, `reduceEdge` will only be called once per edge. In our
-example, `<--f--A` is used twice (`b` and `c`), but `reduceEdge` will only be
-called on it once.
-
-With each value vertex edge having been reduced, `reduceEdge` is called again on
-each edge leaving _those_ edges, which must be tuple edges. An array of the
-values returned from the previous `reduceEdge` calls for each of the tuples'
-input edges is used as the value argument in the next call. This is done until
-the `OpenEdge` is fully reduced into a single value.
-
-To flesh out our example, let's imagine a `mapVal` which returns the input
-string repeated twice, and a `reduceEdge` which returns the input values joined
-with the edge value, and then wrapped with the edge value (eg `reduceEdge(a, [B,
-C]) -> aBaCa`).
-
-Calling `MapReduce` on the edge leading into `X` will then give us the following
-calls:
-
-```
-# Map the value vertices
-
-mapVal(A) -> AA
-mapVal(B) -> BB
-
-# Reduce the value vertex edges
-
-reduceEdge(f, [AA]) -> fAAf
-reduceEdge(g, [AA]) -> gAAg
-reduceEdge(h, [BB]) -> hBBh
-
-# Reduce tuple vertex edges
-
-reduceEdge(b, [fAAf]) -> bfAAfb
-reduceEdge(c, [fAAf]) -> cfAAfc
-reduceEdge(d, [gAAg, hBBh]) -> dgAAgdhBBhd
-
-reduceEdge(a, [bfAAfb, cfAAfc, dgAAgdhBBhd]) -> abfAAfbacfAAfcadgAAgdhBBhda
-```
-
-Beautiful, exactly what we wanted.
-
-`MapReduce` will prove extremely useful when it comes time for the VM to execute
-the graph. It enables the VM to evaluate only the values which are needed to
-produce an output, and to only evaluate each value once no matter how many times
-it's used. `MapReduce` also takes care of the recursive traversal of the
-`Graph`, which simplifies the VM code significantly.
-
-[mapreduce]: https://github.com/mediocregopher/ginger/blob/ebf57591a8ac08da8a312855fc3a6d9c1ee6dcb2/graph/graph.go#L338
-
-## gg
-
-With a generic graph implementation out of the way, it was then required to
-define a specific implementation which could be parsed from a file and later
-used for execution in the VM.
-
-The file extension used for ginger code is `.gg`, as in "ginger graph" (of
-course). The package name for decoding this file format is, therefore, also
-called `gg`.
-
-The core datatype for the `gg` package is the [`Value`][ggvalue], since the
-`graph` package takes care of essentially everything else in the realm of graph
-construction and manipulation. The type definition is:
-
-```go
-// Value represents a value which can be serialized by the gg text format.
-type Value struct {
-
- // Only one of these fields may be set
- Name *string
- Number *int64
- Graph *Graph
-
- // Optional fields indicating the token which was used to construct this
- // Value, if any.
- LexerToken *LexerToken
-}
-
-type Graph = graph.Graph[Value, Value] // type alias for convenience
-```
-
-Note that it's currently only possible to describe three different types in a
-`gg` file, and one of them is the `Graph`! These are the only ones needed to
-implement a fibonacci function, so they're all I implemented.
-
-The lexing/parsing of `gg` files is not super interesting, you can check out the
-package code for more details. The only other thing worth noting is that, for
-now, all statements are required to end with a `;`. I had originally wanted to
-be less strict with this, and allow newlines and other tokens to indicate the
-end of statements, but it was complicating the code and I wanted to move on.
-
-Another small thing worth noting is that I decided to make each entire `.gg`
-file implicitly define a graph. So you can imagine each file's contents wrapped
-in curly braces.
-
-With the `gg` package out of the way I was able to finally parse ginger
-programs! The following is the actual, real-life implementation of the fibonacci
-function (though at this point it didn't actually work, because the VM was still
-not implemented:
-
-```
-out = {
-
- decr = { out = add < (in; -1;); };
-
- n = tupEl < (in; 0;);
- a = tupEl < (in; 1;);
- b = tupEl < (in; 2;);
-
- out = if < (
- isZero < n;
- a;
- recur < (
- decr < n;
- b;
- add < (a;b;);
- );
- );
-
-} < (in; 0; 1;);
-```
-
-[ggvalue]: https://github.com/mediocregopher/ginger/blob/ebf57591a8ac08da8a312855fc3a6d9c1ee6dcb2/gg/gg.go#L14
-
-## VM
-
-Finally, the meat of all this. If the `graph` and `gg` packages are the sturdy,
-well constructed foundations of a tall building, then the `vm` package is the
-extremely long, flimsy stick someone propped up vertically so they could say
-they built a structure of impressive height.
-
-In other words, it's very likely that the current iteration of the VM will not
-be long for this world, and so I won't waste time describing it in super detail.
-
-What I will say about it is that within the `vm` package I've defined a [new
-`Value` type][vmvalue], which extends the one defined in `gg`. The necessity of
-this was that there are types which cannot be represented syntactically in a
-`.gg` file, but which _can_ be used as values within a program being run.
-
-The first of these is the `Operation`, which is essentially a first-class
-function. The VM will automatically interpret a graph as an `Operation` when it
-is used as an edge value, as has been discussed in previous posts, but there are
-also built-in operations (like `if` and `recur`) which cannot be represented as
-datastructures, and so it was necessary to introduce a new in-memory type to
-properly represent operations.
-
-The second is the `Tuple` type. This may seem strange, as ginger graphs already
-have a concept of a tuple. But the ginger graph tuple is a _vertex type_, not a
-value type. The distinction is small, but important. Essentially the graph tuple
-is a structural element which describes how to create a tuple value, but it is
-not yet that value. So we need a new Value type to hold the tuple once it _has_
-been created during runtime.
-
-Another thing worth describing about the `vm` package, even though I think they
-might change drastically, are [`Thunk`s][thunk]:
-
-```go
-// Thunk is returned from the performance of an Operation. When called it will
-// return the result of that Operation having been called with the particular
-// arguments which were passed in.
-type Thunk func() (Value, error)
-```
-
-The term "thunk" is borrowed from Haskell, which I don't actually know so I'm
-probably using it wrong, but anyway...
-
-A thunk is essentially a value which has yet to be evaluated; the VM knows
-exactly _how_ to evaluate it, but it hasn't done so yet. The primary reason for
-their existence within ginger is to account for conditionals, ie the `if`
-operation. The VM can't evaluate each of an `if`'s arguments all at once, it
-must only evaluate the first argument (to obtain a boolean), and then based on
-that evaluate the second or third argument.
-
-This is where `graph.MapReduce` comes in. The VM uses `graph.MapReduce` to
-reduce each edge in a graph to a `Thunk`, where the `Thunk`'s value is based on
-the operation (the edge's value) and the inputs to the edge (which will
-themselves be `Thunk`s). Because each `Thunk` represents a potential value, not
-an actual one, the VM is able to completely parse the program to be executed
-(using `graph.MapReduce`) while allowing conditionals to still work correctly.
-
-[EvaluateEdge][evaledge] is where all that happens, if you're interested, but be
-warned that the code is a hot mess right now and it's probably not worth
-spending a ton of time understanding it as it will change a lot.
-
-A final thing I'll mention is that the `recur` operation is, I think, broken. Or
-probably more accurately, the entire VM is broken in a way which prevents
-`recur` from working correctly. It _does_ produce the correct output, so I
-haven't prioritized debugging it, but for any large number of iterations it
-takes a very long time to run.
-
-[vmvalue]: https://github.com/mediocregopher/ginger/blob/ebf57591a8ac08da8a312855fc3a6d9c1ee6dcb2/vm/vm.go#L18
-[thunk]: https://github.com/mediocregopher/ginger/blob/ebf57591a8ac08da8a312855fc3a6d9c1ee6dcb2/vm/op.go#L11
-[evaledge]: https://github.com/mediocregopher/ginger/blob/ebf57591a8ac08da8a312855fc3a6d9c1ee6dcb2/vm/scope.go#L29
-
-## Demo
-
-Finally, to show it off! I put together a super stupid `eval` binary which takes
-two arguments: a graph to be used as an operation, and a value to be used as an
-argument to that operation. It doesn't even read the code from a file, you have
-to `cat` it in.
-
-The [README][readme] documents how to run the demo, so if you'd like to do so
-then please clone the repo and give it a shot! It should look like this when you
-do:
-
-```
-# go run ./cmd/eval/main.go "$(cat examples/fib.gg)" 8
-21
-```
-
-You can put any number you like instead of `8`, but as mentioned, `recur` is
-broken so it can take a while for larger numbers.
-
-[readme]: https://github.com/mediocregopher/ginger/blob/ebf57591a8ac08da8a312855fc3a6d9c1ee6dcb2/README.md
-
-## Next Steps
-
-The following are all the things I'd like to address the next time I work on
-ginger:
-
-* `gg`
-
- * Allow for newlines (and `)` and `}`) to terminate statements, not just
- `;`.
-
- * Allow names to have punctuation characters in them (maybe?).
-
- * Don't read all tokens into memory prior to parsing.
-
-* `vm`
-
- * Fix `recur`.
-
- * Implement tail call optimization.
-
-* General
-
- * A bit of polish on the `eval` tool.
-
- * Expose graph creation, traversal, and transformation functions as
- builtins.
-
- * Create plan (if not actually implement it yet) for how code will be
- imported from one file to another. Namespacing in general will fall into
- this bucket.
-
- * Create plan (if not actually implement it yet) for how users can
- extend/replace the lexer/parser.
-
-I don't know _when_ I'll get to work on these next, ginger will come back up in
-my rotation of projects eventually. It could be a few months. In the meantime I
-hope you're as excited about this progress as I am, and if you have any feedback
-I'd love to hear it.
-
-Thanks for reading!