summaryrefslogtreecommitdiff
path: root/static
diff options
context:
space:
mode:
authorBrian Picciano <mediocregopher@gmail.com>2021-08-27 17:51:55 -0600
committerBrian Picciano <mediocregopher@gmail.com>2021-08-27 17:51:55 -0600
commita1f1044b488ddb0f5eda1ab5d5e1c481e80be996 (patch)
tree8820efa0f146aec23375e7672b1c5662c281eee6 /static
parent38fdd7725da4be8e6657f15f2511a32454a6dff6 (diff)
ginger syntax
Diffstat (limited to 'static')
-rw-r--r--static/src/_posts/2021-08-26-the-syntax-of-ginger.md480
1 files changed, 480 insertions, 0 deletions
diff --git a/static/src/_posts/2021-08-26-the-syntax-of-ginger.md b/static/src/_posts/2021-08-26-the-syntax-of-ginger.md
new file mode 100644
index 0000000..4ff64d8
--- /dev/null
+++ b/static/src/_posts/2021-08-26-the-syntax-of-ginger.md
@@ -0,0 +1,480 @@
+---
+title: >-
+ The Syntax of Ginger
+description: >-
+ Oh man, this got real fun real quick.
+series: ginger
+tags: tech
+---
+
+Finally I have a syntax for ginger that I'm happy with. This has actually been a
+huge roadblock for me up till this point. There's a bit of a chicken-and-the-egg
+problem with the syntax: without pinning down the structures underlying the
+syntax it's difficult to develop one, but without an idea of syntax it's
+difficult to know what structures will be ergonomic to use.
+
+I've been focusing on the structures so far, and have only now pinned down the
+syntax. Let's see what it looks like.
+
+## Preface: Conditionals
+
+I've so far written [two][cond1] [posts][cond2] regarding conditionals in
+ginger. After more reflection, I think I'm going to stick with my _original_
+gut, which was to only have value and tuple vertices (no forks), and to use a
+function which accepts both a boolean and two input edges: the first being the
+one to take if the boolean is true, and the second being the one to take if it's
+false.
+
+Aka, the very first proposal in the [first post][cond1]. It's hard to justify
+up-front, but I think once you see it in action with a clean syntax you'll agree
+it just kind of works.
+
+[cond1]: {% post_url 2021-03-01-conditionals-in-ginger %}
+[cond2]: {% post_url 2021-03-04-conditionals-in-ginger-errata %}
+
+## Designing a Syntax
+
+Ginger is a bit of a strange language. It uses strange datastructures in strange
+ways. But approaching the building of a syntax for any language is actually
+straightforward: you're designing a serialization protocol.
+
+To pull back a bit, consider a list of words. How would you encode this list in
+order to write it to a file? To answer this, let's flip the question: how would
+you design a sequence of characters (ie the contents of the file) such that the
+reader could reconstruct the list?
+
+Well, constructing the list from a sequence of characters requires being able to
+construct it _at all_, so in what ways is the list constructed? For this list,
+let's say there's only an append operation, which accepts a list and a value to
+append to it, and returns the result.
+
+If we say that append is encoded via wrapping parenthesis around its two
+arguments, and that `()` encodes the empty list, then we get a syntax like...
+
+```
+(((() foo) bar) baz)
+```
+
+...which, in this instance, decodes to a list containing the words, "foo", "bar",
+and "baz", in that order.
+
+It's not a pretty syntax, but it demonstrates the method. If you know how the
+datastructure is constructed via code, you know what capabilities the syntax must
+have and how it needs to fit together.
+
+## gg
+
+All of this amounted to me needing to implement the ginger graph in some other
+language, in order to see what features the syntax must have.
+
+A few years ago I had begun an implementation of a graph datastructure in go, to
+use as the base (or at least a reference) for ginger. I had called this
+implementation `gg` (ginger graph), with the intention that this would also be
+the file extension used to hold ginger code (how clever).
+
+The basic qualities I wanted in a graph datastructure for ginger were, and still
+are:
+
+* Immutability, ie all operations which modify the structure should return a
+ copy, leaving the original intact.
+
+* Support for tuples.
+
+* The property that it should be impossible to construct an invalid graph. An
+ invalid graph might be, for example, one where there is a single node with no
+ edges.
+
+* Well tested, and reasonably performant.
+
+Coming back to all this after a few years I had expected to have a graph
+datastructure implemented, possibly with immutability, but lacking in tuples and
+tests. As it turns out I completely underestimated my past self, because as far
+as I can tell I had already finished the damn thing, tuples, tests and all.
+
+It looks like that's the point where I stopped, probably for being unsure about
+some other aspect of the language, and my motivation fell off. The fact that
+I've come back to ginger, after all these years, and essentially rederived the
+same language all over again, gives me a lot of confidence that I'm on the right
+track (and a lot of respect for my past self for having done all this work!)
+
+The basic API I came up with for building ginger graphs (ggs) looks like this:
+
+```go
+package gg
+
+// OpenEdge represents an edge with a source value but no destination value,
+// with an optional value on it. On its own an OpenEdge has no meaning, but is
+// used as a building block for making Graphs.
+type OpenEdge struct{ ... }
+
+// TupleOut constructs an OpenEdge leading from a tuple, which is comprised of
+// the given OpenEdges leading into it, with an optional edge value.
+func TupleOut(ins []OpenEdge, edgeVal Value) OpenEdge
+
+// ValueOut constructs an OpenEdge leading from a non-tuple value, with an
+// optional edge value.
+func ValueOut(val, edgeVal Value) OpenEdge
+
+// ZeroGraph is an empty Graph, from which all Graphs are constructed via calls
+// to AddValueIn.
+var ZeroGraph = &Graph{ ... }
+
+// Graph is an immutable graph structure, formed from a collection of edges
+// between values and tuples.
+type Graph struct{ ... }
+
+// AddValueIn returns a new Graph which is a copy of the original, with the
+// addition of a new edge. The new edge's source and edge value come from the
+// given OpenEdge, and the edge's destination value is the given value.
+func (g *Graph) AddValueIn(oe OpenEdge, val Value) *Graph
+```
+
+The actual API is larger than this, and includes methods to remove edges,
+iterate over edges and values, and perform unions and disjoins of ggs. But the
+above are the elements which are required only for _making_ ggs, which is all
+that a syntax is concerned with.
+
+As a demonstration, here is how one would construct the `min` operation, which
+takes two numbers and returns the smaller, using the `gg` package:
+
+```go
+// a, b, in, out, if, etc.. are Values which represent the respective symbol.
+
+// a is the result of passing in to the 0 operation, ie a is the 0th element of
+// the in tuple.
+min := gg.ZeroGraph.AddValueIn(gg.ValueOut(in, 0), a)
+
+// b is the 1st element of the in tuple
+min = min.AddValueIn(gg.ValueOut(in, 1), b)
+
+// out is the result of an if which compares a and b together, and which returns
+// the lesser.
+min = min.AddValueIn(out, gg.TupleOut([]gg.OpenEdge{
+ gg.TupleOut([]gg.OpenEdge{a, b}, lt),
+ a,
+ b,
+}, if)
+```
+
+And here's a demonstration of how this `min` would be used:
+
+```go
+// out is the result of passing 1 and 5 to the min operation.
+gg.ZeroGraph.AddValueIn(gg.TupleOut([]gg.OpenEdge{1, 5}, min), out)
+```
+
+## Make it Nice
+
+_Technically_ we're done. We have an implementation of the language's underlying
+structure, and a syntax which encodes it (ie the ugly ass go syntax above). But
+obviously I'm not proposing anyone actually use that.
+
+Another thing I found when digging around in the old ginger repo was a text
+file, tucked away in a directory called "sandbox", which had a primitive syntax
+which _almost_ worked. I won't copy it here, but you can find it if you care to.
+But with that as a foundation I came up with a crude, rough draft spec, which
+maps the go syntax to the new syntax.
+
+```
+ValueOut(val, edgeVal) : -edgeVal-val
+ValueOut(val, null) : -val
+TupleOut([]val, edgeVal) : -edgeVal-(val, ...)
+TupleOut([]val, null) : -(val, ...)
+Graph(openEdge->val, ...) : { val openEdge, ... }
+```
+
+A couple things to note about this spec:
+
+* `null` is used to indicate absence of value on an edge. The details of `null`
+ are yet to be worked out, but we can use this placeholder for now.
+
+* `Graph` is cheating a bit. In the original `gg` implementation a Graph gains
+ its OpenEdge/Value pairs via successive calls to `AddValueIn`. However, such a
+ pattern doesn't translate well to text, and since we're dealing purely with
+ constructing an entire Graph at once we can instead have our Graph syntax
+ declare all OpenEdge/Value pairs at once.
+
+* It's backwards! Eg where the go syntax does `ValueOut(val, edgeVal)`, the
+ proposed spec puts the values in the opposite order: `-edgeVal-val`. The
+ former results in code which is read from input to output, while the latter
+ results in the opposite: output to input.
+
+ This was a tip I picked up from the old text file I found, and the result is
+ code which is more familiar to an existing programmer. I _think_ (but am
+ not sure) that it's also more in line with how programming is done mentally,
+ ie we start with a result and work backwards to figure out what it takes to
+ get there.
+
+ It's possible, though, that I'm wrong, so at this end of this post I'm going
+ to put some examples of the same code both "forwards" and "backwards" and see
+ how I feel about it.
+
+With all that said, let's see it in action! Here's `min` implemented in our shiny new syntax:
+
+```
+min -{
+ a -0-in,
+ b -1-in,
+ out -if-(
+ -lt-(-a,-b),
+ -a,
+ -b
+ )
+}
+```
+
+and then here's it being used:
+
+```
+out -min-(-1,-5)
+```
+
+## Make it _Nicer_
+
+The most striking feature of this rough draft spec is all the prefix dashes,
+such as in the `-min-(-1,-5)` statement. These dashes were included as they make
+sense in the context of what the intended human interpretation of the structure
+is: two values, `1`, and `5`, are being _piped into_ the two slots of a 2-tuple,
+and that 2-tuple is being _piped into_ the `min` operation, the output of which
+is being _piped into_ something `out`.
+
+The "piping into" is what the dash represents, which is why the top level values
+in the graph, `a`, `b`, and `out`, don't have a preceding dash; they are the
+ultimate destinations of the pipes leading to them. But these pipes are
+ultimately ugly, and also introduce odd questions like "how do we represent
+-1?", so they need to go.
+
+So I've made a second draft, which is only a few changes away from the rough,
+but oh man do those changes make a world of difference. Here's the cleaned up
+spec:
+
+```
+ValueOut(val, edgeVal) : edgeVal(val)
+ValueOut(val, null) : val
+TupleOut([]val, edgeVal) : edgeVal(val, ...)
+TupleOut([]val, null) : (val, ...)
+Graph(openEdge->val, ...) : { val = openEdge, ... }
+```
+
+The dashes were simply removed, and the `edgeVal` and `val` concatted together.
+For `ValueOut(val, edgeVal)` wrapping parenthesis were put around `val`, to
+delineate it and `edgeVal`. This conflicts with the syntax for `TupleOut([]val,
+edgeVal)`, but that conflict is easy to remedy: when parenthesis wrap only a
+single `val` then that is a `ValueOut`, otherwise it's a `TupleOut`.
+
+Another change is to add an `=` between the `val` and `openEdge` in the `Graph`
+constructor. This is a purely aesthetic change, but as you'll see it works well.
+
+So let's see it! `min` implemented with this cleaned up syntax:
+
+```
+min = {
+ a = 0(in),
+ b = 1(in),
+ out = if(
+ lt(a,b),
+ a,
+ b
+ )
+}
+```
+
+And then its use:
+
+```
+min(1,5)
+```
+
+Well well well, look what we have here: a conventional programming language
+syntax! `{`/`}` wrap a scope, and `(`/`)` wrap function arguments and
+(optionally) single values. It's a lot clearer now that `0` and `1` are being
+used as operations themselves when instantiating `a` and `b`, and `if` is much
+more readable.
+
+I was extremely surprised at how well this actually worked out. Despite having
+drastically different underpinnings than most languages it ends up looking both
+familiar and obvious. How cool!
+
+## Examples Examples Examples
+
+Here's a collection of example programs written in this new syntax. The base
+structure of these are borrowed from previous posts, I'm merely translating that
+structure into a new form:
+
+```
+// decr outputs one less than the input.
+decr = { out = add(in, -1) }
+
+// fib accepts a number i, and outputs the ith fibonacci number.
+fib = {
+
+ inner = {
+ n = 0(in),
+ a = 1(in),
+ b = 2(in),
+
+ out = if(zero?(n),
+ a,
+ inner(decr(n), b, add(a,b))
+ )
+
+ },
+
+ out = inner(in, 0, 1)
+}
+
+// map accepts a sequence and a function, and returns a sequence consisting of
+// the result of applying the function to each of the elements in the given
+// sequence.
+map = {
+ inner = {
+ mapped-seq = 0(in),
+ orig-seq = 1(in),
+ op = 2(in),
+
+ i = len(mapped-seq),
+
+ // graphs provide an inherent laziness to the language. Just because
+ // next-el is _defined_ here doesn't mean it's evaluated here at runtime.
+ // In reality it will only be evaluated if/when evaluating out requires
+ // evaluating next-el.
+ next-el = op(i(orig-seq)),
+ next-mapped-seq = append(mapped-seq, next-el),
+
+ out = if(
+ eq(len(mapped-seq), len(orig-seq)),
+ mapped-seq,
+ inner(next-mapped-seq, orig-seq, op)
+ )
+ }
+
+ // zero-seq returns an empty sequence
+ out = inner(zero-seq(), 0(in), 1(in))
+}
+```
+
+## Selpmexa Selpmexa Selpmexa
+
+Our syntax encodes a graph, and a graph doesn't really care if the syntax was
+encoded in an input-to-output vs an output-to-input direction. So, as promised,
+here's all the above examples, but "backwards":
+
+```
+// min returns the lesser of the two numbers it is given
+{
+ (in)0 = a,
+ (in)1 = b,
+
+ (
+ (a,b)lt,
+ a,
+ b
+ )if = out
+
+} = min
+
+// decr outputs one less than the input.
+{ (in, -1)add = out } = decr
+
+// fib accepts a number i, and outputs the ith fibonacci number.
+{
+ {
+ (in)0 = n,
+ (in)1 = a,
+ (in)2 = b,
+
+ (
+ (n)zero?
+ a,
+ ((n)decr, b, (a,b)add)inner
+ )if = out
+
+ } = inner,
+
+ (in, 0, 1)inner = out
+
+} = fib
+
+// map accepts a sequence and a function, and returns a sequence consisting of
+// the result of applying the function to each of the elements in the given
+// sequence.
+{
+ {
+ (in)0 = mapped-seq,
+ (in)1 = orig-seq,
+ (in)2 = op,
+
+ (mapped-seq)len = i,
+
+ ((orig-seq)i)op = next-el,
+ (mapped-seq, next-el)append = next-mapped-seq,
+
+ (
+ ((mapped-seq)len, (orig-seq)len)eq,
+ mapped-seq,
+ (next-mapped-seq, orig-seq, op)inner
+ )if = out
+
+ } = inner,
+
+ (()zero-seq, (in)0, (in)1)inner = out
+} = map
+```
+
+Do these make you itchy? They kind of make me itchy. But... parts of them also
+appeal to me.
+
+The obvious reason why these feel wrong to me is the placement of `if`:
+
+```
+(
+ (a,b)lt,
+ a,
+ b
+)if = out
+```
+
+The tuple which is being passed to `if` here is confusing unless you already
+know that it's going to be passed to `if`. But on your first readthrough you
+won't know that till you get to the end, so you'll be in the dark until then.
+For more complex programs I'm sure this problem will compound.
+
+On the other hand, pretty much everything else looks _better_, imo. For example:
+
+```
+// copied and slightly modified from the original to make it even more complex
+
+(mapped-seq, ((orig-seq)i)op)append = next-mapped-seq
+```
+
+Something like this reads very clearly to me, and requires a lot less mental
+backtracking to comprehend. The main difficulty I have is tracking the
+parenthesis, but the overall "flow" of data and the order of events is plain to
+read.
+
+## Next Steps
+
+The syntax here is not done yet, not by a long shot. If my record with past
+posts about ginger (wherein I've "decided" on something and then completely
+backtracked in later posts every single time) is any indication then this syntax
+won't even look remotely familiar in a very short while. But it's a great
+starting point, I think, and raises a lot of good questions.
+
+* Can I make parenthesis chains, a la the last example, more palatable in some
+ way?
+
+* Should I go with the "backwards" syntax afterall? In a functional style of
+ programming `if` statements _should_ be in the minority, and so the syntax
+ which better represents the flow of data in that style might be the way.
+
+* Destructuring of tuples seems to be wanted, as evidenced by all the `a =
+ 0(in)` lines. Should this be reflected in the structure or solely be
+ syntactical sugar?
+
+* Should the commas be replaced with any whitespace (and make commas count as
+ whitespace, as clojure has done)? If this is possible then I think they should
+ be, but I won't know for sure until I begin implementing the parser.
+
+And, surely, many more! I've felt a bit lost with ginger for a _long_ time, but
+seeing a real, usable syntax emerge has really invigorated me, and I'll be
+tackling it again in earnest soon (fingers crossed).