summaryrefslogtreecommitdiff
path: root/_posts
diff options
context:
space:
mode:
authorBrian Picciano <mediocregopher@gmail.com>2020-11-16 22:41:02 -0700
committerBrian Picciano <mediocregopher@gmail.com>2020-11-18 22:13:53 -0700
commitdbf6ebdeee558fee27f97b7b0f69ff25f87244e8 (patch)
treed5b66fcf627d5a3a22aec5298af40c53dc4de4d2 /_posts
parent8d1de40ef7f0210db7e0fcbeaaf8f5f4e2855f73 (diff)
component oriented programming post
Diffstat (limited to '_posts')
-rw-r--r--_posts/2020-11-16-component-oriented-programming.md568
1 files changed, 568 insertions, 0 deletions
diff --git a/_posts/2020-11-16-component-oriented-programming.md b/_posts/2020-11-16-component-oriented-programming.md
new file mode 100644
index 0000000..c17b78d
--- /dev/null
+++ b/_posts/2020-11-16-component-oriented-programming.md
@@ -0,0 +1,568 @@
+---
+title: >-
+ Component Oriented Programming
+description: >-
+ A concise description of.
+---
+
+[A previous post in this
+blog](2019-08-02-program-structure-and-composability.html) focused on a
+framework developed to make designing component-based programs easier. In
+retrospect pattern/framework proposed was over-engineered; this post attempts to
+present the same ideas but in a more distilled form, as a simple programming
+pattern and without the unnecessary framework.
+
+Nothing in this post will be revelatory; it's surely all been said before. But
+hopefully the form it takes here will be useful to someone, as it would have
+been useful to myself when I first learned to program.
+
+## Axioms
+
+For the sake of brevity let's assume the following: within the context of
+single-process (_not_ the same as single-threaded), non-graphical programs the
+following may be said:
+
+1. A program may be thought of as a black-box with certain input and output
+ methods. It is the programmer's task to construct a program such that
+ specific inputs yield specific desired outputs.
+
+2. A program is not complete without sufficient testing to prove it's complete.
+
+3. Global state and global impure functions makes testing more difficult. This
+ can include singletons and system calls.
+
+Any of these may be argued, but that will be left for other posts. Any of these
+may be said of other types of programs as well, but that can also be left for
+other posts.
+
+## Components
+
+Properties of components include:
+
+1. *Creatable*: An instance of a component, given some defined set of
+ parameters, can be created independently of any other instance of that or any
+ other component.
+
+2. *Composable*: A component may be used as a parameter of another component's
+ instantiation. This would make it a child component of the one being
+ instantiated (i.e. the parent).
+
+3. *Abstract*: A component is an interface consisting of one or more methods.
+ Being an interface, a component may have one or more implementations, but
+ generally will have a primary implementation, which is used during a
+ program's runtime, and secondary "mock" implementations, which are only used
+ when testing other components.
+
+4. *Isolated*: A component may not use mutable global variables (i.e.
+ singletons) or impure global functions (e.g. system calls). It may only use
+ constants and variables/components given to it during instantiation.
+
+5. *Ephemeral*: A component may have a specific method used to clean up all
+ resources that it's holding (e.g. network connections, file handles,
+ language-specific lightweight threads, etc).
+
+ 5a. This cleanup method should _not_ clean up any child components given as
+ instantiation parameters.
+
+ 5b. This cleanup method should not return until the component's cleanup is
+ complete.
+
+Components are composed together to create programs. This is done by passing
+components as parameters to other components during instantiation. The `main`
+process of the program is responsible for instantiating and composing most, if
+not all, components in the program.
+
+A component oriented program is one which primarily, if not entirely, uses
+components for its functionality. Components generally have the quality of being
+able to interact with code written in other patterns without any toes being
+stepped on.
+
+## Example
+
+Let's start with an example: suppose a program is desired which accepts a string
+over stdin, hashes it, then writes the string to a file whose name is the hash.
+
+A naive implementation of this program in go might look like:
+
+```go
+package main
+
+import (
+ "crypto/sha1"
+ "encoding/hex"
+ "io"
+ "io/ioutil"
+ "os"
+)
+
+func hashFileWriter() error {
+ h := sha1.New()
+ r := io.TeeReader(os.Stdin, h)
+ body, _ := ioutil.ReadAll(r)
+ fileName := hex.EncodeToString(h.Sum(nil))
+
+ if err := ioutil.WriteFile(fileName, body, 0644); err != nil {
+ return err
+ }
+
+ return nil
+}
+
+func main() {
+ if err := hashFileWriter(); err != nil {
+ panic(err) // consider the error handled
+ }
+}
+```
+
+Notice that there's not a clear separation here between different components;
+`hashFileWriter` _might_ be considered a one method component, except that it
+breaks component property 4, which says that a component may not use mutable
+global variables (`os.Stdin`) or impure global functions (`ioutil.WriteFile`).
+
+Notice also that testing the program would require integration tests, and could
+not be unit tested (because there are no units, i.e. components). For a trivial
+program like this one writing unit and integration tests would be redundant, but
+for larger programs it may not be. Unit tests are important because they are
+fast to run, (usually) easy to formulate, and yield consistent results.
+
+This program could instead be written as being composed of three components:
+
+* `stdin`, a construct given by the runtime which outputs a stream of bytes.
+
+* `disk`, accepts a file name and file contents as input, writes the file
+ contents to a file of the given name, and potentially returns an error back.
+
+* `hashFileWriter`, reads a stream of bytes off a `stdin`, collects the stream
+ into a string, hashes that string to generate a file name, and uses `disk` to
+ create a corresponding file with the string as its contents. If `disk` returns
+ an error then `hashFileWriter` returns that error.
+
+Sprucing up our previous example to use these more clearly defined components
+might look like:
+
+```go
+package main
+
+import (
+ "crypto/sha1"
+ "encoding/hex"
+ "fmt"
+ "io"
+ "io/ioutil"
+ "os"
+)
+
+// Disk defines the methods of the disk component.
+type Disk interface {
+ WriteFile(fileName string, fileContents []byte) error
+}
+
+// disk is the primary implementation of Disk. It implements the methods of
+// Disk (WriteFile) by performing actual system calls.
+type disk struct{}
+
+func NewDisk() Disk { return disk{} }
+
+func (disk) WriteFile(fileName string, fileContents []byte) error {
+ return ioutil.WriteFile(fileName, fileContents, 0644)
+}
+
+func hashFileWriter(stdin io.Reader, disk Disk) error {
+ h := sha1.New()
+ r := io.TeeReader(stdin, h)
+ body, err := ioutil.ReadAll(r)
+ if err != nil {
+ return fmt.Errorf("reading input: %w", err)
+ }
+
+ fileName := hex.EncodeToString(h.Sum(nil))
+
+ if err := disk.WriteFile(fileName, body); err != nil {
+ return fmt.Errorf("writing to file %q: %w", fileName, err)
+ }
+ return nil
+}
+
+func main() {
+ if err := hashFileWriter(os.Stdin, NewDisk()); err != nil {
+ panic(err) // consider the error handled
+ }
+}
+```
+
+`hashFileWriter` no longer directly uses `os.Stdin` and `ioutil.WriteFile`, but
+instead takes in components wrapping them; `io.Reader` is a built-in interface
+which `os.Stdin` inherently implements, and `Disk` is a simple interface defined
+just for this program.
+
+At first glance this would seem to have doubled the line-count for very little
+gain. This is because we have not yet written tests.
+
+## Testing
+
+As has already been firmly established, testing is important.
+
+In the second form of the program we can test the core-functionality of the
+`hashFileWriter` component without resorting to using the actual `stdin` and
+`disk` components. Instead we use mocks of those components. A mock component
+implements the same input/outputs that the "real" component does, but in a way
+which makes testing a particular component possible without reaching outside the
+process. These are unit tests.
+
+Tests for the latest form of the program might look like this:
+
+```go
+package main
+
+import (
+ "strings"
+ "testing"
+)
+
+// mockDisk implements the Disk interface. When WriteFile is called mockDisk
+// will pretend to write the file, but instead will simply store what arguments
+// WriteFile was called with.
+type mockDisk struct {
+ fileName string
+ fileContents []byte
+}
+
+func (d *mockDisk) WriteFile(fileName string, fileContents []byte) error {
+ d.fileName = fileName
+ d.fileContents = fileContents
+ return nil
+}
+
+func TestHashFileWriter(t *testing.T) {
+ type test struct {
+ in string
+ expFileName string
+ // expFileContents can be inferred from in
+ }
+
+ tests := []test{
+ {
+ in: "",
+ expFileName: "da39a3ee5e6b4b0d3255bfef95601890afd80709",
+ },
+ {
+ in: "hello",
+ expFileName: "aaf4c61ddcc5e8a2dabede0f3b482cd9aea9434d",
+ },
+ {
+ in: "hello\nworld", // make sure newlines don't break things
+ expFileName: "7db827c10afc1719863502cf95397731b23b8bae",
+ },
+ }
+
+ for _, test := range tests {
+ // stdin is mocked via a strings.Reader, which outputs the string it was
+ // initialized with as a stream of bytes.
+ in := strings.NewReader(test.in)
+
+ // Disk is mocked by mockDisk, go figure.
+ disk := new(mockDisk)
+
+ if err := hashFileWriter(in, disk); err != nil {
+ t.Errorf("in:%q got err:%v", test.in, err)
+ } else if string(disk.fileContents) != test.in {
+ t.Errorf("in:%q got contents:%q", test.in, disk.fileContents)
+ } else if string(disk.fileName) != test.expFileName {
+ t.Errorf("in:%q got fileName:%q", test.in, disk.fileName)
+ }
+ }
+}
+```
+
+Notice that these tests do not _completely_ cover the desired functionality of
+the program: if `disk` returns an error that error should be returned from
+`hashFileWriter`. Whether or not this must be tested as well, and indeed the
+pedantry level of tests overall, is a matter of taste. I believe these to be
+sufficient.
+
+## Configuration
+
+Practically all programs require some level of runtime configuration. This may
+take the form of command-line arguments, environment variables, configuration
+files, etc. Almost all configuration methods will require some system call, and
+so any component accessing configuration directly would likely break component
+property 4.
+
+Instead each component should take in whatever configuration parameters it needs
+during instantiation, and let `main` handle collecting all configuration from
+outside of the process and instantiating the components appropriately.
+
+Let's take our previous program, but add in two new desired behaviors: first,
+there should be a command-line parameter which allows for specifying the string
+on the command-line, rather than reading from stdin, and second, there should be
+a command-line parameter declaring which directory to write files into. The new
+implementation looks like:
+
+```
+package main
+
+import (
+ "crypto/sha1"
+ "encoding/hex"
+ "flag"
+ "fmt"
+ "io"
+ "io/ioutil"
+ "os"
+ "path/filepath"
+ "strings"
+)
+
+// Disk defines the methods of the disk component.
+type Disk interface {
+ WriteFile(fileName string, fileContents []byte) error
+}
+
+// disk is the concrete implementation of Disk. It implements the methods of
+// Disk (WriteFile) by performing actual OS calls.
+type disk struct {
+ dir string
+}
+
+func NewDisk(dir string) Disk { return disk{dir: dir} }
+
+func (d disk) WriteFile(fileName string, fileContents []byte) error {
+ fileName = filepath.Join(d.dir, fileName)
+ return ioutil.WriteFile(fileName, fileContents, 0644)
+}
+
+func hashFileWriter(in io.Reader, disk Disk) error {
+ h := sha1.New()
+ r := io.TeeReader(in, h)
+ body, err := ioutil.ReadAll(r)
+ if err != nil {
+ return fmt.Errorf("reading input: %w", err)
+ }
+
+ fileName := hex.EncodeToString(h.Sum(nil))
+
+ if err := disk.WriteFile(fileName, body); err != nil {
+ return fmt.Errorf("writing to file %q: %w", fileName, err)
+ }
+ return nil
+}
+
+func main() {
+ str := flag.String("str", "", "If set, hash and write this string instead of stdin")
+ dir := flag.String("dir", ".", "Directory which files should be written to")
+ flag.Parse()
+
+ var in io.Reader
+ if *str == "" {
+ in = os.Stdin
+ } else {
+ in = strings.NewReader(*str)
+ }
+
+ disk := NewDisk(*dir)
+
+ if err := hashFileWriter(in, disk); err != nil {
+ panic(err) // consider the error handled
+ }
+}
+```
+
+Very little has changed, and in fact `hashFileWriter` was not touched at all,
+meaning all unit tests remained valid.
+
+## Setup/Runtime/Cleanup
+
+A program can be split into three stages: setup, runtime, and cleanup. Setup
+is the stage during which internal state is assembled in order to make runtime
+possible. Runtime is the stage during which a program's actual function is being
+performed. Cleanup is the stage during which runtime stop and internal state is
+disassembled.
+
+A graceful (i.e. reliably correct) setup is quite natural to accomplish, but
+unfortunately a graceful cleanup is not a programmer's first concern, and
+frequently is not a concern at all. However, when building reliable and correct
+programs, a graceful cleanup is as important as a graceful setup and runtime. A
+program is still running while it is being cleaned up, and it's possibly even
+acting on the outside world still. Shouldn't it behave correctly during that
+time?
+
+Achieving a graceful setup and cleanup with components is quite simple:
+
+During setup a single-threaded process (usually `main`) will construct the
+"leaf" components (those which have no child components of their own) first,
+then the components which take those leaves as parameters, then the components
+which take _those_ as parameters, and so on, until all are constructed. The
+components end up assembled into a directed acyclic graph.
+
+At this point the program will begin runtime.
+
+Once runtime is over and it is time for the program to exit it's only necessary
+to call each component's cleanup method(s) in the reverse of the order the
+components were instantiated in. A component's cleanup method should not be
+called until all of its parent components have been cleaned up.
+
+Inherent to the pattern is the fact that each component will certainly be
+cleaned up before any of its child components, since its child components must
+have been instantiated first and a component will not clean up child components
+given as parameters (as-per component property 5a).
+
+With go this pattern can be achieved easily using `defer`, but writing it out
+manually is not so hard, as in this toy example:
+
+```
+package main
+
+import (
+ "fmt"
+ "time"
+)
+
+// sleeper is a component which prints its children and sleeps when it's time to
+// cleanup.
+type sleeper struct {
+ children []*sleeper
+ toSleep time.Duration
+
+ // The builtin time.Sleep is an impure global function, a component can't
+ // use it, so the component must be instantiated with it as a parameter.
+ sleep func(time.Duration)
+
+ // likewise os.Stdout is a global singleton, and so must also be a
+ parameter.
+ stdout io.Writer
+}
+
+func (s *sleeper) print() {
+ fmt.Fprintf(s.stdout, "I will sleep for %v\n", s.toSleep)
+ for _, child := range s.children {
+ child.print()
+ }
+}
+
+func (s *sleeper) cleanup() {
+ s.sleep(s.toSleep)
+ fmt.Fprintf(s.stdout, "I slept for %v\n", s.toSleep)
+}
+
+func main() {
+
+ // Within main we make a helper function to easily construct sleepers. for a
+ // toy like this it's not worth the effort of giving sleeper a real
+ // initialization function.
+ newSleeper := func(toSleep time.Duration, children ...*sleeper) *sleeper {
+ return &sleeper{
+ children: children,
+ toSleep: toSleep,
+ sleep: time.Sleep,
+ stdout: os.Stdout,
+ }
+ }
+
+ aa := newSleeper(250 * time.Millisecond)
+ defer aa.cleanup()
+
+ ab := newSleeper(250 * time.Millisecond)
+ defer ab.cleanup()
+
+ // A's children are AA and AB
+ a := newSleeper(500*time.Millisecond, aa, ab)
+ defer a.cleanup()
+
+ b := newSleeper(750 * time.Millisecond)
+ defer b.cleanup()
+
+ // root's children are A and B
+ root := newSleeper(1*time.Second, a, b)
+ defer root.cleanup()
+
+ // All components are now instantiated and runtime begins.
+ root.print()
+ // ... and just like that, runtime ends.
+ fmt.Println("--- Alright, fun is over, time for bed ---")
+
+ // Now to clean up, cleanup methods are called in the reverse order of the
+ // component's instantiation.
+ root.cleanup()
+ b.cleanup()
+ a.cleanup()
+ ab.cleanup()
+ aa.cleanup()
+
+ // Expected output is:
+ //
+ // I will sleep for 1s
+ // I will sleep for 500ms
+ // I will sleep for 250ms
+ // I will sleep for 250ms
+ // I will sleep for 750ms
+ // --- Alright, fun is over, time for bed ---
+ // I slept for 1s
+ // I slept for 750ms
+ // I slept for 500ms
+ // I slept for 250ms
+ // I slept for 250ms
+}
+```
+
+## Criticisms
+
+In lieu of a FAQ I will attempt to premeditate criticisms of the component
+oriented pattern laid out in this post:
+
+*This seems like a lot of extra work.*
+
+Building reliable programs is a lot of work, just as building reliable-anything
+is a lot of work. Many of us work in an industry which likes to balance
+reliability (sometimes referred to by the more specious "quality") with
+maleability and deliverability, which naturally leads to skepticism of any
+suggestions which require more time spent on reliability. This is not
+necessarily a bad thing, it's just how the industry functions.
+
+All that said, a pattern need not be followed perfectly to be worthwhile, and
+the amount of extra work incurred by it can be decided based on practical
+considerations. I merely maintain that when it comes time to revisit some
+existing code, either to fix or augment it, that the job will be notably easier
+if the code _mostly_ follows this pattern.
+
+*My language makes this difficult.*
+
+I don't know of any language which makes this pattern particularly easy, so
+unfortunately we're all in the same boat to some extent (though I recognize that
+some languages, or their ecosystems, make it more difficult than others). It
+seems to me that this pattern shouldn't be unbearably difficult for anyone to
+implement in any language either, however, as the only language feature needed
+is abstract typing.
+
+It would be nice to one day see a language which explicitly supported this
+pattern by baking the component properties in as compiler checked rules.
+
+*This will result in over-abstraction.*
+
+Abstraction is a necessary tool in a programmer's toolkit, there is simply no
+way around it. The only questions are "how much?" and "where?".
+
+The use of this pattern does not effect how those questions are answered, but
+instead aims to more clearly delineate the relationships and interactions
+between the different abstracted types once they've been established using other
+methods. Over-abstraction is the fault of the programmer, not the language or
+pattern or framework.
+
+*The acronymn is CoP.*
+
+Why do you think I've just been ackwardly using "this pattern" instead of the
+acronymn for the whole post? Better names are welcome.
+
+## Conclusion
+
+The component oriented pattern helps make our code more reliable with only a
+small amount of extra effort incurred. In fact most of the pattern has to do
+establishing sensible abstractions around global functionality and remembering
+certain idioms for how those abstractions should be composed together, something
+most of us do to some extent already anyway.
+
+While beneficial in many ways, component oriented programming is merely a tool
+which can be applied in many cases. It is certain that there are cases where it
+is not the right tool for the job. I've found these cases to be
+few-and-far-between, however. It's a solid pattern that I've gotten good use out
+of, and hopefully you'll find it, or some parts of it, to be useful as well.