summaryrefslogtreecommitdiff
path: root/_posts/2020-11-16-component-oriented-programming.md
blob: c17b78d4109643485b2862da3fb86e2c31a2f15d (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
---
title: >-
    Component Oriented Programming
description: >-
    A concise description of.
---

[A previous post in this
blog](2019-08-02-program-structure-and-composability.html) focused on a
framework developed to make designing component-based programs easier. In
retrospect pattern/framework proposed was over-engineered; this post attempts to
present the same ideas but in a more distilled form, as a simple programming
pattern and without the unnecessary framework.

Nothing in this post will be revelatory; it's surely all been said before. But
hopefully the form it takes here will be useful to someone, as it would have
been useful to myself when I first learned to program.

## Axioms

For the sake of brevity let's assume the following: within the context of
single-process (_not_ the same as single-threaded), non-graphical programs the
following may be said:

1. A program may be thought of as a black-box with certain input and output
   methods. It is the programmer's task to construct a program such that
   specific inputs yield specific desired outputs.

2. A program is not complete without sufficient testing to prove it's complete.

3. Global state and global impure functions makes testing more difficult. This
   can include singletons and system calls.

Any of these may be argued, but that will be left for other posts. Any of these
may be said of other types of programs as well, but that can also be left for
other posts.

## Components

Properties of components include:

1. *Creatable*: An instance of a component, given some defined set of
   parameters, can be created independently of any other instance of that or any
   other component.

2. *Composable*: A component may be used as a parameter of another component's
   instantiation. This would make it a child component of the one being
   instantiated (i.e. the parent).

3. *Abstract*: A component is an interface consisting of one or more methods.
   Being an interface, a component may have one or more implementations, but
   generally will have a primary implementation, which is used during a
   program's runtime, and secondary "mock" implementations, which are only used
   when testing other components.

4. *Isolated*: A component may not use mutable global variables (i.e.
   singletons) or impure global functions (e.g. system calls). It may only use
   constants and variables/components given to it during instantiation.

5. *Ephemeral*: A component may have a specific method used to clean up all
   resources that it's holding (e.g. network connections, file handles,
   language-specific lightweight threads, etc).

   5a. This cleanup method should _not_ clean up any child components given as
   instantiation parameters.

   5b. This cleanup method should not return until the component's cleanup is
   complete.

Components are composed together to create programs. This is done by passing
components as parameters to other components during instantiation. The `main`
process of the program is responsible for instantiating and composing most, if
not all, components in the program.

A component oriented program is one which primarily, if not entirely, uses
components for its functionality. Components generally have the quality of being
able to interact with code written in other patterns without any toes being
stepped on.

## Example

Let's start with an example: suppose a program is desired which accepts a string
over stdin, hashes it, then writes the string to a file whose name is the hash.

A naive implementation of this program in go might look like:

```go
package main

import (
	"crypto/sha1"
	"encoding/hex"
	"io"
	"io/ioutil"
	"os"
)

func hashFileWriter() error {
	h := sha1.New()
	r := io.TeeReader(os.Stdin, h)
	body, _ := ioutil.ReadAll(r)
	fileName := hex.EncodeToString(h.Sum(nil))

	if err := ioutil.WriteFile(fileName, body, 0644); err != nil {
		return err
	}

	return nil
}

func main() {
	if err := hashFileWriter(); err != nil {
		panic(err) // consider the error handled
	}
}
```

Notice that there's not a clear separation here between different components;
`hashFileWriter` _might_ be considered a one method component, except that it
breaks component property 4, which says that a component may not use mutable
global variables (`os.Stdin`) or impure global functions (`ioutil.WriteFile`).

Notice also that testing the program would require integration tests, and could
not be unit tested (because there are no units, i.e. components). For a trivial
program like this one writing unit and integration tests would be redundant, but
for larger programs it may not be. Unit tests are important because they are
fast to run, (usually) easy to formulate, and yield consistent results.

This program could instead be written as being composed of three components:

* `stdin`, a construct given by the runtime which outputs a stream of bytes.

* `disk`, accepts a file name and file contents as input, writes the file
  contents to a file of the given name, and potentially returns an error back.

* `hashFileWriter`, reads a stream of bytes off a `stdin`, collects the stream
  into a string, hashes that string to generate a file name, and uses `disk` to
  create a corresponding file with the string as its contents. If `disk` returns
  an error then `hashFileWriter` returns that error.

Sprucing up our previous example to use these more clearly defined components
might look like:

```go
package main

import (
	"crypto/sha1"
	"encoding/hex"
	"fmt"
	"io"
	"io/ioutil"
	"os"
)

// Disk defines the methods of the disk component.
type Disk interface {
	WriteFile(fileName string, fileContents []byte) error
}

// disk is the primary implementation of Disk. It implements the methods of
// Disk (WriteFile) by performing actual system calls.
type disk struct{}

func NewDisk() Disk { return disk{} }

func (disk) WriteFile(fileName string, fileContents []byte) error {
	return ioutil.WriteFile(fileName, fileContents, 0644)
}

func hashFileWriter(stdin io.Reader, disk Disk) error {
	h := sha1.New()
	r := io.TeeReader(stdin, h)
	body, err := ioutil.ReadAll(r)
	if err != nil {
		return fmt.Errorf("reading input: %w", err)
	}

	fileName := hex.EncodeToString(h.Sum(nil))

	if err := disk.WriteFile(fileName, body); err != nil {
		return fmt.Errorf("writing to file %q: %w", fileName, err)
	}
	return nil
}

func main() {
	if err := hashFileWriter(os.Stdin, NewDisk()); err != nil {
		panic(err) // consider the error handled
	}
}
```

`hashFileWriter` no longer directly uses `os.Stdin` and `ioutil.WriteFile`, but
instead takes in components wrapping them; `io.Reader` is a built-in interface
which `os.Stdin` inherently implements, and `Disk` is a simple interface defined
just for this program.

At first glance this would seem to have doubled the line-count for very little
gain. This is because we have not yet written tests.

## Testing

As has already been firmly established, testing is important.

In the second form of the program we can test the core-functionality of the
`hashFileWriter` component without resorting to using the actual `stdin` and
`disk` components. Instead we use mocks of those components. A mock component
implements the same input/outputs that the "real" component does, but in a way
which makes testing a particular component possible without reaching outside the
process. These are unit tests.

Tests for the latest form of the program might look like this:

```go
package main

import (
	"strings"
	"testing"
)

// mockDisk implements the Disk interface. When WriteFile is called mockDisk
// will pretend to write the file, but instead will simply store what arguments
// WriteFile was called with.
type mockDisk struct {
	fileName     string
	fileContents []byte
}

func (d *mockDisk) WriteFile(fileName string, fileContents []byte) error {
	d.fileName = fileName
	d.fileContents = fileContents
	return nil
}

func TestHashFileWriter(t *testing.T) {
	type test struct {
		in          string
		expFileName string
		// expFileContents can be inferred from in
	}

	tests := []test{
		{
			in:          "",
			expFileName: "da39a3ee5e6b4b0d3255bfef95601890afd80709",
		},
		{
			in:          "hello",
			expFileName: "aaf4c61ddcc5e8a2dabede0f3b482cd9aea9434d",
		},
		{
			in:          "hello\nworld", // make sure newlines don't break things
			expFileName: "7db827c10afc1719863502cf95397731b23b8bae",
		},
	}

	for _, test := range tests {
		// stdin is mocked via a strings.Reader, which outputs the string it was
		// initialized with as a stream of bytes.
		in := strings.NewReader(test.in)

		// Disk is mocked by mockDisk, go figure.
		disk := new(mockDisk)

		if err := hashFileWriter(in, disk); err != nil {
			t.Errorf("in:%q got err:%v", test.in, err)
		} else if string(disk.fileContents) != test.in {
			t.Errorf("in:%q got contents:%q", test.in, disk.fileContents)
		} else if string(disk.fileName) != test.expFileName {
			t.Errorf("in:%q got fileName:%q", test.in, disk.fileName)
		}
	}
}
```

Notice that these tests do not _completely_ cover the desired functionality of
the program: if `disk` returns an error that error should be returned from
`hashFileWriter`. Whether or not this must be tested as well, and indeed the
pedantry level of tests overall, is a matter of taste. I believe these to be
sufficient.

## Configuration

Practically all programs require some level of runtime configuration. This may
take the form of command-line arguments, environment variables, configuration
files, etc. Almost all configuration methods will require some system call, and
so any component accessing configuration directly would likely break component
property 4.

Instead each component should take in whatever configuration parameters it needs
during instantiation, and let `main` handle collecting all configuration from
outside of the process and instantiating the components appropriately.

Let's take our previous program, but add in two new desired behaviors: first,
there should be a command-line parameter which allows for specifying the string
on the command-line, rather than reading from stdin, and second, there should be
a command-line parameter declaring which directory to write files into. The new
implementation looks like:

```
package main

import (
	"crypto/sha1"
	"encoding/hex"
	"flag"
	"fmt"
	"io"
	"io/ioutil"
	"os"
	"path/filepath"
	"strings"
)

// Disk defines the methods of the disk component.
type Disk interface {
	WriteFile(fileName string, fileContents []byte) error
}

// disk is the concrete implementation of Disk. It implements the methods of
// Disk (WriteFile) by performing actual OS calls.
type disk struct {
	dir string
}

func NewDisk(dir string) Disk { return disk{dir: dir} }

func (d disk) WriteFile(fileName string, fileContents []byte) error {
	fileName = filepath.Join(d.dir, fileName)
	return ioutil.WriteFile(fileName, fileContents, 0644)
}

func hashFileWriter(in io.Reader, disk Disk) error {
	h := sha1.New()
	r := io.TeeReader(in, h)
	body, err := ioutil.ReadAll(r)
	if err != nil {
		return fmt.Errorf("reading input: %w", err)
	}

	fileName := hex.EncodeToString(h.Sum(nil))

	if err := disk.WriteFile(fileName, body); err != nil {
		return fmt.Errorf("writing to file %q: %w", fileName, err)
	}
	return nil
}

func main() {
	str := flag.String("str", "", "If set, hash and write this string instead of stdin")
	dir := flag.String("dir", ".", "Directory which files should be written to")
	flag.Parse()

	var in io.Reader
	if *str == "" {
		in = os.Stdin
	} else {
		in = strings.NewReader(*str)
	}

	disk := NewDisk(*dir)

	if err := hashFileWriter(in, disk); err != nil {
		panic(err) // consider the error handled
	}
}
```

Very little has changed, and in fact `hashFileWriter` was not touched at all,
meaning all unit tests remained valid. 

## Setup/Runtime/Cleanup

A program can be split into three stages: setup, runtime, and cleanup. Setup
is the stage during which internal state is assembled in order to make runtime
possible. Runtime is the stage during which a program's actual function is being
performed. Cleanup is the stage during which runtime stop and internal state is
disassembled.

A graceful (i.e. reliably correct) setup is quite natural to accomplish, but
unfortunately a graceful cleanup is not a programmer's first concern, and
frequently is not a concern at all. However, when building reliable and correct
programs, a graceful cleanup is as important as a graceful setup and runtime. A
program is still running while it is being cleaned up, and it's possibly even
acting on the outside world still. Shouldn't it behave correctly during that
time?

Achieving a graceful setup and cleanup with components is quite simple:

During setup a single-threaded process (usually `main`) will construct the
"leaf" components (those which have no child components of their own) first,
then the components which take those leaves as parameters, then the components
which take _those_ as parameters, and so on, until all are constructed. The
components end up assembled into a directed acyclic graph.

At this point the program will begin runtime.

Once runtime is over and it is time for the program to exit it's only necessary
to call each component's cleanup method(s) in the reverse of the order the
components were instantiated in. A component's cleanup method should not be
called until all of its parent components have been cleaned up.

Inherent to the pattern is the fact that each component will certainly be
cleaned up before any of its child components, since its child components must
have been instantiated first and a component will not clean up child components
given as parameters (as-per component property 5a).

With go this pattern can be achieved easily using `defer`, but writing it out
manually is not so hard, as in this toy example:

```
package main

import (
	"fmt"
	"time"
)

// sleeper is a component which prints its children and sleeps when it's time to
// cleanup.
type sleeper struct {
	children []*sleeper
	toSleep  time.Duration

	// The builtin time.Sleep is an impure global function, a component can't
	// use it, so the component must be instantiated with it as a parameter.
	sleep func(time.Duration)

	// likewise os.Stdout is a global singleton, and so must also be a
	parameter.
	stdout io.Writer
}

func (s *sleeper) print() {
	fmt.Fprintf(s.stdout, "I will sleep for %v\n", s.toSleep)
	for _, child := range s.children {
		child.print()
	}
}

func (s *sleeper) cleanup() {
	s.sleep(s.toSleep)
	fmt.Fprintf(s.stdout, "I slept for %v\n", s.toSleep)
}

func main() {

	// Within main we make a helper function to easily construct sleepers. for a
	// toy like this it's not worth the effort of giving sleeper a real
	// initialization function.
	newSleeper := func(toSleep time.Duration, children ...*sleeper) *sleeper {
		return &sleeper{
			children: children,
			toSleep:  toSleep,
			sleep:    time.Sleep,
			stdout:   os.Stdout,
		}
	}

	aa := newSleeper(250 * time.Millisecond)
	defer aa.cleanup()

	ab := newSleeper(250 * time.Millisecond)
	defer ab.cleanup()

	// A's children are AA and AB
	a := newSleeper(500*time.Millisecond, aa, ab)
	defer a.cleanup()

	b := newSleeper(750 * time.Millisecond)
	defer b.cleanup()

	// root's children are A and B
	root := newSleeper(1*time.Second, a, b)
	defer root.cleanup()

	// All components are now instantiated and runtime begins.
	root.print()
    // ... and just like that, runtime ends.
	fmt.Println("--- Alright, fun is over, time for bed ---")

	// Now to clean up, cleanup methods are called in the reverse order of the
	// component's instantiation.
	root.cleanup()
	b.cleanup()
	a.cleanup()
	ab.cleanup()
	aa.cleanup()

	// Expected output is:
	//
	// I will sleep for 1s
	// I will sleep for 500ms
	// I will sleep for 250ms
	// I will sleep for 250ms
	// I will sleep for 750ms
	// --- Alright, fun is over, time for bed ---
	// I slept for 1s
	// I slept for 750ms
	// I slept for 500ms
	// I slept for 250ms
	// I slept for 250ms
}
```

## Criticisms

In lieu of a FAQ I will attempt to premeditate criticisms of the component
oriented pattern laid out in this post:

*This seems like a lot of extra work.*

Building reliable programs is a lot of work, just as building reliable-anything
is a lot of work. Many of us work in an industry which likes to balance
reliability (sometimes referred to by the more specious "quality") with
maleability and deliverability, which naturally leads to skepticism of any
suggestions which require more time spent on reliability. This is not
necessarily a bad thing, it's just how the industry functions.

All that said, a pattern need not be followed perfectly to be worthwhile, and
the amount of extra work incurred by it can be decided based on practical
considerations. I merely maintain that when it comes time to revisit some
existing code, either to fix or augment it, that the job will be notably easier
if the code _mostly_ follows this pattern.

*My language makes this difficult.*

I don't know of any language which makes this pattern particularly easy, so
unfortunately we're all in the same boat to some extent (though I recognize that
some languages, or their ecosystems, make it more difficult than others). It
seems to me that this pattern shouldn't be unbearably difficult for anyone to
implement in any language either, however, as the only language feature needed
is abstract typing.

It would be nice to one day see a language which explicitly supported this
pattern by baking the component properties in as compiler checked rules.

*This will result in over-abstraction.*

Abstraction is a necessary tool in a programmer's toolkit, there is simply no
way around it. The only questions are "how much?" and "where?".

The use of this pattern does not effect how those questions are answered, but
instead aims to more clearly delineate the relationships and interactions
between the different abstracted types once they've been established using other
methods. Over-abstraction is the fault of the programmer, not the language or
pattern or framework.

*The acronymn is CoP.*

Why do you think I've just been ackwardly using "this pattern" instead of the
acronymn for the whole post? Better names are welcome.

## Conclusion

The component oriented pattern helps make our code more reliable with only a
small amount of extra effort incurred. In fact most of the pattern has to do
establishing sensible abstractions around global functionality and remembering
certain idioms for how those abstractions should be composed together, something
most of us do to some extent already anyway.

While beneficial in many ways, component oriented programming is merely a tool
which can be applied in many cases. It is certain that there are cases where it
is not the right tool for the job. I've found these cases to be
few-and-far-between, however. It's a solid pattern that I've gotten good use out
of, and hopefully you'll find it, or some parts of it, to be useful as well.