diff options
Diffstat (limited to '_posts')
21 files changed, 0 insertions, 4764 deletions
diff --git a/_posts/2013-04-09-erlang-tcp-socket-pull-pattern.md b/_posts/2013-04-09-erlang-tcp-socket-pull-pattern.md deleted file mode 100644 index 3e5f0af..0000000 --- a/_posts/2013-04-09-erlang-tcp-socket-pull-pattern.md +++ /dev/null @@ -1,256 +0,0 @@ ---- -title: "Erlang, tcp sockets, and active true" -description: >- - Using `{active:once}` isn't always the best way to handle connections. ---- - -If you don't know erlang then [you're missing out][0]. If you do know erlang, -you've probably at some point done something with tcp sockets. Erlang's highly -concurrent model of execution lends itself well to server programs where a high -number of active connections is desired. Each thread can autonomously handle its -single client, greatly simplifying the logic of the whole application while -still retaining [great performance characteristics][1]. - -## Background - -For an erlang thread which owns a single socket there are three different ways -to receive data off of that socket. These all revolve around the `active` -[setopts][2] flag. A socket can be set to one of: - -* `{active,false}` - All data must be obtained through [recv/2][3] calls. This - amounts to syncronous socket reading. - -* `{active,true}` - All data on the socket gets sent to the controlling thread - as a normal erlang message. It is the thread's - responsibility to keep up with the buffered data in the - message queue. This amounts to asyncronous socket reading. - -* `{active,once}` - When set the socket is placed in `{active,true}` for a - single packet. That is, once set the thread can expect a - single message to be sent to when data comes in. To receive - any more data off of the socket the socket must either be - read from using [recv/2][3] or be put in `{active,once}` or - `{active,true}`. - -## Which to use? - -Many (most?) tutorials advocate using `{active,once}` in your application -\[0]\[1]\[2]. This has to do with usability and security. When in `{active,true}` -it's possible for a client to flood the connection faster than the receiving -process will process those messages, potentially eating up a lot of memory in -the VM. However, if you want to be able to receive both tcp data messages as -well as other messages from other erlang processes at the same time you can't -use `{active,false}`. So `{active,once}` is generally preferred because it -deals with both of these problems quite well. - -## Why not to use `{active,once}` - -Here's what your classic `{active,once}` enabled tcp socket implementation will -probably look like: - -```erlang --module(tcp_test). --compile(export_all). - --define(TCP_OPTS, [ - binary, - {packet, raw}, - {nodelay,true}, - {active, false}, - {reuseaddr, true}, - {keepalive,true}, - {backlog,500} -]). - -%Start listening -listen(Port) -> - {ok, L} = gen_tcp:listen(Port, ?TCP_OPTS), - ?MODULE:accept(L). - -%Accept a connection -accept(L) -> - {ok, Socket} = gen_tcp:accept(L), - ?MODULE:read_loop(Socket), - io:fwrite("Done reading, connection was closed\n"), - ?MODULE:accept(L). - -%Read everything it sends us -read_loop(Socket) -> - inet:setopts(Socket, [{active, once}]), - receive - {tcp, _, _} -> - do_stuff_here, - ?MODULE:read_loop(Socket); - {tcp_closed, _}-> donezo; - {tcp_error, _, _} -> donezo - end. -``` - -This code isn't actually usable for a production system; it doesn't even spawn a -new process for the new socket. But that's not the point I'm making. If I run it -with `tcp_test:listen(8000)`, and in other window do: - -```bash -while [ 1 ]; do echo "aloha"; done | nc localhost 8000 -``` - -We'll be flooding the the server with data pretty well. Using [eprof][4] we can -get an idea of how our code performs, and where the hang-ups are: - -```erlang -1> eprof:start(). -{ok,<0.34.0>} - -2> P = spawn(tcp_test,listen,[8000]). -<0.36.0> - -3> eprof:start_profiling([P]). -profiling - -4> running_the_while_loop. -running_the_while_loop - -5> eprof:stop_profiling(). -profiling_stopped - -6> eprof:analyze(procs,[{sort,time}]). - -****** Process <0.36.0> -- 100.00 % of profiled time *** -FUNCTION CALLS % TIME [uS / CALLS] --------- ----- --- ---- [----------] -prim_inet:type_value_2/2 6 0.00 0 [ 0.00] - -....snip.... - -prim_inet:enc_opts/2 6 0.00 8 [ 1.33] -prim_inet:setopts/2 12303599 1.85 1466319 [ 0.12] -tcp_test:read_loop/1 12303598 2.22 1761775 [ 0.14] -prim_inet:encode_opt_val/1 12303599 3.50 2769285 [ 0.23] -prim_inet:ctl_cmd/3 12303600 4.29 3399333 [ 0.28] -prim_inet:enc_opt_val/2 24607203 5.28 4184818 [ 0.17] -inet:setopts/2 12303598 5.72 4533863 [ 0.37] -erlang:port_control/3 12303600 77.13 61085040 [ 4.96] -``` - -eprof shows us where our process is spending the majority of its time. The `%` -column indicates percentage of time the process spent during profiling inside -any function. We can pretty clearly see that the vast majority of time was spent -inside `erlang:port_control/3`, the BIF that `inet:setopts/2` uses to switch the -socket to `{active,once}` mode. Amongst the calls which were called on every -loop, it takes up by far the most amount of time. In addition all of those other -calls are also related to `inet:setopts/2`. - -I'm gonna rewrite our little listen server to use `{active,true}`, and we'll do -it all again: - -```erlang --module(tcp_test). --compile(export_all). - --define(TCP_OPTS, [ - binary, - {packet, raw}, - {nodelay,true}, - {active, false}, - {reuseaddr, true}, - {keepalive,true}, - {backlog,500} -]). - -%Start listening -listen(Port) -> - {ok, L} = gen_tcp:listen(Port, ?TCP_OPTS), - ?MODULE:accept(L). - -%Accept a connection -accept(L) -> - {ok, Socket} = gen_tcp:accept(L), - inet:setopts(Socket, [{active, true}]), %Well this is new - ?MODULE:read_loop(Socket), - io:fwrite("Done reading, connection was closed\n"), - ?MODULE:accept(L). - -%Read everything it sends us -read_loop(Socket) -> - %inet:setopts(Socket, [{active, once}]), - receive - {tcp, _, _} -> - do_stuff_here, - ?MODULE:read_loop(Socket); - {tcp_closed, _}-> donezo; - {tcp_error, _, _} -> donezo - end. -``` - -And the profiling results: - -```erlang -1> eprof:start(). -{ok,<0.34.0>} - -2> P = spawn(tcp_test,listen,[8000]). -<0.36.0> - -3> eprof:start_profiling([P]). -profiling - -4> running_the_while_loop. -running_the_while_loop - -5> eprof:stop_profiling(). -profiling_stopped - -6> eprof:analyze(procs,[{sort,time}]). - -****** Process <0.36.0> -- 100.00 % of profiled time *** -FUNCTION CALLS % TIME [uS / CALLS] --------- ----- --- ---- [----------] -prim_inet:enc_value_1/3 7 0.00 1 [ 0.14] -prim_inet:decode_opt_val/1 1 0.00 1 [ 1.00] -inet:setopts/2 1 0.00 2 [ 2.00] -prim_inet:setopts/2 2 0.00 2 [ 1.00] -prim_inet:enum_name/2 1 0.00 2 [ 2.00] -erlang:port_set_data/2 1 0.00 2 [ 2.00] -inet_db:register_socket/2 1 0.00 3 [ 3.00] -prim_inet:type_value_1/3 7 0.00 3 [ 0.43] - -.... snip .... - -prim_inet:type_opt_1/1 19 0.00 7 [ 0.37] -prim_inet:enc_value/3 7 0.00 7 [ 1.00] -prim_inet:enum_val/2 6 0.00 7 [ 1.17] -prim_inet:dec_opt_val/1 7 0.00 7 [ 1.00] -prim_inet:dec_value/2 6 0.00 10 [ 1.67] -prim_inet:enc_opt/1 13 0.00 12 [ 0.92] -prim_inet:type_opt/2 19 0.00 33 [ 1.74] -erlang:port_control/3 3 0.00 59 [ 19.67] -tcp_test:read_loop/1 20716370 100.00 12187488 [ 0.59] -``` - -This time our process spent almost no time at all (according to eprof, 0%) -fiddling with the socket opts. Instead it spent all of its time in the -read_loop doing the work we actually want to be doing. - -## So what does this mean? - -I'm by no means advocating never using `{active,once}`. The security concern is -still a completely valid concern and one that `{active,once}` mitigates quite -well. I'm simply pointing out that this mitigation has some fairly serious -performance implications which have the potential to bite you if you're not -careful, especially in cases where a socket is going to be receiving a large -amount of traffic. - -## Meta - -These tests were done using R15B03, but I've done similar ones in R14 and found -similar results. I have not tested R16. - -* \[0] http://learnyousomeerlang.com/buckets-of-sockets -* \[1] http://www.erlang.org/doc/man/gen_tcp.html#examples -* \[2] http://erlycoder.com/25/erlang-tcp-server-tcp-client-sockets-with-gen_tcp - -[0]: http://learnyousomeerlang.com/content -[1]: http://www.metabrew.com/article/a-million-user-comet-application-with-mochiweb-part-1 -[2]: http://www.erlang.org/doc/man/inet.html#setopts-2 -[3]: http://www.erlang.org/doc/man/gen_tcp.html#recv-2 -[4]: http://www.erlang.org/doc/man/eprof.html diff --git a/_posts/2013-07-11-goplus.md b/_posts/2013-07-11-goplus.md deleted file mode 100644 index 5ee121e..0000000 --- a/_posts/2013-07-11-goplus.md +++ /dev/null @@ -1,77 +0,0 @@ ---- -title: Go+ -description: >- - A simple proof-of-concept script for doing go dependency management. ---- - -Compared to other languages go has some strange behavior regarding its project -root settings. If you import a library called `somelib`, go will look for a -`src/somelib` folder in all of the folders in the `$GOPATH` environment -variable. This works nicely for globally installed packages, but it makes -encapsulating a project with a specific version, or modified version, rather -tedious. Whenever you go to work on this project you'll have to add its path to -your `$GOPATH`, or add the path permanently, which could break other projects -which may use a different version of `somelib`. - -My solution is in the form of a simple script I'm calling go+. go+ will search -in currrent directory and all of its parents for a file called `GOPROJROOT`. If -it finds that file in a directory, it prepends that directory's absolute path to -your `$GOPATH` and stops the search. Regardless of whether or not `GOPROJROOT` -was found go+ will passthrough all arguments to the actual go call. The -modification to `$GOPATH` will only last the duration of the call. - -As an example, consider the following: -``` -/tmp - /hello - GOPROJROOT - /src - /somelib/somelib.go - /hello.go -``` - -If `hello.go` depends on `somelib`, as long as you run go+ from `/tmp/hello` or -one of its children your project will still compile - -Here is the source code for go+: - -```bash -#!/bin/sh - -SEARCHING_FOR=GOPROJROOT -ORIG_DIR=$(pwd) - -STOPSEARCH=0 -SEARCH_DIR=$ORIG_DIR -while [ $STOPSEARCH = 0 ]; do - - RES=$( find $SEARCH_DIR -maxdepth 1 -type f -name $SEARCHING_FOR | \ - grep -P "$SEARCHING_FOR$" | \ - head -n1 ) - - if [ "$RES" = "" ]; then - if [ "$SEARCH_DIR" = "/" ]; then - STOPSEARCH=1 - fi - cd .. - SEARCH_DIR=$(pwd) - else - export GOPATH=$SEARCH_DIR:$GOPATH - STOPSEARCH=1 - fi -done - -cd "$ORIG_DIR" -exec go $@ -``` - -## UPDATE: Goat - -I'm leaving this post for posterity, but go+ has some serious flaws in it. For -one, it doesn't allow for specifying the version of a dependency you want to -use. To this end, I wrote [goat][0] which does all the things go+ does, plus -real dependency management, PLUS it is built in a way that if you've been -following go's best-practices for code organization you shouldn't have to change -any of your existing code AT ALL. It's cool, check it out. - -[0]: http://github.com/mediocregopher/goat diff --git a/_posts/2013-10-08-generations.md b/_posts/2013-10-08-generations.md deleted file mode 100644 index c1c433d..0000000 --- a/_posts/2013-10-08-generations.md +++ /dev/null @@ -1,100 +0,0 @@ ---- -title: Generations -description: >- - A simple file distribution strategy for very large scale, high-availability - file-services. ---- - -## The problem - -At [cryptic.io][cryptic] we plan on having millions of different -files, any of which could be arbitrarily chosen to be served any given time. -These files are uploaded by users at arbitrary times. - -Scaling such a system is no easy task. The solution I've seen implemented in the -past involves shuffling files around on a nearly constant basis, making sure -that files which are more "popular" are on fast drives, while at the same time -making sure that no drives are at capicty and at the same time that all files, -even newly uploaded ones, are stored redundantly. - -The problem with this solution is one of coordination. At any given moment the -app needs to be able to "find" a file so it can give the client a link to -download the file from one of the servers that it's on. Full-filling this simple -requirement means that all datastores/caches where information about where a -file lives need to be up-to-date at all times, and even then there are -race-conditions and network failures to contend with, while at all times the -requirements of the app evolve and change. - -## A simpler solution - -Let's say you want all files which get uploaded to be replicated in triplicate -in some capacity. You buy three identical hard-disks, and put each on a separate -server. As files get uploaded by clients, each file gets put on each drive -immediately. When the drives are filled (which should be at around the same -time), you stop uploading to them. - -That was generation 0. - -You buy three more drives, and start putting all files on them instead. This is -going to be generation 1. Repeat until you run out of money. - -That's it. - -### That's it? - -It seems simple and obvious, and maybe it's the standard thing which is done, -but as far as I can tell no-one has written about it (though I'm probably not -searching for the right thing, let me know if this is the case!). - -### Advantages - -* It's so simple to implement, you could probably do it in a day if you're -starting a project from scratch - -* By definition of the scheme all files are replicated in multiple places. - -* Minimal information about where a file "is" needs to be stored. When a file is -uploaded all that's needed is to know what generation it is in, and then what -nodes/drives are in that generation. If the file's name is generated -server-side, then the file's generation could be *part* of its name, making -lookup even faster. - -* Drives don't need to "know" about each other. What I mean by this is that -whatever is running as the receive point for file-uploads on each drive doesn't -have to coordinate with its siblings running on the other drives in the -generation. In fact it doesn't need to coordinate with anyone. You could -literally rsync files onto your drives if you wanted to. I would recommend using -[marlin][0] though :) - -* Scaling is easy. When you run out of space you can simply start a new -generation. If you don't like playing that close to the chest there's nothing to -say you can't have two generations active at the same time. - -* Upgrading is easy. As long as a generation is not marked-for-upload, you can -easily copy all files in the generation into a new set of bigger, badder drives, -add those drives into the generation in your code, remove the old ones, then -mark the generation as uploadable again. - -* Distribution is easy. You just copy a generation's files onto a new drive in -Europe or wherever you're getting an uptick in traffic from and you're good to -go. - -* Management is easy. It's trivial to find out how many times a file has been -replicated, or how many countries it's in, or what hardware it's being served -from (given you have easy access to information about specific drives). - -### Caveats - -The big caveat here is that this is just an idea. It has NOT been tested in -production. But we have enough faith in it that we're going to give it a shot at -[cryptic.io][cryptic]. I'll keep this page updated. - -The second caveat is that this scheme does not inherently support caching. If a -file suddenly becomes super popular the world over your hard-disks might not be -able to keep up, and it's probably not feasible to have an FIO drive in *every* -generation. I think that [groupcache][1] may be the answer to this problem, -assuming your files are reasonably small, but again I haven't tested it yet. - -[cryptic]: https://cryptic.io -[0]: https://github.com/cryptic-io/marlin -[1]: https://github.com/golang/groupcache diff --git a/_posts/2013-10-25-namecoind-ssl.md b/_posts/2013-10-25-namecoind-ssl.md deleted file mode 100644 index 2711a92..0000000 --- a/_posts/2013-10-25-namecoind-ssl.md +++ /dev/null @@ -1,248 +0,0 @@ ---- -title: Namecoin, A Replacement For SSL -description: >- - If we use the namecoin chain as a DNS service we get security almost for - free, along with lots of other benefits. ---- - -At [cryptic.io][cryptic] we are creating a client-side, in-browser encryption -system where a user can upload their already encrypted content to our storage -system and be 100% confident that their data can never be decrypted by anyone -but them. - -One of the main problems with this approach is that the client has to be sure -that the code that's being run in their browser is the correct code; that is, -that they aren't the subject of a man-in-the-middle attack where an attacker is -turning our strong encryption into weak encryption that they could later break. - -A component of our current solution is to deliver the site's javascript (and all -other assets, for that matter) using SSL encryption. This protects the files -from tampering in-between leaving our servers and being received by the client. -Unfortunately, SSL isn't 100% foolproof. This post aims to show why SSL is -faulty, and propose a solution. - -## SSL - -SSL is the mechanism by which web-browsers establish an encrypted connection to -web-servers. The goal of this connection is that only the destination -web-browser and the server know what data is passing between them. Anyone spying -on the connection would only see gibberish. To do this a secret key is first -established between the client and the server, and used to encrypt/decrypt all -data. As long as no-one but those parties knows that key, that data will never -be decrypted by anyone else. - -SSL is what's used to establish that secret key on a per-session basis, so that -a key isn't ever re-used and so only the client and the server know it. - -### Public-Private Key Cryptography - -SSL is based around public-private key cryptography. In a public-private key -system, you have both a public key which is generated from a private key. The -public key can be given to anyone, but the private key must remain hidden. There -are two main uses for these two keys: - -* Someone can encrypt a message with your public key, and only you (with the - private key) can decrypt it. - -* You can sign a message with your private key, and anyone with your public key - can verify that it was you and not someone else who signed it. - -These are both extremely useful functions, not just for internet traffic but for -any kind of communication form. Unfortunately, there remains a fundamental flaw. -At some point you must give your public key to the other person in an insecure -way. If an attacker was to intercept your message containing your public key and -swap it for their own, then all future communications could be compromised. That -attacker could create messages the other person would think are from you, and -the other person would encrypt messages meant for you but which would be -decrypt-able by the attacker. - -### How does SSL work? - -SSL is at its heart a public-private key system, but its aim is to be more -secure against the attack described above. - -SSL uses a trust-chain to verify that a public key is the intended one. Your web -browser has a built-in set of public keys, called the root certificates, that it -implicitly trusts. These root certificates are managed by a small number of -companies designated by some agency who decides on these things. - -When you receive a server's SSL certificate (its public key) that certificate -will be signed by a root certificate. You can verify that signature since you -have the root certificate's public key built into your browser. If the signature -checks out then you know a certificate authority trusts the public key the site -gave you, which means you can trust it too. - -There's a bit (a lot!) more to SSL than this, but this is enough to understand -the fundamental problems with it. - -### How SSL doesn't work - -SSL has a few glaring problems. One, it implies we trust the companies holding -the root certificates to not be compromised. If some malicious agency was to get -ahold of a root certificate they could listen in on any connection on the -internet by swapping a site's real certificate with one they generate on the -fly. They could trivially steal any data we send on the internet. - -The second problem is that it's expensive. Really expensive. If you're running a -business you'll have to shell out about $200 a year to keep your SSL certificate -signed (those signatures have an expiration date attached). Since there's very -few root authorities there's an effective monopoly on signatures, and there's -nothing we can do about it. For 200 bucks I know most people simply say "no -thanks" and go unencrypted. The solution is creating a bigger problem. - -## Bitcoins - -Time to switch gears, and propose a solution to the above issues: namecoins. I'm -going to first talk about what namecoins are, how they work, and why we need -them. To start with, namecoins are based on bitcoins. - -If you haven't yet checked out bitcoins, [I highly encourage you to do -so][bitcoins]. They're awesome, and I think they have a chance of really -changing the way we think of and use money in the future. At the moment they're -still a bit of a novelty in the tech realm, but they're growing in popularity. - -The rest of this post assumes you know more or less what bitcoins are, and how -they work. - -## Namecoins - -Few people actually know about bitcoins. Even fewer know that there's other -crypto-currencies besides bitcoins. Basically, developers of these alternative -currencies (altcoins, in the parlance of our times) took the original bitcoin -source code and modified it to produce a new, separate blockchain from the -original bitcoin one. The altcoins are based on the same idea as bitcoins -(namely, a chain of blocks representing all the transactions ever made), but -have slightly different characterstics. - -One of these altcoins is called namecoin. Where other altcoins aim to be digital -currencies, and used as such (like bitcoins), namecoin has a different goal. The -point of namecoin is to create a global, distributed, secure key-value store. -You spend namecoins to claim arbitrary keys (once you've claimed it, you own it -for a set period of time) and to give those keys arbitrary values. Anyone else -with namecoind running can see these values. - -### Why use it? - -A blockchain based on a digital currency seems like a weird idea at first. I -know when I first read about it I was less than thrilled. How is this better -than a DHT? It's a key-value store, why is there a currency involved? - -#### DHT - -DHT stands for Distributed Hash-Table. I'm not going to go too into how they -work, but suffice it to say that they are essentially a distributed key-value -store. Like namecoin. The difference is in the operation. DHTs operate by -spreading and replicating keys and their values across nodes in a P2P mesh. They -have [lots of issues][dht] as far as security goes, the main one being that it's -fairly easy for an attacker to forge the value for a given key, and very -difficult to stop them from doing so or even to detect that it's happened. - -Namecoins don't have this problem. To forge a particular key an attacker would -essentially have to create a new blockchain from a certain point in the existing -chain, and then replicate all the work put into the existing chain into that new -compromised one so that the new one is longer and other clients in the network -will except it. This is extremely non-trivial. - -#### Why a currency? - -To answer why a currency needs to be involved, we need to first look at how -bitcoin/namecoin work. When you take an action (send someone money, set a value -to a key) that action gets broadcast to the network. Nodes on the network -collect these actions into a block, which is just a collection of multiple -actions. Their goal is to find a hash of this new block, combined with some data -from the top-most block in the existing chain, combined with some arbitrary -data, such that the first n characters in the resulting hash are zeros (with n -constantly increasing). When they find one they broadcast it out on the network. -Assuming the block is legitimate they receive some number of coins as -compensation. - -That compensation is what keeps a blockchain based currency going. If there -were no compensation there would be no reason to mine except out of goodwill, so -far fewer people would do it. Since the chain can be compromised if a malicious -group has more computing power than all legitimate miners combined, having few -legitimate miners is a serious problem. - -In the case of namecoins, there's even more reason to involve a currency. Since -you have to spend money to make changes to the chain there's a disincentive for -attackers (read: idiots) to spam the chain with frivolous changes to keys. - -#### Why a *new* currency? - -I'll admit, it's a bit annoying to see all these altcoins popping up. I'm sure -many of them have some solid ideas backing them, but it also makes things -confusing for newcomers and dilutes the "market" of cryptocoin users; the more -users a particular chain has, the stronger it is. If we have many chains, all we -have are a bunch of weak chains. - -The exception to this gripe, for me, is namecoin. When I was first thinking -about this problem my instinct was to just use the existing bitcoin blockchain -as a key-value storage. However, the maintainers of the bitcoin clients -(who are, in effect, the maintainers of the chain) don't want the bitcoin -blockchain polluted with non-commerce related data. At first I disagreed; it's a -P2P network, no-one gets to say what I can or can't use the chain for! And -that's true. But things work out better for everyone involved if there's two -chains. - -Bitcoin is a currency. Namecoin is a key-value store (with a currency as its -driving force). Those are two completely different use-cases, with two -completely difference usage characteristics. And we don't know yet what those -characteristics are, or if they'll change. If the chain-maintainers have to deal -with a mingled chain we could very well be tying their hands with regards to -what they can or can't change with regards to the behavior of the chain, since -improving performance for one use-case may hurt the performance of the other. -With two separate chains the maintainers of each are free to do what they see -fit to keep their respective chains operating as smoothly as possible. -Additionally, if for some reason bitcoins fall by the wayside, namecoin will -still have a shot at continuing operation since it isn't tied to the former. -Tldr: separation of concerns. - -## Namecoin as an alternative to SSL - -And now to tie it all together. - -There are already a number of proposed formats for standardizing how we store -data on the namecoin chain so that we can start building tools around it. I'm -not hugely concerned with the particulars of those standards, only that we can, -in some way, standardize on attaching a public key (or a fingerprint of one) to -some key on the namecoin blockchain. When you visit a website, the server -would then send both its public key and the namecoin chain key to be checked -against to the browser, and the browser would validate that the public key it -received is the same as the one on the namecoin chain. - -The main issue with this is that it requires another round-trip when visiting a -website: One for DNS, and one to check the namecoin chain. And where would this -chain even be hosted? - -My proposition is there would exist a number of publicly available servers -hosting a namecoind process that anyone in the world could send requests for -values on the chain. Browsers could then be made with a couple of these -hardwired in. ISPs could also run their own copies at various points in their -network to improve response-rates and decrease load on the globally public -servers. Furthermore, the paranoid could host their own and be absolutely sure -that the data they're receiving is valid. - -If the above scheme sounds a lot like what we currently use for DNS, that's -because it is. In fact, one of namecoin's major goals is that it be used as a -replacement for DNS, and most of the talk around it is focused on this subject. -DNS has many of the same problems as SSL, namely single-point-of-failure and -that it's run by a centralized agency that we have to pay arbitrarily high fees -to. By switching our DNS and SSL infrastructure to use namecoin we could kill -two horribly annoying, monopolized, expensive birds with a single stone. - -That's it. If we use the namecoin chain as a DNS service we get security almost -for free, along with lots of other benefits. To make this happen we need -cooperation from browser makers, and to standardize on a simple way of -retrieving DNS information from the chain that the browsers can use. The -protocol doesn't need to be very complex, I think HTTP/REST should suffice, -since the meat of the data will be embedded in the JSON value on the namecoin -chain. - -If you want to contribute or learn more please check out [namecoin][nmc] and -specifically the [d namespace proposal][dns] for it. - -[cryptic]: http://cryptic.io -[bitcoins]: http://vimeo.com/63502573 -[dht]: http://www.globule.org/publi/SDST_acmcs2009.pdf -[nsa]: https://www.schneier.com/blog/archives/2013/09/new_nsa_leak_sh.html -[nmc]: http://dot-bit.org/Main_Page -[dns]: http://dot-bit.org/Namespace:Domain_names_v2.0 diff --git a/_posts/2014-01-11-diamond-square.md b/_posts/2014-01-11-diamond-square.md deleted file mode 100644 index 665e07c..0000000 --- a/_posts/2014-01-11-diamond-square.md +++ /dev/null @@ -1,494 +0,0 @@ ---- -title: Diamond Square -description: >- - Tackling the problem of semi-realistic looking terrain generation in - clojure. -updated: 2018-09-06 ---- - -![terrain][terrain] - -I recently started looking into the diamond-square algorithm (you can find a -great article on it [here][diamondsquare]). The following is a short-ish -walkthrough of how I tackled the problem in clojure and the results. You can -find the [leiningen][lein] repo [here][repo] and follow along within that, or -simply read the code below to get an idea. - -Also, Marco ported my code into clojurescript, so you can get random terrain -in your browser. [Check it out!][marco] - -```clojure -(ns diamond-square.core) - -; == The Goal == -; Create a fractal terrain generator using clojure - -; == The Algorithm == -; Diamond-Square. We start with a grid of points, each with a height of 0. -; -; 1. Take each corner point of the square, average the heights, and assign that -; to be the height of the midpoint of the square. Apply some random error to -; the midpoint. -; -; 2. Creating a line from the midpoint to each corner we get four half-diamonds. -; Average the heights of the points (with some random error) and assign the -; heights to the midpoints of the diamonds. -; -; 3. We now have four square sections, start at 1 for each of them (with -; decreasing amount of error for each iteration). -; -; This picture explains it better than I can: -; https://blog.mediocregopher.com/img/diamond-square/dsalg.png -; (http://nbickford.wordpress.com/2012/12/21/creating-fake-landscapes/dsalg/) -; -; == The Strategy == -; We begin with a vector of vectors of numbers, and iterate over it, filling in -; spots as they become available. Our grid will have the top-left being (0,0), -; y being pointing down and x going to the right. The outermost vector -; indicating row number (y) and the inner vectors indicate the column number (x) -; -; = Utility = -; First we create some utility functions for dealing with vectors of vectors. - -(defn print-m - "Prints a grid in a nice way" - [m] - (doseq [n m] - (println n))) - -(defn get-m - "Gets a value at the given x,y coordinate of the grid, with [0,0] being in the - top left" - [m x y] - ((m y) x)) - -(defn set-m - "Sets a value at the given x,y coordinat of the grid, with [0,0] being in the - top left" - [m x y v] - (assoc m y - (assoc (m y) x v))) - -(defn add-m - "Like set-m, but adds the given value to the current on instead of overwriting - it" - [m x y v] - (set-m m x y - (+ (get-m m x y) v))) - -(defn avg - "Returns the truncated average of all the given arguments" - [& l] - (int (/ (reduce + l) (count l)))) - -; = Grid size = -; Since we're starting with a blank grid we need to find out what sizes the -; grids can be. For convenience the size (height and width) should be odd, so we -; easily get a midpoint. And on each iteration we'll be halfing the grid, so -; whenever we do that the two resultrant grids should be odd and halfable as -; well, and so on. -; -; The algorithm that fits this is size = 2^n + 1, where 1 <= n. For the rest of -; this guide I'll be referring to n as the "degree" of the grid. - - -(def exp2-pre-compute - (vec (map #(int (Math/pow 2 %)) (range 31)))) - -(defn exp2 - "Returns 2^n as an integer. Uses pre-computed values since we end up doing - this so much" - [n] - (exp2-pre-compute n)) - -(def grid-sizes - (vec (map #(inc (exp2 %)) (range 1 31)))) - -(defn grid-size [degree] - (inc (exp2 degree))) - -; Available grid heights/widths are as follows: -;[3 5 9 17 33 65 129 257 513 1025 2049 4097 8193 16385 32769 65537 131073 -;262145 524289 1048577 2097153 4194305 8388609 16777217 33554433 67108865 -;134217729 268435457 536870913 1073741825]) - -(defn blank-grid - "Generates a grid of the given degree, filled in with zeros" - [degree] - (let [gsize (grid-size degree)] - (vec (repeat gsize - (vec (repeat gsize 0)))))) - -(comment - (print-m (blank-grid 3)) -) - -; = Coordinate Pattern (The Tricky Part) = -; We now have to figure out which coordinates need to be filled in on each pass. -; A pass is defined as a square step followed by a diamond step. The next pass -; will be the square/dimaond steps on all the smaller squares generated in the -; pass. It works out that the number of passes required to fill in the grid is -; the same as the degree of the grid, where the first pass is 1. -; -; So we can easily find patterns in the coordinates for a given degree/pass, -; I've laid out below all the coordinates for each pass for a 3rd degree grid -; (which is 9x9). - -; Degree 3 Pass 1 Square -; [. . . . . . . . .] -; [. . . . . . . . .] -; [. . . . . . . . .] -; [. . . . . . . . .] -; [. . . . 1 . . . .] (4,4) -; [. . . . . . . . .] -; [. . . . . . . . .] -; [. . . . . . . . .] -; [. . . . . . . . .] - -; Degree 3 Pass 1 Diamond -; [. . . . 2 . . . .] (4,0) -; [. . . . . . . . .] -; [. . . . . . . . .] -; [. . . . . . . . .] -; [2 . . . . . . . 2] (0,4) (8,4) -; [. . . . . . . . .] -; [. . . . . . . . .] -; [. . . . . . . . .] -; [. . . . 2 . . . .] (4,8) - -; Degree 3 Pass 2 Square -; [. . . . . . . . .] -; [. . . . . . . . .] -; [. . 3 . . . 3 . .] (2,2) (6,2) -; [. . . . . . . . .] -; [. . . . . . . . .] -; [. . . . . . . . .] -; [. . 3 . . . 3 . .] (2,6) (6,6) -; [. . . . . . . . .] -; [. . . . . . . . .] - -; Degree 3 Pass 2 Diamond -; [. . 4 . . . 4 . .] (2,0) (6,0) -; [. . . . . . . . .] -; [4 . . . 4 . . . 4] (0,2) (4,2) (8,2) -; [. . . . . . . . .] -; [. . 4 . . . 4 . .] (2,4) (6,4) -; [. . . . . . . . .] -; [4 . . . 4 . . . 4] (0,6) (4,6) (8,6) -; [. . . . . . . . .] -; [. . 4 . . . 4 . .] (2,8) (6,8) - -; Degree 3 Pass 3 Square -; [. . . . . . . . .] -; [. 5 . 5 . 5 . 5 .] (1,1) (3,1) (5,1) (7,1) -; [. . . . . . . . .] -; [. 5 . 5 . 5 . 5 .] (1,3) (3,3) (5,3) (7,3) -; [. . . . . . . . .] -; [. 5 . 5 . 5 . 5 .] (1,5) (3,5) (5,5) (7,5) -; [. . . . . . . . .] -; [. 5 . 5 . 5 . 5 .] (1,7) (3,7) (5,7) (7,7) -; [. . . . . . . . .] - -; Degree 3 Pass 3 Square -; [. 6 . 6 . 6 . 6 .] (1,0) (3,0) (5,0) (7,0) -; [6 . 6 . 6 . 6 . 6] (0,1) (2,1) (4,1) (6,1) (8,1) -; [. 6 . 6 . 6 . 6 .] (1,2) (3,2) (5,2) (7,2) -; [6 . 6 . 6 . 6 . 6] (0,3) (2,3) (4,3) (6,3) (8,3) -; [. 6 . 6 . 6 . 6 .] (1,4) (3,4) (5,4) (7,4) -; [6 . 6 . 6 . 6 . 6] (0,5) (2,5) (4,5) (6,5) (8,5) -; [. 6 . 6 . 6 . 6 .] (1,6) (3,6) (5,6) (7,6) -; [6 . 6 . 6 . 6 . 6] (0,7) (2,7) (4,7) (6,7) (8,7) -; [. 6 . 6 . 6 . 6 .] (1,8) (3,8) (5,8) (7,8) -; -; I make two different functions, one to give the coordinates for the square -; portion of each pass and one for the diamond portion of each pass. To find the -; actual patterns it was useful to first look only at the pattern in the -; y-coordinates, and figure out how that translated into the pattern for the -; x-coordinates. - -(defn grid-square-coords - "Given a grid degree and pass number, returns all the coordinates which need - to be computed for the square step of that pass" - [degree pass] - (let [gsize (grid-size degree) - start (exp2 (- degree pass)) - interval (* 2 start) - coords (map #(+ start (* interval %)) - (range (exp2 (dec pass))))] - (mapcat (fn [y] - (map #(vector % y) coords)) - coords))) -; -; (grid-square-coords 3 2) -; => ([2 2] [6 2] [2 6] [6 6]) - -(defn grid-diamond-coords - "Given a grid degree and a pass number, returns all the coordinates which need - to be computed for the diamond step of that pass" - [degree pass] - (let [gsize (grid-size degree) - interval (exp2 (- degree pass)) - num-coords (grid-size pass) - coords (map #(* interval %) (range 0 num-coords))] - (mapcat (fn [y] - (if (even? (/ y interval)) - (map #(vector % y) (take-nth 2 (drop 1 coords))) - (map #(vector % y) (take-nth 2 coords)))) - coords))) - -; (grid-diamond-coords 3 2) -; => ([2 0] [6 0] [0 2] [4 2] [8 2] [2 4] [6 4] [0 6] [4 6] [8 6] [2 8] [6 8]) - -; = Height Generation = -; We now work on functions which, given a coordinate, will return what value -; coordinate will have. - -(defn avg-points - "Given a grid and an arbitrary number of points (of the form [x y]) returns - the average of all the given points that are on the map. Any points which are - off the map are ignored" - [m & coords] - (let [grid-size (count m)] - (apply avg - (map #(apply get-m m %) - (filter - (fn [[x y]] - (and (< -1 x) (> grid-size x) - (< -1 y) (> grid-size y))) - coords))))) - -(defn error - "Returns a number between -e and e, inclusive" - [e] - (- (rand-int (inc (* 2 e))) e)) - -; The next function is a little weird. It primarily takes in a point, then -; figures out the distance from that point to the points we'll take the average -; of. The locf (locator function) is used to return back the actual points to -; use. For the square portion it'll be the points diagonal from the given one, -; for the diamond portion it'll be the points to the top/bottom/left/right from -; the given one. -; -; Once it has those points, it finds the average and applies the error. The -; error function is nothing more than a number between -interval and +interval, -; where interval is the distance between the given point and one of the averaged -; points. It is important that the error decreases the more passes you do, which -; is why the interval is used. -; -; The error function is what should be messed with primarily if you want to -; change what kind of terrain you generate (a giant mountain instead of -; hills/valleys, for example). The one we use is uniform for all intervals, so -; it generates a uniform terrain. - -(defn- grid-fill-point - [locf m degree pass x y] - (let [interval (exp2 (- degree pass)) - leftx (- x interval) - rightx (+ x interval) - upy (- y interval) - downy (+ y interval) - v (apply avg-points m - (locf x y leftx rightx upy downy))] - (add-m m x y (+ v (error interval))))) - -(def grid-fill-point-square - "Given a grid, the grid's degree, the current pass number, and a point on the - grid, fills in that point with the average (plus some error) of the - appropriate corner points, and returns the resultant grid" - (partial grid-fill-point - (fn [_ _ leftx rightx upy downy] - [[leftx upy] - [rightx upy] - [leftx downy] - [rightx downy]]))) - -(def grid-fill-point-diamond - "Given a grid, the grid's degree, the current pass number, and a point on the - grid, fills in that point with the average (plus some error) of the - appropriate edge points, and returns the resultant grid" - (partial grid-fill-point - (fn [x y leftx rightx upy downy] - [[leftx y] - [rightx y] - [x upy] - [x downy]]))) - -; = Filling in the Grid = -; We finally compose the functions we've been creating to fill in the entire -; grid - -(defn- grid-fill-point-passes - "Given a grid, a function to fill in coordinates, and a function to generate - those coordinates, fills in all coordinates for a given pass, returning the - resultant grid" - [m fill-f coord-f degree pass] - (reduce - (fn [macc [x y]] (fill-f macc degree pass x y)) - m - (coord-f degree pass))) - -(defn grid-pass - "Given a grid and a pass number, does the square then the diamond portion of - the pass" - [m degree pass] - (-> m - (grid-fill-point-passes - grid-fill-point-square grid-square-coords degree pass) - (grid-fill-point-passes - grid-fill-point-diamond grid-diamond-coords degree pass))) - -; The most important function in this guide, does all the work -(defn terrain - "Given a grid degree, generates a uniformly random terrain on a grid of that - degree" - ([degree] - (terrain (blank-grid degree) degree)) - ([m degree] - (reduce - #(grid-pass %1 degree %2) - m - (range 1 (inc degree))))) - -(comment - (print-m - (terrain 5)) -) - -; == The Results == -; We now have a generated terrain, probably. We should check it. First we'll -; create an ASCII representation. But to do that we'll need some utility -; functions. - -(defn max-terrain-height - "Returns the maximum height found in the given terrain grid" - [m] - (reduce max - (map #(reduce max %) m))) - -(defn min-terrain-height - "Returns the minimum height found in the given terrain grid" - [m] - (reduce min - (map #(reduce min %) m))) - -(defn norm - "Given x in the range (A,B), normalizes it into the range (0,new-height)" - [A B new-height x] - (int (/ (* (- x A) new-height) (- B A)))) - -(defn normalize-terrain - "Given a terrain map and a number of \"steps\", normalizes the terrain so all - heights in it are in the range (0,steps)" - [m steps] - (let [max-height (max-terrain-height m) - min-height (min-terrain-height m) - norm-f (partial norm min-height max-height steps)] - (vec (map #(vec (map norm-f %)) m)))) - -; We now define which ASCII characters we want to use for which heights. The -; vector starts with the character for the lowest height and ends with the -; character for the heighest height. - -(def tiles - [\~ \~ \" \" \x \x \X \$ \% \# \@]) - -(defn tile-terrain - "Given a terrain map, converts it into an ASCII tile map" - [m] - (vec (map #(vec (map tiles %)) - (normalize-terrain m (dec (count tiles)))))) - -(comment - (print-m - (tile-terrain - (terrain 5))) - -; [~ ~ " " x x x X % $ $ $ X X X X X X $ x x x X X X x x x x " " " ~] -; [" ~ " " x x X X $ $ $ X X X X X X X X X X X X X X x x x x " " " "] -; [" " " x x x X X % $ % $ % $ $ X X X X $ $ $ X X X X x x x x " " "] -; [" " " x x X $ % % % % % $ % $ $ X X $ $ $ $ X X x x x x x x " " x] -; [" x x x x X $ $ # % % % % % % $ X $ X X % $ % X X x x x x x x x x] -; [x x x X $ $ $ % % % % % $ % $ $ $ % % $ $ $ $ X X x x x x x x x x] -; [X X X $ % $ % % # % % $ $ % % % % $ % $ $ X $ X $ X X x x x X x x] -; [$ $ X $ $ % $ % % % % $ $ $ % # % % % X X X $ $ $ X X X x x x x x] -; [% X X % % $ % % % $ % $ % % % # @ % $ $ X $ X X $ X x X X x x x x] -; [$ $ % % $ $ % % $ $ X $ $ % % % % $ $ X $ $ X X X X X X x x x x x] -; [% % % X $ $ % $ $ X X $ $ $ $ % % $ $ X X X $ X X X x x X x x X X] -; [$ $ $ X $ $ X $ X X X $ $ $ $ % $ $ $ $ $ X $ X x X X X X X x X X] -; [$ $ $ $ X X $ X X X X X $ % % % % % $ X $ $ $ X x X X X $ X X $ $] -; [X $ $ $ $ $ X X X X X X X % $ % $ $ $ X X X X X x x X X x X X $ $] -; [$ $ X X $ X X x X $ $ X X $ % X X X X X X X X X x X X x x X X X X] -; [$ $ X X X X X X X $ $ $ $ $ X $ X X X X X X X x x x x x x x X X X] -; [% % % $ $ X $ X % X X X % $ $ X X X X X X x x x x x x x x x X X $] -; [$ % % $ $ $ X X $ $ $ $ $ $ X X X X x X x x x x " x x x " x x x x] -; [$ X % $ $ $ $ $ X X X X X $ $ X X X X X X x x " " " " " " " " x x] -; [$ X $ $ % % $ X X X $ X X X x x X X x x x x x " " " " " ~ " " " "] -; [$ $ X X % $ % X X X X X X X X x x X X X x x x " " " " " " ~ " " "] -; [$ $ X $ % $ $ X X X X X X x x x x x x x x x " " " " " " " " " ~ ~] -; [$ $ $ $ $ X X $ X X X X X x x x x x x x x " " " " " " " ~ " " " ~] -; [$ % X X $ $ $ $ X X X X x x x x x x x x x x " " " " ~ " " ~ " " ~] -; [% $ $ X $ X $ X $ X $ X x x x x x x x x x x " " " " ~ ~ ~ " ~ " ~] -; [$ X X X X $ $ $ $ $ X x x x x x x x x x x " " " " ~ ~ ~ ~ ~ ~ ~ ~] -; [X x X X x X X X X X X X X x x x x x x x x x " " " ~ ~ " " ~ ~ ~ ~] -; [x x x x x x X x X X x X X X x x x x x x x " x " " " " " ~ ~ ~ ~ ~] -; [x x x x x x x x X X X X $ X X x X x x x x x x x x " ~ ~ ~ ~ ~ ~ ~] -; [" x x x x x X x X X X X X X X X X x x x x x x " " " " ~ ~ ~ ~ ~ ~] -; [" " " x x x X X X X $ $ $ X X X X X X x x x x x x x x " " ~ ~ ~ ~] -; [" " " " x x x X X X X X $ $ X X x X X x x x x x x x " " " " " ~ ~] -; [~ " " x x x x X $ X $ X $ $ X x X x x x x x x x x x x x x " " " ~] -) - -; = Pictures! = -; ASCII is cool, but pictures are better. First we import some java libraries -; that we'll need, then define the colors for each level just like we did tiles -; for the ascii representation. - -(import - 'java.awt.image.BufferedImage - 'javax.imageio.ImageIO - 'java.io.File) - -(def colors - [0x1437AD 0x04859D 0x007D1C 0x007D1C 0x24913C - 0x00C12B 0x38E05D 0xA3A3A4 0x757575 0xFFFFFF]) - -; Finally we reduce over a BufferedImage instance to output every tile as a -; single pixel on it. - -(defn img-terrain - "Given a terrain map and a file name, outputs a png representation of the - terrain map to that file" - [m file] - (let [img (BufferedImage. (count m) (count m) BufferedImage/TYPE_INT_RGB)] - (reduce - (fn [rown row] - (reduce - (fn [coln tile] - (.setRGB img coln rown (colors tile)) - (inc coln)) - 0 row) - (inc rown)) - 0 (normalize-terrain m (dec (count colors)))) - (ImageIO/write img "png" (File. file)))) - -(comment - (img-terrain - (terrain 10) - "resources/terrain.png") - - ; https://blog.mediocregopher.com/img/diamond-square/terrain.png -) - -; == Conclusion == -; There's still a lot of work to be done. The algorithm starts taking a -; non-trivial amount of time around the 10th degree, which is only a 1025x1025px -; image. I need to profile the code and find out where the bottlenecks are. It's -; possible re-organizing the code to use pmaps instead of reduces in some places -; could help. -``` - -[marco]: http://marcopolo.io/diamond-square/ -[terrain]: /img/diamond-square/terrain.png -[diamondsquare]: http://www.gameprogrammer.com/fractal.html -[lein]: https://github.com/technomancy/leiningen -[repo]: https://github.com/mediocregopher/diamond-square diff --git a/_posts/2014-10-29-erlang-pitfalls.md b/_posts/2014-10-29-erlang-pitfalls.md deleted file mode 100644 index 32a8095..0000000 --- a/_posts/2014-10-29-erlang-pitfalls.md +++ /dev/null @@ -1,192 +0,0 @@ ---- -title: Erlang Pitfalls -description: >- - Common pitfalls that people may run into when designing and writing - large-scale erlang applications. ---- - -I've been involved with a large-ish scale erlang project at Grooveshark since -sometime around 2011. I started this project knowing absolutely nothing about -erlang, but now I feel I have accumulated enough knowlege over time that I could -conceivably give some back. Specifically, common pitfalls that people may run -into when designing and writing a large-scale erlang application. Some of these -may show up when searching for them, but some of them you may not even know you -need to search for. - -## now() vs timestamp() - -The cononical way of getting the current timestamp in erlang is to use -`erlang:now()`. This works great at small loads, but if you find your -application slowing down greatly at highly parallel loads and you're calling -`erlang:now()` a lot, it may be the culprit. - -A property of this method you may not realize is that it is monotonically -increasing, meaning even if two processes call it at the *exact* same time they -will both receive different output. This is done through some locking on the -low-level, as well as a bit of math to balance out the time getting out of sync -in the scenario. - -There are situations where fetching always unique timestamps is useful, such as -seeding RNGs and generating unique identifiers for things, but usually when -people fetch a timestamp they just want a timestamp. For these cases, -`os:timestamp()` can be used. It is not blocked by any locks, it simply returns -the time. - -## The rpc module is slow - -The built-in `rpc` module is slower than you'd think. This mostly stems from it -doing a lot of extra work for every `call` and `cast` that you do, ensuring that -certain conditions are accounted for. If, however, it's sufficient for the -calling side to know that a call timed-out on them and not worry about it any -further you may benefit from simply writing your own rpc module. Alternatively, -use [one which already exists](https://github.com/cloudant/rexi). - -## Don't send anonymous functions between nodes - -One of erlang's niceties is transparent message sending between two phsyical -erlang nodes. Once nodes are connected, a process on one can send any message to -a process on the other exactly as if they existed on the same node. This is fine -for many data-types, but for anonymous functions it should be avoided. - -For example: - -```erlang -RemotePid ! {fn, fun(I) -> I + 1 end}. -``` - -Would be better written as - -```erlang -incr(I) -> - I + 1. - -RemotePid ! {fn, ?MODULE, incr}. -``` - -and then using an `apply` on the RemotePid to actually execute the function. - -This is because hot-swapping code messes with anonymous functions quite a bit. -Erlang isn't actually sending a function definition across the wire; it's simply -sending a reference to a function. If you've changed the code within the -anonymous function on a node, that reference changes. The sending node is -sending a reference to a function which may not exist anymore on the receiving -node, and you'll get a weird error which Google doesn't return many results for. - -Alternatively, if you simply send atoms across the wire and use `apply` on the -other side, only atoms are sent and the two nodes involved can have totally -different ideas of what the function itself does without any problems. - -## Hot-swapping code is a convenience, not a crutch - -Hot swapping code is the bees-knees. It lets you not have to worry about -rolling-restarts for trivial code changes, and so adds stability to your -cluster. My warning is that you should not rely on it. If your cluster can't -survive a node being restarted for a code change, then it can't survive if that -node fails completely, or fails and comes back up. Design your system pretending -that hot-swapping does not exist, and only once you've done that allow yourself -to use it. - -## GC sometimes needs a boost - -Erlang garbage collection (GC) acts on a per-erlang-process basis, meaning that -each process decides on its own to garbage collect itself. This is nice because -it means stop-the-world isn't a problem, but it does have some interesting -effects. - -We had a problem with our node memory graphs looking like an upwards facing -line, instead of a nice sinusoid relative to the number of connections during -the day. We couldn't find a memory leak *anywhere*, and so started profiling. We -found that the memory seemed to be comprised of mostly binary data in process -heaps. On a hunch my coworker Mike Cugini (who gets all the credit for this) ran -the following on a node: - -```erlang -lists:foreach(erlang:garbage_collect/1, erlang:processes()). -``` - -and saw memory drop in a huge way. We made that code run every 10 minutes or so -and suddenly our memory problem went away. - -The problem is that we had a lot of processes which individually didn't have -much heap data, but all-together were crushing the box. Each didn't think it had -enough to garbage collect very often, so memory just kept going up. Calling the -above forces all processes to garbage collect, and thus throw away all those -little binary bits they were hoarding. - -## These aren't the solutions you are looking for - -The `erl` process has tons of command-line options which allow you to tweak all -kinds of knobs. We've had tons of performance problems with our application, as -of yet not a single one has been solved with turning one of these knobs. They've -all been design issues or just run-of-the-mill bugs. I'm not saying the knobs -are *never* useful, but I haven't seen it yet. - -## Erlang processes are great, except when they're not - -The erlang model of allowing processes to manage global state works really well -in many cases. Possibly even most cases. There are, however, times when it -becomes a performance problem. This became apparent in the project I was working -on for Grooveshark, which was, at its heart, a pubsub server. - -The architecture was very simple: each channel was managed by a process, client -connection processes subscribed to that channel and received publishes from it. -Easy right? The problem was that extremely high volume channels were simply not -able to keep up with the load. The channel process could do certain things very -fast, but there were some operations which simply took time and slowed -everything down. For example, channels could have arbitrary properties set on -them by their owners. Retrieving an arbitrary property from a channel was a -fairly fast operation: client `call`s the channel process, channel process -immediately responds with the property value. No blocking involved. - -But as soon as there was any kind of call which required the channel process to -talk to yet *another* process (unfortunately necessary), things got hairy. On -high volume channels publishes/gets/set operations would get massively backed up -in the message queue while the process was blocked on another process. We tried -many things, but ultimately gave up on the process-per-channel approach. - -We instead decided on keeping *all* channel state in a transactional database. -When client processes "called" operations on a channel, they really are just -acting on the database data inline, no message passing involved. This means that -read-only operations are super-fast because there is minimal blocking, and if -some random other process is being slow it only affects the one client making -the call which is causing it to be slow, and not holding up a whole host of -other clients. - -## Mnesia might not be what you want - -This one is probably a bit controversial, and definitely subject to use-cases. -Do your own testing and profiling, find out what's right for you. - -Mnesia is erlang's solution for global state. It's an in-memory transactional -database which can scale to N nodes and persist to disk. It is hosted -directly in the erlang processes memory so you interact with it in erlang -directly in your code; no calling out to database drivers and such. Sounds great -right? - -Unfortunately mnesia is not a very full-featured database. It is essentially a -key-value store which can hold arbitrary erlang data-types, albeit in a set -schema which you lay out for it during startup. This means that more complex -types like sorted sets and hash maps (although this was addressed with the -introduction of the map data-type in R17) are difficult to work with within -mnesia. Additionally, erlang's data model of immutability, while awesome -usually, can bite you here because it's difficult (impossible?) to pull out -chunks of data within a record without accessing the whole record. - -For example, when retrieving the list of processes subscribed to a channel our -application doesn't simply pull the full list and iterate over it. This is too -slow, and in some cases the subscriber list was so large it wasn't actually -feasible. The channel process wasn't cleaning up its heap fast enough, so -multiple publishes would end up with multiple copies of the giant list in -memory. This became a problem. Instead we chain spawned processes, each of which -pull a set chunk of the subsciber list, and iterate over that. This is very -difficult to implement in mnesia without pulling the full subscriber list into -the process' memory at some point in the process. - -It is, however, fairly trivial to implement in redis using sorted sets. For this -case, and many other cases after, the motto for performance improvements became -"stick it in redis". The application is at the point where *all* state which -isn't directly tied to a specific connection is kept in redis, encoded using -`term_to_binary`. The performance hit of going to an outside process for data -was actually much less than we'd originally thought, and ended up being a plus -since we had much more freedom to do interesting hacks to speedup up our -accesses. diff --git a/_posts/2015-03-11-rabbit-hole.md b/_posts/2015-03-11-rabbit-hole.md deleted file mode 100644 index 97c2b80..0000000 --- a/_posts/2015-03-11-rabbit-hole.md +++ /dev/null @@ -1,165 +0,0 @@ ---- -title: Rabbit Hole -description: >- - Complex systems sometimes require complex debugging. ---- - -We've begun rolling out [SkyDNS][skydns] at my job, which has been pretty neat. -We're basing a couple future projects around being able to use it, and it's made -dynamic configuration and service discovery nice and easy. - -This post chronicles catching a bug because of our switch to SkyDNS, and how we -discover its root cause. I like to call these kinds of bugs "rabbit holes"; they -look shallow at first, but anytime you make a little progress forward a little -more is always required, until you discover the ending somewhere totally -unrelated to the start. - -## The Bug - -We are seeing *tons* of these in the SkyDNS log: - -``` -[skydns] Feb 20 17:21:15.168 INFO | no nameservers defined or name too short, can not forward -``` - -I fire up tcpdump to see if I can see anything interesting, and sure enough run -across a bunch of these: - -``` -# tcpdump -vvv -s 0 -l -n port 53 -tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes - ... - $fen_ip.50257 > $skydns_ip.domain: [udp sum ok] 16218+ A? unknown. (25) - $fen_ip.27372 > $skydns_ip.domain: [udp sum ok] 16218+ A? unknown. (25) - $fen_ip.35634 > $skydns_ip.domain: [udp sum ok] 59227+ A? unknown. (25) - $fen_ip.64363 > $skydns_ip.domain: [udp sum ok] 59227+ A? unknown. (25) -``` - -It appears that some of our front end nodes (FENs) are making tons of DNS -fequests trying to find the A record of `unknown`. Something on our FENs is -doing something insane and is breaking. - -## The FENs - -Hopping over to my favorite FEN we're able to see the packets in question -leaving on a tcpdump as well, but that's not helpful for finding the root cause. -We have lots of processes running on the FENs and any number of them could be -doing something crazy. - -We fire up sysdig, which is similar to systemtap and strace in that it allows -you to hook into the kernel and view various kernel activites in real time, but -it's easier to use than both. The following command dumps all UDP packets being -sent and what process is sending them: - -``` -# sysdig fd.l4proto=udp -... -2528950 22:17:35.260606188 0 php-fpm (21477) < connect res=0 tuple=$fen_ip:61173->$skydns_ip:53 -2528961 22:17:35.260611327 0 php-fpm (21477) > sendto fd=102(<4u>$fen_ip:61173->$skydns_ip:53) size=25 tuple=NULL -2528991 22:17:35.260631917 0 php-fpm (21477) < sendto res=25 data=.r...........unknown..... -2530470 22:17:35.261879032 0 php-fpm (21477) > ioctl fd=102(<4u>$fen_ip:61173->$skydns_ip:53) request=541B argument=7FFF82DC8728 -2530472 22:17:35.261880574 0 php-fpm (21477) < ioctl res=0 -2530474 22:17:35.261881226 0 php-fpm (21477) > recvfrom fd=102(<4u>$fen_ip:61173->$skydns_ip:53) size=1024 -2530476 22:17:35.261883424 0 php-fpm (21477) < recvfrom res=25 data=.r...........unknown..... tuple=$skydns_ip:53->$fen_ip:61173 -2530485 22:17:35.261888997 0 php-fpm (21477) > close fd=102(<4u>$fen_ip:61173->$skydns_ip:53) -2530488 22:17:35.261892626 0 php-fpm (21477) < close res=0 -``` - -Aha! We can see php-fpm is requesting something over udp with the string -`unknown` in it. We've now narrowed down the guilty process, the rest should be -easy right? - -## Which PHP? - -Unfortunately we're a PHP shop; knowing that php-fpm is doing something on a FEN -narrows down the guilty codebase little. Taking the FEN out of our load-balancer -stops the requests for `unknown`, so we *can* say that it's some user-facing -code that is the culprit. Our setup on the FENs involves users hitting nginx -for static content and nginx proxying PHP requests back to php-fpm. Since all -our virtual domains are defined in nginx, we are able to do something horrible. - -On the particular FEN we're on we make a guess about which virtual domain the -problem is likely coming from (our main app), and proxy all traffic from all -other domains to a different FEN. We still see requests for `unknown` leaving -the box, so we've narrowed the problem down a little more. - -## The Despair - -Nothing in our code is doing any direct DNS calls as far as we can find, and we -don't see any places PHP might be doing it for us. We have lots of PHP -extensions in place, all written in C and all black boxes; any of them could be -the culprit. Grepping through the likely candidates' source code for the string -`unknown` proves fruitless. - -We try xdebug at this point. xdebug is a profiler for php which will create -cachegrind files for the running code. With cachegrind you can see every -function which was ever called, how long spent within each function, a full -call-graph, and lots more. Unfortunately xdebug outputs cachegrind files on a -per-php-fpm-process basis, and overwrites the previous file on each new request. -So xdebug is pretty much useless, since what is in the cachegrind file isn't -necessarily what spawned the DNS request. - -## Gotcha (sorta) - -We turn back to the tried and true method of dumping all the traffic using -tcpdump and perusing through that manually. - -What we find is that nearly everytime there is a DNS request for `unknown`, if -we scroll up a bit there is (usually) a particular request to memcache. The -requested key is always in the style of `function-name:someid:otherstuff`. When -looking in the code around that function name we find this ominous looking call: - -```php -$ipAddress = getIPAddress(); -$geoipInfo = getCountryInfoFromIP($ipAddress); -``` - -This points us in the right direction. On a hunch we add some debug -logging to print out the `$ipAddress` variable, and sure enough it comes back as -`unknown`. AHA! - -So what we surmise is happening is that for some reason our geoip extension, -which we use to get the location data of an IP address and which -`getCountryInfoFromIP` calls, is seeing something which is *not* an IP address -and trying to resolve it. - -## Gotcha (for real) - -So the question becomes: why are we getting the string `unknown` as an IP -address? - -Adding some debug logging around the area we find before showed that -`$_SERVER['REMOTE_ADDR']`, which is the variable populated with the IP address -of the client, is sometimes `unknown`. We guess that this has something to do -with some magic we are doing on nginx's side to populate `REMOTE_ADDR` with the -real IP address of the client in the case of them going through a proxy. - -Many proxies send along the header `X-Forwarded-For` to indicate the real IP of -the client they're proxying for, otherwise the server would only see the proxy's -IP. In our setup I decided that in those cases we should set the `REMOTE_ADDR` -to the real client IP so our application logic doesn't even have to worry about -it. There are a couple problems with this which render it a bad decision, one -being that if some misbahaving proxy was to, say, start sending -`X-Forwarded-For: unknown` then some written applications might mistake that to -mean the client's IP is `unknown`. - -## The Fix - -The fix here was two-fold: - -1) We now always set `$_SERVER['REMOTE_ADDR']` to be the remote address of the -requests, regardless of if it's a proxy, and also send the application the -`X-Forwarded-For` header to do with as it pleases. - -2) Inside our app we look at all the headers sent and do some processing to -decide what the actual client IP is. PHP can handle a lot more complex logic -than nginx can, so we can do things like check to make sure the IP is an IP, and -also that it's not some NAT'd internal ip, and so forth. - -And that's it. From some weird log messages on our DNS servers to an nginx -mis-configuration on an almost unrelated set of servers, this is one of those -strange bugs that never has a nice solution and goes unsolved for a long time. -Spending the time to dive down the rabbit hole and find the answer is often -tedious, but also often very rewarding. - -[skydns]: https://github.com/skynetservices/skydns diff --git a/_posts/2015-07-15-go-http.md b/_posts/2015-07-15-go-http.md deleted file mode 100644 index 7da7d6b..0000000 --- a/_posts/2015-07-15-go-http.md +++ /dev/null @@ -1,547 +0,0 @@ ---- -title: Go's http package by example -description: >- - The basics of using, testing, and composing apps built using go's net/http - package. ---- - -Go's [http](http://golang.org/pkg/net/http/) package has turned into one of my -favorite things about the Go programming language. Initially it appears to be -somewhat complex, but in reality it can be broken down into a couple of simple -components that are extremely flexible in how they can be used. This guide will -cover the basic ideas behind the http package, as well as examples in using, -testing, and composing apps built with it. - -This guide assumes you have some basic knowledge of what an interface in Go is, -and some idea of how HTTP works and what it can do. - -## Handler - -The building block of the entire http package is the `http.Handler` interface, -which is defined as follows: - -```go -type Handler interface { - ServeHTTP(ResponseWriter, *Request) -} -``` - -Once implemented the `http.Handler` can be passed to `http.ListenAndServe`, -which will call the `ServeHTTP` method on every incoming request. - -`http.Request` contains all relevant information about an incoming http request -which is being served by your `http.Handler`. - -The `http.ResponseWriter` is the interface through which you can respond to the -request. It implements the `io.Writer` interface, so you can use methods like -`fmt.Fprintf` to write a formatted string as the response body, or ones like -`io.Copy` to write out the contents of a file (or any other `io.Reader`). The -response code can be set before you begin writing data using the `WriteHeader` -method. - -Here's an example of an extremely simple http server: - -```go -package main - -import ( - "fmt" - "log" - "net/http" -) - -type helloHandler struct{} - -func (h helloHandler) ServeHTTP(w http.ResponseWriter, r *http.Request) { - fmt.Fprintf(w, "hello, you've hit %s\n", r.URL.Path) -} - -func main() { - err := http.ListenAndServe(":9999", helloHandler{}) - log.Fatal(err) -} -``` - -`http.ListenAndServe` serves requests using the handler, listening on the given -address:port. It will block unless it encounters an error listening, in which -case we `log.Fatal`. - -Here's an example of using this handler with curl: - -``` - ~ $ curl localhost:9999/foo/bar - hello, you've hit /foo/bar -``` - - -## HandlerFunc - -Often defining a full type to implement the `http.Handler` interface is a bit -overkill, especially for extremely simple `ServeHTTP` functions like the one -above. The `http` package provides a helper function, `http.HandlerFunc`, which -wraps a function which has the signature -`func(w http.ResponseWriter, r *http.Request)`, returning an `http.Handler` -which will call it in all cases. - -The following behaves exactly like the previous example, but uses -`http.HandlerFunc` instead of defining a new type. - -```go -package main - -import ( - "fmt" - "log" - "net/http" -) - -func main() { - h := http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { - fmt.Fprintf(w, "hello, you've hit %s\n", r.URL.Path) - }) - - err := http.ListenAndServe(":9999", h) - log.Fatal(err) -} -``` - -## ServeMux - -On their own, the previous examples don't seem all that useful. If we wanted to -have different behavior for different endpoints we would end up with having to -parse path strings as well as numerous `if` or `switch` statements. Luckily -we're provided with `http.ServeMux`, which does all of that for us. Here's an -example of it being used: - -```go -package main - -import ( - "fmt" - "log" - "net/http" -) - -func main() { - h := http.NewServeMux() - - h.HandleFunc("/foo", func(w http.ResponseWriter, r *http.Request) { - fmt.Fprintln(w, "Hello, you hit foo!") - }) - - h.HandleFunc("/bar", func(w http.ResponseWriter, r *http.Request) { - fmt.Fprintln(w, "Hello, you hit bar!") - }) - - h.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) { - w.WriteHeader(404) - fmt.Fprintln(w, "You're lost, go home") - }) - - err := http.ListenAndServe(":9999", h) - log.Fatal(err) -} -``` - -The `http.ServeMux` is itself an `http.Handler`, so it can be passed into -`http.ListenAndServe`. When it receives a request it will check if the request's -path is prefixed by any of its known paths, choosing the longest prefix match it -can find. We use the `/` endpoint as a catch-all to catch any requests to -unknown endpoints. Here's some examples of it being used: - -``` - ~ $ curl localhost:9999/foo -Hello, you hit foo! - - ~ $ curl localhost:9999/bar -Hello, you hit bar! - - ~ $ curl localhost:9999/baz -You're lost, go home -``` - -`http.ServeMux` has both `Handle` and `HandleFunc` methods. These do the same -thing, except that `Handle` takes in an `http.Handler` while `HandleFunc` merely -takes in a function, implicitly wrapping it just as `http.HandlerFunc` does. - -### Other muxes - -There are numerous replacements for `http.ServeMux` like -[gorilla/mux](http://www.gorillatoolkit.org/pkg/mux) which give you things like -automatically pulling variables out of paths, easily asserting what http methods -are allowed on an endpoint, and more. Most of these replacements will implement -`http.Handler` like `http.ServeMux` does, and accept `http.Handler`s as -arguments, and so are easy to use in conjunction with the rest of the things -I'm going to talk about in this post. - -## Composability - -When I say that the `http` package is composable I mean that it is very easy to -create re-usable pieces of code and glue them together into a new working -application. The `http.Handler` interface is the way all pieces communicate with -each other. Here's an example of where we use the same `http.Handler` to handle -multiple endpoints, each slightly differently: - -```go -package main - -import ( - "fmt" - "log" - "net/http" -) - -type numberDumper int - -func (n numberDumper) ServeHTTP(w http.ResponseWriter, r *http.Request) { - fmt.Fprintf(w, "Here's your number: %d\n", n) -} - -func main() { - h := http.NewServeMux() - - h.Handle("/one", numberDumper(1)) - h.Handle("/two", numberDumper(2)) - h.Handle("/three", numberDumper(3)) - h.Handle("/four", numberDumper(4)) - h.Handle("/five", numberDumper(5)) - - h.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) { - w.WriteHeader(404) - fmt.Fprintln(w, "That's not a supported number!") - }) - - err := http.ListenAndServe(":9999", h) - log.Fatal(err) -} -``` - -`numberDumper` implements `http.Handler`, and can be passed into the -`http.ServeMux` multiple times to serve multiple endpoints. Here's it in action: - -``` - ~ $ curl localhost:9999/one -Here's your number: 1 - ~ $ curl localhost:9999/five -Here's your number: 5 - ~ $ curl localhost:9999/bazillion -That's not a supported number! -``` - -## Testing - -Testing http endpoints is extremely easy in Go, and doesn't even require you to -actually listen on any ports! The `httptest` package provides a few handy -utilities, including `NewRecorder` which implements `http.ResponseWriter` and -allows you to effectively make an http request by calling `ServeHTTP` directly. -Here's an example of a test for our previously implemented `numberDumper`, -commented with what exactly is happening: - -```go -package main - -import ( - "fmt" - "net/http" - "net/http/httptest" - . "testing" -) - -func TestNumberDumper(t *T) { - // We first create the http.Handler we wish to test - n := numberDumper(1) - - // We create an http.Request object to test with. The http.Request is - // totally customizable in every way that a real-life http request is, so - // even the most intricate behavior can be tested - r, _ := http.NewRequest("GET", "/one", nil) - - // httptest.Recorder implements the http.ResponseWriter interface, and as - // such can be passed into ServeHTTP to receive the response. It will act as - // if all data being given to it is being sent to a real client, when in - // reality it's being buffered for later observation - w := httptest.NewRecorder() - - // Pass in our httptest.Recorder and http.Request to our numberDumper. At - // this point the numberDumper will act just as if it was responding to a - // real request - n.ServeHTTP(w, r) - - // httptest.Recorder gives a number of fields and methods which can be used - // to observe the response made to our request. Here we check the response - // code - if w.Code != 200 { - t.Fatalf("wrong code returned: %d", w.Code) - } - - // We can also get the full body out of the httptest.Recorder, and check - // that its contents are what we expect - body := w.Body.String() - if body != fmt.Sprintf("Here's your number: 1\n") { - t.Fatalf("wrong body returned: %s", body) - } - -} -``` - -In this way it's easy to create tests for your individual components that you -are using to build your application, keeping the tests near to the functionality -they're testing. - -Note: if you ever do need to spin up a test server in your tests, `httptest` -also provides a way to create a server listening on a random open port for use -in tests as well. - -## Middleware - -Serving endpoints is nice, but often there's functionality you need to run for -*every* request before the actual endpoint's handler is run. For example, access -logging. A middleware component is one which implements `http.Handler`, but will -actually pass the request off to another `http.Handler` after doing some set of -actions. The `http.ServeMux` we looked at earlier is actually an example of -middleware, since it passes the request off to another `http.Handler` for actual -processing. Here's an example of our previous example with some logging -middleware: - -```go -package main - -import ( - "fmt" - "log" - "net/http" -) - -type numberDumper int - -func (n numberDumper) ServeHTTP(w http.ResponseWriter, r *http.Request) { - fmt.Fprintf(w, "Here's your number: %d\n", n) -} - -func logger(h http.Handler) http.Handler { - return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { - log.Printf("%s requested %s", r.RemoteAddr, r.URL) - h.ServeHTTP(w, r) - }) -} - -func main() { - h := http.NewServeMux() - - h.Handle("/one", numberDumper(1)) - h.Handle("/two", numberDumper(2)) - h.Handle("/three", numberDumper(3)) - h.Handle("/four", numberDumper(4)) - h.Handle("/five", numberDumper(5)) - - h.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) { - w.WriteHeader(404) - fmt.Fprintln(w, "That's not a supported number!") - }) - - hl := logger(h) - - err := http.ListenAndServe(":9999", hl) - log.Fatal(err) -} -``` - -`logger` is a function which takes in an `http.Handler` called `h`, and returns -a new `http.Handler` which, when called, will log the request it was called with -and then pass off its arguments to `h`. To use it we pass in our -`http.ServeMux`, so all incoming requests will first be handled by the logging -middleware before being passed to the `http.ServeMux`. - -Here's an example log entry which is output when the `/five` endpoint is hit: - -``` -2015/06/30 20:15:41 [::1]:34688 requested /five -``` - -## Middleware chaining - -Being able to chain middleware together is an incredibly useful ability which we -get almost for free, as long as we use the signature -`func(http.Handler) http.Handler`. A middleware component returns the same type -which is passed into it, so simply passing the output of one middleware -component into the other is sufficient. - -However, more complex behavior with middleware can be tricky. For instance, what -if you want a piece of middleware which takes in a parameter upon creation? -Here's an example of just that, with a piece of middleware which will set a -header and its value for all requests: - -```go -package main - -import ( - "fmt" - "log" - "net/http" -) - -type numberDumper int - -func (n numberDumper) ServeHTTP(w http.ResponseWriter, r *http.Request) { - fmt.Fprintf(w, "Here's your number: %d\n", n) -} - -func logger(h http.Handler) http.Handler { - return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { - log.Printf("%s requested %s", r.RemoteAddr, r.URL) - h.ServeHTTP(w, r) - }) -} - -type headerSetter struct { - key, val string - handler http.Handler -} - -func (hs headerSetter) ServeHTTP(w http.ResponseWriter, r *http.Request) { - w.Header().Set(hs.key, hs.val) - hs.handler.ServeHTTP(w, r) -} - -func newHeaderSetter(key, val string) func(http.Handler) http.Handler { - return func(h http.Handler) http.Handler { - return headerSetter{key, val, h} - } -} - -func main() { - h := http.NewServeMux() - - h.Handle("/one", numberDumper(1)) - h.Handle("/two", numberDumper(2)) - h.Handle("/three", numberDumper(3)) - h.Handle("/four", numberDumper(4)) - h.Handle("/five", numberDumper(5)) - - h.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) { - w.WriteHeader(404) - fmt.Fprintln(w, "That's not a supported number!") - }) - - hl := logger(h) - hhs := newHeaderSetter("X-FOO", "BAR")(hl) - - err := http.ListenAndServe(":9999", hhs) - log.Fatal(err) -} -``` - -And here's the curl output: - -``` - ~ $ curl -i localhost:9999/three - HTTP/1.1 200 OK - X-Foo: BAR - Date: Wed, 01 Jul 2015 00:39:48 GMT - Content-Length: 22 - Content-Type: text/plain; charset=utf-8 - - Here's your number: 3 - -``` - -`newHeaderSetter` returns a function which accepts and returns an -`http.Handler`. Calling that returned function with an `http.Handler` then gets -you an `http.Handler` which will set the header given to `newHeaderSetter` -before continuing on to the given `http.Handler`. - -This may seem like a strange way of organizing this; for this example the -signature for `newHeaderSetter` could very well have looked like this: - -``` -func newHeaderSetter(key, val string, h http.Handler) http.Handler -``` - -And that implementation would have worked fine. But it would have been more -difficult to compose going forward. In the next section I'll show what I mean. - -## Composing middleware with alice - -[Alice](https://github.com/justinas/alice) is a very simple and convenient -helper for working with middleware using the function signature we've been using -thusfar. Alice is used to create and use chains of middleware. Chains can even -be appended to each other, giving even further flexibility. Here's our previous -example with a couple more headers being set, but also using alice to manage the -added complexity. - -```go -package main - -import ( - "fmt" - "log" - "net/http" - - "github.com/justinas/alice" -) - -type numberDumper int - -func (n numberDumper) ServeHTTP(w http.ResponseWriter, r *http.Request) { - fmt.Fprintf(w, "Here's your number: %d\n", n) -} - -func logger(h http.Handler) http.Handler { - return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { - log.Printf("%s requested %s", r.RemoteAddr, r.URL) - h.ServeHTTP(w, r) - }) -} - -type headerSetter struct { - key, val string - handler http.Handler -} - -func (hs headerSetter) ServeHTTP(w http.ResponseWriter, r *http.Request) { - w.Header().Set(hs.key, hs.val) - hs.handler.ServeHTTP(w, r) -} - -func newHeaderSetter(key, val string) func(http.Handler) http.Handler { - return func(h http.Handler) http.Handler { - return headerSetter{key, val, h} - } -} - -func main() { - h := http.NewServeMux() - - h.Handle("/one", numberDumper(1)) - h.Handle("/two", numberDumper(2)) - h.Handle("/three", numberDumper(3)) - h.Handle("/four", numberDumper(4)) - - fiveHS := newHeaderSetter("X-FIVE", "the best number") - h.Handle("/five", fiveHS(numberDumper(5))) - - h.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) { - w.WriteHeader(404) - fmt.Fprintln(w, "That's not a supported number!") - }) - - chain := alice.New( - newHeaderSetter("X-FOO", "BAR"), - newHeaderSetter("X-BAZ", "BUZ"), - logger, - ).Then(h) - - err := http.ListenAndServe(":9999", chain) - log.Fatal(err) -} -``` - -In this example all requests will have the headers `X-FOO` and `X-BAZ` set, but -the `/five` endpoint will *also* have the `X-FIVE` header set. - -## Fin - -Starting with a simple idea of an interface, the `http` package allows us to -create for ourselves an incredibly useful and flexible (yet still rather simple) -ecosystem for building web apps with re-usable components, all without breaking -our static checks. diff --git a/_posts/2015-11-21-happy-trees.md b/_posts/2015-11-21-happy-trees.md deleted file mode 100644 index 8d36a91..0000000 --- a/_posts/2015-11-21-happy-trees.md +++ /dev/null @@ -1,235 +0,0 @@ ---- -title: Happy Trees -description: >- - Visualizing a forest of happy trees. ---- - -Source code related to this post is available [here](https://github.com/mediocregopher/happy-tree). - -This project was inspired by [this video](https://www.youtube.com/watch?v=_DpzAvb3Vk4), -which you should watch first in order to really understand what's going on. - -My inspiration came from his noting that happification could be done on numbers -in bases other than 10. I immediately thought of hexadecimal, base-16, since I'm -a programmer and that's what I think of. I also was trying to think of how one -would graphically represent a large happification tree, when I realized that -hexadecimal numbers are colors, and colors graphically represent things nicely! - -## Colors - -Colors to computers are represented using 3-bytes, encompassing red, green, and -blue. Each byte is represented by two hexadecimal digits, and they are appended -together. For example `FF0000` represents maximum red (`FF`) added to no green -and no blue. `FF5500` represents maximum red (`FF`), some green (`55`) and no -blue (`00`), which when added together results in kind of an orange color. - -## Happifying colors - -In base 10, happifying a number is done by splitting its digits, squaring each -one individually, and adding the resulting numbers. The principal works the same -for hexadecimal numbers: - -``` -A4F -A*A + 4*4 + F*F -64 + 10 + E1 -155 // 341 in decimal -``` - -So if all colors are 6-digit hexadecimal numbers, they can be happified easily! - -``` -FF5500 -F*F + F*F + 5*5 + 5*5 + 0*0 + 0*0 -E1 + E1 + 19 + 19 + 0 + 0 -0001F4 -``` - -So `FF5500` (an orangish color) happifies to `0001F4` (a darker blue). Since -order of digits doesn't matter, `5F50F0` also happifies to `0001F4`. From this -fact, we can make a tree (hence the happification tree). I can do this process -on every color from `000000` (black) to `FFFFFF` (white), so I will! - -## Representing the tree - -So I know I can represent the tree using color, but there's more to decide on -than that. The easy way to represent a tree would be to simply draw a literal -tree graph, with a circle for each color and lines pointing to its parent and -children. But this is boring, and also if I want to represent *all* colors the -resulting image would be enormous and/or unreadable. - -I decided on using a hollow, multi-level pie-chart. Using the example -of `000002`, it would look something like this: - -![An example of a partial multi-level pie chart](/img/happy-tree/partial.png) - -The inner arc represents the color `000002`. The second arc represents the 15 -different colors which happify into `000002`, each of them may also have their -own outer arc of numbers which happify to them, and so on. - -This representation is nice because a) It looks cool and b) it allows the -melancoils of the hexadecimals to be placed around the happification tree -(numbers which happify into `000001`), which is convenient. It's also somewhat -easier to code than a circle/branch based tree diagram. - -An important feature I had to implement was proportional slice sizes. If I were -to give each child of a color an equal size on that arc's edge the image would -simply not work. Some branches of the tree are extremely deep, while others are -very shallow. If all were given the same space, those deep branches wouldn't -even be representable by a single pixel's width, and would simply fail to show -up. So I implemented proportional slice sizes, where the size of every slice is -determined to be proportional to how many total (recursively) children it has. -You can see this in the above example, where the second level arc is largely -comprised of one giant slice, with many smaller slices taking up the end. - -## First attempt - -My first attempt resulted in this image (click for 5000x5000 version): - -[![Result of first attempt](/img/happy-tree/happy-tree-atmp1-small.png)](/img/happy-tree/happy-tree-atmp1.png) - -The first thing you'll notice is that it looks pretty neat. - -The second thing you'll notice is that there's actually only one melancoil in -the 6-digit hexadecimal number set. The innermost black circle is `000000` which -only happifies to itself, and nothing else will happify to it (sad `000000`). -The second circle represents `000001`, and all of its runty children. And -finally the melancoil, comprised of: - -``` -00000D -> 0000A9 -> 0000B5 -> 000092 -> 000055 -> 00003 -> ... -``` - -The final thing you'll notice (or maybe it was the first, since it's really -obvious) is that it's very blue. Non-blue colors are really only represented as -leaves on their trees and don't ever really have any children of their own, so -the blue and black sections take up vastly more space. - -This makes sense. The number which should generate the largest happification -result, `FFFFFF`, only results in `000546`, which is primarily blue. So in effect -all colors happify to some shade of blue. - -This might have been it, technically this is the happification tree and the -melancoil of 6 digit hexadecimal numbers represented as colors. But it's also -boring, and I wanted to do better. - -## Second attempt - -The root of the problem is that the definition of "happification" I used -resulted in not diverse enough results. I wanted something which would give me -numbers where any of the digits could be anything. Something more random. - -I considered using a hash instead, like md5, but that has its own problems. -There's no gaurantee that any number would actually reach `000001`, which isn't -required but it's a nice feature that I wanted. It also would be unlikely that -there would be any melancoils that weren't absolutely gigantic. - -I ended up redefining what it meant to happify a hexadecimal number. Instead of -adding all the digits up, I first split up the red, green, and blue digits into -their own numbers, happified those numbers, and finally reassembled the results -back into a single number. For example: - -``` -FF5500 -FF, 55, 00 -F*F + F*F, 5*5 + 5*5, 0*0 + 0*0 -1C2, 32, 00 -C23200 -``` - -I drop that 1 on the `1C2`, because it has no place in this system. Sorry 1. - -Simply replacing that function resulted in this image (click for 5000x5000) version: - -[![Result of second attempt](/img/happy-tree/happy-tree-atmp2-small.png)](/img/happy-tree/happy-tree-atmp2.png) - -The first thing you notice is that it's so colorful! So that goal was achieved. - -The second thing you notice is that there's *significantly* more melancoils. -Hundreds, even. Here's a couple of the melancoils (each on its own line): - -``` -00000D -> 0000A9 -> 0000B5 -> 000092 -> 000055 -> 000032 -> ... -000D0D -> 00A9A9 -> 00B5B5 -> 009292 -> 005555 -> 003232 -> ... -0D0D0D -> A9A9A9 -> B5B5B5 -> 929292 -> 555555 -> 323232 -> ... -0D0D32 -> A9A90D -> B5B5A9 -> 9292B5 -> 555592 -> 323255 -> ... -... -``` - -And so on. You'll notice the first melancoil listed is the same as the one from -the first attempt. You'll also notice that the same numbers from the that -melancoil are "re-used" in the rest of them as well. The second coil listed is -the same as the first, just with the numbers repeated in the 3rd and 4th digits. -The third coil has those numbers repeated once more in the 1st and 2nd digits. -The final coil is the same numbers, but with the 5th and 6th digits offset one -place in the rotation. - -The rest of the melancoils in this attempt work out to just be every conceivable -iteration of the above. This is simply a property of the algorithm chosen, and -there's not a whole lot we can do about it. - -## Third attempt - -After talking with [Mr. Marco](/members/#marcopolo) about the previous attempts -I got an idea that would lead me towards more attempts. The main issue I was -having in coming up with new happification algorithms was figuring out what to -do about getting a number greater than `FFFFFF`. Dropping the leading digits -just seemed.... lame. - -One solution I came up with was to simply happify again. And again, and again. -Until I got a number less than or equal to `FFFFFF`. - -With this new plan, I could increase the power by which I'm raising each -individual digit, and drop the strategy from the second attempt of splitting the -number into three parts. In the first attempt I was doing happification to the -power of 2, but what if I wanted to happify to the power of 6? It would look -something like this (starting with the number `34BEEF`): - -``` -34BEEF -3^6 + 4^6 + B^6 + E^6 + E^6 + E^6 + F^6 -2D9 + 1000 + 1B0829 + 72E440 + 72E440 + ADCEA1 -1AEB223 - -1AEB223 is greater than FFFFFF, so we happify again - -1^6 + A^6 + E^6 + B^6 + 2^6 + 2^6 + 3^6 -1 + F4240 + 72E440 + 1B0829 + 40 + 40 + 2D9 -9D3203 -``` - -So `34BEEF` happifies to `9D3203`, when happifying to the power of 6. - -As mentioned before the first attempt in this blog was the 2nd power tree, -here's the trees for the 3rd, 4th, 5th, and 6th powers (each image is a link to -a larger version): - -3rd power: -[![Third attempt, 3rd power](/img/happy-tree/happy-tree-atmp3-pow3-small.png)](/img/happy-tree/happy-tree-atmp3-pow3.png) - -4th power: -[![Third attempt, 4th power](/img/happy-tree/happy-tree-atmp3-pow4-small.png)](/img/happy-tree/happy-tree-atmp3-pow4.png) - -5th power: -[![Third attempt, 5th power](/img/happy-tree/happy-tree-atmp3-pow5-small.png)](/img/happy-tree/happy-tree-atmp3-pow5.png) - -6th power: -[![Third attempt, 6th power](/img/happy-tree/happy-tree-atmp3-pow6-small.png)](/img/happy-tree/happy-tree-atmp3-pow6.png) - -A couple things to note: - -* 3-5 are still very blue. It's not till the 6th power that the distribution - becomes random enough to become very colorful. - -* Some powers have more coils than others. Power of 3 has a lot, and actually a - lot of them aren't coils, but single narcissistic numbers. Narcissistic - numbers are those which happify to themselves. `000000` and `000001` are - narcissistic numbers in all powers, power of 3 has quite a few more. - -* 4 looks super cool. - -Using unsigned 64-bit integers I could theoretically go up to the power of 15. -But I hit a roadblock at power of 7, in that there's actually a melancoil which -occurs whose members are all greater than `FFFFFF`. This means that my strategy -of repeating happifying until I get under `FFFFFF` doesn't work for any numbers -which lead into that coil. diff --git a/_posts/2017-09-06-brian-bars.md b/_posts/2017-09-06-brian-bars.md deleted file mode 100644 index 2c56272..0000000 --- a/_posts/2017-09-06-brian-bars.md +++ /dev/null @@ -1,105 +0,0 @@ ---- -title: Brian Bars -description: >- - Cheap and easy to make, healthy, vegan, high-carb, high-protein. "The Good - Stuff". -updated: 2018-01-18 ---- - -It actually blows my mind it's been 4 years since I used this blog. It was -previously a tech blog, but then I started putting all my tech-related posts on -[the cryptic blog](https://cryptic.io). As of now this is a lifestyle/travel -blog. The me of 4 years ago would be horrified. - -Now I just have to come up with a lifestyle and do some traveling. - -## Recipe - -This isn't a real recipe because I'm not going to preface it with my entire -fucking life story. Let's talk about the food. - -Brian bars: - -* Are like Clif Bars, but with the simplicity of ingredients that Larabars have. -* Are easy to make, only needing a food processor (I use a magic bullet) and a - stovetop oven. -* Keep for a long time and don't really need refrigerating (but don't mind it - neither) -* Are paleo, vegan, gluten-free, free-range, grass-fed, whatever... -* Are really really filling. -* Are named after me, deal with it. - -I've worked on this recipe for a bit, trying to make it workable, and will -probably keep adjusting it (and this post) as time goes on. - -### Ingredients - -Nuts and seeds. Most of this recipe is nuts and seeds. Here's the ones I used: - -* 1 cup almonds -* 1 cup peanuts -* 1 cup walnuts -* 1 cup coconut flakes/shavings/whatever -* 1/2 cup flax seeds -* 1/2 cup sesame seeds - -For all of those above it doesn't _really_ matter what nuts/seeds you use, it's -all gonna get ground up anyway. So whatever's cheap works fine. Also, avoid -salt-added ones if you can. - -The other ingredients are: - -* 1 cup raisins/currants -* 1.5 lbs of pitted dates (no added sugar! you don't need it!) -* 2 cups oats - -### Grind up the nuts - -Throw the nuts into the food processor and grind them into a powder. Then throw -that powder into a bowl along with the seeds, coconuts, raisins, and oats, and -mix em good. - -I don't _completely_ grind up the nuts, instead leaving some chunks in it here -and there, but you do you. - -### Prepare the dates - -This is the harder part, and is what took me a couple tries to get right. The -best strategy I've found is to steam the dates a bit over a stove to soften -them. Then, about a cup at a time, you can throw them in the food processor and -turn them into a paste. You may have to add a little water if your processor is -having trouble. - -Once processed you can add the dates to the mix from before and stir it all up. -It'll end up looking something like cookie dough. Except unlike cookie dough -it's completely safe to eat and maybe sorta healthy. - -### Bake it, Finish it - -Put the dough stuff in a pan of some sort, flatten it out, and stick it in the -oven at like 250 or 300 for a few hours. You're trying to cook out the water you -added earlier when you steamed the dates, as well as whatever little moisture -the dates had in the first place. - -Once thoroughly baked you can stick the pan in the fridge to cool and keep, -and/or cut it up into individual bars. Keep in mind that the bars are super -filling and allow for pretty small portions. Wrap em in foil or plastic wrap and -take them to-go, or keep them around for a snack. Or both. Or whatever you want -to do, it's your food. - -### Cleanup - -Dates are simultaneously magical and the most annoying thing to work with, so -there's cleanup problems you may run into with them: - -Protip #1: When cleaning your processed date slime off of your cooking utensils -I'd recommend just letting them soak in water for a while. Dry-ish date slime -will stick to everything, while soaked date slime will come right off. - -Protip #2: Apparently if you want ants, dates are a great way to get ants. My -apartment has never had an ant problem until 3 hours after I made a batch of -these and didn't wipe down my counter enough. I'm still dealing with the ants. -Apparently there's enviromentally friendly ant poisons where the ants happily -carry the poison back into the nest and the whole nest eats it and dies. Which -feels kinda mean in some way, but is also pretty clever and they're just ants -anyway so fuck it. diff --git a/_posts/2018-10-25-rethinking-identity.md b/_posts/2018-10-25-rethinking-identity.md deleted file mode 100644 index d3520d7..0000000 --- a/_posts/2018-10-25-rethinking-identity.md +++ /dev/null @@ -1,292 +0,0 @@ ---- -title: Rethinking Identity -description: >- - A more useful way of thinking about identity on the internet, and using that - to build a service which makes our online life better. ---- - -In my view, the major social media platforms (Facebook, Twitter, Instagram, -etc...) are broken. They worked well at small scales, but billions of people are -now exposed to them, and [Murphy's Law][murphy] has come into effect. The weak -points in the platforms have been found and exploited, to the point where -they're barely usable for interacting with anyone you don't already know in -person. - -[murphy]: https://en.wikipedia.org/wiki/Murphy%27s_law - -On the other hand, social media, at its core, is a powerful tool that humans -have developed, and it's not one to be thrown away lightly (if it can be thrown -away at all). It's worthwhile to try and fix it. So that's what this post is -about. - -A lot of moaning and groaning has already been done on how social media is toxic -for the average person. But the average person isn't doing anything more than -receiving and reacting to their environment. If that environment is toxic, the -person in it becomes so as well. It's certainly possible to filter the toxicity -out, and use a platform to your own benefit, but that takes work on the user's -part. It would be nice to think that people will do more than follow the path of -least resistance, but at scale that's simply not how reality is, and people -shouldn't be expected to do that work. - -To identify what has become toxic about the platforms, first we need to identify -what a non-toxic platform would look like. - -The ideal definition for social media is to give people a place to socialize -with friends, family, and the rest of the world. Defining "socialize" is tricky, -and probably an exercise only a socially awkward person who doesn't do enough -socializing would undertake. "Expressing one's feelings, knowledge, and -experiences to other people, and receiving theirs in turn" feels like a good -approximation. A platform where true socializing was the only activity would be -ideal. - -Here are some trends on our social media which have nothing to do with -socializing: artificially boosted follower numbers on Instagram to obtain -product sponsors, shills in Reddit comments boosting a product or company, -russian trolls on Twitter spreading propaganda, trolls everywhere being dicks -and switching IPs when they get banned, and [that basketball president whose -wife used burner Twitter accounts to trash talk players][president]. - -[president]: https://www.nytimes.com/2018/06/07/sports/bryan-colangelo-sixers-wife.html - -These are all examples of how anonymity can be abused on social media. I want -to say up front that I'm _not_ against anonymity on the internet, and that I -think we can have our cake and eat it too. But we _should_ acknowledge the -direct and indirect problems anonymity causes. We can't trust that anyone on -social media is being honest about who they are and what their motivation is. -This problem extends outside of social media too, to Amazon product reviews (and -basically any other review system), online polls and raffles, multiplayer games, -and surely many other other cases. - -## Identity - -To fix social media, and other large swaths of the internet, we need to rethink -identity. This process started for me a long time ago, when I watched [this TED -talk][identity], which discusses ways in which we misunderstand identity. -Crucially, David Birch points out that identity is not a name, it's more -fundamental than that. - -[identity]: https://www.ted.com/talks/david_birch_identity_without_a_name - -In the context of online platforms, where a user creates an account which -identifies them in some way, identity breaks down into 3 distinct problems -which are often conflated: - -* Authentication: Is this identity owned by this person? -* Differentiation: Is this identity unique to this person? -* Authorization: Is this identity allowed to do X? - -For internet platform developers, authentication has been given the full focus. -Blog posts, articles, guides, and services abound which deal with properly -hashing and checking passwords, two factor authentication, proper account -recovery procedure, etc... While authentication is not a 100% solved problem, -it's had the most work done on it, and the problems which this post deals with -are not affected by it. - -The problem which should instead be focused on is differentiation. - -## Differentiation - -I want to make very clear, once more, that I am _not_ in favor of de-anonymizing -the web, and doing so is not what I'm proposing. - -Differentiation is without a doubt the most difficult identity problem to solve. -It's not even clear that it's solvable offline. Take this situation: you are in -a room, and you are told that one person is going to walk in, then leave, then -another person will do the same. These two persons may or may not be the same -person. You're allowed to do anything you like to each person (with their -consent) in order to determine if they are the same person or not. - -For the vast, vast majority of cases you can simply look with your eyeballs and -see if they are different people. But this will not work 100% of the time. -Identical twins are an obvious example of two persons looking like one, but a -malicious actor with a disguise might be one person posing as two. Biometrics -like fingerprints, iris scanning, and DNA testing fail for many reasons (the -identical twin case being one). You could attempt to give the first a unique -marking on their skin, but who's to say they don't have a solvent, which can -clean that marking off, waiting right outside the door? - -The solutions and refutations can continue on pedantically for some time, but -the point is that there is likely not a 100% solution, and even the 90% -solutions require significant investment. Differentiation is a hard problem, -which most developers don't want to solve. Most are fine with surrogates like -checking that an email or phone number is unique to the platform, but these -aren't enough to stop a dedicated individual or organization. - -### Roll Your Own Differentiation - -If a platform wants to roll their own solution to the differentiation problem, a -proper solution, it might look something like this: - -* Submit an image of your passport, or other government issued ID. This would - have to be checked against the appropriate government agency to ensure the - ID is legitimate. - -* Submit an image of your face, alongside a written note containing a code given - by the platform. Software to detect manipulated images would need to be - employed, as well as reverse image searching to ensure the image isn't being - reused. - -* Once completed, all data needs to be hashed/fingerprinted and then destroyed, - so sensitive data isn't sitting around on servers, but can still be checked - against future users signing up for the platform. - -* A dedicated support team would be needed to handle edge-cases and mistakes. - -None of these is trivial, nor would I trust an up-and-coming platform which is -being bootstrapped out of a basement to implement any of them correctly. -Additionally, going through with this process would be a _giant_ point of -friction for a user creating a new account; they likely would go use a different -platform instead, which didn't have all this nonsense required. - -### Differentiation as a Service - -This is the crux of this post. - -Instead of each platform rolling their own differentiation, what if there was a -service for it. Users would still have to go through the hassle described above, -but only once forever, and on a more trustable site. Then platforms, no matter -what stage of development they're at, could use that service to ensure that -their community of users is free from the problems of fake accounts and trolls. - -This is what the service would look like: - -* A user would have to, at some point, have gone through the steps above to - create an account on the differentiation-as-a-service (DaaS) platform. This - account would have the normal authentication mechanisms that most platforms - do (password, two-factor, etc...). - -* When creating an account on a new platform, the user would login to their DaaS - account (similar to the common "login with Google/Facebook/Twitter" buttons). - -* The DaaS then returns an opaque token, an effectively random string which - uniquely identifies that user, to the platform. The platform can then check in - its own user database for any other users using that token, and know if the - user already has an account. All of this happens without any identifying - information being passed to the platform. - -Similar to how many sites outsource to Cloudflare to handle DDoS protection, -which is better handled en masse by people familiar with the problem, the DaaS -allows for outsourcing the problem of differentiation. Users are more likely to -trust an established DaaS service than a random website they're signing up for. -And signing up for a DaaS is a one-time event, so if enough platforms are using -the DaaS it could become worthwhile for them to do so. - -Finally, since the DaaS also handles authentication, a platform could outsource -that aspect of identity management to it as well. This is optional for the -platform, but for smaller platforms which are just starting up it might be -worthwhile to save that development time. - -### Traits of a Successful DaaS - -It's possible for me to imagine a world where use of DaaS' is common, but -bridging the gap between that world and this one is not as obvious. Still, I -think it's necessary if the internet is to ever evolve passed being, primarily, -a home for trolls. There are a number of traits of an up-and-coming DaaS which -would aid it in being accepted by the internet: - -* **Patience**: there is a critical mass of users and platforms using DaaS' - where it becomes more advantageous for platforms to use the DaaS than not. - Until then, the DaaS and platforms using it need to take deliberate but small - steps. For example: making DaaS usage optional for platform users, and giving - their accounts special marks to indicate they're "authentic" (like Twitter's - blue checkmark); giving those users' activity higher weight in algorithms; - allowing others to filter out activity of non-"authentic" users; etc... These - are all preliminary steps which can be taken which encourage but don't require - platform users to use a DaaS. - -* **User-friendly**: most likely the platforms using a DaaS are what are going - to be paying the bills. A successful DaaS will need to remember that, no - matter where the money comes from, if the users aren't happy they'll stop - using the DaaS, and platforms will be forced to switch to a different one or - stop using them altogether. User-friendliness means more than a nice - interface; it means actually caring for the users' interests, taking their - privacy and security seriously, and in all other aspects being on their side. - In that same vein, competition is important, and so... - -* **No country/government affiliation**: If the DaaS was to be run by a - government agency it would have no incentive to provide a good user - experience, since the users aren't paying the bills (they might not even be in - that country). A DaaS shouldn't be exclusive to any one government or country - anyway. Perhaps it starts out that way, to get off the ground, but ultimately - the internet is a global institution, and is healthiest when it's connecting - individuals _around the world_. A successful DaaS will reach beyond borders - and try to connect everyone. - -Obviously actually starting a DaaS would be a huge undertaking, and would -require proper management and good developers and all that, but such things -apply to most services. - -## Authorization - -The final aspect of identity management, which I haven't talked about yet, is -authorization. This aspect deals with what a particular identity is allowed to -do. For example, is an identity allowed to claim they have a particular name, or -are from a particular place, or are of a particular age? Other things like -administration and moderation privileges also fall under authorization, but they -are generally defined and managed within a platform. - -A DaaS has the potential to help with authorization as well, though with a giant -caveat. If a DaaS were to not fingerprint and destroy the user's data, like -their name and birthday and whatnot, but instead store them, then the following -use-case could also be implemented: - -* A platform wants to know if a user is above a certain age, let's say. It asks - the DaaS for that information. - -* The DaaS asks the user, OAuth style, whether the user is ok with giving the - platform that information. - -* If so, the platform is given that information. - -This is a tricky situation. It adds a lot of liablity for the user, since their -raw data will be stored with the DaaS, ripe for hacking. It also places a lot of -trust with the DaaS to be responsible with users' data and not go giving it out -willy-nilly to others, and instead to only give out the bare-minimum that the -user allows. Since the user is not the DaaS' direct customer, this might be too -much to ask. Nevertheless, it's a use-case which is worth thinking about. - -## Dapps - -The idea of decentralized applications, or dapps, has begun to gain traction. -While not mainstream yet, I think they have potential, and it's necessary to -discuss how a DaaS would operate in a world where the internet is no longer -hosted in central datacenters. - -Consider an Ethereum-based dapp. If a user were to register one ethereum address -(which are really public keys) with their DaaS account, the following use-case -could be implemented: - -* A charity dapp has an ethereum contract, which receives a call from an - ethereum address asking for money. The dapp wants to ensure every person it - sends money to hasn't received any that day. - -* The DaaS has a separate ethereum contract it manages, where it stores all - addresses which have been registered to a user. There is no need to keep any - other user information in the contract. - -* The charity dapp's contract calls the DaaS' contract, asking it if the address - is one of its addresses. If so, and if the charity contract hasn't given to - that address yet today, it can send money to that address. - -There would perhaps need to be some mechanism by which a user could change their -address, which would be complex since that address might be in use by a dapp -already, but it's likely a solvable problem. - -A charity dapp is a bit of a silly example; ideally with a charity dapp there'd -also be some mechanism to ensure a person actually _needs_ the money. But -there's other dapp ideas which would become feasible, due to the inability of a -person to impersonate many people, if DaaS use becomes normal. - -## Why Did I Write This? - -Perhaps you've gotten this far and are asking: "Clearly you've thought about -this a lot, why don't you make this yourself and make some phat stacks of cash -with a startup?" The answer is that this project would need to be started and -run by serious people, who can be dedicated and thorough and responsible. I'm -not sure I'm one of those people; I get distracted easily. But I would like to -see this idea tried, and so I've written this up thinking maybe someone else -would take the reins. - -I'm not asking for equity or anything, if you want to try; it's a free idea for -the taking. But if it turns out to be a bazillion dollar Good Idea™, I won't say -no to a donation... diff --git a/_posts/2018-11-12-viz-1.md b/_posts/2018-11-12-viz-1.md deleted file mode 100644 index 8fd9fd9..0000000 --- a/_posts/2018-11-12-viz-1.md +++ /dev/null @@ -1,54 +0,0 @@ ---- -title: >- - Visualization 1 -description: >- - Using clojurescript and quil to generate interesting visuals -series: viz -git_repo: https://github.com/mediocregopher/viz.git -git_commit: v1 ---- - -First I want to appologize if you've seen this already, I originally had this up -on my normal website, but I've decided to instead consolidate all my work to my -blog. - -This is the first of a series of visualization posts I intend to work on, each -building from the previous one. - -<script src="/assets/viz/1/goog/base.js"></script> -<script src="/assets/viz/1/cljs_deps.js"></script> -<script>goog.require("viz.core");</script> -<p align="center"><canvas id="viz"></canvas></p> - -This visualization follows a few simple rules: - -* Any point can only be occupied by a single node. A point may be alive (filled) - or dead (empty). - -* On every tick each live point picks from 0 to N new points to spawn, where N is - the number of empty adjacent points to it. If it picks 0, it becomes dead. - -* Each line indicates the parent of a point. Lines have an arbitrary lifetime of - a few ticks, and occupy the points they connect (so new points may not spawn - on top of a line). - -* When a dead point has no lines it is cleaned up, and its point is no longer - occupied. - -The resulting behavior is somewhere between [Conway's Game of -Life](https://en.wikipedia.org/wiki/Conway%27s_Game_of_Life) and white noise. -Though each point operates independently, they tend to move together in groups. -When two groups collide head on they tend to cancel each other out, killing most -of both. When they meet while both heading in a common direction they tend to -peacefully merge towards that direction. - -Sometimes their world becomes so cluttered there's hardly room to move. -Sometimes a major coincidence of events leads to multiple groups canceling each -other at once, opening up the world and allowing for an explosion of new growth. - -Some groups spiral about a single point, sustaining themselves and defending -from outside groups in the same movement. This doesn't last for very long. - -The performance of this visualization is not very optimized, and will probably -eat up your CPU like nothing else. Most of the slowness comes from drawing the -lines; since there's so many individual small ones it's quite cumbersome to do. diff --git a/_posts/2018-11-12-viz-2.md b/_posts/2018-11-12-viz-2.md deleted file mode 100644 index c3e342e..0000000 --- a/_posts/2018-11-12-viz-2.md +++ /dev/null @@ -1,49 +0,0 @@ ---- -title: >- - Visualization 2 -description: >- - Now in glorious technicolor! -series: viz -git_repo: https://github.com/mediocregopher/viz.git -git_commit: v2 ---- - - -<script src="/assets/viz/2/goog/base.js"></script> -<script src="/assets/viz/2/cljs_deps.js"></script> -<script>goog.require("viz.core");</script> -<p align="center"><canvas id="viz"></canvas></p> - -This visualization builds on the previous. Structurally the cartesian grid has -been turned into an isometric one, but this is more of an environmental change -than a behavioral one. - -Behavioral changes which were made: - -* When a live point is deciding its next spawn points, it first sorts the set of - empty adjacent points from closest-to-the-center to farthest. It then chooses - a number `n` between `0` to `N` (where `N` is the sorted set's size) and - spawns new points from the first `n` points of the sorted set. `n` is chosen - based on: - - * The live point's linear distance from the center. - - * A random multiplier. - -* Each point is spawned with an attached color, where the color chosen is a - slightly different hue than its parent. The change is deterministic, so all - child points of the same generation have the same color. - -The second change is purely cosmetic, but does create a mesmerizing effect. The -first change alters the behavior dramatically. Only the points which compete for -the center are able to reproduce, but by the same token are more likely to be -starved out by other points doing the same. - -In the previous visualization the points moved around in groups aimlessly. Now -the groups are all competing for the same thing, the center. As a result they -congregate and are able to be viewed as a larger whole. - -The constant churn of the whole takes many forms, from a spiral in the center, -to waves crashing against each other, to outright chaos, to random purges of -nearly all points. Each form lasts for only a few seconds before giving way to -another. diff --git a/_posts/2019-08-02-program-structure-and-composability.md b/_posts/2019-08-02-program-structure-and-composability.md deleted file mode 100644 index b44c534..0000000 --- a/_posts/2019-08-02-program-structure-and-composability.md +++ /dev/null @@ -1,587 +0,0 @@ ---- -title: >- - Program Structure and Composability -description: >- - Discussing the nature of program structure, the problems presented by - complex structures, and a pattern that helps in solving those problems. ---- - -## Part 0: Introduction - -This post is focused on a concept I call “program structure,” which I will try -to shed some light on before discussing complex program structures. I will then -discuss why complex structures can be problematic to deal with, and will finally -discuss a pattern for dealing with those problems. - -My background is as a backend engineer working on large projects that have had -many moving parts; most had multiple programs interacting with each other, used -many different databases in various contexts, and faced large amounts of load -from millions of users. Most of this post will be framed from my perspective, -and will present problems in the way I have experienced them. I believe, -however, that the concepts and problems I discuss here are applicable to many -other domains, and I hope those with a foot in both backend systems and a second -domain can help to translate the ideas between the two. - -Also note that I will be using Go as my example language, but none of the -concepts discussed here are specific to Go. To that end, I’ve decided to favor -readable code over “correct” code, and so have elided things that most gophers -hold near-and-dear, such as error checking and proper documentation, in order to -make the code as accessible as possible to non-gophers as well. As with before, -I trust that someone with a foot in Go and another language can help me -translate between the two. - -## Part 1: Program Structure - -In this section I will discuss the difference between directory and program -structure, show how global state is antithetical to compartmentalization (and -therefore good program structure), and finally discuss a more effective way to -think about program structure. - -### Directory Structure - -For a long time, I thought about program structure in terms of the hierarchy -present in the filesystem. In my mind, a program’s structure looked like this: - -``` -// The directory structure of a project called gobdns. -src/ - config/ - dns/ - http/ - ips/ - persist/ - repl/ - snapshot/ - main.go -``` - -What I grew to learn was that this conflation of “program structure” with -“directory structure” is ultimately unhelpful. While it can’t be denied that -every program has a directory structure (and if not, it ought to), this does not -mean that the way the program looks in a filesystem in any way corresponds to -how it looks in our mind’s eye. - -The most notable way to show this is to consider a library package. Here is the -structure of a simple web-app which uses redis (my favorite database) as a -backend: - -``` -src/ - redis/ - http/ - main.go -``` - -If I were to ask you, based on that directory structure, what the program does -in the most abstract terms, you might say something like: “The program -establishes an http server that listens for requests. It also establishes a -connection to the redis server. The program then interacts with redis in -different ways based on the http requests that are received on the server.” - -And that would be a good guess. Here’s a diagram that depicts the program -structure, wherein the root node, `main.go`, takes in requests from `http` and -processes them using `redis`. - -{% include image.html - dir="program-structure" file="diag1.jpg" width=519 - descr="Example 1" - %} - -This is certainly a viable guess for how a program with that directory -structure operates, but consider another answer: “A component of the program -called `server` establishes an http server that listens for requests. `server` -also establishes a connection to a redis server. `server` then interacts with -that redis connection in different ways based on the http requests that are -received on the http server. Additionally, `server` tracks statistics about -these interactions and makes them available to other components. The root -component of the program establishes a connection to a second redis server, and -stores those statistics in that redis server.” Here’s another diagram to depict -_that_ program. - -{% include image.html - dir="program-structure" file="diag2.jpg" width=712 - descr="Example 2" - %} - -The directory structure could apply to either description; `redis` is just a -library which allows for interaction with a redis server, but it doesn’t -specify _which_ or _how many_ servers. However, those are extremely important -factors that are definitely reflected in our concept of the program’s -structure, and not in the directory structure. **What the directory structure -reflects are the different _kinds_ of components available to use, but it does -not reflect how a program will use those components.** - - -### Global State vs Compartmentalization - -The directory-centric view of structure often leads to the use of global -singletons to manage access to external resources like RPC servers and -databases. In examples 1 and 2 the `redis` library might contain code which -looks something like this: - -```go -// A mapping of connection names to redis connections. -var globalConns = map[string]*RedisConn{} - -func Get(name string) *RedisConn { - if globalConns[name] == nil { - globalConns[name] = makeRedisConnection(name) - } - return globalConns[name] -} -``` - -Even though this pattern would work, it breaks with our conception of the -program structure in more complex cases like example 2. Rather than the `redis` -component being owned by the `server` component, which actually uses it, it -would be practically owned by _all_ components, since all are able to use it. -Compartmentalization has been broken, and can only be held together through -sheer human discipline. - -**This is the problem with all global state. It is shareable among all -components of a program, and so is accountable to none of them.** One must look -at an entire codebase to understand how a globally held component is used, -which might not even be possible for a large codebase. Therefore, the -maintainers of these shared components rely entirely on the discipline of their -fellow coders when making changes, usually discovering where that discipline -broke down once the changes have been pushed live. - -Global state also makes it easier for disparate programs/components to share -datastores for completely unrelated tasks. In example 2, rather than creating a -new redis instance for the root component’s statistics storage, the coder might -have instead said, “well, there’s already a redis instance available, I’ll just -use that.” And so, compartmentalization would have been broken further. Perhaps -the two instances _could_ be coalesced into the same instance for the sake of -resource efficiency, but that decision would be better made at runtime via the -configuration of the program, rather than being hardcoded into the code. - -From the perspective of team management, global state-based patterns do nothing -except slow teams down. The person/team responsible for maintaining the central -library in which shared components live (`redis`, in the above examples) -becomes the bottleneck for creating new instances for new components, which -will further lead to re-using existing instances rather than creating new ones, -further breaking compartmentalization. Additionally the person/team responsible -for the central library, rather than the team using it, often finds themselves -as the maintainers of the shared resource. - -### Component Structure - -So what does proper program structure look like? In my mind the structure of a -program is a hierarchy of components, or, in other words, a tree. The leaf -nodes of the tree are almost _always_ IO related components, e.g., database -connections, RPC server frameworks or clients, message queue consumers, etc. -The non-leaf nodes will _generally_ be components that bring together the -functionalities of their children in some useful way, though they may also have -some IO functionality of their own. - -Let's look at an even more complex structure, still only using the `redis` and -`http` component types: - -{% include image.html - dir="program-structure" file="diag3.jpg" width=729 - descr="Example 3" - %} - -This component structure contains the addition of the `debug` component. -Clearly the `http` and `redis` components are reusable in different contexts, -but for this example the `debug` endpoint is as well. It creates a separate -http server that can be queried to perform runtime debugging of the program, -and can be tacked onto virtually any program. The `rest-api` component is -specific to this program and is therefore not reusable. Let’s dive into it a -bit to see how it might be implemented: - -```go -// RestAPI is very much not thread-safe, hopefully it doesn't have to handle -// more than one request at once. -type RestAPI struct { - redisConn *redis.RedisConn - httpSrv *http.Server - - // Statistics exported for other components to see - RequestCount int - FooRequestCount int - BarRequestCount int -} - -func NewRestAPI() *RestAPI { - r := new(RestAPI) - r.redisConn := redis.NewConn("127.0.0.1:6379") - - // mux will route requests to different handlers based on their URL path. - mux := http.NewServeMux() - mux.HandleFunc("/foo", r.fooHandler) - mux.HandleFunc("/bar", r.barHandler) - r.httpSrv := http.NewServer(mux) - - // Listen for requests and serve them in the background. - go r.httpSrv.Listen(":8000") - - return r -} - -func (r *RestAPI) fooHandler(rw http.ResponseWriter, r *http.Request) { - r.redisConn.Command("INCR", "fooKey") - r.RequestCount++ - r.FooRequestCount++ -} - -func (r *RestAPI) barHandler(rw http.ResponseWriter, r *http.Request) { - r.redisConn.Command("INCR", "barKey") - r.RequestCount++ - r.BarRequestCount++ -} -``` - - -In that snippet `rest-api` coalesced `http` and `redis` into a simple REST-like -api using pre-made library components. `main.go`, the root component, does much -the same: - -```go -func main() { - // Create debug server and start listening in the background - debugSrv := debug.NewServer() - - // Set up the RestAPI, this will automatically start listening - restAPI := NewRestAPI() - - // Create another redis connection and use it to store statistics - statsRedisConn := redis.NewConn("127.0.0.1:6380") - for { - time.Sleep(1 * time.Second) - statsRedisConn.Command("SET", "numReqs", restAPI.RequestCount) - statsRedisConn.Command("SET", "numFooReqs", restAPI.FooRequestCount) - statsRedisConn.Command("SET", "numBarReqs", restAPI.BarRequestCount) - } -} -``` - -One thing that is clearly missing in this program is proper configuration, -whether from command-line or environment variables, etc. As it stands, all -configuration parameters, such as the redis addresses and http listen -addresses, are hardcoded. Proper configuration actually ends up being somewhat -difficult, as the ideal case would be for each component to set up its own -configuration variables without its parent needing to be aware. For example, -`redis` could set up `addr` and `pool-size` parameters. The problem is that there -are two `redis` components in the program, and their parameters would therefore -conflict with each other. An elegant solution to this problem is discussed in -the next section. - -## Part 2: Components, Configuration, and Runtime - -The key to the configuration problem is to recognize that, even if there are -two of the same component in a program, they can’t occupy the same place in the -program’s structure. In the above example, there are two `http` components: one -under `rest-api` and the other under `debug`. Because the structure is -represented as a tree of components, the “path” of any node in the tree -uniquely represents it in the structure. For example, the two `http` components -in the previous example have these paths: - -``` -root -> rest-api -> http -root -> debug -> http -``` - -If each component were to know its place in the component tree, then it would -easily be able to ensure that its configuration and initialization didn’t -conflict with other components of the same type. If the `http` component sets -up a command-line parameter to know what address to listen on, the two `http` -components in that program would set up: - -``` ---rest-api-listen-addr ---debug-listen-addr -``` - -So how can we enable each component to know its path in the component structure? -To answer this, we’ll have to take a detour through a type, called `Component`. - -### Component and Configuration - -The `Component` type is a made-up type (though you’ll be able to find an -implementation of it at the end of this post). It has a single primary purpose, -and that is to convey the program’s structure to new components. - -To see how this is done, let's look at a couple of `Component`'s methods: - -```go -// Package mcmp - -// New returns a new Component which has no parents or children. It is therefore -// the root component of a component hierarchy. -func New() *Component - -// Child returns a new child of the called upon Component. -func (*Component) Child(name string) *Component - -// Path returns the Component's path in the component hierarchy. It will return -// an empty slice if the Component is the root component. -func (*Component) Path() []string -``` - -`Child` is used to create a new `Component`, corresponding to a new child node -in the component structure, and `Path` is used retrieve the path of any -`Component` within that structure. For the sake of keeping the examples simple, -let’s pretend these functions have been implemented in a package called `mcmp`. -Here’s an example of how `Component` might be used in the `redis` component’s -code: - -```go -// Package redis - -func NewConn(cmp *mcmp.Component, defaultAddr string) *RedisConn { - cmp = cmp.Child("redis") - paramPrefix := strings.Join(cmp.Path(), "-") - - addrParam := flag.String(paramPrefix+"-addr", defaultAddr, "Address of redis instance to connect to") - // finish setup - - return redisConn -} -``` - -In our above example, the two `redis` components' parameters would be: - -``` -// This first parameter is for the stats redis, whose parent is the root and -// therefore doesn't have a prefix. Perhaps stats should be broken into its own -// component in order to fix this. ---redis-addr ---rest-api-redis-addr -``` - -`Component` definitely makes it easier to instantiate multiple redis components -in our program, since it allows them to know their place in the component -structure. - -Having to construct the prefix for the parameters ourselves is pretty annoying, -so let’s introduce a new package, `mcfg`, which acts like `flag` but is aware -of `Component`. Then `redis.NewConn` is reduced to: - -```go -// Package redis - -func NewConn(cmp *mcmp.Component, defaultAddr string) *RedisConn { - cmp = cmp.Child("redis") - addrParam := mcfg.String(cmp, "addr", defaultAddr, "Address of redis instance to connect to") - // finish setup - - return redisConn -} -``` - -Easy-peasy. - -#### But What About Parse? - -Sharp-eyed gophers will notice that there is a key piece missing: When is -`flag.Parse`, or its `mcfg` counterpart, called? When does `addrParam` actually -get populated? It can’t happen inside `redis.NewConn` because there might be -other components after `redis.NewConn` that want to set up parameters. To -illustrate the problem, let’s look at a simple program that wants to set up two -`redis` components: - -```go -func main() { - // Create the root Component, an empty Component. - cmp := mcmp.New() - - // Create the Components for two sub-components, foo and bar. - cmpFoo := cmp.Child("foo") - cmpBar := cmp.Child("bar") - - // Now we want to try to create a redis sub-component for each component. - - // This will set up the parameter "--foo-redis-addr", but bar hasn't had a - // chance to set up its corresponding parameter, so the command-line can't - // be parsed yet. - fooRedis := redis.NewConn(cmpFoo, "127.0.0.1:6379") - - // This will set up the parameter "--bar-redis-addr", but, as mentioned - // before, redis.NewConn can't parse command-line. - barRedis := redis.NewConn(cmpBar, "127.0.0.1:6379") - - // It is only after all components have been instantiated that the - // command-line arguments can be parsed - mcfg.Parse() -} -``` - -While this solves our argument parsing problem, fooRedis and barRedis are not -usable yet because the actual connections have not been made. This is a classic -chicken and the egg problem. The func `redis.NewConn` needs to make a connection -which it cannot do until _after_ `mcfg.Parse` is called, but `mcfg.Parse` cannot -be called until after `redis.NewConn` has returned. We will solve this problem -in the next section. - -### Instantiation vs Initialization - -Let’s break down `redis.NewConn` into two phases: instantiation and -initialization. Instantiation refers to creating the component on the component -structure and having it declare what it needs in order to initialize (e.g., -configuration parameters). During instantiation, nothing external to the -program is performed; no IO, no reading of the command-line, no logging, etc. -All that’s happened is that the empty template of a `redis` component has been -created. - -Initialization is the phase during which the template is filled in. -Configuration parameters are read, startup actions like the creation of database -connections are performed, and logging is output for informational and debugging -purposes. - -The key to making effective use of this dichotomy is to allow _all_ components -to instantiate themselves before they initialize themselves. By doing this we -can ensure, for example, that all components have had the chance to declare -their configuration parameters before configuration parsing is done. - -So let’s modify `redis.NewConn` so that it follows this dichotomy. It makes -sense to leave instantiation-related code where it is, but we need a mechanism -by which we can declare initialization code before actually calling it. For -this, I will introduce the idea of a “hook.” - -#### But First: Augment Component - -In order to support hooks, however, `Component` will need to be augmented with -a few new methods. Right now, it can only carry with it information about the -component structure, but here we will add the ability to carry arbitrary -key/value information as well: - -```go -// Package mcmp - -// SetValue sets the given key to the given value on the Component, overwriting -// any previous value for that key. -func (*Component) SetValue(key, value interface{}) - -// Value returns the value which has been set for the given key, or nil if the -// key was never set. -func (*Component) Value(key interface{}) interface{} - -// Children returns the Component's children in the order they were created. -func (*Component) Children() []*Component -``` - -The final method allows us to, starting at the root `Component`, traverse the -component structure and interact with each `Component`’s key/value store. This -will be useful for implementing hooks. - -#### Hooks - -A hook is simply a function that will run later. We will declare a new package, -calling it `mrun`, and say that it has two new functions: - -```go -// Package mrun - -// InitHook registers the given hook to the given Component. -func InitHook(cmp *mcmp.Component, hook func()) - -// Init runs all hooks registered using InitHook. Hooks are run in the order -// they were registered. -func Init(cmp *mcmp.Component) -``` - -With these two functions, we are able to defer the initialization phase of -startup by using the same `Components` we were passing around for the purpose -of denoting component structure. - -Now, with these few extra pieces of functionality in place, let’s reconsider the -most recent example, and make a program that creates two redis components which -exist independently of each other: - -```go -// Package redis - -// NOTE that NewConn has been renamed to InstConn, to reflect that the returned -// *RedisConn is merely instantiated, not initialized. - -func InstConn(cmp *mcmp.Component, defaultAddr string) *RedisConn { - cmp = cmp.Child("redis") - - // we instantiate an empty RedisConn instance and parameters for it. Neither - // has been initialized yet. They will remain empty until initialization has - // occurred. - redisConn := new(RedisConn) - addrParam := mcfg.String(cmp, "addr", defaultAddr, "Address of redis instance to connect to") - - mrun.InitHook(cmp, func() { - // This hook will run after parameter initialization has happened, and - // so addrParam will be usable. Once this hook as run, redisConn will be - // usable as well. - *redisConn = makeRedisConnection(*addrParam) - }) - - // Now that cmp has had configuration parameters and intialization hooks - // set into it, return the empty redisConn instance back to the parent. - return redisConn -} -``` - -```go -// Package main - -func main() { - // Create the root Component, an empty Component. - cmp := mcmp.New() - - // Create the Components for two sub-components, foo and bar. - cmpFoo := cmp.Child("foo") - cmpBar := cmp.Child("bar") - - // Add redis components to each of the foo and bar sub-components. - redisFoo := redis.InstConn(cmpFoo, "127.0.0.1:6379") - redisBar := redis.InstConn(cmpBar, "127.0.0.1:6379") - - // Parse will descend into the Component and all of its children, - // discovering all registered configuration parameters and filling them from - // the command-line. - mcfg.Parse(cmp) - - // Now that configuration parameters have been initialized, run the Init - // hooks for all Components. - mrun.Init(cmp) - - // At this point the redis components have been fully initialized and may be - // used. For this example we'll copy all keys from one to the other. - keys := redisFoo.Command("KEYS", "*") - for i := range keys { - val := redisFoo.Command("GET", keys[i]) - redisBar.Command("SET", keys[i], val) - } -} -``` - -## Conclusion - -While the examples given here are fairly simplistic, the pattern itself is quite -powerful. Codebases naturally accumulate small, domain-specific behaviors and -optimizations over time, especially around the IO components of the program. -Databases are used with specific options that an organization finds useful, -logging is performed in particular places, metrics are counted around certain -pieces of code, etc. - -By programming with component structure in mind, we are able to keep these -optimizations while also keeping the clarity and compartmentalization of the -code intact. We can keep our code flexible and configurable, while also -re-usable and testable. Also, the simplicity of the tools involved means they -can be extended and retrofitted for nearly any situation or use-case. - -Overall, this is a powerful pattern that I’ve found myself unable to do without -once I began using it. - -### Implementation - -As a final note, you can find an example implementation of the packages -described in this post here: - -* [mcmp](https://godoc.org/github.com/mediocregopher/mediocre-go-lib/mcmp) -* [mcfg](https://godoc.org/github.com/mediocregopher/mediocre-go-lib/mcfg) -* [mrun](https://godoc.org/github.com/mediocregopher/mediocre-go-lib/mrun) - -The packages are not stable and are likely to change frequently. You’ll also -find that they have been extended quite a bit from the simple descriptions found -here, based on what I’ve found useful as I’ve implemented programs using -component structures. With these two points in mind, I would encourage you to -look and take whatever functionality you find useful for yourself, and not use -the packages directly. The core pieces are not different from what has been -described in this post. diff --git a/_posts/2020-04-26-trading-in-the-rain.md b/_posts/2020-04-26-trading-in-the-rain.md deleted file mode 100644 index 3a31a95..0000000 --- a/_posts/2020-04-26-trading-in-the-rain.md +++ /dev/null @@ -1,55 +0,0 @@ ---- -title: >- - Trading in the Rain -description: >- - All those... gains... will be lost like... tears... ---- - -<!-- MIDI.js --> -<!-- polyfill --> -<script src="/assets/trading-in-the-rain/MIDI.js/inc/shim/Base64.js" type="text/javascript"></script> -<script src="/assets/trading-in-the-rain/MIDI.js/inc/shim/Base64binary.js" type="text/javascript"></script> -<script src="/assets/trading-in-the-rain/MIDI.js/inc/shim/WebAudioAPI.js" type="text/javascript"></script> -<!-- MIDI.js package --> -<script src="/assets/trading-in-the-rain/MIDI.js/js/midi/audioDetect.js" type="text/javascript"></script> -<script src="/assets/trading-in-the-rain/MIDI.js/js/midi/gm.js" type="text/javascript"></script> -<script src="/assets/trading-in-the-rain/MIDI.js/js/midi/loader.js" type="text/javascript"></script> -<script src="/assets/trading-in-the-rain/MIDI.js/js/midi/plugin.audiotag.js" type="text/javascript"></script> -<script src="/assets/trading-in-the-rain/MIDI.js/js/midi/plugin.webaudio.js" type="text/javascript"></script> -<script src="/assets/trading-in-the-rain/MIDI.js/js/midi/plugin.webmidi.js" type="text/javascript"></script> -<!-- utils --> -<script src="/assets/trading-in-the-rain/MIDI.js/js/util/dom_request_xhr.js" type="text/javascript"></script> -<script src="/assets/trading-in-the-rain/MIDI.js/js/util/dom_request_script.js" type="text/javascript"></script> -<!-- / MIDI.js --> - -<script src="/assets/trading-in-the-rain/Distributor.js" type="text/javascript"></script> -<script src="/assets/trading-in-the-rain/MusicBox.js" type="text/javascript"></script> -<script src="/assets/trading-in-the-rain/RainCanvas.js" type="text/javascript"></script> -<script src="/assets/trading-in-the-rain/CW.js" type="text/javascript"></script> -<script src="/assets/trading-in-the-rain/SeriesComposer.js" type="text/javascript"></script> -<script src="/assets/trading-in-the-rain/main.js" type="text/javascript"></script> - - -<div id="tradingInRainModal"> -For each pair listed below, live trade data will be pulled down from the -<a href="https://docs.cryptowat.ch/websocket-api/">Cryptowat.ch Websocket -API</a> and used to generate musical rain drops. The price of each trade -determines both the musical note and position of the rain drop on the screen, -while the volume of each trade determines how long the note is held and how big -the rain drop is. - -<p id="markets">Pairs to be generated, by color:<br/><br/></p> - -<button id="button" onclick="run()">Click Here to Begin</button> -<p id="progress"></p> - -<script type="text/javascript"> - fillMarketP(); - if (window.addEventListener) window.addEventListener("load", autorun, false); - else if (window.attachEvent) window.attachEvent("onload", autorun); - else window.onload = autorun; -</script> -</div> - - -<canvas id="rainCanvas" style=""></canvas> diff --git a/_posts/2020-05-30-denver-protests.md b/_posts/2020-05-30-denver-protests.md deleted file mode 100644 index 710987f..0000000 --- a/_posts/2020-05-30-denver-protests.md +++ /dev/null @@ -1,161 +0,0 @@ ---- -title: >- - Denver Protests -description: >- - Craziness ---- - -# Saturday, May 30th - -We went to the May 30th protest at Civic Center Park. We were there for a few -hours during the day, leaving around 4pm. I would describe the character of the -protest as being energetic, angry, but contained. A huge crowd moved in and -around civic center, chanting and being rowdy, but clearly was being led. - -After a last hurrah at the pavilion it seemed that the organized event was -"over". We stayed a while longer, and eventually headed back home. I don't feel -that people really left the park at the same time we did; mostly everyone just -dispersed around the park and found somewhere to keep hanging out. - -Tonight there has been an 8pm curfew. The police lined up on the north side of -the park, armored and clearly ready for action. We watched all of this on the -live news stations, gritting our teeth through the comentary of their reporters. -As the police stood there, the clock counting down to 8, the protesters grew -more and more irritated. They taunted the police, and formed a line of their -own. The braver (or more dramatic) protesters walked around in the no-man's land -between them, occasionally earning themselves some teargas. - -The police began pushing forward just before 8 a little, but began pushing in -earnest just after 8, after the howling. They would advance, wait, advance, wait -again. An armada of police cars, ambulance, and fire trucks followed the line as -it advanced. - -The police did not give the protesters anywhere to go except into Capital Hill, -southeast of Civic Center Park. We watched as a huge crowd marched past the -front of our house, chanting their call and response: "What's his name?" "GEORGE -FLOYD". The feeling wasn't of violence still, just anger. Indignant at a curfew -aimed at quelling a movement, the protesters simply kept moving. The police were -never far behind. - -We sat on our front stoop with our neighbors and watched the night unfold. I -don't think a single person in our building or the buildings to the left and -right of us hadn't gone to protest today in some capacity. We came back from our -various outings and sat out front, watching the crowds and patrolling up and -down the street to keep tabs on things. - -Around 9pm the fires started. We saw them on the news, and in person. They were -generally dumpster fires, generally placed such that they were away from -buildings, clearly being done more to be annoying than to accomplish anything -specific. A very large set of fires was started a block south of us, in the -middle of the street. The fire department was there within a few minutes to put -those out, before moving on. - -From the corner of my eye, sitting back on the stoop, I noticed our neighbors -running into their backyard. We ran after them, and they told us there was a -dumpster fire in our alley. They were running with fire extinguishers, and we -ran inside to grab some of our own. By the time we got to the backyard the fire -was only smouldering, and the fire department was coming down the alley. We -scurried back into the backyard. A few minutes later I peeked my head around the -corner, into the alley, to see what happening. I was greeted by at least two -police in riot gear, guarding the dumpster as the fire department worked. They -saw me but didn't move, and I quickly retreated back to the yard. - -Talking to our neighbor later we found out she had seen a group of about 10 -people back there, and watched them jump the fence into another backyard in -order to escape the alley. She thinks they, or some subset of them, started the -fire. She looked one in the eye, she says, and didn't get the impression they -were trying to cause damage, just to make a statement. - -The fires stopped not long after that, it seems. We're pretty sure the fire -trucks were just driving up and down the main roads, looking into alleys and -stopping all fires they could find. In all this time the police didn't do much. -They would hold a line, but never chase anyone. Even now, as I write this around -midnight, people are still out, meandering around in small groups, and police -are present but not really doing anything. - -It's hard to get a good view of everything though. All we have is livestreams on -youtube to go on at this point. There's a couple intrepid amateur reporters out -there, getting into the crowds and streaming events as they happen. Right now -we're watching people moving down Lincoln towards Civic Center Park, some of -them trying to smash windows of buildings as they go. - -The violence of these protests is going to be the major story of tonight, I know -that already. That I know of there's been 3 police injured, some broken -windows, and quite a bit of graffiti. I do believe the the tactic of pushing -everyone into Cap Hill had the desired effect of reducing looting (again, as far -as I can tell so far), but at that expense of those who live here who have to -endure latent tear gas, dumpster fires, and sirens all through the night. - -Even now, at midnight, from what I can see from my porch and from these live -streams, the protesters are not violent. At worst they are guilty of a lot of -loitering. The graffiti, the smashed windows, the injured officers, all of these -things will be held up as examples of the anarchy and violence inherent to the -protesters. But I don't think that's an honest picture. The vast, vast majority -of those out right now are civily disobeying an unjust curfew, trying to keep -the energy of the movement alive. - -My thoughts about these things are complicated. When turning a corner on the -street I'm far more afraid to see the police than to see other protesters. The -fires have been annoying, and stupid, and unhelpful, but were never threatening. -The violence is stupid, though I don't shed many tears for a looted Chili's or -Papa Johns. The police have actually shown more restraint than I expected in all -of this, though funneling the protest into a residential neighborhood was an -incredibly stupid move. Could the protesters not have just stayed in the park? -Yes, the park would likely have been turned into an encampment, but it was -already heading into that direction due to Covid-19. Overall, this night didn't -need to be so hard, but Denver handled this well. - -But, it's only 1am, and the night has a long way to go. Things could still get -worse. Even now I'm watching people trying to break into the supreme court -building. Civic Center Park appears to be very populated again, and the police -are very present there again. It's possible I may eat my words. - -# Monday, June 1st - -Yesterday was quite a bit more tame than the craziness Saturday. I woke up -Sunday morning feeling antsy, and rode my bike around to see the damage. I had a -long conversation with a homeless man named Gary in Civic Center Park. He was -pissed, and had a lot to say about the "suburban kids" destroying the park he -and many others live in, causing it to be shut down and tear gassed. The -protesters saw it as a game, according to him, but it was life and death for the -homeless; three of his guys got beat up in the street, and neither police nor -protesters stopped it. - -Many people had shown up to the park early to help clean it up. Apart from the -graffiti, which was also in the process of being cleaned, it was hard to tell -anything had actually happened. Gary had some words about them as well, that -they were only there for the gram and some pats on the back, but once they left -his life would be back as it was. I could feel that, but I also appreciated that -people were cognizant that damage was being done and were willing to do -something about it. - -I rode around 16th street mall, down colfax, and back up 13th, looking to see if -anything had happened. For the most part there was no damage, save the graffiti. -A mediterranean restaurant got its windows smashed, as well as the Office Depot. -The restaurant was unfortunate, Office Depot will be ok. - -The protest yesterday was much more peaceful. The cops were nowhere to be found -when curfew hit, but did eventually show up when the protest moved down Colfax. -They had lined the streets around their precinct building there, but for the -most part the protesters just kept walking. This is when the "violence" started. -The cops moved into the street, forming a line across Colfax behind the -protesters. Police cars and vans started moving. As the protest turned back, -presumably to head back to the capitol lawn, it ran into the riot line. - -Predictably, everyone scattered. The cat-and-mouse game had begun, which meant -dumpster fires, broken windows, tear gas, and all the rest. Watching the whole -thing it was extremely clear to us, though not the news casters, unfortunately, -that if the police hadn't moved out into Colfax nothing would have ever -happened. Instead, the news casters lamented that people were bringing things -like helmets, gas masks, traffic cones, shields, etc... and so were clearly not there -"for the right reasons". - -The thing that the news casters couldn't seem to grasp was that the police -attempting to control these situations are what are catalyzing them in the first -place. These are protests _against_ the police, they cannot take place under the -terms the police choose. If the police were not here setting terms, but instead -working with the peaceful protesters (the vast, vast majority) to quell the -violence, no one would be here with helmets, gas masks, traffic cones, -shields... But instead the protesters feel they need to protect themselves in -order to be heard, and the police feel they have to exercise their power to -maintain control, and so the situation degrades. diff --git a/_posts/2020-07-07-viz-3.md b/_posts/2020-07-07-viz-3.md deleted file mode 100644 index f56dbb6..0000000 --- a/_posts/2020-07-07-viz-3.md +++ /dev/null @@ -1,154 +0,0 @@ ---- -title: >- - Visualization 3 -description: >- - All the pixels. -series: viz ---- - -<canvas id="canvas" style="padding-bottom: 2rem;"></canvas> - -This visualization is built from the ground up. On every frame a random set of -pixels is chosen. Each chosen pixel calculates the average of its color and the -color of a random neighbor. Some random color drift is added in as well. It -replaces its own color with that calculated color. - -Choosing a neighbor is done using the "asteroid rule", ie a pixel at the very -top row is considered to be the neighbor of the pixel on the bottom row of the -same column. - -Without the asteroid rule the pixels would all eventually converge into a single -uniform color, generally a light blue, due to the colors at the edge, the reds, -being quickly averaged away. With the asteroid rule in place the canvas has no -edges, thus no position on the canvas is favored and balance can be maintained. - -<script type="text/javascript"> -let rectSize = 12; - -function randn(n) { - return Math.floor(Math.random() * n); -} - -let canvas = document.getElementById("canvas"); -canvas.width = window.innerWidth - (window.innerWidth % rectSize); -canvas.height = window.innerHeight- (window.innerHeight % rectSize); -let ctx = canvas.getContext("2d"); - -let w = canvas.width / rectSize; -let h = canvas.height / rectSize; - -let matrices = new Array(2); -matrices[0] = new Array(w); -matrices[1] = new Array(w); -for (let x = 0; x < w; x++) { - matrices[0][x] = new Array(h); - matrices[1][x] = new Array(h); - for (let y = 0; y < h; y++) { - let el = { - h: 360 * (x / w), - s: "100%", - l: "50%", - }; - matrices[0][x][y] = el; - matrices[1][x][y] = el; - } -} - -// draw initial canvas, from here on out only individual rectangles will be -// filled as they get updated. -for (let x = 0; x < w; x++) { - for (let y = 0; y < h; y++) { - let el = matrices[0][x][y]; - ctx.fillStyle = `hsl(${el.h}, ${el.s}, ${el.l})`; - ctx.fillRect(x * rectSize, y * rectSize, rectSize, rectSize); - } -} - - -let requestAnimationFrame = - window.requestAnimationFrame || - window.mozRequestAnimationFrame || - window.webkitRequestAnimationFrame || - window.msRequestAnimationFrame; - -let neighbors = [ - [-1, -1], [0, -1], [1, -1], - [-1, 0], [1, 0], - [-1, 1], [0, 1], [1, 1], -]; - -function randNeighborAsteroid(matrix, x, y) { - let neighborCoord = neighbors[randn(neighbors.length)]; - let neighborX = x+neighborCoord[0]; - let neighborY = y+neighborCoord[1]; - neighborX = (neighborX + w) % w; - neighborY = (neighborY + h) % h; - return matrix[neighborX][neighborY]; -} - -function randNeighbor(matrix, x, y) { - while (true) { - let neighborCoord = neighbors[randn(neighbors.length)]; - let neighborX = x+neighborCoord[0]; - let neighborY = y+neighborCoord[1]; - if (neighborX < 0 || neighborX >= w || neighborY < 0 || neighborY >= h) { - continue; - } - return matrix[neighborX][neighborY]; - } -} - -let drift = 10; -function genChildH(elA, elB) { - // set the two h values, h1 <= h2 - let h1 = elA.h; - let h2 = elB.h; - if (h1 > h2) { - h1 = elB.h; - h2 = elA.h; - } - - // diff must be between 0 (inclusive) and 360 (exclusive). If it's greater - // than 180 then it's not the shortest path around, that must be the other - // way around the circle. - let hChild; - let diff = h2 - h1; - if (diff > 180) { - diff = 360 - diff; - hChild = h2 + (diff / 2); - } else { - hChild = h1 + (diff / 2); - } - - hChild += (Math.random() * drift * 2) - drift; - hChild = (hChild + 360) % 360; - return hChild; -} - -let tick = 0; -function doTick() { - tick++; - let currI = tick % 2; - let curr = matrices[currI]; - let lastI = (tick - 1) % 2; - let last = matrices[lastI]; - - for (let i = 0; i < (w * h / 2); i++) { - let x = randn(w); - let y = randn(h); - if (curr[x][y].lastTick == tick) continue; - - let neighbor = randNeighborAsteroid(last, x, y); - curr[x][y].h = genChildH(curr[x][y], neighbor); - curr[x][y].lastTick = tick; - ctx.fillStyle = `hsl(${curr[x][y].h}, ${curr[x][y].s}, ${curr[x][y].l})`; - ctx.fillRect(x * rectSize, y * rectSize, rectSize, rectSize); - } - - matrices[currI] = curr; - requestAnimationFrame(doTick); -} - -requestAnimationFrame(doTick); - -</script> diff --git a/_posts/2020-11-16-component-oriented-programming.md b/_posts/2020-11-16-component-oriented-programming.md deleted file mode 100644 index 3400090..0000000 --- a/_posts/2020-11-16-component-oriented-programming.md +++ /dev/null @@ -1,352 +0,0 @@ ---- -title: >- - Component-Oriented Programming -description: >- - A concise description of. ---- - -[A previous post in this -blog](/2019/08/02/program-structure-and-composability.html) focused on a -framework developed to make designing component-based programs easier. In -retrospect, the proposed pattern/framework was over-engineered. This post -attempts to present the same ideas in a more distilled form, as a simple -programming pattern and without the unnecessary framework. - -## Components - -Many languages, libraries, and patterns make use of a concept called a -"component," but in each case the meaning of "component" might be slightly -different. Therefore, to begin talking about components, it is necessary to first -describe what is meant by "component" in this post. - -For the purposes of this post, the properties of components include the -following. - - 1... **Abstract**: A component is an interface consisting of one or more -methods. - - 1a... A function might be considered a single-method component -_if_ the language supports first-class functions. - - 1b... A component, being an interface, may have one or more -implementations. Generally, there will be a primary implementation, which is -used during a program's runtime, and secondary "mock" implementations, which are -only used when testing other components. - - 2... **Instantiatable**: An instance of a component, given some set of -parameters, can be instantiated as a standalone entity. More than one of the -same component can be instantiated, as needed. - - 3... **Composable**: A component may be used as a parameter of another -component's instantiation. This would make it a child component of the one being -instantiated (the parent). - - 4... **Pure**: A component may not use mutable global variables (i.e., -singletons) or impure global functions (e.g., system calls). It may only use -constants and variables/components given to it during instantiation. - - 5... **Ephemeral**: A component may have a specific method used to clean -up all resources that it's holding (e.g., network connections, file handles, -language-specific lightweight threads, etc.). - - 5a... This cleanup method should _not_ clean up any child -components given as instantiation parameters. - - 5b... This cleanup method should not return until the -component's cleanup is complete. - - 5c... A component should not be cleaned up until all its -parent components are cleaned up. - -Components are composed together to create component-oriented programs. This is -done by passing components as parameters to other components during -instantiation. The `main` procedure of the program is responsible for -instantiating and composing the components of the program. - -## Example - -It's easier to show than to tell. This section posits a simple program and then -describes how it would be implemented in a component-oriented way. The program -chooses a random number and exposes an HTTP interface that allows users to try -and guess that number. The following are requirements of the program: - -* A guess consists of a name that identifies the user performing the guess and - the number that is being guessed; - -* A score is kept for each user who has performed a guess; - -* Upon an incorrect guess, the user should be informed of whether they guessed - too high or too low, and 1 point should be deducted from their score; - -* Upon a correct guess, the program should pick a new random number against - which to check subsequent guesses, and 1000 points should be added to the - user's score; - -* The HTTP interface should have two endpoints: one for users to submit guesses, - and another that lists out user scores from highest to lowest; - -* Scores should be saved to disk so they survive program restarts. - -It seems clear that there will be two major areas of functionality for our -program: score-keeping and user interaction via HTTP. Each of these can be -encapsulated into components called `scoreboard` and `httpHandlers`, -respectively. - -`scoreboard` will need to interact with a filesystem component to save/restore -scores (because it can't use system calls directly; see property 4). It would be -wasteful for `scoreboard` to save the scores to disk on every score update, so -instead it will do so every 5 seconds. A time component will be required to -support this. - -`httpHandlers` will be choosing the random number which is being guessed, and -will therefore need a component that produces random numbers. `httpHandlers` -will also be recording score changes to `scoreboard`, so it will need access to -`scoreboard`. - -The example implementation will be written in go, which makes differentiating -HTTP handler functionality from the actual HTTP server quite easy; thus, there -will be an `httpServer` component that uses `httpHandlers`. - -Finally, a `logger` component will be used in various places to log useful -information during runtime. - -[The example implementation can be found -here.](/assets/component-oriented-design/v1/main.html) While most of it can be -skimmed, it is recommended to at least read through the `main` function to see -how components are composed together. Note that `main` is where all components -are instantiated, and that all components' take in their child components as -part of their instantiation. - -## DAG - -One way to look at a component-oriented program is as a directed acyclic graph -(DAG), where each node in the graph represents a component, and each edge -indicates that one component depends upon another component for instantiation. -For the previous program, it's quite easy to construct such a DAG just by -looking at `main`, as in the following: - -``` -net.Listener rand.Rand os.File - ^ ^ ^ - | | | - httpServer --> httpHandlers --> scoreboard --> time.Ticker - | | | - +---------------+---------------+--> log.Logger -``` - -Note that all the leaves of the DAG (i.e., nodes with no children) describe the -points where the program meets the operating system via system calls. The leaves -are, in essence, the program's interface with the outside world. - -While it's not necessary to actually draw out the DAG for every program one -writes, it can be helpful to at least think about the program's structure in -these terms. - -## Benefits - -Looking at the previous example implementation, one would be forgiven for having -the immediate reaction of "This seems like a lot of extra work for little gain. -Why can't I just make the system calls where I need to, and not bother with -wrapping them in interfaces and all these other rules?" - -The following sections will answer that concern by showing the benefits gained -by following a component-oriented pattern. - -### Testing - -Testing is important, that much is being assumed. - -A distinction to be made with testing is between unit and non-unit tests. Unit -tests are those for which there are no requirements for the environment outside -the test, such as the existence of global variables, running databases, -filesystems, or network services. Unit tests do not interact with the world -outside the testing procedure, but instead use mocks in place of the -functionality that would be expected by that world. - -Unit tests are important because they are faster to run and more consistent than -non-unit tests. Unit tests also force the programmer to consider different -possible states of a component's dependencies during the mocking process. - -Unit tests are often not employed by programmers, because they are difficult to -implement for code that does not expose any way to swap out dependencies for -mocks of those dependencies. The primary culprit of this difficulty is the -direct usage of singletons and impure global functions. For component-oriented -programs, all components inherently allow for the swapping out of any -dependencies via their instantiation parameters, so there's no extra effort -needed to support unit tests. - -[Tests for the example implementation can be found -here.](/assets/component-oriented-design/v1/main_test.html) Note that all -dependencies of each component being tested are mocked/stubbed next to them. - -### Configuration - -Practically all programs require some level of runtime configuration. This may -take the form of command-line arguments, environment variables, configuration -files, etc. - -For a component-oriented program, all components are instantiated in the same -place, `main`, so it's very easy to expose any arbitrary parameter to the user -via configuration. For any component that is affected by a configurable -parameter, that component merely needs to take an instantiation parameter for -that configurable parameter; `main` can connect the two together. This accounts -for the unit testing of a component with different configurations, while still -allowing for the configuration of any arbitrary internal functionality. - -For more complex configuration systems, it is also possible to implement a -`configuration` component that wraps whatever configuration-related -functionality is needed, which other components use as a sub-component. The -effect is the same. - -To demonstrate how configuration works in a component-oriented program, the -example program's requirements will be augmented to include the following: - -* The point change values for both correct and incorrect guesses (currently - hardcoded at 1000 and 1, respectively) should be configurable on the - command-line; - -* The save file's path, HTTP listen address, and save interval should all be - configurable on the command-line. - -[The new implementation, with newly configurable parameters, can be found -here.](/assets/component-oriented-design/v2/main.html) Most of the program has -remained the same, and all unit tests from before remain valid. The primary -difference is that `scoreboard` takes in two new parameters for the point change -values, and configuration is set up inside `main` using the `flags` package. - -### Setup/Runtime/Cleanup - -A program can be split into three stages: setup, runtime, and cleanup. Setup is -the stage during which the internal state is assembled to make runtime possible. -Runtime is the stage during which a program's actual function is being -performed. Cleanup is the stage during which the runtime stops and internal -state is disassembled. - -A graceful (i.e., reliably correct) setup is quite natural to accomplish for -most. On the other hand, a graceful cleanup is, unfortunately, not a programmer's -first concern (if it is a concern at all). - -When building reliable and correct programs, a graceful cleanup is as important -as a graceful setup and runtime. A program is still running while it is being -cleaned up, and it's possibly still acting on the outside world. Shouldn't -it behave correctly during that time? - -Achieving a graceful setup and cleanup with components is quite simple. - -During setup, a single-threaded procedure (`main`) first constructs the leaf -components, then the components that take those leaves as parameters, then the -components that take _those_ as parameters, and so on, until the component DAG -is fully constructed. - -At this point, the program's runtime has begun. - -Once the runtime is over, signified by a process signal or some other mechanism, -it's only necessary to call each component's cleanup method (if any; see -property 5) in the reverse of the order in which the components were -instantiated. This order is inherently deterministic, as the components were -instantiated by a single-threaded procedure. - -Inherent to this pattern is the fact that each component will certainly be -cleaned up before any of its child components, as its child components must have -been instantiated first, and a component will not clean up child components -given as parameters (properties 5a and 5c). Therefore, the pattern avoids -use-after-cleanup situations. - -To demonstrate a graceful cleanup in a component-oriented program, the example -program's requirements will be augmented to include the following: - -* The program will terminate itself upon an interrupt signal; - -* During termination (cleanup), the program will save the latest set of scores - to disk one final time. - -[The new implementation that accounts for these new requirements can be found -here.](/assets/component-oriented-design/v3/main.html) For this example, go's -`defer` feature could have been used instead, which would have been even -cleaner, but was omitted for the sake of those using other languages. - - -## Conclusion - -The component pattern helps make programs more reliable with only a small amount -of extra effort incurred. In fact, most of the pattern has to do with -establishing sensible abstractions around global functionality and remembering -certain idioms for how those abstractions should be composed together, something -most of us already do to some extent anyway. - -While beneficial in many ways, component-oriented programming is merely a tool -that can be applied in many cases. It is certain that there are cases where it -is not the right tool for the job, so apply it deliberately and intelligently. - -## Criticisms/Questions - -In lieu of a FAQ, I will attempt to premeditate questions and criticisms of the -component-oriented programming pattern laid out in this post. - -**This seems like a lot of extra work.** - -Building reliable programs is a lot of work, just as building a -reliable _anything_ is a lot of work. Many of us work in an industry that likes -to balance reliability (sometimes referred to by the more specious "quality") -with malleability and deliverability, which naturally leads to skepticism of any -suggestions requiring more time spent on reliability. This is not necessarily a -bad thing, it's just how the industry functions. - -All that said, a pattern need not be followed perfectly to be worthwhile, and -the amount of extra work incurred by it can be decided based on practical -considerations. I merely maintain that code which is (mostly) component-oriented -is easier to maintain in the long run, even if it might be harder to get off the -ground initially. - -**My language makes this difficult.** - -I don't know of any language which makes this pattern particularly easier than -others, so, unfortunately, we're all in the same boat to some extent (though I -recognize that some languages, or their ecosystems, make it more difficult than -others). It seems to me that this pattern shouldn't be unbearably difficult for -anyone to implement in any language either, however, as the only language -feature required is abstract typing. - -It would be nice to one day see a language that explicitly supports this -pattern by baking the component properties in as compiler-checked rules. - -**My `main` is too big** - -There's no law saying all component construction needs to happen in `main`, -that's just the most sensible place for it. If there are large sections of your -program that are independent of each other, then they could each have their own -construction functions that `main` then calls. - -Other questions that are worth asking include: Can my program be split up -into multiple programs? Can the responsibilities of any of my components be -refactored to reduce the overall complexity of the component DAG? Can the -instantiation of any components be moved within their parent's -instantiation function? - -(This last suggestion may seem to be disallowed, but is fine as long as the -parent's instantiation function remains pure.) - -**Won't this will result in over-abstraction?** - -Abstraction is a necessary tool in a programmer's toolkit, there is simply no -way around it. The only questions are "how much?" and "where?" - -The use of this pattern does not affect how those questions are answered, in my -opinion, but instead aims to more clearly delineate the relationships and -interactions between the different abstracted types once they've been -established using other methods. Over-abstraction is possible and avoidable -regardless of which language, pattern, or framework is being used. - -**Does CoP conflict with object-oriented or functional programming?** - -I don't think so. OoP languages will have abstract types as part of their core -feature-set; most difficulties are going to be with deliberately _not_ using -other features of an OoP language, and with imported libraries in the language -perhaps making life inconvenient by not following CoP (specifically regarding -cleanup and the use of singletons). - -For functional programming, it may well be that, depending on the language, CoP -is technically being used, as functional languages are already generally -antagonistic toward globals and impure functions, which is most of the battle. -If anything, the transition from functional to component-oriented programming -will generally be an organizational task. diff --git a/_posts/2021-01-01-new-year-new-resolution.md b/_posts/2021-01-01-new-year-new-resolution.md deleted file mode 100644 index 8e9edc7..0000000 --- a/_posts/2021-01-01-new-year-new-resolution.md +++ /dev/null @@ -1,50 +0,0 @@ ---- -title: >- - New Year, New Resolution -description: >- - This blog is about to get some action. ---- - -At this point I'm fairly well known amongst friends and family for my new year's -resolutions, to the point that earlier this month a friend of mine asked me -"What's it going to be this year?". In the past I've done things like no -chocoloate, no fast food, no added sugar (see a theme?), and no social media. -They've all been of the "I won't do this" sort, because it's a lot easier to -stop doing something than to start doing something new. Doing something new -inherently means _also_ not doing something else; there's only so many hours in -the day, afterall. - -## This Year - -This year I'm going to shake things up, I'm going to do something new. My -resolution is to have published 52 posts on this blog by Jan 1, 2022, 00:00 UTC. -Only one post per day can count towards the 52. A post must be "substantial" to -count towards the 52. A non-substantial post would be something like the 100 -word essay about my weekend that I wrote in first grade, which went something -like "My weekend was really really really ('really' 96 more times) really really -boring". - -Other than that, it's pretty open-ended. - -## Why - -My hope is that I'll get more efficient at writing these things. Usually I take -a lot of time to craft a post, weeks in some cases. I really appreciate those of -you that have taken the time to read them, but to be frank the time commitment -just isn't worth it. With practice I can hopefully learn what exactly I have to -say that others are interested in, and then go back to spending a lot of time -crafting the things being said. - -Another part of this is going to be learning how to market myself properly, -something I've always been reticent to do. Our world is filled with people -shouting into the void of the internet, each with their own reasons for wanting -to be heard. Does it need another? Probably not. But here I am. I guess what I'm -really going to be doing is learning _why_ I want to do this; I know I want to -have others read what I write, but is it possible that that desire isn't -entirely selfish? Is it ok if it is? - -Once I'm comfortable with why I'm doing this it will, hopefully, be easier to -figure out a marketing avenue I feel comfortable with putting a lot of energy -towards. There must be at least _one_... - -So consider this #1, world. Only 51 to go. diff --git a/_posts/2021-01-09-ginger.md b/_posts/2021-01-09-ginger.md deleted file mode 100644 index 3a97d7f..0000000 --- a/_posts/2021-01-09-ginger.md +++ /dev/null @@ -1,352 +0,0 @@ ---- -title: >- - Ginger -description: >- - Yes, it does exist. ---- - -This post is about a programming language that's been bouncing around in my head -for a _long_ time. I've tried to actually implement the language three or more -times now, but everytime I get stuck or run out of steam. It doesn't help that -everytime I try again the form of the language changes significantly. But all -throughout the name of the language has always been "Ginger". It's a good name. - -In the last few years the form of the language has somewhat solidified in my -head, so in lieu of actually working on it I'm going to talk about what it -currently looks like. - -## Abstract Syntax Lists - -_In the beginning_ there was assembly. Well, really in the beginning there were -punchcards, and probably something even more esoteric before that, but it was -all effectively the same thing: a list of commands the computer would execute -sequentially, with the ability to jump to odd places in the sequence depending -on conditions at runtime. For the purpose of this post, we'll call this class of -languages "abstract syntax list" (ASL) languages. - -Here's a hello world program in my favorite ASL language, brainfuck: - -``` -++++++++[>++++[>++>+++>+++>+<<<<-]>+>+>->>+[<]<-]>>.>---.+++++++..+++.>>.<-.<.++ -+.------.--------.>>+.>++. -``` - -(If you've never seen brainfuck, it's deliberately unintelligible. But it _is_ -an ASL, each character representing a single command, executed by the brainfuck -runtime from left to right.) - -ASLs did the job at the time, but luckily we've mostly moved on past them. - -## Abstract Syntax Trees - -Eventually programmers upgraded to C-like languages. Rather than a sequence of -commands, these languages were syntactically represented by an "abstract syntax -tree" (AST). Rather than executing commands in essentially the same order they -are written, an AST language compiler reads the syntax into a tree of syntax -nodes. What it then does with the tree is language dependent. - -Here's a program which outputs all numbers from 0 to 9 to stdout, written in -(slightly non-idiomatic) Go: - -```go -i := 0 -for { - if i == 10 { - break - } - fmt.Println(i) - i++ -} -``` - -When the Go compiler sees this, it's going to first parse the syntax into an -AST. The AST might look something like this: - -``` -(root) - |-(:=) - | |-(i) - | |-(0) - | - |-(for) - |-(if) - | |-(==) - | | |-(i) - | | |-(10) - | | - | |-(break) - | - |-(fmt.Println) - | |-(i) - | - |-(++) - |-(i) -``` - -Each of the non-leaf nodes in the tree represents an operation, and the children -of the node represent the arguments to that operation, if any. From here the -compiler traverses the tree depth-first in order to turn each operation it finds -into the appropriate machine code. - -There's a sub-class of AST languages called the LISP ("LISt Processor") -languages. In a LISP language the AST is represented using lists of elements, -where the first element in each list denotes the operation and the rest of the -elements in the list (if any) represent the arguments. Traditionally each list -is represented using parenthesis. For example `(+ 1 1)` represents adding 1 and -1 together. - -As a more complex example, here's how to print numbers 0 through 9 to stdout -using my favorite (and, honestly, only) LISP, Clojure: - -```clj -(doseq - [n (range 10)] - (println n)) -``` - -Much smaller, but the idea is there. In LISPs there is no differentiation -between the syntax, the AST, and the language's data structures; they are all -one and the same. For this reason LISPs generally have very powerful macro -support, wherein one uses code written in the language to transform code written -in that same language. With macros users can extend a language's functionality -to support nearly anything they need to, but because macro generation happens -_before_ compilation they can still reap the benefits of compiler optimizations. - -### AST Pitfalls - -The ASL (assembly) is essentially just a thin layer of human readability on top -of raw CPU instructions. It does nothing in the way of representing code in the -way that humans actually think about it (relationships of types, flow of data, -encapsulation of behavior). The AST is a step towards expressing code in human -terms, but it isn't quite there in my opinion. Let me show why by revisiting the -Go example above: - -```go -i := 0 -for { - if i > 9 { - break - } - fmt.Println(i) - i++ -} -``` - -When I understand this code I don't understand it in terms of its syntax. I -understand it in terms of what it _does_. And what it does is this: - -* with a number starting at 0, start a loop. -* if the number is greater than 9, stop the loop. -* otherwise, print the number. -* add one to the number. -* go to start of loop. - -This behavior could be further abstracted into the original problem statement, -"it prints numbers 0 through 9 to stdout", but that's too general, as there -are different ways for that to be accomplished. The Clojure example first -defines a list of numbers 0 through 9 and then iterates over that, rather than -looping over a single number. These differences are important when understanding -what code is doing. - -So what's the problem? My problem with ASTs is that the syntax I've written down -does _not_ reflect the structure of the code or the flow of data which is in my -head. In the AST representation if you want to follow the flow of data (a single -number) you _have_ to understand the semantic meaning of `i` and `:=`; the AST -structure itself does not convey how data is being moved or modified. -Essentially, there's an extra implicit transformation that must be done to -understand the code in human terms. - -## Ginger: An Abstract Syntax Graph Language - -In my view the next step is towards using graphs rather than trees for -representing our code. A graph has the benefit of being able to reference -"backwards" into itself, where a tree cannot, and so can represent the flow of -data much more directly. - -I would like Ginger to be an ASG language where the language is the graph, -similar to a LISP. But what does this look like exactly? Well, I have a good -idea about what the graph _structure_ will be like and how it will function, but -the syntax is something I haven't bothered much with yet. Representing graph -structures in a text file is a problem to be tackled all on its own. For this -post we'll use a made-up, overly verbose, and probably non-usable syntax, but -hopefully it will convey the graph structure well enough. - -### Nodes, Edges, and Tuples - -All graphs have nodes, where each node contains a value. A single unique value -can only have a single node in a graph. Nodes are connected by edges, where -edges have a direction and can contain a value themselves. - -In the context of Ginger, a node represents a value as expected, and the value -on an edge represents an operation to take on that value. For example: - -``` -5 -incr-> n -``` - -`5` and `n` are both nodes in the graph, with an edge going from `5` to `n` that -has the value `incr`. When it comes time to interpret the graph we say that the -value of `n` can be calculated by giving `5` as the input to the operation -`incr` (increment). In other words, the value of `n` is `6`. - -What about operations which have more than one input value? For this Ginger -introduces the tuple to its graph type. A tuple is like a node, except that it's -anonymous, which allows more than one to exist within the same graph, as they do -not share the same value. For the purposes of this blog post we'll represent -tuples like this: - -``` -1 -> } -add-> t -2 -> } -``` - -`t`'s value is the result of passing a tuple of two values, `1` and `2`, as -inputs to the operation `add`. In other words, the value of `t` is `3`. - -For the syntax being described in this post we allow that a single contiguous -graph can be represented as multiple related sections. This can be done because -each node's value is unique, so when the same value is used in disparate -sections we can merge the two sections on that value. For example, the following -two graphs are exactly equivalent (note the parenthesis wrapping the graph which -has been split): - -``` -1 -> } -add-> t -incr-> tt -2 -> } -``` - -``` -( - 1 -> } -add-> t - 2 -> } - - t -incr-> tt -) -``` - -(`tt` is `4` in both cases.) - -A tuple with only one input edge, a 1-tuple, is a no-op, semantically, but can -be useful structurally to chain multiple operations together without defining -new value names. In the above example the `t` value can be eliminated using a -1-tuple. - -``` -1 -> } -add-> } -incr-> tt -2 -> } -``` - -When an integer is used as an operation on a tuple value then the effect is to -output the value in the tuple at that index. For example: - -``` -1 -> } -0-> } -incr-> t -2 -> } -``` - -(`t` is `2`.) - -### Operations - -When a value sits on an edge it is used as an operation on the input of that -edge. Some operations will no doubt be builtin, like `add`, but users should be -able to define their own operations. This can be done using the `in` and `out` -special values. When a graph is used as an operation it is scanned for both `in` -and `out` values. `in` is set to the input value of the operation, and the value -of `out` is used as the output of the operation. - -Here we will define the `incr` operation and then use it. Note that we set the -`incr` value to be an entire sub-graph which represents the operation's body. - -``` -( in -> } -add-> out - 1 -> } ) -> incr - -5 -incr-> n -``` - -(`n` is `6`.) - -The output of an operation may itself be a tuple. Here's an implementation and -usage of `double-incr`, which increments two values at once. - -``` -( in -0-> } -incr-> } -> out - } - in -1-> } -incr-> } ) -> double-incr - -1 -> } -double-incr-> t -add-> tt -2 -> } -``` - -(`t` is a 2-tuple with values `2`, and `3`, `tt` is `5.) - -### Conditionals - -The conditional is a bit weird, and I'm not totally settled on it yet. For now -we'll use this. The `if` operation expects as an input a 2-tuple whose first -value is a boolean and whose second value will be passed along. The `if` -operation is special in that it has _two_ output edges. The first will be taken -if the boolean is true, the second if the boolean is false. The second value in -the input tuple, the one to be passed along, is used as the input to whichever -branch is taken. - -Here is an implementation and usage of `max`, which takes two numbers and -outputs the greater of the two. Note that the `if` operation has two output -edges, but our syntax doesn't represent that very cleanly. - -``` -( in -gt-> } -if-> } -0-> out - in -> } -> } -1-> out ) -> max - -1 -> } -max-> t -2 -> } -``` - -(`t` is `2`.) - -It would be simple enough to create a `switch` macro on top of `if`, to allow -for multiple conditionals to be tested at once. - -### Loops - -Loops are tricky, and I have two thoughts about how they might be accomplished. -One is to literally draw an edge from the right end of the graph back to the -left, at the point where the loop should occur, as that's conceptually what's -happening. But representing that in a text file is difficult. For now I'll -introduce the special `recur` value, and leave this whole section as TBD. - -`recur` is cousin of `in` and `out`, in that it's a special value and not an -operation. It takes whatever value it's set to and calls the current operation -with that as input. As an example, here is our now classic 0 through 9 printer -(assume `println` outputs whatever it was input): - -``` -// incr-1 is an operation which takes a 2-tuple and returns the same 2-tuple -// with the first element incremented. -( in -0-> } -incr-> } -> out - in -1-> } ) -> incr-1 - -( in -eq-> } -if-> out - in -> } -> } -0-> } -println-> } -incr-1-> } -> recur ) -> print-range - -0 -> } -print-range-> } -10 -> } -``` - -## Next Steps - -This post is long enough, and I think gives at least a basic idea of what I'm -going for. The syntax presented here is _extremely_ rudimentary, and is almost -definitely not what any final version of the syntax would look like. But the -general idea behind the structure is sound, I think. - -I have a lot of further ideas for Ginger I haven't presented here. Hopefully as -time goes on and I work on the language more some of those ideas can start -taking a more concrete shape and I can write about them. - -The next thing I need to do for Ginger is to implement (again) the graph type -for it, since the last one I implemented didn't include tuples. Maybe I can -extend it instead of re-writing it. After that it will be time to really buckle -down and figure out a syntax. Once a syntax is established then it's time to -start on the compiler! diff --git a/_posts/2021-01-14-the-web.md b/_posts/2021-01-14-the-web.md deleted file mode 100644 index 4d47a57..0000000 --- a/_posts/2021-01-14-the-web.md +++ /dev/null @@ -1,239 +0,0 @@ ---- -title: >- - The Web -description: >- - What is it good for? ---- - -With the recent crisis in the US's democratic process, there's been much abuzz -in the world about social media's undoubted role in the whole debacle. The -extent to which the algorithms of Facebook, Twitter, Youtube, TikTok, etc, have -played a role in the radicalization of large segments of the world's population -is one popular topic. Another is the tactics those same companies are now -employing to try and euthanize the monster they made so much ad money in -creating. - -I don't want to talk about any of that; there is more to the web than -social media. I want to talk about what the web could be, and to do that I want -to first talk about what it has been. - -## Web 1.0 - -In the 1950's computers were generally owned by large organizations like -companies, universities, and governments. They were used to compute and manage -large amounts of data, and each existed independently of the other. - -In the 60's protocols began to be developed which would allow them to -communicate over large distances, and thereby share resources (both -computational and informational). - -The funding of ARPANET by the US DoD led to the initial versions of the TCP/IP -protocol in the 70's, still used today as the backbone of virtually all internet -communication. Email also came about from ARPANET around this time. - -The 80s saw the growth of the internet across the world, as ARPANET gave way to -NSFNET. It was during this time that the domain name system we use today was -developed. At this point the internet use was still mostly for large -non-commercial organizations; there was little commercial footprint, and little -private access. The first commercially available ISP, which allowed access to -the internet from private homes via dialup, wasn't launched until 1989. - -And so we find ourselves in the year 1989, when Tim Berners-Lee (TBL) first -proposed the World-Wide Web (WWW, or "the web"). You can find the original -proposal, which is surprisingly short and non-technical, -[here](https://www.w3.org/Proposal.html). - -From reading TBL's proposal it's clear that what he was after was some mechanism -for hosting information on his machine in such a way that others could find and -view it, without it needing to be explicitly sent to them. He includes the -following under the "Applications" header: - -> The application of a universal hypertext system, once in place, will cover -> many areas such as document registration, on-line help, project documentation, -> news schemes and so on. - -But out of such a humble scope grew one of the most powerful forces of the 21st -century. By the end of 1990 TBL had written the first HTML/HTTP browser and -server. By the end of 1994 sites like IMDB, Yahoo, and Bianca's Smut Shack were -live and being accessed by consumers. The web grew that fast. - -In my view the characteristic of the web which catalyzed its adoption so quickly -was the place-ness of it. The web is not just a protocol for transferring -information, like email, but instead is a _place_ where that information lives. -Any one place could be freely linked to any other place, and so complex and -interesting relations could be formed between people and ideas. The -contributions people make on the web can reverberate farther than they would or -could in any other medium precisely because those contributions aren't tied to -some one-off event or a deteriorating piece of physical infrastructure, but are -instead given a home which is both permanent and everywhere. - -The other advantage of the web, at the time, was its simplicity. HTML was so -simple it was basically human-readable. A basic HTTP server could be implemented -as a hobby project by anyone in any language. Hosting your own website was a -relatively straightforward task which anyone with a computer and an ISP could -undertake. - -This was the environment early adopters of the web found themselves in. - -## Web 2.0 - -The infamous dot-com boom took place in 2001. I don't believe this was a failure -inherent in the principles of the web itself, but instead was a product of -people investing in a technology they didn't fully understand. The web, as it -was then, wasn't really designed with money-making in mind. It certainly allowed -for it, but that wasn't the use-case being addressed. - -But of course, in this world we live in, if there's money to be made, it will -certainly be made. - -By 2003 the phrase "Web 2.0" started popping up. I remember this. To me "Web -2.0" meant a new aesthetic on the web, complete with bubble buttons and centered -fix-width paragraph boxes. But what "Web 2.0" actually signified wasn't related -to any new technology or aesthetic. It was a new strategy for how companies -could enable use of the web by non-expert users, i.e. users who don't have the -inclination or means to host their own website. Web 2.0 was a strategy for -giving everyone a _place_ of their own on the web. - -"Web 2.0" was merely a label given to a movement which had already been in -motion for years. I think the following Wikipedia excerpt describes this period -best: - - -> In 2004, the term ["Web 2.0"] began its rise in popularity when O'Reilly Media -and MediaLive hosted the first Web 2.0 conference. In their opening remarks, -John Battelle and Tim O'Reilly outlined their definition of the "Web as -Platform", where software applications are built upon the Web as opposed to upon -the desktop. The unique aspect of this migration, they argued, is that -"customers are building your business for you". They argued that the -activities of users generating content (in the form of ideas, text, videos, or -pictures) could be "harnessed" to create value. - - -In other words, Web 2.0 turned the place-ness of the web into a commodity. -Rather than expect everyone to host, or arrange for the hosting, of their own -corner of the web, the technologists would do it for them for "free"! This -coincided with the increasing complexity of the underlying technology of the -web; websites grew to be flashy, interactive, and stateful applications which -_did_ things rather than be places which _held_ things. The idea of a hyperlink, -upon which the success of the web had been founded, became merely an -implementation detail. - -And so the walled gardens began to be built. Myspace was founded in 2003, -Facebook opened to the public in 2006, Digg (the precursor to reddit) was -launched in 2004, Flickr launched in 2004 (and was bought by Yahoo in 2005), -Google bought Blogger in 2003, and Twitter launched in 2006. In effect this -period both opened the web up to everyone and established the way we still use -it today. - -It's upon these foundations that current events unfold. We have platforms whose -only incentive is towards capturing new users and holding their attention, to -the exclusion of other platforms, so they can be advertised to. Users are -enticed in because they are being offered a place on the web, a place of their -own to express themselves from, in order to find out the worth of their -expressions to the rest of the world. But they aren't expressing to the world at -large, they are expressing to a social media platform, a business, and so only -the most lucrative of voices are heard. - -So much for not wanting to talk about social media. - -## Web 3.0 - -The new hot topic in crypto and hacker circles is "Web 3.0", or the -decentralized web (dweb). The idea is that we can have all the good of the -current web (the accessibility, utility, permanency, etc) without all the bad -(the centralized platforms, censorship, advertising, etc). The way forward to -this utopian dream is by building decentralized applications (dApps). - -dApps are constructed in a way where all the users of an application help to -host all the stateful content of that application. If I, as a user, post an -image to a dApp, the idea is that other users of that same dApp would lend their -meager computer resources to ensure my image is never forgotten, and in turn I -would lend mine for theirs. - -In practice building successful dApps is enormously difficult for many reasons, -and really I'm not sure there _are_ any successful ones (to date). While I -support the general sentiment behind them, I sometimes wonder about the -efficacy. What people want from the web is a place they can call their own, a -place from which they can express themselves and share their contributions with -others with all the speed and pervasiveness that the internet offers. A dApp is -just another walled garden with specific capabilities; it offers only free -hosting, not free expression. - -## Web 2.0b - -I'm not here solely to complain (just mostly). - -Thinking back to Web 1.0, and specifically to the turning point between 1.0 and -2.0, I'd like to propose that maybe we made a wrong turn. The issue at hand was -that hosting one's own site was still too much of a technical burden, and the -direction we went was towards having businesses host them for us. Perhaps there -was another way. - -What are the specific difficulties with hosting one's own site? Here are the -ones I can think of: - -* Bad tooling: basically none of the tools you're required to use (web server, - TLS, DNS, your home router) are designed for the average person. - -* Aggregiously complex languages: making a site which looks half decent and can - do the things you want requires a _lot_ of knowledge about the underlying - languages (CSS, HTML, Javascript, and whatever your server is written in). - -* Single point-of-failure: if your machine is off, your site is down. - -* Security: it's important to stay ahead of the hackers, but it takes time to - do so. - -* Hostile environment: this is separate from security, and includes difficulties - like dynamic home IPs and bad ISP policies (such as asymetric upload/download - speeds). - -These are each separate avenues of attack. - -Bad tooling is a result of the fact that devs generally build technology for -themselves or their fellow devs, and only build for others when they're being -paid to do it. This is merely an attitude problem. - -Complex languages are really a sub-category of bad tooling. The concesus seems -to be that the average person isn't interested or capable of working in -HTML/CSS/JS. This may be true today, but it wasn't always. Most of my friends in -middle and high school were well within their interest and capability to create -the most heinous MySpace pages the world has ever seen, using nothing but CSS -generators and scraps of shitty JS they found lying around. So what changed? The -tools we use to build those pages did. - -A hostile environment is not something any individual can do anything about, but -in the capitalist system we exist in we can at least hold in faith the idea that -eventually us customers will get what we want. It may take a long time, but all -monopolies break eventually, and someone will eventually sell us the internet -access we're asking for. If all other pieces are in place I think we'll have -enough people asking to make a difference. - -For single point-of-failure we have to grant that more than one person will be -involved, since the vast majority of people aren't going to be able to keep one -machine online consistently, let alone two or more machines. But I think we all -know at least one person who could keep a machine online with some reliability, -and they probably know a couple of other people who could do so as well. What -I'm proposing is that, rather than building tools for global decentralization, -we need tools for local decentralization, aka federation. We can make it -possible for a group of people to have their presence managed by a subset of -themselves. Those with the ability could help to host the online presence of -their family, friends, churches, etc, if given the right tools. - -Security is the hard one, but also in many ways isn't. What most people want -from the web is a place from which to express themselves. Expression doesn't -take much more than a static page, usually, and there's not much attacking one -can do against a static page. Additionally, we've already established that -there's going to be at least a _couple_ of technically minded people involved in -hosting this thing. - -So that's my idea that I'd like to build towards. First among these ideas is -that we need tools which can help people help each other host their content, and -on top of that foundation a new web can be built which values honest expression -rather than the lucrative madness which our current algorithms love so much. - -This project was already somewhat started by -[Cryptorado](https://github.com/Cryptorado-Community/Cryptorado-Node) while I -was a regular attendee, but since COVID started my attendance has fallen off. -Hopefully one day it can resume. In the meantime I'm going to be working on -setting up these tools for myself, and see how far I can get. |