diff options
17 files changed, 1136 insertions, 0 deletions
diff --git a/_posts/2014-10-29-erlang-pitfalls.md b/_posts/2014-10-29-erlang-pitfalls.md new file mode 100644 index 0000000..910db79 --- /dev/null +++ b/_posts/2014-10-29-erlang-pitfalls.md @@ -0,0 +1,190 @@ +--- +layout: post +title: Erlang Pitfalls +--- + +I've been involved with a large-ish scale erlang project at Grooveshark since +sometime around 2011. I started this project knowing absolutely nothing about +erlang, but now I feel I have accumulated enough knowlege over time that I could +conceivably give some back. Specifically, common pitfalls that people may run +into when designing and writing a large-scale erlang application. Some of these +may show up when searching for them, but some of them you may not even know you +need to search for. + +## now() vs timestamp() + +The cononical way of getting the current timestamp in erlang is to use +`erlang:now()`. This works great at small loads, but if you find your +application slowing down greatly at highly parallel loads and you're calling +`erlang:now()` a lot, it may be the culprit. + +A property of this method you may not realize is that it is monotonically +increasing, meaning even if two processes call it at the *exact* same time they +will both receive different output. This is done through some locking on the +low-level, as well as a bit of math to balance out the time getting out of sync +in the scenario. + +There are situations where fetching always unique timestamps is useful, such as +seeding RNGs and generating unique identifiers for things, but usually when +people fetch a timestamp they just want a timestamp. For these cases, +`os:timestamp()` can be used. It is not blocked by any locks, it simply returns +the time. + +## The rpc module is slow + +The built-in `rpc` module is slower than you'd think. This mostly stems from it +doing a lot of extra work for every `call` and `cast` that you do, ensuring that +certain conditions are accounted for. If, however, it's sufficient for the +calling side to know that a call timed-out on them and not worry about it any +further you may benefit from simply writing your own rpc module. Alternatively, +use [one which already exists](https://github.com/cloudant/rexi). + +## Don't send anonymous functions between nodes + +One of erlang's niceties is transparent message sending between two phsyical +erlang nodes. Once nodes are connected, a process on one can send any message to +a process on the other exactly as if they existed on the same node. This is fine +for many data-types, but for anonymous functions it should be avoided. + +For example: + +```erlang +RemotePid ! {fn, fun(I) -> I + 1 end}. +``` + +Would be better written as + +```erlang +incr(I) -> + I + 1. + +RemotePid ! {fn, ?MODULE, incr}. +``` + +and then using an `apply` on the RemotePid to actually execute the function. + +This is because hot-swapping code messes with anonymous functions quite a bit. +Erlang isn't actually sending a function definition across the wire; it's simply +sending a reference to a function. If you've changed the code within the +anonymous function on a node, that reference changes. The sending node is +sending a reference to a function which may not exist anymore on the receiving +node, and you'll get a weird error which Google doesn't return many results for. + +Alternatively, if you simply send atoms across the wire and use `apply` on the +other side, only atoms are sent and the two nodes involved can have totally +different ideas of what the function itself does without any problems. + +## Hot-swapping code is a convenience, not a crutch + +Hot swapping code is the bees-knees. It lets you not have to worry about +rolling-restarts for trivial code changes, and so adds stability to your +cluster. My warning is that you should not rely on it. If your cluster can't +survive a node being restarted for a code change, then it can't survive if that +node fails completely, or fails and comes back up. Design your system pretending +that hot-swapping does not exist, and only once you've done that allow yourself +to use it. + +## GC sometimes needs a boost + +Erlang garbage collection (GC) acts on a per-erlang-process basis, meaning that +each process decides on its own to garbage collect itself. This is nice because +it means stop-the-world isn't a problem, but it does have some interesting +effects. + +We had a problem with our node memory graphs looking like an upwards facing +line, instead of a nice sinusoid relative to the number of connections during +the day. We couldn't find a memory leak *anywhere*, and so started profiling. We +found that the memory seemed to be comprised of mostly binary data in process +heaps. On a hunch my coworker Mike Cugini (who gets all the credit for this) ran +the following on a node: + +```erlang +lists:foreach(erlang:garbage_collect/1, erlang:processes()). +``` + +and saw memory drop in a huge way. We made that code run every 10 minutes or so +and suddenly our memory problem went away. + +The problem is that we had a lot of processes which individually didn't have +much heap data, but all-together were crushing the box. Each didn't think it had +enough to garbage collect very often, so memory just kept going up. Calling the +above forces all processes to garbage collect, and thus throw away all those +little binary bits they were hoarding. + +## These aren't the solutions you are looking for + +The `erl` process has tons of command-line options which allow you to tweak all +kinds of knobs. We've had tons of performance problems with our application, as +of yet not a single one has been solved with turning one of these knobs. They've +all been design issues or just run-of-the-mill bugs. I'm not saying the knobs +are *never* useful, but I haven't seen it yet. + +## Erlang processes are great, except when they're not + +The erlang model of allowing processes to manage global state works really well +in many cases. Possibly even most cases. There are, however, times when it +becomes a performance problem. This became apparent in the project I was working +on for Grooveshark, which was, at its heart, a pubsub server. + +The architecture was very simple: each channel was managed by a process, client +connection processes subscribed to that channel and received publishes from it. +Easy right? The problem was that extremely high volume channels were simply not +able to keep up with the load. The channel process could do certain things very +fast, but there were some operations which simply took time and slowed +everything down. For example, channels could have arbitrary properties set on +them by their owners. Retrieving an arbitrary property from a channel was a +fairly fast operation: client `call`s the channel process, channel process +immediately responds with the property value. No blocking involved. + +But as soon as there was any kind of call which required the channel process to +talk to yet *another* process (unfortunately necessary), things got hairy. On +high volume channels publishes/gets/set operations would get massively backed up +in the message queue while the process was blocked on another process. We tried +many things, but ultimately gave up on the process-per-channel approach. + +We instead decided on keeping *all* channel state in a transactional database. +When client processes "called" operations on a channel, they really are just +acting on the database data inline, no message passing involved. This means that +read-only operations are super-fast because there is minimal blocking, and if +some random other process is being slow it only affects the one client making +the call which is causing it to be slow, and not holding up a whole host of +other clients. + +## Mnesia might not be what you want + +This one is probably a bit controversial, and definitely subject to use-cases. +Do your own testing and profiling, find out what's right for you. + +Mnesia is erlang's solution for global state. It's an in-memory transactional +database which can scale to N nodes and persist to disk. It is hosted +directly in the erlang processes memory so you interact with it in erlang +directly in your code; no calling out to database drivers and such. Sounds great +right? + +Unfortunately mnesia is not a very full-featured database. It is essentially a +key-value store which can hold arbitrary erlang data-types, albeit in a set +schema which you lay out for it during startup. This means that more complex +types like sorted sets and hash maps (although this was addressed with the +introduction of the map data-type in R17) are difficult to work with within +mnesia. Additionally, erlang's data model of immutability, while awesome +usually, can bite you here because it's difficult (impossible?) to pull out +chunks of data within a record without accessing the whole record. + +For example, when retrieving the list of processes subscribed to a channel our +application doesn't simply pull the full list and iterate over it. This is too +slow, and in some cases the subscriber list was so large it wasn't actually +feasible. The channel process wasn't cleaning up its heap fast enough, so +multiple publishes would end up with multiple copies of the giant list in +memory. This became a problem. Instead we chain spawned processes, each of which +pull a set chunk of the subsciber list, and iterate over that. This is very +difficult to implement in mnesia without pulling the full subscriber list into +the process' memory at some point in the process. + +It is, however, fairly trivial to implement in redis using sorted sets. For this +case, and many other cases after, the motto for performance improvements became +"stick it in redis". The application is at the point where *all* state which +isn't directly tied to a specific connection is kept in redis, encoded using +`term_to_binary`. The performance hit of going to an outside process for data +was actually much less than we'd originally thought, and ended up being a plus +since we had much more freedom to do interesting hacks to speedup up our +accesses. diff --git a/_posts/2015-07-15-go-http.md b/_posts/2015-07-15-go-http.md new file mode 100644 index 0000000..29d7486 --- /dev/null +++ b/_posts/2015-07-15-go-http.md @@ -0,0 +1,545 @@ +--- +layout: post +title: Go's http package by example +--- + +Go's [http](http://golang.org/pkg/net/http/) package has turned into one of my +favorite things about the Go programming language. Initially it appears to be +somewhat complex, but in reality it can be broken down into a couple of simple +components that are extremely flexible in how they can be used. This guide will +cover the basic ideas behind the http package, as well as examples in using, +testing, and composing apps built with it. + +This guide assumes you have some basic knowledge of what an interface in Go is, +and some idea of how HTTP works and what it can do. + +## Handler + +The building block of the entire http package is the `http.Handler` interface, +which is defined as follows: + +```go +type Handler interface { + ServeHTTP(ResponseWriter, *Request) +} +``` + +Once implemented the `http.Handler` can be passed to `http.ListenAndServe`, +which will call the `ServeHTTP` method on every incoming request. + +`http.Request` contains all relevant information about an incoming http request +which is being served by your `http.Handler`. + +The `http.ResponseWriter` is the interface through which you can respond to the +request. It implements the `io.Writer` interface, so you can use methods like +`fmt.Fprintf` to write a formatted string as the response body, or ones like +`io.Copy` to write out the contents of a file (or any other `io.Reader`). The +response code can be set before you begin writing data using the `WriteHeader` +method. + +Here's an example of an extremely simple http server: + +```go +package main + +import ( + "fmt" + "log" + "net/http" +) + +type helloHandler struct{} + +func (h helloHandler) ServeHTTP(w http.ResponseWriter, r *http.Request) { + fmt.Fprintf(w, "hello, you've hit %s\n", r.URL.Path) +} + +func main() { + err := http.ListenAndServe(":9999", helloHandler{}) + log.Fatal(err) +} +``` + +`http.ListenAndServe` serves requests using the handler, listening on the given +address:port. It will block unless it encounters an error listening, in which +case we `log.Fatal`. + +Here's an example of using this handler with curl: + +``` + ~ $ curl localhost:9999/foo/bar + hello, you've hit /foo/bar +``` + + +## HandlerFunc + +Often defining a full type to implement the `http.Handler` interface is a bit +overkill, especially for extremely simple `ServeHTTP` functions like the one +above. The `http` package provides a helper function, `http.HandlerFunc`, which +wraps a function which has the signature +`func(w http.ResponseWriter, r *http.Request)`, returning an `http.Handler` +which will call it in all cases. + +The following behaves exactly like the previous example, but uses +`http.HandlerFunc` instead of defining a new type. + +```go +package main + +import ( + "fmt" + "log" + "net/http" +) + +func main() { + h := http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { + fmt.Fprintf(w, "hello, you've hit %s\n", r.URL.Path) + }) + + err := http.ListenAndServe(":9999", h) + log.Fatal(err) +} +``` + +## ServeMux + +On their own, the previous examples don't seem all that useful. If we wanted to +have different behavior for different endpoints we would end up with having to +parse path strings as well as numerous `if` or `switch` statements. Luckily +we're provided with `http.ServeMux`, which does all of that for us. Here's an +example of it being used: + +```go +package main + +import ( + "fmt" + "log" + "net/http" +) + +func main() { + h := http.NewServeMux() + + h.HandleFunc("/foo", func(w http.ResponseWriter, r *http.Request) { + fmt.Fprintln(w, "Hello, you hit foo!") + }) + + h.HandleFunc("/bar", func(w http.ResponseWriter, r *http.Request) { + fmt.Fprintln(w, "Hello, you hit bar!") + }) + + h.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) { + w.WriteHeader(404) + fmt.Fprintln(w, "You're lost, go home") + }) + + err := http.ListenAndServe(":9999", h) + log.Fatal(err) +} +``` + +The `http.ServeMux` is itself an `http.Handler`, so it can be passed into +`http.ListenAndServe`. When it receives a request it will check if the request's +path is prefixed by any of its known paths, choosing the longest prefix match it +can find. We use the `/` endpoint as a catch-all to catch any requests to +unknown endpoints. Here's some examples of it being used: + +``` + ~ $ curl localhost:9999/foo +Hello, you hit foo! + + ~ $ curl localhost:9999/bar +Hello, you hit bar! + + ~ $ curl localhost:9999/baz +You're lost, go home +``` + +`http.ServeMux` has both `Handle` and `HandleFunc` methods. These do the same +thing, except that `Handle` takes in an `http.Handler` while `HandleFunc` merely +takes in a function, implicitly wrapping it just as `http.HandlerFunc` does. + +### Other muxes + +There are numerous replacements for `http.ServeMux` like +[gorilla/mux](http://www.gorillatoolkit.org/pkg/mux) which give you things like +automatically pulling variables out of paths, easily asserting what http methods +are allowed on an endpoint, and more. Most of these replacements will implement +`http.Handler` like `http.ServeMux` does, and accept `http.Handler`s as +arguments, and so are easy to use in conjunction with the rest of the things +I'm going to talk about in this post. + +## Composability + +When I say that the `http` package is composable I mean that it is very easy to +create re-usable pieces of code and glue them together into a new working +application. The `http.Handler` interface is the way all pieces communicate with +each other. Here's an example of where we use the same `http.Handler` to handle +multiple endpoints, each slightly differently: + +```go +package main + +import ( + "fmt" + "log" + "net/http" +) + +type numberDumper int + +func (n numberDumper) ServeHTTP(w http.ResponseWriter, r *http.Request) { + fmt.Fprintf(w, "Here's your number: %d\n", n) +} + +func main() { + h := http.NewServeMux() + + h.Handle("/one", numberDumper(1)) + h.Handle("/two", numberDumper(2)) + h.Handle("/three", numberDumper(3)) + h.Handle("/four", numberDumper(4)) + h.Handle("/five", numberDumper(5)) + + h.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) { + w.WriteHeader(404) + fmt.Fprintln(w, "That's not a supported number!") + }) + + err := http.ListenAndServe(":9999", h) + log.Fatal(err) +} +``` + +`numberDumper` implements `http.Handler`, and can be passed into the +`http.ServeMux` multiple times to serve multiple endpoints. Here's it in action: + +``` + ~ $ curl localhost:9999/one +Here's your number: 1 + ~ $ curl localhost:9999/five +Here's your number: 5 + ~ $ curl localhost:9999/bazillion +That's not a supported number! +``` + +## Testing + +Testing http endpoints is extremely easy in Go, and doesn't even require you to +actually listen on any ports! The `httptest` package provides a few handy +utilities, including `NewRecorder` which implements `http.ResponseWriter` and +allows you to effectively make an http request by calling `ServeHTTP` directly. +Here's an example of a test for our previously implemented `numberDumper`, +commented with what exactly is happening: + +```go +package main + +import ( + "fmt" + "net/http" + "net/http/httptest" + . "testing" +) + +func TestNumberDumper(t *T) { + // We first create the http.Handler we wish to test + n := numberDumper(1) + + // We create an http.Request object to test with. The http.Request is + // totally customizable in every way that a real-life http request is, so + // even the most intricate behavior can be tested + r, _ := http.NewRequest("GET", "/one", nil) + + // httptest.Recorder implements the http.ResponseWriter interface, and as + // such can be passed into ServeHTTP to receive the response. It will act as + // if all data being given to it is being sent to a real client, when in + // reality it's being buffered for later observation + w := httptest.NewRecorder() + + // Pass in our httptest.Recorder and http.Request to our numberDumper. At + // this point the numberDumper will act just as if it was responding to a + // real request + n.ServeHTTP(w, r) + + // httptest.Recorder gives a number of fields and methods which can be used + // to observe the response made to our request. Here we check the response + // code + if w.Code != 200 { + t.Fatalf("wrong code returned: %d", w.Code) + } + + // We can also get the full body out of the httptest.Recorder, and check + // that its contents are what we expect + body := w.Body.String() + if body != fmt.Sprintf("Here's your number: 1\n") { + t.Fatalf("wrong body returned: %s", body) + } + +} +``` + +In this way it's easy to create tests for your individual components that you +are using to build your application, keeping the tests near to the functionality +they're testing. + +Note: if you ever do need to spin up a test server in your tests, `httptest` +also provides a way to create a server listening on a random open port for use +in tests as well. + +## Middleware + +Serving endpoints is nice, but often there's functionality you need to run for +*every* request before the actual endpoint's handler is run. For example, access +logging. A middleware component is one which implements `http.Handler`, but will +actually pass the request off to another `http.Handler` after doing some set of +actions. The `http.ServeMux` we looked at earlier is actually an example of +middleware, since it passes the request off to another `http.Handler` for actual +processing. Here's an example of our previous example with some logging +middleware: + +```go +package main + +import ( + "fmt" + "log" + "net/http" +) + +type numberDumper int + +func (n numberDumper) ServeHTTP(w http.ResponseWriter, r *http.Request) { + fmt.Fprintf(w, "Here's your number: %d\n", n) +} + +func logger(h http.Handler) http.Handler { + return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { + log.Printf("%s requested %s", r.RemoteAddr, r.URL) + h.ServeHTTP(w, r) + }) +} + +func main() { + h := http.NewServeMux() + + h.Handle("/one", numberDumper(1)) + h.Handle("/two", numberDumper(2)) + h.Handle("/three", numberDumper(3)) + h.Handle("/four", numberDumper(4)) + h.Handle("/five", numberDumper(5)) + + h.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) { + w.WriteHeader(404) + fmt.Fprintln(w, "That's not a supported number!") + }) + + hl := logger(h) + + err := http.ListenAndServe(":9999", hl) + log.Fatal(err) +} +``` + +`logger` is a function which takes in an `http.Handler` called `h`, and returns +a new `http.Handler` which, when called, will log the request it was called with +and then pass off its arguments to `h`. To use it we pass in our +`http.ServeMux`, so all incoming requests will first be handled by the logging +middleware before being passed to the `http.ServeMux`. + +Here's an example log entry which is output when the `/five` endpoint is hit: + +``` +2015/06/30 20:15:41 [::1]:34688 requested /five +``` + +## Middleware chaining + +Being able to chain middleware together is an incredibly useful ability which we +get almost for free, as long as we use the signature +`func(http.Handler) http.Handler`. A middleware component returns the same type +which is passed into it, so simply passing the output of one middleware +component into the other is sufficient. + +However, more complex behavior with middleware can be tricky. For instance, what +if you want a piece of middleware which takes in a parameter upon creation? +Here's an example of just that, with a piece of middleware which will set a +header and its value for all requests: + +```go +package main + +import ( + "fmt" + "log" + "net/http" +) + +type numberDumper int + +func (n numberDumper) ServeHTTP(w http.ResponseWriter, r *http.Request) { + fmt.Fprintf(w, "Here's your number: %d\n", n) +} + +func logger(h http.Handler) http.Handler { + return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { + log.Printf("%s requested %s", r.RemoteAddr, r.URL) + h.ServeHTTP(w, r) + }) +} + +type headerSetter struct { + key, val string + handler http.Handler +} + +func (hs headerSetter) ServeHTTP(w http.ResponseWriter, r *http.Request) { + w.Header().Set(hs.key, hs.val) + hs.handler.ServeHTTP(w, r) +} + +func newHeaderSetter(key, val string) func(http.Handler) http.Handler { + return func(h http.Handler) http.Handler { + return headerSetter{key, val, h} + } +} + +func main() { + h := http.NewServeMux() + + h.Handle("/one", numberDumper(1)) + h.Handle("/two", numberDumper(2)) + h.Handle("/three", numberDumper(3)) + h.Handle("/four", numberDumper(4)) + h.Handle("/five", numberDumper(5)) + + h.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) { + w.WriteHeader(404) + fmt.Fprintln(w, "That's not a supported number!") + }) + + hl := logger(h) + hhs := newHeaderSetter("X-FOO", "BAR")(hl) + + err := http.ListenAndServe(":9999", hhs) + log.Fatal(err) +} +``` + +And here's the curl output: + +``` + ~ $ curl -i localhost:9999/three + HTTP/1.1 200 OK + X-Foo: BAR + Date: Wed, 01 Jul 2015 00:39:48 GMT + Content-Length: 22 + Content-Type: text/plain; charset=utf-8 + + Here's your number: 3 + +``` + +`newHeaderSetter` returns a function which accepts and returns an +`http.Handler`. Calling that returned function with an `http.Handler` then gets +you an `http.Handler` which will set the header given to `newHeaderSetter` +before continuing on to the given `http.Handler`. + +This may seem like a strange way of organizing this; for this example the +signature for `newHeaderSetter` could very well have looked like this: + +``` +func newHeaderSetter(key, val string, h http.Handler) http.Handler +``` + +And that implementation would have worked fine. But it would have been more +difficult to compose going forward. In the next section I'll show what I mean. + +## Composing middleware with alice + +[Alice](https://github.com/justinas/alice) is a very simple and convenient +helper for working with middleware using the function signature we've been using +thusfar. Alice is used to create and use chains of middleware. Chains can even +be appended to each other, giving even further flexibility. Here's our previous +example with a couple more headers being set, but also using alice to manage the +added complexity. + +```go +package main + +import ( + "fmt" + "log" + "net/http" + + "github.com/justinas/alice" +) + +type numberDumper int + +func (n numberDumper) ServeHTTP(w http.ResponseWriter, r *http.Request) { + fmt.Fprintf(w, "Here's your number: %d\n", n) +} + +func logger(h http.Handler) http.Handler { + return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { + log.Printf("%s requested %s", r.RemoteAddr, r.URL) + h.ServeHTTP(w, r) + }) +} + +type headerSetter struct { + key, val string + handler http.Handler +} + +func (hs headerSetter) ServeHTTP(w http.ResponseWriter, r *http.Request) { + w.Header().Set(hs.key, hs.val) + hs.handler.ServeHTTP(w, r) +} + +func newHeaderSetter(key, val string) func(http.Handler) http.Handler { + return func(h http.Handler) http.Handler { + return headerSetter{key, val, h} + } +} + +func main() { + h := http.NewServeMux() + + h.Handle("/one", numberDumper(1)) + h.Handle("/two", numberDumper(2)) + h.Handle("/three", numberDumper(3)) + h.Handle("/four", numberDumper(4)) + + fiveHS := newHeaderSetter("X-FIVE", "the best number") + h.Handle("/five", fiveHS(numberDumper(5))) + + h.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) { + w.WriteHeader(404) + fmt.Fprintln(w, "That's not a supported number!") + }) + + chain := alice.New( + newHeaderSetter("X-FOO", "BAR"), + newHeaderSetter("X-BAZ", "BUZ"), + logger, + ).Then(h) + + err := http.ListenAndServe(":9999", chain) + log.Fatal(err) +} +``` + +In this example all requests will have the headers `X-FOO` and `X-BAZ` set, but +the `/five` endpoint will *also* have the `X-FIVE` header set. + +## Fin + +Starting with a simple idea of an interface, the `http` package allows us to +create for ourselves an incredibly useful and flexible (yet still rather simple) +ecosystem for building web apps with re-usable components, all without breaking +our static checks. diff --git a/_posts/2015-11-21-happy-trees.md b/_posts/2015-11-21-happy-trees.md new file mode 100644 index 0000000..2ac2850 --- /dev/null +++ b/_posts/2015-11-21-happy-trees.md @@ -0,0 +1,237 @@ +--- +layout: post +title: Happy Trees +--- + +Source code related to this post is available [here](https://github.com/mediocregopher/happy-tree). + +This project was inspired by [this video](https://www.youtube.com/watch?v=_DpzAvb3Vk4), +which you should watch first in order to really understand what's going on. + +My inspiration came from his noting that happification could be done on numbers +in bases other than 10. I immediately thought of hexadecimal, base-16, since I'm +a programmer and that's what I think of. I also was trying to think of how one +would graphically represent a large happification tree, when I realized that +hexadecimal numbers are colors, and colors graphically represent things nicely! + +## Colors + +Colors to computers are represented using 3-bytes, encompassing red, green, and +blue. Each byte is represented by two hexadecimal digits, and they are appended +together. For example `FF0000` represents maximum red (`FF`) added to no green +and no blue. `FF5500` represents maximum red (`FF`), some green (`55`) and no +blue (`00`), which when added together results in kind of an orange color. + +## Happifying colors + +In base 10, happifying a number is done by splitting its digits, squaring each +one individually, and adding the resulting numbers. The principal works the same +for hexadecimal numbers: + +``` +A4F +A*A + 4*4 + F*F +64 + 10 + E1 +155 // 341 in decimal +``` + +So if all colors are 6-digit hexadecimal numbers, they can be happified easily! + +``` +FF5500 +F*F + F*F + 5*5 + 5*5 + 0*0 + 0*0 +E1 + E1 + 19 + 19 + 0 + 0 +0001F4 +``` + +So `FF5500` (an orangish color) happifies to `0001F4` (a darker blue). Since +order of digits doesn't matter, `5F50F0` also happifies to `0001F4`. From this +fact, we can make a tree (hence the happification tree). I can do this process +on every color from `000000` (black) to `FFFFFF` (white), so I will! + +## Representing the tree + +So I know I can represent the tree using color, but there's more to decide on +than that. The easy way to represent a tree would be to simply draw a literal +tree graph, with a circle for each color and lines pointing to its parent and +children. But this is boring, and also if I want to represent *all* colors the +resulting image would be enormous and/or unreadable. + +I decided on using a hollow, multi-level pie-chart. Using the example +of `000002`, it would look something like this: + +![An example of a partial multi-level pie chart](/img/happy-tree/partial.png) + +The inner arc represents the color `000002`. The second arc represents the 15 +different colors which happify into `000002`, each of them may also have their +own outer arc of numbers which happify to them, and so on. + +This representation is nice because a) It looks cool and b) it allows the +melancoils of the hexadecimals to be placed around the happification tree +(numbers which happify into `000001`), which is convenient. It's also somewhat +easier to code than a circle/branch based tree diagram. + +An important feature I had to implement was proportional slice sizes. If I were +to give each child of a color an equal size on that arc's edge the image would +simply not work. Some branches of the tree are extremely deep, while others are +very shallow. If all were given the same space, those deep branches wouldn't +even be representable by a single pixel's width, and would simply fail to show +up. So I implemented proportional slice sizes, where the size of every slice is +determined to be proportional to how many total (recursively) children it has. +You can see this in the above example, where the second level arc is largely +comprised of one giant slice, with many smaller slices taking up the end. + +## First attempt + +My first attempt resulted in this image (click for 5000x5000 version): + +[![Result of first attempt](/img/happy-tree/happy-tree-atmp1-small.png)](/img/happy-tree/happy-tree-atmp1.png) + +The first thing you'll notice is that it looks pretty neat. + +The second thing you'll notice is that there's actually only one melancoil in +the 6-digit hexadecimal number set. The innermost black circle is `000000` which +only happifies to itself, and nothing else will happify to it (sad `000000`). +The second circle represents `000001`, and all of its runty children. And +finally the melancoil, comprised of: + +``` +00000D -> 0000A9 -> 0000B5 -> 000092 -> 000055 -> 00003 -> ... +``` + +The final thing you'll notice (or maybe it was the first, since it's really +obvious) is that it's very blue. Non-blue colors are really only represented as +leaves on their trees and don't ever really have any children of their own, so +the blue and black sections take up vastly more space. + +This makes sense. The number which should generate the largest happification +result, `FFFFFF`, only results in `000546`, which is primarily blue. So in effect +all colors happify to some shade of blue. + +This might have been it, technically this is the happification tree and the +melancoil of 6 digit hexadecimal numbers represented as colors. But it's also +boring, and I wanted to do better. + +## Second attempt + +The root of the problem is that the definition of "happification" I used +resulted in not diverse enough results. I wanted something which would give me +numbers where any of the digits could be anything. Something more random. + +I considered using a hash instead, like md5, but that has its own problems. +There's no gaurantee that any number would actually reach `000001`, which isn't +required but it's a nice feature that I wanted. It also would be unlikely that +there would be any melancoils that weren't absolutely gigantic. + +I ended up redefining what it meant to happify a hexadecimal number. Instead of +adding all the digits up, I first split up the red, green, and blue digits into +their own numbers, happified those numbers, and finally reassembled the results +back into a single number. For example: + +``` +FF5500 +FF, 55, 00 +F*F + F*F, 5*5 + 5*5, 0*0 + 0*0 +1C2, 32, 00 +C23200 +``` + +I drop that 1 on the `1C2`, because it has no place in this system. Sorry 1. + +Simply replacing that function resulted in this image (click for 5000x5000) version: + +[![Result of second attempt](/img/happy-tree/happy-tree-atmp2-small.png)](/img/happy-tree/happy-tree-atmp2.png) + +The first thing you notice is that it's so colorful! So that goal was achieved. + +The second thing you notice is that there's *significantly* more melancoils. +Hundreds, even. Here's a couple of the melancoils (each on its own line): + +``` +00000D -> 0000A9 -> 0000B5 -> 000092 -> 000055 -> 000032 -> ... +000D0D -> 00A9A9 -> 00B5B5 -> 009292 -> 005555 -> 003232 -> ... +0D0D0D -> A9A9A9 -> B5B5B5 -> 929292 -> 555555 -> 323232 -> ... +0D0D32 -> A9A90D -> B5B5A9 -> 9292B5 -> 555592 -> 323255 -> ... +... +``` + +And so on. You'll notice the first melancoil listed is the same as the one from +the first attempt. You'll also notice that the same numbers from the that +melancoil are "re-used" in the rest of them as well. The second coil listed is +the same as the first, just with the numbers repeated in the 3rd and 4th digits. +The third coil has those numbers repeated once more in the 1st and 2nd digits. +The final coil is the same numbers, but with the 5th and 6th digits offset one +place in the rotation. + +The rest of the melancoils in this attempt work out to just be every conceivable +iteration of the above. This is simply a property of the algorithm chosen, and +there's not a whole lot we can do about it. + +## Third attempt + +After talking with [Mr. Marco](/members/#marcopolo) about the previous attempts +I got an idea that would lead me towards more attempts. The main issue I was +having in coming up with new happification algorithms was figuring out what to +do about getting a number greater than `FFFFFF`. Dropping the leading digits +just seemed.... lame. + +One solution I came up with was to simply happify again. And again, and again. +Until I got a number less than or equal to `FFFFFF`. + +With this new plan, I could increase the power by which I'm raising each +individual digit, and drop the strategy from the second attempt of splitting the +number into three parts. In the first attempt I was doing happification to the +power of 2, but what if I wanted to happify to the power of 6? It would look +something like this (starting with the number `34BEEF`): + +``` +34BEEF +3^6 + 4^6 + B^6 + E^6 + E^6 + E^6 + F^6 +2D9 + 1000 + 1B0829 + 72E440 + 72E440 + ADCEA1 +1AEB223 + +1AEB223 is greater than FFFFFF, so we happify again + +1^6 + A^6 + E^6 + B^6 + 2^6 + 2^6 + 3^6 +1 + F4240 + 72E440 + 1B0829 + 40 + 40 + 2D9 +9D3203 +``` + +So `34BEEF` happifies to `9D3203`, when happifying to the power of 6. + +As mentioned before the first attempt in this blog was the 2nd power tree, +here's the trees for the 3rd, 4th, 5th, and 6th powers (each image is a link to +a larger version): + +3rd power: +[![Third attempt, 3rd power](/img/happy-tree/happy-tree-atmp3-pow3-small.png)](/img/happy-tree/happy-tree-atmp3-pow3.png) + +4th power: +[![Third attempt, 4th power](/img/happy-tree/happy-tree-atmp3-pow4-small.png)](/img/happy-tree/happy-tree-atmp3-pow4.png) + +5th power: +[![Third attempt, 5th power](/img/happy-tree/happy-tree-atmp3-pow5-small.png)](/img/happy-tree/happy-tree-atmp3-pow5.png) + +6th power: +[![Third attempt, 6th power](/img/happy-tree/happy-tree-atmp3-pow6-small.png)](/img/happy-tree/happy-tree-atmp3-pow6.png) + +A couple things to note: + +* 3-5 are still very blue. It's not till the 6th power that the distribution + becomes random enough to become very colorful. + +* Some powers have more coils than others. Power of 3 has a lot, and actually a + lot of them aren't coils, but single narcissistic numbers. Narcissistic + numbers are those which happify to themselves. `000000` and `000001` are + narcissistic numbers in all powers, power of 3 has quite a few more. + +* 4 looks super cool. + +Using unsigned 64-bit integers I could theoretically go up to the power of 15. +But I hit a roadblock at power of 7, in that there's actually a melancoil which +occurs whose members are all greater than `FFFFFF`. This means that my strategy +of repeating happifying until I get under `FFFFFF` doesn't work for any numbers +which lead into that coil. + + All images linked to in this post are licensed under the [Do what the fuck you + want to public license](http://www.wtfpl.net/txt/copying/). diff --git a/_posts/2015-3-11-rabbit-hole.md b/_posts/2015-3-11-rabbit-hole.md new file mode 100644 index 0000000..190408c --- /dev/null +++ b/_posts/2015-3-11-rabbit-hole.md @@ -0,0 +1,164 @@ +--- +layout: post +title: Rabbit Hole +--- + +We've begun rolling out [SkyDNS][skydns] at my job, which has been pretty neat. +We're basing a couple future projects around being able to use it, and it's made +dynamic configuration and service discovery nice and easy. + +This post chronicles catching a bug because of our switch to SkyDNS, and how we +discover its root cause. I like to call these kinds of bugs "rabbit holes"; they +look shallow at first, but anytime you make a little progress forward a little +more is always required, until you discover the ending somewhere totally +unrelated to the start. + +## The Bug + +We are seeing *tons* of these in the SkyDNS log: + +``` +[skydns] Feb 20 17:21:15.168 INFO | no nameservers defined or name too short, can not forward +``` + +I fire up tcpdump to see if I can see anything interesting, and sure enough run +across a bunch of these: + +``` +# tcpdump -vvv -s 0 -l -n port 53 +tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes + ... + $fen_ip.50257 > $skydns_ip.domain: [udp sum ok] 16218+ A? unknown. (25) + $fen_ip.27372 > $skydns_ip.domain: [udp sum ok] 16218+ A? unknown. (25) + $fen_ip.35634 > $skydns_ip.domain: [udp sum ok] 59227+ A? unknown. (25) + $fen_ip.64363 > $skydns_ip.domain: [udp sum ok] 59227+ A? unknown. (25) +``` + +It appears that some of our front end nodes (FENs) are making tons of DNS +fequests trying to find the A record of `unknown`. Something on our FENs is +doing something insane and is breaking. + +## The FENs + +Hopping over to my favorite FEN we're able to see the packets in question +leaving on a tcpdump as well, but that's not helpful for finding the root cause. +We have lots of processes running on the FENs and any number of them could be +doing something crazy. + +We fire up sysdig, which is similar to systemtap and strace in that it allows +you to hook into the kernel and view various kernel activites in real time, but +it's easier to use than both. The following command dumps all UDP packets being +sent and what process is sending them: + +``` +# sysdig fd.l4proto=udp +... +2528950 22:17:35.260606188 0 php-fpm (21477) < connect res=0 tuple=$fen_ip:61173->$skydns_ip:53 +2528961 22:17:35.260611327 0 php-fpm (21477) > sendto fd=102(<4u>$fen_ip:61173->$skydns_ip:53) size=25 tuple=NULL +2528991 22:17:35.260631917 0 php-fpm (21477) < sendto res=25 data=.r...........unknown..... +2530470 22:17:35.261879032 0 php-fpm (21477) > ioctl fd=102(<4u>$fen_ip:61173->$skydns_ip:53) request=541B argument=7FFF82DC8728 +2530472 22:17:35.261880574 0 php-fpm (21477) < ioctl res=0 +2530474 22:17:35.261881226 0 php-fpm (21477) > recvfrom fd=102(<4u>$fen_ip:61173->$skydns_ip:53) size=1024 +2530476 22:17:35.261883424 0 php-fpm (21477) < recvfrom res=25 data=.r...........unknown..... tuple=$skydns_ip:53->$fen_ip:61173 +2530485 22:17:35.261888997 0 php-fpm (21477) > close fd=102(<4u>$fen_ip:61173->$skydns_ip:53) +2530488 22:17:35.261892626 0 php-fpm (21477) < close res=0 +``` + +Aha! We can see php-fpm is requesting something over udp with the string +`unknown` in it. We've now narrowed down the guilty process, the rest should be +easy right? + +## Which PHP? + +Unfortunately we're a PHP shop; knowing that php-fpm is doing something on a FEN +narrows down the guilty codebase little. Taking the FEN out of our load-balancer +stops the requests for `unknown`, so we *can* say that it's some user-facing +code that is the culprit. Our setup on the FENs involves users hitting nginx +for static content and nginx proxying PHP requests back to php-fpm. Since all +our virtual domains are defined in nginx, we are able to do something horrible. + +On the particular FEN we're on we make a guess about which virtual domain the +problem is likely coming from (our main app), and proxy all traffic from all +other domains to a different FEN. We still see requests for `unknown` leaving +the box, so we've narrowed the problem down a little more. + +## The Despair + +Nothing in our code is doing any direct DNS calls as far as we can find, and we +don't see any places PHP might be doing it for us. We have lots of PHP +extensions in place, all written in C and all black boxes; any of them could be +the culprit. Grepping through the likely candidates' source code for the string +`unknown` proves fruitless. + +We try xdebug at this point. xdebug is a profiler for php which will create +cachegrind files for the running code. With cachegrind you can see every +function which was ever called, how long spent within each function, a full +call-graph, and lots more. Unfortunately xdebug outputs cachegrind files on a +per-php-fpm-process basis, and overwrites the previous file on each new request. +So xdebug is pretty much useless, since what is in the cachegrind file isn't +necessarily what spawned the DNS request. + +## Gotcha (sorta) + +We turn back to the tried and true method of dumping all the traffic using +tcpdump and perusing through that manually. + +What we find is that nearly everytime there is a DNS request for `unknown`, if +we scroll up a bit there is (usually) a particular request to memcache. The +requested key is always in the style of `function-name:someid:otherstuff`. When +looking in the code around that function name we find this ominous looking call: + +```php +$ipAddress = getIPAddress(); +$geoipInfo = getCountryInfoFromIP($ipAddress); +``` + +This points us in the right direction. On a hunch we add some debug +logging to print out the `$ipAddress` variable, and sure enough it comes back as +`unknown`. AHA! + +So what we surmise is happening is that for some reason our geoip extension, +which we use to get the location data of an IP address and which +`getCountryInfoFromIP` calls, is seeing something which is *not* an IP address +and trying to resolve it. + +## Gotcha (for real) + +So the question becomes: why are we getting the string `unknown` as an IP +address? + +Adding some debug logging around the area we find before showed that +`$_SERVER['REMOTE_ADDR']`, which is the variable populated with the IP address +of the client, is sometimes `unknown`. We guess that this has something to do +with some magic we are doing on nginx's side to populate `REMOTE_ADDR` with the +real IP address of the client in the case of them going through a proxy. + +Many proxies send along the header `X-Forwarded-For` to indicate the real IP of +the client they're proxying for, otherwise the server would only see the proxy's +IP. In our setup I decided that in those cases we should set the `REMOTE_ADDR` +to the real client IP so our application logic doesn't even have to worry about +it. There are a couple problems with this which render it a bad decision, one +being that if some misbahaving proxy was to, say, start sending +`X-Forwarded-For: unknown` then some written applications might mistake that to +mean the client's IP is `unknown`. + +## The Fix + +The fix here was two-fold: + +1) We now always set `$_SERVER['REMOTE_ADDR']` to be the remote address of the +requests, regardless of if it's a proxy, and also send the application the +`X-Forwarded-For` header to do with as it pleases. + +2) Inside our app we look at all the headers sent and do some processing to +decide what the actual client IP is. PHP can handle a lot more complex logic +than nginx can, so we can do things like check to make sure the IP is an IP, and +also that it's not some NAT'd internal ip, and so forth. + +And that's it. From some weird log messages on our DNS servers to an nginx +mis-configuration on an almost unrelated set of servers, this is one of those +strange bugs that never has a nice solution and goes unsolved for a long time. +Spending the time to dive down the rabbit hole and find the answer is often +tedious, but also often very rewarding. + +[skydns]: https://github.com/skynetservices/skydns diff --git a/img/happy-tree/happy-tree-atmp1-small.png b/img/happy-tree/happy-tree-atmp1-small.png Binary files differnew file mode 100644 index 0000000..adcf267 --- /dev/null +++ b/img/happy-tree/happy-tree-atmp1-small.png diff --git a/img/happy-tree/happy-tree-atmp1.png b/img/happy-tree/happy-tree-atmp1.png Binary files differnew file mode 100644 index 0000000..5a9baaf --- /dev/null +++ b/img/happy-tree/happy-tree-atmp1.png diff --git a/img/happy-tree/happy-tree-atmp2-small.png b/img/happy-tree/happy-tree-atmp2-small.png Binary files differnew file mode 100644 index 0000000..694858f --- /dev/null +++ b/img/happy-tree/happy-tree-atmp2-small.png diff --git a/img/happy-tree/happy-tree-atmp2.png b/img/happy-tree/happy-tree-atmp2.png Binary files differnew file mode 100644 index 0000000..60a2b16 --- /dev/null +++ b/img/happy-tree/happy-tree-atmp2.png diff --git a/img/happy-tree/happy-tree-atmp3-pow3-small.png b/img/happy-tree/happy-tree-atmp3-pow3-small.png Binary files differnew file mode 100644 index 0000000..02dfd2d --- /dev/null +++ b/img/happy-tree/happy-tree-atmp3-pow3-small.png diff --git a/img/happy-tree/happy-tree-atmp3-pow3.png b/img/happy-tree/happy-tree-atmp3-pow3.png Binary files differnew file mode 100644 index 0000000..710a0a0 --- /dev/null +++ b/img/happy-tree/happy-tree-atmp3-pow3.png diff --git a/img/happy-tree/happy-tree-atmp3-pow4-small.png b/img/happy-tree/happy-tree-atmp3-pow4-small.png Binary files differnew file mode 100644 index 0000000..311718d --- /dev/null +++ b/img/happy-tree/happy-tree-atmp3-pow4-small.png diff --git a/img/happy-tree/happy-tree-atmp3-pow4.png b/img/happy-tree/happy-tree-atmp3-pow4.png Binary files differnew file mode 100644 index 0000000..b82406e --- /dev/null +++ b/img/happy-tree/happy-tree-atmp3-pow4.png diff --git a/img/happy-tree/happy-tree-atmp3-pow5-small.png b/img/happy-tree/happy-tree-atmp3-pow5-small.png Binary files differnew file mode 100644 index 0000000..87b7ced --- /dev/null +++ b/img/happy-tree/happy-tree-atmp3-pow5-small.png diff --git a/img/happy-tree/happy-tree-atmp3-pow5.png b/img/happy-tree/happy-tree-atmp3-pow5.png Binary files differnew file mode 100644 index 0000000..bcea6ad --- /dev/null +++ b/img/happy-tree/happy-tree-atmp3-pow5.png diff --git a/img/happy-tree/happy-tree-atmp3-pow6-small.png b/img/happy-tree/happy-tree-atmp3-pow6-small.png Binary files differnew file mode 100644 index 0000000..69742f2 --- /dev/null +++ b/img/happy-tree/happy-tree-atmp3-pow6-small.png diff --git a/img/happy-tree/happy-tree-atmp3-pow6.png b/img/happy-tree/happy-tree-atmp3-pow6.png Binary files differnew file mode 100644 index 0000000..6571ce2 --- /dev/null +++ b/img/happy-tree/happy-tree-atmp3-pow6.png diff --git a/img/happy-tree/partial.png b/img/happy-tree/partial.png Binary files differnew file mode 100644 index 0000000..59a9d3d --- /dev/null +++ b/img/happy-tree/partial.png |