From f1998c321a4eec6d75b58d84aa8610971bf21979 Mon Sep 17 00:00:00 2001 From: Brian Picciano Date: Sat, 31 Jul 2021 11:35:39 -0600 Subject: move static files into static sub-dir, refactor nix a bit --- src/_posts/2014-10-29-erlang-pitfalls.md | 193 ------------------------------- 1 file changed, 193 deletions(-) delete mode 100644 src/_posts/2014-10-29-erlang-pitfalls.md (limited to 'src/_posts/2014-10-29-erlang-pitfalls.md') diff --git a/src/_posts/2014-10-29-erlang-pitfalls.md b/src/_posts/2014-10-29-erlang-pitfalls.md deleted file mode 100644 index 7358430..0000000 --- a/src/_posts/2014-10-29-erlang-pitfalls.md +++ /dev/null @@ -1,193 +0,0 @@ ---- -title: Erlang Pitfalls -description: >- - Common pitfalls that people may run into when designing and writing - large-scale erlang applications. -tags: tech ---- - -I've been involved with a large-ish scale erlang project at Grooveshark since -sometime around 2011. I started this project knowing absolutely nothing about -erlang, but now I feel I have accumulated enough knowlege over time that I could -conceivably give some back. Specifically, common pitfalls that people may run -into when designing and writing a large-scale erlang application. Some of these -may show up when searching for them, but some of them you may not even know you -need to search for. - -## now() vs timestamp() - -The cononical way of getting the current timestamp in erlang is to use -`erlang:now()`. This works great at small loads, but if you find your -application slowing down greatly at highly parallel loads and you're calling -`erlang:now()` a lot, it may be the culprit. - -A property of this method you may not realize is that it is monotonically -increasing, meaning even if two processes call it at the *exact* same time they -will both receive different output. This is done through some locking on the -low-level, as well as a bit of math to balance out the time getting out of sync -in the scenario. - -There are situations where fetching always unique timestamps is useful, such as -seeding RNGs and generating unique identifiers for things, but usually when -people fetch a timestamp they just want a timestamp. For these cases, -`os:timestamp()` can be used. It is not blocked by any locks, it simply returns -the time. - -## The rpc module is slow - -The built-in `rpc` module is slower than you'd think. This mostly stems from it -doing a lot of extra work for every `call` and `cast` that you do, ensuring that -certain conditions are accounted for. If, however, it's sufficient for the -calling side to know that a call timed-out on them and not worry about it any -further you may benefit from simply writing your own rpc module. Alternatively, -use [one which already exists](https://github.com/cloudant/rexi). - -## Don't send anonymous functions between nodes - -One of erlang's niceties is transparent message sending between two phsyical -erlang nodes. Once nodes are connected, a process on one can send any message to -a process on the other exactly as if they existed on the same node. This is fine -for many data-types, but for anonymous functions it should be avoided. - -For example: - -```erlang -RemotePid ! {fn, fun(I) -> I + 1 end}. -``` - -Would be better written as - -```erlang -incr(I) -> - I + 1. - -RemotePid ! {fn, ?MODULE, incr}. -``` - -and then using an `apply` on the RemotePid to actually execute the function. - -This is because hot-swapping code messes with anonymous functions quite a bit. -Erlang isn't actually sending a function definition across the wire; it's simply -sending a reference to a function. If you've changed the code within the -anonymous function on a node, that reference changes. The sending node is -sending a reference to a function which may not exist anymore on the receiving -node, and you'll get a weird error which Google doesn't return many results for. - -Alternatively, if you simply send atoms across the wire and use `apply` on the -other side, only atoms are sent and the two nodes involved can have totally -different ideas of what the function itself does without any problems. - -## Hot-swapping code is a convenience, not a crutch - -Hot swapping code is the bees-knees. It lets you not have to worry about -rolling-restarts for trivial code changes, and so adds stability to your -cluster. My warning is that you should not rely on it. If your cluster can't -survive a node being restarted for a code change, then it can't survive if that -node fails completely, or fails and comes back up. Design your system pretending -that hot-swapping does not exist, and only once you've done that allow yourself -to use it. - -## GC sometimes needs a boost - -Erlang garbage collection (GC) acts on a per-erlang-process basis, meaning that -each process decides on its own to garbage collect itself. This is nice because -it means stop-the-world isn't a problem, but it does have some interesting -effects. - -We had a problem with our node memory graphs looking like an upwards facing -line, instead of a nice sinusoid relative to the number of connections during -the day. We couldn't find a memory leak *anywhere*, and so started profiling. We -found that the memory seemed to be comprised of mostly binary data in process -heaps. On a hunch my coworker Mike Cugini (who gets all the credit for this) ran -the following on a node: - -```erlang -lists:foreach(erlang:garbage_collect/1, erlang:processes()). -``` - -and saw memory drop in a huge way. We made that code run every 10 minutes or so -and suddenly our memory problem went away. - -The problem is that we had a lot of processes which individually didn't have -much heap data, but all-together were crushing the box. Each didn't think it had -enough to garbage collect very often, so memory just kept going up. Calling the -above forces all processes to garbage collect, and thus throw away all those -little binary bits they were hoarding. - -## These aren't the solutions you are looking for - -The `erl` process has tons of command-line options which allow you to tweak all -kinds of knobs. We've had tons of performance problems with our application, as -of yet not a single one has been solved with turning one of these knobs. They've -all been design issues or just run-of-the-mill bugs. I'm not saying the knobs -are *never* useful, but I haven't seen it yet. - -## Erlang processes are great, except when they're not - -The erlang model of allowing processes to manage global state works really well -in many cases. Possibly even most cases. There are, however, times when it -becomes a performance problem. This became apparent in the project I was working -on for Grooveshark, which was, at its heart, a pubsub server. - -The architecture was very simple: each channel was managed by a process, client -connection processes subscribed to that channel and received publishes from it. -Easy right? The problem was that extremely high volume channels were simply not -able to keep up with the load. The channel process could do certain things very -fast, but there were some operations which simply took time and slowed -everything down. For example, channels could have arbitrary properties set on -them by their owners. Retrieving an arbitrary property from a channel was a -fairly fast operation: client `call`s the channel process, channel process -immediately responds with the property value. No blocking involved. - -But as soon as there was any kind of call which required the channel process to -talk to yet *another* process (unfortunately necessary), things got hairy. On -high volume channels publishes/gets/set operations would get massively backed up -in the message queue while the process was blocked on another process. We tried -many things, but ultimately gave up on the process-per-channel approach. - -We instead decided on keeping *all* channel state in a transactional database. -When client processes "called" operations on a channel, they really are just -acting on the database data inline, no message passing involved. This means that -read-only operations are super-fast because there is minimal blocking, and if -some random other process is being slow it only affects the one client making -the call which is causing it to be slow, and not holding up a whole host of -other clients. - -## Mnesia might not be what you want - -This one is probably a bit controversial, and definitely subject to use-cases. -Do your own testing and profiling, find out what's right for you. - -Mnesia is erlang's solution for global state. It's an in-memory transactional -database which can scale to N nodes and persist to disk. It is hosted -directly in the erlang processes memory so you interact with it in erlang -directly in your code; no calling out to database drivers and such. Sounds great -right? - -Unfortunately mnesia is not a very full-featured database. It is essentially a -key-value store which can hold arbitrary erlang data-types, albeit in a set -schema which you lay out for it during startup. This means that more complex -types like sorted sets and hash maps (although this was addressed with the -introduction of the map data-type in R17) are difficult to work with within -mnesia. Additionally, erlang's data model of immutability, while awesome -usually, can bite you here because it's difficult (impossible?) to pull out -chunks of data within a record without accessing the whole record. - -For example, when retrieving the list of processes subscribed to a channel our -application doesn't simply pull the full list and iterate over it. This is too -slow, and in some cases the subscriber list was so large it wasn't actually -feasible. The channel process wasn't cleaning up its heap fast enough, so -multiple publishes would end up with multiple copies of the giant list in -memory. This became a problem. Instead we chain spawned processes, each of which -pull a set chunk of the subsciber list, and iterate over that. This is very -difficult to implement in mnesia without pulling the full subscriber list into -the process' memory at some point in the process. - -It is, however, fairly trivial to implement in redis using sorted sets. For this -case, and many other cases after, the motto for performance improvements became -"stick it in redis". The application is at the point where *all* state which -isn't directly tied to a specific connection is kept in redis, encoded using -`term_to_binary`. The performance hit of going to an outside process for data -was actually much less than we'd originally thought, and ended up being a plus -since we had much more freedom to do interesting hacks to speedup up our -accesses. -- cgit v1.2.3