summaryrefslogtreecommitdiff
path: root/static/src/_posts/2021-07-18-radix-v4.md
blob: bb0d04db7183fdadc237084eea1f2188ca754979 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
---
title: >-
    V4 of Radix, a Golang Redis Driver
description: >-
    What's new, what's improved, and where we're going from here.
tags: tech
---

Radix is a Go driver for the [Redis][redis] database. The current stable release
is v3, the docs for which can be found [here][v3]. Over the past year
(perhaps longer) I've been working on a new version, v4, with the aim of
addressing some of the shortcomings of v3 and distilling the API a bit better.

At this point v4 is in beta. While there's still some internal bugs and QoL
improvements which need to be made, the API is roughly stable and I wouldn't
discourage anyone from using it for a non-critical project. In the coming months
I intend on finishing the polish and tagging a `v4.0.0` release, but in the
meantime let's go over the major changes and improvements in v4!

You can see the v4 documentation [here][v4], if you'd like to follow along with
any of the particulars, and you can see the full CHANGELOG [here][changelog].

## Shoutouts

Before continuing I want to give to give a huge shoutout to
[nussjustin][nussjustin]. Since before v3 was even stable Justin has been
contributing to radix in every way possible, from running benchmarks and making
very low-level performance improvements to building whole user-facing features
and responding to github issues when I get lost in the woods. Thank you Justin!

## RESP3

Starting at the lowest level, v4 supports new redis's new wire protocol,
[RESP3][resp3]. This new protocol is (mostly) backwards compatible with the
previous wire protocol, and is really more an extension than anything. The [new
resp3 sub-package][resp3pkg] is capable of marshaling and unmarshaling all new
wire types, including the streamed aggregates and streamed strings.

A major improvement made on the API level is addition of the
[resp.Opts][respOpts] type, which is used to propagate things like byte buffers
and buffered readers. Doing this allows the resp3 package to reduce memory
allocations without relying on something like `sync.Pool`, which introduces
locking overhead.

There's still some question to be answered regarding the best way for the main
radix package to deal with the new push and attribute types, but the resp3
package is general-purpose enough to handle most strategies in the future.

In fact, the RESP3 protocol as a whole (and therefore v4's associated resp3
sub-package) is totally usable outside of redis. If you're looking for a
human-readable, binary safe, fast, and simple wire protocol which already has
great tooling and libraries across multiple programming languages, I highly
recommend checking out RESP3.

## Conn

Arguably one of the biggest design warts of v3, in my eyes, is the
[CmdAction][cmdaction] type. This type required to allow for pipelining, which
is a feature of redis where you can write new commands to a redis connection
prior to previous ones returning their results. The major upside of pipelining
is that N pipelined commands will only result in 2 system calls (a network write
then a network read), rather than 2N system calls (N writes and N reads) if each
command was performed independently.

The normal v3 Action type is fairly opaque, and would perform both the write and
read internally without exposing any way to do some other action in between
(such as performing writes/reads for other commands in a pipeline). CmdAction
extends Action to allow the write and read to be performed independently, and
then leaves it to the Pipeline type to deal with the batching.

v4 gets rid of the need for CmdAction, while allowing even more Action types to
be pipeline-able than before (e.g. [EvalScript][evalscript]). This was done by
coalescing the Encode and Decode methods on the [Conn][conn] type into a single
method: EncodeDecode. By doing this we allow Actions to perform the write/read
steps in a way which groups the two together, but leaves it to Conn to actually
perform the steps in its own way.

Because Conn now has knowledge of which read/write steps go together, it's
possible to perform pipelining in nearly all cases. Aside from using the
Pipeline type manually, the v4 Conn is able to automatically pipeline most
Actions when they are performed concurrently on the same Conn. v3 had a similar
feature, called "implicit pipelining", but v4 rebrands the feature as
"connection sharing" since the mechanism is slightly different and the
applicability is broader.

Despite the apparent simplicity of the change (combining Encode and Decode
methods), this resulted in probably the largest code difference between v3 and
v4, involving the most complex new logic and package-wide refactorings. But the
end result is a simpler, smaller API which can be applied to more use-cases. A
great win!

## Pool

In v3 the connection pool, the Pool type, was implemented with the assumption
that each Action (or CmdAction) would borrow a Conn for the duration of the
Action. As such the Pool expects to be creating and destroying connections as
load increases and decreases; if number of concurrent commands goes up then
number of connections required to handle them goes up as well, and vice-versa.

Down the road the Pool became responsible for performing implicit pipelining as
well. This allowed for grouping together many commands on the same connection,
reducing pressure on connection creation greatly, but nevertheless the Pool kept
that same general pattern of dynamic connection pool sizing.

In v4 there is no longer the assumption that each command gets its own
connection, and in fact that assumption is flipped: each connection is expected
to handle multiple commands concurrently in almost all cases. This means the
Pool can get rid of the dynamism, and opt instead for a simple static connection
pool size. There is still room in the API for some dynamic connection sizing to
be implemented later, but it's mostly unnecessary now.

Some care should be used with commands which _can't_ be pipelined, for example
blocking commands like BRPOPLPUSH and XREAD. These commands, ideally, should be
performed on an individual Conn created just for that purpose. Pool _will_
properly handle them if needed, but with the caveat that the Action which will
essentially remove a Conn from the Pool for its duration.

[The new Pool][pool] is _vastly_ simpler in implementation than the old, as most
of the complexity has been moved into Conn. Really this whole section is an
extension of the refactoring which was started by the changes to Conn.

## MultiClient

In v3 there was a single Client type which was used to encompass Conn, Pool,
Sentinel, and Cluster, with the aim that users could just use Client in their
code and easily swap out the underlying implementation as needed.

In practice this didn't work out. The original Client type only had a Do method
for performing Actions, which would always perform the Actions against the
primary instance in the case of Cluster and Sentinel. Cluster and Sentinel ended
up being extended with DoSecondary methods, and Cluster required its own
constructor for Scanner, so if you used any of those features you would not be
able to use Client.

v4 improves this situation by introducing the [MultiClient][multiclient]
interface, which is implemented by both Cluster and Sentinel, while Conn and
Pool only implement [Client][client]. Client is intended for clients which
interact with only a single redis instance, while MultiClient is intended for
use by clients which encompass multiple redis instances, and makes the
distinction between primary and secondary instances.

In general, users will want to use MultiClient in their code and swap the
underlying implementation as their infrastructure evolves. When using only a
single Pool, one can make it into a MultiClient using the new
[ReplicaSet][replicaset].

One can also implement their own MultiClient's fairly easily, to handle their
own custom sharding or failover systems. It's not a common use-case, but it's
cool that existing types like Scanner will still continue to work.

## Contexts

A common feature request of v3 was for support for Go's [Contexts][context],
which would allow callers to unblock blocked operations in a dynamic way. There
wasn't a clear way to incorporate Contexts into v3 without greatly expanding the
API (something the Go standard library has had to do), and so I saved them for
v4.

In v4 all operations which might potentially block accept a Context argument.
This takes the place of timeout options and some trace events which were used in
v3, and in general simplifies things for the user.

This was a change for which there is not much to talk about, but which required
a _lot_ of work internally. Go's Contexts do not play nicely with its networking
primitives, and making this all work alongside connection sharing and pipelining
is a really hairy puzzle (for which there's a few open bugs still). I may one
day write a blog post just about this topic, if I can figure out how to explain
it in a way which isn't completely mind-numbing.

## Configuration

Constructors in v3 took advantage of the [functional options pattern][opts] for
accepting optional parameters. While this pattern _looks_ nice, I've since
grown out of love with it. The implementation is a lot more complex, its
behavior is more ambiguous to users in certain cases (what happens if the same
option is passed in twice?), it makes documentation more complex, and a slice of
option functions isn't inspectable or serializable like a struct is.

v4 uses a config struct pattern, but in a different way than I've generally seen
it. See [Pool's constructor][pool] for an example. This pattern is functionally
the same as passing the config struct as an argument to the constructor, but I
think it results in a nicer grouping in the documentation.

## Smaller Changes

There's some smaller sets of changes which are worth mentioning. These didn't
result in huge, package-wide changes, but will be useful for users of specific
functionality.

### Action Properties

[v4's Action type][action] has a Properties method which returns a struct
containing various fields which are useful for client's performing the Action.
This is an improvement over v3's Action, which had no such method, in that it's
more extensible going forward. Those implementing their own custom Actions
should take care to understand the Action properties.

### PubSub

The v4 [PubSubConn][pubsub] has been completely redesigned from v3's
implementation. The old design tried to do too much, and resulted in weird
edge-cases when trying to tear down a connection that a user would have to
handle themselves. The new design is simple both in implementation and usage.

### Tracing

The v4 [trace][trace] sub-package has been extended to support tracing Sentinel
events, but at the same time has been cleaned out of all events which could be
otherwise inferred by using Context values or wrapping an interface like Conn,
Action, etc...

## What's Next

Obviously the most immediate goal is to get v4 stable and tagged. Once that's
done I'm sure there will be many small bugs, feature requests, etc... which come
up over time, and I'll do my best to address those as quickly as I can. I'm
very excited to start using v4 in my own day-to-day work like I currently do for
v3; it has a lot of great improvements and new flexibility that will make using
Go and redis together an even better experience than it already is.

That all said, I don't expect there to be a radix v5. I have a lot of other
projects I'd like to work on, and radix is a huge time-sink. As time goes on v4
will stabilize further and further, until all that's left is for it to gain
additional support for whatever new crazy features redis comes up with. My hope
is that the existing API is flexibile enough to allow others to fill in those
gaps without any major changes to the existing code, and radix v4 can be the
final major radix version.

[redis]: https://redis.io
[v3]: https://pkg.go.dev/github.com/mediocregopher/radix/v3#section-documentation
[v4]: https://pkg.go.dev/github.com/mediocregopher/radix/v4#section-documentation
[nussjustin]: https://github.com/nussjustin
[resp3]: https://github.com/antirez/RESP3
[resp3pkg]: https://pkg.go.dev/github.com/mediocregopher/radix/v4/resp/resp3
[respOpts]: https://pkg.go.dev/github.com/mediocregopher/radix/v4/resp#Opts
[changelog]: https://github.com/mediocregopher/radix/blob/v4/CHANGELOG.md
[cmdaction]: https://pkg.go.dev/github.com/mediocregopher/radix/v3#CmdAction
[evalscript]: https://pkg.go.dev/github.com/mediocregopher/radix/v4#EvalScript
[conn]: https://pkg.go.dev/github.com/mediocregopher/radix/v4#Conn
[pool]: https://pkg.go.dev/github.com/mediocregopher/radix/v4#PoolConfig.New
[multiclient]: https://pkg.go.dev/github.com/mediocregopher/radix/v4#MultiClient
[client]: https://pkg.go.dev/github.com/mediocregopher/radix/v4#Client
[replicaset]: https://pkg.go.dev/github.com/mediocregopher/radix/v4#ReplicaSet
[context]: https://blog.golang.org/context
[opts]: https://dave.cheney.net/2014/10/17/functional-options-for-friendly-apis
[action]: https://pkg.go.dev/github.com/mediocregopher/radix/v4#Action
[pubsub]: https://pkg.go.dev/github.com/mediocregopher/radix/v4#PubSubConn
[trace]: https://pkg.go.dev/github.com/mediocregopher/radix/v4/trace