summaryrefslogtreecommitdiff
path: root/src/_posts/2021-07-14-how-to-secure-a-webapp.md
blob: 155068dd286281168e0e4067ce7afb404fc6ff2b (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
---
title: >-
    How to Secure a Webapp
description: >-
    Get ready to jump through some hoops.
tags: tech
---

In this post I will be documenting all security hoops that one must jump through
in order to consider their webapp secure. This list should not be considered
comprehensive, as I might have forgotten something or some new hoop might have
appeared since writing.

For the context of this post a "webapp" will be considered to be an HTML/CSS/JS
website, loaded in a browser, with which users create and access accounts using
some set of credentials (probably username and password). In other words, most
popular websites today. This post will only cover those concerns which apply to
_all_ webapps of this nature, and so won't dive into any which might be incurred
by using one particular technology or another.

Some of these hoops might seem redundant or optional. That may be the case. But
if you are building a website and are beholden to passing some third-party
security audit for any reason you'll likely find yourself being forced to
implement most, if not all, of these measures anyway.

So without further ado, let's get started!

## HTTPS

At this point you have to use HTTPS, there's not excuse for not doing so. All
attempts to hit an HTTP endpoint should redirect to the equivalent HTTPS
endpoint, and you should be using [HSTS][hsts] to ensure that a browser is never
tricked into falling back to HTTP via some compromised DNS server.

[hsts]: https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Strict-Transport-Security

## Cookies

Cookies are an old web technology, and have always been essentially broken. Each
cookie can have certain flags set on it which change their behavior, and some of
these flags are required at this point.

### Secure

If you're storing anything sensitive in a cookie (spoiler alert: you will be)
then you need to have the Secure flag set on it. This prevents the cookie from
being sent in a non-HTTPS request.

### HTTPOnly

The HTTPOnly flag protects a cookie from XSS attacks by preventing it from being
accessible from javascript. Any cookie which is storing sensitive information
_must_ have this flag set. In the **Authentication** section we will cover the
storage of session tokens, but the TLDR is that they have to be stored in an
HTTPOnly cookie.

Practically, this means that your sessions architecture _must_ account for the
fact that the webapp itself will not have direct access to its persistent
session token(s), and therefore must have some other way of knowing that it's
logged in (e.g. a secondary, non-HTTPOnly cookie which contains no secrets but
only signals that the browser is logged in).

### SameSite

The SameSite attribute can be set to `Strict`, `Lax`, or `None`. `Lax` is the
default in modern browsers and is sufficient for most security concerns, but if
you can go with `Strict` that would be better. The downside of `Strict` is that
cookies won't be sent on initial page-load of a site.

In any case, even though `Lax` is the default you should still set this
attribute manually (or your auditor might get to add another bullet point to
their report).

## Authentication

Authentication is obviously one of the juiciest targets for an attacker. It's
one thing to be able to trick a user into performing this or that action, but if
one can just log in _as_ the user then they essentially have free-reign over all
their information.

### Password History

Most websites use a username/password system as the first step of login. This
is.... fine. We've accepted it, at any rate. But there's a couple of hoops which
must be jumped through as a result of it, and the first is password history.

I hope it goes without saying that one should be using a hashing algorithm like
bcrypt to store user passwords. But what is often not said is that, for each
user, you need to store the hashes of their last N passwords (where N is
something like 8). This way if they attempt to re-use an old password they are
not able to do so. The users must be protected from themselves, afterall.

### Credential Stuffing/Account Enumeration

A credential stuffing attack is one where credentials are stolen from one
website and then attempted to be used on another, in the hope that users have
re-used their username/password across multiple sites. When they occur it'll
often look like a botnet spamming the authentication endpoint with tons of
different credentials.

Account enumeration is a similar attack: it's where an attacker finds a way to
get the webapp to tell them whether or not an account email/username exists in
the system, without needing to have the right password. This is often done by
analyzing the error messages returned from login or a similar endpoint (e.g.
"Sorry this username is taken"). They then run through all possible values for
that endpoint to try and enumerate which users actually exist in the system.

Account enumeration is tricky because often those errors are extremely helpful,
and we'd _like_ to keep them if we can.

I've bucketed both of these attacks in the same section because they have a
similar solution: proof-of-work. The idea is that, for each request to some
sensitive endpoint, the client must send some proof that they've done an
intensive CPU computation.

Compared to IP-based rate-limiting, PoW is much more effective against botnets
(which have a limitless set of IPs from which to spam you), while also being
much less intrusive on your real users than a captcha.

PoW stymies botnets because they are generally being hosted by low-power,
compromised machines. In addition the systems that run these botnets are pretty
shallow in capability, because it's more lucrative to rent the botnet out then
to actually use it yourself, so it's rare for a botnet operator to go to the
trouble of implementing your PoW algorithm in the first place.

So stick a PoW requirement on any login or account creation endpoint, or any
other endpoint which might be used to enumerate accounts in the system. You can
even make the PoW difficulty rise in relation to number of recent attempts on
these endpoints, if you're feeling spry.

### MFA

All the PoW checks in the world won't help your poor user who isn't using a
different username/password for each website, and who got unlucky enough to have
those credentials leaked in a hack of a completely separate site than your own.
They also won't help your user if they _are_ using different username/passwords
for everything, but their machine gets straight up stolen IRL and the attacker
gets access to their credential storage.

What _will_ help them in these cases, however, is if your site supports
multi-factor authentication, such as [TOTP][totp]. If it does then your user
will have a further line of defense in the form of another password which
changes every 30 seconds, and which can only be accessed from a secondary device
(like their phone). If your site claims to care about the security of your
user's account then MFA is an absolute requirement.

It should be noted, however, that not all MFA is created equal. A TOTP system
is great, but a one-time code being sent over SMS or email is totally different
and not nearly as great. SMS is vulnerable to [SIM jacking][sim], which can be
easily used in a targeted attack against one of your users. One-time codes over
email are pointless for MFA, as most people have their email logged in on their
machine all the time, so if someone steals your user's machine they're still
screwed.

In summary: MFA is essentially required, _especially_ if the user's account is
linked to anything valuable, and must be done with real MFA systems like TOTP,
not SMS or email.

[totp]: https://www.twilio.com/docs/glossary/totp
[sim]: https://www.vice.com/en/article/3kx4ej/sim-jacking-mobile-phone-fraud

### Login Notifications

Whenever a user successfully logs into their account you should send them email
(or some other notification) letting them know it happened. This way if it
wasn't actually them who did so, but an attacker, they can perhaps act quickly
to lock down their account and prevent any further harm. The login notification
email should have some kind of link in it which can be used to immediately lock
the account.

### Token Storage

Once your user has logged into your webapp, it's up to you, the developer, to
store their session token(s) somewhere. The question is... where? Well this
one's easy, because there's only one right answer: HTTPOnly cookies (as alluded
to earlier).

When storing session tokens you want to guard against XSS attacks which might
grab the tokens and send them to an attacker, allowing that attacker to hijack
the session and pose as the user. This means the following are not suitable
places to store the tokens:

* Local storage.
* `window`, or anything which can be accessed via `window`.
* Non-HTTPOnly cookies.

Any of these are trivial to find by a script running in the browser. If a
session token is ephemeral then it may be stored in a "normal" javascript
variable somewhere _as long as_ that variable isn't accessible from a global
context. But for any tokens which need to be persisted across browser restarts
an HTTPOnly cookie is your only option.

## Cross-Site

Speaking of XSS attacks, we have some more mitigation coming up...

### CSP

Setting a [CSP][csp] for your website is key to preventing XSS. A CSP allows you
to more tightly control the allowed origins of the various entities on your site
(be they scripts, styles, images, etc...). If an entity of unexpected origin
shows up it is disallowed.

Be sure to avoid any usages of the policies labeled "unsafe" (go figure),
otherwise the CSP is rendered somewhat pointless. Also, when using hostname
based allowlisting try to be as narrow as you can in your allowlist, and
especially only include https hosts. If you can you should opt for the `nonce`
or `sha` policies.

[csp]: https://developer.mozilla.org/en-US/docs/Web/HTTP/CSP

### SVG

A small but important note: if you're website allows users to upload images,
then be _very_ careful about allowing users to upload SVGs. SVGs are actually
XML documents, and even worse than that they allow `<script>` tags within them!
So you need to be very careful about allowing SVGs to be uploaded. If you can
get away with it, it's better to disallow their use at all.

## CSRF

The web was designed in a time when cross-site requests were a considered
feature. This has proven to be a massive mistake. We have two cross-site request
prevention techniques in this list. The first is CSRF.

CSRF protection will cover you from a variety of attacks, mostly of the kind
where an attacker embeds a `<form>` on their own webpage, with the form set up
to POST to _your_ website in some way. When a user of your website lands on the
attacker's page and triggers the POST, the POST will be sent along with whatever
cookies the user has stored in their browser for _your_ site!

The attacker could, potentially, trick a user into submitting a password-reset
request using a known value, or withdrawing all their money into the attacker's
bank account, or anything else the user might be able to do on their own.

The idea with CSRF is that any HTTP request made against an API should have an
unguessable token as a required parameter, called the CSRF token. The CSRF token
should be given to your webapp in a way where only your webapp could know it.
There are many ways to accomplish this, including a cookie, server-side embedded
value, etc... OWASP has put together an [entire cheatsheet full of CSRF
methods][csrf] which is well worth checking out.

[csrf]: https://cheatsheetseries.owasp.org/cheatsheets/Cross-Site_Request_Forgery_Prevention_Cheat_Sheet.html

## CORS

CORS is the other half of cross-site protection. With CSRF in place it's
somewhat redundant, but it's good to have multiple layers of protection in place
(in case you fuck up one of them by accident).

The key thing one must do for CORS protection is to set the
`Access-Control-Allow-Origin` to the origin a request is being sent from _only
if you trust that origin_. If you stick a wildcard in that header then you're
not doing anything.

## Random Headers

The rest of this is random HTTP headers which must be set in various contexts to
protect your users.

### Permissions Policy

The [Permissions-Policy][pp] header is fairly new and not fully standardized
yet, but there is support for it so it's worth using. It allows you to specify
exactly which browser features you expect your webapp to need, and therefore
prevent an attacker from taking advantage of some other feature that you were
never going to use anyway.

[pp]: https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Feature-Policy

### X-Content-Type-Options

It's important to set `X-Content-Type-Options: nosniff` on virtually all
HTTP responses, in order to (theoretically) prevent a browser from inferring the
MIME of the returned content.

### X-Frame-Options

Set `X-Frame-Options: deny` to prevent your webapp from being rendered in a
frame or iframe on someone else's site, which might then be used to trick one of
your users into doing something stupid.

### X-XSS-Protection

Set `X-XSS-Protection: 1; mode=block` to give older browsers which lack CSP
support some extra defense against XSS attacks. It's not super clear to me what
exactly this actually does, but it's easy enough to set.

### Referrer-Policy

Set the `Referrer-Policy` to inform your users' browsers to not send the
`Referer` header to third-party sites when your users navigate away from your
site. You don't want other websites to be able to see _yours_ in their logs, as
they could then correlate which users of theirs have accounts with you (and so
potentially have some easy targets).

### Cache-Control/Pragma

For all requests which return sensitive information (i.e. any authenticated
requests) it's important to set `Cache-Control: no-store` and `Pragma: no-cache`
on the response. This prevents some middle server or the browser from caching
the response, and potentially returning it later to someone else using your site
from the same location.

## That's It

It's probably not it, actually, these are just what I could think of off the top
of my head. Please email me if I missed any.

If you, like me, find yourself asking "how is anyone supposed to have figured
this out?" then you should A) thank me for writing it all down for you and B)
realize that at least 50% of this list has nothing to do with the web, really,
and everything to do with covering up holes that backwards compatibility has
left open. We can cover these holes, we just need everyone to agree on the path
to doing so, and to allow ourselves to leave some ancient users behind.