|
| 1 | +# MSC1711: X.509 certificate verification for federation connections |
| 2 | + |
| 3 | +TLS connections for server-to-server communication currently rely on an |
| 4 | +approach borrowed from the [Perspectives |
| 5 | +project](https://web.archive.org/web/20170702024706/https://perspectives-project.org/) |
| 6 | +to provide certificate verification, rather than the more normal model using |
| 7 | +certificates signed by trusted Certificate Authorities. This document sets out |
| 8 | +the reasons that this has not been a success, and suggests that we should |
| 9 | +instead revert to the CA model. |
| 10 | + |
| 11 | +## Background: the failure of the Perspectives approach |
| 12 | + |
| 13 | +The Perspectives approach replaces the conventional hierarchy of trust provided |
| 14 | +by the Certificate Authority model with a large number of "notary" servers |
| 15 | +distributed around the world. The intention is that the notary servers |
| 16 | +regularly monitor remote servers and observe the certificates they present; |
| 17 | +when making a connection to a new site, a client can correlate the certificate |
| 18 | +it presents with that seen by the notary servers. In theory this makes it very |
| 19 | +hard to mount a Man-in-the-Middle (MitM) attack, because it would require |
| 20 | +intercepting traffic between the target server and a large number of the notary |
| 21 | +servers. |
| 22 | + |
| 23 | +It is notable that the Perspectives project itself appears to have largely been |
| 24 | +abandoned: its website has largely been repurposed, the [Firefox |
| 25 | +extension](https://addons.mozilla.org/en-GB/firefox/addon/perspectives/) does |
| 26 | +not work with modern versions of Firefox, the [mailing |
| 27 | +list](https://groups.google.com/forum/#!forum/perspectives-dev) is inactive, |
| 28 | +and several of the (ten) published notary servers are no longer functional. The |
| 29 | +reasons for this are not entirely clear, though clearly it never gained |
| 30 | +widespread adoption. |
| 31 | + |
| 32 | +When Matrix was originally designed in 2014, the Perspectives project was |
| 33 | +heavily active, and avoiding dependencies on the relatively centralised |
| 34 | +Certificate Authorities was attractive, in accordance with Matrix's design as a |
| 35 | +decentralised protocol. However, this has not been a success in practice. |
| 36 | + |
| 37 | +Matrix was unable to make use of the existing notary servers (largely because |
| 38 | +we wanted to extend the protocol to include signing keys): the intention was |
| 39 | +that, as the Matrix ecosystem grew, public Matrix servers would act as notary |
| 40 | +servers. However, in practice we have ended up in a situation where almost <sup |
| 41 | +id="a1">[1](#f1)</sup> every Matrix homeserver either uses `matrix.org` as the |
| 42 | +sole notary, or does no certificate verification at all. Far from avoiding the |
| 43 | +centralisation of the Certificate Authorities, the entire protocol is therefore |
| 44 | +dependent on a single point of control at `matrix.org` - and because |
| 45 | +`matrix.org` only monitors from a single location, the protection against MitM |
| 46 | +attacks is weak. |
| 47 | + |
| 48 | +It is also clear that the Perspectives approach is poorly-understood. It is a |
| 49 | +common error for homeservers to be deployed behind reverse-proxies which make |
| 50 | +the Perspectives-based approach unreliable. The CA model, for all its flaws, is |
| 51 | +at least commonly used, which makes it easier for administrators to deploy |
| 52 | +(secure) homeservers, and allows server implementations to leverage existing |
| 53 | +libraries. |
| 54 | + |
| 55 | +## Proposal |
| 56 | + |
| 57 | +We propose that Matrix homeservers should be required to present valid TLS |
| 58 | +certificates, signed by a known Certificate Authority, on their federation |
| 59 | +port. |
| 60 | + |
| 61 | +In order to ease transition and give administrators time to switch to a signed |
| 62 | +certificate, we will continue to follow the current, perspectives-based |
| 63 | +approach for servers whose TLS certificates fail validation. |
| 64 | + |
| 65 | +However, this fallback will be strictly time-limited, and Matrix S2S spec r0 |
| 66 | +will not accept self-signed certificates, nor will it include the |
| 67 | +`tls_fingerprints` property of the |
| 68 | +[`/_matrix/key/v2`](https://matrix.org/docs/spec/server_server/unstable.html#retrieving-server-keys) |
| 69 | +endpoints. Synapse 1.0 will not accept self-signed certificates by default. |
| 70 | + |
| 71 | +The `matrix.org` team will proactively attempt to reach out to homeserver |
| 72 | +administrators who do not update their certificates in the coming weeks. |
| 73 | + |
| 74 | +The process of determining which CAs are trusted to sign certificates would be |
| 75 | +implementation-specific, though it should almost certainly make use of existing |
| 76 | +operating-system support for maintaining such lists. It might also be useful if |
| 77 | +administrators could override this list, for the purpose of setting up a |
| 78 | +private federation using their own CA. |
| 79 | + |
| 80 | +It would also be useful for administrators to be able to disable the |
| 81 | +certificate checks for a whitelist of domains/netmasks. This would be useful |
| 82 | +for testing, or for networks that provide server verification themselves, |
| 83 | +such as like `.onion` domains on Tor or `fc00::/8` IPs on cjdns. |
| 84 | + |
| 85 | +### Interaction with SRV records |
| 86 | + |
| 87 | +With the use of `SRV` records, it is possible for the hostname of a homeserver |
| 88 | +to be quite different from the matrix domain it is hosting. For example, if |
| 89 | +there were an SRV record at `_matrix._tcp.matrix.org` which pointed to |
| 90 | +`server.example.com`, then any federation requests for `matrix.org` would be |
| 91 | +routed to `server.example.com`. The question arises as to which certificate |
| 92 | +`server.example.com` should present. |
| 93 | + |
| 94 | +In short: the server should present a certificate for the matrix domain |
| 95 | +(`matrix.org` in the above example). This ensures that traffic cannot be |
| 96 | +intercepted by a MitM who can control the DNS response for the `SRV` record |
| 97 | +(perhaps via cache-poisoning or falsifying DNS responses). |
| 98 | + |
| 99 | +This will be in line with the current |
| 100 | +[requirements](https://matrix.org/docs/spec/server_server/unstable.html#resolving-server-names) |
| 101 | +in the Federation API specification for the `Host`, and by implication, the TLS |
| 102 | +Server Name Indication <sup id="a2">[2](#f2)</sup>. It is also consistent with |
| 103 | +the recommendations of |
| 104 | +[RFC6125](https://tools.ietf.org/html/rfc6125#section-6.2.1) and the |
| 105 | +conventions established by the XMPP protocol (per [RFC6120](https://tools.ietf.org/html/rfc6120#section-13.7.2.1). |
| 106 | + |
| 107 | +### Extensions |
| 108 | + |
| 109 | +HTTP-Based Public Key Pinning (HPKP) and |
| 110 | +[Certificate transparency](https://www.certificate-transparency.org) are |
| 111 | +both HTTP extensions which attempt to work around some of the deficiencies in |
| 112 | +the CA model, by making it more obvious if a CA has issued a certificate |
| 113 | +incorrectly. |
| 114 | + |
| 115 | +HPKP has not been particularly successful, and is |
| 116 | +[deprecated](https://developers.google.com/web/updates/2018/04/chrome-67-deps-rems#deprecate_http-based_public_key_pinning) |
| 117 | +in Google Chrome as of April 2018. Certificate transparency, however, is seeing |
| 118 | +widespread adoption from Certificate Authories and HTTP clients. |
| 119 | + |
| 120 | +This proposal sees both technologies as optional techniques which could be |
| 121 | +provided by homeserver implementations. We encourage but do not mandate the use |
| 122 | +of Certificate Transparency. |
| 123 | + |
| 124 | +### Related work |
| 125 | + |
| 126 | +The Perspectives approach is also currently used for exchanging the keys that |
| 127 | +are used by homeservers to sign Matrix events and federation requests (the |
| 128 | +"signing keys"). Problems similar to those covered here also apply to that |
| 129 | +mechanism. This is discussed at [#1685](thttps://github.com/matrix-org/matrix-doc/issues/1685). |
| 130 | + |
| 131 | +## Alternatives |
| 132 | + |
| 133 | +There are well-known problems with the CA model, including a number of |
| 134 | +widely-published incidents in which CAs have issued certificates |
| 135 | +incorrectly. It is therefore important to consider alternatives to the CA |
| 136 | +model. |
| 137 | + |
| 138 | +### Improving support for the Perspectives model |
| 139 | + |
| 140 | +In principle, we could double-down on the Perspectives approach, and make an effort |
| 141 | +to get servers other than `matrix.org` used as notary servers. However, there |
| 142 | +remain significant problems with such an approach: |
| 143 | + |
| 144 | +* Perspectives remain complex to configure correctly. Ideally, administrators |
| 145 | + need to make conscious choices about which notaries to trust, which is hard |
| 146 | + to do, especially for newcomers to the ecosystem. (In practice, people use |
| 147 | + the out-of-the-box configuration, which is why everyone just uses |
| 148 | + `matrix.org` today). |
| 149 | + |
| 150 | +* A *correct* implementation of Perspectives really needs to take into account |
| 151 | + more than the latest state seen by the notary servers: some level of history |
| 152 | + should be taken into account too. |
| 153 | + |
| 154 | +Essentially, whilst we still believe the Perspectives approach has some merit, |
| 155 | +we believe it needs further research before it can be relied upon. We believe |
| 156 | +that the resources of the Matrix ecosystem are better spent elsewhere. |
| 157 | + |
| 158 | +### DANE |
| 159 | + |
| 160 | +DNS-Based Authentication of Named Entities (DANE) can be used as an alternative |
| 161 | +to the CA model. (It is arguably more appropriately used *together* with the CA |
| 162 | +model.) |
| 163 | + |
| 164 | +It is not obvious to the author of this proposal that DANE provides any |
| 165 | +material advantages over the CA model. In particular it replaces the |
| 166 | +centralised trust of the CAs with the centralised trust of the DNS registries. |
| 167 | + |
| 168 | +## Potential issues |
| 169 | + |
| 170 | +Beyond the problems already discussed with the CA model, requiring signed |
| 171 | +certificates comes with a number of downsides. |
| 172 | + |
| 173 | +### More difficult setup |
| 174 | + |
| 175 | +Configuring a working, federating homeserver is a process fraught with |
| 176 | +pitfalls. This proposal adds the requirement to obtain a signed certificate to |
| 177 | +that process. Even with modern intiatives such as Let's Encrypt, this is |
| 178 | +another procedure requiring manual intervention across several moving parts. |
| 179 | + |
| 180 | +On the other hand: obtaining an SSL certificate should be a familiar process to |
| 181 | +anybody capable of hosting a production homeserver (indeed, they should |
| 182 | +probably already have one for the client port). This change also opens the |
| 183 | +possibility of putting the federation port behind a reverse-proxy without the |
| 184 | +need for additional configuration. Hopefully making the certificate usage more |
| 185 | +conventional will offset the overhead of setting up a certificate. |
| 186 | + |
| 187 | +Furthermore, homeserver implementations could provide an implementation of the |
| 188 | +ACME protocol and integration with Let's Encrypt, to make it easier for |
| 189 | +administrators to get started. (This would of course be |
| 190 | +implementation-specific, and administrators who wanted to keep control of the |
| 191 | +certificate creation process would be free to do so). |
| 192 | + |
| 193 | +### Inferior support for IP literals |
| 194 | + |
| 195 | +Whilst it is possible to obtain an SSL cert which is valid for a literal IP |
| 196 | +address, this typically requires purchase of a premium certificate; in |
| 197 | +particular, Let's Encrypt will not issue certificates for IP literals. This may |
| 198 | +make it impractical to run a homeserver which uses an IP literal, rather than a |
| 199 | +DNS name, as its `server_name`. |
| 200 | + |
| 201 | +It has long been the view of the `matrix.org` administrators that IP literals |
| 202 | +are only really suitable for internal testing. Those who wish to use them for |
| 203 | +that purpose could either disable certificate checks inside their network, or |
| 204 | +use their own CA to issue certificates. |
| 205 | + |
| 206 | +### Inferior support for hidden services (`.onion` addresses) |
| 207 | + |
| 208 | +It is currently possible to correctly route traffic to a homeserver on a |
| 209 | +`.onion` domain, provided any remote servers which may need to reach that |
| 210 | +server are configured to route to such addresses via the Tor network. However, |
| 211 | +it can be difficult to get a certificate for a `.onion` domain (again, Let's |
| 212 | +Encrypt do not support them). |
| 213 | + |
| 214 | +The reasons for requiring a signed certificate (or indeed, for using TLS at |
| 215 | +all) are weakened when traffic is routed via the Tor network. Administrators |
| 216 | +using the Tor network could disable certificate checks for `.onion` addresses. |
| 217 | + |
| 218 | +## Conclusion |
| 219 | + |
| 220 | +We believe that requiring homeservers to present an X.509 certificate signed by |
| 221 | +a recognised Certificate Authority will improve security, reduce |
| 222 | +centralisation, and eliminate some common deployment pitfalls. |
| 223 | + |
| 224 | +<a id="f1"/>[1] It's *possible* to set up homeservers to use servers other than |
| 225 | +`matrix.org` as notaries, but only a minority are actually set up this |
| 226 | +way. [↩](#a1) |
| 227 | + |
| 228 | +<a id="f2"/>[2] I've not been able to find an authoritative source on this, but |
| 229 | +most reverse-proxies will reject requests where the SNI and Host headers do not |
| 230 | +match. [↩](#a2) |
0 commit comments