inconsistent node startup behavior depending on attempted ws-star swarm binding #1619
Description
- Version: 0.32.3
- Platform: node
- Subsystem: js-libp2p-websockets
Type: Bug
Severity: High
Description:
In the browser, we generally have to use websocket-star swarm addresses (since binding TCP is not available), however these addresses are temperamental and depend on the signaling server being available. Node startup will abort if the specified swarm address cannot be bound:
> const options = {
... "config": {
..... "Addresses": {
....... "Swarm": [
....... "/dns4/example.com/tcp/443/wss/p2p-websocket-star/" // bad
....... ]
....... }
..... },
... "EXPERIMENTAL": {
..... "pubsub": true
..... }
... }
undefined
> const node = new IPFS(options)
undefined
> Error: websocket error
at WS.Transport.onError (/home/arkadiy/tmp/dweb-transport/node_modules/engine.io-client...
> node.isOnline()
false
or even if ANY ws failures occur before an address could be bound:
> const options = {
... "config": {
..... "Addresses": {
....... "Swarm": [
....... "/dns4/example.com/tcp/443/wss/p2p-websocket-star/", // bad
....... "/dns4/ws-star.discovery.libp2p.io/tcp/443/wss/p2p-websocket-star/" // good
.......
....... ]
....... }
..... },
... "EXPERIMENTAL": {
..... "pubsub": true
..... }
... }
undefined
>
> const node = new IPFS(options)
undefined
> Error: websocket error
at WS.Transport.onError (/home/arkadiy/tmp/dweb-transport/node_modules/engine.io-client/lib/transport.js:64:13)
at WebSocket.ws.onerror (/home/arkadiy/tmp/dweb-transport/node_modules/engine.io-client/lib/transports/websocket.js:150:10)
at WebSocket.onError (/home/arkadiy/tmp/dweb-transport/node_modules/engine.io-client/node_modules/ws/lib/EventTarget.js:109:16)
at WebSocket.emit (events.js:182:13)
at WebSocket.EventEmitter.emit (domain.js:442:20)
at WebSocket.finalize (/home/arkadiy/tmp/dweb-transport/node_modules/engine.io-client/node_modules/ws/lib/WebSocket.js:182:41)
at ClientRequest._req.on (/home/arkadiy/tmp/dweb-transport/node_modules/engine.io-client/node_modules/ws/lib/WebSocket.js:653:12)
at ClientRequest.emit (events.js:182:13)
at ClientRequest.EventEmitter.emit (domain.js:442:20)
at HTTPParser.parserOnIncomingClient [as onIncoming] (_http_client.js:546:21)
> node.isOnline()
false
however, if we succeed before we fail, the node starts up fine(-ish) and even lies about what addresses it's listening on:
> const options = {
... "config": {
..... "Addresses": {
....... "Swarm": [
....... "/dns4/ws-star.discovery.libp2p.io/tcp/443/wss/p2p-websocket-star/", // good
....... "/dns4/example.com/tcp/443/wss/p2p-websocket-star/", // bad
.......
....... ]
....... }
..... },
... "EXPERIMENTAL": {
..... "pubsub": true
..... }
... }
undefined
> const node = new IPFS(options)
undefined
> Swarm listening on /dns4/example.com/tcp/443/wss/p2p-websocket-star/ipfs/QmSmwDi3AmMm3pFbyvzmRZ3FfLtNAtYv5ie7ispER1kGUB // are you really?
Swarm listening on /dns4/ws-star.discovery.libp2p.io/tcp/443/wss/p2p-websocket-star/ipfs/QmSmwDi3AmMm3pFbyvzmRZ3FfLtNAtYv5ie7ispER1kGUB
> node.isOnline()
true
> node.swarm.addrs(a => console.log(a))
null
finally, if we specify an empty swarm address set the node happily goes online:
> const options = {
... "config": {
..... "Addresses": {
....... "Swarm": [
....... ]
....... }
..... },
... "EXPERIMENTAL": {
..... "pubsub": true
..... }
... }
undefined
> const node = new IPFS(options)
undefined
> node.isOnline()
true
(the latter is useful if you want to dial another node directly, for example)
there seem to be at least two problems here:
- assumptions about node startup success based on TCP (where failure to bind a port is, reasonably, a fatal error) do not apply in more fragile environments like ws relay
- no swarm addresses and no successfully bound swarm addresses are not equivalent
- logic with multiple ws-star addresses is broken
some related discussion in ipfs-shipyard/peer-pad-core#13, though the fix is probably at js-ipfs level