-
Notifications
You must be signed in to change notification settings - Fork 405
MSC4143: MatrixRTC #4143
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
MSC4143: MatrixRTC #4143
Conversation
d717c0b
to
cff8291
Compare
Signed-off-by: Timo K <[email protected]>
Signed-off-by: Timo K <[email protected]>
proposals/4143-matrix-rtc.md
Outdated
## Unstable prefix | ||
|
||
The state events and the well_known key introduced in this MSC use the unstable prefix | ||
`org.matrix.msc4143.` instead of `m.` as used in the text. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
empirically we seem to be using org.matrix.msc3401.call.member
rather than org.matrix.msc4143.rtc.member
in Element?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current implementation is still using the msc3401 prefix. That is wrong and will be addressed. There is still the open topic of how exactly we want to do the state keys and the event ownership and on top of that we have plans for how to index rtc member events in a better way.
The reason we changed it to (...call...
->) ...rtc...
is, that we need the call
namespace for the particular video call matrixRTC application (session type) of calling over MatrixRTC. Using that word for both. Matrix rtc sessions in general and calls will be confusing in the long run.
proposals/4143-matrix-rtc.md
Outdated
`created_ts()`+`device_id`. This is why the `m.rtc.member` events deliberately do NOT include a `membership_id`. | ||
|
||
Other then the membership sessions, there is **no event** to represent a rtc session (containing all members). | ||
Such an event would include shared information, and deciding who has authority over that is not trivial. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I continue to trip over whether it is wise to force clients to read all possible m.rtc.member
events to figure out if a call is happening, and who created it.
I /think/ that a better reason for not having an m.rtc
state event describing the existence of an RTC session is that you'd have to handle disconnection semantics on it similar to delayed-events for m.rtc.member
... at which point, why not leverage the membership events?
However, it still feels REALLY weird to not have something in state telling you whether semantically a call is intended to be happening now (and what that sort of call is, when it began, and who initiated it) - versus having to infer it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively, if the thought experiment is "what if two users both create an m.rtc
state event on different forks of the DAG at the same time?" ... is that really so bad? and does aggregating m.rtc.member
state actually make it better? if so, how?
In other words, we actually need to justify the lack of m.rtc
state event much better here, imo. In particular, having somewhere to store the metadata about the call at the point of creation (its name, its ID, whether it's intended to be a voice/video room or a group call or a conference, etc)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a large list of reasons and I 100% support the idea to go into detail in the MSC to justify this approach.
As for this comment I will just give a couple of short examples/arguments:
The security and "trolling" surface is huge. If we have one state event we either limit who can create a call or we allow everyone to mess with the event. This can go from smaller issues like ending the call for fun to larger issues like changing where the call is happening without the members noticing it. (in a LK world at least)
But even if everyone plays fair, a call can be stopped by a delayed event because the creator failed to send the refresh event. This now disconnects everyone from the call.
Independent how we make things behave, if we have a public/shared event controlling the experience for everyone we switch from:
- If there is a client with issues (or user that actively introduces issues) that user has a degraded experience
- If there is a client with issues (or user that actively introduces issues) everyone experience can be broken.
In the context of matrix where there is no central entity controlling the clients the seconds seems to be the only valid option.
|
||
- [`m.call`](www.example.com) TODO: create `m.call` MSC and add link here. | ||
|
||
## Potential issues |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fact we don't reference how to tell users that a call is happening (i.e. m.call.notify) is very disorienting here.
proposals/4143-matrix-rtc.md
Outdated
// event type: "m.rtc.member" | ||
// event key: "@user:matrix.domain_DEVICEID" | ||
{ | ||
"leave_reason": "CONNECTION_LOST" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should be namespaced. what other leave reasons are there?
proposals/4143-matrix-rtc.md
Outdated
fields are explained and how the communication with the possible foci is | ||
defined: | ||
|
||
- [`m.call`](www.example.com) TODO: create `m.call` MSC and add link here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something which is very unclear right now is the lifecycle of a call (and the draft of MSC4196 doesn't really help).
Specifically, what has happened to m.call.invite
? m.call.answer
? m.call.select_answer
? m.call.hangup
? The intention of MSC3401 was to keep using these (as per https://github.com/matrix-org/matrix-spec-proposals/blob/matthew/group-voip/proposals/3401-group-voip.md#basic-call) even when an SFU is involved, in order to preserve signalling for:
- Call was placed but isn't ringing yet (not that we have in legacy VoIP)
- Call is ringing
- Caller gave up and stopped ringing
- Call has been answered on a device
- Call was rejected locally
- Call was rejected on all devices
- Call was answered
- Call was hung up
i.e. all the state machine transitions that go into the lifecycle of a call.
I'm very worried that this seems to have been collapsed to a single m.call.notify
event from MSC4075 + m.rtc.member
events here, which seems to be missing most of the above. For instance, I can't see how the caller tells the callee that it's no longer ringing them - not to mention stuff like early media (MSC3635) or ringback tones.
Legacy VoIP was already a dumbed down version of SIP, and I'm worried that the lack of signalling semantics here is going to make mimicing SIP or PSTN semantics really hard (let alone bridging to it), and would be a backwards step from legacy voip.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason that information about this is so thin here is that this proposal tries to define the line between matrixRTC and the special case matrix rtc use case of calls.
The focus should be to find the minimum requirements for a rtc session so clients can implement everything from this msc and they will be able to have a rough idea what is happening in a room.
Everything that is not applicable to all kinds of sessions should be part of their dedicated msc's.
Calling in particular has a very rich setup/notification cycle that is not necessarily part of every rtc session.
Th idea very much is however to fully support this list:
- Call was placed but isn't ringing yet (not that we have in legacy VoIP) !! This one is TBD
- Call is ringing
- Caller gave up and stopped ringing
- Call has been answered on a device
- Call was rejected locally
- Call was rejected on all devices
- Call was answered
- Call was hung up
This proposals tries to not enforce this for matrixRTC in general however.
Your comment makes me wonder if there is any benefit in reusing the already existing legacy calling events for m.call.invite
m.call.answer
m.call.select_answer
m.call.hangup
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this proposal tries to define the line between matrixRTC and the special case matrix rtc use case of calls.
are we sure this is not an unnecessary abstraction right now? we could add it later, if/when we have a use case for MatrixRTC which doesn't involve calling? and meanwhile keep the MSCs easier to follow and less fragmented?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your comment makes me wonder if there is any benefit in reusing the already existing legacy calling events for m.call.invite m.call.answer m.call.select_answer m.call.hangup ...
There may not be any advantage to reusing those events, but it's very hard to tell when I can't figure out how those semantics map onto a MatrixRTC call life cycle today. My spidey sense is that having to infer the call state machine out of m.rtc.member and m.rtc.notify events is going to be fragile, hard to reason about, and hard to extend (e.g. how would you do early media, for compatibility with MSC3635?) - rather than throwing explicit events around when things happen. But am very happy to be corrected... especially if the rationale ends up in an MSC :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are we sure this is not an unnecessary abstraction right now? we could add it later, if/when we have a use case for MatrixRTC which doesn't involve calling? and meanwhile keep the MSCs easier to follow and less fragmented?
Since the ring procedure is entirely detangled from the rtc signalling and the session participation I am not sure it really would make this MSC easier to understand. We would need to state very explicitly, that the lifecycle is entirely detangled and implementations cannot rely on it in any way to actually create the session.
…atrix_client_property_org_matrix_msc4143_rtc_foci` so that a list is produced `group_vars/matrix_servers` was correctly populating `matrix_static_files_file_matrix_client_property_org_matrix_msc4143_rtc_foci_auto` with a list, but: - the defaults for these variables were hinting that hashmaps are necessary - merging of `_auto` and `_custom` was done as if for hashmaps, not lists As a result, `/.well-known/matrix/client` looked like this: ```json { "org.matrix.msc4143.rtc_foci": { "livekit_service_url": "https://matrix.example.com/livekit-jwt-service", "type": "livekit" } } ``` .. instead of what's expected as per MSC4143 (matrix-org/matrix-spec-proposals#4143): ```json { "org.matrix.msc4143.rtc_foci": [ { "livekit_service_url": "https://matrix.example.com/livekit-jwt-service", "type": "livekit" } ] } ``` Regardless of our incorrectly formatted `org.matrix.msc4143.rtc_foci` configuration in `/.well-known/matrix/client`, Element Web still seemed to be able to discover LiveKit JWT Service (and by extension, LiveKit Server) correctly, even without this fix.
}, | ||
"member": { | ||
"id": "xyzABCDEF10123", | ||
"device_id": "DEVICEID", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is there a device_id
, user_id
here?
These are claimed fields, no need to have them they can be found from the olm session that was used for decryption
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no olm session for the m.call.member
state events.
And we might want the option to create memberships from other accounts. (bots creating multiple memberships)
proposals/4143-matrix-rtc.md
Outdated
The fields are as follows: | ||
|
||
- `member` required object - describes the participant of the RTC session: | ||
- `id` required string - a unique identifier for this session membership as defined above. Recommended to be a UUID. It can be reused if the user leaves and rejoins the session. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe obvious but presumably the recommendation is for UUIDv4?
proposals/4143-matrix-rtc.md
Outdated
|
||
#### Leaving a session | ||
|
||
Sending an empty `m.rtc.member` event represents a leave action. The state key must be the same as boefore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is said further up that the session participant (member.user_id
) is not necessarily identical to the sender of the m.rtc.member
event. How does this interact with state protection when a user wants to leave a session that they were added to by somebody else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a really good reason, that the state key should not be used as the state protection. We would like the external user to create a session for you and allow you + them to modify this state key.
All membership events that belong to one member session can be grouped with the index | ||
`created_ts()`+`state_key`. This is why the `m.rtc.member` events deliberately do NOT include something akin to a `membership_id`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm curious why the chain is being build via the timestamp of the initial event and not its ID?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The timestamp is needed anyhow to compute expired events. So this is one field that is the same in all events and where we do not need an iterative approach. In the end this is an implementation detail. The spec only clarifies that this is a required property (one an implementation is allowed to rely on) of a membership.
membershipsForSingleDevice.map((m)=>m.created_ts ?? m.origin_server_ts).filter((ts)=>ts===related_events_ts)
Is a valid implementation to collect all events for connected session.
It still does not help with finding the associated leave event.
It is still helpful in making clear how created_ts
behaves.
> on the map, so that clients can omit connecting to participants that are not in their | ||
> area of interest. | ||
|
||
#### State key for `m.rtc.member` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Several applications (widgets) can send this event to the same room. In most cases the application should only receive and send it's own member events, so that data is not shared and cannot be modified by another application. Unless someone writes a widget to show and or manage all rtc data in the room.
It would be nice if the spec could define this behavior. Maybe we could have some changes to widget capabilities or some default behavior for this state event.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on what field would the widget api/driver filter the events. They all have the same type.
Signed-off-by: Timo K <[email protected]>
Signed-off-by: Timo K <[email protected]>
…device id and application Signed-off-by: Timo K <[email protected]>
Co-authored-by: Johannes Marbach <[email protected]>
Rendered
To-do:
Pull Request Checklist
#matrix-spec:matrix.org to
get feedback on this PR.