Skip to content
This repository was archived by the owner on Apr 26, 2024. It is now read-only.
This repository was archived by the owner on Apr 26, 2024. It is now read-only.

Speed up push actions/unread counting #13846

@Fizzadar

Description

@Fizzadar

I've been spending a bunch of time looking into push action processing & badge counting and I think there'd be a real benefit to separating out push actions from summaries (notification/unread/highlight counts). There is a lot of code and complexity introduced in the mechanism of rolling event push actions into summaries, and the push actions table conflates push notifications and push counters. Suggestions:

Push actions

Firstly push actions stay as is using event_push_actions, this gets deleted on receipt and read/deleted by the pusher instances, no need for any background work. No longer used for any badge counting.

Badge counts

For badge counts we add a new table, event_push_counts, that looks roughly like (pseudo-SQL):

CREATE TABLE event_push_counts (
    user_id text,
    room_id text,
    thread_id text,
    event_stream_ordering bigint,
    notifs bigint,
    unreads bigint,
    highlights bigint,
)
ALTER TABLE `event_push_counts` ADD CONSTRAINT uniq (user_id, room_id, thread_id, event_stream_ordering);

The key here is that this is not unique per user/room/thread but also event stream ordering. This means that as events come in new rows simply get appended according to the push actions per user. This prevents any contention issues during the critical event insertion path.

This makes counting a users total unreads very simple - instead of the current loop & count per room, simply:

# Counting all events for push badges
SELECT SUM(unreads)
FROM event_push_counts
WHERE user_id = '@blah:matrix.org'

# Counting unread rooms for push badges
SELECT COUNT(room_id)
FROM event_push_counts
WHERE user_id = '@blah:matrix.org'
GROUP BY room_id

# Separated room/thread counts for sync responses
SELECT SUM(unreads), SUM(notifs), SUM(highlights), room_id, thread_id
FROM event_push_counts
WHERE user_id = '@blah:matrix.org'
AND room_id IN ('!abc:matrix.org', '!def:beeper.com')
GROUP BY room_id, thread_id

The same applies to counting unreads for rooms in sync responses.

It is still possible to summarise these by merging rows into a higher stream ordering. Like the current system this doesn't account for receipts not at the latest stream ordering, but the summarisaton could be delayed to provide a window of support for this if desired. The table is leaner than the push actions table so this shouldn't be such an issue (but still important to do to keep the table fast).

Finally, rows could be cleaned out either on receipt on as a background job processing receipts. If there was sufficient (24h?) delay before any summarisation phase, deleting on receipt shouldn't result in much contention on the table.


(If this seems sensible, I can invest time to implement over the next few weeks)

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-DatabaseDB stuff like queries, migrations, new/remove columns, indexes, unexpected entries in the dbO-FrequentAffects or can be seen by most users regularly or impacts most users' first experienceS-MinorBlocks non-critical functionality, workarounds exist.T-EnhancementNew features, changes in functionality, improvements in performance, or user-facing enhancements.T-TaskRefactoring, removal, replacement, enabling or disabling functionality, other engineering tasks.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions