Skip to content

Conversation

@kaspar030
Copy link
Contributor

I stumbled across a few oddities with the embassy-time std driver:

  1. Both now() and schedule_wake() were locking inner. That made any use of now() within a waker cause a deadlock.

    => fixed by splitting inner into its fields queue and zero_instant, and using LazyLock for initialization.

  2. The alarm_thread() iterates the timer queue while holding the queue lock, but outside a critical section. Now if:

    1. another thread holds a critical section and calls schedule_wake(), it is blocked on the queue lock
    2. alarm_thread() wakes a waker that in its wake() enters a critical section
    3. the two threads dead-lock

    => fixed by wrapping the queue iteration in a critical section.

  3. Previously, schedule_wake() would just pass the Waker to the queue's schedule_wake() and signal the alarm thread. If two consecutive Wakers are scheduled, the first before now and the second after now, only the first would get woken (as soon as alarm_thread() processes the signal).
    This behaves differently than all the other time driver implementations, that all call queue.next_iteration() within schedule_wake() and thus execute already passed timeouts right away, causing two wakeups.

    While I believe consistent behavior alone justifies this change, it is also necessary for using the embassy-time machinery outside of tasks, because a Waker cannot re-schedule itself to a later timeout in its wake(), as calling schedule_wake() while the queue lock is held causes a deadlock. So a waker can only check if it should wake (through now() and its context). Through first setting it with at=0 (already passed), which wakes() once and then drops the waker from the queue, it can be set again with any at.

    => fixed by also calling queue.next_iteration() within schedule_wake().

@kaspar030
Copy link
Contributor Author

I do see this is basically fixing three things, should I split this up? Or would that triple testing?

@Dirbaio
Copy link
Member

Dirbaio commented Dec 19, 2025

we should avoid using critical_section in code targeting std. It's basically a global mutex, and using real mutexes allows for finer grained locking.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants