Skip to content

Support for workloads with response time requirements (realtime) #898

@handellm

Description

@handellm

Consider an RTC-style web app concurrently running a collection of ML workloads of various response requirements. Assume that there aren't capacity issues (the system as a whole has enough HP to execute the workload on average).

Workload Period/response requirement Job completion delay
Audio capture processing (denoising etc) 50 Hz 25%
Video capture processing (eg background blur) 30 Hz 10-30%
WebSpeech 500ms? (continuous)
LLM Bursty Bursty

Without proper WebNN consideration of these requirements there are concerns that the UA will make poor scheduling decisions, causing the system to miss deadlines. The effects:

  • robo-sound
  • audio glitches
  • dropped video frames
  • janky video
  • increased end-to-end delay (since receiver jitterbuffers need to bump target delay to account for the problems).

The ideal state is that the system manages to understand the requirements. With the right scheduling decisions, LLM queries will be belated a little and possibly the latency of WebSpeech will go up, but as a whole it manages to meet the expectations in a good way.

Assuming the backend frameworks are capable of supporting workload prioritization, I see some options to help this case

  1. Equip the WebNN API with hints on the workload passed.
  2. Auto-detect the type of processing. For example, when audio or video capture wakes a worker it could transport a token containing periodicity info and let that influence scheduling decisions in the WebNN backend.

My guess is both need spec updates? Or maybe just the first option?

See also crbug.com/456006123 for repros of related issues with other tech than WebNN. Also note this issue is not asking for ways to understand total system capacity as this is impossible to compute without trying workloads out.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions