Support for workloads with response time requirements (realtime)

Consider an RTC-style web app concurrently running a collection of ML workloads of various response requirements. Assume that there aren't capacity issues (the system as a whole has enough HP to execute the workload on average).

|Workload|Period/response requirement|Job completion delay|
|-|-|-|
|Audio capture processing (denoising etc)|50 Hz|25%|
|Video capture processing (eg background blur)|30 Hz|10-30%|
|WebSpeech|500ms?|(continuous)|
|LLM|Bursty|Bursty|

Without proper WebNN consideration of these requirements there are concerns that the UA will make poor scheduling decisions, causing the system to miss deadlines. The effects:
- robo-sound
- audio glitches
- dropped video frames
- janky video
- increased end-to-end delay (since receiver jitterbuffers need to bump target delay to account for the problems).

The ideal state is that the system manages to understand the requirements. With the right scheduling decisions, LLM queries will be belated a little and possibly the latency of WebSpeech will go up, but as a whole it manages to meet the expectations in a good way.

Assuming the backend frameworks are capable of supporting workload prioritization, I see some options to help this case
1. Equip the WebNN API with hints on the workload passed.
2. Auto-detect the type of processing. For example, when audio or video capture wakes a worker it could transport a token containing periodicity info and let that influence scheduling decisions in the WebNN backend.

My guess is both need spec updates? Or maybe just the first option?

See also [crbug.com/456006123](crbug.com/456006123) for repros of related issues with other tech than WebNN. Also note this issue is not asking for ways to understand total system capacity as this is impossible to compute without trying workloads out.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support for workloads with response time requirements (realtime) #898

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Workload	Period/response requirement	Job completion delay
Audio capture processing (denoising etc)	50 Hz	25%
Video capture processing (eg background blur)	30 Hz	10-30%
WebSpeech	500ms?	(continuous)
LLM	Bursty	Bursty

Support for workloads with response time requirements (realtime) #898

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions