Open
Description
Feature request
Currently, Transformers.js V3 defaults to use CPU (WASM) instead of GPU (WebGPU) due to lack of support and instability across browsers (specifically Firefox and Safari, and Chrome in Ubuntu). However, this provides a poor user experience since is performance left on the table. As browser support for WebGPU increases (currently ~70%), this will become more important since users may experience poor performance when better settings are available.
A better proposal should be to use device: "auto"
instead of device: null
by default, which should select (1) quantization and (2) device) based on the following:
- Browser support (e.g., whether WebGPU is enabled)
- Device capabilities (OS, mobile vs. desktop, fp16 support)
- Model architecture/type (BERT models are more likely to succeed than encoder-decoder models) - some models have ops which are not supported in WebGPU.
Motivation
Improve user experience and performance with better defaults
Your contribution
Will work with @FL33TW00D on this