-
Notifications
You must be signed in to change notification settings - Fork 2k
Open
Labels
Description
System information
- TensorFlow.js version (you are using): 4.20
- Are you willing to contribute it (Yes/No): Maybe :)
Describe the feature and the current behavior/state.
- We are looking into using WebGPU backend for inference and see a decent improvement (~5-10%) over WebGL for our models, but it is much lower than our expectation.
- One potential way to speed up inference would be to use fp16 instead of fp32 data type for tensors. The WebGL backend already supports fp16 which we use. WebGPU also supports fp16, atleast on Chrome desktop (https://chromestatus.com/feature/5180552617656320)
- Ideally we would like to use F32_F16 precision as defined in tflite to get best tradeoff between precision loss and performance.
Will this change the current api? How?
- An environment flag to set precision (similar to WebGL) would be ideal for ease of integration.
Who will benefit with this feature?
- All consumers of WebGPU backend.