Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!
https://
Rethinking floating point for deep learning
Training Deep Neural Networks with 8-bit Floating Point Numbers
8-Bit Quantization and TensorFlow Lite: Speeding up mobile inference with low precision