Thursday, August 10, 2023

AI on your phone? Tim Dettmers on quantization of neural networks — Manifold #41


Tim Dettmers develops computationally efficient methods for deep learning. He is a leader in quantization: coarse graining of large neural networks to increase speed and reduce hardware requirements. 

Tim developed 4-and 8-bit quantizations enabling training and inference with large language models on affordable GPUs and CPUs - i.e., as commonly found in home gaming rigs. 

Tim and Steve discuss: Tim's background and current research program, large language models, quantization and performance, democratization of AI technology, the open source Cambrian explosion in AI, and the future of AI. 

0:00 Introduction and Tim’s background 
18:02 Tim's interest in the efficiency and accessibility of large language models 
38:05 Inference, speed, and the potential for using consumer GPUs for running large language models 
45:55 Model training and the benefits of quantization with QLoRA 
57:14 The future of AI and large language models in the next 3-5 years and beyond

No comments:

Blog Archive