Show simple item record

dc.contributor.advisorPolyanskiy, Yury
dc.contributor.authorSavkin, Semyon
dc.date.accessioned2026-02-12T17:13:35Z
dc.date.available2026-02-12T17:13:35Z
dc.date.issued2025-09
dc.date.submitted2025-09-15T14:56:28.393Z
dc.identifier.urihttps://hdl.handle.net/1721.1/164834
dc.description.abstractWe study quantization in Machine Learning. First, we introduce NestQuant — a technique for quantization of matrix products and post-training quantization of LLMs. Beyond reducing the memory footprint, quantization accelerates inference, as the primary bottleneck during autoregressive generation is often the memory bandwidth. NestQuant leverages two nested lattices to construct an efficient vector codebook for quantization, along with practical encoding and decoding algorithms. The approach is grounded in recent theoretical work that characterizes the optimal rate–distortion trade-off for matrix products. Empirically, on Llama-3-8B, it reduces the perplexity gap between full-precision and quantized models by more than 55% relative to the current state-of-the-art technique (SpinQuant). Second, we investigate data-domain quantization for RF signals. We propose a tokenized transformer for source separation that discretizes RF waveforms into learned tokens and operates directly on the resulting sequences, outperforming strong convolutional baselines. Together, these contributions connect information-theoretic limits with deployable systems: structured vector quantizers accelerate LLM inference and enable competitive discrete representations for RF tasks.
dc.publisherMassachusetts Institute of Technology
dc.rightsIn Copyright - Educational Use Permitted
dc.rightsCopyright retained by author(s)
dc.rights.urihttps://rightsstatements.org/page/InC-EDU/1.0/
dc.titleQuantization Methods for Matrix Multiplication and Efficient Transformers
dc.typeThesis
dc.description.degreeM.Eng.
dc.contributor.departmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
mit.thesis.degreeMaster
thesis.degree.nameMaster of Engineering in Electrical Engineering and Computer Science


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record