Recent & Upcoming Talks

An Introduction to Quantization of Large Language Models

A talk about efficient LLM with a special focus on quantization.

Last updated on Aug 30, 2023

Model Compression Towards Efficient Deep Learning Inference

A talk on model compression towards efficient DL inference

Last updated on Aug 29, 2023

Neural Architecture Search and Architecture Encoding

A talk on NAS researches at Renmin University.

Last updated on Dec 12, 2022