Recent & Upcoming Talks
An Introduction to Quantization of Large Language Models
A talk about efficient LLM with a special focus on quantization.
Last updated on Aug 30, 2023
Model Compression Towards Efficient Deep Learning Inference
A talk on model compression towards efficient DL inference
Last updated on Aug 29, 2023
Neural Architecture Search and Architecture Encoding
A talk on NAS researches at Renmin University.
Last updated on Dec 12, 2022