Xuefei Ning

Xuefei Ning

Research Assistant Professor at Tsinghua University

NICS-EFC, EE Dept., Tsinghua University

Started from 2024, I’m a research assistant professor in the NICS-EFC group at the Department of Electronic Engineering, Tsinghua University. I got my B.S. and Ph.D. degrees from the department of Electronic Engineering, Tsinghua University, in 2016 and 2021, advised by Prof. Huazhong Yang and Prof. Yu Wang. I spent two years (from 2021.12 to 2023.12) as a post-doctoral researcher with and Prof. Yu Wang and Prof. Pinyan Lu.

My past research interests mainly lie in Model Compression and Neural Architecture Search (NAS). Currently, I’m co-advising ~10 graduate students. And the major focus of my team is efficient AIGC, including language models and generative models for other modalities. I serve as a reviewer for top conferences including NeurIPS, ICLR, ICML, CVPR, ICCV, ECCV, AAAI, and so on. My resume (as of 05/2023) is available here.

Our group is continuously recruiting visiting students and engineers who are interested in efficient deep learning. I’ve been instructing quite a few undergraduate, master and PhD students for their first projects, and I must say I learned a lot and get quite some experiences on how to help different students learn, improve ability, and accomplish some goals. I’m sure we can do something interesting and maybe impactful together. Email me and Prof. Yu Wang if you’re interested!

  • Neural Architecture Search
  • Efficient Deep Learning
  • PhD in EE, 2016-2021

    Tsinghua University

  • BE in EE, 2012-2016

    Tsinghua University


  • 2024/02/27: 1 paper, FlashEval, is accepted by CVPR'24. This work is on selecting a compact data subset to evaluate text-to-image Diffusion models. Congrats and thanks to the students and collaborators.
  • 2024/02/09: Our paper on long-context benchmark, LV-Eval, is public on arXiv. Check the code and HuggingFace page.
  • 2024/01/17: 2 papers are acccepted by ICLR'24. One is Skeleton-of-Thoughts, which accelerates LLM generation by letting the LLM itself to plan and generate segments in parallel, achieving ~2x speed-ups; Another is USF, which summarizes the sampling strategies for diffusion and search for the best sampling strategy. Congrats and thanks to the collaborators and students.
  • 2023/12/17: My students and collaborators present two of our work in the Efficient Natural Language and Speech Processing workshop at NeurIPS'23: LLM-MQ: Mixed-precision Quantization for Efficient LLM Deployment, and Skeleton-of-Thoughts.
  • 2023/09/22: 1 paper is accepted by NeurIPS'23. Congrats and thanks to the collaborators.
  • 2023/08/27: Give a tutorial talk on LLM quantization for an AWS competition. Here is the tutorial-only slide! And here is the video.
  • 2023/07/27: Our technical report on prompting techniques for efficient LLM generation (work still in progress) – “Skeleton-of-Thoughts” – is public! Check the website for an introduction and demos. The code is available here.
  • 2023/07/17: 1 paper is accepted by ICCV'23. This work is on Dynamic Inference for Efficient 3D Perception. Congrats to the students and collaborators. Check the website for more information.
  • 2023/04/27: Give a 1.5h talk at Huawei on Model Compression for Efficient DL.
  • 2023/04/25: 1 paper is accepted by ICML'23. This work is on Searching Model Schedule for Efficient Diffusion – OMS-DPM. Congrats to the students and collaborators. Check the website for an introduction and demos.
  • 2022/12/05: 1 paper, GATES++, is accepted by TPAMI.
  • 2022/11/25: Give a 60-min talk at Inceptio.ai on practices of applying NAS for efficient DL, including (1) “how to efficient search”: sample-based and one-shot workflows, and (2) “how to consider hardware efficiency objectives in NAS”.
  • 2022/11/24: Give a 75-min talk at Renming University on NAS research.
  • 2022/11/19: 3 papers, DELE / MOSP / EIO, have been accepted by AAAI'23. The topics are efficient NAS, LLCV pruning, and efficient adversarial ensemble training, respecitvely. And the NAS work, DELE, is selected as Oral presentation. Congrats to the students and collaborators.
  • 2022/11/09: Give a 20-min talk (starting from 50:20) at AI-Time on TA-GATES. The talk is in Chinese.
  • 2022/09/15: 1 paper, TA-GATES, is accepted by NeurIPS'22 as Spotlight.