Mastodon

Chengkun (Charlie) Li

MSc student in Robotics

Swiss Federal Institute of Technology Lausanne (EPFL)

Brief Bio

I hold an MSc in Robotics from Swiss Federal Institute of Technology Lausanne (EPFL), where I completed my thesis at ETH Zurich under Prof. Marco Hutter and Prof. Caglar Gulcehre from EPFL. My interests include robotics perception, reasoning, and actuation, with a focus on 3D vision, robot control/learning, and multimodal learning.

I had the pleasure to work in the labs of Prof. Huijing Zhao at Peking University on 3D perception, Prof. Min Xu at Carnegie Mellon University on Cryo-ET generative models, Prof. Mathieu Salzmann at EPFL on neural networks quantization for object detection and pose estimation.

I was a research intern at ByteDance AI Lab working on multimodal representation learning, and student researcher at Google DeepMind working on extending Vision Language Model with ink modality.

I enjoy board games 🎲, soccer ⚽, tennis 🎾, and music 🎶. Feel free to reach out to me if you want to join in on a hike or play some board games.


Download my CV.

Interests

  • Efficient and Continuous Learning
  • Perception and Reasoning
  • Multimodal Learning

Education

  • Robotics, 2021 - 2024

    Swiss Federal Institute of Technology Lausanne

  • Robotics Summer School, July 2022

    Swiss Federal Institute of Technology Zurich

  • Summer Session, 2018 - 2018

    University of California, Berkeley

Recent News

All news»

  • [Nov 2024] 🎉 Excited to share that our paper MQAT (Modular Quantization-Aware Training for 6D Object Pose Estimation) has been accepted to Transactions on Machine Learning Research [link].

  • [Oct 2024] 🎉 Excited to share that our project, InkSight, has been featured across multiple platforms: Google Research Blog, LinkedIn, X post, Hugging Face (AK’s post), and Hacker News.

  • [Sept 2024] 🎉 Successfully defended my thesis with the grade of 6.0/6.0 at both ETH Zurich and EPFL. Grateful to my supervisors and committee members for their invaluable feedback and support!

  • [April 2024] I began my master’s thesis at the Robotic Systems Lab and CLAIRE Lab.

  • [Feb 2024] 🌟 Completed my student researcher internship at Google Research. It was an incredible experience, filled with joy from learning and collaboration with a fantastic team.

Experience

Download full CV

 
 
 
 
 

Thesis Student

Robotics Systems Lab, ETH Zürich

Robotics Systems Lab, ETH Zürich

Apr 2024 – Sep 2024

Thesis Title: Continuous Skill Learning For ANYmal Robot

Supervisors: Chenhao Li, Nikita Rudin, Skander Moalla*, Marco Hutter, Caglar Gulcehre* (hosting supervisors from CLAIRE, EPFL)

 
 
 
 
 

Student Researcher

Google DeepMind, Zürich

Google DeepMind, Zürich

Aug 2023 – Feb 2024
Extending Vision-Language Model with ink modality (previous Google Research)
 
 
 
 
 

Robotics Summer School

Swiss Federal Institute of Technology Zürich

Swiss Federal Institute of Technology Zürich

Jul 2022 – Jul 2022
Selected as a participant in the ETH Robotics Summer School, completed “Autonomous Navigation and Rescue in the Wild” challenge. [blog post]
 
 
 
 
 

MSc in Robotics

Swiss Federal Institute of Technology Lausanne

Swiss Federal Institute of Technology Lausanne

Sep 2021 – Oct 2024
Enrolled in MSc in Robotics program.
 
 
 
 
 

Research Intern

ByteDance Inc.

ByteDance Inc.

Mar 2021 – Jul 2021 Beijing, China
Research Intern @ AI Lab Visual Computing group working on Multimodal Representation Learning.
 
 
 
 
 

Research Intern

Carnegie Mellon University

Carnegie Mellon University

Aug 2020 – Mar 2021 Pittsburgh, PA
Research Intern of XU LAB | Advisor: Min Xu
 
 
 
 
 

Research Intern

Peking University, Beijing

Peking University, Beijing

Aug 2019 – Jan 2021 Beijing
Research Intern of POSS Lab | Advisor: Huijing Zhao
 
 
 
 
 

Summer Session

University of California, Berkeley

University of California, Berkeley

Jul 2018 – Aug 2018 Berkeley, CA
Enrolled in Summer Session classes on Biology (focusing on ECG and EMG) and Multimedia study

Recent Publications

Modular Quantization-Aware Training for 6D Object Pose Estimation

Edge applications, such as collaborative robotics and spacecraft rendezvous, demand efficient 6D object pose estimation on resource-constrained embedded platforms. Existing 6D object pose estimation networks are often too large for such deployments, necessitating compression while maintaining reliable performance. To address this challenge, we introduce Modular Quantization-Aware Training (MQAT), an adaptive and mixed-precision quantization-aware training strategy that exploits the modular structure of modern 6D object pose estimation architectures. MQAT guides a systematic gradated modular quantization sequence and determines module-specific bit precisions, leading to quantized models that outperform those produced by state-of-the-art uniform and mixed-precision quantization techniques. Our experiments showcase the generality of MQAT across datasets, architectures, and quantization algorithms. Additionally, we observe that MQAT quantized models can achieve an accuracy boost (>7% ADI-0.1d) over the baseline full-precision network while reducing model size by a factor of 4x or more.

InkSight: Offline-to-Online Handwriting Conversion by Learning to Read and Write

Our work aims to bridge the gap between images of handwriting and digital ink with a Vision Language Model (PaLI). To our knowledge, this is the first work that effectively does so with arbitrary photos with diverse visual characteristics and backgrounds. Furthermore, it generalizes beyond its training domain and can work on simple sketches. Human evaluation reveals that 87% of the samples produced by our model on the challenging HierText dataset are considered valid tracings of the input image, and 67% look like pen trajectories traced by a human.

Model Generalization: A Sharpness Aware Optimization Perspective

This project investigates the effects of Sharpness-Aware Minimization (SAM) and Adaptive Sharpness-Aware Minimization (ASAM) on model generalization. Our experiments demonstrate that sharpness-aware optimization techniques notably enhance generalization abilities. Notably, ASAM shows promise in improving performance on un-normalized data.

Projects

Some of my projects, enjoy!

*

MQAT

Modular Quantization-Aware Training for 6D Object Pose Estimation

InkSight: Offline-to-Online Handwriting Conversion by Learning to Read and Write

We present a model to convert photos of handwriting into a digital format that reproduces component pen strokes, without the need for …

Autonomous Aerial Robotics Navigation and Control

Aerial Robotics project @ EPFL. The goal is to design and implement a control system for a quadrotor that is able to autonomously …

Pocket Food (Third Place Winner of LauzHack)

The digital platform is intended to built upon Pocket Campus that addresses food sustainability in educational institutions. The …

One Hundred Years of Cinema

(A data story project for EPFL ADA course) The movie industry has long been criticized for its lack of diversity both on and off …

ETHZ Robotics Summer School 2022

Autonomous Navigation and Detection in the Wild

EPFL DLAV Tandem Race

“This tandem racing project, part of the DLAV course at EPFL, involved developing a detection and tracking system, which we used to …

Noise2Noise with Deep Learning Framework Implementation

I implemented a PyTorch Deep Learning framework (including the backprop implementations of modules without using autograd).

Modular Lamprey Robot Control

Robotics Practical at Biorobotics Laboratory

Model Predictive Control of Mini Rocket

This project is a course project of ME-425 at EPFL

Quadrupedal Locomotion via Central Pattern Generator and Reinforcement Learning

This project is a course project of MICRO-507 at EPFL

Soccer Playing with Mobile Robot

This project is a course project of MICRO-452 at EPFL

UDepFusion

An Object-Aware Dense Indoor Semantic Mapping Framework

Third order system simulation

A PID controlled system simulation program

BIT LaTeX template

Templates for Proposal, Projects, Report at BIT

SemanticPOSS

A Point Cloud Dataset with Large Quantity of Dynamic Instances

3d Reconstruction of Maize plants

Project on 3D reconstruction of maize plants from RGB images.

Effects of High Intensity Interval Training

A course project @ MCB 32L, UC Berkeley

Star War game

A STAR WAR themed Game implemented with SFML library

Contact

..feel free to reach out to me