This week, we're having a talk from our very own Fuxin Li for the Friday seminar. He is an expert in point cloud deep networks, human understanding of deep learning, video object segmentation, multi-target tracking, and uncertainty estimation in deep learning - many relevant topics for us as roboticists.

As a reminder, everyone is expected to come to the seminar. We’ve got a strict no-device policy. We also have, as always, coffee and coffee cake.

For the winter term, we're again in Rogers Hall 230. Seminar is at 10am Fridays, with a students-only Q&A session following each seminar time.

________________________________________________

From Sparse to Dense, and Back to Sparse Again?

Fuxin Li, Associate Professor in EECS, Oregon State University

Abstract: Computer vision architectures used to be built on a sparse sample of points in the 80s and 90s. In the 2000s, dense models started to become popular for visual recognition as heuristically defined sparse models do not cover all the important parts of an image. However, with deep learning and end-to-end training approaches, this does not have to continue and sparse models may still have significant advantages in saving unnecessary computation as well as being more flexible. In this talk, I will talk about the deep point cloud convolutional backbones that we have developed in the past few years, including results on point cloud segmentation tasks, as well as recent applications on interaction modeling among objects, point cloud completion and world models for robot manipulation tasks. Point cloud approaches can also work well as 2D image recognition backbones. I will introduce our work AutoFocusFormer that uses point cloud backbones and decoders to work on 2D image recognition, with a novel adaptive downsampling module that enables the end-to-end learning of adaptive downsampling for dense prediction tasks such as segmentation. This is very helpful for detecting tiny objects faraway in the scene which would have been decimated by conventional grid downsampling approaches.

Bio: Fuxin Li is currently an associate professor in the School of Electrical Engineering and Computer Science at Oregon State University. He has held research positions at Apple Inc., University of Bonn and Georgia Institute of Technology. He had obtained a Ph.D. degree in the Institute of Automation, Chinese Academy of Sciences in 2009. He has won an NSF CAREER award, an Amazon Research Award, CVPR 2024 Best Student Paper runner-up award, (co-)won the PASCAL VOC semantic segmentation challenges from 2009-2012, and led a team to the 4th place finish in the DAVIS Video Segmentation challenge 2017. He is a program chair of CVPR 2025. He has published more than 90 papers in computer vision, machine learning, as well as applications of machine learning and computer vision. His main research interests are point cloud deep networks, human understanding of deep learning, video object segmentation, multi-target tracking and uncertainty estimation in deep learning.