
Dear all, Our next AI seminar is scheduled to be on March 8th , 2-3 PM. It will be followed by a 30-minute Q&A session with the graduate students. Location: KEC 1001 Zoom link: https://oregonstate.zoom.us/j/96491555190?pwd=azJHSXZ0TFQwTFFJdkZCWFhnTW04UT09<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Foregonstate.zoom.us%2Fj%2F96491555190%3Fpwd%3DazJHSXZ0TFQwTFFJdkZCWFhnTW04UT09&data=05%7C02%7Cai%40engr.orst.edu%7Cb8ca9fe5b5484bec46c908dc3d8db667%7Cce6d05e13c5e4d6287a84c4a2713c113%7C0%7C0%7C638452926055880413%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=vYRwsbVZ%2FC6a7m7xuO5Zi5J4jl%2FTZB2XJ2e%2BSfBblbQ%3D&reserved=0>
From Large to Small Datasets: Size Generalization for Clustering Algorithm Selection
Ellen Vitercik Assistant Professor Management Science & Engineering & Computer Science Stanford University Abstract: In clustering algorithm selection, we are given a massive dataset and must efficiently select which clustering algorithm to use. We study this problem in a semi-supervised setting, with an unknown ground-truth clustering that we can only access through expensive oracle queries. Ideally, the clustering algorithm's output will be structurally close to the ground truth. We approach this problem by introducing a notion of size generalization for clustering algorithm accuracy. We identify conditions under which we can (1) subsample the massive clustering instance, (2) evaluate a set of candidate algorithms on the smaller instance, and (3) guarantee that the algorithm with the best accuracy on the small instance will have the best accuracy on the original big instance. We verify these findings both theoretically and empirically. Speaker Bio: Ellen Vitercik is an Assistant Professor at Stanford University with a joint appointment between the Management Science & Engineering department and the Computer Science department. Her research revolves around machine learning theory, discrete optimization, and the interface between economics and computation. Before joining Stanford, she spent a year as a Miller Fellow at UC Berkeley after receiving a PhD in Computer Science from Carnegie Mellon University. Her thesis won the SIGecom Doctoral Dissertation Award and the CMU School of Computer Science Distinguished Dissertation Award. Please watch this space for future AI Seminars : https://engineering.oregonstate.edu/EECS/research/AI<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fengineering.oregonstate.edu%2FEECS%2Fresearch%2FAI&data=05%7C02%7Cai%40engr.orst.edu%7Cb8ca9fe5b5484bec46c908dc3d8db667%7Cce6d05e13c5e4d6287a84c4a2713c113%7C0%7C0%7C638452926055880413%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=dp%2F5ymHP73CLRxuP8S%2FnEwme0xM7xi85MuMXxYrmBKk%3D&reserved=0> Rajesh Mangannavar, Graduate Student Oregon State University ---- AI Seminar Important Reminders: -> For graduate students in the AI program, attendance is strongly encouraged
participants (1)
-
Mangannavar, Rajesh Devaraddi