Research

Our research develops a data-centric foundation for machine learning—treating datasets not as passive artifacts but as structured, dynamic objects that can be characterized, transformed, and optimized. We draw on tools from optimal transport, information theory, and geometric deep learning to formalize how data properties affect learning outcomes.

Dataset Characterization & Geometry

Understanding what makes data valuable for learning through geometry and optimal transport

Dataset Transformations & Training Dynamics

How data structure affects learning dynamics and model behavior during training

Dataset Optimization & Synthesis

Principled methods to reduce, enhance, and synthesize training data

Adaptive & Reconfigurable Models

Dynamically combining and adapting models based on constraints and objectives