Adaptive & Reconfigurable Models
Overview
Can we build models that morph and adapt without expensive retraining? We develop methods to combine, interpolate, and reconfigure model architectures dynamically—enabling flexible systems that adjust to new constraints, domains, or objectives on the fly.
This includes techniques for model merging, weight-space interpolation, inference-time ensembling, and tokenizer adaptation, allowing practitioners to compose and customize models for specific needs.
Key Questions
- How can we interpolate between models of different sizes without retraining?
- Can we ensemble models with incompatible tokenizers at inference time?
- How do we adapt pre-trained models to new domains with minimal overhead?
Methods & Tools
- Knowledge Distillation: Transferring capabilities between models of different sizes
- Weight Interpolation: Continuous traversal of model space for controllable behavior
- Inference-Time Ensembling: Combining diverse models without shared vocabularies
- Tokenizer Adaptation: Translating between tokenization schemes for domain transfer
Selected Publications
- Boomerang Distillation Enables Zero-Shot Model Size Interpolation (arXiv 2025)
- Continuous Language Model Interpolation for Dynamic and Controllable Text Generation (TMLR 2025)
- CharED: Character-wise Ensemble Decoding for Large Language Models (arXiv 2024)
- S2T2: Sparse Sinkhorn Token Translation for Domain Adaptation (arXiv 2024)