Bac Nguyen

I am a Research Scientist at Mirelo, working on the next generation of multimodal video-to-audio models. Before I worked at Sony AI specializing in Generative AI and Foundation Models. My work spans multiple domains–including computer vision, speech, and natural language processing–with a current focus on enhancing the efficiency and scalability of deep generative models. I am particularly interested in reducing training costs, accelerating inference, and optimizing large-scale foundation models for real-world impact.

Previously, I obtained my PhD from Ghent University in 2019, co-advised by Carlos Morell and De Baets Bernard. My PhD research focused on a supervised learning problem, called metric learning. Given some supervision information, the goal is to learn from examples a distance function that measures how similar or related two objects are. During my Ph.D., I developed various large-scale optimization techniques for distance metric learning problems under different types of supervision.

selected publications

ICLR

Improved Object-Centric Diffusion Learning with Registers and Contrastive Alignment

Bac Nguyen, Yuhta Takida, Naoki Murata, Chieh-Hsin Lai, Toshimitsu Uesaka, Stefano Ermon, and Yuki Mitsufuji

In The International Conference on Learning Representations, 2026

arXiv Bib HTML Code

@inproceedings{nguyen2026coda,
  title = {Improved Object-Centric Diffusion Learning with Registers and Contrastive Alignment},
  author = {Nguyen, Bac and Takida, Yuhta and Murata, Naoki and Lai, Chieh-Hsin and Uesaka, Toshimitsu and Ermon, Stefano and Mitsufuji, Yuki},
  year = {2026},
  booktitle = {The International Conference on Learning Representations}
}

IJCNN

Improving vector-quantized image modeling with latent consistency-matching diffusion

Bac Nguyen, Chieh-Hsin Lai, Yuta Takida, Naoki Murata, Toshimitsu Uesaka, Stefano Ermon, and Yuki Mitsufuji

In International Joint Conference on Neural Networks, 2025

Best Industry Paper arXiv Bib HTML

Best Industry Paper Award

@inproceedings{nguyen2025improving,
  title = {Improving vector-quantized image modeling with latent consistency-matching diffusion},
  author = {Nguyen, Bac and Lai, Chieh-Hsin and Takida, Yuta and Murata, Naoki and Uesaka, Toshimitsu and Ermon, Stefano and Mitsufuji, Yuki},
  booktitle = {International Joint Conference on Neural Networks},
  pages = {1--8},
  year = {2025},
  organization = {IEEE},
}

ECCV

SAFT: Towards out-of-distribution generalization in fine-tuning

Bac Nguyen, Stefan Uhlich, Fabien Cardinaux, Lukas Mauch, Marzieh Edraki, and Aaron Courville

In European Conference on Computer Vision, 2024

arXiv Bib HTML

@inproceedings{nguyen2024saft,
  title = {SAFT: Towards out-of-distribution generalization in fine-tuning},
  author = {Nguyen, Bac and Uhlich, Stefan and Cardinaux, Fabien and Mauch, Lukas and Edraki, Marzieh and Courville, Aaron},
  booktitle = {European Conference on Computer Vision},
  pages = {138--154},
  year = {2024},
  organization = {Springer},
}

ICASSP

AutoTTS: End-to-end text-to-speech synthesis through differentiable duration modeling

Bac Nguyen, Fabien Cardinaux, and Stefan Uhlich

In International Conference on Acoustics, Speech and Signal Processing, 2023

arXiv Bib HTML Website

@inproceedings{nguyen2023autotts,
  title = {AutoTTS: End-to-end text-to-speech synthesis through differentiable duration modeling},
  author = {Nguyen, Bac and Cardinaux, Fabien and Uhlich, Stefan},
  booktitle = {International Conference on Acoustics, Speech and Signal Processing},
  pages = {1--5},
  year = {2023},
  organization = {IEEE},
}

ICASSP

NVC-Net: End-to-end adversarial voice conversion

Bac Nguyen and Fabien Cardinaux

In International Conference on Acoustics, Speech and Signal Processing, 2022

arXiv Bib HTML Website

@inproceedings{nguyen2022nvc,
  title = {NVC-Net: End-to-end adversarial voice conversion},
  author = {Nguyen, Bac and Cardinaux, Fabien},
  booktitle = {International Conference on Acoustics, Speech and Signal Processing},
  pages = {7012--7016},
  year = {2022},
  organization = {IEEE},
}