Junseok Ahn

About me

I am a third-year Integrated M.S./Ph.D. student at KAIST, advised by Professor Joon Son Chung. My research focuses on multimodal learning and generative modeling, particularly for building conversational agents that interact naturally through speech and facial expressions.

I am particularly interested in developing dialogue-aware talking head generation models, expressive speech synthesis, and multimodal interaction systems that integrate audio, text, and visual cues. My broader goal is to advance the capabilities of multimodal large language models (MLLMs) for real-time, human-like interaction.

Education

Integrated M.S./Ph.D. in Electrical Engineering, KAIST

Mar. 2023 ~

Advisor: Joon Son Chung

B.S. in Electrical Engineering, KAIST

Mar. 2019 ~ Feb. 2023

Experience

Student Researcher, Samsung Electronics System LSI

Sep. 2022 ~ Feb. 2023

Publications

2025

TAVID: Text-Driven Audio-Visual Interactive Dialogue Generation
J.-H. Kim^*, J. Ahn^*, D. Kwak, J. S. Chung, S. Watanabe
arXiv
Paper Project Page

Deep Understanding of Sign Language for Sign to Subtitle Alignment
Y. Jang^*, J. Choi^*, J. Ahn, J. S. Chung
Transactions on Multimedia
Paper

VoiceDiT: Dual-Condition Diffusion Transformer for Environment-Aware Speech Synthesis
J. Jung^*, J. Ahn^*, C. Jung, T. D. Nguyen, Y. Jang, and J. S. Chung
ICASSP 2025
Paper Project Page Code

2024

VoxSim: A perceptual voice similarity dataset
J. Ahn, Y. Kim, Y. Choi, D. Kwak, J.-H. Kim, S. Mun, and J. S. Chung
Interspeech 2024
Paper Project Page Code

Faces that Speak: Jointly Synthesising Talking Face and Speech from Text
Y. Jang^*, J.-H. Kim^*, J. Ahn, D. Kwak, H. Yang, Y. Ju, I. Kim, B. Kim, and J. S. Chung
CVPR 2024
Paper Project Page

SlowFast Network for Continuous Sign Language Recognition
J. Ahn^*, Y. Jang^*, and J. S. Chung
ICASSP 2024
Paper Code

Contact

junseok (at) mmai.io

Room 3103, N24 (LG Innovation Hall)