에이아이트릭스 AITRICS

Stable-TTS: Stable Speaker-Adaptive Text-to-Speech Synthesis via Prosody Prompting

ICASSP 2025 Stable-TTS: Stable Speaker-Adaptive Text-to-Speech Synthesis via Prosody Prompting Wooseok Han, Minki Kang, Changhun Kim, Eunho Yang Speaker-adaptive Text-to-Speech (TTS) synthesis has attracted considerable attention due to its broad range of applications, such as p...

5 ICASSP

Face-StyleSpeech: Enhancing Zero-shot Speech Synthesis from Face Images with Improved Face-to-Speech Mapping

ICASSP 2025 Face-StyleSpeech: Enhancing Zero-shot Speech Synthesis from Face Images with Improved Face-to-Speech Mapping Minki Kang, Wooseok Han, Eunho Yang Generating speech from a face image is crucial for developing virtual humans capable of interacting using their uniq...

4 ICASSP

COMPACT AND DE-BIASED NEGATIVE INSTANCE EMBEDDING FOR MULTI-INSTANCE LEARNING ON WHOLE-SLIDE IMAGE CLASSIFICATION

ICASSP 2024 COMPACT AND DE-BIASED NEGATIVE INSTANCE EMBEDDING FOR MULTI-INSTANCELEARNING ON WHOLE-SLIDE IMAGE CLASSIFICATION Joohyung Lee, Heejeong Nam, Kwanhyung Lee, Sangchul Hahn Whole-slide image (WSI) classification is a challenging task because 1) patches ...

3 ICASSP

WeavSpeech: Data Augmentation Strategy For Automatic Speech Recognition Via Semantic-Aware Weaving

ICASSP 2023 WeavSpeech: Data Augmentation Strategy For Automatic Speech Recognition Via Semantic-Aware Weaving Kyusung Seo, Joonhyung Park, Jaeyun Song and Eunho Yang A cut-and-paste type of data augmentation strategy has attracted considerable attention in the vision...

2 ICASSP

Grad-StyleSpeech: Any-Speaker Adaptive Text-to-Speech Synthesis with Diffusion Models

ICASSP 2023 Grad-StyleSpeech: Any-Speaker Adaptive Text-to-Speech Synthesis with Diffusion Models Minki Kang, Dongchan Min, Sung Ju Hwang There has been a significant progress in Text-To-Speech (TTS) synthesis technology in recent years, thanks to the advancement in ne...

1 ICASSP

Mutually-Constrained Monotonic Multihead Attention for Online ASR

ICASSP 2021 Mutually-Constrained Monotonic Multihead Attention for Online ASR Jaeyun Song, Hajin Shim, Eunho Yang Despite the feature of real-time decoding, Monotonic Multihead Attention (MMA) shows comparable performance to the state-of-the-art offline methods ...

PUBLICATIONS

Publications