Recent advances in conversational AI have demonstrated impressive capabilities in single-turn responses, yet multi-turn dialogues remain challenging for even the most sophisticated language models. Current dialogue datasets are limited in their emotional range, domain diversity, turn depth, and are predominantly text-only, hindering progress in developing more human-like conversational systems across modalities. To address these limitations, we present DeepDialogue, a large-scale multimodal dataset containing 40,150 high-quality multi-turn dialogues spanning 41 domains and incorporating 20 distinct emotions with coherent emotional progressions. Our approach pairs 9 different language models (4B-72B parameters) to generate 65,600 initial conversations, which we then evaluate through a combination of human annotation and LLM-based quality filtering. The resulting dataset reveals fundamental insights: smaller models fail to maintain coherence beyond 6 dialogue turns; concrete domains (e.g., "cars," "travel") yield more meaningful conversations than abstract ones (e.g., "philosophy"); and cross-model interactions produce more coherent dialogues than same-model conversations. A key contribution of DeepDialogue is its speech component, where we synthesize emotion-consistent voices for all 40,150 dialogues, creating the first large-scale open-source multimodal dialogue dataset that faithfully preserves emotional context across multi-turn conversations.
Select a domain and conversation to begin exploring the dataset
Conversational Analysis
Investigate the dynamics of human-like conversations generated by LLMs
Emotional Model Pretraining
Leverage large-scale data for model pretraining on synthetic emotional speech
Conversational AI
Develop more natural-sounding dialogue systems with appropriate emotional inflection
Complete BibTeX citation will be updated soon.
@misc{koudounas2025deepdialoguemultiturnemotionallyrichspoken,
title={DeepDialogue: A Multi-Turn Emotionally-Rich Spoken Dialogue Dataset},
author={Alkis Koudounas and Moreno La Quatra and Elena Baralis},
year={2025},
eprint={2505.19978},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2505.19978},
}