XPLAIN: A Proactive Scaffold Across Speech Processing Stages—Supporting Non-Native Speakers in Real-Time AI-Mediated Turn-Taking
Under review at ACM Conference on Conversational User Interfaces (CUI 2026), 2026
Status: Under review at ACM CUI 2026.
This paper introduces XPLAIN, a proactive AI scaffolding system that intervenes across distinct stages of speech processing—semantic prediction, lexical retrieval, and content integration—to support non-native speakers (L2) in real-time AI-mediated turn-taking. Unlike reactive translation or post-hoc summarization tools, XPLAIN anticipates emerging cognitive trouble during conversation and provides scaffolds (lexical clarifications, idea/content suggestions, topic summaries) timed to the listener’s processing stage.
We report on a Wizard-of-Oz study (N = 38) in which a confederate simulated XPLAIN’s anticipatory cues in dyadic virtual meetings. Results demonstrate 30–80% perceived gains in communicative efficiency, increased participation, and improved inclusivity for L2 speakers, moderated by individual differences in language proficiency, personality, and prior AI experience. The paper contributes a stage-aligned design framework for proactive conversational AI and empirical evidence that scaffolding type and timing—not just content—shape user experience in cross-linguistic interaction.
Recommended citation: He, W. P. & Fussell, S. R. (2026). "XPLAIN: A Proactive Scaffold Across Speech Processing Stages—Supporting Non-Native Speakers in Real-Time AI-Mediated Turn-Taking." Proceedings of the ACM Conference on Conversational User Interfaces (CUI 2026). (Under review)
