Projects

Ongoing Projects

Five active projects extend the XPLAIN program toward stage-aligned, individually adaptive AI scaffolding for real-time conversation. Together they address how proactive AI tools should be calibrated to user-level cognitive dimensions, language background, conversational alignment, and the social perception of AI-assisted speech.

Personalization of Proactive AI Scaffolds

Wizard-of-Oz study (N = 40); manuscript in preparation.

This project moves from what AI scaffolds to provide (XPLAIN’s lexical clarifications, idea/content suggestions, topic summaries) to for whom and when they should be triggered. Building on the baseline XPLAIN study, I designed and ran a Wizard-of-Oz study (N = 40) in which a confederate simulated context-adaptive interventions calibrated to user-level cognitive dimensions — language proficiency, working-memory profile, AI familiarity, and communication style — under different conversational contexts (high vs. low domain familiarity, comprehension- vs. production-heavy turns).

User Heterogeneity via Latent Class Analysis

Accepted at CogSci 2026 (Rio de Janeiro, Brazil); full paper in preparation.

Using the qualitative interview data from the personalization study as input, this companion analysis asks a theoretical question: does user heterogeneity in AI-MC reduce to a small number of stable types, or does it operate domain by domain?

L1- vs. L2-cued Translation Scaffolds

Wizard-of-Oz study (N = 27); manuscript in preparation.

For non-native (L2) speakers, real-time turn-taking imposes simultaneous demands: comprehend incoming speech, predict appropriate responses, access target-language lexical forms, and articulate under turn-taking constraints. AI mediation adds the further step of interpreting and integrating AI-suggested content. This project asks whether AI scaffolds should be delivered in the user’s L1 or L2 under domain-specific lexical demand.

Interactive Alignment & Semantic Integration

Manuscript in preparation.

Subjective-satisfaction metrics dominate the evaluation of conversational AI, yet they fail to capture whether AI mediation preserves shared common ground or merely papers over local breakdowns. This project introduces a dual-lens framework for evaluating proactive AI scaffolding in L2 turn-taking:

Listener Perception of AI-Assisted Speech

Pre-registered on OSF; in data collection (Spring 2026).

The same multimodal traces that allow AI tools to detect when a speaker needs help (disfluency, timing, prosody, visible ease/strain) may also leak to listeners — and listeners may attribute those traces to the speaker rather than to the AI. This project tests the resulting “double bind” for L2 speakers using AI support.


Past Projects

Sustaining Public Goods via Prosocial Behavior

Citizens & Technology Lab (CAT Lab), Cornell University. PI: J. Nathan Matias. Jan. 2025 – Present (manuscript under review at PNAS).

Lead statistician on parallel pre-registered field experiments across four Wikipedia language communities (Arabic, German, Persian, Polish; N = 15,558 vetted editors), quantifying the causal effect of peer-to-peer gratitude on upstream reciprocity and sustained volunteer participation. Built end-to-end causal-inference pipelines in R (power analyses; intent-to-treat and complier-average causal-effect estimation; multilevel models with cluster-robust SEs; sensitivity analyses; cross-community meta-analytic pooling). Receiving thanks increased contribution time by 11%, two-week retention by 2.2 pp, and outgoing thanks by 61%, with 99.8% sent upstream. See the PNAS manuscript.

Cross-Linguistic Statistical Learning of Language

Cognitive Science of Language Lab, Cornell University. PI: Prof. Morten H. Christiansen. Aug. 2021 – May 2023.

Investigated how high-frequency multi-word chunks and word-marker pairs are statistically learned and how they modulate reading performance and language processing. Designed self-paced reading and statistical-learning experiments and presented findings at the International Conference on Interdisciplinary Advances in Statistical Learning (ISLA 2022, 2024).

Multilingual FrameNet Project

International Computer Science Institute (ICSI), UC Berkeley. PI: Terry Regier, Collin Baker. Aug. 2019 – Dec. 2019.

Studied the structure of English lexical databases and ran model trainings on nine types of frame-to-frame semantic relations. Evaluated poorly predicted relations and constructed new relations from lexical units to improve model coverage; refined labeling of semi-automatic semantic-role and universal semantic frames to improve multilingual alignment.

Rule Generalization at Different Boundary Levels

Experimental Phonology Group, UC Berkeley. Advisor: Jesse Zymet. Aug. 2019 – May 2020.

Empirical and modeling work on how phonological rules generalize across morphological and prosodic boundaries, contributing to a broader theoretical project on the locality of phonological computation.

Acoustic Analyses of Iquito

Indigenous Language Revitalization Group, UC Berkeley. Advisor: Christine Beier. Aug. 2018 – May 2021.

Conducted acoustic analyses (Praat, ELAN) and social-background research on Iquito, an endangered Zaparoan language of Peru. Presented an Optimality-Theoretic analysis of verbal tone interactions at the 28th Manchester Phonology Meeting (2021) and a dispersion-theoretic model of tone systems at the 56th Linguistics Colloquium (2020).

Locomotion and Early Language Acquisition

Infant Study Center, UC Berkeley. PI: Joseph Campos. Jan. 2018 – Dec. 2019.

Studied the relationship between infant locomotion (walking onset), social-communicative abilities, and vocabulary growth in Chinese-American infants. Co-presented two posters at the 22nd Biennial International Conference of Infant Studies (2020).

Cost of Phonetic Cues in Mandarin–English Code-Switching

Berkeley PhonLab, UC Berkeley. PIs: Susan Lin, Alice Shen. Aug. 2017 – Dec. 2017.

Acoustic and behavioral investigation of the phonetic cost of code-switching between Mandarin and English, contributing to broader work on bilingual phonetic accommodation.