AI Agents AI Gadgets & HW AI Models - LLM AI Open Source AI Security AI for Coding AI for Gaming AI for Images AI for Music AI for Videos Artificial Intelligence Editor's Choice NVIDIA AI Other News Robotics Tech Face-off Tech Satire

Cornell Researcher Trains AI on Own Speech, Discovers Identity Risks

By Artūras Malašauskas May 04, 2026 5 min read Share:
A Cornell Tech doctoral student's seven-month experiment training AI on his own AAC communication reveals how personalized speech models can reshape identity and create surveillance-like effects.

What happens when you train an artificial intelligence to speak like you? The answer, according to new research from Cornell Tech, is complicated. Doctoral student Tobias Weinberg, who uses augmentative and alternative communication (AAC) due to a neurological condition, conducted a seven-month experiment logging his real-world speech data. He then trained a language model on that data and lived with the resulting personalized system for three months.

The study, titled "I, Robot?", was presented in April at the 2026 CHI Conference on Human Factors in Computing Systems. It emerged from Cornell Tech's Matter of Tech Lab and was co-authored by Weinberg; Thijs Roumen, assistant professor at Cornell Tech; Ricardo Gonzalez Penuela, a doctoral student in information science; and Stephanie Valencia, an assistant professor at the University of Maryland.

Weinberg's motivation was straightforward. "Since I'm typing it anyway, I might as well see what I can do with it," he said. Rather than relying on hypothetical users or lab-based simulations, he used his own speech to ask: "What does it mean to train a machine to be you?" The question became the foundation for exploring promises and risks of ultra-personalized AI in AAC.

One of the most striking findings emerged before any AI entered the picture. The act of logging speech changed Weinberg's behavior. "I didn't expect the discomfort," he said. "Just knowing I was logging changed how I spoke. I was no longer just speaking in the moment. I was curating a future dataset, and that changed my sense of freedom in conversation."

This is the surveillance effect in action. The first things to disappear were informal, emotionally charged expressions—dark jokes, gossip, venting—which were filtered out to prevent inappropriate text from resurfacing in professional settings. As a result, the AI learned what Weinberg described as a "cleaned-up" version of himself. (It's like watching yourself on a security camera and suddenly forgetting how to swear.)

The physical reality of using AAC devices adds another layer of friction. Weinberg has been unable to speak since age 15. His first text-to-voice device was monotone, with Mexican or Spanish accents but not his native Argentinian. "The monotone voices, the timing of interjections and conveying my personality through this new way of communication was definitely frustrating," he wrote in earlier research.

While the personalized model performed well in structured settings—helping elaborate ideas smoothly and efficiently—Weinberg found that it struggled in fast-moving social situations. "At bars, during quick topic shifts, or in mixed conversations jumping between work and personal life, the model often pushed toward familiar patterns instead of what I actually wanted to say in that moment," he said.

Remarks that work in one setting can be inappropriate in another. Once speech data is collected and aggregated, it often loses the social cues that gave it meaning. Roumen said this is a fundamental design challenge for these AI systems. "Our findings suggest that ultra-personal AAC requires a high granularity of context," he said. "Who you speak with, what the intention of the conversation is, and in what environment, all contribute to the type of suggestions that may or may not make sense."

The implications extend beyond AAC. "Everything we found—the self-censorship, the privacy violations, the identity reshaping—happened with a system I built for myself, that I fully controlled and could turn off at any moment," Weinberg said. "Most users won't have that."

Right now, the industry is moving very fast toward deploying these systems at scale without having figured out the basics. How to capture contextual information without erasing privacy. How the system knows when, what, and in front of whom to surface a suggestion. How to keep users in control of the technology that mediates their speech.

According to the Cornell Chronicle report, Weinberg's research was supported by a Google Research Scholar Award. The work builds on his earlier papers, including "Why so serious?" which won best paper honorable mention and jury best demo awards at CHI, and "One does not simply 'Mm-hmm,'" presented at the ASSETS Conference on Computers and Accessibility in October 2025.

Through a standing partnership between Cornell Tech and YAI—a nonprofit that supports more than 20,000 people with intellectual and developmental disabilities in New York, New Jersey and California—Weinberg spent a year working with AAC users who live in group homes in Tarrytown, New York. This helped him better understand needs and behaviors and improve prototypes.

The research highlights a tension between agency and efficiency. While an AI auto-complete will enable making humorous comments faster, there is a risk that it diminishes the user's sense of agency by making jokes for users instead of with the user. In time-pressured scenarios, AAC users were willing to give up some agency to deliver the comment faster. This challenged existing research that said AAC users care most about maximum agency, which is true in general but not always.

Roumen said that while the technology itself is advancing rapidly, the social and ethical groundwork has not caught up. "Our work highlights both the potential and the risks of personalized AI," he said. "Before these systems are deployed at scale, we need to think much harder about when recording should stop, how context is preserved, and how users remain in control of what becomes their voice."

Weinberg's question remains unanswered. "Can you build a truly personal AAC without also building a surveillance system for your own speech?" The answer depends on whether developers prioritize speed over control, and whether users will accept a cleaned-up version of themselves as the price of convenience. Whether companies actually listen to these warnings before shipping products remains the real question.

Arturas Malas Artūras Malašauskas is an AI Systems Integrator with 20+ years of production-grade web engineering experience. He has designed, shipped, and scaled enterprise Python/PHP systems for logistics, SaaS, and public-sector clients. For the past year, he has focused exclusively on AI integrations: deploying open-source LLMs, building generative media pipelines (image, audio, video), and engineering multi-agent workflows for real production environments. His standard: reproducibility, security, cost-efficient inference—no vaporware. He documents and evaluates emerging AI tooling, separating verified capabilities from marketing noise. Technical editor at: muza-ai.eu, ai-verslas.lt, ai-naujinos.lt Connect on LinkedIn
Share:

Comments

Sign in to comment:
    <