Speaking

The Speaking Practice Gap: Why You Understand More Than You Say

Most learners understand far more than they can say. Here's the practice gap behind it — why input alone stalls output — and how to close it for Korean.

The Sudamate Team9 min read

You understand far more than you can say because recognition is easy and recall is hard. That gap — between the Korean you can follow and the Korean you can produce out loud — is the practice gap, and it's the reason input-heavy study stalls right when you most want to speak. A learner's passive vocabulary is commonly estimated at several times larger than their active vocabulary — a difference often cited as 3-5x — so months of comprehensible input can leave you nodding along to a podcast and going blank the moment it's your turn to answer.

This isn't a sign you studied wrong. It's a sign you studied one half of the job. Comprehension and production are different skills, and speaking practice is the half most tools quietly skip.

Full disclosure: we make Sudamate, a Korean speaking app, so we think about this gap all day. That makes us biased — but it also means we've read the research on why understanding doesn't turn into speaking by itself, and what actually closes the distance. Here's the honest version.

Why can I understand a language far better than I can speak it?

Because understanding and producing are two different muscles, and you've only been training one. Recognizing a word when you hear it is far easier than dragging it out of memory and saying it on cue — recognition is widely held to be far easier than recall. So input-heavy study builds a receptive vocabulary that outpaces your productive one, and the gap widens as you advance. One commonly cited finding had learners score around 68.7% on a receptive test versus 48.2% on a productive test for the same words.

That asymmetry is the practice gap in one number. You can understand a sentence you could never have built yourself, because every Duolingo streak, every drama episode, every hour of input pours into the passive store and almost none into the active one — you were never the one talking.

Input builds comprehension. Output is a separate skill you have to train.

Input is necessary but not sufficient. You can't speak a language you can't understand, so comprehensible input is the foundation — but it doesn't automatically become speech. That's the whole argument between two of the most-cited ideas in language acquisition.

Stephen Krashen's input hypothesis (1977) says we acquire language by understanding input slightly above our current level — his "i+1." Understood input, in that view, is the active ingredient. Then Merrill Swain looked at Canadian French-immersion students who'd had years of rich, comprehensible input and found something stubborn: they understood nearly everything, yet their production stayed inaccurate and they rarely produced more than a clause. Comprehension had soared. Speaking hadn't followed.

Out of that, Swain proposed the comprehensible-output hypothesis (1985): producing language does work that understanding it never can. When you have to say something, you notice the gap between what you mean and what you can build, you test how the grammar works, and you reach for words by recall instead of recognition. Listening lets you skate past all of that. Speaking forces it. That's why understanding doesn't transfer on its own — output is its own skill.

Why is speaking the hardest part of learning Korean if Hangul is easy?

Because Hangul isn't the hard part — real-time grammar is. The writing system is famously learnable in a weekend, so reading was never the bottleneck. The U.S. Foreign Service Institute classifies Korean as a "super-hard" language, exceptionally difficult for native English speakers, requiring roughly 2,200 class hours (about 88 weeks) to reach professional working proficiency — the same top tier as Arabic, Chinese, and Japanese.

That figure isn't measuring how long it takes to read 한글. It's the cost of producing Korean correctly while a conversation moves: the right honorific level, the right particles, verbs conjugated on the fly. You can know every rule cold and still jam when you have to apply all three at once, in real time. "Korean is hard" and "Hangul is hard" are different claims — the first is true, the second basically isn't.

Plenty of free input resources will build your comprehension fast, and you should use them. Just know what they do and don't do: they make the listening side easier and leave the speaking side exactly where it was.

Why do I freeze or go blank when it's my turn to talk?

Usually it's anxiety, not a gap in knowledge. The words are in there — you proved that a second ago when you understood the question. What jams is the act of producing under social pressure.

Horwitz and colleagues named this in 1986 with the Foreign Language Classroom Anxiety Scale, and its two core components do the damage: communication apprehension, and fear of negative evaluation — the fear of sounding foolish in front of someone. That fear suppresses what researchers call willingness to communicate (MacIntyre et al., 1998): your readiness to actually open your mouth, which is inversely tied to anxiety. High anxiety, low willingness, blank face.

Most apps can't touch this, because they never put you in the situation that triggers it. You can't get desensitized to live speaking by tapping multiple-choice answers in a quiet room. If freezing mid-sentence is your particular wall, we wrote a whole guide on getting your first sentences out when you usually go blank — the short version is that the cure is reps, in a place safe enough that being wrong costs nothing.

Why don't input tools and language exchange close the gap on their own?

They each solve a real piece of the problem — and each leaves the speaking piece open. Here's the honest accounting.

Input tools deserve genuine credit. Duolingo, textbooks, spaced-repetition decks, and Korean YouTube are excellent at building comprehension and a daily habit, and they're cheap or free. The limit is structural: most of their exercises are tap-to-translate or multiple-choice — recognition, not production. Reviewers and long-term users keep landing on the same line, that Duolingo "teaches you to translate, not to speak." None of that makes it bad. It makes it an input tool.

Language exchange goes the other way — a real human and real output, the thing input tools lack. But its structure works against you: each partner is a native speaker of the other's target language, so a session splits in two. Add mismatched levels, time-zone friction, and partners who slowly stop replying, and consistency becomes the hard part.

LayerWhat it's great atThe speaking shortfall
Input apps / SRSFast comprehension, daily habitRecognition, not production — little real speaking
Textbooks / videoGrammar, vocabulary, listeningYou read and watch; you rarely talk
Language exchangeReal human, real output50/50 split halves output; partners ghost

Both are worth using. Neither, on its own, reliably gives you the one thing the output hypothesis says you need: sustained, on-demand reps of producing the language.

What does effective speaking practice actually look like?

Short, frequent, feedback-rich reps beat rare marathon sessions. Two well-supported findings point the same direction. The first is deliberate practice (Ericsson et al., 1993): effortful, goal-directed repetition with immediate feedback and a chance to try again — not just logging hours, but reaching slightly past what you can already do and correcting as you go. The second is the spacing effect: Cepeda and colleagues' 2006 meta-analysis found spaced practice beat massed (crammed) practice in 259 of 271 cases — about 95% of the time.

Translate that into a prescription and it's almost boringly simple. A few minutes of real speaking most days, with feedback you act on, will move you faster than a two-hour session once a fortnight. The reps have to be effortful — you producing, not you listening — and they have to come back around often enough to stick.

One more thing worth saying: the tests don't measure this. A TOPIK score, like most study material, measures input — reading and listening — not whether you can hold a conversation. We unpack that split in why tests don't measure whether you can talk. If your goal is speaking, you have to practice and check the thing you're actually after.

Closing the speaking-practice gap: a layer on top of your input tools

Here's where we come in, and here's the honest frame: Sudamate is a speaking-practice layer that sits on top of your input tools — never instead of them. Your apps, textbooks, and classes build the comprehension you need before you have anything to say. Sudamate exists for the next step those tools structurally can't take: turning that passive knowledge into spoken ability.

The mechanism is straightforward. Sudamate is voice calls with an AI Korean partner — it hears your pronunciation, replies in natural casual Korean, gently corrects you, and remembers you across calls. Because you're the one producing, every call forces the noticing and recall that Swain's output hypothesis says converts understanding into speaking. Because the partner is always free and judgment-free, you sidestep language exchange's 50/50 split and its ghosting, and you get the short, frequent reps the spacing effect rewards.

The early evidence is encouraging without being a slam dunk. A 2024 study in the journal System (Kim & Su) found Korean-as-a-foreign-language learners who did eight AI-chatbot conversation sessions made significant gains in willingness to communicate and lost anxiety versus a control group — one semester-long study with a modest, uneven sample, so supportive but not definitive proof that any app makes you fluent.

And the limits are real, so we'll name them. You need comprehensible input first; absolute beginners should lean on input tools to have something to produce. AI feedback is timely but it isn't a trained human teacher's, and pronunciation and genuinely unpredictable native interaction are still where AI partners are maturing. Sudamate is the speaking layer — the missing rep, not the whole curriculum.

But it's the rep almost everything else skips. You already understand more than you can say. The only way across that gap is to say it, often, somewhere it's safe to be wrong — and that's the part we built for.

Frequently asked

Why can I understand a language but not speak it?
Because understanding and speaking are different skills. Recognizing a word (input) is far easier than recalling and producing it on demand (output), so most learners' passive vocabulary ends up several times larger than their active vocabulary — a difference often cited as 3-5x. Input-heavy study — apps, video, reading — builds comprehension fast but gives very little speaking practice, leaving a gap between what you understand and what you can say.
Is comprehensible input enough to learn a language, or do you need output too?
Input is necessary but not sufficient. Stephen Krashen's input hypothesis (1977) holds that understanding language slightly above your level drives acquisition, but Merrill Swain's comprehensible-output hypothesis (1985) showed that learners with near-native comprehension still couldn't produce the language accurately. Producing language — speaking or writing — forces you to notice gaps, test what you know, and recall words rather than just recognize them, which is why output has to be practiced separately.
Why is speaking the hardest part of learning Korean if Hangul is easy?
Hangul, the writing system, is famously learnable in a weekend, so reading isn't the bottleneck. The U.S. Foreign Service Institute classifies Korean as a 'super-hard' language requiring about 2,200 class hours, and that difficulty comes from real-time grammar — honorifics, particles, and verb conjugation that you assemble while speaking. Producing correct Korean under conversational time pressure is the hard part, not recognizing letters or words.
Why do I freeze or go blank when I try to speak?
Freezing is usually foreign-language speaking anxiety, not a lack of knowledge. Horwitz and colleagues (1986) identified communication apprehension and fear of negative evaluation as core components — the fear of making mistakes in front of someone suppresses your willingness to communicate. Most apps can't fix this because they never put you in a live, low-stakes situation where you actually have to talk; the cure is frequent, judgment-free speaking practice.
Does practicing with an AI conversation partner reduce speaking anxiety?
The early evidence is encouraging. A 2024 study in the journal System found that Korean-as-a-foreign-language learners who did eight AI-chatbot conversation sessions showed significant gains in willingness to communicate and reduced anxiety compared with a control group. It's one semester-long study with a modest sample, so it's not proof any app produces fluency, but it supports the idea that low-stakes AI speaking practice helps learners get over the hump of actually talking.

Practice this, out loud.

Sudamate is voice calls in Korean with a tutor who remembers what you care about. No homework, no streaks. Just talking.

Keep reading