How Spaced Repetition Actually Works (And Why Most Apps Get It Wrong)
Learn the science behind spaced repetition vocabulary learning, from Ebbinghaus's forgetting curve to modern FSRS, and why most apps still fail to implement it.
Last updated: March 2026
TL;DR
- The forgetting curve is real: you lose roughly two-thirds of new vocabulary within 24 hours unless you review it at precisely the right time.
- Spaced repetition vocabulary systems work by scheduling reviews just before forgetting, turning a sprint into a marathon. Most apps simulate this but don't actually do it.
- The gap between a good SRS algorithm and a bad one is the difference between retaining 90% of what you study and retaining 30%.
You Forgot 70% of Yesterday's Vocabulary. Here's Why.
The average learner forgets roughly 70% of new vocabulary within 24 hours of first exposure. Not because they are lazy. Not because the words were too hard. Because the human brain, by design, discards information it has not been signaled to keep.
This is not a personal failing. It is a feature. The brain maintains an estimated 100 trillion synaptic connections, and without a pruning mechanism, it would collapse under its own weight. Unused or weakly reinforced connections are systematically weakened and removed through a process called synaptic pruning. The problem for language learners is that "signal to keep" is exactly what most study methods fail to send.
There are roughly 150 vocabulary learning apps on the App Store. Almost all of them understand this problem at a surface level. Duolingo has streaks. Quizlet has flashcard stacks. Anki has an algorithm that rivals academic research software. Yet most learners still plateau, still forget, still give up.
This post explains the actual science behind spaced repetition vocabulary learning, starting with a 19th-century German psychologist who memorized nonsense syllables until he understood memory itself, and then shows you precisely how to use it to retain vocabulary permanently.
Section 1: The Forgetting Curve, What Ebbinghaus Discovered in 1885
In 1885, Hermann Ebbinghaus published Über das Gedächtnis (On Memory), one of the most cited works in cognitive psychology. His experiment was simple and slightly obsessive: he memorized lists of nonsense syllables (like "WID," "ZOF," "BAK"), waited varying amounts of time, then tested how much effort was "saved" when relearning them.
What he found became known as the forgetting curve.
The data was stark:
- After 20 minutes, roughly 42% of new material was forgotten.
- After 1 hour, 56% was gone.
- After 1 day, about 67% had vanished.
- After 6 days, 75% was lost.
- After 31 days, the retention rate leveled off at roughly 21%.
(These figures represent Ebbinghaus's "savings score," measuring how much relearning effort was preserved over time.)
The curve is not a straight line. It drops steeply in the first few hours, then gradually flattens. This is critical: the first review window matters most. A learner who reviews new vocabulary after 24 hours is already fighting uphill; one who reviews after 20 minutes is catching the curve at its steepest descent.
For language learners, the challenge is worse than Ebbinghaus's nonsense syllables in two ways. First, volume: a TOEFL candidate learning 50 words per week faces a recall burden Ebbinghaus never tested. Second, interference: real vocabulary items compete with each other ("affect" vs. "effect," "comprise" vs. "compose"), creating proactive and retroactive inhibition that accelerates forgetting.
Memory Consolidation and Sleep
Ebbinghaus worked before neuroscience had the tools to explain why forgetting followed this curve. We now know: newly learned information exists first in working memory and must be transferred to long-term memory through a process called memory consolidation. Consolidation is energy-intensive and happens primarily during sleep, specifically during slow-wave sleep and REM cycles.
This has a direct practical implication: reviewing vocabulary in the evening before sleep is not superstition. It is physiology. Words reviewed before sleep enter the consolidation window at its most active point. Studies on declarative memory (the type used for vocabulary) consistently show that sleep-adjacent review improves next-day recall by 20–40%.
The forgetting curve, in other words, has a natural exploit: time your reviews to intercept the drop, and amplify with sleep. Spaced repetition is the system that does both.
Section 2: What Spaced Repetition Actually Is
How Does Spaced Repetition Work?
Spaced repetition is a study technique that schedules reviews of individual items at increasing intervals, calibrated to arrive just before the memory of each item is predicted to fade.
Instead of reviewing all 50 words every day (massed practice, or "cramming"), a spaced repetition system tracks each word individually and asks: when is this specific learner likely to forget this specific word? It schedules the next review at that threshold, not before (which wastes time on still-fresh memories) and not after (which lets forgetting win).
The analogy for cramming is a sprint: intense, fast, and unsustainable. Spaced repetition is a marathon: steady, compounding, and permanent. Cramming can produce a 90% score on a test taken the next morning and a 20% score on the same test taken one week later. Spaced repetition produces a 70% score the next morning and a 70% score one month later, and the gap keeps closing in spaced repetition's favor over time.
The SM-2 Algorithm
The most widely implemented spaced repetition algorithm is SM-2, developed by Piotr Wozniak for the SuperMemo software in 1987. SM-2 works like this:
- When you review a card, you rate your recall on a difficulty scale.
- Based on your rating, the algorithm calculates an interval (days until next review) and an ease factor (a multiplier that grows if you keep recalling easily and shrinks if you struggle).
- A confident recall on a new card might schedule the next review in 4 days. A complete failure resets the card to 1 day.
- After each successful review, the interval multiplies by the ease factor. A card you've recalled correctly five times might have an interval of 30 days. A card you keep failing stays at 1–2 days indefinitely.
The result is that easy vocabulary (words like "table" or "city" for a B1 learner) quickly gets scheduled monthly or less, demanding almost no daily attention. Hard vocabulary ("sycophant," "ameliorate," "perfidious") stays in heavy rotation until it's solidified.
FSRS: The 2022 Upgrade
SM-2 was excellent for 1987. In 2022, a researcher named Jarrett Ye published FSRS (Free Spaced Repetition Scheduler), a machine-learning-based algorithm trained on millions of real Anki review logs. FSRS makes three key improvements over SM-2:
- It models memory as a two-dimensional state: stability (how long before forgetting) and retrievability (current probability of correct recall). SM-2 only tracked intervals.
- It recalibrates predictions per individual, not just per card. Your personal forgetting rate is factored in.
- It handles lapses (re-learning forgotten cards) more accurately, preventing the "ease hell" problem common in SM-2 where repeatedly failed cards accumulate crushing daily loads.
FSRS has been available in Anki since version 23.10 (November 2023) and is widely considered the best publicly available SRS algorithm as of 2026. Users must enable it manually, but adoption is growing rapidly.
The visualization: imagine two lines on a graph. The cramming line rises sharply before an exam, then collapses. The SRS line climbs more slowly but never falls below 70%, compounding session by session into something close to permanent retention.
Section 3: The Science Behind Why It Works
Retrieval Practice: Testing Beats Re-reading
In 2006, cognitive psychologists Henry Roediger III and Jeffrey Karpicke published a landmark study comparing learning conditions: study-only (re-reading) versus repeated testing. On a final test given one week later, students who were repeatedly tested showed significantly higher retention than those who only re-read the material. The study-only group forgot more than half of what they had originally learned, while the repeatedly tested group forgot far less.
The mechanism: retrieval practice, the act of pulling information out of memory rather than just putting it in, creates stronger, more durable memory traces than passive review. Every time you answer a flashcard, you are not simply checking whether you know the word. You are strengthening the neural pathway that retrieves it.
This is why active recall vocabulary practice systematically outperforms highlighting, re-reading, and even taking notes. Effort at retrieval time is what builds memory.
Desirable Difficulties
Robert Bjork at UCLA introduced the concept of desirable difficulties: the finding that slightly harder study conditions produce better long-term retention, even when they feel less productive in the moment.
For spaced repetition vocabulary, this means reviewing a word when you are almost going to forget it, not when it is still fresh and easy. The difficulty of barely remembering "perfidious" on the last day before forgetting is precisely what makes the retrieval event so powerful. Interval scheduling that's too aggressive (reviewing too frequently) kills this effect. Interval scheduling that's too loose produces more forgetting events, which have their own cost.
The sweet spot is the SRS interval. It is not comfortable. It is precisely calibrated to be just hard enough.
Interleaving
Blocked study means reviewing all your "transportation vocabulary" together, then all your "academic verbs." Interleaved study means mixing them randomly. Interleaving consistently produces worse performance in the short term and better performance in the long term, another desirable difficulty.
A well-implemented spaced repetition vocabulary system naturally produces interleaving: because each word has its own interval, your daily review queue mixes words from dozens of semantic categories and exposure ages. This is not an accident; it is a structural advantage over any study method that groups items by topic.
Sleep: The Consolidation Window
Neuroscience has confirmed what Ebbinghaus could only infer: the hippocampus, which tags new memories for consolidation, replays and strengthens memory traces during slow-wave sleep. Words studied within a few hours of sleep benefit from longer hippocampal replay. A 15-minute vocabulary review session before bed is worth more than a 30-minute session at noon, all else being equal.
Rhythm Word's offline capability matters here precisely because the bedtime review session is when Wi-Fi is most likely to be off, the device is on Do Not Disturb, and the optimal consolidation window is open.
Section 4: Why Most Apps Get It Wrong
Not every app that claims to use spaced repetition actually does, and the differences matter enormously for long-term retention. Here is an honest breakdown of the major players.
Duolingo
Duolingo has a large and well-funded research team. It has published studies on its own effectiveness. Its core product, however, is optimized for engagement metrics (daily active users, streak counts, notification open rates) rather than vocabulary retention specifically.
The spaced repetition component in Duolingo is subordinate to its lesson structure. Vocabulary re-appears based on lesson progression, not per-item forgetting curves. There is no difficulty rating. There is no adaptive scheduling at the word level. The gamification layer (hearts, streaks, leaderboards) is the primary mechanism driving behavior, and it has been designed to maximize daily opens, not optimal review timing.
This is not a criticism of Duolingo for general language exposure. For spaced repetition vocabulary specifically, its implementation is lightweight.
Quizlet
Quizlet's "Learn" mode simulates spaced repetition by presenting items more frequently when you miss them. This is directionally correct but structurally limited: intervals are fixed and based on within-session behavior, not multi-day scheduling. There is no cross-session memory model. A word you "learned" on Tuesday has no special status on Thursday.
Quizlet's advantage is content creation: millions of user-made sets for every subject imaginable. Its SRS implementation does not match the underlying science.
Anki
Anki is the gold standard for SRS algorithm quality. Its FSRS implementation is, as noted above, the most sophisticated publicly available scheduling algorithm. For vocabulary learners who commit to it, Anki produces exceptional long-term retention.
The widely reported friction point (frequently cited as high abandonment within the first month across Reddit, language learning forums, and productivity communities) is not the algorithm. It is the setup experience. Users must source or create their own decks, configure the interface, understand the algorithm settings, and build a daily habit around a tool that has seen minimal UX investment since the mid-2000s. The learning curve for Anki is a meaningful barrier, especially for learners who are not already technically inclined.
Anki is also fully offline and excels there. But for a learner who wants a ready-to-use system rather than a research tool to configure, the barrier is real.
The Comparison
| Feature | Duolingo | Quizlet | Anki | Rhythm Word |
|---|---|---|---|---|
| True adaptive SRS | Partial | Partial | Yes (FSRS) | Yes |
| Per-word scheduling | No | No | Yes | Yes |
| Difficulty self-rating | No | Partial | Yes | Yes |
| Real-time sentence generation | No | No | No | Yes |
| Offline capable | Limited | No | Yes | Yes |
| Setup required | None | Low | High | None |
| Modern slang / current vocab | No | User-dependent | User-dependent | Yes |
| Free to try | Freemium | Freemium | Free | Yes |
What Rhythm Word's Six Engines Address
The "6 learning engines" in Rhythm Word are not marketing language. They reflect a genuine gap in how most vocabulary apps handle memory pathways.
Human vocabulary acquisition uses multiple cognitive routes: recognizing a word when you see it is different from recalling it when writing, which is different from using it naturally in context. A system that only drills recognition (multiple-choice identification) will produce learners who can pass a reading comprehension test but cannot produce the word unprompted.
Rhythm Word's six engines cover:
- Recognition — seeing the word, identifying the meaning (passive recall)
- Production — seeing the definition, producing the word (active recall)
- Recall in context — word appears inside an personalized sentence; learner confirms understanding
- Contextual judgment — is this sentence using the word correctly? Forces semantic precision
- Spaced retrieval — core SRS scheduling layer, identical in logic to FSRS
- Interleaved review — mixed-topic daily queues that prevent blocked-study stagnation
Together, these six routes attack vocabulary from every angle that cognitive science has identified as load-bearing. No single engine alone is sufficient.
Section 5: What Good Spaced Repetition Looks Like
If you are evaluating any spaced repetition vocabulary system (including Rhythm Word), these are the criteria that separate real implementations from surface-level simulations.
Must-Haves
Adaptive scheduling. The interval between reviews must change based on your actual recall performance, not a fixed schedule. If you consistently struggle with a word, the system must increase review frequency. If you master a word, it must schedule reviews further apart. This is the core of SRS.
Difficulty self-rating. You must be able to tell the system how hard a recall was. This is not about honesty theater. It is the signal the algorithm uses to calibrate future intervals. An app that removes this step (by auto-detecting correctness only) loses critical information about the quality of the recall, not just the outcome.
Context-rich content. Vocabulary is not a list of word-definition pairs. Words have collocations, register (formal vs. informal), common errors, and semantic nuance. A system that shows only "perfidious = treacherous" misses the full picture. Sentences, especially sentences calibrated to the learner's current level, encode vocabulary in the neural context closest to how it will actually be used.
Nice-to-Haves
personalized, level-adapted sentences. Static example sentences from a dictionary are better than nothing. Sentences generated to match a learner's current vocabulary level and interests are significantly better: they are comprehensible, memorable, and personally relevant. Rhythm Word generates these dynamically; a sentence for "ephemeral" shown to a B2 learner studying academic vocabulary will differ from one shown to a C1 learner reviewing for GRE.
Offline capability. A review habit built around Wi-Fi is a habit with a single point of failure. Commute, travel, and pre-sleep review sessions all benefit from full offline functionality. Syncing can happen when connectivity returns.
Visual progress feedback. Retention curves, streak data, and due-card counts are not vanity metrics. They are feedback loops that help learners calibrate their study investment. Seeing that a word has been reviewed 8 times and now has a 30-day interval is motivating in a way that a raw flashcard stack is not.
How Rhythm Word Maps to These
Rhythm Word's adaptive scheduler adjusts intervals dynamically per word. In each review, the target word in the sentence is bold by default, indicating you remembered it. If your recall was not confident, you can tap the word to change its status: orange means the word felt fuzzy, and red means you forgot it entirely. This simple interaction gives the algorithm the precise difficulty signal it needs without interrupting your reading flow.
Example sentences are personalized and level-matched: a beginner studying TOEFL vocabulary sees simpler sentences with the target word in an accessible context, while an advanced learner studying GRE words sees the same word in a more demanding register.
The app's offline mode downloads all card content and scheduling data locally. No Wi-Fi means no interruption. For a learner on a Seoul subway at 7 AM with patchy connectivity, this is not a minor feature. It is the difference between a maintained habit and a broken one.
Section 6: How to Start Today
The research is settled. Spaced repetition vocabulary works. The gap is always implementation, specifically building a daily habit that does not collapse after two weeks.
Here is a five-step protocol that works with any serious SRS system, and maps directly to how Rhythm Word is designed to be used.
Step 1: Choose one word list matched to your goal. Do not study "all English vocabulary." Pick one list: TOEFL Academic Word List, GRE high-frequency words, IELTS Band 7 collocations, or Everyday Contemporary English. Focused input beats diffuse input at every stage. Rhythm Word comes pre-loaded with word lists across all of these categories. Pick the one that matches your exam or goal and start there.
Step 2: Limit new words to 15–20 per day. This is not a motivation issue. It is a queue management issue. Adding 50 new words today creates a review load in 4 days that will take 45 minutes, and most people will skip it. A sustainable 15 new words per day compounding over 30 days creates a manageable queue that stays under 20 minutes daily. Consistency over volume.
Step 3: Review all due cards before adding new ones. Every day, the first action is to clear yesterday's review queue. Skipping due reviews while adding new words is the fastest way to create an unmanageable debt. The algorithm has scheduled those reviews at the optimal window. Bypassing that window increases future review load, not decreases it. For more on daily volume strategy, see How to Learn 30 Words Per Day.
Step 4: Rate difficulty honestly. Tapping through everything without adjusting status feels fast, but it produces interval inflation: words get scheduled further out than warranted, leading to forgotten words re-entering the queue as failed cards with a reset interval. Leave the word bold only when recall was immediate and effortless. Tap to mark it orange when you hesitated. Tap to mark it red when you genuinely could not recall. The algorithm works best on honest data.
Step 5: Commit to 15 minutes daily, not 90 minutes twice a week. The spacing effect is destroyed by massed sessions. Two 45-minute sessions per week produce significantly worse retention than seven 15-minute sessions. Daily consistency is not about discipline; it is about how memory consolidation works. Short daily sessions align with the biology. For learners targeting GRE prep, Rhythm Word includes a dedicated GRE word list with a structured study protocol built in.
Conclusion: The Science Is Not the Hard Part
Spaced repetition has been understood for 140 years. The algorithm to implement it has been publicly available since 1987 and refined with machine learning since 2022. The research on retrieval practice, desirable difficulties, and sleep consolidation is not contested.
The hard part is closing the gap between knowing and doing, and that gap is where app design lives.
Duolingo closes it with streaks and gamification, at the cost of SRS fidelity. Anki closes it with algorithmic precision, at the cost of an accessible daily habit. Most other apps don't try seriously on either front.
The design challenge Rhythm Word was built to solve is precisely this: make a system with real SRS mechanics, context-rich personalized sentences, six cognitive learning pathways, and full offline support, and make it as easy to pick up as a social media app. No setup. No configuration. Free to try, with six learning engines and one tap to start.
Download Rhythm Word free on iOS: no complicated setup, just science.
Frequently Asked Questions
Does Duolingo use spaced repetition?
Duolingo incorporates elements of spaced repetition in its review sessions, but its core learning loop is built around lesson completion and gamification (streaks, hearts, XP) rather than per-word adaptive scheduling. Vocabulary reappears based on lesson structure, not individual forgetting curves. For a true SRS system focused specifically on vocabulary retention, Duolingo's implementation is limited compared to Anki or Rhythm Word.
Is Anki better than other spaced repetition apps?
Anki's FSRS algorithm (available since version 23.10 in November 2023) is the most sophisticated publicly available SRS implementation for vocabulary. For learners who are technically comfortable and willing to invest time in setup and deck curation, Anki delivers excellent long-term retention. However, its interface is dated and its setup friction is high; community discussions consistently cite steep early abandonment as a recurring pattern. For learners who want FSRS-quality scheduling without the configuration overhead, Rhythm Word offers a ready-to-use alternative with added sentence generation.
How long does it take for spaced repetition to work?
You will notice improved retention within the first week. The compounding effect, where reviews become less frequent because words are solidly encoded, becomes visible around 30–45 days. By day 90 with consistent 15-minute daily sessions, most learners have a vocabulary of 500–800 words with retention rates above 80%. Spaced repetition does not produce overnight results; it produces durable results that passive methods cannot match.
How many words can you learn per day with spaced repetition?
15–20 new words per day is the sustainable ceiling for most learners using a genuine SRS system, assuming a 15-minute daily session budget. At this pace, the review queue remains manageable and retention stays high. Learners who push to 50+ new words per day typically create review queues so large that sessions become overwhelming, leading to skipped days and cascading overdue cards. Quality of encoding matters more than volume of input.
What is the SM-2 algorithm?
SM-2 is a spaced repetition scheduling algorithm developed by Piotr Wozniak in 1987 for the SuperMemo software. It calculates review intervals based on two variables: an interval multiplier (the number of days between reviews) and an ease factor (a per-card multiplier that adjusts based on recall difficulty). Correct recalls increase the ease factor and extend the next interval; failed recalls reset the interval and reduce ease. SM-2 remained the standard SRS algorithm for over 30 years. It has largely been superseded by FSRS (2022), which adds stability and retrievability as separate memory dimensions and recalibrates per individual learner.
Rhythm Word is available on iOS. If the way we think about vocabulary learning resonates with you, we would love for you to try it.
Download on the App Store