Geometric Series

From the idea to the engineering — and how to catch the model in a lie.

Section 1 of the Applied Mathematics book · interactive throughout

Now look — I want to show you one of the plainest little ideas in all of mathematics: you take some quantity and you shave it by the same fraction, again and again. That's the whole thing. And yet this one move — shrink, shrink, shrink — puts a price on a bond that promises to pay you forever, smooths the jitter out of a noisy sensor, teaches a machine to plan ahead, brings a bouncing ball to rest in a finite stretch of time, and tells a young company whether its growth will catch fire or quietly run out of breath. It's one pattern wearing a great many costumes. The last chapter pulls off the mask and names it; the chapter just before that shows you exactly where it breaks — the moments it quietly lies to you.

Chapter One — Theory

1. The Finite Geometric Series

Here's the whole idea, and it's about as plain as ideas get. You take a number, and to get the next one you just multiply by the same fixed amount — and you do it again, and again. Each term is the one before it, scaled by some constant. That constant — the thing you keep multiplying by — is what people call the common ratio, and we'll write it $r$. A sum built that way is a geometric series.

$ S_n = a + ar + ar^2 + ar^3 + \cdots + ar^{n-1} = \sum_{k=0}^{n-1} ar^k $

So $a$ is where you start, $r$ is the factor you keep applying, and you do it $n$ terms' worth.

Deriving the closed form

Now, you could add these up one at a time, but who's got the patience? There's a lovely little trick instead — multiply the whole sum by $r$, and then subtract. Watch what happens.

$ S_n = a + ar + ar^2 + \cdots + ar^{n-1} $

$ rS_n = \;\;\;\;\; ar + ar^2 + \cdots + ar^{n-1} + ar^{n} $

Take the second line away from the first, and look — almost everything in the middle washes right out. Each $ar$, each $ar^2$, all the way down, appears in both lines and cancels itself. You're left with just the two ends:

$ S_n - rS_n = a - ar^n \qquad\Longrightarrow\qquad S_n(1-r) = a(1-r^n) $

Result — Finite Sum

$ \boxed{\,S_n = a \cdot \frac{1 - r^n}{1 - r}\,} \quad (r \neq 1) $

And there it is — one little formula that swallows any finite geometric sum you can throw at it, ten terms or ten million, and that's the whole trick.

Chapter Two — Theory

2. The Infinite Geometric Series

Now let's not stop. Let the terms keep coming forever: $n \to \infty$. And here's the question that has to make you uneasy — what does an endless sum do? Does it run away on you, or does it settle? Everything hinges on $r$. Watch the $r^n$ term in the closed form, and ask what becomes of it as $n \to \infty$:

If $|r| < 1$, then $r^n \to 0$ — each step is a fraction of the one before, so the pieces shrink toward nothing.
If $|r| \geq 1$, then $r^n$ does not die away, and the whole thing blows up — the sum diverges.

So when $0 < r < 1$ — and more generally whenever $|r| < 1$ — that troublesome $r^n$ term just washes out, and look what's left:

Result — Infinite Sum

$ \boxed{\,\sum_{k=0}^{\infty} ar^k = \frac{a}{1 - r}\,} \quad (|r| < 1) $

the most useful identity in applied mathematics

And if you set $a = 1$, it gets even plainer:

$ \sum_{k=0}^{\infty} r^k = 1 + r + r^2 + r^3 + \cdots = \frac{1}{1 - r} $

Intuition

Why should an endless pile of numbers add up to anything at all? Because each piece is smaller than the last by a fixed fraction, so they fold in faster than they accumulate, and the total settles onto one finite number. Picture it: you walk half the remaining distance to a wall, then half of what's left, then half of that, on and on forever ($r = \tfrac12$) — and the steps sum to $\frac{1}{1-1/2} = 2$. You travel exactly twice that first step, and you never once touch the wall. Isn't that something?

Chapter Three — Theory

3. Differentiating Both Sides — New Identities for Free

Now look — this is where the little trick really starts to pay you back. That identity $\sum_{k=0}^{\infty} r^k = \frac{1}{1-r}$ isn't just a fact about one number. It's an equality between two functions of $r$ — the same value, every $r$ in range, all the way along. And here's the thing: if two functions march together over a whole stretch, then so do their slopes. So let's take the slope of both sides — differentiate with respect to $r$ — and see what falls out.

Left side — term by term

$ \frac{d}{dr}\sum_{k=0}^{\infty} r^k = \sum_{k=1}^{\infty} k\, r^{k-1} $

(the $k=0$ term is constant, so it drops out)

Right side — chain rule on $(1-r)^{-1}$

$ \frac{d}{dr}\left(\frac{1}{1-r}\right) = \frac{1}{(1-r)^2} $

Set the two equal — they have to be — and out comes a brand-new closed form, for nothing:

Result — First Derivative

$ \boxed{\,\sum_{k=1}^{\infty} k\, r^{k-1} = \frac{1}{(1-r)^2}\,} \quad (|r| < 1) $

And if you give the whole thing one more push by $r$, you get the version people lean on most — the one where each term carries its position as a weight:

$ \sum_{k=1}^{\infty} k\, r^{k} = \frac{r}{(1-r)^2} $

Why this matters

The plain old sum $\frac{1}{1-r}$ tells you "what's the total?" Fine. But the differentiated one answers something deeper — "where does all the weight sit? what's the average?" Watch why: that $k$ sitting in front of each term in $\sum k\,r^k$ tags every term with its position. Each thing counts according to where it lands. So this identity is, secretly, an averaging machine. And here are three honest-to-goodness problems it cracks — all of them the very same formula, just wearing different clothes:

1. How many tries until something works. Roll a die and wait for your first 6. The chance it takes you exactly $k$ rolls shrinks geometrically — each extra roll a fixed fraction less likely than the last. Now ask the average number of rolls: weight each outcome by how likely it is, and that's precisely a $\sum k\,r^k$ sum. It collapses to $1/p$ — so $1/(1/6)=6$ rolls, and not one symbol more. That "keep trying till it works" shape has a name people use — the geometric distribution: retries until a request finally goes through, attempts until the right key turns in the lock.

2. The average line you'll wait in. Picture a coffee shop where, in each stretch of service, there's a fixed chance one more customer walks in. The line that builds up settles into probabilities that decay geometrically — long lines getting rarer at a steady rate. So what's the average queue length — the number you'd actually need to decide whether to put on a second barista? It's $\sum k\cdot(\text{prob of }k)$, that same position-weighted sum staring back at you. This is the bedrock of queueing theory — how people size call centres, checkout lanes, and pools of servers.

3. The balance point of decaying weights. Here's a picture you can feel. Hang masses $1, r, r^2, \ldots$ along a weightless ruler at the marks $0,1,2,\ldots$ — each one lighter than the last. Where does it balance on your finger? That balance point — the center of mass — is $\frac{\sum k\,r^k}{\sum r^k} = \frac{r}{1-r}$. And this is exactly the bookkeeping inside a momentum optimizer's memory (§5.5): the balance point tells you how many steps back the thing is really still feeling.

So here's the punchline. One identity — got for free by taking the slope of an equation we already had — answers "what's the average?" across waiting, queueing, and weighting. Three problems that look like total strangers, until you notice they're all $\sum k\,r^k$ underneath. It's the same darn thing in a different suit.

Going further — risk, not just the average

And why stop? Take the slope again and you climb up to $k(k-1)$ weighting:

$ \sum_{k=2}^{\infty} k(k-1)\, r^{k-2} = \frac{2}{(1-r)^3} $

Now, §5.6 tells you the average number of retries before success is $1/p$. Good to know — but it's not enough for someone planning capacity. They need the spread: every so often, does a stubborn request drag on for 20-plus tries? That's the variance, and to get it you need the second moment $\sum k^2\cdot(\text{prob})$ — and that $k^2$ is exactly what drags you to the second derivative. Work it through and it comes out to $\frac{1-p}{p^2}$. Take a request that succeeds a quarter of the time: the average is 4 tries, but the standard deviation is about 3.5 — so the retry counts are flung wide around that average. That's the whole reason careful systems cap their retries and sprinkle in jitter. And going from "what's typical" to "how nasty is the tail" — that's the very same move that carries you from a portfolio's expected return over to its volatility.

Integrating instead — the logarithm

We took slopes; now let's go the other way and add up the area under the series, term by term. And look what comes out — the logarithm:

$ \int_0^x \frac{dr}{1-r} = \sum_{k=0}^{\infty}\frac{x^{k+1}}{k+1} \;\Longrightarrow\; -\ln(1-x) = \sum_{k=1}^{\infty}\frac{x^k}{k} = x + \frac{x^2}{2} + \frac{x^3}{3} + \cdots $

How a calculator actually computes $\ln$. Here's something that ought to surprise you: your calculator has no "logarithm" inside it. None. It can add, multiply, divide — and that's the lot. So how does it hand you $\ln(1.1)$? It runs exactly this series with $x=0.1$: $\;0.1 - \frac{0.1^2}{2} + \cdots \approx 0.0953$, and because $x$ is small, the terms shrink so fast you're done almost before you've started. The same series even tells you why $\ln$ crawls along so slowly — every term gets divided by $k$, so each one chips in less. And that's the real reason, right down at the level of the series, that log scales (decibels, pH, Richter, the $O(\log n)$ running times) can squash enormous ranges down to something you can hold in your hand.

The Seed

One starting identity, an entire family of results — take its slope and you get averages and risk; add up its area and you get logarithms. The seed of the whole thing is $\sum r^k = \frac{1}{1-r}$.

Chapter Four — Theory

4. The Toolkit

Here's the whole thing on one card. Six identities — and every one of them grew out of that single seed, the same shrinking-by-a-fixed-fraction idea wearing six different costumes. Keep this beside you when we get to the applications; that's all you'll need.

The geometric-series toolkit
Identity	Closed form	Condition
Finite sum	$a\dfrac{1-r^n}{1-r}$	$r \neq 1$
Infinite sum	$\dfrac{a}{1-r}$	$\|r\|<1$
First derivative	$\dfrac{1}{(1-r)^2}$	$\|r\|<1$
Weighted $\,k\,r^k$	$\dfrac{r}{(1-r)^2}$	$\|r\|<1$
Second derivative	$\dfrac{2}{(1-r)^3}$	$\|r\|<1$
Integral	$-\ln(1-x)$	$\|x\|<1$

Chapter Five — Applications

5. Real-World Applications

Now look — for each of these, the part worth watching isn't the formula at all. It's the problem someone was actually stuck on, and the one little move that got them unstuck. And keep your eye out for the same fingerprint every single time: something decays by a fixed fraction each step, and we want its total or its mean.

Finance

How do you price something that pays forever?

The problem. Picture Britain in 1751, rolling all its war debts into one peculiar bond they called the consol. This thing never came due — no final payday, no return of your money. It just promised to pay you a fixed sum, say £3, every year, forever. So ask yourself the honest question: what's a promise of £3 a year, for all eternity, worth to you today? It can't be infinity — nobody hands over infinity. But how on earth do you add up a stream that never, ever stops?

Why the obvious approach fails. Try the lazy way — write down $3+3+3+\cdots$ and keep going. It runs away on you; it diverges. But here's the thing everybody forgets: a pound handed to you 50 years out is worth far less to you today, because today's pound you could put to work. At 5% a year, £1 due next year is only worth $\frac{1}{1.05}$ today; £1 two years out, $\frac{1}{1.05^2}$; and so on down the line.

The move. So watch what each payment really is — it's the one before it, shrunk by that same factor $\frac{1}{1.05}$. That's our shrinking pattern again, a geometric series with $r=\frac{1}{1.05}$, and we already know how to sum it:

$ PV = \frac{3}{1.05}+\frac{3}{1.05^2}+\frac{3}{1.05^3}+\cdots = \frac{3}{0.05} = 60. $

And the whole endless stream collapses to something astonishingly plain — an endless £3-a-year is worth exactly £60 today. Price = payment ÷ interest rate, and that's all there is to it. People traded consols on this very reasoning for over 250 years, until Britain bought back the last of them in 2015. The same argument prices a share of stock as the discounted sum of every dividend it will ever pay: the price is the series.

Signal Processing

How do you smooth a noisy sensor without storing its history?

The problem. Suppose you've got a thermostat, or a heart-rate monitor, and the readings come in jittery. The obvious cure is to average the last $N$ of them — but that means holding onto $N$ readings and re-adding the whole pile every update. On a tiny chip ticking thousands of times a second, you simply can't.

The move. Here's a one-line update that keeps no history at all:

$ y[n] = (1-a)\,x[n] + a\,y[n-1]. $

The new value just blends the freshest reading with the last smoothed value — one number, carried forward. But now you ought to feel uneasy: if the output keeps feeding back into itself forever, does it settle down, or does it blow up?

Why a geometric series answers it. Unroll that feedback and look at what the output secretly is — a weighted sum of all the past inputs, with weights $(1-a),(1-a)a,(1-a)a^2,\ldots$ shrinking geometrically. And two things fall right out:

The weights add up to exactly 1 (since $(1-a)\cdot\frac{1}{1-a}=1$) — so it's an honest average, neither puffing up a steady signal nor whittling it down.
The filter is stable iff $|a|<1$ — which is the very same condition that makes the series converge.

So the convergence condition isn't bookkeeping here — it is the line between a filter that settles quietly and one that screams off into distortion. This runs in essentially every sensor, audio codec, and control loop you'll find today.

Reinforcement Learning

How do you compare a reward now against rewards later?

The problem. Here's the question that has to make you uneasy. You've got an agent that just keeps running, and you'd like to add up all its future rewards — but two different strategies can both total "infinity," and infinity equals infinity ranks exactly nothing. A lousy way to choose what to do next. You can't steer toward a goal you can't even score.

The move. So pick a number $\gamma\in(0,1)$ — call it the discount factor — and weight a reward $k$ steps out by $\gamma^k$. For a steady reward $r$ at every step:

$ G = r + \gamma r + \gamma^2 r + \cdots = \frac{r}{1-\gamma}. $

Why this is the whole game. Two things happen at once. The discount makes the value finite, so strategies finally get real, rankable scores — and at the same time it bakes in a taste for sooner over later. And the marvelous thing is how concrete $\frac{1}{1-\gamma}$ turns out to be: at $\gamma=0.99$ it's $100$, so the agent is, in effect, "planning about 100 steps ahead." Turning the $\gamma$ knob is turning the planning horizon. Every value-based method — Q-learning, and the algorithms behind game-playing and robotics — leans on this one sum being finite.

Physics

A ball bounces infinitely many times — so why does it stop?

The problem. Drop a ball, and let each bounce come back to 60% of the height before it. In principle it bounces infinitely often. Surely that ought to take infinite time, and cover infinite distance? And yet you watch it die on the floor in a couple of seconds. So where does the paradox give?

The move — distance. The ball falls $h$, then goes up-and-back-down $h\cdot0.6$, then $h\cdot0.6^2$, and so on:

$ D = h + 2h(0.6) + 2h(0.6)^2 + \cdots = h\cdot\frac{1+0.6}{1-0.6} = 4h. $

Drop it from one metre and it travels exactly 4 metres — finite, even with infinitely many bounces.

The sharper paradox — finite time. Now the trickier one. A fall from height $H$ takes $\sqrt{2H/g}$ seconds — the time goes as the square root of the height. Each bounce reaches $0.6$ of the height, so it stays in the air for $\sqrt{0.6}\approx0.775$ of the time before it. Here's the subtlety: the heights shrink by $0.6$, but the airtimes shrink by $\sqrt{0.6}$ — you have to tease that out. With $t_0=\sqrt{2h/g}$:

$ T = t_0 + 2t_0 q + 2t_0 q^2 + \cdots = t_0\cdot\frac{1+q}{1-q},\qquad q=\sqrt{0.6}. $

The bracket comes to $\frac{1.775}{0.225}\approx 7.9$; with $t_0=\sqrt{2/9.8}\approx0.45$ s, the entire infinite cascade is over in about 3.5 seconds — which is exactly what your eyes told you.

That's the physical answer to Zeno's puzzle, 2,400 years old: infinitely many shrinking steps add up to a finite whole. And there's a modeling lesson tucked in — distance and time carry different ratios ($0.6$ against $\sqrt{0.6}$) — which shows up every time you cross over from an energy quantity to a timing one. The same maths runs anything that rings down: a plucked string, a struck bell, a suspension settling.

Machine Learning

Why do optimizers "remember" old gradients, and how much?

The problem. Raw gradients are noisy — each mini-batch shouts a slightly different direction, so the training jitters all over. What you'd really like is the smoothed trend of the recent ones; but stashing thousands of them is too costly.

The move. Momentum optimizers — the Adam optimizer sitting behind most modern models — keep a running average in which each past gradient fades by $\beta\approx0.9$ every step:

$ m_t = (1-\beta)\big(g_t + \beta g_{t-1} + \beta^2 g_{t-2} + \cdots\big). $

What the series tells the engineer.

Fair average? The weights $(1-\beta)\beta^k$ sum to 1 — so yes, it is.
How far back? The differentiated identity hands you the average gradient "age" as $\frac{\beta}{1-\beta}$. At $\beta=0.9$ that's $9$; at $0.99$, about $99$. That one knob is the memory length — and the series turns an abstract decay rate into the plain step count practitioners actually reach for and tune.

Distributed Systems

How many times will this request retry — and can the network survive it?

The problem. Requests fail now and then, so they retry. And somebody has to budget for that — provision for one attempt when the true average is four, and the whole system buckles under the load.

The move. Each independent attempt succeeds with probability $p$, so "exactly $k$ tries" has probability $p(1-p)^{k-1}$. The average wants $\sum k\,p(1-p)^{k-1}$ — and here's where the differentiated identity earns its keep:

$ E[\text{tries}] = \sum_{k=1}^{\infty} k\,p(1-p)^{k-1} = \frac{1}{p}. $

Why it's load-bearing. A request that succeeds a quarter of the time averages $1/0.25=4$ attempts — so a naive retry policy quietly quadruples your traffic. And during an outage, when $p$ falls through the floor, the retries avalanche and knock over the very server they're aimed at. Knowing the average is $1/p$ is what lets you set sane caps and back off. That same $1/p$ also sizes skip lists and hash tables, where "probes until a free slot" is the same until-first-success question wearing a different suit.

Virality & Growth

Will this campaign fizzle, and how big will it get?

The problem. Picture a referral program: users invite friends, who invite more friends. The founders need to know the one thing that matters — will this take off, or quietly run out of breath, and how many users does a single sign-up ultimately drag along behind it?

The move. If each user brings in $R$ new ones on average, and you start from a single person:

$ \text{Total} = 1 + R + R^2 + \cdots = \frac{1}{1-R}\quad(R<1). $

The single most important number. The whole thing hangs on $R$ against 1 — the same threshold the epidemiologists call the reproduction number. If $R<1$, the growth fizzles: at $R=0.8$, one user pulls in $1/0.2=5$ in total, and then it's done. If $R>1$ it explodes (until the market fills up — see Probe 5). So the series hands the team a clean go/no-go: measure $R$; below 1 the campaign can't carry itself, and the formula tells you exactly how modest the payoff will be. The same arithmetic decides whether a meme catches fire or a chain letter dies on the vine.

Interactive — Practice

▸ Interactive Lab

Now it's your turn to play. Grab a slider, give it a nudge, and watch every plot answer at once — they're all dancing to one tune. That single number, the ratio $r$, runs the whole show. Prefer the keyboard? Focus a slider and let the arrow keys do the nudging — Home and End jump you straight to the extremes. Go on, turn the knob and see what the one little fraction does.

Controls

ratio r0.60 converges → 2.50

first term a1.0

Figure 1

Partial sums climbing toward the limit

Sₙ = a + ar + ar² + … → a/(1−r)

Each dot adds one more term. When r < 1 the staircase flattens onto the dashed limit — convergence you can see. Past 1 it never settles.

Figure 2

Why it converges: each term is a shrinking slice

tₖ = a·rᵏ

The bars are individual terms. Each is a fixed fraction of the one before — they vanish fast enough that infinitely many still add to something finite.

Figure 3

The cliff: how the total blows up as r → 1

total = 1/(1−r)

Gentle for small r, then rocketing toward infinity near 1. The marker is your current r.

Figure 4

The differentiated identity: weighting by position

Σ k·rᵏ = r/(1−r)² — the "averaging machine"

Plain weights rᵏ (ink) always shrink; multiply each by its position k (teal) and a hump appears. The balance point of the rᵏ weights — their center of mass — is r/(1−r), the dashed line.

Figure 5

The bouncing ball that stops in finite time

each bounce reaches r × previous height

A real trajectory: infinitely many ever-shorter arcs piling into finite total distance and time. Heights shrink by r, but bounce durations shrink by √r.

Chapter Six — Practice

6. Word Problems

Here's the thing about these — every one hands you all the facts you need, and not one of them whispers the words "geometric," "ratio," or "$r$." Nobody's going to do that for you out in the world. So that's your job: spot the thing being shaved by the same fraction over and over, figure out what's really playing the role of $r$, reach for the right identity, and solve.

Problem 1 · Pharmacology

A patient takes a 200 mg pill every 8 hours. By the next dose, only 70% of any drug present remains. After months, the amount right after a pill settles to a constant steady-state peak.

(a) What is that steady-state peak (mg)? (b) Safety requires it stay below 700 mg — are we safe?

Domain note. "Steady state" is just the moment the level stops climbing — what you add with each pill exactly matches what the body clears away. That one fact about 70% remaining is the whole story; you don't need a word more.

Problem 2 · Epidemiology

Each infected person passes a new illness to 0.85 new people on average before recovering. A single infected traveller arrives in a town that has never seen it.

(a) Counting the traveller and everyone eventually infected, how many total cases on average before the chain dies out? (b) To keep total cases per introduction under 10, what's the largest average new infections per person?

Domain note. Each person sets off their own little chain, and those chains run on independently of one another. The 0.85 is all the epidemiology there is here — your real work is seeing what role that number plays.

Problem 3 · Optics

A pane of tinted glass transmits 80% of light and reflects 20% (reflected light is lost). A solar panel sits behind a stack of 5 panes.

(a) What fraction of light reaches the panel? (b) With infinitely thin coatings each transmitting 99%, how many can you stack before less than half the light gets through?

Domain note. Each pane works on whatever light actually reaches it — nothing more. Transmitting 80% means the light gets multiplied by 0.8 going through. But watch yourself here: which tool does this really call for — is it a sum at all?

Problem 4 · Audio

A clap echoes in a hall; each reflection returns 65% of the previous pass's energy. The first arrival delivers energy $E$.

(a) Total acoustic energy from the clap and all echoes, in terms of $E$? (b) "Muddy" means the echoes exceed the original — do they here?

Domain note. "Energy multiplied by 0.65 each pass" — that's every bit of acoustics you need.

Problem 5 · Reliability

A spacecraft attempts a radio lock once per orbit; each attempt independently succeeds with probability 0.4, retrying until it locks.

(a) Average orbits until first lock? (b) To get average lock time under 2 orbits, what's the minimum per-attempt success probability?

Domain note. "Independently, probability 0.4 each time" tells you the flip side too: failure comes up 0.6 each time. And notice what you're chasing — an average count of tries, not a grand total.

Problem 6 · Geophysics

A core sample's layers are weighted for a recency-adjusted average: top layer weight 1, next $0.6$, next $0.6^2$, decreasing by 0.6 per layer for effectively infinitely many layers.

(a) What average depth (layer-number, top = layer 0) do these weights point to — i.e., the expected layer if you picked one in proportion to its weight?

Domain note. You want the average position when the weights run $1,0.6,0.6^2,\ldots$ This one asks for more than the plain sum — think back to what differentiating handed you.

Solutions

Problem 1 — Drug. Every old dose gets shaved down by 0.7 each cycle, so the body's carrying $200 + 200(0.7) + 200(0.7)^2 + \cdots$ — our shrinking series, with $a=200,\,r=0.7$. (a) $\frac{200}{1-0.7}=\frac{200}{0.3}\approx 666.7$ mg. (b) $666.7 < 700$ — safe, but only just, with no room to spare.

What breaks this? That clean number leans on one thing — the same dose, the same schedule, the same 70% clearance, forever. Skip a pill or take two and the steady input it all rests on quietly falls apart; then you've got no shortcut left but to track it dose by dose. (See Probe 2.)

Problem 2 — Outbreak. $r=0.85$; total $=1+0.85+0.85^2+\cdots$. (a) $\frac{1}{1-0.85}=\frac{1}{0.15}\approx 6.67$ cases. (b) $\frac{1}{1-r}<10 \Rightarrow r<0.9$ — just under 0.9.

What breaks this? Push $r$ past 1 and the sum runs away on you — it blows up. And even under 1, the tidy answer pretends the population is endless and everybody can still catch it. In real life, as more people become immune, $r$ keeps sliding down, the outbreak crests and turns over, and you need a different machine — an SIR or logistic model. (See Probe 1.)

Problem 3 — Glass stack. Here's the trap: $r=0.8$, and the hand wants to reach for $\frac{1}{1-r}$ — but don't. This is a finite product, not a sum at all. (a) $0.8^5\approx 0.328$, about 33%. (b) $0.99^n < 0.5 \Rightarrow n > \frac{\ln 0.5}{\ln 0.99}\approx 68.9$, so 69 coatings.

What breaks this? Light through panes gets multiplied, not piled up — so there's no series to add in the first place. A stack of filters is a geometric product ($0.8^n$), and it just steadily shrinks, exactly as your eyes tell you it should. (See Probe 3.)

Problem 4 — Echoes. $E + 0.65E + 0.65^2E + \cdots$ with $a=E,\,r=0.65$. (a) $\frac{E}{1-0.65}=\frac{E}{0.35}\approx 2.86E$. (b) The echoes by themselves come to $2.86E-E=1.86E > E$ — yes, muddy.

What breaks this? That 65% loss is what keeps $r<1$ and keeps the answer finite. Make the chamber a perfect mirror and $r=1$, and the total flies off to infinity ($1/0$). And $r\ge1$ would mean the room is somehow handing back more energy than the clap put in — which a passive room simply can't do. That's why the model behaves. (See Probe 4.)

Problem 5 — Radio lock. This is the geometric case with $p=0.4$, and for the average you reach for the differentiated identity, $E=1/p$. (a) $\frac{1}{0.4}=2.5$ orbits. (b) $\frac{1}{p}<2 \Rightarrow p>0.5$ — just over 0.5.

What breaks this? The whole thing rests on the tries being independent and carrying the same 0.4 every orbit. Let bad weather hang around for a few orbits and the failures bunch together — the tries stop being independent, and $1/p$ no longer tells the truth. (Cousin of Probe 5.)

Problem 6 — Sediment. What you want is the average layer index $k$, where each layer carries weight $r^k,\,r=0.6$:

$ \bar k = \frac{\sum_{k=0}^{\infty}k\,r^k}{\sum_{k=0}^{\infty}r^k} = \frac{r/(1-r)^2}{1/(1-r)} = \frac{r}{1-r}. $

Put $r=0.6$ in and out it falls: $\frac{0.6}{0.4}=1.5$ — the weights are pointing at layer 1.5. (Notice the top of that fraction used the differentiated identity; the plain old sum on its own never could have answered this.)

What breaks this? It takes for granted that the decay factor 0.6 is exactly the same between every neighboring pair of layers. But real layers don't oblige — they come in different thicknesses, so the factor isn't a clean constant, and the moment it drifts as you go deeper, that neat $\frac{r}{1-r}$ falls apart on you.

Chapter Six·c — Practice

6c. When the Model Breaks

Now look — that lovely little formula has been keeping three secrets from you the whole time. Every problem we did obeyed all three, which is exactly why the answers came out clean. The real skill isn't plugging into the formula — it's noticing the moment one of these promises quietly breaks:

The ratio is constant — the same $r$ at every step, forever.
$|r|<1$ — each term shrinks, so the sum converges.
The process runs unbounded — no external limit (finite population, resource, time) is being ignored.

Probe 1 · the outbreak with no controls

Controls drop; each person now infects 2.5 others. (a) What does $\frac{1}{1-r}$ give? (b) A student says "real epidemics stop!" — what is the model failing to capture?

Here's the thing to get straight first: this was never one infection passing to exactly one. Even back at 0.85 the thing fanned out — it only summed to something finite because each round of sick people came out smaller than the last. All quarantine ever did was shove $r$ down below 1.

(a) Watch what the formula hands you: $\frac{1}{1-2.5}=\frac{1}{-1.5}\approx-0.67$. A negative number of sick people? That's nonsense — we marched the formula clean out of the country where it lives. With $r=2.5>1$ the terms don't shrink, they grow, $r^n\to\infty$, and the series diverges (assumption 2 fails).

(b) But there's a deeper trouble — assumption 3. The model pretends the crowd is infinite and everybody's still catchable, with $r$ pinned at one value. In the real world, as more people get immune the effective $r$ slides down, so the outbreak crests and rolls back over, and to capture that turning you need an SIR/logistic model. So two promises break at once: $r<1$ fails, and $r$ wasn't even constant to begin with.

Probe 2 · the drug with a missed dose

The patient becomes erratic — 200 mg some days, 0 others, occasionally 400. (a) Which assumption breaks? (b) Is the model worthless now?

(a) The steady input goes. The body still clears the drug at the same fixed rate (0.7) — that part's fine — but the dose $a$ is jumping around, and the series leans on the same $a$ landing in every single term.

(b) Worthless? No — just dented. That 667 mg becomes your idealized adherent yardstick, the patient who never misses. For the erratic one, step through it day by day, or just fence it in: the level can't climb past what perfect dosing at the highest dose would reach. A formula that's broken often still hands you a perfectly good bound — and a bound you can trust beats an exact answer you can't.

Probe 3 · the glass stack taken too far

A student writes $1+0.8+0.8^2+\cdots=\frac{1}{1-0.8}=5$ and concludes "5 panes let through 5× the light." (a) The error? (b) Why does the sum converge yet answer nothing here?

(a) Multiplying got mistaken for adding. Light through glass multiplies ($0.8\times0.8\times\cdots$) — each pane chews on whatever made it through the one before. There's no pile of things being added up, so there's no series here at all. A stack of filters is a geometric product, $0.8^n$, and it shrinks toward dark.

(b) Now the funny part — the sum really does equal 5, and the arithmetic is flawless. It's the right answer to five fading echoes, each 80% of the last. The number's honest. It just answers a question nobody in this room asked. A convergent formula is not automatically the right formula.

Probe 4 · the perfect echo chamber

Perfectly reflective walls — no energy lost per pass. (a) What is $r$, and the total? (b) What would $r\ge1$ mean physically?

(a) Nothing gets lost, so $r=1$, and now $\frac{1}{1-1}=\frac{1}{0}$ — undefined, the total diverges. Right at the edge, $r=1$, convergence has already let go.

(b) What would $r\ge1$ actually be saying? That every bounce comes back carrying at least as much energy as showed up — the room would be handing back more than it got, an amplifier. A plain, passive room can't do that, so $r<1$ comes free, guaranteed by conservation of energy. The convergence condition is enforced by a physical law — and that, you see, is why these models keep on converging out in the real world. Nature doesn't care about our notation.

Probe 5 · the referral campaign that "works" — the silent failure

Each user invites 0.7 new users; from 10,000 sign-ups, $\frac{10{,}000}{1-0.7}\approx 33{,}333$. Finite, positive, reasonable — no red flags. Six months later growth stalls at 19,000. (a) Which assumption failed silently? (b) What did the constant 0.7 hide?

This is the one that ought to keep you up at night. The math checks out, the answer looks reasonable, so nothing trips your alarm. Assumption 1 caves in without a sound — that tidy 0.7 is an average pretending to be a constant, standing in for a thing that's quietly changing underneath it.

(a) The early diehards refer like crazy, maybe 0.9; but later the easy people are already in, friend networks start overlapping, the shine wears off — so the rate slides down round by round (0.9, 0.6, 0.4, 0.2…). A ratio that keeps falling gives a much thinner total, and there's your 19k-against-33k gap, sitting right out in the open.

(b) It hid market saturation and network overlap — Probe 1's running-out-of-susceptible-people, just wearing a business suit. And one more sting: $\frac{1}{1-r}$ bends upward, it's convex, so feeding it the average ratio overshoots the real total (that's Jensen's inequality — the function of the average isn't the average of the function).

The lesson. The other probes broke loudly — a negative count, a divide-by-zero, a category mistake you could spot across the room. This one broke without a peep. The only guard you've got is to grill the input before you trust the output: "Is $r$ honestly constant, or is it the average of something drifting?" The most dangerous wrong answers are the ones that look right.

The takeaway for modelers

Before trusting a/(1−r), ask…
Ask…	If "no"…
Is the same $r$ applied every step?	Input/rate varies → recursion or bounds (Probe 2)
Is the ratio under 1 in magnitude?	Diverges → genuine unbounded growth, or use SIR/logistic (Probes 1, 4)
Is it truly a sum of unbounded contributions?	May be a product, not a series (Probe 3); or a real-world cap applies (Probe 1)
Is $r$ a real constant, or an average of a trend?	Believable-but-wrong answer → check if $r$ drifts over time (Probe 5)

And here's the marvelous part: a crazy answer — a negative total, a division by zero, "5× the light" — isn't a wall you've hit. It's the model leaning over and telling you exactly which promise you broke. The nastiest case is the one that fires no alarm at all, and that's why you've got to question the assumptions before you ever trust the number.

Chapter Seven — Close

7. The One Pattern to Remember

The Pattern

You take some quantity and shave it by the same fraction $r$ again and again — with $|r|<1$ — and you want one of two things. Either the whole total piled up (reach for $\frac{1}{1-r}$), or a weighted average / expectation, where the later terms carry their own weight (reach for the differentiated forms like $\frac{r}{(1-r)^2}$).

And that's the whole story. The moment you spot something shrinking by a fixed fraction over and over, you already know the answer folds up into one clean closed form — you don't have to add up the endless stream by hand, the formula's waiting for you. And here's the part I love: that little condition $|r|<1$ isn't some fussy footnote the mathematicians stuck on to be safe. It's physics. It's nearly always doubling as a stability or sanity check on the real thing — a filter that settles instead of screaming, a discount that's less than full, a process that quietly fades rather than blowing up in your face.

One slider, one ratio r — and watch: every plot moves together. That's the whole thing in one picture. It was one pattern all along.