Posted by: Alexandre Borovik | April 14, 2008

## Donald Knuth: Calculus via O notation

Continuing the theme of alternative approaches to teaching calculus, I take the liberty of posting a letter sent by Donald Knuth to to the Notices of the American Mathematical Society in March, 1998 (TeX file).

Professor Anthony W. Knapp
P O Box 333
East Setauket, NY 11733

Dear editor,

I am pleased to see so much serious attention being given to improvements in the way calculus has traditionally been taught, but I’m surprised that nobody has been discussing the kinds of changes that I personally believe would be most valuable. If I were responsible for teaching calculus to college undergraduates and advanced high school students today, and if I had the opportunity to deviate from the existing textbooks, I would certainly make major changes by emphasizing several notational improvements that advanced mathematicians have been using for more than a hundred years.

The most important of these changes would be to introduce the $O$ notation and related ideas at an early stage. This notation, first used by Bachmann in 1894 and later popularized by Landau, has the great virtue that it makes calculations simpler, so it simplifies many parts of the subject, yet it is highly intuitive and easily learned. The key idea is to be able to deal with quantities that are only partly specified, and to use them in the midst of formulas.

I would begin my ideal calculus course by introducing a simpler “$A$ notation,” which means “absolutely at most.” For example, $A(2)$ stands for a quantity whose absolute value is less than or equal to $2$. This notation has a natural connection with decimal numbers: Saying that $\pi$ is approximately $3.14$ is equivalent to saying that $\pi=3.14+A(.005)$. Students will easily discover how to calculatewith $A$:

$10^{A(2)}=A(100)$

$\bigl(3.14+A(.005)\bigr)\bigl(1+A(0.01)\bigr)$

$\qquad = 3.14+A(.005)+A(0.0314)+A(.00005)$

$\qquad=3.14+A(0.3645)=3.14+A(.04)\,.$

I would of course explain that the equality sign is not symmetric with respect to such notations; we have $3=A(5)$ and $4=A(5)$ but not $3=4$, nor can we say that $A(5)=4$. We can, however, say that $A(0)=0$. As de Bruijn points out in [1, 1.2], mathematicians customarily use the $=$ sign as they use the word “is” in English: Aristotle is a man, but a man isn’t necessarily Aristotle.

The $A$ notation applies to variable quantities as well as to constant ones. For example,

$\sin x=A(1);$

$A(x) =xA(1)\,;$

$A(x)+A(y) =A(x+y)$ if $x\geq 0$ and $y\geq 0\,;$

$\bigl(1+A(t)\bigr){}^2 =1+3A(t)$ if $t=A(1)\,.$

Once students have caught on to the idea of $A$ notation, they are ready for $O$ notation, which is even less specific. In its simplest form, $O(x)$ stands for something that is $CA(x)$ for some constant $C$, but we don’t say what $C$ is. We also define side conditions on the variables that appear in the formulas. For example, if $n$ is a positive integer we can say that any quadratic polynomial in $n$ is $O(n^2)$. If $n$ is sufficiently large, we can deduce that

$\bigl(n+O(\sqrt{n}\,)\bigr)\bigl(\ln n+\gamma+O(1/n)\bigr)$

$\quad=n\ln n+\gamma n+O(1)$

$\qquad\null+O(\sqrt{n}\ln n)+O(\sqrt{n}\,)+O(1/\sqrt{n}\,)$

$\quad=n\ln n+\gamma n+O(\sqrt{n}\ln n)\,.$

I would define the derivative by first defining what might be called a “strong derivative”: The function $f$ has a strong derivative $f'(x)$ at point $x$ if

$f(x+\epsilon)=f(x)+f'(x)\epsilon+O(\epsilon^2)$

whenever $\epsilon$ is sufficiently small. The vast majority of all functions that arise in practical work have strong derivatives, so I believe this definition best captures the intuition I want students to have about derivatives. We see immediately, for example, that if $f(x)=x^2$ we have

$(x+\epsilon)^2=x^2+2x\epsilon+\epsilon^2\,,$

so the derivative of $x^2$ is $2x$. And if the derivative of $x^n$ is $d_n(x)$, we have

$(x+\epsilon)^{n+1}=(x+\epsilon)\bigl(x^n+d_n(x)\epsilon+O(\epsilon^2)\bigr)$

$\qquad=x^{n+1}+\bigl(xd_n(x)+x^n\bigr)\epsilon+O(\epsilon^2)\,;$

hence the derivative of $x^{n+1}$ is $xd_n(x)+x^n$ and we find by induction that

$d_n(x)=nx^{n-1}.$

Similarly if $f$ and $g$ have strong derivatives $f'(x)$ and $g'(x)$, we readily find

$f(x+\epsilon)g(x+\epsilon)=f(x)g(x)+\bigl(f'(x)g(x)+f(x)g'(x)\bigr)\epsilon +O(\epsilon^2)$

and this gives the strong derivative of the product. The chain rule

$f\bigl(g(x+\epsilon)\bigr)=f\bigl(g(x)\bigr)+f'\bigl(g(x)\bigr)g'(x)\epsilon +O(\epsilon^2)$

also follows when $f$ has a strong derivative at point $g(x)$ and $g$ has a strong derivative at $x$.

Once it is known that integration is the inverse of differentiation and related to the area under a curve, we can observe, for example, that if $f$ and $f'$ both have strong derivatives at $x$, then

$f(x+\epsilon)-f(x)=\int_0^{\epsilon}f'(x+t)\,dt$

$\qquad=\int_0^{\epsilon}\bigl(f'(x)+f''(x)\,t+O(t^2)\bigr)\,dt$

$\qquad=f'(x)\epsilon+f''(x)\epsilon^2\!/2+O(\epsilon^3)\,.$

I’m sure it would be a pleasure for both students and teacher if calculus were taught in this way. The extra time needed to introduce $O$ notation is amply repaid by the simplifications that occur later. In fact, there probably will be time to introduce the “$o$ notation,” which is equivalent to the taking of limits, and to give the general definition of a not-necessarily-strong derivative:

$f(x+\epsilon)=f(x)+f'(x)\epsilon+o(\epsilon)\,.$

The function $f$ is continuous at $x$ if

$f(x+\epsilon)=f(x)+o(1)\,;$

and so on. But I would not mind leaving a full exploration of such things to a more advanced course, when it will easily be picked up by anyone who has learned the basics with $O$ alone. Indeed, I have not needed to use “$o$” in 2200 pages of The Art of Computer Programming, although many techniques of advanced calculus are applied throughout those books to a great variety of problems.

Students will be motivated to use $O$ notation for two important reasons. First, it significantly simplifies calculations because it allows us to be sloppy — but in a satisfactorily controlled way. Second, it appears in the power series calculations of symbolic algebra systems like Maple and Mathematica, which today’s students will surely be using.

For more than 20 years I have dreamed of writing a calculus text entitled O Calculus, in which the subject would be taught along the lines sketched above. More pressing projects, such as the development of the TeX system, have made that impossible, although I did try to write a good introduction to $O$ notation for post-calculus students in [2, Chapter 9].

Perhaps my ideas are preposterous, but I’m hoping that this letter will catch the attention of people who are much more capable than I of writing calculus texts for the new millennium. And I hope that some of these
now-classical ideas will prove to be at least half as fruitful for students of the next generation as they have been for me.

Sincerely,

Donald E. Knuth

Professor

[1] N. G. de Bruijn, Asymptotic Methods in Analysis (Amsterdam: North-Holland, 1958).

[2] R. L. Graham, D. E. Knuth, and O. Patashnik, Concrete Mathematics (Reading, Mass.: Addison-Wesley, 1989).

## Responses

2. Very interesting article. Also, since I’m the first to comment I thought I’d mention you’ve hit the front page of Reddit! Congrats!

3. Congratulations should be directed to Donald Knuth, but he, apparently, does not read e-mails.

4. Perhaps it’s just me, but this idea seems as though it would needless obfuscate an already difficult to grasp concept. It’s a hand waving exercise in an area where actual understanding of limits and infinitesimals is desired.

For example, what would one gain by saying sin(x) = A(1)? This is known from the definition of the sin function, but seldom in regular calculus classes does a student need to use the bounds of the function in a calculation.

As another, specifying pi = 3.14 + A(0.005) is an interesting way to approximate it’s value descriptively, but it’s necessarily imprecise. It doesn’t help the final calculations, in which case the student will use as many decimals as they see fit, or the predefined Pi variable in their calculators, and when writing equations, the symbol for pi is exact.

I can see introducing this notation and it’s concepts in a calculus course designed specifically as a mathematical primer for computer scientists, for whom it will be eventually useful, but I would have strong objection to the use of this idea in an introduction to calculus, or even forays into Real Analysis, where delta-epsilon style proofs have significant meaning which could potentially be lost by stating O(1).

Just my two cents.

5. It strikes me as very domain specific. Who outside of computer science majors would need it? Most other engineers and science majors don’t see it often as far as I know, and if you understand limits O(f(x)) is not hard to understand. I think if anything spend less time on limits and spend more time on an introduction to differential equations would benefit students more.

6. Interesting idea… but I hate it when CS folks overload the ‘=’ sign when ϵ (U+03F5 or $\in$ if it doesn’t print here) is what’s really meant. It’s much simpler to be clear about what is an element and what is a set.

7. Wow. So that explains Calc I. Where were you three semesters ago?

@ Kevin: You make some good points; however, my guess is that teaching it this way would probably result in more people “getting it.” Those that should have an actual understanding of limits and infinitesimals (assuming that this will result in a pseudo-understanding of limits and infinitesimals) will probably be able to pick up that understanding regardless of which teaching approach is taken in the intro to Calc course.

8. It is also easy to explain all of those other functions, like w(x), W(x), which would take, at maximum, one class of two hours exaggerating.
Note that these functions are also important, mainly when working with big field.
Thus, I approve Knuth’s idea, and I’d be happy to see a text book with this kind of fundamentals.

Breno

9. It’s fine to tout the benefits of big-oh notation for future computer scientists who will never encounter a badly-behaved function in their lives, and who will constantly be using big-oh notation. For future mathematicians, however, the fundamental concept of limit will have far more general applicability down the road. I would have been most distressed in my later math courses had limits been left “for a more advanced course” by my intoductory calculus teacher.

It is also telling that Knuth says, “Once it is known that integration is the inverse of differentiation […]”. How would one prove the Fundamental Theorem of Calculus using big-oh notation instead of limits? I’m imagining being presented with a proof of the Fundamental Theorem of Calculus later in my mathematical career, and thinking, “Oh, so that’s what calculus was about. Why didn’t they just tell me?”

10. To Karl Juhnke: Here is a proof of the Fundamental Theorem of Calculus in o notation. Let $F(x)=\int_a^x f(u)du$ and $f$ continuous at $b$. Then, because $f(x)-f(b)=o(1)$, $F(x)-F(b)-f(b)(x-b)=o(|x-b|)$, that is the same as $F'(b)=f(b)$.

11. A proof of the Fundamental Theorem of Calculus for a Lipschitz function in O notation and with the Knuth’s definition of the derivative. Let $f$ be Lipschitz, i.e., $f(x)-f(u)=O(|x-u|)$ and $F(x)=\int_a^x f(u)du$.
Then, because $f(x)-f(b)=O(|x-b|)$, $F(x)-F(b)=f(b)(x-b)+O((x-b)^)$, and this is the same as $F'(b)=f(b)$ according to the Knuth’s definition.

12. Oops! the penultimate formula in my previous comment should be:
$F(x)-F(b)=f(b)(x-b)+O((x-b)^2)$

13. To Kevin: The O notation in fact gives us the explicit instances of the epsilon-delta definitions, where delta is a function of epsilon of a specific form, instead of the claim that “for every epsilon there is delta, such that…” Let’s take a look at the O-definition of the defivative suggested by Knuth. It says that $f(x+h)=f(x)+f'(x)e+O(h^2)$, which can be rewritten as $(f(x+h)-f(x))/h-f'(x)=O(|h|)$, i.e., we can take $\delta=\epsilon /K$ with some constant $K$ when we say that $f'(x)$ is the limit of $(f(x+h)-f(x))/h$ for $h \rightarrow 0$. Come to think about it, the phrase “for every epsilon there is delta” is very much the same as “delta is a function of epsilon.” This whole tradition of cramming this epsilon-delta mantra into every definition looks like a throwback to the times when the abstract notion of a function was not widely known and universally accepted yet. We can do better today. In any case, the translations between the O-definitions, explicit estimates, and epsilon-delta mantras are totally straight forward and should not cause any troubles. On the other hand, anybody who has troubles with these translations, should think twice before trying to become a mathematician, he may be not up to it.

The O-definitions and “strong derivative” suggested by Knuth will allow exactly what you want, i.e., spending less (=zero) time on limits and spending the saved time on differential equations.

14. Is there a prize for pointing out an error in Knuth’s calculations?

In the first example, Knuth reduces A(.005) + A(0.0314) + A(.00005) to A(0.3645). I think there is a zero missing (or the decimal is misplaced); it should be A(0.03645).

15. @Rob

There is a prize if he printed it in a book. See this wikipedia entry to try to claim your “hexadollar” check: http://en.wikipedia.org/wiki/Knuth_reward_check

16. Using Big-O notation SERIOUSLY speeds up the process for finding divergence or convergence for series and sequences. You go from around 10 operations to 2-3 operations. Super-quick.

17. omg yes!!! I’m doing this right now in my class

18. “mathematicians customarily use the = sign as they use the word “is” in English”

News to me. I think it’s computer scientists who are careless with the = symbol. I think if this set of ideas were to be expressed without this very bad one it might be more compelling.

19. I like

sin(x) = A(1).

I think it would have been clarifying to me at a certain stage. While it does not confer a great deal of information about the sine function the information it does convey is definitely useful to the beginner.

To write

-1 <= sin(x) <= 1

forces the reader to digest more symbols. Symbol overload is a problem for most people in studying math. I say reduce it whenever possible.

20. Despite of all the computational simplifications delivered by O notation, Knuth doesn’t go far enough in my opinion. He still sticks with the pointwise notion of differentiation. His constant $C$, implicitly entering into his definition of “strong derivative at point $x$” is allowed to depend on $x$ in an uncontrolled manner. When he is talking about continuity, he is still talking about pointwise continuity. It means that getting from his definitions to any practical resilts, such as the fact that a function with a positive derivative is increasing, requires a good deal of subtle reasoning that involves completeness of the reals, such as the existence of the lowest upper bound. He still would need the uniform continuity of pointwise continuous functions on a closed interval to build a definite integral, and that involves compactness, i.e., Bolzano-Weierstrass lemma and such.

On the other hand, if we strenthen his definition of “strong differentiability” even further, by requiring the estimate
$|f(x+h)-f(x)-f'(x)h| \leq Kh^2$
that is uniform in $x$, we will end up with the derivatives that are automatically Lipschitz, and very simple proofs of the basic facts of calculus, that do not use the heavy machinery of classical analysis, such as completeness and compactness. This approach to calculus has been systematically developed. See the posting Calculus without limits on the previous reincarnation of this blog.

If Lipschitz estimates are too restrictive, Holder comes to the rescue, all the proofs stay the same. Any other modulus of continuity can be used as well, and since any function, continuous on a segment has a modulus of continuity, this approach captures all the results about continuously differentiable functions. But we hardly need anything beyond Lipschitz in an undergraduate calculus course.

21. Despite of all the computational simplifications delivered by O notation, Knuth doesn’t go far enough in my opinion. He still sticks with the pointwise notion of differentiation. His constant $C$, implicitly entering into his definition of “strong derivative at point $x$“, is allowed to depend on $x$ in an uncontrolled manner. When he is talking about continuity, he is still talking about pointwise continuity. It means that getting from his definitions to any practical resilts, such as the fact that a function with a positive derivative is increasing, requires a good deal of subtle reasoning that involves completeness of the reals, such as the existence of the lowest upper bound. He still would need the uniform continuity of pointwise continuous functions on a closed interval to build a definite integral, and that involves compactness, i.e., Bolzano-Weierstrass lemma and such.

On the other hand, if we strenthen his definition of “strong differentiability” even further, by requiring the estimate
$|f(x+h)-f(x)-f'(x)h| \leq Kh^2$
that is uniform in $x$, we will end up with the derivatives that are automatically Lipschitz, and very simple proofs of the basic facts of calculus, that do not use the heavy machinery of classical analysis, such as completeness and compactness. This approach to calculus has been systematically developed. See the posting Calculus without limits on the previous reincarnation of this blog.

If Lipschitz estimates are too restrictive, Holder comes to the rescue, all the proofs stay the same. Any other modulus of continuity can be used as well, and since any function, continuous on a segment has a modulus of continuity, this approach captures all the results about continuously differentiable functions. But we hardly need anything beyond Lipschitz in an undergraduate calculus course.

22. The example “=3.14 + A(0.3645)” should be “=3.14 + A(0.03645)”.

Without this, I was looking at that example and going “huh? I don’t get it, even addition doesn’t work correctly, so how can it be called simple?” … Then I realised the example was wrong.

23. Kevin and Karl: I don’t think the article proposes elimination of limits from the curriculum. Much less any kind of “fudging” or harmful imprecision. Rather, it discusses intermediate, yet meaningful and strictly defined, concepts. I see nothing wrong with that. Defining little-oh indeed is equivalent to defining limits, hence continuity, derivatives and integrals. Using big-oh as an important stepping stone may be worth a try.

Michael, Tonio: Look at recent publications in analysis. Big- and little-oh are used, and with the “abused” equality sign. Most of the time this is fine, as there is no possibility for confusion.

24. In my opinion Knuth doesn’t go far enough.
He still sticks with pointwise notions of differentiability and continuity that still require some heavy tools from classical analysis, such as completeness and compactness, to get any practical results. If his estimates had been uniform in x, he would have ended up with a much simpler theory, based on uniform notions and not requiring these heavy tools. See Calculus without limits in the previous reincarnation of this blog.

25. I’m in Calc2 and perusing this is giving me a headache. I think I’ll stick with how I’m currently being taught :p integrating from 1 to 2 is simple enough, why throw in another layer of thinking? besides, college is easy enough these days, why allow something that simplifies it even more.

and btw I am a CS student. I know the importance of big O in CS but personally the mathematician side of me is offended to see it being “integrated” into calculus 😉

26. To misha: I don’t see how you can control your error term in your proof of the Fundamental Theorem of Calculus without unpacking the definition of integration. Come to think of it, integration wasn’t defined at all in Knuth’s letter. Maybe he just didn’t have time. Instead of jumping ahead to wonder how the big theorem was proved with big-oh notation and without limits, I should have first asked how integration was defined with big-oh notation and without limits. Perhaps with that answer under my belt it would all become clear to me.

27. On further consideration, perhaps the magic is not in the notation at all, but rather in assuming all functions are sufficiently well-behaved so that the notation is adequate? I can see a strong case for teaching “Calculus of Friendly Functions” to the majority of people who study calculus. For me, however, real analysis was full of intuition-breaking functions that forced me to go back to the definitions. It seems that having some practice with limits prior to real analysis helped me get in synch with all the mind-bending concepts, whereas had I been taught calculus with big-oh notation, my intuition would be under-developed. Are fans of Knuth’s proposal suggesting it is better for mathematicians as well as for computer scientists?

28. Lipschitz function — got it now.

29. To Karl Juhnke: Look, I’m not entirely against continuity, limits, pointwise differentiability and such. I’m just against starting with them. If you make the estimate in Knuth’s strong derivative definition uniform in $x$, you will end up with calculus of Lipschitz functions. Scroll down to Calculus without limits on the previous reincarnation of this blog for details. A lot of the intuition-breaking functions are artifacts of the weak “classical” definitions and are irrelevant in the vast majority of applications. Also “Calculus of Friendly Functions” can be a good stepping stone to the classical analysis even for math majors, who will have learn about Lipschitz functions and moduli of continuity anyway.

30. Karl Juhnke said:
I can see a strong case for teaching “Calculus of Friendly Functions” to the majority of people who study calculus.

I cannot agree more. The only issue is elementary definition of an appropriate class of functions.

By the way, there is a a well-developed theory of “o-minimal structures”, part of model theory, where every definable function (that is, “friendly” function) has a wonderful property: it is piecewise monotonous-and-taking-all-intermediate-values). Unfortunately, it is difficult to deal with general monotonous-and-taking-all-intermediate-values functions – they are continuous (and, moreover, obviously so — these are archetypal continuous functions, fitting into our intuition of continuity), but the class is not closed under addition. Finding a narrower explicitly and elementary defined subclass with good natural properties is an interesting problem — but it is unclear even whether it has a solution.

The theory of o-minimal structures originates in a classical work by Alfred Tarski on decision procedure for Euclidean geometry.

31. Coming back to calculus, we actually can define an increasing function on an inreval to be continuous if it doesn’t skip any values, and then we can define a fanction $f$ to be continuous at $a$ if there is an increasing continuous function $h$, defined for $x \geq 0$, such that $h(0)=0$ and $|f(x)-f(a)| \leq h(|x-a|)$. This is the way it is done in a ground-breaking book “An Infinite Series Approach to Calculus” by Susan Bassein (Publish or Perish, 1993), page 67. A short note on continuity I wrote a while ago, especially exercises 3 to 8, may clarify the matter (since then I have abandoned continuity in favor of explicit uniform estimates).

Now, the class of increasing continuous functions is closed under addition
(and multiplication by positive constants), it provides an adequate tool to develop the o-notation systematically.

Pushing this approach a bit further by requiring the estimate on $|f(x)-f(a)|$ to be uniform in both $x$ and $a$, we arrive naturally at “Calculus without limits” from the previous reincarnation of this blog.

Since any continuous function on a closed interval has a global modulus of continuity, all the continuous functions become “friendly functions,” and we can look at differentiation as division of $f(x)-f(a)$ by $x-a$ in the ring of friendly functions. Of course, it is natural to start with polynomials and then move to Lipschitz and maybe Holder functions as “friendly,” before exploring the general continuity. This is the approach that I love.

32. With all apologies to Knuth (and a lot of reverence) …

I’m not a mathematician, but I tutored a lot of calculus to reluctant business majors to make ends meet in college. When I tried to get a student to understand what calculus “is” I often found Leibniz’s notation to be superior to Lagrangian notation. It stresses, simply and visually, the fact that we’re talking about slope when we’re talking about derivatives.

While this notation looks like fun for CS majors (as noted above) it also appears like it gets away from what is fundamentally being emphasized: just take the slope.

Add in the confusion of saying “=” no longer denotes transitive equality and you might have a bigger mess than the one you started with.

33. I’m a big Knuth fan, but the thought of teaching calculus this way gives me a combination of a headache and fits of laughter. Some of my calculus students cannot remember how to add integer fractions, cannot solve 3x+2=0, and want to use the Product Rule to differentiate ln(x) (because it’s “ln” times x). Something tells me Knuth wasn’t entirely serious in pushing this as a “calculus reform” effort.

34. Alexandre,

I realize this may come across as nitpicking, but I’m not sure I’d agree that the theory of o-minimal structures originates in the Tarski-Seidenberg theorem. Certainly that work, and particularly their method of quantifier elimination, has been a source of inspiration for many, but the result is restricted to just one particular structure: that of semi-algebraic sets (sets which are definable by means of polynomial inequalities).

My understanding is that only later did it dawn on model theorists that the o-minimality condition on a general structure had such powerful consequences — and due to a paucity of examples (the main one being Tarski’s), the subject didn’t really take off until Wilkie came out with his remarkable result, that the expansion of semi-algebraic sets that results by adding an exponential function is o-minimal. You may be more expert than I in this area, but I’d be more inclined to say that the theory really originates in those two developments, even conceding that Tarski-Seidenberg has always played an archetypal role.

You can do a lot of fun calculus (from the big O point of view) with Wilkie’s structure and further expansions, but the uncomfortable fact remains that with o-minimality, you can never incorporate the sine function in this setting. Is this something that Shiota’s X-sets can handle?

35. Pushing Knuth’s idea to its limit, take any modulus of continuity, i.e. a convex increasing function $m(\epsilon)$, defined for $\epsilon \geq 0$, $m(0)=0$ and not skipping any values (i.e., continuous at 0, continuity for strictly positive $\epsilon$ being automatic). Then Knuth’s “strong derivative” concept with $O(\epsilon)$ replaced by $O(m(|\epsilon|)$ and $O$ uniform in $x$, will give you calculus without limits.

36. It should have been …$O(\epsilon ^2)$ replaced by $O(|\epsilon|m(|\epsilon|))$“… in comment 30, sorry for messing up.

To Robert: you remark only indicates that any “calculus reform” will not work without a reform of the rest of mathematical education.

37. Todd Trimble said:

You can do a lot of fun calculus (from the big O point of view) with Wilkie’s structure and further expansions, but the uncomfortable fact remains that with o-minimality, you can never incorporate the sine function in this setting. Is this something that Shiota’s X-sets can handle?

Yes, Shiota’s universe looks like a good idea. But in any case, a lot of serious mathematical work will be needed before a reasonable concept of “friendly functions” is born.

38. I claim that Calculus of Friendly Functions is already here for everybody to enjoy. Here is how. Take you favorite modulus of continuity $m$, then uniform $m$-differentiability can be defined by the inequality
$|f(x+h)-f(x)-f'(x)h| \leq K|h|m(|h|)$ uniform in $x$.
The derivative will be uniformly $m$-continuous, i.e., it will satisfy the inequality $|f'(x+h)-f(x)| \le 2Km(|h|)$ uniform in $x$. Any $m$-continuous function $f$ has a primitive $F$ that is uniformly $m$-differentiable. All the proofs are clean and simple, on the level of hight school algebra. How can it be friendlier?

39. Misha: My dream is to get rid of inequalities and work with piecewise monotonous-and-taking-all-intermediate-values functions. This is a class of functions sufficient for doing most of mathematical economics (but not models of derivatives trading, I have to admit — but perhaps derivatives trading will eventually be outlawed).

40. To Alexandre: I must admit that getting rid of inequaities looks a bit too radical to me. What is left then? Aren’t inequalities implicitly present in O and o notations and semi-algebraic and semi-amalytic sets? What kind of problems in mathematical economics can be treated, or you would like to treat, without inequalities? Can you give a reference maybe? It’s probably a topic for another posting, since you and Todd clearly went off on a tangent here. I’m not sure whether complete formalization and axiomatization of O and o notations will help make calculus more widely understandable.

41. Misha, the “tangent” where I was a commenter was about friendly functions, where you were a participant as well. Alexandre was pointing out o-minimal structures as giving classes of friendly functions, but part of my point (and his too) was that these classes may not be general enough.

The “o” stands for “order”, that is the binary relation < which is assumed to be part of the structure we are considering — the theory of o-minimal structures is therefore in the direction opposite to getting rid of inequalities. (“O-minimal” means that the only subsets of $\mathbb{R}$ which are definable in the structure are the ones already guaranteed to be in the structure: finite unions of points and intervals. The book by van den Dries, Tame Topology and O-minimal Structures, is a very good introduction — the word “tame” meaning friendly in the sense of being free of pathology; cf. Grothendieck’s Esquisse d’un Programme.)

42. Thanks for the explanations and the references, Todd. As for the tension between friendliness as the absence of pathologies (or amenability to explicit or numerical calculations) and generality of our axioms, I think it will always be with us. How to resolve this tension of course depends on the problem that the theory is applied to. My attitude is that theories are mostly the means to the ends (of solving problems), not the ends in themselves.

43. I can not resist quoting the concluding remarks of the presidential address to the London Mathematical Society by Michael Atiyah (Bull. London Math. Soc., 10, 1978, 69-76), called “The Unity of Mathematics,” that still ring true today, maybe even more so than in 1976:

The main theme of my lecture has been to illustrate the unity of mathematics by discussing a few examples that range from Number Theory through Algebra, Geometry, Topology and Analysis. This interaction is, in my view, not simply an occasional interesting incident, but rather it is of the essence of mathematics. Finding analogies between different phenomena and developing the techniques to exploit these analogies is the basic mathematical approach to the physical world. It is therefore hardly surprising that it should also figure prominently internally within mathematics itself. I feel that this have to be emphasized because the axiomatic era has tended to divide mathematics into special branches, each restricted to developing the consequences of a given set of axioms. Now I am not entirely against the axiomatic approach so long as it is regarded as a convenient temporary device to concentrate the mind, but it should not be given too high a status.

A secondary theme implicit in my lecture has been the importance of simplicity in mathematics. The most useful piece of advice I would give to a mathematics student is always to suspect an impressive sounding Theorem if it doesn’t have a special case which is both simple and nontrivial.I have tried to select examples that satisfy these conditions.

Both unity and simplicity are essential, since the aim of mathematics is to explain as much as possible in simple basic terms. Mathematics is still after all a human activity, not a computer programme, and if our accumulated experience is to be passed on from generation to generation we must continually strive to simplify and unify.

44. To Alexandre: you can “get rid of inequalities” in differentiation, by viewing differentiation as division in your favorite class of (globally defined) “friendly” functions (uniformly continuous functions will do). This will work for many variables as well, because you have to divide by polynomials only, and polynomials can vanish only on sets without any interior points. I tried to explain it elsewhere, but encountered some misunderstanding and resistance. So you can look at differentiation as a purely algebraic matter. But still inequalities sneak in through the back door, so to speak, because you have to describe your friendly functions, and some restrictions on the variation $f(x)-f(u)$ in terms of inequalities will be necessary (you have mentioned monotonicity, for example). Also at some point you will have to explain why tangents look like tangents, why they cling to the graphs, and you will need inequalities again to explain the very meaning of differentiation as linear approximation.

A correction: In comment #30, line 2, it should be concave, not convex.

45. Robert wrote: I’m a big Knuth fan, but the thought of teaching calculus this way gives me a combination of a headache and fits of laughter. Some of my calculus students cannot remember how to add integer fractions, cannot solve 3x+2=0, and want to use the Product Rule to differentiate ln(x) (because it’s “ln” times x).

That is the most ridiculous thing. Why are these students even taking calculus? They should be taking remedial math.

46. I have a moved my web page to http://www.mathfoolery.com a couple of days ago, sorry for any inconvenience. Also my article not quite finished article at http://arxiv.org/abs/0905.3611 may be of interest

47. i have some problem to understand tho big-o notation i need help to the function in maths practice ?

48. Some of calculus can be done on discrete structures, without limits. That would greatly speed up a course.

This approach could be used for example to tell a group of normal, non-mathematical adults what a derivative is. Gilbert Strang got the basic picture down to 40 minutes here: http://ocw.mit.edu/resources/res-18-005-highlights-of-calculus-spring-2010/highlights_of_calculus/big-picture-of-calculus/ but I think you could explain a derivative is in 10 minutes, with computer graphics, pre-loaded data (for example http://media.tumblr.com/4567487c802ce60df944785c233cb5eb/tumblr_inline_mkgbvtq2351qbdydv.png + http://media.tumblr.com/3c6ff79662a5b628e55b86c39c0675f1/tumblr_inline_mkgax5A89h1qbdydv.png), lag/diff, and “overloaded” Cartesian pictures like this: http://media.tumblr.com/cf27f5a7d9d28700c7c329f600bcd375/tumblr_inline_mf8tm1KZkY1qbdydv.png + http://media.tumblr.com/1ac8510bcd509ac915da70e72d375dfa/tumblr_inline_mf8t98qfo11qbdydv.png + http://media.tumblr.com/7b578ae7121a9eedd86c7f1fd92385b9/tumblr_inline_mf8rimE0T81qbdydv.png. (Strang does essentially this in one of his lectures: comment that square − lag( square ) = double, say “Hmm…” and leave it there for contemplation.)

Or you could quickly show undergraduates a baby version of Green’s theorem on a 1-simplex, again in under 10 minutes.

49. wh0cd439775 zithromax