I recommend against it. The symbolism of formal logic is indispensable in the discussion of the logic of mathematics, but used as a means of transmitting ideas from one mortal to another it becomes a cumbersome code. The author had to code his thoughts in it (I deny that anybody thinks in terms of , , , and the like), and the reader has to decode what the author wrote; both steps are a waste of time and an obstruction to understanding. Symbolic presentation, in the sense of either the modern logician or the classical epsilontist, is something that machines can write and few but machines can read.
So much for “any”. Other offenders, charged with lesser crimes, are “where”, and “equivalent”, and “if … then … if … then”. “Where” is usually a sign of a lazy afterthought that should have been thought through before. “If is sufficiently large, then , where is a preassigned positive number”; both disease and cure are clear. “Equivalent” for theorems is logical nonsense. (By “theorem” I mean a mathematical truth, something that has been proved. A meaningful statement can be false, but a theorem cannot; “a false theorem” is self-contradictory). What sense does it make to say that the completeness of is equivalent to the representation theorem for linear functionals on ? What is meant is that the proofs of both theorems are moderately hard, but once one of them has been proved, either one, the other can be proved with relatively much less work. The logically precise word “equivalent” is not a good word for that. As for “if … then … if … then”, that is just a frequent stylistic bobble committed by quick writers and rued by slow readers. “If , then if , then ” Logically all is well , but psychologically it is just another pebble to stumble over, unnecessarily. Usually all that is needed to avoid it is to recast the sentence, but no universally good recasting exists; what is best depends on what is important in the case at hand. It could be “If and then ” or “In the presence of , the hypotheses implies the conclusion ”, or many other versions.
The examples of mathematical diction mentioned so far were really logical matters. To illustrate the possibilities of the unobtrusive use of precise language in the everyday sense of the working mathematician, I briefly mention three examples: function, sequence, and contain.
I belong to the school that believes that functions and their values are sufficiently different that the distinction should be maintained. No fuss is necessary, or at least no visible, public fuss; just refrain from saying things like “the function is even”. It takes a little longer to say “the function defined by is even”, or, what is from many points of view preferable, “the function is even”, but it is good habit that sometimes save the reader (and the author) from serious blunder and that always makes for smoother reading.
“Sequence” always means “function whose domain is the set of natural numbers”. When an author writes “the union of sequence of measurable sets is measurable” he is guiding the reader’s attention to where it doesn’t belong. The theorem has nothing to do with the firstness of the first set, the secondness of the second, and so on; the sequence is irrelevant. The correct statement is that “the union of a countable set of measurable sets is measurable” (or, if a different emphasis is wanted, “the union of a countably infinite set of measurable sets is measurable”). The theorem that “the limit of a sequence of measurable functions is measurable” is a very different thing; there “sequence” is correctly used. If a reader knows what a sequence is, if he feels the definition in his bones, then the misuse of the word will distract him and slow his reading down, if ever so slightly; if he doesn’t really know, then the misuse will seriously postpone his ultimate understanding.
“Contain” and “include” are almost always used as synonyms, often by the same people who carefully coach their students that and are not the same thing at all. It is extremely unlikely that the interchangeable use of contain and include will lead to confusion. Still, some years ago I started an experiment, and I am still trying it: I have systematically and always, in spoken word and written, used “contain” for and “include” for . I don’t say that I have proved anything by this, but I can report that (a) it is very easy to get used to, (b) it does no harm whatever, and (c) I don’t think that anybody ever noticed it. I suspect, but that is not likely to be provable, that this kind of terminological consistency (with no fuss made about it) might nevertheless contribute to the reader’s (and listener’s) comfort.
Consistency, by the way, is a major virtue and its opposite is a cardinal sin in exposition. Consistency is important in language, in notation, in references, in typography – it is important everywhere, and its absence can cause anything from mild irritation to severe misinformation.
My advice about the use of words can be summed up as follows. (1) Avoid technical terms, and especially the creation of new ones, whenever possible. (2) Think hard about the new ones that you must create; consult Roget; and make them as appropriate as possible. (3) Use the old ones correctly and consistently, but with a minimum of obtrusive pedantry.
Everything said about words applies, mutatis mutandis, to the even smaller units of mathematical writing, the mathematical symbols. The best notation is no notation; whenever it is possible to avoid the use of a complicated alphabetic apparatus, avoid it. A good attitude to the preparation of written mathematical exposition is to pretend that it is spoken. Pretend that you are explaining the subject to a friend on a long walk in the woods, with no paper available; fall back on symbolism only when it is really necessary.
A corollary to the principle that the less there is of notation the better it is, and in analogy with the principle of omitting irrelevant assumptions, avoid the use of irrelevant symbols. Example: “On a compact space every real-valued continuous function is bounded.” What does the symbol “” contribute to the clarity of that statement? Another example: “If , then ” What does “” contribute here? The answer is the same in both cases (nothing) but the reasons for the presence of the irrelevant symbols may be different. In the first case “ “ may be just a nervous habit; in the second case “” is probably a preparation for the proof. The nervous habit is easy to break. The other is harder, because it involves more work for the author. Without the “” in the statement, the proof will take a half line longer; it will have to begin with something like “Write ” The repetition (of “”) is worth the trouble; both statement and proof read more easily and more naturally.
A showy way to say “use no superfluous letters” is to say “use no letter only once”. What I am referring to here is what logicians would express by saying “leave no variable free”. In the example above, the one about continuous functions, “” was a free variable. The best way to eliminate is to convert it to free from bound. Most mathematicians would do that by saying “If is a real-valued continuous function on a compact space, then is bounded.” Some logicians would insist on pointing out that “” is still free in the new sentence (twice), and technically they would be right. To make it bound, it would be necessary to insert “for all ” at some grammatically appropriate point, but the customary way mathematicians handle the problem is to refer (tacitly) to the (tacit) convention that every sentence preceded by all the universal quantifiers that are needed to convert all its variables into bound ones.
The rule of never leaving a free variable in a sentence, like many of the rules I am stating, is sometimes better to break than to obey. The sentence, after all, is an arbitrary unit, and if you want a free “” dangling in one sentence so that you may refer to it in a later sentence in, say, the same paragraph, I don’t think you should necessarily be drummed out of the regiment. The rule is essentially sound, just the same, and while it may be bent sometimes, it does not deserve to be shattered into smithereens.
There are other symbolic logical hairs that can lead to obfuscation, or, at best, temporary bewilderment, unless they are carefully split. Suppose, for an example, that somewhere you have displayed the relation
as, say, a theorem proved about some particular . If, later, you run across another function with what looks like the same property, you should resist the temptation to say “ also satisfies ”. That’s logical and alphabetical nonsense. Say instead “ remains satisfied if is replaced by ”, or, better, give a name (in this case it has a customary one) and say “ also belongs to ”.
What about “inequality ”, or “equation ”, or “formula ”; should all displays be labelled or numbered? My answer is no. Reason: just as you shouldn’t mention irrelevant assumptions or name irrelevant concepts, you also shouldn’t attach irrelevant labels. Some small part of the reader’s attention is attracted to the label, and some small part of his mind will wonder why the label is there. If there is a reason, then the wonder serves a healthy purpose by way of preparation, with no fuss, for a future reference to the same idea; if there is no reason, then the attention and the wonder are wasted.
It’s good to be stingy in the use of labels, but parsimony also can be carried to extremes. I do not recommend that you do what Dickson once did10. On p.89 he says: “Then … we have …” – but p.89 is the beginning of a new chapter, and happens to contain no display at all, let alone one bearing the label . The display labelled (1) occurs on p. 90, overleaf, and I never thought of looking for it there. That trick gave me a helpless and bewildered five minutes. When I finally saw the light, I felt both stupid and cheated, and I have never forgiven Dickson.
One place where cumbersome notation quite often enters is in mathematical induction. Sometimes it is unavoidable. More often, however, I think that indicating the step from 1 to 2 and following it by an airy “and so on” is as rigorously unexceptionable as the detailed computation, and much more understandable and convincing. Similarly, a general statement about matrices is frequently best proved not by the exhibition of many aij ‘s, accompanied by triples of dots laid out in rows and columns and diagonals, but by the proof of a typical (say ) special case.
There is a pattern in all these injunctions about the avoidance of notation. The point is that the rigorous concept of a mathematical proof can be taught to a stupid computing machine in one way only, but to a human being endowed with geometric intuition, with daily increasing experience, and with the impatient inability to concentrate on repetitious detail for every long, that way is a bad way. Another illustration of this is a proof that consists of a chain of expressions separated by equal signs. Such a proof is easy to write. The author starts from the first equation, makes a natural substitution to get the second, collects terms, permutes, inserts and immediately cancels an inspired factor, and by steps such as these proceeds till he gets the last equation. This is, once again, coding, and the reader is forced not only to learn as he goes, but, at the same time, to decode as goes. The double effort is needless. By spending another ten minutes writing a carefully worded paragraph, the author can save each of his readers half an hour and a lot of confusion. The paragraph should be a recipe for action, to replace the unhelpful code that merely reports the result of the act and leaves the reader to guess how they were obtained. The paragraph would say something like this: “For the proof, first substitute for , then collect terms, permute factors, and finally, insert and cancel a factor .
A familiar trick of bad teaching is to begin a proof by saying: “Given , let be ”. This is the traditional backward proof-writing of classical analysis. It has the advantage of being easily verifiable by a machine (as opposed to understandable by a human being), and it has the dubious advantage that something at the end comes out to be less than , instead of less than, say, . The way to make the human reader’s task less demanding is obvious: write the proof forward. Start, as the author always starts, by putting something less than , and then do what needs to be done – multiply by at the right time and divide by 24 latter, etc., etc. – till you end up with what you end up with. Neither arrangement is elegant, but the forward one is graspable and rememberable.
There is not much harm that can be done with non-alphabetical symbols, but there too consistency is good and so is the avoidance of individually unnoticed but collectively abrasive abuses. Thus, for instance, it is good to use a symbol so consistently that its verbal translation is always the same. It is good, but it is probably impossible; nonetheless it’s a better aim than no aim at all. How are we to read “”: as the verb phrase “is in” or as the preposition “in”? Is it correct to say: “For , we have ,” or “If , then ”? I strongly prefer the latter (always read “” as “is in”) and I doubly deplore the former (both usages occur in the same sentence). It’s easy to write and it’s easy to read “For in , we have ”; all dissonance and all even momentary ambiguity is avoided. The same is true for “” even though the verbal translation is longer, and even more true for “”. A sentence such as “Whenever a positive number is , its square is ” is ugly.
Not only paragraphs, sentences, words, letters, and mathematical symbols, but even the innocent looking symbols of standard prose can be the source of blemishes and misunderstanding; I refer to punctuation marks. A couple of examples will suffice. First: an equation, or inequality, or inclusion, or any other mathematical clause is, in its informative content, equivalent to a clause in ordinary language, and, therefore, it demands just as much to be separated from its neighbors. In other words: punctuate symbolic sentences just as would verbal ones. Second: don’t overwork a small punctuation mark such as a period or a comma. They are easy for the reader to overlook, and the oversight causes backtracking, confusion, delay. Example: “Assume that . belongs to the class , …”. The period between the two ’s is overworked, and so is this one: “Assume that vanishes. belongs to the class , …”. A good general rule is: never start a sentence with a symbol. If you insist on starting the sentence with a mention of the thing the symbol denotes, put the appropriate word in apposition, thus: “The set belongs to the class , … “.
The overworked period is no worse than the overworked comma. Not “For invertible , also is invertible”, but “For invertible , the adjoint also is invertible”. Similarly, not “Since , ”, but “Since , it follows that ”. Even the ordinary “If you don’t like it, lump it” (or, rather, its mathematical relatives) is harder to digest that the stuffy-sounding “If you don’t like it, then lump it”; I recommend “then” with “if” in all mathematical contexts. The presence of “then”can never confuse; its absence can.
A final technicality that can serve as an expository aid, and should be mentioned here, is in a sense smaller than even the punctuation marks, it is conspicuous aspect of the printed page. What I am talking about is the layout, the architecture, the appearance of the page itself, of all the pages. Experience with writing, or perhaps even with fully conscious and critical reading, should give you a feeling for how what you are now writing will look when it’s printed. If it looks like solid prose, it will have a forbidding, sermony aspect; if it looks like computational hash, with a page full of symbols, it will have a frightening, complicated aspect. The golden mean is golden. Break it up, but not too small; use prose, but not too much. Intersperse enough displays to give the eye a chance to help the brain; use symbols, but in the middle of enough prose to keep the mind from drowning in a morass of suffixes.
I said before, and I’d like for emphasis again, that the differences among books, articles, lectures, and letters (and whatever other means of communication you can think of) are smaller than the similarities.
When you are writing a research paper, the role of the “slips of paper” out of which a book outline can be constructed might be played by the theorems and the proofs that you have discovered; but the game of solitaire that you have to play with them is the same.
A lecture is a little different. In the beginning a lecture is an expository paper; you plan it and write it the same way. The difference is that you must keep the difficulties of oral presentation in mind. The reader of a book can let his attention wander, and later, when he decides to, he can pick up the thread, with nothing lost except his own time; a member of a lecture audience cannot do that. The reader can try to prove your theorems for himself, and use your exposition as a check on his work; the hearer cannot do that. The reader’s attention span is short enough; the hearer’s is much shorter. If computations are unavoidable, a reader can be subjected to them; a hearer must never be. Half the art of good writing is the art of omission; in speaking, the art of omission is nine-tenths of the trick. These differences are not large. To be sure, even a good expository paper, read out loud, would make an awful lecture – but not worse than some I have heard.
The appearance of the printed page is replaced, for a lecture, by the appearance of the blackboard, and the author’s imagined audience is replaced for the lecturer by live people; these are big differences. As for the blackboard: it provides the opportunity to make something grow and come alive in a way that is not possible with the printed page. (Lecturers who prepare a blackboard, cramming it full before they start speaking, are unwise and unkind to audiences.) As for live people: they provide an immediate feedback that every author dreams about but can never have.
The basic problem of all expository communication are the same; they are the ones I have been describing in this essay. Content, aim and organization, plus the vitally important details of grammar, diction, and notation – they, not showmanship, are the essential ingredients of good lectures, as well good books.
Smooth, consistent, effective communications has enemies; they are called editorial assistants or copyreaders.
An editor can be a very great help to a writer. Mathematical writers must usually live without this help, because the editor of a mathematical book must be a mathematician, and there are very few mathematical editors. The ideal editor, who must potentially understand every detail of the author’s subject, can give the author an inside but nonetheless unbiased view of the work that the author himself cannot have. The ideal editor is the union of the friend, wife, student, and expert junior-grade whose contribution to writing I described earlier. The mathematical editors of book series and journals don’t even come near to the ideal. Their editorial work is but a small fraction of their life, whereas to be a good editor is a full-time job. The ideal mathematical editor does not exist; the friend-wife-etc. combination is only an almost ideal substitute.
The editorial assistant is a full-time worker whose job is to catch your inconsistencies, your grammatical slips, your errors of diction, your misspellings – everything that you can do wrong, short of the mathematical content. The trouble is that the editorial assistant does not regard himself as an extension of the author, and he usually degenerates into a mechanical misapplier of mechanical rules. Let me give some examples.
I once studied certain transformations called “measure-preserving”. (Note the hyphen: it plays an important role, by making a single word, an adjective, out of tow words.) Some transformations pertinent to that study failed to deserve the name; their failure was indicated, of course, by the prefix “non”. After a long sequence of misunderstood instructions, the printed version spoke of a “nonmeasure preserving transformation”. That is nonsense, of course, amusing nonsense, but, as such, it is distracting and confusing nonsense.
A mathematician friend reports that in the manuscript of a book of his he wrote something like “ or holds according as is negative or positive”. The editorial assistant changed that to “ or holds according as positive or negative”, on the grounds that it sounds better that way. That could be funny if it weren’t sad, and, of course, very very wrong.
A common complaint of anyone who has ever discussed quotation marks with the enemy concerns their relation to other punctuation. There appears to be an international typographical decree according to which a period or a comma immediately to the right of a quotation is “ugly”. (As here: the editorial assistant would have changed that to “ugly.” if I had let him.) From the point of view of the logical mathematician (and even more the mathematical logician) the decree makes no sense; the comma or period should come where the logic of the situation forces it to come. Thus,
He said:”The comma is ugly.”
Here, clearly, the period belongs inside the quote; the two situations are different and no inelastic rule can apply to both.
Moral: there are books on “style” (which frequently means typographical conventions), but their mechanical application by editorial assistants can be harmful. If you want to be an author, you must be prepared to defend your style; go forearmed into the battle.
The battle against copyreaders is the author’s last task, but it’s not the one that most authors regards as the last. The subjectively last step comes just before: it is to finish the book itself – to stop writing. That’s hard.
There is always something left undone, always either something more to say, or a better way to say something, or, at the very least, a disturbing vague sense that the perfect addition or improvement is just around the corner, and the dread that its omission would be everlasting cause for regret. Even as I write this, I regret that I did not include a paragraph or two on the relevance of euphony and prosody to mathematical exposition. Or, hold a minute!, surely I cannot stop without a discourse on the proper naming of concepts (why “commutator” is good and “set of first category” is bad) and the proper way to baptize theorems (why “the closed graph theorem” is good and “the Cauchy-Buniakowski-Schwarz theorem” is bad). And what about that sermonette that I haven’t been able to phrase satisfactorily about following a model. Choose someone, I was going to say, whose writing can touch you and teach you, and adapt and modify his style to fit your personality and your subject – surely I must get that said somehow.
There is no solution to this problem except the obvious one; the only way to stop is to be ruthless about it. you can postpone the agony a bit, and you should do so, by proofreading, by checking the computations, by letting the manuscript ripen, and then by reading the whole thing over in a gulp, but you won’t to stop any more then than before.
When you’ve written everything you can think of, take a day or two to read over the manuscript quickly and to test it for the obvious major points that would first strike a stranger’s eye. Is the mathematics good, is the exposition interesting, is the language clear, is the format pleasant and easy to read? Then proofread and check the computations; that’s an obvious piece of advice, and no one needs to be told how to do it. “Ripening” is easy to explain but not always easy to do: it means to put the manuscript out of sight and try to forget it off a few months. When you have done all that, and then re-read the whole work form a rested point of view, you have done all you can. Don’t wait and hope for one more result, and don’t keep on polishing. Even if you do get that result or do remove that sharp corner, you’ll only discover another mirage just ahead.
To sum it all up: begin at the beginning, go on till you come to the end, and then, with no further ado, stop.
I have come to the end of all the advice on mathematical writing that I can compress into one essay. The recommendations I have been making are based partly on what I do, more on what I regret not having done, and most on what I wish others had done for me. You may criticize what I’ve said on many grounds, but I ask that a comparison of my present advice with my past actions not be one of them. Do, please, as I say, and not as I do, and you’ll do better. Then rewrite this essay and tell the next generation how to do better still.
Heisel C. T., The circle squared beyond refutation, Heisel, Cleveland (1934). ↩
Nelson E., A proof of Liouville’s theorem, Proc. A.M.S. 12 (1961) 995. ↩
Dunford N. and Schwartz J.T., Linear operators, Interscience, New York (1958,1963) ↩
Birkhoff, G. D., Proof of the ergodic theorem, Proc. N.A.S. U.S.A. 17 (1931) 656–660. ↩
Thurber J. and Nugent E., The male animal, Random House, New York (1940). ↩
Lefschetz, S., Algebraic Topology A.M.S, New York (1942). ↩
Fowler H. W., Modern English usage (Second edition, revised by Sir Ernest Gowers), Oxford, New York (1965). ↩
Roget’s International Thesaurus, Cronwell, New York (1946). ↩
Webster’s New International Dictionary (Second edition, unabridged), Merrian, Springfield (1951). ↩
Dickson, L. E., Modern algebraic theories, Sanborn, Chicago (1926). ↩