Wednesday, November 28, 2007

Arbitrary Names

Language Log is a treasure trove. Every time I dive into its archives I find something old, but new to me, that is interesting.

In 2004, Geoffrey Pullum discussed his refusal to learn which business model for massage is called incall and which is called outcall, on the grounds that the names are utterly unhelpful and "if I can't see how it follows by some sort of linguistic principles, I will just forget it again." Later that same day, Christopher Potts pointed out that similar problems exist with ordering food "in" or "out", pushing an appointment "up" or "back", and (my favorite) identifying the root of a tree structure as maximal or minimal in the dominance relation. (Potts also says that "linguists draw their trees upside down, with the root at the top of the page" which of course is right-side up to computer scientists like me, as long as we're talking about trees as an abstract graph structure rather than about trees, the plants.)

I love reading this kind of LL post, because some of these are the kind of things about language, especially technical language, that drive me crazy.

Example: Statistics textbooks classify the ways a statistical test can go wrong as "type 1 errors" and "type 2 errors". (Or maybe they use Roman numerals. The point, namely that these names are stupid, is the same.) I have an old book (actually, a new copy of a Dover edition of an old text) that calls them "errors of the first kind" and "errors of the second kind". Now, one of these kinds of errors is when your test tells you two things are different when they're really the same, and the other is when the test tells you they are the same when they're really different. Like Pullum, I absolutely refuse to even try to remember which is which. And also like Pullum, I anticipate that many people reading this statement would respond to it by trying to tell me anyway: the fact that I don't know is not the point; the point is that the meanings of these technical terms are arbitrary, reflecting nothing about the events they describe but depending instead on what order some founding father of statistics happened to think of them in.

So what do I suggest? Well, in many contexts, like testing people for diseases or testing drugs for treatment effects or identifying spam or whatever, the terms "false positive" and "false negative" are transparent and work perfectly well. In other cases, where the designation of outcomes as "positive" and "negative" is not obvious, the statistics literature provides the terms "rejection error" and "acceptance error", referring to rejection or acceptance of the null hypothesis; which hypothesis is null is an unambiguous technical fact about the statistical method being used.

Interestingly, there are tons of these arbitrary namings in mathematics, and I don't object to all of them. I don't really care, for example, about the fact that one kind of function is called a homomorphism and another very different kind of function is called a homeomorphism, even though (as far as I can tell) there's no way to look at those words and determine from their structures which means what. But in this case, and many others, there's a good reason not to mind: a homomorphism is a function you study in algebra, and a homeomorphism is one you study in topology. So while if you're reading this paragraph as a nonmathematician and seeing these terms for the first time I could forgive you for being confused, the average mathematician probably learned the definitions of those two words in separate classes, probably separate semesters of undergraduate math. They're probably conceptually separate enough that no one who cares about either thing can possibly be confused.

Similarly, before I studied Euclidean geometry in ninth grade, I could never remember which kind of pair of angles was called complementary and which was called supplementary. (To be truthful, it's now been long enough that I will have to go look them up before I can finish this paragraph.) (Okay, looked it up. And guess what? I had it right.) There's an episode of the Cosby Show where Cockroach, complaining to Dr. Huxtable about having to study for a math test, says "I couldn't care less about complementary and supplementary angles." If he meant he didn't want to learn which word meant which thing, and Mrs. Westlake had organized her syllabus such that the two definitions were on the same pop quiz, then I can't say I blame him. But when I took geometry, the notion of a straight angle was introduced a whole week or two before that of a right angle. I'm not sure why that was, but it had the pleasant effect of letting us talk about supplementary angles for quite a while before we had to know about complementary angles. By the time we had to learn the second word, we were comfortable enough with the first word that they didn't interfere. (At least not for me. My classmates might not have agreed.) The difference between the classroom experience and the, well, call it the "dictionary experience" was impressive to me even then.

And don't get me started on confluence and the Church-Rosser property. Because I've spent so much time on these two examples that I'll have to save that one for later.

No comments: