Knuth’s -yllion number notation
Introduction
Donald Knuth proposed an alternative way to write out large numbers as English phrases. This way uses fewer kinds of different words compared to the conventionally accepted notation. Take this number for example: 12,345,678,900,011.
In conventional English notation we write this number as: twelve trillion three hundred forty-five billion six hundred seventy-eight million nine hundred thousand eleven.
12 | 345 | 678 | 900 | 011 |
twelve | three hundred forty-five | six hundred seventy-eight | nine hundred | eleven |
trillion | billion | million | thousand |
In the bottom row of the table above, we see that there is a new number name every three digits. That is to say, there is a name for 103 (thousand), 106 (million), 109 (billion), 1012 (trillion), etc.
In Knuth’s -yllion number notation, we group the digits as 12,3456;7890,0011, and write in words: twelve myriad thirty-four hundred fifty-six myllion seventy-eight hundred ninety myriad eleven.
12 | 34 | 56 | 78 | 90 | 00 | 11 |
twelve | thirty-four | fifty-six | seventy-eight | ninety | (zero) | eleven |
hundred | hundred | (hundred) | ||||
myriad | myriad | |||||
myllion |
In this case, there is a new number name every time the number of digits doubles. For example, 104 is myriad, 108 is myllion, 1016 is byllion, 1032 is tryllion, etc.
Live demo (JavaScript)
Number: | |
Conventional English grouping: | |
Conventional English notation: | |
Knuth’s -yllion grouping: | |
Knuth’s -yllion notation in English: | |
Knuth’s -yllion notation in Chinese: |
Source code
Here are programs to demonstrate these two systems for naming numbers. Both the Java and Python implementations behave identically.
- Java: IntegerToWordsDemo.java (main), IntegerToWordsTestjava (JUnit)
- Python: integer-to-words.py
- TypeScript/
JavaScript: integer-to-words.ts / integer-to-words.js
Supported features:
Printing a bunch of random large numbers to standard output as a demonstration. (Sample output)
Converting an integer −1066 < n < 1066 to conventional English short scale notation (thousand, million, billion, etc.).
Converting an integer −108192 < n < 108192 to Knuth’s -yllion notation in English (myriad, myllion, byllion, etc.).
Converting an integer −108192 < n < 108192 to Knuth’s -yllion notation in Chinese (萬, 億, 兆, 京, etc.).
Adding separators for an integer in conventional English notation, e.g.: 12345678 → 12,345,678.
Adding separators for an integer in Knuth’s -yllion notation, e.g.: 12345678901234567890 → 1234:5678,9012;3456,7890.
Notes
Parsing conventional number notation is fairly easy for humans. You read a spelled number whose value is between 1 and 999, followed by an -illion word. Then you repeat this reading procedure on the next small block of words. At the large scale, this encoding scheme is quite “flat” in structure.
Parsing Knuth’s -yllion notation gets increasingly difficult with large numbers because you have to keep track of which words have been recently used, in order to determine how much of the previous phrase the -yllion word applies to. Overall, the phrases form a nested block structure that quickly gets confusing for average humans.
The spelling of names used for the -illion numbers are badly non-uniform. For example, 10(4+1)×3 is quadrillion but 10(14+1)×3 is quattuordecillion; 10(2+1)×3 is billion but 10(20+1)×3 is vigintillion. (Knuth’s -yllion notation inherits this spelling consistency problem too, but needs extremely large numbers before the problem appears in practice.)
When framed in the language of asymptotic analysis, we can say that for an arbitrary n-digit integer, conventional notation uses Θ(n) different words, whereas the -yllion notation uses only Θ(log n) different words.
All of this talk of spelling out large numbers in words is moot in practice. Both the conventional system and Knuth’s system are unwieldy and verbose. Realistically, large numbers are written out as plain numeric digits, possibly with some side annotations to keep track of how many digits were written. It is true that large numbers have practical applications and are routinely processed on computers for cryptography (e.g. 2048-bit RSA encryption). But the numbers are handled internally and never need to be seen by humans. Even when the numbers are displayed for debugging purposes, writing out the digits is sufficient, and using words to spell out the numbers is counterproductive.
The algorithms I implemented for converting numbers to phrases in conventional and Knuth’s notations are not the most efficient; I emphasized simple, clear code over absolute performance. For example,
YllionEnglishNotation
repeatedly converts between.numberToWords() String
andBigInteger
representations of a number, even though it is possible to convert toString
just once. Also, the string concatenation logic can be made more efficient by passing aStringBuilder
into the recursive function calls instead of producing immutableString
s as return values.