Fast QR Code generator library
Introduction
This is an alternate, speed-optimized implementation of my QR library in Java. It contrasts with my standard/
The slow library is available in Java plus 5 other programming languages, but the fast one is only available in Java. To achieve speed-ups, I rewrote parts of the logic to implement caching and optimize low-level arithmetic, at the cost of making the architecture more complicated and the code longer. The fast library can be useful for applications that need to generate large batches of QR Code symbols. Two demo applications are provided to illustrate the usage.
Source code
Browse the project’s source code at GitHub: https://github.com/nayuki/QR-Code-generator/tree/master/java-fast
import java.awt.image.BufferedImage; import java.io.File; import java.util.List; import javax.imageio.ImageIO; import io.nayuki.fastqrcodegen.*; // Simple operation QrCode qr0 = QrCode.encodeText("Hello, world!", QrCode.Ecc.MEDIUM); BufferedImage img = toImage(qr0, 4, 10); // See QrCodeGeneratorDemo ImageIO.write(img, "png", new File("qr-code.png")); // Manual operation List<QrSegment> segs = QrSegment.makeSegments("3141592653589793238462643383"); QrCode qr1 = QrCode.encodeSegments(segs, QrCode.Ecc.HIGH, 5, 5, 2, false); for (int y = 0; y < qr1.size; y++) { for (int x = 0; x < qr1.size; x++) { (... paint qr1.getModule(x, y) ...) } }
Design notes
Here is a high-level explanation of how I took my reference/
BitBuffer
classThe slow bit buffer class is based on
java.util.BitSet
. The fast bit buffer is based on low-level manipulation of an array of 32-bitint
values.The slow buffer only reads or writes one bit at a time, whereas the fast buffer works with many consecutive bits that don’t cross a word boundary.
The speed of the bit buffer impacts the creation and concatenation of QR segments.
QrSegment
classThe slow
makeNumeric()
usesInteger.parseInt()
, but the fast version calculates directly on the character codes.The slow
makeAlphanumeric()
usesString.indexOf()
, whereas the fast version queries a look-up table once per character.The slow
isNumeric()
andisAlphanumeric()
use regular expressions to test the incoming string before starting to encode. The fast variants of the functions perform the character set checking while encoding output bits; if the string is invalid then the partially encoded result is discarded and an exception is thrown.The constructor (which generally shouldn’t be used because it’s quite low-level) takes a raw array and length instead of a
BitBuffer
object, in order to reduce a layer of pointer indirection.Overall, the speed improvement within this class shouldn’t be great. The main speed-up comes from
QrSegment
utilizing the newBitBuffer
class.
ReedSolomonGenerator
classThe QR Code standard uses between 7 to 30 error correction bytes per block. We can create Reed–Solomon encoders for given ECC sizes, and cache them for later reuse. These cached encoders include pre-computed multiplication tables.
Note that QR Code symbols of many versions and error correction levels may share the same RS ECC generator, due to having the same number of error correction bytes per block.
QrTemplate
classSome aspects of a QR Code symbol only depend on the version (size) and not on the user data being conveyed. These pieces of information can be pre-computed and cached, if you generate many QR Codes that have the same version number.
Most of the function patterns stay the same: finders, alignment, timing, version info block, and a single dark module. The format info (mask and error correction level) does vary. Hence, the invariant parts of a QR Code of a particular version are rendered onto an initially blank template.
There is also a temporary template that marks whether each module is for a function pattern versus a bit from a user data/ECC codeword.
The template computes and saves the zigzag scan. Hence the template knows, for each user data/ECC codeword bit, what location the bit is rendered at in the final QR Code symbol.
The template also saves all 8 mask patterns, with respect to the version number and avoidance of function pattern modules.
Memoizer
classThis is a thread-safe function execution cache that is used for
ReedSolomonGenerator
andQrTemplate
.When a worker is given f and x and wants to compute f(x), it first checks if the dictionary (cache) already contains the answer. If the key-value mapping exists and the soft reference’s target is not null, then it immediately returns that value. Note that the soft reference could be cleared by the garbage collector at any time to reclaim memory. Otherwise, the worker checks if another worker is already trying to compute the same value; if so then it waits for the other worker to finish and then it rechecks the dictionary; otherwise it marks the entry as pending, runs the function, stores the result in the dictionary, notifies all waiting workers, and returns the answer.
QrCode
classThe early parts of the QR Code encoding process are the same in the slow and fast versions: making segments, finding a version number and error correction level, and concatenating segments.
The computation of error correction bytes is performed using the fast, cacheable Reed–Solomon generator.
With the version number known and all the data plus ECC bytes, an appropriate QR Code template is cloned as the basis of this barcode.
The format info is computed and drawn. Then the codewords are drawn using the pre-computed zigzag scan.
If automatic masking is requested, then each of the 8 masks is applied, the penalty score is computed, and the mask is removed.
The
getPenaltyScore()
method fuses all the horizontal-scanning loops into one, and likewise for the vertical-scanning loops. This reduces the amount of overhead spent on extracting bits from specific locations in the QR Code symbol.
Automatic mask selection is an expensive feature that seems to resist effective optimization. Surprisingly in the fast library, the evaluation of one penalty score takes more time than all the other steps needed to render a QR Code symbol. When comparing the fast and slow libraries with the same input data and options, the speed-up is greater if manual masking is used, and less if automatic masking is used.
Benchmark timings
I measured the time to generate a single QR barcode, for all versions from 1 to 40, for all 4 error correction levels, for both the slow and fast Java libraries. The data for each case is a random lowercase string (which must be encoded in byte mode), such that the length is the longest supported by the particular version and error correction level. The platform is OpenJDK Java 12.0.1 (64-bit), Intel Core i5-4690 (Haswell) 3.50 GHz, Windows 8.1 Pro (64-bit).
Manual mask selection
The computations performed are rendering the main QR Code content, applying one mask, and not calculating the penalty score. All times are in microseconds (μs). The speed-up factor is the slow time divided by the corresponding fast time.
Version | Slow L | Slow M | Slow Q | Slow H | Fast L | Fast M | Fast Q | Fast H | Speedup L | Speedup M | Speedup Q | Speedup H |
---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | 13 | 10 | 10 | 10 | 4 | 3 | 2 | 2 | 3.40× | 3.88× | 4.38× | 5.06× |
2 | 16 | 16 | 17 | 18 | 3 | 4 | 3 | 2 | 5.04× | 4.37× | 5.99× | 10.33× |
3 | 24 | 27 | 20 | 20 | 5 | 4 | 4 | 3 | 4.52× | 7.05× | 5.33× | 7.59× |
4 | 36 | 29 | 31 | 22 | 6 | 4 | 4 | 4 | 5.71× | 7.47× | 8.74× | 6.23× |
5 | 51 | 44 | 31 | 30 | 5 | 6 | 4 | 4 | 9.57× | 7.88× | 7.66× | 7.17× |
6 | 53 | 44 | 43 | 40 | 6 | 6 | 5 | 5 | 8.61× | 8.03× | 8.16× | 8.14× |
7 | 64 | 52 | 43 | 43 | 7 | 6 | 5 | 5 | 8.80× | 8.75× | 8.17× | 7.86× |
8 | 84 | 69 | 56 | 52 | 9 | 8 | 7 | 7 | 9.68× | 8.89× | 8.32× | 7.89× |
9 | 110 | 80 | 63 | 59 | 11 | 9 | 8 | 8 | 10.18× | 9.00× | 7.71× | 7.56× |
10 | 100 | 99 | 79 | 73 | 11 | 11 | 10 | 9 | 8.79× | 9.20× | 8.15× | 7.78× |
11 | 122 | 124 | 95 | 78 | 13 | 13 | 11 | 10 | 9.20× | 9.50× | 8.38× | 7.41× |
12 | 152 | 121 | 103 | 92 | 16 | 14 | 13 | 12 | 9.66× | 8.79× | 8.07× | 7.59× |
13 | 176 | 138 | 117 | 94 | 18 | 16 | 15 | 13 | 9.91× | 8.84× | 8.01× | 7.13× |
14 | 207 | 157 | 116 | 106 | 20 | 17 | 15 | 15 | 10.11× | 9.02× | 7.64× | 7.31× |
15 | 201 | 177 | 154 | 119 | 21 | 20 | 18 | 16 | 9.42× | 9.03× | 8.39× | 7.28× |
16 | 235 | 204 | 154 | 144 | 24 | 22 | 19 | 19 | 9.71× | 9.18× | 8.00× | 7.58× |
17 | 272 | 226 | 180 | 153 | 27 | 24 | 22 | 20 | 10.01× | 9.34× | 8.35× | 7.58× |
18 | 314 | 239 | 195 | 168 | 31 | 26 | 24 | 22 | 10.07× | 9.07× | 8.08× | 7.51× |
19 | 330 | 264 | 207 | 175 | 33 | 29 | 26 | 24 | 10.03× | 9.10× | 8.08× | 7.31× |
20 | 357 | 282 | 243 | 202 | 36 | 31 | 30 | 27 | 9.88× | 9.01× | 8.18× | 7.47× |
21 | 386 | 301 | 249 | 221 | 38 | 33 | 31 | 29 | 10.02× | 9.07× | 8.13× | 7.56× |
22 | 415 | 341 | 283 | 221 | 42 | 37 | 34 | 31 | 10.00× | 9.29× | 8.29× | 7.19× |
23 | 468 | 373 | 304 | 253 | 46 | 40 | 37 | 34 | 10.08× | 9.25× | 8.30× | 7.49× |
24 | 502 | 397 | 328 | 276 | 50 | 43 | 40 | 37 | 10.02× | 9.18× | 8.29× | 7.53× |
25 | 499 | 431 | 352 | 291 | 51 | 47 | 43 | 39 | 9.85× | 9.22× | 8.22× | 7.41× |
26 | 558 | 458 | 358 | 317 | 56 | 50 | 44 | 43 | 9.93× | 9.19× | 8.09× | 7.43× |
27 | 620 | 486 | 396 | 335 | 62 | 53 | 49 | 45 | 10.00× | 9.09× | 8.16× | 7.43× |
28 | 649 | 514 | 425 | 354 | 65 | 56 | 52 | 48 | 9.99× | 9.14× | 8.24× | 7.44× |
29 | 690 | 546 | 446 | 374 | 69 | 60 | 54 | 51 | 9.95× | 9.11× | 8.18× | 7.40× |
30 | 733 | 587 | 479 | 397 | 74 | 64 | 58 | 54 | 9.93× | 9.15× | 8.22× | 7.37× |
31 | 777 | 621 | 503 | 421 | 78 | 68 | 62 | 57 | 9.93× | 9.14× | 8.11× | 7.36× |
32 | 823 | 657 | 539 | 447 | 83 | 72 | 66 | 61 | 9.94× | 9.12× | 8.19× | 7.34× |
33 | 871 | 694 | 565 | 473 | 88 | 76 | 70 | 64 | 9.92× | 9.11× | 8.13× | 7.35× |
34 | 920 | 732 | 594 | 502 | 93 | 80 | 73 | 68 | 9.93× | 9.10× | 8.09× | 7.37× |
35 | 967 | 769 | 621 | 518 | 97 | 84 | 76 | 70 | 9.99× | 9.18× | 8.14× | 7.37× |
36 | 1019 | 810 | 653 | 551 | 102 | 88 | 80 | 75 | 9.99× | 9.18× | 8.12× | 7.36× |
37 | 1072 | 844 | 686 | 574 | 107 | 93 | 85 | 78 | 10.00× | 9.12× | 8.12× | 7.32× |
38 | 1128 | 890 | 721 | 599 | 113 | 98 | 89 | 82 | 10.01× | 9.10× | 8.12× | 7.29× |
39 | 1174 | 936 | 757 | 636 | 118 | 103 | 93 | 87 | 9.98× | 9.10× | 8.10× | 7.33× |
40 | 1233 | 983 | 796 | 663 | 123 | 108 | 98 | 91 | 10.01× | 9.11× | 8.11× | 7.30× |
Version | Slow L | Slow M | Slow Q | Slow H | Fast L | Fast M | Fast Q | Fast H | Speedup L | Speedup M | Speedup Q | Speedup H |
Automatic mask selection
The computations performed are rendering the main QR Code content, iterating 8 times of {apply mask, calculate penalty score, remove mask}, and applying one final mask. All times are in microseconds (μs). The speed-up factor is the slow time divided by the corresponding fast time.
Version | Slow L | Slow M | Slow Q | Slow H | Fast L | Fast M | Fast Q | Fast H | Speedup L | Speedup M | Speedup Q | Speedup H |
---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | 102 | 94 | 93 | 92 | 63 | 56 | 54 | 53 | 1.61× | 1.68× | 1.71× | 1.73× |
2 | 142 | 137 | 137 | 137 | 86 | 80 | 78 | 77 | 1.64× | 1.71× | 1.75× | 1.79× |
3 | 195 | 191 | 183 | 183 | 117 | 108 | 108 | 106 | 1.66× | 1.76× | 1.70× | 1.73× |
4 | 255 | 242 | 242 | 232 | 151 | 140 | 139 | 137 | 1.69× | 1.73× | 1.74× | 1.69× |
5 | 324 | 310 | 297 | 296 | 185 | 176 | 173 | 174 | 1.75× | 1.76× | 1.72× | 1.70× |
6 | 385 | 371 | 370 | 369 | 224 | 214 | 213 | 214 | 1.72× | 1.74× | 1.74× | 1.72× |
7 | 462 | 443 | 435 | 433 | 268 | 257 | 256 | 255 | 1.72× | 1.72× | 1.70× | 1.70× |
8 | 553 | 534 | 520 | 515 | 316 | 307 | 304 | 304 | 1.75× | 1.74× | 1.71× | 1.69× |
9 | 657 | 625 | 609 | 605 | 369 | 357 | 357 | 356 | 1.78× | 1.75× | 1.71× | 1.70× |
10 | 730 | 726 | 710 | 703 | 422 | 410 | 413 | 412 | 1.73× | 1.77× | 1.72× | 1.71× |
11 | 838 | 843 | 816 | 801 | 479 | 472 | 470 | 473 | 1.75× | 1.79× | 1.74× | 1.69× |
12 | 964 | 935 | 918 | 909 | 543 | 533 | 535 | 536 | 1.77× | 1.75× | 1.72× | 1.70× |
13 | 1086 | 1048 | 1034 | 1011 | 611 | 598 | 603 | 602 | 1.78× | 1.75× | 1.72× | 1.68× |
14 | 1226 | 1180 | 1143 | 1135 | 682 | 671 | 674 | 672 | 1.80× | 1.76× | 1.70× | 1.69× |
15 | 1336 | 1312 | 1296 | 1263 | 757 | 747 | 751 | 750 | 1.77× | 1.76× | 1.73× | 1.68× |
16 | 1487 | 1462 | 1415 | 1404 | 836 | 825 | 828 | 829 | 1.78× | 1.77× | 1.71× | 1.69× |
17 | 1649 | 1604 | 1567 | 1542 | 920 | 908 | 912 | 911 | 1.79× | 1.77× | 1.72× | 1.69× |
18 | 1818 | 1747 | 1716 | 1694 | 1005 | 993 | 1002 | 999 | 1.81× | 1.76× | 1.71× | 1.69× |
19 | 1965 | 1907 | 1863 | 1833 | 1094 | 1085 | 1091 | 1093 | 1.80× | 1.76× | 1.71× | 1.68× |
20 | 2135 | 2067 | 2041 | 2004 | 1184 | 1179 | 1184 | 1187 | 1.80× | 1.75× | 1.72× | 1.69× |
21 | 2282 | 2224 | 2187 | 2165 | 1267 | 1273 | 1280 | 1285 | 1.80× | 1.75× | 1.71× | 1.68× |
22 | 2455 | 2413 | 2377 | 2322 | 1362 | 1372 | 1382 | 1389 | 1.80× | 1.76× | 1.72× | 1.67× |
23 | 2665 | 2599 | 2556 | 2518 | 1470 | 1477 | 1487 | 1497 | 1.81× | 1.76× | 1.72× | 1.68× |
24 | 2869 | 2790 | 2749 | 2705 | 1583 | 1591 | 1605 | 1607 | 1.81× | 1.75× | 1.71× | 1.68× |
25 | 3026 | 2989 | 2942 | 2899 | 1691 | 1699 | 1717 | 1727 | 1.79× | 1.76× | 1.71× | 1.68× |
26 | 3254 | 3196 | 3126 | 3101 | 1804 | 1820 | 1830 | 1842 | 1.80× | 1.76× | 1.71× | 1.68× |
27 | 3498 | 3399 | 3349 | 3304 | 1928 | 1943 | 1956 | 1967 | 1.81× | 1.75× | 1.71× | 1.68× |
28 | 3759 | 3654 | 3580 | 3524 | 2077 | 2086 | 2089 | 2097 | 1.81× | 1.75× | 1.71× | 1.68× |
29 | 4000 | 3885 | 3803 | 3741 | 2209 | 2218 | 2224 | 2233 | 1.81× | 1.75× | 1.71× | 1.68× |
30 | 4237 | 4128 | 4037 | 3972 | 2334 | 2349 | 2364 | 2368 | 1.82× | 1.76× | 1.71× | 1.68× |
31 | 4486 | 4362 | 4265 | 4198 | 2470 | 2481 | 2492 | 2507 | 1.82× | 1.76× | 1.71× | 1.67× |
32 | 4743 | 4608 | 4521 | 4442 | 2615 | 2630 | 2649 | 2651 | 1.81× | 1.75× | 1.71× | 1.68× |
33 | 5001 | 4861 | 4769 | 4686 | 2755 | 2768 | 2791 | 2794 | 1.82× | 1.76× | 1.71× | 1.68× |
34 | 5278 | 5127 | 5022 | 4937 | 2901 | 2917 | 2944 | 2942 | 1.82× | 1.76× | 1.71× | 1.68× |
35 | 5535 | 5375 | 5275 | 5186 | 3045 | 3062 | 3092 | 3093 | 1.82× | 1.76× | 1.71× | 1.68× |
36 | 5818 | 5650 | 5540 | 5450 | 3203 | 3220 | 3249 | 3253 | 1.82× | 1.75× | 1.71× | 1.68× |
37 | 6108 | 5927 | 5821 | 5721 | 3365 | 3379 | 3410 | 3415 | 1.81× | 1.75× | 1.71× | 1.68× |
38 | 6407 | 6216 | 6105 | 6000 | 3518 | 3546 | 3575 | 3587 | 1.82× | 1.75× | 1.71× | 1.67× |
39 | 6699 | 6510 | 6393 | 6287 | 3696 | 3710 | 3743 | 3742 | 1.81× | 1.75× | 1.71× | 1.68× |
40 | 7022 | 6834 | 6705 | 6586 | 3855 | 3886 | 3920 | 3932 | 1.82× | 1.76× | 1.71× | 1.68× |
Version | Slow L | Slow M | Slow Q | Slow H | Fast L | Fast M | Fast Q | Fast H | Speedup L | Speedup M | Speedup Q | Speedup H |
Observations
For manual masking, the fast library is about 8× the speed of the slow library. It represents a good payoff for the amount of new code written.
For automatic masking, the speed-up is only about 1.7×. This figure does not adequately justify the effort and complexity of the new code.
For the fast library, manual masking is about 40 times faster than automatic masking. Using manual masking is a good choice if the input already has good randomness properties, such as the output of a hash function.
For a given version, using less error correction (and storing more useful data) takes somewhat more computation time. This is entirely due to how much work the Reed–Solomon encoding performs; it is unaffected by the templates and rendering logic.