Project Nayuki

RC4 cipher in x86 assembly

The core of the RC4 stream cipher is a very small amount of code, so I decided to implement it in x86 assembly language for fun to see how fast I could make it go.

Excluding comments and blank lines, my x86 code is 40 lines long, and the main encryption loop consists of only 14 instructions. Interestingly, there are just enough registers on x86 to hold all the relevant values for this algorithm, so no register spills are needed.

Source code

This code offers a reusable function that performs RC4 encryption and also a demo main() function that runs a self-test and a speed benchmark.


To use this code, compile it on Linux with this command:

Then run the executable with ./rc4-test.

Benchmark results

A quick, informal benchmark on Intel Core 2 Quad Q6600 2.40 GHz (using a single core), Ubuntu 10.04, GCC 4.4.3 gives these numbers:

Therefore, we see that my x86 code is 1.45× as fast as my C code. Manually writing and optimizing assembly code seems to pay off significantly in this case.


More info