Native hash functions for Java

This is a library of popular cryptographic hash functions implemented in pure Java, along with speed-optimized versions in C, x86 assembly, and x86-64 assembly. The Java Native Interface (JNI) is used to achieve this functionality.

Benchmark

Hash function	Java speed	C speed (-O0)	C speed (-O1)	x86-64 speed	Native speedup
MD2	9.42 MiB/s	3.55 MiB/s	4.74 MiB/s		0.50×
MD4	249 MiB/s	145 MiB/s	452 MiB/s	576 MiB/s	2.31×
MD5	187 MiB/s	108 MiB/s	326 MiB/s	418 MiB/s	2.24×
RIPEMD-128	148 MiB/s	63 MiB/s	197 MiB/s	190 MiB/s	1.33×
RIPEMD-160	59 MiB/s	49 MiB/s	139 MiB/s	134 MiB/s	2.36×
RIPEMD-256	139 MiB/s	111 MiB/s	265 MiB/s		1.91×
RIPEMD-320	65 MiB/s	83 MiB/s	196 MiB/s		3.02×
SHA-1	123 MiB/s	110 MiB/s	305 MiB/s	322 MiB/s	2.62×
SHA-224	78 MiB/s	71 MiB/s	118 MiB/s	131 MiB/s	1.68×
SHA-256	78 MiB/s	71 MiB/s	118 MiB/s	131 MiB/s	1.68×
SHA-384	106 MiB/s	106 MiB/s	176 MiB/s	200 MiB/s	1.89×
SHA-512	106 MiB/s	106 MiB/s	176 MiB/s	200 MiB/s	1.89×
Tiger	145 MiB/s	136 MiB/s	284 MiB/s		1.96×
Tiger2	145 MiB/s	136 MiB/s	284 MiB/s		1.96×
Whirlpool	23.7 MiB/s	11.3 MiB/s	44.2 MiB/s	61.5 MiB/s	2.59×

Higher GCC optimization levels than -O1 produced slower C code in most cases. For -O2 and -O3, most algorithms were a few percent slower, SHA-1 was 25% slower, RIPEMD-320 achieved 199 MiB/s at -O2, and Whirlpool achieved 46.7 MiB/s at -O3. As for the x86-64 code, testing was done at -O1, which was about a percent faster than -O0 (no meaningful difference).

All the benchmark results above are based on: CPU = Intel Core 2 Quad Q6600 2.40 GHz (single-threaded), OS = Ubuntu 10.04 (64-bit), compiler = GCC 4.4.3, JVM = OpenJDK 1.6.0_33 HotSpot server. All benchmarks ran in 64-bit mode.

Source code

Browse the full source code at GitHub: https://github.com/nayuki/Native-hashes-for-Java

Or download a ZIP of all the files: https://github.com/nayuki/Native-hashes-for-Java/archive/master.zip

The code is open source under the MIT License.

Overview

How to use one of these hash functions:

import nayuki.nativehash.*;

Sha1 hasher = new Sha1();
byte[] b = (... read stuff ...);
hasher.update(b);
byte[] hash = hasher.getHash();

The source code comes in a number of parts:

Java programs

Examples: java/demo/sha1sum.java, java/demo/nayuki/nativehash/BenchmarkHashes.java

These are main programs runnable from the command line. They illustrate how this hashing library is used in practice.

Java hashers

Examples: java/src/nayuki/nativehash/BlockHasher.java, java/src/nayuki/nativehash/Md5.java

These implement the non-speed-critical parts of the hash function such as initialization, block accumulation, final padding, and hash value serialization. The compression function can be either pure Java or native code.

Java compressors

Example: java/test/nayuki/nativehash/Sha1Java.java

These implement the speed-critical compression function of each hash function in pure Java. These classes provide full hashing capability on any platform (even if the C or assembly code is ignored), a reference implementation to check that the native implementation produces the same values, and a comparison point for speed benchmarking (to see how much gain the native code yields).

Java main tests

Examples: java/test/nayuki/nativehash/HashTest.java, java/test/nayuki/nativehash/Sha256Test.java

These test suites include test vectors (known input-output values) for each hash function, as well as generic tests for block splitting equivalence and Java vs. native implementation value checking. For production usage with native code only, the entire java/test directory can be disregarded.

C wrappers for JNI

Example: native/sha512-jni.c

These implement the interfacing and data conversion between Java data types and native data types. The Java code calls into these native functions, which do some processing before calling the native compression functions.

C and assembly compressors

Examples: native/ripemd160-compress.c, native/whirlpool-compress-x86.S

These implement the hash function’s main compression function in C code (good for any platform), or assembly code for the x86 or x86-64 CPU instruction sets. For the x86/x86-64 code, SSE2 is often required.

Instructions

To build and run the code, follow these steps (Linux only, not supported on Windows):

Download the full ZIP archive from the repository and unpack it.
Compile the Java classes as per the normal procedure, using javac, your favorite Java build system, or by setting up a project in an IDE. Compiling the runnable main test classes (all the *Test.java files) is optional.
For the native code, choose whether you want to use the hash compression functions from C, x86, or x86-64.
- If using the C compression functions, then invoke the C compiler with a command like this (split into multiple lines for clarity):
```
gcc -Wall -shared -fPIC -O1
    -I /usr/lib/jvm/java-1.6.0-openjdk/include/
    -o libnayuki-native-hashes.so
    native/*.c
```
  Note that you will need to change the -I argument to the appropriate include path provided by your Java VM installation. The directory contains jni.h and other C/C++ header files.
- If using the x86 compression functions, then invoke the C compiler with a command like this (some hashes don’t have an x86 implementation):
```
gcc -Wall -shared -fPIC -O1
    -I /usr/lib/jvm/java-1.6.0-openjdk/include/
    -o libnayuki-native-hashes.so
    native/*-jni.c  native/*-x86.S
    native/{md2,ripemd256,ripemd320,tiger}-compress.c
```
- If using the x86-64 compression functions, then invoke the C compiler with a command like this (some hashes don’t have an x86-64 implementation):
```
gcc -Wall -shared -fPIC -O1
    -I /usr/lib/jvm/java-1.6.0-openjdk/include/
    -o libnayuki-native-hashes.so
    native/*-jni.c  native/*-x8664.S
    native/{md2,ripemd256,ripemd320,tiger}-compress.c
```
Now you should have produced a file libnayuki-native-hashes.so in your current working directory. This will be loaded by the JVM later.
To run the Java VM and be able to load the native library, either add the directory of libnayuki-native-hashes.so to the environment variable LD_LIBRARY_PATH or give the JVM the appropriate -D option.

For example: java -Djava.library.path=/home/user/nativehash -cp /home/user/nativehash/javabin nayuki/nativehash/Sha256Test

Notes

This was my first project involving the use of JNI, and it turned out quite well. After the initial hurdle of configuring paths and stuff properly and understanding the usage model, everything was a matter of straightforward effort (not much guessing, cleverness, or debugging).
This project derives the core Java design and hash function implementations from my cryptography library (not officially published or supported), and the native code comes from my numerous hash functions in C and x86 assembly implemented and published previously (without the Java interfacing).
In the interest of simplicity, this project loses a number of nice features implemented in the Project79068 Cryptography Library, such as: Separate objects for hash functions vs. hashers, convenience function for hashing a byte array without explicitly creating a hasher, smart hash value objects that can be compared to each other and serialized as bytes and serialized as hexadecimal strings, unification of certain hashers that have very similar implementation details.