Project Nayuki

Forcing a file’s CRC to any value

CRC-32 demos ZIP file

Example files:

By modifying any 4 consecutive bytes in a file, you can change the file’s CRC to any value you choose. This technique is useful for vanity purposes, and also proves that the CRC-32 hash is extremely malleable and unsuitable for protecting a file from intentional modifications.

This page provides programs to demonstrate this technique and includes an explanation of the mathematics involved. Note that the technique is entirely algebraic – without guessing or using brute force, the program calculates the necessary modification to the file in one shot.

Source code

HxD hex editor

Compile: javac
Run: java forcecrc32 FileName ByteOffset NewCrc32Value
Example: java forcecrc32 picture.bmp 31337 DEADBEEF


Usage: python FileName ByteOffset NewCrc32Value
Example: python document.txt 94 CAFEF00D

C: forcecrc32.c

Compile: cc forcecrc32.c -o forcecrc32
Run: ./forcecrc32 FileName ByteOffset NewCrc32Value
Example: ./forcecrc32 sound.wav 2807 88111188


Review of CRC-32

These are the steps to calculate the CRC-32 hash of any file or message:

  1. Append the bytes 00 00 00 00 to the message.

  2. XOR the initial 4 bytes by FF FF FF FF.

  3. Interpret the sequence of bytes as a (very long) list of polynomial coefficients. Earlier bytes are assigned higher powers. Less significant bits are assigned higher powers. So for example, the hexadecimal byte sequence 01 C0 represents the sequence of polynomial coefficients 10000000 00000011, which represents the polynomial \(x^{15} + x^1 + x^0\).

  4. Divide this preprocessed message polynomial by the CRC-32 generator polynomial[0] in \(\text{GF}(2)\) and keep only the remainder.

  5. Interpret the remainder as a 32-bit integer by mapping less significant bits to higher powers. For example, the polynomial \(x^{31} + x^3 + x^1 + x^0\) is represented by the integer 0xD0000001.

  6. XOR the value by 0xFFFFFFFF to obtain the CRC.

Forcing the CRC

The core of the CRC algorithm (including CRC-32 in particular) is polynomial division, where the polynomials have coefficients drawn from the finite field \(\text{GF}(2)\).

But to do more advanced analysis on the properties of CRCs, we need to adopt a more formal view of things. The underlying field for the coefficients is always \(\text{GF}(2)\). The generator polynomial defines a quotient ring[1], and now every polynomial we work with is an element of this ring. So the goal is to take the preprocessed message polynomial \(M(x)\) (which may have a degree greater than the generator polynomial’s degree) and reduce it to its canonical form by doing polynomial division. Remember that from now on, we treat each polynomial as a single object, and the result of each arithmetic operation is a polynomial object.

To force the CRC to any value we desire, read the following observations and derivation:

  1. Let \(G(x)\) be the CRC generator polynomial.

  2. Let \(n\) be the bit length of the original message. Let \(k\) be the chosen bit offset from the start of the message where the message will be modified.

  3. Let \(M(x)\) be the preprocessed message polynomial.

  4. Let \(R(x) \equiv M(x) \!\mod\! G(x)\) be the remainder polynomial, having degree less than \(G\)’s degree.

  5. Let \(R'(x)\) be the new remainder chosen by the user.

  6. Let \(D(x) \equiv R'(x) - R(x) \!\mod\! G(x)\). This is the delta for the remainder.

  7. The act of modifying degree-of-\(G\) bits of the message \(M(x)\) at bit offset \(k\) means XORing those bits by some polynomial \(E(x)\).

  8. Let \(M'(x)\) be the preprocessed message after modification, which we will compute.

  9. Let \(E(x)\) be the delta applied to the message bits that will be modified, which we will compute.

  10. Algebraically, we have that \(M'(x) \equiv M(x) + E(x) \, x^{n - k} \!\mod\! G(x)\).

  11. We want \(M'(x) \equiv R'(x) \!\mod\! G(x)\). But we know \(M(x) \equiv R(x) \!\mod\! G(x)\).

  12. Substitute and subtract to get \(E(x) \equiv D(x) \, (x^{n-k})^{-1} \mod G(x)\).

To summarize, in order to compute how to modify the message bytes, we need to calculate the difference between the current CRC and the desired CRC, calculate a polynomial power mod \(G(x)\), calculate a reciprocal mod \(G(x)\), and calculate a product mod \(G(x)\).


More info