Cryptographic primitives in plain Python
Do you want to learn how to calculate a cipher like AES or a hash function like SHA-256?
Here I present popular crypto algorithms in straightforward Python code, with logic that is easy to follow.
Source code
Download the complete package:
crypto-primitives-plain-python.zip
Or browse individual files:
- Hash functions:
- md2hash.py (MD2, Message Digest)
- md4hash.py (MD4)
- md5hash.py (MD5)
- sha1hash.py (SHA-1, Secure Hash Algorithm)
- sha256hash.py (SHA-256)
- sha512hash.py (SHA-512)
- sha3hash.py (SHA-3 family: SHA3-224, SHA3-256, SHA3-384, SHA3-512)
- skeinhash.py (Skein)
- whirlpoolhash.py (Whirlpool)
- Ciphers:
- aescipher.py (AES, Advanced Encryption Standard)
- blowfishcipher.py (Blowfish)
- descipher.py (DES, Data Encryption Standard)
- ideacipher.py (IDEA, International Data Encryption Algorithm)
- teacipher.py (TEA, Tiny Encryption Algorithm)
- twofishcipher.py (Twofish)
- Miscellaneous:
The code is open source under the MIT license.
Explanation
Modern digital cryptography might look like black magic to the novice programmer. But actually, cryptographic functions are built from sequences of basic operations. These operations include addition, bitwise XOR, bit shifting, table look-up, looping, et cetera.
The fact that cryptography is accomplished via arithmetic isn’t so obvious these days. In practice, ciphers and hash functions are hidden behind libraries with tidy interfaces. And the source code for these crypto libraries is often written in intimidating programming languages like C (with lots of preprocessor macros), assembly, or even HDL. In high-level languages like Python, using a cryptographic function means calling out to a native function that was written in C and compiled to machine code – not implemented in pure Python because high-level languages are slow.
To help the curious programmer who wants to understand what really happens inside of ciphers and other cryptographic primitives, I wrote implementations of popular crypto algorithms in plain, straightforward Python code and published them here on this page.
My code is optimized for clarity and simplicity, not speed or memory usage. It’s easy to insert print statements into the code to examine intermediate data values. My hope is that once you understand how this implementation works, you can translate the abstract algorithm to any language of your choice or start optimizing for performance.
Some conventions to note about the code:
All functions (public and private) take input values as arguments and return new output values. They do not modify any lists or data structures in place. Also, all functions are pure and do not write any global state.
Public functions take either
bytes
,bytearray
,list
ofint
, ortuple
ofint
as input; they returnbytes
orbytearray
as output. Private (internal) functions usually usebytes
ortuple
ofint
, both of which are immutable; this provides extra defense against accidental programming errors.Because the cryptographic functions use byte lists as input/output, the module
cryptocommon
provides utility functions to convert between byte lists, hexadecimal strings, and ASCII strings.
As for my serious, non-pedagogical implementations of cryptographic hashes optimized for speed, see these pages/projects:
- Project Nayuki: Various hash functions in C and x86 assembly
- Project Nayuki: Native hash functions for Java (mainly C + x86 code, with a thin Java wrapper)
- GitHub: Project79068 Cryptography Library (pure Java implementation, with framework to reduce repeated code)