Crack legacy zip encryption with Biham and Kocher's known-plaintext attack.
A ZIP archive may contain many entries whose content can be compressed and encrypted. In particular, entries can be encrypted with a password-based symmetric encryption algorithm referred to as traditional PKWARE encryption, legacy encryption, or ZipCrypto. This algorithm generates a pseudo-random stream of bytes (keystream) which is XORed to the entry's content (plaintext) to produce encrypted data (ciphertext).
The generator's state, made of three 32-bits integers, is initialized using the password and then continuously updated with plaintext as encryption continues. This encryption algorithm is vulnerable to known plaintext attacks, as shown by Eli Biham and Paul C. Kocher in the research paper A known plaintext attack on the PKZIP stream cipher.
Given ciphertext and, 12 or more bytes of the corresponding plaintext, the internal state of the keystream generator can be recovered. This internal state is enough to decipher ciphertext entirely and other entries encrypted with the same password. It can also be used to brute-force the password with a complexity where n is the size of the character set, and l is the length of the password.
bkcrack is a command-line tool that implements this known-plaintext attack. The main features are:
Recover internal state from ciphertext and plaintext.
Change a ZIP archive's password using the internal state.
Recover the original password from the internal state.
Additional information and source code can be found here: https://github.com/kimci86/bkcrack