Search this site


 Javamex Home  Arcmexer home  Reading encrypted archives  Password recovery

Zip password recovery

Occasionally, part of a data recovery or data conversion operation may involve password recovery, where a customer has lost the password used to ecnrypt a file and needs to attempt to recover their data from that file. At present, Arcmexer contains a function that can be useful in Java applications that need to attempt password recovery from encrypted zip files. It currently works with AES-encrypted zip files, although support for PKZIP-encrypted files may be added in a future version.

With AES-encrypted ZIP files, there is currently no publically known way to just "work out" what the password is. After all, that's the whole point of a secure encryption algorithm. Password recovery therefore involves testing a series of candidate passwords. The procedure is essentially as follows. In this example, we test for passwords based on particular words plus an optional single digit:

String[] possiblePasswords = {"password", "dog", "john", "peter",
  "football", "jacknicholson", ...};

ArchiveReader r = ArchiveReader.getReader(f);
ArchiveEntry entry;

next_entry:
  while ((entry = r.nextEntry()) != null) {
    for (String pwBase : possiblePasswords) {
      for (int digitNo = -1; digitNo <= 9; digitNo++) {
        String pw = (digitNo == -1) ? pwBase :
          (pwBase + digitNo);
        if (entry.isProbablyCorrectPassword(pw)) {
          System.out.println("Entry " + entry.getFilename() +
            " has password '" + pw + "'");
          continue next_entry;
        }
      }
    }
  }

The interesting method in this case is isProbablyCorrectPassword(). This returns true if the password supplied appears to be the password used to encrypt the data. The method doesn't have to decrypt the data to make this decision, so it's faster than just trying to call getInputStream(). It has a false-positive rate of about 1 in 65 thousand (i.e. for every 65,000 or so candidate passwords you supply, it will spuriously return true even though that isn't the real password). This means that in a real-life application, you would probably want to modify code such as the above so that when isProbablyCorrectPassword() returned true, it actually decrypted the data and inspected it before concluding that the correct password had been found.

(However, if the password supplied is correct, it will definitely return true; you won't get a "false negative".)

The method could also be useful when different files are encrypted with different passwords that you know, but you have forgotten which password was used with which file.

Candidate passwords

The success of this method clearly hinges on supplying a list of candidate passwords in order of likelihood. In addition to testing a list of dictionary words and common first names/surnames and actors' names, choosing such candidates is something that may require speaking to the customer in question, to find out names of users, dates of birth, names of users' spouses, pets, sports, favourite football teams etc.

If the ZIP file was encrypted with a truly strong password, then of course, recovery is unlikely. Recovery relies on the fact that in real life, most users don't choose very strong passwords.

Performance and multithreading

Note that the ZIP encryption scheme is designed so that it is a deliberately slow operation to check a password. Depending on your hardware, isProbablyCorrectPassword() will take in the order of a few milliseconds per check (i.e. you will be able to check in the order of 100-200 passwords per processor per second).

The isProbablyCorrectPassword() method is designed to be thread-safe, so that if you are running on a multiprocessor machine, you can check several passwords simultaneously to speed up recovery.

Known issue

At present, isProbablyCorrectPassword() works only on AES-encrypted entries; it will always return false on entries encrypted with the traditional PKZIP encryption.


Copyright © Javamex UK 2009. All rights reserved. Please note that the software provided on this site is provided "as is".
No guarantee is made that it will be suitable for a particular purpose. In downloading the software, you agree to use it at your own risk. You also agree not to use it for any illegal purpose.