Friday, April 24, 2009

Compute hashes in Linux and OS X

Hashes are used to verify the integrity of a file. A hash is a one-way algorithm designed such that each and every file has its own unique hash. If a small change is made to a file the hash will look vastly different. They are used to verify that a file has not been tampered with. They are often used in conjunction with files to be downloaded. The server hosting a file will have a hash of that file posted so that when someone downloads it that person can compute their own hash of the file and compare it to the one available on the server.

Linux, OS X, and most *nix variants come with tools built in for computing file hashes. By using Cygwin in Windows environments these tools are available as well. The syntax looks like this:
openssl dgst -[hash] [file]
The hash algorithms I tend to see most often are SHA1 and MD5. To compute either of these with a file called "specification.txt," the syntax would look like this:
openssl dgst -md5 specification.txt
openssl dgst -sha1 specification.txt
Some other hashes available using this tool are MD2, MD4, RMD160, and SHA.

Hashes, however, are not perfect. The idea with them is that no one should be able to reverse the algorithm and determine what a file looks like from its hash. MD5, for example, has a vulnerability where a file can be modified from its original version and and still provide the same hash output. A proposed solution to this is to use the results of multiple hash functions for files.

No comments:

Post a Comment