Some years ago I was writing some code to use Bloom filters to remember which files contained which words. The efficiency of the filter was well off of the theoretical. I spent a couple of days looking for the bug in my code and could find none. For my hash function I was using variations of classic pseudo random number generators. By elimination I came to focus on the quality of the hash. After putting the problem aside a few days, it occurred to me that an implementation of DES, the Data Encryption Standard, was readily available. That would seem to either provide a high quality hash, or reveal a DES weakness. Using the DES encryption of the data as a hash made the Bloom code work at its theoretical efficiency. I now recommend SHA1 for it is faster and specifically designed with such requirements in mind.

SHA1 produces 160 bits. Bloom filters usually require more. If you concatenate an integer to the end of the data to be hashed, you get 160 distinct hash bits for each integer. Either binary or decimal integers suffice.