Description

Miscellaneous Hash Functions

Author

Kon Lovett

Requires

Usage

(require-extension hashes)

Download

hashes.egg

Documentation

Hash Procedures

A suite of hash procedures. All have the same signature. All return hash values in a 32-bit range, so non-fixnum results possible! The {HASH} name refers to one of the hash algorithm symbols below.

Use of the 'hashes' extension will cause all the hash functions to load. Use the individual '{HASH}' extensions when the entire suite is not desired.

Though this egg is categorized as 'cryptographic' not a one of these is suitable for cryptographic work!

procedure: ({HASH}-prim STRING LENGTH SEED)

Returns the hash of STRING of LENGTH, using SEED.

procedure: ({HASH} STRING [LENGTH [SEED]])

Returns the hash of STRING. When LENGTH is missing (string-length STRING) is assumed. When SEED is missing 0 is assumed.

RJMXHash

Bob Jenkins' MIX Lookup 2 hash function.

RJL3Hash

Bob Jenkins' MIX Lookup 3 hash function.

TWMXHash

Thomas Wang's hash function with original MIX.

TWSHMXHash

Thomas Wang's hash function with shift MIX.

TWSHMLMXHash

Thomas Wang's hash function with shift-multiply MIX.

TWMGMXHash

Thomas Wang's hash function with Magic MIX.

FNVHash

Fowler/Noll/Vo 1 hash function. Substitutes the algorithm's initial value when the seed is non-zero.

FNVAHash

Fowler/Noll/Vo 1a hash function. Substitutes the algorithm's initial value when the seed is non-zero.

PHSFHash

Paul Hsieh's SuperFast hash function. Substitutes the algorithm's initial value when the seed is non-zero.

RSHash

Robert Sedgwick's "Algorithms in C" hash function.

JSHash

A bitwise hash function written by Justin Sobel. Ignores the seed when 0.

PJWHash

Hash algorithm is based on work by Peter J. Weinberger of AT&T Bell Labs.

ELFHash

Similar to the PJW Hash function, but tweaked for 32-bit processors. It's the hash function widely used on most UNIX systems.

BKDRHash

This hash function comes from Brian Kernighan and Dennis Ritchie's book "The C Programming Language". It is a simple hash function using a strange set of possible seeds which all constitute a pattern of 31....31...31 etc, it seems to be very similar to the DJB hash function

SDBMHash

This is the algorithm of choice which is used in the open source SDBM project. The hash function seems to have a good over-all distribution for many different data sets. It seems to work well in situations where there is a high variance in the MSBs of the elements in a data set.

DJBHash

An algorithm produced by Professor Daniel J. Bernstein and shown first to the world on the usenet newsgroup comp.lang.c. It is one of the most efficient hash functions ever published. Substitutes the algorithm's initial value when the seed is non-zero.

NDJBHash

Now favored by Bernstein. Substitutes the algorithm's initial value when the seed is non-zero.

DEKHash

An algorithm proposed by Donald E. Knuth in "The Art Of Computer Programming, Volume 3", under the topic of sorting and search chapter 6.4. Substitutes the algorithm's initial value when the seed is non-zero.

APHash

Arash Partow's hash function. A hybrid rotative and additive hash function algorithm based around four primes 3, 5, 7, and 11.

CRCHash

The crc32 procedure wrapped as above. Ignores the seed when 0.

BRPHash

A hash function by Bruno R. Preiss.

PYHash

The Python String Object hash function.

ISPLHash

The ISpell hash function.

TWUserMixHash Procedures

Thomas Wang's hash function with a user supplied MIX procedure.

procedure: (make-TWUserMixHash-primitive-procedure MIXER [UNSAFE #f])

Returns a hash primitive procedure, (scheme-object unsigned-integer32 unsigned-integer32 -> unsigned-integer32), for the procedure MIX, (unsigned-integer32 -> unsigned-integer32).

When UNSAFE no exception checking is performed.

procedure: (make-TWUserMixHash MIXER [UNSAFE #f])

Returns 5 values: HASH-PRIM, HASH, :binary-digest, :digest, and :primitive.

Digest Procedures

An acceptable input object for the digest procedures is a string, input-port, blob, vector, list, or homogeneous-vector. See message-digest for more information.

procedure: ({HASH}:digest OBJECT)

Returns the hash of OBJECT as a hexadecimal text string.

procedure: ({HASH}:binary-digest OBJECT)

Returns the hash of OBJECT as a byte-string.

procedure: ({HASH}:primitive)

Returns the hash primitive object.

Hash Auxillary Procedures

Note: The domain 'number' below means '(or flonum fixnum)', not the full Scheme 'number' domain.

parameter: (current-hash-seed [NEW-SEED])

Returns or sets the current default hash seed. The initial value is 0.

procedure: (make-fixnum-bounded-hash {HASH} [GETLEN string-length] [GETINT (constantly 0)])

Returns a hash function with a SRFI-69 signature, but with a fixnum domain; i.e. (object #!optional (positive fixnum) -> fixnum). The GETLEN will be used to aquire the OBJECT length for the {HASH}. The GETINT will supply the initial hash value.

procedure: (make-bounded-hash {HASH} [GETLEN string-length] [GETINT (constantly 0)])

Returns a hash function with a SRFI-69 signature, but with a real domain; i.e. (object #!optional (positive number) -> number). The GETLEN will be used to aquire the OBJECT length for the {HASH}. The GETINT will supply the initial hash value.

procedure: (make-seeded-hash {HASH} [SEED])

Returns a curried {HASH} of 1 or 2 arguments with the supplied SEED. When the seed is missing the (current-hash-seed) is assumed.

procedure: (make-mask-hash {HASH} MASK)

Returns a {HASH} with the hash value bitwise and'ed with the supplied MASK.

procedure: (make-range-hash {HASH} UPPER [LOWER])

Returns a {HASH} with the hash value restricted to the supplied exact interval, [LOWER UPPER]. When LOWER is missing 0 is assumed. The signature is that of the {HASH}.

procedure: (make-real-hash {HASH})

Returns a {HASH} with the hash value restricted to the interval, [0.0 1.0]. The signature is that of the {HASH}.

procedure: (make-hash-procedure HASH-PRIM [BYTE-LENGTH string-length])

Returns a hash procedure, (scheme-object #!optional unsigned-integer32 unsigned-integer32 -> unsigned-integer32), for the hash primitive procedure HASH-PRIM.

procedure: (make-hash-message-digest-procedures HASH-PRIM)

Returns a list of :binary-digest, :digest, and :primitivefor the hash primitive procedure HASH-PRIM.

Range Procedures

procedure: (make-range-restriction UPPER [LOWER])

Returns a procedure of 1 argument, (-> number number). The arguments will be swapped if necessary so the range is [LOWER UPPER]. When LOWER missing 0 is assumed.

procedure: (make-fixnum-range-restriction UPPER [LOWER])

Returns a procedure of 1 argument, (-> number fixnum). The arguments will be swapped if necessary so the range is [LOWER UPPER]. When LOWER missing 0 is assumed.

LOWER & UPPER must be fixnums.

unsigned-integer32 Procedures

procedure: (unsigned-integer32-ref OBJECT)

Returns the first 32-bits of OBJECT as an unsigned-integer32.

procedure: (unsigned-integer32-set! OBJECT NUMBER)

Sets the first 32-bits of STRING to NUMBER.

Hash Search

Usage

rabin-karp
procedure: (make-rabin-karp-string-search SUBSTRINGS [TEST [HASH]])

Returns a procedure of one argument, the search string, and two optional arguments, the start and end positions within the string. The search procedure returns a list of the matched substring and a list of the start and end positions of the match in the search string. Returns #f when no match found. Similar to the regex unit string-match procedure.

SUBSTRINGS is a list of strings. TEST is an equivalence procedure. HASH is a SRFI-69 compliant hash procedure.

Version

License

Copyright (c) 2006, Kon Lovett.  All rights reserved.

Permission is hereby granted, free of charge, to any person obtaining a
copy of this software and associated documentation files (the Software),
to deal in the Software without restriction, including without limitation
the rights to use, copy, modify, merge, publish, distribute, sublicense,
and/or sell copies of the Software, and to permit persons to whom the
Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included
in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED ASIS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
OTHER DEALINGS IN THE SOFTWARE.

Does not supercede any restrictions found in the source code.