
Too many demands for a single cryptographic hash function to satisfy
Cryptographic hash functions are used for a very wide variety of purposes.
These many purposes place a wide variety of demands on these hash functions.
Below the Definitions and assumptions section
you will find a partial set of
requirements for Cryptographically
sound hash functions.
This long list of requirements reflect just some of the demands placed
on cryptographic hash today.
I believe it is unreasonable to expect that a single, useful cryptographic
hash function can satisfy ALL requirements we effectively place given
the wide variety of demands that we have today.
My concern with the
NIST
Cryptographic Hash Algorithm Competition is that it seeks a
single candidate algorithm.
Instead of a single cryptographic hash function, we need several
such functions.
Each of those several hash functions could be designed to
better satisfy a subset of the
requirements for Cryptographically
sound hash functions.
Designers of cryptographic protocols and procedures would need to
determine their hash function requirements and then select the appropriate
hash function that was designed to meet those hash function requirements.
Some may object on the basis that this would complicate
cryptographic protocols and procedures.
I believe that today's simplistic "one cryptographic hash
function fits all" possible requirements is both wrong
and naive.
Instead we need specifically designed cryptographic hash functions
designed to need a select set of function requirements.
Some function requirements are difficult to achieve than others.
It is likely that a cryptographic hash function designed to
meet those more difficult to achieve requirements will be
more complex.
Moreover, that added complexity comes with a performance and
implementation cost.
Applications that do not require the more difficult requirements
could opt to use cryptographic hash functions that are likely
to be less complex and easier to implement and use.
It would be nice if there were a "one cryptographic hash
function fits all" that was fast and east to implement.
Unfortunately the world of cryptographic mathematic appears
to not allow such naive wish.
We really do need standardize on more than one cryptographic hash
function.
We really do need those family of cryptographic hash functions
to be more specifically target to useful subset of
function requirements.
Comments and corrections welcome
This page is a work in progress.
Moreover, the notation on this page is prone to human (my) error.
Comments, corrections and suggestions are welcome.
See How to contact Landon
for my latest contact information.
Cryptographically sound hash function requirements
NOTE: Please see the
Definitions and assumptions section
at the bottom for details on the variables, constants and
related assumption made in the requirements.
 Zero inverse resistant hash map:
It is "hard to find" x
such that:
h(x) == 0
BTW:
This is known as resistance to the TPM hardware design flaw megacrack.
 Inverse resistant hash map:
Given x,
where 0 < x < R
it is hard to find y
such that:
h(y) == x
BTW:
This is also known as resistance to 1st preimage attacks.
 Collision resistant hash map:
It is "hard to find" x,
AND it is "hard to find" y,
where
x != y
such that:
h(x) == h(y)
BTW:
This requirement resists finding hash collisions.
 Second preimage collision resistant hash map:
Given x,
it is "hard to find" y,
where x != y
such that:
h(x) == h(y)
BTW:
This resists finding a hash collision for a particular given value.
 Concatenation collision resistant hash map:
Given x,
AND given y,
where x != y
it is "hard to find" p,
where
0 ≤ (p  x) < R,
AND where
0 ≤ (p  y) < R
such that:
h(p  x) == h(p  y)
OR
it is "hard to find" q,
where
0 ≤ (x  q) < R,
AND where
0 ≤ (y  q) < R
such that:
h(x  q) == h(y  q)
BTW:
This is known as the suffix or prefix collision attack.
 Double concatenation collision resistant hash map:
Given x,
AND given y,
where
x != y,
it is "hard to find" j,
AND it is "hard to find" k,
where
j != k
AND where either j or k (but not both) may be empty,
AND where
(j  x) != (k  y)
AND where
0 ≤ (j  x) < R
AND where
0 ≤ (k  y) < R
such that:
h(j  x) == h(k  y)
OR
it is "hard to find" p,
AND it is "hard to find" q,
where
p != q
AND where either p or q (but not both) may be empty,
AND where
(x  p) != (y  q)
AND where
0 ≤ (x  p) < R
AND where
0 ≤ (y  q) < R
such that:
h(x  p) == h(y  q)
BTW:
This is a more general form of the suffix or prefix collision attack that
allows for different prefixes or suffixes to be used.
 Single bit set or clear resistant hash map:
It is "hard to find" x
such that:
h(x) is a power of 2
OR
The 2's compliment of h(x) is a power of 2
BTW:
The first case means it is hard to find a value that, when hashed,
will produce a value with only a single bit set to 1.
The second case means it is hard to find a value that, when hashed,
will produce a value with only a single bit set to 0.
 Incremental or decremental preimage collision resistant hash map:
Given x,
it is "hard to find" y,
such that:
h(x) + 1 == h(y)
OR
such that:
h(x)  1 == h(y)
 Domain coverage map:
Foreach 0 ≤ y < D,
there exists at least one value x,
where
0 ≤ x < R
such that:
h(x) == y
BTW:
This means that the hash function is able output all possible domain values.
 Compound zero inverse resistant hash map:
Given x,
it is "hard to find" N,
where
1 < N < B
such that:
h^{N}(x) == 0
 Compound inverse resistant hash map:
Given x,
it is "hard to find" y,
AND
it is "hard to find" N,
where
1 < N < B
such that:
h^{N}(y) == x
 Compound collision resistant hash map:
It is "hard to find" x,
AND
it is "hard to find" y,
where
x != y
AND
it is "hard to find" N,
where
1 < N < B
such that:
h^{N}(x) ==
h^{N}(y)
BTW:
This resists finding any hash collision by repeated hashing.
 Compound 2nd preimage collision resistant hash map:
Given x,
it is "hard to find" y,
where x != y
AND
it is "hard to find" N,
where
1 < N < B
such that:
h^{N}(x) ==
h^{N}(y)
BTW:
This resists finding a hash collision by repeated hashing for a particular given value.
 Double compound collision resistant hash map:
It is "hard to find" x,
AND
it is "hard to find" y,
where
x != y
AND
it is "hard to find" M,
where
2 < M < B
AND
it is "hard to find" N,
where
1 < N < M
such that:
h^{N}(x) ==
h^{M}(y)
 Double compound second preimage resistant hash map:
Given x,
it is "hard to find" y,
where
x != y
AND
it is "hard to find" M,
where
2 < M < B
AND
it is "hard to find" N,
where
1 < N < M
such that:
h^{N}(x) ==
h^{M}(y)
 Functionrelated preimage collision resistant hash map:
Given x,
AND
given an O(1) nondegenerate elementary function f(y),
where 0 ≤ f(y) < D
for all y,
where
0 ≤ y < R
it is "hard to find" z
such that:
f(h(x)) == h(z)
BTW:
This means that one cannot find two values whose hash
has an elementary relationship.
Don't be fooled by the word elementary.
Here the word elementary refers to an function that
falls under the scope of
elementary
number theory.
Elementary functions can be quite complex!
Simple examples include:
f(x) == 2*(x)
f(x) == x^{2} mod D
f(x) == the count of the number of 1 bits in x
f(x) == the xth prime (where the required primes have been precomputed)
Certain degenerate functions are excluded.
For example, this function is excluded:
f(any value) == h(z)
Because it requires the h(z) ti be known
and because it always returns that predetermined answer
regardless of input.
Another example of a function that is excluded is a function
that makes an exhaustive search of the entire hash space
returning the first collision discovered after the hash space
has been searched.
This brute force exhaustive search function would run
in constant time (because it always does a complete search
of the same sized space before turning an answer).
However for realistic hash space sizes, such a brute force
function is impractical to run to completion.
 Compound functionrelated preimage collision resistant hash map:
Given x,
AND
given an O(1) nontrivial elementary function f(z),
where
0 ≤ f(z) < D
for all z in the half open interval [0,R)
it is "hard to find" N,
where
1 < N < B
AND
it is "hard to find" y
such that:
f(h^{N}(x)) == h(y)
BTW:
This means that one cannot find a compound hash of a value that
has an elementary relationship with the hash of another value.
See the "Functionrelated preimage collision resistant hash
map" requirement for details on nontrivial elementary functions.
 Uniform dispersion map:
For any set U_{N}
where
0 < N < B
AND
where the values U_{N} have been "uniformly distributed"
over the half open interval [0,R) by a
"cryptographically sound deterministic bit generator"
the hash of each member of U_{N}
appears to be "uniformly distributed" over the half open interval:
[0,D)
BTW:
This means that the hash of random values appears to
be well distributed across the hash function domain.
 Pseudorandom map:
For a given N,
where 0 < N < B,
the concatenation of the hash of each member of U_{N}
appears to be a sequence generated by a
"cryptographically sound deterministic bit generator".
BTW:
This means that the hash function acts as a
cryptographically sound deterministic bit generator
that was seeded by U_{N}.
That is:
h(U_{N}[0]) 
h(U_{N}[1]) 
...
h(U_{N}[N1])
passes the requirements for output of from a
"cryptographically sound random bit generator".
 Pseudorandom bit flip dispersion map:
For a given N,
where 0 < N < B,
AND
for a given set U_{N} that was
generated over the half open interval [0,R) by a
"cryptographically sound deterministic bit generator",
foreach value x over the half open interval [0,N),
AND
foreach value i over the half open interval [0,W),
the concatenation of the set of values from this expression:
h(U_{N}[x]) xor
h(U_{N}[x] xor 2^{i})
passes the requirements for output of from a
"cryptographically sound random bit generator".
BTW:
Randomly changing an input bit will appear to
randomly change about half the output bits.
Moreover, the output bit positions
that change appear to be randomly selected.
Definitions and assumptions
 h(x)
This is a given cryptographic hash function.
 All values are unsigned integers unless otherwise specified.
 R and D
The range (R) and domain (D)
of the hash function h(x).
In particular:
For all 0 ≤ x < R: 0 ≤
h(x) < D
 B
This value is known as the "birthday domain limit"
In particular:
B = floor(sqrt(D))
where floor() is the floor function and sqrt() is the square root function.
 W
This is the bit width of the output of the hash function.
In particular:
W is the smallest value such that
2^{W} ≥ D
 h^{N}(x)
A compound hash function. Example:
h^{3}(x) ==
h(h(h(x)))
 L
The lifetime of the hash function.
In particular:
L is the amount time where h(x)
is expected to be used.
After period T,
h(x) will become deprecated for use.
The value L is expected to be defined by either the hash function
creators and/or the hash standard.
 C
A large collection of classical (nonquantum) computers.
The value C is expected to be defined by either the hash function
creators and/or the hash standard.
 P
A very small probability that is 0 < P < 1.
The value P is expected to be defined by either the hash function
creators and/or the hash standard.
P is not an unsigned integer.
 (p  x)
The bitwise concatenation of the values p
AND
x.
 Variables
j, k,
p, q,
x, y,
z,
are hashable
These variables are in the half open interval [0,R).
 "hard to find"
A solution is said to be "hard to find" if C
classical (nonquantum) computers working constantly over
the hash function lifetime L, has a probability of discovering
a solution is ≤ P.
 U_{N} {U_{N}[0],
U_{N}[1],
... U_{N}[N1]}
A set of N unique values
such that:
0 < N < D
Foreach 0 ≤ i < N
0 ≤ U_{N}[i] < D
Foreach 0 ≤ i < N1
Foreach i+1 ≤ j < N
U_{N}[i] != U_{N}[j]
 "True random source"
A "true random source"
is a theoretical random bit generator that is
perfect in every way.
The next value produced is completely uncorrelated with all previous
values.
Random bit statistical tests that run in polynomial time determine
the likelihood (or lack thereof) that a given finite set of data could
have been produced by a "true random source".
For a sequence of generated bits to pass a statistical test, said
sequence must pass with the same level of confidence as an
equivalent length sequence from a true random source.
 "Cryptographically strong random bit generator"
A "cryptographically strong random bit generator" passes all
statistical tests that run in polynomial time asymptotically.
It will pass any statistical test for randomness that does not require
an exponentially increasing to infinite amount of time to run.
All such polynomial time statistical tests will be unable to distinguish
the random bit generator from a "true random source".
The values produced by the generator are random in an absolutely precise way.
There is absolutely no better way to predict the sequence than by tossing
a coin.
Furthermore, having a large chunk of output from the random bit
generator does not help in predicting past or future values.
 "Cryptographically sound random bit generator"
A "cryptographically sound random bit generator"
is a very high quality random bit generator that is almost certainly a
"cryptographically strong random bit generator".
While the generator lacks a formal proof, there should exist a solid
and well reasoned argument for its cryptographic strength.
A "cryptographically sound random bit generator" will pass
statistical tests with a very high level of confidence.
In fact, sound generators will pass tests at the same confidence level
as strong generators.
In particular, any standard battery of statistical tests for randomness,
when given a statistically significant amount of data, will not be able
to distinguish the random bit generator from a
"true random source".
 The requirements listed are not indented to be orthogonal
Some requirements may overlap other requirements.
The list of requirements is not intended to be a minimal set of requirements.
 The requirements listed are not complete.
The list of requirements listed is not intended to be
an exhaustive list of requirements.
TODO: Things to add someday
As I said, this is a work in progress.
Someday I plan to add:
 Define "uniformly distributed"
 Increase the orthogonality of the requirements
 Add more BTW text to each of the requirements
 Add of the few missing requirements
 Make the list of elementary function examples more cryptographically relevant
