You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This document is a work-in-progress. Please contact the author if you see errors or omissions.
Summary
Most of the secure hash functions ever designed have turned out to be
vulnerable to collision attacks. This includes the widely-used secure
hash functions MD5 and SHA-1.
What about pre-image and second-pre-image attacks? Have practical hash
functions historically been vulnerable to those?
I summarize here the history of attacks on secure hash functions in order
to yield an answer to that.
The main result is that there is a big gap between the history of
collision attacks and pre-image attacks. Almost all older secure hash
functions have fallen to collision attacks. Almost none have ever
fallen to pre-image attacks.
Secondarily, almost no new secure hash functions (designed after
approximately the year 2000) have so far succumbed to collision attacks,
either.
Preliminaries
The input to a secure hash function is called the pre-image and the
output is called the image.
A hash function collision is two different inputs (pre-images) which
result in the same output. A hash function is collision-resistant if an
adversary can't find any collision.
A hash function is pre-image resistant if, given an output (image), an
adversary can't find any input (pre-image) which results in that output.
A hash function is second-pre-image resistant if, given one
pre-image, an adversary can't find any other pre-image which results in
the same image.
When collision attacks don't matter
There are cases where collision-resistance doesn't matter at all and what
you care about is second-pre-image resistance.
For such uses it would be harmless to be able to generate collisions, but
harmful to be able to generate pre-images or second-pre-images. For this
purpose the relevant question is not whether hash function designs have
historically been revealed to be vulnerable to collisions but instead
whether they've been revealed to be vulnerable to (second-)pre-images.
hash-based digital signatures
An example of this is the construction of hash-based digital
signatures. Hash-based digital signatures are secure (resistant to
forgery) as long as the hash function they are built on has
second-pre-image resistance, e.g. SPHINCS.
Such a hash-based digital signature would fail if its underlying hash
function failed at second-pre-image resistance, but this is the only
way that it could be broken—any attack which was able to forge digital
signatures against such a scheme would have to violate the
second-pre-image resistance of the underlying hash function.
One reason that hash-based digital signatures might be useful is that if
an attacker has a sufficiently large quantum computer, they could forge
digital signatures that rely on factorization or discrete log, such as
RSA, DSA, ECDSA, or Ed25519. There is no reason to think that such a
quantum computer would enable them to break secure hash functions,
however.
Another reason is that even if the attacker does not have a
sufficiently large quantum computer, but has a mathematical breakthrough
that allows them to exploit the asymmetric crypto technique (such as
factoring, discrete log, code-based crypto, etc.), then they would be
able exploit asymmetric-crypto-based digital signatures, but not
hash-based digital signatures.
What about in the other direction, though? Can't we imagine an attacker
who can break hash-based signatures but can't break
asymmetric-crypto-based signatures? No—there cannot be such an
attacker. Any attacker who can break hash-based signatures can also break
asymmetric-crypto-based signatures, because the latter rely on hash
functions in addition to relying on their asymmetric crypto primitives.
color key: is relying on this safe?
unsafe
You can be exploited if you rely on this.
safe
There is no reason to believe that relying on this will make you
vulnerable to exploitation.
all other post-quantum
(McEliece, NTRUsign,
LWE, Ring-LWE,
Lattice-based signatures,
code-based signatures,
Rainbow,
multivariate-quadratic,
etc.)
safe
safe
unsafe
unsafe
unsafe
all others (RSA, DSA,
ECDSA, Ed25519, etc.)
safe
unsafe
unsafe
unsafe
unsafe
When collision attacks do matter
Be careful about this! The ability to generate collisions can be
surprisingly harmful to many systems. This is one of those subtleties of
cryptographic engineering which frequently trip up engineers who are not
cryptography experts. The famous “Internet Root Cert” attack [18] is an
example of engineers working at VeriSign incorrectly thinking that their
system was not threatened by collisions (in the absence of
second-pre-images).
git, which uses SHA-1, is like VeriSign's MD5 certificates in this
way—it is believed by its developers [50] that a mere collision attack
(not second-pre-image) against SHA-1 wouldn't make git users vulnerable
to malicious action, but no-one has written a security proof showing that
git is safe against this attack.
In contrast to VeriSign and git, the cryptographic constructions
mentioned above come with proofs showing that the security of the
construction is guaranteed, assuming the security of some underlying
component. For example, the hash-based digital signature SPHINCS comes
with a proof that any possible attack which couldn't generate
second-pre-images against the hash function couldn't forge signatures.
Results
Here are the results of my search for all state-of-the-art attacks on
widely-studied hash functions.
The bottom line is that no widely-studied hash function has ever
succumbed to a (second-)pre-image attack except for one.
That single exception is the second-oldest secure hash function ever
designed, Snefru, which was designed in 1989 and 1990, and which turned
out to be vulnerable to differential cryptanalysis. Differential
cryptanalysis was discovered (by the open research community) in 1990.
No other widely-studied hash function has been shown to be vulnerable to
a practical (second-)pre-image attack. Furthermore, no other
widely-studied hash function has been shown to be vulnerable to a
(second-)pre-image attack that is more efficient than brute force, even
if we were to count attacks too expensive for anyone to actually
implement!
The history of (second-)pre-image attacks is therefore quite different
from the history of collision attacks. Most hash functions have been
proven vulnerable to collision attacks more efficient than brute force,
and even to collision attacks that could be implemented in practice.
History of attacks on hash functions
This is a timeline of the publication of hash functions and of
publication of weaknesses in hash functions.
I omit attacks on reduced-round or otherwise weakened variants of hash
functions (there are a lot of those). I omit attacks that have
unrealistic requirements, like attacks that require 2¹²⁸ precomputation
or require the messages to be 2⁵⁶ blocks long (there are a lot of those,
too).
color key: is relying on this safe?
no
You can be exploited if you rely on this.
maybe
There are known attacks but they are probably too expensive to
actually implement. If the attacks have been secretly improved, or if
the attacker has more efficient computational resources than we think,
then maybe you can be exploited if you rely on this.
maybe
There are no known attacks that are cheaper than brute force, but the
hash output size is small enough that brute force might be feasible,
so maybe you can be exploited if you rely on this.
yes
There is no known attack cheaper than brute force, and to pay for a
brute force attack is far, far beyond the bounds of possibility for
the forseeable future. There is no reason to believe that relying on
this will make you vulnerable to exploitation.
Cycles per byte were taken from on ebash's amd64-pluton1mn,
4096-byte blocks, median measurement, except for Tiger, which was
is not measured on that machine and was instead taken from ebash's
amd64-h9ivy, and Panama, which is not measured on ebash. For
Panama, I measured it on my laptop (an Intel(R) Core(TM) i5-3427U,
which is similar to the ebash amd64-h9ivy machine) with Crypto++
v5.6.2's implementation of Panama. I also measured MD5, SHA-1,
SHA-256, SHA-512, SHA-3-256, SHA-3-512, Tiger, Whirlpool, and
RIPEMD-160 on my machine and confirmed that their measurements on
my machine were similar to the measurements posted from
amd64-h9ivy.
For MD2, I marked it as "maybe" safe in the collisions column up
until 2010 and then marked is as "no". This is even though there
are no known collision attacks on them better than brute
force. This is because MD2's 128-bit output means the brute force
attack takes only 2⁶⁴ comp and negligible memory to find a
collision. To do that much comp has become feasible over the last
few years. For example, in 2014 the Bitcoin mining network is
doing it approximately every 10 minutes [45], [46]!
SHA-0 was considered unsafe beginning in 1995, not because of any
published attack on it, nor because the 2⁸⁰ work factor for the
brute force collision attack was feasible, but because the NSA had
asserted that something was wrong with SHA-0 when they published
SHA-1.
[§]
RIPEMD-160's 160-bit output means it takes only 2⁸⁰ comp and
negligible memory to find a collision. In my estimation this was
safe until recently and is now “maybe” safe. See also [47] and
Table 5.1 of [49].
Discussion
The main result of this investigation is that there is a big gap between
the historical successes of collision attacks and the almost
non-existence successes of pre-image attacks. This is evidence that a
cryptosystem which is invulnerable to collision-attacks (even if still
vulnerable to pre-image attacks) is much stronger than one which is
vulnerable to collision-attacks.
Another interesting pattern that I perceive in these results is that
maybe sometime between 1996 (Tiger) and 2000 (Whirlpool), humanity
learned how to make collision-resistant hash functions, and none of the
prominent secure hash functions designed since that era have succumbed to
collision attacks.
Or maybe this is just a 15-year-long hiatus, and in the future we'll
discover how to perform collision attacks against the "modern" secure
hash functions. Looking in the rearview mirror can't answer that for us.
Acknowledgments
Thanks to Daira Hopwood, Andreas Hülsing, and Samuel Neves for comments on this note.