mirror of
https://github.com/Llewellynvdm/Tomb.git
synced 2024-11-06 05:17:59 +00:00
405 lines
20 KiB
Plaintext
405 lines
20 KiB
Plaintext
|
|
|||
|
Linux hard disk encryption settings
|
|||
|
|
|||
|
This page intends to educate the reader about the existing weaknesses
|
|||
|
of the public-IV on-disk format commonly used with cryptoloop and
|
|||
|
dm-crypt (used in IV-plain mode). This page aims to facilitate risk
|
|||
|
calculation when utilising Linux hard disk encryption. The attacks
|
|||
|
presented on this page may pose a thread to you, but at the same time
|
|||
|
may be totally irrelevant for others. At the end of this document, the
|
|||
|
reader should be able to make a good choice according to his security
|
|||
|
needs.
|
|||
|
|
|||
|
A good quote with respect to this topic is ''All security involves
|
|||
|
trade-offs'' from Beyond Fear (Bruce Schneier). You should keep in mind
|
|||
|
that perfect security is unachievable and by all means shouldn't be
|
|||
|
your goal. For instance, when using pass phrase based cryptography, you
|
|||
|
have to trust in that the underlying system is secure, the computer
|
|||
|
system has not been tampered with, and nobody is watching you. The most
|
|||
|
obvious weakness is the last one, but even if you make sure nobody nor
|
|||
|
any camera is around, how about the keyboard you're typing on? Has it
|
|||
|
been manipulated while you have been getting your lunch?
|
|||
|
|
|||
|
So security comes for a price, and the price when designing
|
|||
|
cryptography security algorithms is performance. You will be introduced
|
|||
|
to the fastest of all setups available, the "public-IV", which
|
|||
|
sacrifices security properties for speed. After that we will talk about
|
|||
|
ESSIV, the newest of IV modes implemented. It comes for a small price,
|
|||
|
but it can deal with watermarking for a relatively small price. Then
|
|||
|
you'll be introduced to the draft specifications of the Security in
|
|||
|
Storage Working Group ([18]SISWG). Currently SISWG is considering EME
|
|||
|
and LRW for standardisation. EME along with it's cousin CMC seems to
|
|||
|
provide the best security level, but imposes additional encryption
|
|||
|
steps. Plumb-IV is discussed only for reference, because it has the
|
|||
|
same performance penalty as CMC, but in constrast suffers from
|
|||
|
weaknesses of CBC encryption.
|
|||
|
|
|||
|
As convention, this document will use the term "blocks", when it
|
|||
|
referes to a single block of plain or cipher text (usually 16 byte),
|
|||
|
and will use the term "sectors", when it refers to a 512-byte wide hard
|
|||
|
disk block.
|
|||
|
|
|||
|
CBC Mode: The basic level
|
|||
|
|
|||
|
Most hard disk encryption systems utilise CBC to encrypt bulk data.
|
|||
|
Good descriptions on CBC and other common cipher modes are available at
|
|||
|
* [19]Wikipedia
|
|||
|
* [20]Connected: An Internet Encyclopedia
|
|||
|
* [21]NIST: Recommendation for Block Cipher Modes of Operation (CBC
|
|||
|
is at PDF Page 17)
|
|||
|
|
|||
|
Please make sure you're familiar with CBC before proceeding.
|
|||
|
|
|||
|
Since CBC encryption is a recursive algorithm, the encryption of the
|
|||
|
n-th block requires the encryption of all preceding blocks, 0 till n-1.
|
|||
|
Thus, if we would run the whole hard disk encryption in CBC mode, one
|
|||
|
would have to re-encrypt the whole hard disk, if the first computation
|
|||
|
step changed, this is, when the first plain text block changed. Of
|
|||
|
course, this is an undesired property, therefore the CBC chaining is
|
|||
|
cut every sector and restarted with a new initialisation vector (IV),
|
|||
|
so we can encrypt sectors individually. The choice of the sector as
|
|||
|
smallest unit matches with the smallest unit of hard disks, where a
|
|||
|
sector is also atomic in terms of access.
|
|||
|
|
|||
|
For reference, I will give a formal definition of CBC encryption and
|
|||
|
decryption. Note, that decryption is not recursive, in contrast to
|
|||
|
encryption, since it's a function only of C[n-1] and C[n].
|
|||
|
Encryption:
|
|||
|
C[-1] = IV
|
|||
|
C[n] = E(P[n] ⊕ C[n-1])
|
|||
|
Decryption:
|
|||
|
C[-1] = IV
|
|||
|
P[n] = C[n-1] ⊕ D(C[n])
|
|||
|
The next sections will deal with how this IV is chosen.
|
|||
|
|
|||
|
The IV Modes
|
|||
|
|
|||
|
The "public-IV"
|
|||
|
|
|||
|
The IV for sector n is simply the 32-bit version of the number n
|
|||
|
encoded in little-endian padded with zeros to the block-size of the
|
|||
|
cipher used, if necessary. This is the most simple IV mode, but at the
|
|||
|
same the most vulnerable.
|
|||
|
|
|||
|
ESSIV
|
|||
|
|
|||
|
E(Sector|Salt) IV, short ESSIV, derives the IV from key material via
|
|||
|
encryption of the sector number with a hashed version of the key
|
|||
|
material, the salt. ESSIV does not specify a particular hash algorithm,
|
|||
|
but the digest size of the hash must be an accepted key size for the
|
|||
|
block cipher in use. As the IV depends on a none public piece of
|
|||
|
information, the key, the sequence of IV is not known, and the attacks
|
|||
|
based on this can't be launched.
|
|||
|
|
|||
|
plumb IV
|
|||
|
|
|||
|
The IV is computed by hashing (or MAC-ing) the plain text from the
|
|||
|
second block till the last. Additionally, the sector number and the key
|
|||
|
are used as input as well. If a byte changes in the plain text of the
|
|||
|
blocks 2 till n, the first block is influenced by the change of the IV.
|
|||
|
As the first encryption effects all subsequent encryption steps due to
|
|||
|
the nature of CBC, the whole sector is changed.
|
|||
|
|
|||
|
Decryption is possible because CBC is not recursive for decryption. The
|
|||
|
prerequisites for a successful CBC decryption are two subsequent cipher
|
|||
|
blocks. The former one is decrypted and the first one is XOR-ed into
|
|||
|
the decryption result yielding the original plain text. Therefore
|
|||
|
independent of the IV scheme, decryption is possible from the 2nd to
|
|||
|
the last block. After the recovery of these plain text blocks, the IV
|
|||
|
can be computed, and finally the first block can be decrypted as well.
|
|||
|
|
|||
|
The only weakness of this scheme is it's performance. It has to process
|
|||
|
data twice: first for obtaining the IV, and then to produce the CBC
|
|||
|
encryption with this IV. With the same performance penalty CMC is able
|
|||
|
to achieve better security properties (CMC is discussed later), thus
|
|||
|
plumb-IV will remain unimplemented.
|
|||
|
|
|||
|
The attack arsenal
|
|||
|
|
|||
|
Content leaks
|
|||
|
|
|||
|
This attack can be mounted against any system operating in CBC Mode. It
|
|||
|
rests on the property, that in CBC decryption, the preceding cipher
|
|||
|
block's influence is simple, that is, it's XORed into the plain text.
|
|||
|
The preceding cipher block, C[n-1], is readily available on disk (for n
|
|||
|
> 0) or may be deduced from the IV (for n = 0). If an attacker finds
|
|||
|
two blocks with identical cipher text, he knows that both cipher texts
|
|||
|
have been formed according to:
|
|||
|
C[m] = E(P[m] ⊕ C[m-1] )
|
|||
|
C[n] = E(P[n] ⊕ C[n-1] )
|
|||
|
Since he found that C[m] = C[n], it holds
|
|||
|
P[m] ⊕ C[m-1] = P[n] ⊕ C[n-1]
|
|||
|
which can be rewritten as
|
|||
|
C[m-1] ⊕ C[n-1] = P[n] ⊕ P[m]
|
|||
|
The left hand side is known to the attacker by reading the preceding
|
|||
|
cipher text from disk. If one of the blocks is the first block of a
|
|||
|
sector, the IV must be examined instead (when it's available as it is
|
|||
|
in public-IV). The attacker is now able to deduce the difference
|
|||
|
between the plain texts by examining the difference of C[m-1] and
|
|||
|
C[n-1]. If one of the plain text blocks happens to be zero, the
|
|||
|
difference yields the original content of the other related plain text
|
|||
|
block.
|
|||
|
|
|||
|
Another information is available to the attacker. Any succeeding
|
|||
|
identical pair of cipher text, that follows the initial identical
|
|||
|
cipher pair, is equal. No information about the content of those pairs
|
|||
|
can be extracted, since the information is extracted from the
|
|||
|
respective preceding cipher blocks, but those are all required to be
|
|||
|
equal.
|
|||
|
|
|||
|
Let's have a look at the chance of succeeding with this attack.
|
|||
|
Assuming the output of a cipher forms an uniform distribution, the
|
|||
|
chance, p, of finding an identical block is 2^-blocksize. For instance,
|
|||
|
p = 1/2^128 for a 128-bit cipher. Because the number of possible pairs
|
|||
|
develops as an arithmetic series in n, the number of sectors, the
|
|||
|
chance of not finding two identical blocks is given by
|
|||
|
(1-p)^n(n-1)/2
|
|||
|
As p is very small, but in contrast the power is very big, we apply the
|
|||
|
logarithm to get meaningful answers, that is
|
|||
|
n(n-1)/2 ln (1-p)
|
|||
|
An example: The number of cipher blocks available on 200GB disk with
|
|||
|
known C[n-1] is 200GB × 1024^2 KB/GB × 64/1KB ^1. Or in other words, a
|
|||
|
128-bit block is 16 bytes, so the number of 16-byte blocks in a 200GB
|
|||
|
hard disk is 13.4 billion. Therefore, n = 1.342e10. For a 128-bit
|
|||
|
cipher, p = 2^-128. Hence,
|
|||
|
ln(1-p) = -2.939e-39
|
|||
|
n(n-1)/2 = 9.007e19
|
|||
|
|
|||
|
n(n-1)/2 ln (1-p) = -2.647e-19
|
|||
|
1-e^-2.776e-13 = 2.647e-19
|
|||
|
The last term is the chance of finding at least one pair of identical
|
|||
|
cipher blocks. But how does this number grow in n? Obviously
|
|||
|
exponentially. Plotting a few a decimal powers shows that the chance
|
|||
|
for finding at least on identical cipher pair flips to 1 around n =
|
|||
|
10^20 (n = 10^40 for a 256-bit cipher). This inflexion point is reached
|
|||
|
for a 146 million TB storage (or a hundered thousand trillion trillions
|
|||
|
TB storage for a 256-bit cipher).
|
|||
|
|
|||
|
^1The blocks with available preceding cipher blocks is 62/1KB for all
|
|||
|
non-public IV schemes, i.e. ESSIV/plumb IV
|
|||
|
|
|||
|
Data existence leak: The Watermark
|
|||
|
|
|||
|
No IV format discussed on this page allows the user to deny the
|
|||
|
existence of encrypted data. Neither cryptoloop nor dm-crypt is an
|
|||
|
implementation of a deniable cryptography system. But the problem is
|
|||
|
more serious with public-IV.
|
|||
|
|
|||
|
With public IV and the predicable difference it introduces in the first
|
|||
|
blocks of a sequence of plain text, data can be watermarked, which
|
|||
|
means, the watermarked data is detectable even when the key has not
|
|||
|
been recovered. As shown in the paragraph above, the existence of two
|
|||
|
blocks with identical cipher text is very unlikely and coincidence can
|
|||
|
be excluded, which is relevant when somebody tries to demonstrate
|
|||
|
before the law that certain data is in an encrypted partition.
|
|||
|
|
|||
|
As the IV progresses with a foreseeable pattern and is guaranteed to
|
|||
|
change the least significant bit ever step, we can build identical pair
|
|||
|
of cipher text by writing three consecutive sectors each with a flipped
|
|||
|
LSB relative to the previous. (The reason it's three instead of two is,
|
|||
|
that the second least significant bit might change as well.) This
|
|||
|
"public-IV"-driven CBC encryption will output exactly the same cipher
|
|||
|
text for two consecutive sectors. An attacker can search the disk for
|
|||
|
identical consecutive blocks to find the watermark. This can be done in
|
|||
|
a single pass, and is much more feasible than finding to identical
|
|||
|
blocks, that are scattered on the disk, as in the previous attack. A
|
|||
|
few bits of information can be encoded into the watermarks, which might
|
|||
|
serve as tag to prove the existence copyright infringing material.
|
|||
|
|
|||
|
A complete description of watermarking can be found in [22]Encrypted
|
|||
|
Watermarks and Linux Laptop Security. The attack can be defeated by
|
|||
|
using ESSIV.
|
|||
|
|
|||
|
Data modification leak
|
|||
|
|
|||
|
CBC encryption is recursive, so the n-th block depends on all previous
|
|||
|
blocks. But the other way round would also be nice. Why? The weakness
|
|||
|
becomes visible, if storage on a remote computer is used, or more
|
|||
|
likely, the hard disk exhibits good forensic properties. The point is,
|
|||
|
the attacker has to have access to preceding (in time) cipher text of a
|
|||
|
sector, either by recording it from the network, or by using forensic
|
|||
|
methods.
|
|||
|
|
|||
|
An attacker can now guess data modification patterns by examining the
|
|||
|
historic data. If a sector is overwritten with a partial changed plain
|
|||
|
text, there is an amount of bytes at the beginning, which are
|
|||
|
unchanged. This point of change^2 is directly reflected in the cipher
|
|||
|
text. So an attacker can deduce the point of the change in plain text
|
|||
|
by finding the point where the cipher text starts to differ.
|
|||
|
|
|||
|
This weakness is present in public-IV and ESSIV.
|
|||
|
|
|||
|
^2aligned to the cipher block size boundaries
|
|||
|
|
|||
|
Malleable plain text
|
|||
|
|
|||
|
The decryption structure of CBC is the source of this weakness.
|
|||
|
Malleability (with respect to cryptography) is defined as a
|
|||
|
modification of the cipher text that will resulting in a predictable
|
|||
|
change in plain text. To put it formally, there is a function f(C),
|
|||
|
that, if applied to the cipher text, C' = f(C), will result in a known
|
|||
|
function f', which will predict the resulting plain text, P' = D(C'),
|
|||
|
correctly assuming P is known, that is P' = f'(P).
|
|||
|
|
|||
|
As we can see in it's definition, CBC decryption depends on C[n-1]. An
|
|||
|
attacker can flip arbitrary bits in the plain text by flipping bit in
|
|||
|
C[n-1]. More formally^3, if
|
|||
|
P = P[1] || P[2] || ... || P[i] || ... || P[n]
|
|||
|
C = E[CBC](P)
|
|||
|
C = C[1] || C[2] || ... || C[i-1] || ... || C[n]
|
|||
|
the function
|
|||
|
f(C[1] || ... || C[n]) = C[1] || ... || C[i-1] XOR M || ... || C[n]
|
|||
|
follows the function f', which predicts the resulting plain text
|
|||
|
correctly as,
|
|||
|
f'(P[1] || ... || P[n]) = P[1] || ... || P[i] XOR M || ... || P[n]
|
|||
|
The first block of the CBC cipher text stream is not malleable, because
|
|||
|
it depends on the IV, which is not modifiable for an attacker.
|
|||
|
|
|||
|
^3The IV parameter for E[CBC] has been intentionally omitted.
|
|||
|
|
|||
|
Movable
|
|||
|
|
|||
|
On the expense of one block decrypting to garbage, an attacker can move
|
|||
|
around plain text as he likes. CBC decryption depends on two variables,
|
|||
|
C[n-1] and C[n]. Both can be modified at free will. To make meaningful
|
|||
|
modifications, an attacker has to replace the pair C[n-1] and C[n] with
|
|||
|
other cipher text pair from disk. The first block C[n-1] will decrypt
|
|||
|
to garbage, but the second block C[n] will yield a copy of the plain
|
|||
|
text of the copied cipher block. This attack is also known as
|
|||
|
copy&paste attack. This attack is mountable against any CBC setup. The
|
|||
|
only limitation is, the first block, C[0], can't be replaced with
|
|||
|
something meaningful, as C[-1] can't be modified, because it's the IV.
|
|||
|
|
|||
|
CMC and EME: Tweakable wide block cipher modes
|
|||
|
|
|||
|
CMC is a new chaining mode. It stands for ''CBC-Mask-CBC''. It works by
|
|||
|
processing the data in three steps, first CBC, then masking the cipher
|
|||
|
text, and then another CBC step, but this time backwards. The last step
|
|||
|
introduces a dependency from the last block to the first block. The
|
|||
|
authors of the CMC paper provide a prove for the security of this mode,
|
|||
|
making a secure 128-bit cipher a secure 4096-bit cipher (sector size).
|
|||
|
As in normal CBC, this scheme also takes an IV, but the authors call it
|
|||
|
tweak.
|
|||
|
|
|||
|
EME is CMC's cousin. EME has also been authored by Haveli and Rogaway
|
|||
|
as well been authored for the same purpose. The difference to CMC is,
|
|||
|
that EME is parallelizable, that is, all operations of the underlying
|
|||
|
cipher can be evaluated in parallel. To introduce an interdependency
|
|||
|
among the resulting cipher blocks, the encryption happens in two
|
|||
|
stages. Between these stages a mask is computed from all intermediate
|
|||
|
blocks and applied to each intermediate block. This step causes an
|
|||
|
interdependency among the cipher blocks. After applying the mask,
|
|||
|
another encryption step diffuses the mask.
|
|||
|
|
|||
|
The interdependency among the resulting blocks allow CMC and EME to be
|
|||
|
nonmovable, nonmalleable, to prevent content leaks and in-sector data
|
|||
|
modification patterns. The tweaks are encrypted by both cipher modes,
|
|||
|
thus both are nonwatermarkable.
|
|||
|
|
|||
|
For simplicity, the EME description above omitted the pre- and post-
|
|||
|
whitening steps as well as the multiplications in GF(2^128). An
|
|||
|
in-depth specification can be found at the [23]Cryptology ePrint
|
|||
|
Archive. An applicable draft specification for EME-32-AES can be found
|
|||
|
at [24]SISWG. I have written an EME-32-AES test implementation for
|
|||
|
Linux 2.6. It's available [25]here. The CMC paper is available from the
|
|||
|
[26]Cryptology ePrint Archive as well.
|
|||
|
|
|||
|
LRW: A tweakable narrow block cipher mode
|
|||
|
|
|||
|
EME as well as CMC are comparatively secure cipher modes, but heavy in
|
|||
|
terms of performance. LRW tries to cope with most of security
|
|||
|
requirements, and at the same time provide a good performance. LRW is a
|
|||
|
narrow block cipher mode, that is, it operates only on one block,
|
|||
|
instead of a whole sector. To make a cipher block tied to a location on
|
|||
|
disk (to make it unmovable), a logical index is included in the
|
|||
|
computation. For LRW you have to provide two keys, one for the cipher
|
|||
|
and one for the cipher mode. The second key is multiplied with a
|
|||
|
logical index under GF(2^128) and used as pre- and post- whitening for
|
|||
|
encryption. With those whitening steps the block is effectively tied to
|
|||
|
a logical index. The logical index is usually the absolute position on
|
|||
|
disk measured with the block size of the cipher algorithm. The
|
|||
|
different choice of the measuring unit is the only different between
|
|||
|
the logical index and the public-IV.
|
|||
|
|
|||
|
The LRW draft is available from the [27]SISWG mailing list archive.
|
|||
|
|
|||
|
Summarising
|
|||
|
|
|||
|
The following table shows a comparison between the security properties
|
|||
|
of different encryption setups and their computational costs. The
|
|||
|
number of cipher calls, XOR operations and additional operations are
|
|||
|
stated in terms of encryption blocks, n.
|
|||
|
IV mode cipher mode content leaks watermarkable malleable movable
|
|||
|
modification detection^5 cipher calls XOR ops additional op.
|
|||
|
public-IV CBC Yes Yes Yes Yes Yes n n None
|
|||
|
ESSIV CBC Yes No Yes Yes Yes n+1 n None
|
|||
|
Plumb-IV1^4 CBC Yes No Yes Yes No 2n-1 2n None
|
|||
|
public-IV CMC No No No No No 2n+1 2n+1 1 LW GF ⊗
|
|||
|
public-IV EME No No No No No 2n+1 5n 3n-1 LW GF ⊗
|
|||
|
public-IV LRW No No No No Yes n 2n n HW GF ⊗
|
|||
|
|
|||
|
Legend:
|
|||
|
* LW GF ⊗: light-weight Galois field multiplication, that is, a
|
|||
|
multiplication with a constant x^2^i, which can be computed in
|
|||
|
θ(1).
|
|||
|
* HW GF ⊗: heavy-weight Galois field multiplication, that is, a
|
|||
|
multiplication with an arbitrary constant, which can be computed in
|
|||
|
θ(bits).
|
|||
|
|
|||
|
^4plumb-IV1 uses CBC-MAC instead of hashing, so we can make a good
|
|||
|
comparison with other ciphers in terms of cipher/XOR calls.
|
|||
|
^5detectable partial in-sector modification
|
|||
|
__________________________________________________________________
|
|||
|
|
|||
|
Clemens Fruhwirth, , also author of LUKS and ESSIV, porter of
|
|||
|
cryptoloop, aes-i586 for 2.6., twofish-i586, and implementor of
|
|||
|
EME-32-AES. This text is an excerpt of my diploma thesis.
|
|||
|
|
|||
|
This page has been reviewed by
|
|||
|
|
|||
|
Dr. Ernst Molitor
|
|||
|
Arno Wagner
|
|||
|
James Hughes , "Security in Storage Working Group" chair
|
|||
|
|
|||
|
Additional thanks to Pascal Brisset, for pointing out an error in the
|
|||
|
Bernoulli estimation in an earlier version of this document, further
|
|||
|
Adam J. Richter for pointing out an error in the KB/GB ratio.
|
|||
|
|
|||
|
Content and design, Copyright © 2004-2008 Clemens Fruhwirth, unless
|
|||
|
stated otherwise
|
|||
|
Original design by [28]haran | Additional art by [29]LinuxArt | | Blog
|
|||
|
by [30]NanoBlogger
|
|||
|
|
|||
|
References
|
|||
|
|
|||
|
1. http://clemens.endorphin.org/
|
|||
|
2. http://clemens.endorphin.org/credits
|
|||
|
3. http://clemens.endorphin.org/aboutme
|
|||
|
4. http://clemens.endorphin.org/cryptography
|
|||
|
5. http://blog.clemens.endorphin.org/
|
|||
|
6. http://clemens.endorphin.org/patches
|
|||
|
7. http://clemens.endorphin.org/archive
|
|||
|
8. http://clemens.endorphin.org/Cryptoloop_Migration_Guide
|
|||
|
9. http://clemens.endorphin.org/LUKS
|
|||
|
10. http://clemens.endorphin.org/AFsplitter
|
|||
|
11. http://clemens.endorphin.org/lo-tracker
|
|||
|
12. http://blog.clemens.endorphin.org/2008/12/luks-on-disk-format-revision-111.html
|
|||
|
13. http://blog.clemens.endorphin.org/2008/11/xmonad-gridselect.html
|
|||
|
14. http://blog.clemens.endorphin.org/2008/11/workaround-for-bittorrent-traffic.html
|
|||
|
15. http://blog.clemens.endorphin.org/2008/09/i-love-lolcat-meme.html
|
|||
|
16. http://blog.clemens.endorphin.org/2008/09/counter-steganography-research.html
|
|||
|
17. http://clemens.endorphin.org/cryptography
|
|||
|
18. http://www.siswg.org/
|
|||
|
19. http://en.wikipedia.org/wiki/Block_cipher_modes_of_operation
|
|||
|
20. http://www.freesoft.org/CIE/Topics/143.htm
|
|||
|
21. http://csrc.nist.gov/publications/nistpubs/800-38a/sp800-38a.pdf
|
|||
|
22. http://www.tcs.hut.fi/~mjos/doc/wisa2004.pdf
|
|||
|
23. http://eprint.iacr.org/2003/147/
|
|||
|
24. http://grouper.ieee.org/groups/1619/email/pdf00011.pdf
|
|||
|
25. http://article.gmane.org/gmane.linux.kernel.device-mapper.dm-crypt/544
|
|||
|
26. http://eprint.iacr.org/2003/148/
|
|||
|
27. http://grouper.ieee.org/groups/1619/email/msg00160.html
|
|||
|
28. http://www.oswd.org/user/profile/id/3013
|
|||
|
29. http://www.linuxart.com/
|
|||
|
30. http://nanoblogger.sourceforge.net/
|