mirror of
https://github.com/Llewellynvdm/Tomb.git
synced 2024-11-25 06:07:38 +00:00
405 lines
20 KiB
Plaintext
405 lines
20 KiB
Plaintext
|
||
Linux hard disk encryption settings
|
||
|
||
This page intends to educate the reader about the existing weaknesses
|
||
of the public-IV on-disk format commonly used with cryptoloop and
|
||
dm-crypt (used in IV-plain mode). This page aims to facilitate risk
|
||
calculation when utilising Linux hard disk encryption. The attacks
|
||
presented on this page may pose a thread to you, but at the same time
|
||
may be totally irrelevant for others. At the end of this document, the
|
||
reader should be able to make a good choice according to his security
|
||
needs.
|
||
|
||
A good quote with respect to this topic is ''All security involves
|
||
trade-offs'' from Beyond Fear (Bruce Schneier). You should keep in mind
|
||
that perfect security is unachievable and by all means shouldn't be
|
||
your goal. For instance, when using pass phrase based cryptography, you
|
||
have to trust in that the underlying system is secure, the computer
|
||
system has not been tampered with, and nobody is watching you. The most
|
||
obvious weakness is the last one, but even if you make sure nobody nor
|
||
any camera is around, how about the keyboard you're typing on? Has it
|
||
been manipulated while you have been getting your lunch?
|
||
|
||
So security comes for a price, and the price when designing
|
||
cryptography security algorithms is performance. You will be introduced
|
||
to the fastest of all setups available, the "public-IV", which
|
||
sacrifices security properties for speed. After that we will talk about
|
||
ESSIV, the newest of IV modes implemented. It comes for a small price,
|
||
but it can deal with watermarking for a relatively small price. Then
|
||
you'll be introduced to the draft specifications of the Security in
|
||
Storage Working Group ([18]SISWG). Currently SISWG is considering EME
|
||
and LRW for standardisation. EME along with it's cousin CMC seems to
|
||
provide the best security level, but imposes additional encryption
|
||
steps. Plumb-IV is discussed only for reference, because it has the
|
||
same performance penalty as CMC, but in contrast suffers from
|
||
weaknesses of CBC encryption.
|
||
|
||
As convention, this document will use the term "blocks", when it
|
||
refers to a single block of plain or cipher text (usually 16 byte),
|
||
and will use the term "sectors", when it refers to a 512-byte wide hard
|
||
disk block.
|
||
|
||
CBC Mode: The basic level
|
||
|
||
Most hard disk encryption systems utilise CBC to encrypt bulk data.
|
||
Good descriptions on CBC and other common cipher modes are available at
|
||
* [19]Wikipedia
|
||
* [20]Connected: An Internet Encyclopedia
|
||
* [21]NIST: Recommendation for Block Cipher Modes of Operation (CBC
|
||
is at PDF Page 17)
|
||
|
||
Please make sure you're familiar with CBC before proceeding.
|
||
|
||
Since CBC encryption is a recursive algorithm, the encryption of the
|
||
n-th block requires the encryption of all preceding blocks, 0 till n-1.
|
||
Thus, if we would run the whole hard disk encryption in CBC mode, one
|
||
would have to re-encrypt the whole hard disk, if the first computation
|
||
step changed, this is, when the first plain text block changed. Of
|
||
course, this is an undesired property, therefore the CBC chaining is
|
||
cut every sector and restarted with a new initialisation vector (IV),
|
||
so we can encrypt sectors individually. The choice of the sector as
|
||
smallest unit matches with the smallest unit of hard disks, where a
|
||
sector is also atomic in terms of access.
|
||
|
||
For reference, I will give a formal definition of CBC encryption and
|
||
decryption. Note, that decryption is not recursive, in contrast to
|
||
encryption, since it's a function only of C[n-1] and C[n].
|
||
Encryption:
|
||
C[-1] = IV
|
||
C[n] = E(P[n] ⊕ C[n-1])
|
||
Decryption:
|
||
C[-1] = IV
|
||
P[n] = C[n-1] ⊕ D(C[n])
|
||
The next sections will deal with how this IV is chosen.
|
||
|
||
The IV Modes
|
||
|
||
The "public-IV"
|
||
|
||
The IV for sector n is simply the 32-bit version of the number n
|
||
encoded in little-endian padded with zeros to the block-size of the
|
||
cipher used, if necessary. This is the most simple IV mode, but at the
|
||
same the most vulnerable.
|
||
|
||
ESSIV
|
||
|
||
E(Sector|Salt) IV, short ESSIV, derives the IV from key material via
|
||
encryption of the sector number with a hashed version of the key
|
||
material, the salt. ESSIV does not specify a particular hash algorithm,
|
||
but the digest size of the hash must be an accepted key size for the
|
||
block cipher in use. As the IV depends on a none public piece of
|
||
information, the key, the sequence of IV is not known, and the attacks
|
||
based on this can't be launched.
|
||
|
||
plumb IV
|
||
|
||
The IV is computed by hashing (or MAC-ing) the plain text from the
|
||
second block till the last. Additionally, the sector number and the key
|
||
are used as input as well. If a byte changes in the plain text of the
|
||
blocks 2 till n, the first block is influenced by the change of the IV.
|
||
As the first encryption effects all subsequent encryption steps due to
|
||
the nature of CBC, the whole sector is changed.
|
||
|
||
Decryption is possible because CBC is not recursive for decryption. The
|
||
prerequisites for a successful CBC decryption are two subsequent cipher
|
||
blocks. The former one is decrypted and the first one is XOR-ed into
|
||
the decryption result yielding the original plain text. Therefore
|
||
independent of the IV scheme, decryption is possible from the 2nd to
|
||
the last block. After the recovery of these plain text blocks, the IV
|
||
can be computed, and finally the first block can be decrypted as well.
|
||
|
||
The only weakness of this scheme is it's performance. It has to process
|
||
data twice: first for obtaining the IV, and then to produce the CBC
|
||
encryption with this IV. With the same performance penalty CMC is able
|
||
to achieve better security properties (CMC is discussed later), thus
|
||
plumb-IV will remain unimplemented.
|
||
|
||
The attack arsenal
|
||
|
||
Content leaks
|
||
|
||
This attack can be mounted against any system operating in CBC Mode. It
|
||
rests on the property, that in CBC decryption, the preceding cipher
|
||
block's influence is simple, that is, it's XORed into the plain text.
|
||
The preceding cipher block, C[n-1], is readily available on disk (for n
|
||
> 0) or may be deduced from the IV (for n = 0). If an attacker finds
|
||
two blocks with identical cipher text, he knows that both cipher texts
|
||
have been formed according to:
|
||
C[m] = E(P[m] ⊕ C[m-1] )
|
||
C[n] = E(P[n] ⊕ C[n-1] )
|
||
Since he found that C[m] = C[n], it holds
|
||
P[m] ⊕ C[m-1] = P[n] ⊕ C[n-1]
|
||
which can be rewritten as
|
||
C[m-1] ⊕ C[n-1] = P[n] ⊕ P[m]
|
||
The left hand side is known to the attacker by reading the preceding
|
||
cipher text from disk. If one of the blocks is the first block of a
|
||
sector, the IV must be examined instead (when it's available as it is
|
||
in public-IV). The attacker is now able to deduce the difference
|
||
between the plain texts by examining the difference of C[m-1] and
|
||
C[n-1]. If one of the plain text blocks happens to be zero, the
|
||
difference yields the original content of the other related plain text
|
||
block.
|
||
|
||
Another information is available to the attacker. Any succeeding
|
||
identical pair of cipher text, that follows the initial identical
|
||
cipher pair, is equal. No information about the content of those pairs
|
||
can be extracted, since the information is extracted from the
|
||
respective preceding cipher blocks, but those are all required to be
|
||
equal.
|
||
|
||
Let's have a look at the chance of succeeding with this attack.
|
||
Assuming the output of a cipher forms an uniform distribution, the
|
||
chance, p, of finding an identical block is 2^-blocksize. For instance,
|
||
p = 1/2^128 for a 128-bit cipher. Because the number of possible pairs
|
||
develops as an arithmetic series in n, the number of sectors, the
|
||
chance of not finding two identical blocks is given by
|
||
(1-p)^n(n-1)/2
|
||
As p is very small, but in contrast the power is very big, we apply the
|
||
logarithm to get meaningful answers, that is
|
||
n(n-1)/2 ln (1-p)
|
||
An example: The number of cipher blocks available on 200GB disk with
|
||
known C[n-1] is 200GB × 1024^2 KB/GB × 64/1KB ^1. Or in other words, a
|
||
128-bit block is 16 bytes, so the number of 16-byte blocks in a 200GB
|
||
hard disk is 13.4 billion. Therefore, n = 1.342e10. For a 128-bit
|
||
cipher, p = 2^-128. Hence,
|
||
ln(1-p) = -2.939e-39
|
||
n(n-1)/2 = 9.007e19
|
||
|
||
n(n-1)/2 ln (1-p) = -2.647e-19
|
||
1-e^-2.776e-13 = 2.647e-19
|
||
The last term is the chance of finding at least one pair of identical
|
||
cipher blocks. But how does this number grow in n? Obviously
|
||
exponentially. Plotting a few a decimal powers shows that the chance
|
||
for finding at least on identical cipher pair flips to 1 around n =
|
||
10^20 (n = 10^40 for a 256-bit cipher). This inflection point is reached
|
||
for a 146 million TB storage (or a hundred thousand trillion trillions
|
||
TB storage for a 256-bit cipher).
|
||
|
||
^1The blocks with available preceding cipher blocks is 62/1KB for all
|
||
non-public IV schemes, i.e. ESSIV/plumb IV
|
||
|
||
Data existence leak: The Watermark
|
||
|
||
No IV format discussed on this page allows the user to deny the
|
||
existence of encrypted data. Neither cryptoloop nor dm-crypt is an
|
||
implementation of a deniable cryptography system. But the problem is
|
||
more serious with public-IV.
|
||
|
||
With public IV and the predicable difference it introduces in the first
|
||
blocks of a sequence of plain text, data can be watermarked, which
|
||
means, the watermarked data is detectable even when the key has not
|
||
been recovered. As shown in the paragraph above, the existence of two
|
||
blocks with identical cipher text is very unlikely and coincidence can
|
||
be excluded, which is relevant when somebody tries to demonstrate
|
||
before the law that certain data is in an encrypted partition.
|
||
|
||
As the IV progresses with a foreseeable pattern and is guaranteed to
|
||
change the least significant bit ever step, we can build identical pair
|
||
of cipher text by writing three consecutive sectors each with a flipped
|
||
LSB relative to the previous. (The reason it's three instead of two is,
|
||
that the second least significant bit might change as well.) This
|
||
"public-IV"-driven CBC encryption will output exactly the same cipher
|
||
text for two consecutive sectors. An attacker can search the disk for
|
||
identical consecutive blocks to find the watermark. This can be done in
|
||
a single pass, and is much more feasible than finding to identical
|
||
blocks, that are scattered on the disk, as in the previous attack. A
|
||
few bits of information can be encoded into the watermarks, which might
|
||
serve as tag to prove the existence copyright infringing material.
|
||
|
||
A complete description of watermarking can be found in [22]Encrypted
|
||
Watermarks and Linux Laptop Security. The attack can be defeated by
|
||
using ESSIV.
|
||
|
||
Data modification leak
|
||
|
||
CBC encryption is recursive, so the n-th block depends on all previous
|
||
blocks. But the other way round would also be nice. Why? The weakness
|
||
becomes visible, if storage on a remote computer is used, or more
|
||
likely, the hard disk exhibits good forensic properties. The point is,
|
||
the attacker has to have access to preceding (in time) cipher text of a
|
||
sector, either by recording it from the network, or by using forensic
|
||
methods.
|
||
|
||
An attacker can now guess data modification patterns by examining the
|
||
historic data. If a sector is overwritten with a partial changed plain
|
||
text, there is an amount of bytes at the beginning, which are
|
||
unchanged. This point of change^2 is directly reflected in the cipher
|
||
text. So an attacker can deduce the point of the change in plain text
|
||
by finding the point where the cipher text starts to differ.
|
||
|
||
This weakness is present in public-IV and ESSIV.
|
||
|
||
^2aligned to the cipher block size boundaries
|
||
|
||
Malleable plain text
|
||
|
||
The decryption structure of CBC is the source of this weakness.
|
||
Malleability (with respect to cryptography) is defined as a
|
||
modification of the cipher text that will resulting in a predictable
|
||
change in plain text. To put it formally, there is a function f(C),
|
||
that, if applied to the cipher text, C' = f(C), will result in a known
|
||
function f', which will predict the resulting plain text, P' = D(C'),
|
||
correctly assuming P is known, that is P' = f'(P).
|
||
|
||
As we can see in it's definition, CBC decryption depends on C[n-1]. An
|
||
attacker can flip arbitrary bits in the plain text by flipping bit in
|
||
C[n-1]. More formally^3, if
|
||
P = P[1] || P[2] || ... || P[i] || ... || P[n]
|
||
C = E[CBC](P)
|
||
C = C[1] || C[2] || ... || C[i-1] || ... || C[n]
|
||
the function
|
||
f(C[1] || ... || C[n]) = C[1] || ... || C[i-1] XOR M || ... || C[n]
|
||
follows the function f', which predicts the resulting plain text
|
||
correctly as,
|
||
f'(P[1] || ... || P[n]) = P[1] || ... || P[i] XOR M || ... || P[n]
|
||
The first block of the CBC cipher text stream is not malleable, because
|
||
it depends on the IV, which is not modifiable for an attacker.
|
||
|
||
^3The IV parameter for E[CBC] has been intentionally omitted.
|
||
|
||
Movable
|
||
|
||
On the expense of one block decrypting to garbage, an attacker can move
|
||
around plain text as he likes. CBC decryption depends on two variables,
|
||
C[n-1] and C[n]. Both can be modified at free will. To make meaningful
|
||
modifications, an attacker has to replace the pair C[n-1] and C[n] with
|
||
other cipher text pair from disk. The first block C[n-1] will decrypt
|
||
to garbage, but the second block C[n] will yield a copy of the plain
|
||
text of the copied cipher block. This attack is also known as
|
||
copy&paste attack. This attack is mountable against any CBC setup. The
|
||
only limitation is, the first block, C[0], can't be replaced with
|
||
something meaningful, as C[-1] can't be modified, because it's the IV.
|
||
|
||
CMC and EME: Tweakable wide block cipher modes
|
||
|
||
CMC is a new chaining mode. It stands for ''CBC-Mask-CBC''. It works by
|
||
processing the data in three steps, first CBC, then masking the cipher
|
||
text, and then another CBC step, but this time backwards. The last step
|
||
introduces a dependency from the last block to the first block. The
|
||
authors of the CMC paper provide a prove for the security of this mode,
|
||
making a secure 128-bit cipher a secure 4096-bit cipher (sector size).
|
||
As in normal CBC, this scheme also takes an IV, but the authors call it
|
||
tweak.
|
||
|
||
EME is CMC's cousin. EME has also been authored by Haveli and Rogaway
|
||
as well been authored for the same purpose. The difference to CMC is,
|
||
that EME is parallelizable, that is, all operations of the underlying
|
||
cipher can be evaluated in parallel. To introduce an interdependency
|
||
among the resulting cipher blocks, the encryption happens in two
|
||
stages. Between these stages a mask is computed from all intermediate
|
||
blocks and applied to each intermediate block. This step causes an
|
||
interdependency among the cipher blocks. After applying the mask,
|
||
another encryption step diffuses the mask.
|
||
|
||
The interdependency among the resulting blocks allow CMC and EME to be
|
||
nonmovable, nonmalleable, to prevent content leaks and in-sector data
|
||
modification patterns. The tweaks are encrypted by both cipher modes,
|
||
thus both are nonwatermarkable.
|
||
|
||
For simplicity, the EME description above omitted the pre- and post-
|
||
whitening steps as well as the multiplications in GF(2^128). An
|
||
in-depth specification can be found at the [23]Cryptology ePrint
|
||
Archive. An applicable draft specification for EME-32-AES can be found
|
||
at [24]SISWG. I have written an EME-32-AES test implementation for
|
||
Linux 2.6. It's available [25]here. The CMC paper is available from the
|
||
[26]Cryptology ePrint Archive as well.
|
||
|
||
LRW: A tweakable narrow block cipher mode
|
||
|
||
EME as well as CMC are comparatively secure cipher modes, but heavy in
|
||
terms of performance. LRW tries to cope with most of security
|
||
requirements, and at the same time provide a good performance. LRW is a
|
||
narrow block cipher mode, that is, it operates only on one block,
|
||
instead of a whole sector. To make a cipher block tied to a location on
|
||
disk (to make it unmovable), a logical index is included in the
|
||
computation. For LRW you have to provide two keys, one for the cipher
|
||
and one for the cipher mode. The second key is multiplied with a
|
||
logical index under GF(2^128) and used as pre- and post- whitening for
|
||
encryption. With those whitening steps the block is effectively tied to
|
||
a logical index. The logical index is usually the absolute position on
|
||
disk measured with the block size of the cipher algorithm. The
|
||
different choice of the measuring unit is the only different between
|
||
the logical index and the public-IV.
|
||
|
||
The LRW draft is available from the [27]SISWG mailing list archive.
|
||
|
||
Summarising
|
||
|
||
The following table shows a comparison between the security properties
|
||
of different encryption setups and their computational costs. The
|
||
number of cipher calls, XOR operations and additional operations are
|
||
stated in terms of encryption blocks, n.
|
||
IV mode cipher mode content leaks watermarkable malleable movable
|
||
modification detection^5 cipher calls XOR ops additional op.
|
||
public-IV CBC Yes Yes Yes Yes Yes n n None
|
||
ESSIV CBC Yes No Yes Yes Yes n+1 n None
|
||
Plumb-IV1^4 CBC Yes No Yes Yes No 2n-1 2n None
|
||
public-IV CMC No No No No No 2n+1 2n+1 1 LW GF ⊗
|
||
public-IV EME No No No No No 2n+1 5n 3n-1 LW GF ⊗
|
||
public-IV LRW No No No No Yes n 2n n HW GF ⊗
|
||
|
||
Legend:
|
||
* LW GF ⊗: light-weight Galois field multiplication, that is, a
|
||
multiplication with a constant x^2^i, which can be computed in
|
||
θ(1).
|
||
* HW GF ⊗: heavy-weight Galois field multiplication, that is, a
|
||
multiplication with an arbitrary constant, which can be computed in
|
||
θ(bits).
|
||
|
||
^4plumb-IV1 uses CBC-MAC instead of hashing, so we can make a good
|
||
comparison with other ciphers in terms of cipher/XOR calls.
|
||
^5detectable partial in-sector modification
|
||
__________________________________________________________________
|
||
|
||
Clemens Fruhwirth, , also author of LUKS and ESSIV, porter of
|
||
cryptoloop, aes-i586 for 2.6., twofish-i586, and implementor of
|
||
EME-32-AES. This text is an excerpt of my diploma thesis.
|
||
|
||
This page has been reviewed by
|
||
|
||
Dr. Ernst Molitor
|
||
Arno Wagner
|
||
James Hughes , "Security in Storage Working Group" chair
|
||
|
||
Additional thanks to Pascal Brisset, for pointing out an error in the
|
||
Bernoulli estimation in an earlier version of this document, further
|
||
Adam J. Richter for pointing out an error in the KB/GB ratio.
|
||
|
||
Content and design, Copyright © 2004-2008 Clemens Fruhwirth, unless
|
||
stated otherwise
|
||
Original design by [28]haran | Additional art by [29]LinuxArt | | Blog
|
||
by [30]NanoBlogger
|
||
|
||
References
|
||
|
||
1. http://clemens.endorphin.org/
|
||
2. http://clemens.endorphin.org/credits
|
||
3. http://clemens.endorphin.org/aboutme
|
||
4. http://clemens.endorphin.org/cryptography
|
||
5. http://blog.clemens.endorphin.org/
|
||
6. http://clemens.endorphin.org/patches
|
||
7. http://clemens.endorphin.org/archive
|
||
8. http://clemens.endorphin.org/Cryptoloop_Migration_Guide
|
||
9. http://clemens.endorphin.org/LUKS
|
||
10. http://clemens.endorphin.org/AFsplitter
|
||
11. http://clemens.endorphin.org/lo-tracker
|
||
12. http://blog.clemens.endorphin.org/2008/12/luks-on-disk-format-revision-111.html
|
||
13. http://blog.clemens.endorphin.org/2008/11/xmonad-gridselect.html
|
||
14. http://blog.clemens.endorphin.org/2008/11/workaround-for-bittorrent-traffic.html
|
||
15. http://blog.clemens.endorphin.org/2008/09/i-love-lolcat-meme.html
|
||
16. http://blog.clemens.endorphin.org/2008/09/counter-steganography-research.html
|
||
17. http://clemens.endorphin.org/cryptography
|
||
18. http://www.siswg.org/
|
||
19. http://en.wikipedia.org/wiki/Block_cipher_modes_of_operation
|
||
20. http://www.freesoft.org/CIE/Topics/143.htm
|
||
21. http://csrc.nist.gov/publications/nistpubs/800-38a/sp800-38a.pdf
|
||
22. http://www.tcs.hut.fi/~mjos/doc/wisa2004.pdf
|
||
23. http://eprint.iacr.org/2003/147/
|
||
24. http://grouper.ieee.org/groups/1619/email/pdf00011.pdf
|
||
25. http://article.gmane.org/gmane.linux.kernel.device-mapper.dm-crypt/544
|
||
26. http://eprint.iacr.org/2003/148/
|
||
27. http://grouper.ieee.org/groups/1619/email/msg00160.html
|
||
28. http://www.oswd.org/user/profile/id/3013
|
||
29. http://www.linuxart.com/
|
||
30. http://nanoblogger.sourceforge.net/
|