mirror of
https://github.com/torvalds/linux.git
synced 2026-04-18 06:44:00 -04:00
Update verity.rst to add a dedicated section about FEC and improve the documentation for the FEC-related parameters. Signed-off-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
350 lines
15 KiB
ReStructuredText
350 lines
15 KiB
ReStructuredText
=========
|
|
dm-verity
|
|
=========
|
|
|
|
Device-Mapper's "verity" target provides transparent integrity checking of
|
|
block devices using a cryptographic digest provided by the kernel crypto API.
|
|
This target is read-only.
|
|
|
|
Construction Parameters
|
|
=======================
|
|
|
|
::
|
|
|
|
<version> <dev> <hash_dev>
|
|
<data_block_size> <hash_block_size>
|
|
<num_data_blocks> <hash_start_block>
|
|
<algorithm> <digest> <salt>
|
|
[<#opt_params> <opt_params>]
|
|
|
|
<version>
|
|
This is the type of the on-disk hash format.
|
|
|
|
0 is the original format used in the Chromium OS.
|
|
The salt is appended when hashing, digests are stored continuously and
|
|
the rest of the block is padded with zeroes.
|
|
|
|
1 is the current format that should be used for new devices.
|
|
The salt is prepended when hashing and each digest is
|
|
padded with zeroes to the power of two.
|
|
|
|
<dev>
|
|
This is the device containing data, the integrity of which needs to be
|
|
checked. It may be specified as a path, like /dev/sdaX, or a device number,
|
|
<major>:<minor>.
|
|
|
|
<hash_dev>
|
|
This is the device that supplies the hash tree data. It may be
|
|
specified similarly to the device path and may be the same device. If the
|
|
same device is used, the hash_start should be outside the configured
|
|
dm-verity device.
|
|
|
|
<data_block_size>
|
|
The block size on a data device in bytes.
|
|
Each block corresponds to one digest on the hash device.
|
|
|
|
<hash_block_size>
|
|
The size of a hash block in bytes.
|
|
|
|
<num_data_blocks>
|
|
The number of data blocks on the data device. Additional blocks are
|
|
inaccessible. You can place hashes to the same partition as data, in this
|
|
case hashes are placed after <num_data_blocks>.
|
|
|
|
<hash_start_block>
|
|
This is the offset, in <hash_block_size>-blocks, from the start of hash_dev
|
|
to the root block of the hash tree.
|
|
|
|
<algorithm>
|
|
The cryptographic hash algorithm used for this device. This should
|
|
be the name of the algorithm, like "sha1".
|
|
|
|
<digest>
|
|
The hexadecimal encoding of the cryptographic hash of the root hash block
|
|
and the salt. This hash should be trusted as there is no other authenticity
|
|
beyond this point.
|
|
|
|
<salt>
|
|
The hexadecimal encoding of the salt value.
|
|
|
|
<#opt_params>
|
|
Number of optional parameters. If there are no optional parameters,
|
|
the optional parameters section can be skipped or #opt_params can be zero.
|
|
Otherwise #opt_params is the number of following arguments.
|
|
|
|
Example of optional parameters section:
|
|
1 ignore_corruption
|
|
|
|
ignore_corruption
|
|
Log corrupted blocks, but allow read operations to proceed normally.
|
|
|
|
restart_on_corruption
|
|
Restart the system when a corrupted block is discovered. This option is
|
|
not compatible with ignore_corruption and requires user space support to
|
|
avoid restart loops.
|
|
|
|
panic_on_corruption
|
|
Panic the device when a corrupted block is discovered. This option is
|
|
not compatible with ignore_corruption and restart_on_corruption.
|
|
|
|
restart_on_error
|
|
Restart the system when an I/O error is detected.
|
|
This option can be combined with the restart_on_corruption option.
|
|
|
|
panic_on_error
|
|
Panic the device when an I/O error is detected. This option is
|
|
not compatible with the restart_on_error option but can be combined
|
|
with the panic_on_corruption option.
|
|
|
|
ignore_zero_blocks
|
|
Do not verify blocks that are expected to contain zeroes and always return
|
|
zeroes instead. This may be useful if the partition contains unused blocks
|
|
that are not guaranteed to contain zeroes.
|
|
|
|
use_fec_from_device <fec_dev>
|
|
Use forward error correction (FEC) parity data from the specified device to
|
|
try to automatically recover from corruption and I/O errors.
|
|
|
|
If this option is given, then <fec_roots> and <fec_blocks> must also be
|
|
given. <hash_block_size> must also be equal to <data_block_size>.
|
|
|
|
<fec_dev> can be the same as <dev>, in which case <fec_start> must be
|
|
outside the data area. It can also be the same as <hash_dev>, in which case
|
|
<fec_start> must be outside the hash and optional additional metadata areas.
|
|
|
|
If the data <dev> is encrypted, the <fec_dev> should be too.
|
|
|
|
For more information, see `Forward error correction`_.
|
|
|
|
fec_roots <num>
|
|
The number of parity bytes in each 255-byte Reed-Solomon codeword. The
|
|
Reed-Solomon code used will be an RS(255, k) code where k = 255 - fec_roots.
|
|
|
|
The supported values are 2 through 24 inclusive. Higher values provide
|
|
stronger error correction. However, the minimum value of 2 already provides
|
|
strong error correction due to the use of interleaving, so 2 is the
|
|
recommended value for most users. fec_roots=2 corresponds to an
|
|
RS(255, 253) code, which has a space overhead of about 0.8%.
|
|
|
|
fec_blocks <num>
|
|
The total number of <data_block_size> blocks that are error-checked using
|
|
FEC. This must be at least the sum of <num_data_blocks> and the number of
|
|
blocks needed by the hash tree. It can include additional metadata blocks,
|
|
which are assumed to be accessible on <hash_dev> following the hash blocks.
|
|
|
|
Note that this is *not* the number of parity blocks. The number of parity
|
|
blocks is inferred from <fec_blocks>, <fec_roots>, and <data_block_size>.
|
|
|
|
fec_start <offset>
|
|
This is the offset, in <data_block_size> blocks, from the start of <fec_dev>
|
|
to the beginning of the parity data.
|
|
|
|
check_at_most_once
|
|
Verify data blocks only the first time they are read from the data device,
|
|
rather than every time. This reduces the overhead of dm-verity so that it
|
|
can be used on systems that are memory and/or CPU constrained. However, it
|
|
provides a reduced level of security because only offline tampering of the
|
|
data device's content will be detected, not online tampering.
|
|
|
|
Hash blocks are still verified each time they are read from the hash device,
|
|
since verification of hash blocks is less performance critical than data
|
|
blocks, and a hash block will not be verified any more after all the data
|
|
blocks it covers have been verified anyway.
|
|
|
|
root_hash_sig_key_desc <key_description>
|
|
This is the description of the USER_KEY that the kernel will lookup to get
|
|
the pkcs7 signature of the roothash. The pkcs7 signature is used to validate
|
|
the root hash during the creation of the device mapper block device.
|
|
Verification of roothash depends on the config DM_VERITY_VERIFY_ROOTHASH_SIG
|
|
being set in the kernel. The signatures are checked against the builtin
|
|
trusted keyring by default, or the secondary trusted keyring if
|
|
DM_VERITY_VERIFY_ROOTHASH_SIG_SECONDARY_KEYRING is set. The secondary
|
|
trusted keyring includes by default the builtin trusted keyring, and it can
|
|
also gain new certificates at run time if they are signed by a certificate
|
|
already in the secondary trusted keyring.
|
|
|
|
try_verify_in_tasklet
|
|
If verity hashes are in cache and the IO size does not exceed the limit,
|
|
verify data blocks in bottom half instead of workqueue. This option can
|
|
reduce IO latency. The size limits can be configured via
|
|
/sys/module/dm_verity/parameters/use_bh_bytes. The four parameters
|
|
correspond to limits for IOPRIO_CLASS_NONE, IOPRIO_CLASS_RT,
|
|
IOPRIO_CLASS_BE and IOPRIO_CLASS_IDLE in turn.
|
|
For example:
|
|
<none>,<rt>,<be>,<idle>
|
|
4096,4096,4096,4096
|
|
|
|
Theory of operation
|
|
===================
|
|
|
|
dm-verity is meant to be set up as part of a verified boot path. This
|
|
may be anything ranging from a boot using tboot or trustedgrub to just
|
|
booting from a known-good device (like a USB drive or CD).
|
|
|
|
When a dm-verity device is configured, it is expected that the caller
|
|
has been authenticated in some way (cryptographic signatures, etc).
|
|
After instantiation, all hashes will be verified on-demand during
|
|
disk access. If they cannot be verified up to the root node of the
|
|
tree, the root hash, then the I/O will fail. This should detect
|
|
tampering with any data on the device and the hash data.
|
|
|
|
Cryptographic hashes are used to assert the integrity of the device on a
|
|
per-block basis. This allows for a lightweight hash computation on first read
|
|
into the page cache. Block hashes are stored linearly, aligned to the nearest
|
|
block size.
|
|
|
|
Hash Tree
|
|
---------
|
|
|
|
Each node in the tree is a cryptographic hash. If it is a leaf node, the hash
|
|
of some data block on disk is calculated. If it is an intermediary node,
|
|
the hash of a number of child nodes is calculated.
|
|
|
|
Each entry in the tree is a collection of neighboring nodes that fit in one
|
|
block. The number is determined based on block_size and the size of the
|
|
selected cryptographic digest algorithm. The hashes are linearly-ordered in
|
|
this entry and any unaligned trailing space is ignored but included when
|
|
calculating the parent node.
|
|
|
|
The tree looks something like:
|
|
|
|
alg = sha256, num_blocks = 32768, block_size = 4096
|
|
|
|
::
|
|
|
|
[ root ]
|
|
/ . . . \
|
|
[entry_0] [entry_1]
|
|
/ . . . \ . . . \
|
|
[entry_0_0] . . . [entry_0_127] . . . . [entry_1_127]
|
|
/ ... \ / . . . \ / \
|
|
blk_0 ... blk_127 blk_16256 blk_16383 blk_32640 . . . blk_32767
|
|
|
|
Forward error correction
|
|
------------------------
|
|
|
|
dm-verity's optional forward error correction (FEC) support adds strong error
|
|
correction capabilities to dm-verity. It allows systems that would be rendered
|
|
inoperable by errors to continue operating, albeit with reduced performance.
|
|
|
|
FEC uses Reed-Solomon (RS) codes that are interleaved across the entire
|
|
device(s), allowing long bursts of corrupt or unreadable blocks to be recovered.
|
|
|
|
dm-verity validates any FEC-corrected block against the wanted hash before using
|
|
it. Therefore, FEC doesn't affect the security properties of dm-verity.
|
|
|
|
The integration of FEC with dm-verity provides significant benefits over a
|
|
separate error correction layer:
|
|
|
|
- dm-verity invokes FEC only when a block's hash doesn't match the wanted hash
|
|
or the block cannot be read at all. As a result, FEC doesn't add overhead to
|
|
the common case where no error occurs.
|
|
|
|
- dm-verity hashes are also used to identify erasure locations for RS decoding.
|
|
This allows correcting twice as many errors.
|
|
|
|
FEC uses an RS(255, k) code where k = 255 - fec_roots. fec_roots is usually 2.
|
|
This means that each k (usually 253) message bytes have fec_roots (usually 2)
|
|
bytes of parity data added to get a 255-byte codeword. (Many external sources
|
|
call RS codewords "blocks". Since dm-verity already uses the term "block" to
|
|
mean something else, we'll use the clearer term "RS codeword".)
|
|
|
|
FEC checks fec_blocks blocks of message data in total, consisting of:
|
|
|
|
1. The data blocks from the data device
|
|
2. The hash blocks from the hash device
|
|
3. Optional additional metadata that follows the hash blocks on the hash device
|
|
|
|
dm-verity assumes that the FEC parity data was computed as if the following
|
|
procedure were followed:
|
|
|
|
1. Concatenate the message data from the above sources.
|
|
2. Zero-pad to the next multiple of k blocks. Let msg be the resulting byte
|
|
array, and msglen its length in bytes.
|
|
3. For 0 <= i < msglen / k (for each RS codeword):
|
|
a. Select msg[i + j * msglen / k] for 0 <= j < k.
|
|
Consider these to be the 'k' message bytes of an RS codeword.
|
|
b. Compute the corresponding 'fec_roots' parity bytes of the RS codeword,
|
|
and concatenate them to the FEC parity data.
|
|
|
|
Step 3a interleaves the RS codewords across the entire device using an
|
|
interleaving degree of data_block_size * ceil(fec_blocks / k). This is the
|
|
maximal interleaving, such that the message data consists of a region containing
|
|
byte 0 of all the RS codewords, then a region containing byte 1 of all the RS
|
|
codewords, and so on up to the region for byte 'k - 1'. Note that the number of
|
|
codewords is set to a multiple of data_block_size; thus, the regions are
|
|
block-aligned, and there is an implicit zero padding of up to 'k - 1' blocks.
|
|
|
|
This interleaving allows long bursts of errors to be corrected. It provides
|
|
much stronger error correction than storage devices typically provide, while
|
|
keeping the space overhead low.
|
|
|
|
The cost is slow decoding: correcting a single block usually requires reading
|
|
254 extra blocks spread evenly across the device(s). However, that is
|
|
acceptable because dm-verity uses FEC only when there is actually an error.
|
|
|
|
The list below contains additional details about the RS codes used by
|
|
dm-verity's FEC. Userspace programs that generate the parity data need to use
|
|
these parameters for the parity data to match exactly:
|
|
|
|
- Field used is GF(256)
|
|
- Bytes are mapped to/from GF(256) elements in the natural way, where bits 0
|
|
through 7 (low-order to high-order) map to the coefficients of x^0 through x^7
|
|
- Field generator polynomial is x^8 + x^4 + x^3 + x^2 + 1
|
|
- The codes used are systematic, BCH-view codes
|
|
- Primitive element alpha is 'x'
|
|
- First consecutive root of code generator polynomial is 'x^0'
|
|
|
|
On-disk format
|
|
==============
|
|
|
|
The verity kernel code does not read the verity metadata on-disk header.
|
|
It only reads the hash blocks which directly follow the header.
|
|
It is expected that a user-space tool will verify the integrity of the
|
|
verity header.
|
|
|
|
Alternatively, the header can be omitted and the dmsetup parameters can
|
|
be passed via the kernel command-line in a rooted chain of trust where
|
|
the command-line is verified.
|
|
|
|
Directly following the header (and with sector number padded to the next hash
|
|
block boundary) are the hash blocks which are stored a depth at a time
|
|
(starting from the root), sorted in order of increasing index.
|
|
|
|
The full specification of kernel parameters and on-disk metadata format
|
|
is available at the cryptsetup project's wiki page
|
|
|
|
https://gitlab.com/cryptsetup/cryptsetup/wikis/DMVerity
|
|
|
|
Status
|
|
======
|
|
1. V (for Valid) is returned if every check performed so far was valid.
|
|
If any check failed, C (for Corruption) is returned.
|
|
2. Number of corrected blocks by Forward Error Correction.
|
|
'-' if Forward Error Correction is not enabled.
|
|
|
|
Example
|
|
=======
|
|
Set up a device::
|
|
|
|
# dmsetup create vroot --readonly --table \
|
|
"0 2097152 verity 1 /dev/sda1 /dev/sda2 4096 4096 262144 1 sha256 "\
|
|
"4392712ba01368efdf14b05c76f9e4df0d53664630b5d48632ed17a137f39076 "\
|
|
"1234000000000000000000000000000000000000000000000000000000000000"
|
|
|
|
A command line tool veritysetup is available to compute or verify
|
|
the hash tree or activate the kernel device. This is available from
|
|
the cryptsetup upstream repository https://gitlab.com/cryptsetup/cryptsetup/
|
|
(as a libcryptsetup extension).
|
|
|
|
Create hash on the device::
|
|
|
|
# veritysetup format /dev/sda1 /dev/sda2
|
|
...
|
|
Root hash: 4392712ba01368efdf14b05c76f9e4df0d53664630b5d48632ed17a137f39076
|
|
|
|
Activate the device::
|
|
|
|
# veritysetup create vroot /dev/sda1 /dev/sda2 \
|
|
4392712ba01368efdf14b05c76f9e4df0d53664630b5d48632ed17a137f39076
|