Document the vault "interchangable passphrases" artefact as an FAQ
Marco Ricci

Marco Ricci commited on 2025-01-27 19:39:03
Zeige 4 geänderte Dateien mit 137 Einfügungen und 0 Löschungen.

... ...
@@ -0,0 +1,131 @@
1
+---
2
+title: What are "interchangable passphrases" in `vault`, and what does that mean in practice?
3
+---
4
+
5
+# What are "interchangable passphrases" in `vault`, and what does that mean in practice?
6
+
7
+## What are "interchangable passphrases"?
8
+
9
+The "vault" derivation scheme internally uses PBKDF2-HMAC-SHA1[^1] to turn
10
+the master passphrase[^2] into a pseudo-random bit sequence, which then
11
+drives the actual passphrase derivation.
12
+In this context, the master passphrase is passed directly as a key to
13
+HMAC-SHA1, and because HMAC-SHA1 requires keys of exactly 64 bytes size, the
14
+key is thus subject to the HMAC key mapping procedure.
15
+Because the mapping of infinitely many arbitrarily sized keys to 64-byte
16
+sized keys cannot be one-to-one, there exist pairs of keys that behave
17
+identically when passed into (PBKDF2-)HMAC-SHA1, i.e., the keys (master
18
+passphrases) are "interchangable" from the vault scheme's perspective.
19
+
20
+Fundamentally, this is an issue of *encoding*: the master passphrase is
21
+interpreted as an encoding of the HMAC-SHA1 key, and this encoding is not
22
+unique, so the effective space of HMAC-SHA1 keys is reduced through the
23
+presence of "non-canonical" encodings of keys.
24
+
25
+  [^1]: PBKDF2 is a key derivation function, published in [RFC 2898][].
26
+  It uses a pseudo-random function such as HMAC-SHA1 (hashed message
27
+  authentication code, specified in [RFC 2104][] and using SHA1 as the
28
+  underlying hash function) when processing its input.  PBKDF2 passes the
29
+  key on to its pseudo-random function, and otherwise only depends on the
30
+  output of the pseudo-random function, not on the key.
31
+
32
+  [^2]: If you use a master SSH key, it is first converted to an "equivalent
33
+  master passphrase".
34
+
35
+  [RFC 2104]: https://datatracker.ietf.org/doc/html/rfc2104
36
+  [RFC 2898]: https://datatracker.ietf.org/doc/html/rfc2898
37
+
38
+## What is the HMAC key mapping procedure?
39
+
40
+???+ abstract "HMAC key mapping procedure"
41
+
42
+    Let <var>MP</var> denote the master passphrase, and let <var>K</var>
43
+    denote the HMAC key candidate.  Let <var>B</var> denote the block size
44
+    of HMAC-SHA1 in bytes, i.e., `64`.  At the beginning,
45
+    <var>K</var> = <var>MP</var>.
46
+
47
+    1.  If <var>K</var> (= <var>MP</var>) is larger than <var>B</var>, set
48
+        <var>K</var> to `SHA1(K)`.  This updates <var>K</var> for all
49
+        further steps below.
50
+    2.  If <var>K</var> is smaller than <var>B</var>, append as many NUL
51
+        bytes as necessary to extend <var>K</var> to size <var>B</var>.
52
+    3.  Use <var>K</var> as the HMAC key.
53
+
54
+## What effect does the HMAC key mapping procedure have on key security?
55
+
56
+The key space shrinks to 99.6% of its original size.
57
+But since it started out as astronomically large (2^512^), it *still* is
58
+astronomically large.
59
+
60
+??? example "Mathematical details: key space"
61
+
62
+    | variant                      | key space size | fraction of total size |
63
+    |:-----------------------------|:---------------|:-----------------------|
64
+    | 64-byte keys only            | 256^64^ = 13407807929942597099574024998205846127479365820592393377723561443721764030073546976801874298166903427690031858186486050853753882811946569946433649006084096 | 99.609375% |
65
+    | full key size up to 64 bytes | (256^65^ – 1) / (256 – 1) = 13460387568883548460748825096238025916214579019888834136067575410167731732152266768867764001296969715641757473316629133406121545097483615318772604492382465 | 100% |
66
+
67
+    The key space sizes can be calculated using the following formulas.  Let
68
+    <var>q</var> = `256` denote the alphabet size of binary strings, and let
69
+    <var>n</var> denote the string length.  The total count of all strings
70
+    of size <var>n</var> is <var>q</var>^<var>n</var>^, and the total count
71
+    of all strings up to (and including) size <var>n</var> is
72
+    (<var>q</var><sup><var>n</var> + 1</sup> – 1) / (<var>q</var> – 1) per
73
+    the formula for geometric series.
74
+
75
+    Verification:
76
+
77
+    ~~~~ shell-session
78
+    $ # using GNU bc 1.07.1
79
+    $ 
80
+    $ BC_LINE_LENGTH=0 bc <<'HERE'
81
+    > # The total count of size 64 byte strings.
82
+    > 256^64
83
+    > # The total count of byte strings of size 64 or less.
84
+    > (256^65 - 1) / (256 - 1)
85
+    > # The fraction of the former within the latter.
86
+    > scale = 8
87
+    > (256^64) / ((256^65 - 1) / (256 - 1))
88
+    > HERE
89
+    13407807929942597099574024998205846127479365820592393377723561443721764030073546976801874298166903427690031858186486050853753882811946569946433649006084096
90
+    13460387568883548460748825096238025916214579019888834136067575410167731732152266768867764001296969715641757473316629133406121545097483615318772604492382465
91
+    .99609375
92
+    $ 
93
+    $ BC_LINE_LENGTH=0 bc <<'HERE'
94
+    > # The fraction of unusable keys.
95
+    > scale = 160
96
+    > (256^64 - 1) / (256^65 - 1)
97
+    > 1 / 256
98
+    > HERE
99
+    .0039062499999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999997097
100
+    .0039062500000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
101
+    ~~~~
102
+
103
+In particular, assuming a sufficiently secure master passphrase, this
104
+mapping procedure is still cryptographically secure against attackers
105
+without possession of the master passphrase if the hashing function (here:
106
+SHA1) is secure against preimage attacks:
107
+
108
+  * The attacker can attempt to guess a NUL-extended version of the
109
+    passphrase if it is shorter than or equal in length to 64 bytes.  This
110
+    has the same computational cost as guessing the master passphrase
111
+    directly, which is cryptographically secure by assumption.
112
+
113
+  * The attacker can attempt to guess a hashed and NUL-extended version of
114
+    the passphrase if it is larger than 64 bytes.  This amounts to carrying
115
+    out a preimage attack against the SHA1 digest of the master passphrase,
116
+    which is also cryptographically secure by assumption.
117
+
118
+## What effect does the HMAC key mapping procedure have on `derivepassphrase`?
119
+
120
+`derivepassphrase vault` does not check for interchangable passphrases, and
121
+will happily accept any (non-empty) passphrase it is given.
122
+The [`derivepassphrase.vault.Vault`][] class does not check for
123
+interchangable passphrases either, and will happily accept any passphrase it
124
+is given, even empty ones.
125
+
126
+Most interchangable variations of a master passphrase contain binary
127
+characters such as NUL, or even arbitrary byte sequences, which may be hard
128
+to type in or impossible to express in certain storage formats.  As such, it
129
+is unlikely---but otherwise supported---that the user would want to enter
130
+or store a different, interchangable version of their master passphrase in
131
+the first place.
... ...
@@ -4,5 +4,8 @@ title: Explanation overview
4 4
 
5 5
 * [How to comply with the "altered versions" clause of the
6 6
   license][FAQ_ALTERED_VERSIONS]
7
+* [What are "interchangable passphrases" in `vault`, and what does that mean
8
+  in practice?][FAQ_INTERCHANGABLE_PASSPHRASES]
7 9
 
8 10
 [FAQ_ALTERED_VERSIONS]: faq-altered-versions.md
11
+[FAQ_INTERCHANGABLE_PASSPHRASES]: faq-vault-interchangable-passphrases.md
... ...
@@ -117,6 +117,7 @@ nav:
117 117
   - Design & Background:
118 118
     - explanation/index.md
119 119
     - '"altered versions" license requirement': explanation/faq-altered-versions.md
120
+    - '"interchangable passphrases" in vault': explanation/faq-vault-interchangable-passphrases.md
120 121
   - Changelog:
121 122
     - Changelog: changelog.md
122 123
     - Upgrade notes: upgrade-notes.md
... ...
@@ -136,6 +137,7 @@ markdown_extensions:
136 137
   - smarty
137 138
   - toc:
138 139
       permalink: true
140
+  - pymdownx.caret
139 141
   - pymdownx.details
140 142
   - pymdownx.snippets:
141 143
       base_path:
... ...
@@ -44,6 +44,7 @@ nav:
44 44
   - Design & Background:
45 45
     - explanation/index.md
46 46
     - '"altered versions" license requirement': explanation/faq-altered-versions.md
47
+    - '"interchangable passphrases" in vault': explanation/faq-vault-interchangable-passphrases.md
47 48
   - Changelog:
48 49
     - Changelog: changelog.md
49 50
     - Upgrade notes: upgrade-notes.md
50 51