Replace the `vault` repetition tests with a faster version (e0dcb95) - derivepassphrase.git

Replace the `vault` repetition tests with a faster version

Marco Ricci commited on 2025-08-14 22:58:08
Zeige 1 geänderte Dateien mit 22 Einfügungen und 10 Löschungen.

For asserting the correctness of `vault`'s repetition limitation
setting, we used to extract all size `r` substrings of the derived
passphrase (where `r` is the repetition count that is *not* allowed
anymore) and tested whether they contained more than one different
character (by building a set over the characters).  That works, but it
repeatedly builds sets, and scales badly with increased repetition
count.

Instead, we adopt the faster approach that examines the derived
passphrase once, character by character, keeping track of the longest
seen run of identical characters, and asserting that that run is within
the permitted repetition limit.  Although it consists of more
instructions, these are "simpler" instructions that do not involve set
object construction, and in particular, they are independent of the
repetition limit, leading to better scalability.  Sample runs with
Python's `timeit` module also indicate that for length-200 strings and
repetition limit 100, the set-building version takes 2-5 times as long
as the direct run counting version.  Given the nature of this code – it
runs in `hypothesis`, so is executed repeatedly and cannot afford to be
*too* slow –, I posit that the speed gain is worth the slightly indirect
measurement style.

tests/test_derivepassphrase_vault.py 0248932..766e9fa

tests/test_derivepassphrase_vault.py

Zeige Datei @ e0dcb95

@@ -633,8 +633,17 @@ class TestConstraintSatisfactionThoroughness(TestVault):
         password = vault.Vault(
             phrase=phrase, length=length, repeat=repeat
         ).generate(service)
-        for i in range((length + 1) - (repeat + 1)):
-            assert len(set(password[i : i + repeat + 1])) > 1
+        last_char: str | int | None = None
+        highest_count = 0
+        count = 0
+        for ch in password:
+            if ch != last_char:
+                last_char = ch
+                count = 0
+            else:
+                count += 1
+                highest_count = max(highest_count, count)
+            assert count <= repeat
 
 
 class TestConstraintSatisfactionHeavyDuty(TestVault):
@@ -728,16 +737,19 @@ class TestConstraintSatisfactionHeavyDuty(TestVault):
                     sum(c in vault.Vault.CHARSETS[key] for c in password) == 0
                 ), "Password does not satisfy character ban constraints."
 
-        T = TypeVar("T", str, bytes)
-
-        def length_r_substrings(string: T, *, r: int) -> Iterator[T]:
-            for i in range(len(string) - (r - 1)):
-                yield string[i : i + r]
-
         repeat = config["repeat"]
         if repeat:
-            for snippet in length_r_substrings(password, r=(repeat + 1)):
-                assert len(set(snippet)) > 1, (
+            last_char: str | int | None = None
+            highest_count = 0
+            count = 0
+            for ch in password:
+                if ch != last_char:
+                    last_char = ch
+                    count = 0
+                else:
+                    count += 1
+                    highest_count = max(highest_count, count)
+                assert count <= repeat, (
                     "Password does not satisfy character repeat constraints."
                 )
 


...	...	@@ -633,8 +633,17 @@ class TestConstraintSatisfactionThoroughness(TestVault):
633	633	password = vault.Vault(
634	634	phrase=phrase, length=length, repeat=repeat
635	635	).generate(service)
636		- for i in range((length + 1) - (repeat + 1)):
637		- assert len(set(password[i : i + repeat + 1])) > 1
	636	+ last_char: str \| int \| None = None
	637	+ highest_count = 0
	638	+ count = 0
	639	+ for ch in password:
	640	+ if ch != last_char:
	641	+ last_char = ch
	642	+ count = 0
	643	+ else:
	644	+ count += 1
	645	+ highest_count = max(highest_count, count)
	646	+ assert count <= repeat
638	647
639	648
640	649	class TestConstraintSatisfactionHeavyDuty(TestVault):
...	...	@@ -728,16 +737,19 @@ class TestConstraintSatisfactionHeavyDuty(TestVault):
728	737	sum(c in vault.Vault.CHARSETS[key] for c in password) == 0
729	738	), "Password does not satisfy character ban constraints."
730	739
731		- T = TypeVar("T", str, bytes)
732		-
733		- def length_r_substrings(string: T, *, r: int) -> Iterator[T]:
734		- for i in range(len(string) - (r - 1)):
735		- yield string[i : i + r]
736		-
737	740	repeat = config["repeat"]
738	741	if repeat:
739		- for snippet in length_r_substrings(password, r=(repeat + 1)):
740		- assert len(set(snippet)) > 1, (
	742	+ last_char: str \| int \| None = None
	743	+ highest_count = 0
	744	+ count = 0
	745	+ for ch in password:
	746	+ if ch != last_char:
	747	+ last_char = ch
	748	+ count = 0
	749	+ else:
	750	+ count += 1
	751	+ highest_count = max(highest_count, count)
	752	+ assert count <= repeat, (
741	753	"Password does not satisfy character repeat constraints."
742	754	)
743	755
744	756