Use "loadgroup" scheduling in the test suite runner
Marco Ricci

Marco Ricci commited on 2026-01-17 19:00:49
Zeige 3 geänderte Dateien mit 47 Einfügungen und 2 Löschungen.


Parallelize the test suite via the "loadgroup" scheduler, instead of the
"worksteal" scheduler.  There is currently only one `xtest_group` marker
value, so effectively, the scheduler schedules the *marked* tests all to
the same worker, and the others in whatever manner.  We can thus rely on
the marked tests executing serially, and do not need locks to protect
them (or their fixture calls) from concurrent access.

This eliminates "locking implementations" as both a source of errors and
as another group of code that needs debugging, testing, and coverage.
(Which was, unfortunately, our experience with the `filelock` package we
used to protect non-isolated SSH agents on The Annoying OS during
fixture setup and teardown.)

As a bonus, because the "loadgroup" scheduler lazily assigns work items
as other items are completed, the performance is similar to the
"worksteal" scheduler it is replacing.
... ...
@@ -412,7 +412,7 @@ sqlite_cache = true
412 412
 enable_error_code = ['ignore-without-code']
413 413
 
414 414
 [tool.pytest.ini_options]
415
-addopts = '--doctest-modules --dist=worksteal --import-mode=importlib'
415
+addopts = '--doctest-modules --dist=loadgroup --import-mode=importlib'
416 416
 pythonpath = ['src']
417 417
 testpaths = ['src', 'tests']
418 418
 xfail_strict = true
... ...
@@ -26,6 +26,7 @@ from derivepassphrase import _types, ssh_agent
26 26
 from derivepassphrase.ssh_agent import socketprovider
27 27
 from tests import data, machinery
28 28
 from tests.data import callables
29
+from tests.machinery import pytest as pytest_machinery
29 30
 
30 31
 if TYPE_CHECKING:
31 32
     from collections.abc import Iterator, Sequence
... ...
@@ -509,6 +510,12 @@ for key, handler in spawn_handlers.items():
509 510
             "environment variable.",
510 511
         ),
511 512
     ]
513
+    # Mark non-isolated agents as, well, using non-isolated agents.
514
+    # Assume by default that an agent is potentially non-isolated unless
515
+    # we can be absolutely sure it is isolated, by design.  (The "stub"
516
+    # agents fall into the latter category.)
517
+    if handler.agent_type != data.KnownSSHAgent.StubbedSSHAgent:
518
+        marks.append(pytest_machinery.non_isolated_agent_use)
512 519
     # Some agents require UNIX domain socket support.
513 520
     if key in {"unix-pageant", "openssh"}:
514 521
         marks.append(
... ...
@@ -719,8 +726,21 @@ def spawn_named_agent(  # noqa: C901
719 726
     )
720 727
 
721 728
 
722
-@pytest.fixture
729
+# Hack: We cannot associate markers with fixtures (the pytest devs say
730
+# there are many unresolved questions when it comes to fixture and
731
+# marker overriding in such a case), so we fake this by trivially
732
+# parametrizing the fixture.  (The parametrizations *can* contain the
733
+# necessary marks.)
734
+@pytest.fixture(
735
+    params=[
736
+        pytest.param(
737
+            "non_isolated_agent_use",
738
+            marks=pytest_machinery.non_isolated_agent_use,
739
+        )
740
+    ]
741
+)
723 742
 def running_ssh_agent(  # pragma: no cover [external]
743
+    request: pytest.FixtureRequest,
724 744
 ) -> Iterator[data.RunningSSHAgentInfo]:
725 745
     """Ensure a running SSH agent, if possible, as a pytest fixture.
726 746
 
... ...
@@ -742,6 +762,7 @@ def running_ssh_agent(  # pragma: no cover [external]
742 762
             If no agent is running or can be spawned, skip this test.
743 763
 
744 764
     """
765
+    del request
745 766
 
746 767
     def prepare_environment(
747 768
         agent_type: data.KnownSSHAgent,
... ...
@@ -151,6 +151,30 @@ All current heavy duty tests are integration tests.
151 151
 """
152 152
 
153 153
 
154
+non_isolated_agent_use = pytest.mark.xdist_group("non_isolated_agent_use")
155
+"""
156
+A cached `pytest` mark, for use with xdist's "loadgroup" scheduling,
157
+that marks a test as part of the group of tests that use a potentially
158
+non-isolated agent, and thus need to be run serially, in the same worker
159
+(because of the setup and teardown).
160
+
161
+This is a pessimistic approach: all agent use is marked with this mark
162
+*unless* we know for sure that this constellation is guaranteed safe.
163
+To date, this is only the case for the fake agents from this very test
164
+suite.
165
+
166
+The use of this mark is, if not outright a hack, at least an abuse of
167
+xdist features.  The "correct" way to handle this situation is to employ
168
+a lock, but we have had bad experiences with trying to do this in
169
+a portable way without adding oodles of extra code (that also needs to be
170
+tested), or effectively forcing *everything* into serial execution.  It
171
+also only works because we have precisely one group, and because we have
172
+no work stealing (at least, not with the only scheduler that honors
173
+`xdist_group`).  In particular, it bars us from adding more groups or
174
+grouping along different criteria.
175
+"""
176
+
177
+
154 178
 # Parameter sets
155 179
 # ==============
156 180
 
157 181