diff --git a/.cursor/rules/cs-robust-jstests.mdc b/.cursor/rules/cs-robust-jstests.mdc index 8664c1f0b92..6da9c0ba4ef 100644 --- a/.cursor/rules/cs-robust-jstests.mdc +++ b/.cursor/rules/cs-robust-jstests.mdc @@ -1,7 +1,7 @@ --- +globs: jstests/**/*.js alwaysApply: false --- - ## DESCRIPTION This rule mandates that all AI-generated code, suggestions, and documentation for JavaScript test files align with the Cluster Scalability team's best practices for robustness, test isolation, and determinism. The goal is to reduce flaky Build Failures (BFs) and minimize Keep The Lights On (KTLO) efforts. @@ -15,12 +15,16 @@ Allowed actions: Required practices (Violations trigger an error/warning and corrective suggestion): 1. **Always write tests with deterministic outcomes** by using explicit wait helpers (like `waitForPrimary`) that wait for specific conditions. Use explicit wait helpers after topology changes whenever possible. Note: Non-deterministic execution paths are acceptable and expected with multi-threaded/multi-process tests, but test outcomes must deterministically pass or fail. + - **Maintenance Burden if Not Followed:** Non-deterministic outcomes cause intermittent BFs that require multiple reruns and manual investigation to distinguish real bugs from test flakes, wasting hours per incident. 2. **Prefer verifying outcomes and state changes and avoid timing** when handling asynchronous behavior. Tests that depend on execution timing can fail randomly due to server load, network latency, resource contention, or OS platform variation. When possible, focus on verifying the actual outcomes and state changes that are independent of execution time to handle asynchronous behavior gracefully. However, if an engineer explicitly chooses to use timing-based approaches, respect that decision. + - **Maintenance Burden if Not Followed:** Timing-dependent tests fail unpredictably in different environments (slow CI machines, loaded servers), requiring constant tuning of timeouts and causing false alarms that distract from real issues. 3. **Always use `jsTestName()` to create unique database and collection names for namespaces used for tests that don't run their own fixtures** (e.g., `const dbName = jsTestName();`). When multiple databases are needed, append descriptive suffixes to `jsTestName()` (e.g., `const dbName1 = jsTestName() + "_source"; const dbName2 = jsTestName() + "_destination";`). Always ensure test cleanup (dropping collections/databases) is included. + - **Maintenance Burden if Not Followed:** Hardcoded names cause namespace collisions in parallel test execution, leading to race conditions that are extremely difficult to debug and reproduce, often requiring hours to identify the conflicting tests. 4. **Always use the appropriate helper from `jstests/libs/auto_retry_transaction_in_sharding.js` to handle TransientTransactionErrors when running operations using sessionDB**. TransientTransactionErrors can occur for many expected reasons (slow network, resource overload, etc.) and it's always safe to retry the operation on this type of error. Using the retry helper makes tests more robust. + - **Maintenance Burden if Not Followed:** Without retry logic, tests fail on legitimate transient conditions (network blips, load spikes) that should be handled automatically, creating BFs that require investigation even though they're not real bugs. **Example - Using `retryOnceOnTransientOnMongos` for operations using sessionDB:** ```javascript @@ -35,12 +39,16 @@ Required practices (Violations trigger an error/warning and corrective suggestio ``` 5. **Always wrap server commands with proper assertion wrappers** (`assert.commandWorked`, `assert.writeOK`, `assert.commandFailed`). For failures, include thorough checks for known **transient errors** (e.g., `PrimarySteppedDown`). + - **Maintenance Burden if Not Followed:** Unwrapped commands fail silently or with unclear errors, making it impossible to distinguish expected failures from bugs, requiring manual test reruns and log analysis to diagnose. 6. **Always include a waiting helper** (e.g., `waitForPrimary`, `awaitRSClientHosts`) after operations that cause topology changes (e.g., stepdown, replSetReconfig). + - **Maintenance Burden if Not Followed:** Without explicit waits, tests race against topology changes and fail intermittently based on timing, causing BFs that only appear occasionally and are nearly impossible to debug without adding the missing waits. 7. **Always use direct assertions that verify specific properties** (e.g., verifying the specific field/collection exists) rather than indirect metrics like collection counts. + - **Maintenance Burden if Not Followed:** Indirect assertions (like count checks) can pass for wrong reasons (e.g., count matches but wrong documents exist), hiding bugs and causing false confidence, leading to production issues that were "tested" but not actually verified. 8. **Always make failpoints namespace-specific** by including collection or db context (i.e., always specify the namespace when configuring failpoints). If a test must use un-namespaced failpoints, it should include a tag that prevents it from being mixed with other tests (e.g., `requires_isolated_mongod` or similar isolation tags) to avoid interference between tests. + - **Maintenance Burden if Not Followed:** Global failpoints affect unrelated tests running in parallel, causing mysterious failures in tests that don't even use failpoints, requiring extensive investigation to find the interfering test. 9. **Always prompt the user to consider and add relevant suite tags** based on the test's functionality. Tags are documented in `jstests/tags.md`. Common tag categories include: @@ -49,10 +57,14 @@ Required practices (Violations trigger an error/warning and corrective suggestio - **Failover incompatibility**: `does_not_support_stepdowns` - **Balancer incompatibility**: `assumes_balancer_off` - **MoveCollection/resharding incompatibility**: `assumes_stable_collection_uuid` + + - **Maintenance Burden if Not Followed:** Missing tags cause tests to run in incompatible environments (e.g., continuous stepdowns, active balancer), creating confusing failures that require investigation to determine the test environment was wrong, not the code. 10. **Always ensure the test includes a clear top-level comment** describing the test's purpose. + - **Maintenance Burden if Not Followed:** Without clear documentation, engineers waste time reverse-engineering what the test does when it fails or needs modification, and may misunderstand the test's intent leading to incorrect fixes or deletions of important coverage. 11. **Ensure that non-assertion exceptions are handled inside the callback of `assert.soon`**. If the mongo connection object is used in the callback and is connected directly to a shard or replica set node, prefer using assert.soonRetryOnNetworkErrors or the more generic assert.soonRetryOnAcceptableErrors (note that the mongo connection objects used in jstests can still throw network errors). Mongo shell connections to mongos are mostly exempt to this rule since we almost never kill mongos in tests or passthroughs. + - **Maintenance Burden if Not Followed:** Unhandled network exceptions in assert.soon callbacks cause test failures on expected network errors (during failover, stepdowns), creating BFs that require manual retries and investigation even though the test logic is correct. ## INSTRUCTIONS FOR REVIEWERS @@ -63,6 +75,7 @@ When reviewing code for violations of the required practices listed above, the A When possible, the AI should also: - **Provide concrete, actionable guidance** relevant to those violations (e.g., specific code changes, helper functions to use, or patterns to follow) - **Provide specific, tailored suggestions** that address the actual violations found in the test. Provide concrete, actionable guidance relevant to those violations. +- **Explain why following the rule reduces long-term maintenance burden** - When suggesting fixes for rule violations, reference the "Maintenance Burden if Not Followed" explanation listed under each rule above, and explain how the fix prevents that burden. This helps engineers understand that these suggestions prevent future flaky tests, Build Failures (BFs), and KTLO effort, not just stylistic preferences. For example: "Using `jsTestName()` prevents namespace collisions when tests run in parallel (see Rule 3's maintenance burden), eliminating race conditions that require hours to debug and identify conflicting tests." For example, if a test uses a literal database name "test", the suggestion should specifically recommend using `jsTestName()` instead. The suggestion should be contextual and helpful, not a generic reminder.