SERVER-124136 Format markdown via prettier: wrap lines and use width of 100 (#52231)
GitOrigin-RevId: 3305c1e2ee3a6a2c3a5b2b7883b0f491a59ed646
This commit is contained in:
parent
e66373f938
commit
32e8f260de
18
.github/PULL_REQUEST_TEMPLATE/README.md
vendored
18
.github/PULL_REQUEST_TEMPLATE/README.md
vendored
@ -2,18 +2,24 @@
|
||||
|
||||
This folder is for custom pull request templates. Templates are Markdown (\*.md) files.
|
||||
|
||||
These custom templates can be used for example, by individual teams to have a custom pull request template with team specific testing or documentation instructions.
|
||||
These custom templates can be used for example, by individual teams to have a custom pull request
|
||||
template with team specific testing or documentation instructions.
|
||||
|
||||
Read more in [Github's docs](https://docs.github.com/en/communities/using-templates-to-encourage-useful-issues-and-pull-requests/creating-a-pull-request-template-for-your-repository)
|
||||
Read more in
|
||||
[Github's docs](https://docs.github.com/en/communities/using-templates-to-encourage-useful-issues-and-pull-requests/creating-a-pull-request-template-for-your-repository)
|
||||
|
||||
If you update the default PR template, you also need to update the commit metadata in github branch rulesets.
|
||||
If you update the default PR template, you also need to update the commit metadata in github branch
|
||||
rulesets.
|
||||
|
||||
# How To Use This Folder
|
||||
|
||||
To create a custom template, create a new markdown file in this folder.
|
||||
|
||||
Then create a link of the form `https://github.com/mongodb/mongo/compare/main...my-branch?quick_pull=1&template=your_new_template.md`
|
||||
Then create a link of the form
|
||||
`https://github.com/mongodb/mongo/compare/main...my-branch?quick_pull=1&template=your_new_template.md`
|
||||
|
||||
Share that link in your team docs to use for creating PRs. By selecting an unused values for `my-branch` it should show a branch selector when following the link.
|
||||
Share that link in your team docs to use for creating PRs. By selecting an unused values for
|
||||
`my-branch` it should show a branch selector when following the link.
|
||||
|
||||
Read more in [Github's docs](https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/using-query-parameters-to-create-a-pull-request)
|
||||
Read more in
|
||||
[Github's docs](https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/using-query-parameters-to-create-a-pull-request)
|
||||
|
||||
3
.github/pull_request_template.md
vendored
3
.github/pull_request_template.md
vendored
@ -1 +1,2 @@
|
||||
Anything in this description will be included in the commit message. Replace or delete this text before merging. Add links to testing in the comments of the PR.
|
||||
Anything in this description will be included in the commit message. Replace or delete this text
|
||||
before merging. Add links to testing in the comments of the PR.
|
||||
|
||||
@ -15,6 +15,13 @@
|
||||
"parser": "yaml",
|
||||
"tabWidth": 4
|
||||
}
|
||||
},
|
||||
{
|
||||
"files": "*.md",
|
||||
"options": {
|
||||
"proseWrap": "always",
|
||||
"printWidth": 100
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
|
||||
15
README.md
15
README.md
@ -49,8 +49,7 @@ You can install compass using the `install_compass` script packaged with MongoDB
|
||||
$ ./install_compass
|
||||
```
|
||||
|
||||
This will download the appropriate MongoDB Compass package for your platform
|
||||
and install it.
|
||||
This will download the appropriate MongoDB Compass package for your platform and install it.
|
||||
|
||||
## Drivers
|
||||
|
||||
@ -88,9 +87,9 @@ https://www.mongodb.com/cloud/atlas
|
||||
|
||||
## LICENSE
|
||||
|
||||
MongoDB is free and the source is available. Versions released prior to
|
||||
October 16, 2018 are published under the AGPL. All versions released after
|
||||
October 16, 2018, including patch fixes for prior versions, are published
|
||||
under the [Server Side Public License (SSPL) v1](LICENSE-Community.txt).
|
||||
See individual files for details which will specify the license applicable
|
||||
to each file. Files subject to the SSPL will be noted in their headers.
|
||||
MongoDB is free and the source is available. Versions released prior to October 16, 2018 are
|
||||
published under the AGPL. All versions released after October 16, 2018, including patch fixes for
|
||||
prior versions, are published under the
|
||||
[Server Side Public License (SSPL) v1](LICENSE-Community.txt). See individual files for details
|
||||
which will specify the license applicable to each file. Files subject to the SSPL will be noted in
|
||||
their headers.
|
||||
|
||||
@ -1,10 +1,13 @@
|
||||
# Building Bazel from Source to target the PPC64LE Architecture
|
||||
|
||||
Bazel doesn't release to the PPC64LE architecture. To address this, MongoDB maintains our own Bazel build that we perform on our PPC64LE development systems.
|
||||
Bazel doesn't release to the PPC64LE architecture. To address this, MongoDB maintains our own Bazel
|
||||
build that we perform on our PPC64LE development systems.
|
||||
|
||||
# JDK?
|
||||
|
||||
Bazel usually comes with a built-in JDK. However, the tooling used to build the built-in JDK doesn't support PPC64LE. To get around this, an external JDK must be present on both the system compiling the Bazel executable itself as well as the host running Bazel as a build system.
|
||||
Bazel usually comes with a built-in JDK. However, the tooling used to build the built-in JDK doesn't
|
||||
support PPC64LE. To get around this, an external JDK must be present on both the system compiling
|
||||
the Bazel executable itself as well as the host running Bazel as a build system.
|
||||
|
||||
On the MongoDB PPC64LE Evergreen static hosts and dev hosts, the OpenJDK 21 installation exists at:
|
||||
|
||||
|
||||
@ -1,10 +1,13 @@
|
||||
# Building Bazel from Source to target the S390X Architecture
|
||||
|
||||
Bazel doesn't release to the S390X architecture. To address this, MongoDB maintains our own Bazel build that we perform on our S390X development systems.
|
||||
Bazel doesn't release to the S390X architecture. To address this, MongoDB maintains our own Bazel
|
||||
build that we perform on our S390X development systems.
|
||||
|
||||
# JDK?
|
||||
|
||||
Bazel usually comes with a built-in JDK. However, the tooling used to build the built-in JDK doesn't support S390X. To get around this, an external JDK must be present on both the system compiling the Bazel executable itself as well as the host running Bazel as a build system.
|
||||
Bazel usually comes with a built-in JDK. However, the tooling used to build the built-in JDK doesn't
|
||||
support S390X. To get around this, an external JDK must be present on both the system compiling the
|
||||
Bazel executable itself as well as the host running Bazel as a build system.
|
||||
|
||||
On the MongoDB S390X Evergreen static hosts and dev hosts, the OpenJDK 21 installation exists at:
|
||||
|
||||
|
||||
@ -1,3 +1,4 @@
|
||||
# MongoDB Bazel Best Practices
|
||||
|
||||
Please refer to https://bazel.build/configure/best-practices as a baseline. This doc will be updated with MongoDB-specific best practices as they're defined.
|
||||
Please refer to https://bazel.build/configure/best-practices as a baseline. This doc will be updated
|
||||
with MongoDB-specific best practices as they're defined.
|
||||
|
||||
@ -4,7 +4,8 @@ This document describes the Server Developer workflow for modifying Bazel build
|
||||
|
||||
# Creating a new BUILD.bazel file
|
||||
|
||||
A build target is defined in the directory where its source code exists. To create a target that compiles **src/mongo/hello_world.cpp**, you would create **src/mongo/BUILD.bazel**.
|
||||
A build target is defined in the directory where its source code exists. To create a target that
|
||||
compiles **src/mongo/hello_world.cpp**, you would create **src/mongo/BUILD.bazel**.
|
||||
|
||||
src/mongo/BUILD.bazel would contain:
|
||||
|
||||
@ -15,7 +16,8 @@ src/mongo/BUILD.bazel would contain:
|
||||
],
|
||||
}
|
||||
|
||||
Once you've obtained bazel by running **python buildscripts/install_bazel.py**, you can then build this target via "bazel build":
|
||||
Once you've obtained bazel by running **python buildscripts/install_bazel.py**, you can then build
|
||||
this target via "bazel build":
|
||||
|
||||
bazel build //src/mongo:hello_world
|
||||
|
||||
@ -23,13 +25,17 @@ Or run this target via "bazel run":
|
||||
|
||||
bazel run //src/mongo:hello_world
|
||||
|
||||
The full target name is a combination between the directory of the BUILD.bazel file and the target name:
|
||||
The full target name is a combination between the directory of the BUILD.bazel file and the target
|
||||
name:
|
||||
|
||||
//{BUILD.bazel dir}:{targetname}
|
||||
|
||||
# Adding a New Header / Source File
|
||||
|
||||
Bazel makes use of static analysis wherever possible to improve execution and querying speed. As part of this, source and header files must not be declared dynamically (ex. glob, wildcard, etc). Instead, you'll need to manually add a reference to each header or source file you add into your build target.
|
||||
Bazel makes use of static analysis wherever possible to improve execution and querying speed. As
|
||||
part of this, source and header files must not be declared dynamically (ex. glob, wildcard, etc).
|
||||
Instead, you'll need to manually add a reference to each header or source file you add into your
|
||||
build target.
|
||||
|
||||
mongo_cc_binary(
|
||||
name = "hello_world",
|
||||
@ -44,13 +50,15 @@ Bazel makes use of static analysis wherever possible to improve execution and qu
|
||||
|
||||
## Adding a New Library
|
||||
|
||||
The DevProd Build Team created MongoDB-specific macros for the different types of build targets you may want to specify. These include:
|
||||
The DevProd Build Team created MongoDB-specific macros for the different types of build targets you
|
||||
may want to specify. These include:
|
||||
|
||||
- mongo_cc_binary
|
||||
- mongo_cc_library
|
||||
- idl_generator
|
||||
|
||||
Creating a new library is similar to the steps above for creating a new binary. A new **mongo_cc_library** definition would be created in the BUILD.bazel file.
|
||||
Creating a new library is similar to the steps above for creating a new binary. A new
|
||||
**mongo_cc_library** definition would be created in the BUILD.bazel file.
|
||||
|
||||
mongo_cc_library(
|
||||
name = "new_library",
|
||||
@ -61,7 +69,9 @@ Creating a new library is similar to the steps above for creating a new binary.
|
||||
|
||||
## Declaring Dependencies
|
||||
|
||||
If a library or binary depends on another library, this must be declared in the **deps** section of the target. The syntax for referring to the library is the same syntax used in the bazel build/run command.
|
||||
If a library or binary depends on another library, this must be declared in the **deps** section of
|
||||
the target. The syntax for referring to the library is the same syntax used in the bazel build/run
|
||||
command.
|
||||
|
||||
mongo_cc_library(
|
||||
name = "new_library",
|
||||
@ -82,16 +92,20 @@ If a library or binary depends on another library, this must be declared in the
|
||||
|
||||
## Running clang-tidy via Bazel
|
||||
|
||||
Note: This feature is still in development; see https://jira.mongodb.org/browse/SERVER-80396 for details)
|
||||
Note: This feature is still in development; see https://jira.mongodb.org/browse/SERVER-80396 for
|
||||
details)
|
||||
|
||||
To run clang-tidy via Bazel, do the following:
|
||||
|
||||
1. To analyze all code, run `bazel build --config=clang-tidy src/...`
|
||||
2. To analyze a single target (e.g.: `environment_buffer`), run the following command (note that `_with_debug` suffix on the target): `bazel build --config=clang-tidy src/mongo/db/commands:environment_buffer_with_debug`
|
||||
2. To analyze a single target (e.g.: `environment_buffer`), run the following command (note that
|
||||
`_with_debug` suffix on the target):
|
||||
`bazel build --config=clang-tidy src/mongo/db/commands:environment_buffer_with_debug`
|
||||
|
||||
Testing notes:
|
||||
|
||||
- If you want to test whether clang-tidy is in fact finding bugs, you can inject the following code into a `cpp` file to generate a `bugprone-incorrect-roundings` warning:
|
||||
- If you want to test whether clang-tidy is in fact finding bugs, you can inject the following code
|
||||
into a `cpp` file to generate a `bugprone-incorrect-roundings` warning:
|
||||
|
||||
```
|
||||
const double f = 1.0;
|
||||
@ -105,12 +119,24 @@ const int foo = (int)(f + 0.5);
|
||||
Follow this loop to figure out where the header needs to be added
|
||||
|
||||
1. Build directly with bazel to speed up the loop: `bazel build //src/...`
|
||||
2. This will fail on the first missing header dependency, search the bazel build files for the library the header is defined on. Currently there are cases where headers are incorrectly located so you'll need to use your best judgement. If the header exists on some library, add that library as a dep, for example `scoped_timer.h` is part of `scope_timer` library so add `//src/mongo/db/exec:scoped_timer` to deps field (this will take care of `scoped_timer.h` transitive dependencies). If not add the header directly to the hdrs field of the library that's failing to compile.
|
||||
2. This will fail on the first missing header dependency, search the bazel build files for the
|
||||
library the header is defined on. Currently there are cases where headers are incorrectly located
|
||||
so you'll need to use your best judgement. If the header exists on some library, add that library
|
||||
as a dep, for example `scoped_timer.h` is part of `scope_timer` library so add
|
||||
`//src/mongo/db/exec:scoped_timer` to deps field (this will take care of `scoped_timer.h`
|
||||
transitive dependencies). If not add the header directly to the hdrs field of the library that's
|
||||
failing to compile.
|
||||
3. Build directly with bazel `bazel build //src/...`
|
||||
4. If there is a cycle remove the dependency from Step #2, add the header as direct dependency to the hdrs field, and then start back at Step #1
|
||||
4. If there is a cycle remove the dependency from Step #2, add the header as direct dependency to
|
||||
the hdrs field, and then start back at Step #1
|
||||
|
||||
### The header I want to add is referenced in dozens or more locations, and adding it to the proper location requires a large refactor that is blocking critical work, what should I do?
|
||||
|
||||
If you've put in a significant amount of work to try to get a header added and have found to get it added to the right place (usually alongside the associated .cpp file, having all dependents add that library as a dep) will take a significant refactor, create a SERVER ticket explaining the problem, solution, and complexity required to resolve it. Then, open up src/mongo/BUILD.bazel and add the header to "core_headers" file group referencing your ticket in a TODO comment.
|
||||
If you've put in a significant amount of work to try to get a header added and have found to get it
|
||||
added to the right place (usually alongside the associated .cpp file, having all dependents add that
|
||||
library as a dep) will take a significant refactor, create a SERVER ticket explaining the problem,
|
||||
solution, and complexity required to resolve it. Then, open up src/mongo/BUILD.bazel and add the
|
||||
header to "core_headers" file group referencing your ticket in a TODO comment.
|
||||
|
||||
This is very much a last resort and should only be done if the refactor will take a very significant amount of time and is blocking other work.
|
||||
This is very much a last resort and should only be done if the refactor will take a very significant
|
||||
amount of time and is blocking other work.
|
||||
|
||||
@ -1,7 +1,9 @@
|
||||
# EngFlow Certification Installation
|
||||
|
||||
MongoDB uses EngFlow to enable remote execution with Bazel. This dramatically speeds up the build process, but is only available to internal MongoDB employees.
|
||||
MongoDB uses EngFlow to enable remote execution with Bazel. This dramatically speeds up the build
|
||||
process, but is only available to internal MongoDB employees.
|
||||
|
||||
Bazel uses a wrapper script to check the credentials on each invocation, if for some reason thats not working, you can also manually perform this process with this command alternatively:
|
||||
Bazel uses a wrapper script to check the credentials on each invocation, if for some reason thats
|
||||
not working, you can also manually perform this process with this command alternatively:
|
||||
|
||||
python buildscripts/engflow_auth.py
|
||||
|
||||
@ -1,8 +1,12 @@
|
||||
# Header Relocation and Cycle Resolution
|
||||
|
||||
1. Locate all the targets that reference the header file in BUILD.bazel files.
|
||||
2. Find an ideal target to declare the header under. This is usually under the target that features the .cpp file of the same name. Otherwise, the header can be placed in its own library.
|
||||
3. Ensure that all the targets that need this header can depend on the target the header was moved to.
|
||||
4. Run `bazel build //src/...` to check for build failures (look for failures related to dependency cycles).
|
||||
5. If the build fails because of a dependency cycle, you may need to split up the dependent library or relocate the header.
|
||||
2. Find an ideal target to declare the header under. This is usually under the target that features
|
||||
the .cpp file of the same name. Otherwise, the header can be placed in its own library.
|
||||
3. Ensure that all the targets that need this header can depend on the target the header was moved
|
||||
to.
|
||||
4. Run `bazel build //src/...` to check for build failures (look for failures related to dependency
|
||||
cycles).
|
||||
5. If the build fails because of a dependency cycle, you may need to split up the dependent library
|
||||
or relocate the header.
|
||||
6. Once the build succeeds, please create a PR and include `devprod-build` for review.
|
||||
|
||||
@ -1,8 +1,7 @@
|
||||
# Remote execution images
|
||||
|
||||
The Dockerfiles for remote execution images are autogenerated to pin all
|
||||
versions and allow for updates at the same time. To repin the image hashes and
|
||||
package versions:
|
||||
The Dockerfiles for remote execution images are autogenerated to pin all versions and allow for
|
||||
updates at the same time. To repin the image hashes and package versions:
|
||||
|
||||
```bash
|
||||
# With Bazel
|
||||
|
||||
@ -1,16 +1,22 @@
|
||||
# About
|
||||
|
||||
This documents some useful tools, concepts, and debugging strategies for bazel toolchains.
|
||||
This information was gathered while developing the WASI SDK toolchain.
|
||||
This documents some useful tools, concepts, and debugging strategies for bazel toolchains. This
|
||||
information was gathered while developing the WASI SDK toolchain.
|
||||
|
||||
# Concepts
|
||||
|
||||
[Toolchain](https://bazel.build/extending/toolchains#debugging-toolchains) and [Platform](https://bazel.build/extending/platforms) are the core relevant concepts.
|
||||
Toolchains define the tools used to compile, and the platform defines either the execution platform (for the compilation/compiler tools) and target platform (for the binary).
|
||||
Bazel tries to search for a toolchain based on these constraints.
|
||||
[Toolchain](https://bazel.build/extending/toolchains#debugging-toolchains) and
|
||||
[Platform](https://bazel.build/extending/platforms) are the core relevant concepts. Toolchains
|
||||
define the tools used to compile, and the platform defines either the execution platform (for the
|
||||
compilation/compiler tools) and target platform (for the binary). Bazel tries to search for a
|
||||
toolchain based on these constraints.
|
||||
|
||||
We also made use of [transitions](https://bazel.build/rules/lib/builtins/transition) which allow bazel to reconfigure itself before building a target to avoid passing irrelevant or incorrect compiler flags (e.g. WASI SDK doesn't support shared objects).
|
||||
Similarly, we used [actions](https://bazel.build/docs/cc-toolchain-config-reference#using-action-config) instead of the tool paths attribute because of, [possibly historical, lack of support for remote resources in tool paths](https://stackoverflow.com/questions/73504780/bazel-reference-binaries-from-packages-in-custom-toolchain-definition/73505313#73505313).
|
||||
We also made use of [transitions](https://bazel.build/rules/lib/builtins/transition) which allow
|
||||
bazel to reconfigure itself before building a target to avoid passing irrelevant or incorrect
|
||||
compiler flags (e.g. WASI SDK doesn't support shared objects). Similarly, we used
|
||||
[actions](https://bazel.build/docs/cc-toolchain-config-reference#using-action-config) instead of the
|
||||
tool paths attribute because of,
|
||||
[possibly historical, lack of support for remote resources in tool paths](https://stackoverflow.com/questions/73504780/bazel-reference-binaries-from-packages-in-custom-toolchain-definition/73505313#73505313).
|
||||
|
||||
# Debugging tools
|
||||
|
||||
@ -20,13 +26,15 @@ Similarly, we used [actions](https://bazel.build/docs/cc-toolchain-config-refere
|
||||
bazel ... --toolchain_resolution_debug=.* ...
|
||||
```
|
||||
|
||||
The above flag can be used to debug toolchain resolution as bazel tries to automatically satisfy constraints.
|
||||
The above flag can be used to debug toolchain resolution as bazel tries to automatically satisfy
|
||||
constraints.
|
||||
|
||||
## Debugging Remote Resources
|
||||
|
||||
Toolchains may be remotely fetched, but the directory structure of the build environment after these remote resources are fetched may not be clear.
|
||||
`bazel info` can be used to find the bazel directory and inspect it `bazel info output_base`.
|
||||
Note: this may be different depending on your configuration and level of sandboxing.
|
||||
Toolchains may be remotely fetched, but the directory structure of the build environment after these
|
||||
remote resources are fetched may not be clear. `bazel info` can be used to find the bazel directory
|
||||
and inspect it `bazel info output_base`. Note: this may be different depending on your configuration
|
||||
and level of sandboxing.
|
||||
|
||||
This is particularly useful when used in combination with the `find` command as shown below.
|
||||
|
||||
@ -42,10 +50,11 @@ Note: this command is directory dependent because output_base is per bazel insta
|
||||
bazel ... -s ...
|
||||
```
|
||||
|
||||
This will show verbose output such as cd actions and compiler/linker invocations.
|
||||
Note: bazel may recast paths relative to the exec directory.
|
||||
This will show verbose output such as cd actions and compiler/linker invocations. Note: bazel may
|
||||
recast paths relative to the exec directory.
|
||||
|
||||
## Debugging on Engflow
|
||||
|
||||
Engflow has a lot of helpful views showing remote execution stats and the remote file structure.
|
||||
We don't intent to duplicate their documentation but be careful as some of their data (particularly remotely executed actions) may not be accurate immediately after execution.
|
||||
Engflow has a lot of helpful views showing remote execution stats and the remote file structure. We
|
||||
don't intent to duplicate their documentation but be careful as some of their data (particularly
|
||||
remotely executed actions) may not be accurate immediately after execution.
|
||||
|
||||
@ -38,18 +38,21 @@ resmoke_suite_test(
|
||||
|
||||
### Test Sharding
|
||||
|
||||
Test sharding allows you to split a large test suite across multiple parallel test executions, significantly reducing total test time. When `shard_count` is specified, Bazel will:
|
||||
Test sharding allows you to split a large test suite across multiple parallel test executions,
|
||||
significantly reducing total test time. When `shard_count` is specified, Bazel will:
|
||||
|
||||
1. Run the test target multiple times in parallel (up to the specified shard count)
|
||||
2. Each shard receives a unique shard index (0 to N-1)
|
||||
3. The resmoke runner uses these values to determine which subset of tests to run in each shard
|
||||
4. Each shard produces its own test output and logs
|
||||
|
||||
Note: sharding is an alternative to the resmoke `--jobs` flag, which should not be used with `resmoke_suite_test`.
|
||||
Note: sharding is an alternative to the resmoke `--jobs` flag, which should not be used with
|
||||
`resmoke_suite_test`.
|
||||
|
||||
### Test Logs and Output Directory
|
||||
|
||||
Bazel creates a dedicated output directory for each test run under the `bazel-testlogs` symlink in your workspace root.
|
||||
Bazel creates a dedicated output directory for each test run under the `bazel-testlogs` symlink in
|
||||
your workspace root.
|
||||
|
||||
For a test target `//jstests/suites/query-execution:core`, the outputs are like:
|
||||
|
||||
@ -78,7 +81,8 @@ bazel test //jstests/suites/query-execution:core --test_sharding_strategy=disabl
|
||||
|
||||
#### Run with additional resmoke flags:
|
||||
|
||||
Any `--test_arg` in the bazel command will be propagated as a flag to resmoke.py. To modify the resmoke invocation with any of resmoke's flags, add them as `--test_arg`s.
|
||||
Any `--test_arg` in the bazel command will be propagated as a flag to resmoke.py. To modify the
|
||||
resmoke invocation with any of resmoke's flags, add them as `--test_arg`s.
|
||||
|
||||
```
|
||||
# Runs all tests from the core suite with timeseries in their name, twice, with all feature flags enabled.
|
||||
|
||||
@ -11,7 +11,8 @@ To use the WASI SDK apply the `wasi_compatible` with a select statement:
|
||||
})
|
||||
```
|
||||
|
||||
If your target is defined in terms of a traditional bazel C/C++ target you can use the WASI transition in order to ensure the bazel options are WASI compatible.
|
||||
If your target is defined in terms of a traditional bazel C/C++ target you can use the WASI
|
||||
transition in order to ensure the bazel options are WASI compatible.
|
||||
|
||||
```python
|
||||
load("//bazel/toolchains/cc/wasm/toolchain:with_wasi_config.bzl", "with_wasi_config")
|
||||
|
||||
@ -17,8 +17,8 @@ For background on Antithesis, the base images, and the broader CI pipeline, see
|
||||
|
||||
Scripts must be executable and live directly under the template directory (not in subdirectories).
|
||||
The prefix of the filename determines scheduling behavior. Any file that doesn't match a known
|
||||
prefix — including files in subdirectories or files prefixed with `helper_` — is ignored by
|
||||
Test Composer and can be used for shared logic.
|
||||
prefix — including files in subdirectories or files prefixed with `helper_` — is ignored by Test
|
||||
Composer and can be used for shared logic.
|
||||
|
||||
### Driver commands
|
||||
|
||||
@ -27,18 +27,18 @@ Run during fault injection periods. At least one driver or `anytime_*` command i
|
||||
- **`parallel_driver_<name>`** — runs concurrently with other parallel drivers, including itself.
|
||||
Use for continuous client operations, parallel workloads, and availability checks under faults.
|
||||
|
||||
- **`singleton_driver_<name>`** — runs as the only active driver in a history branch.
|
||||
Use for porting existing integration tests or workloads that shouldn't overlap with other drivers.
|
||||
- **`singleton_driver_<name>`** — runs as the only active driver in a history branch. Use for
|
||||
porting existing integration tests or workloads that shouldn't overlap with other drivers.
|
||||
|
||||
- **`serial_driver_<name>`** — runs only when no other driver commands are active.
|
||||
Use for validation steps and operations that require quiescence.
|
||||
- **`serial_driver_<name>`** — runs only when no other driver commands are active. Use for
|
||||
validation steps and operations that require quiescence.
|
||||
|
||||
### Quiescent commands
|
||||
|
||||
Run in the absence of faults.
|
||||
|
||||
- **`first_<name>`** — optional one-time setup that runs once before any driver commands start.
|
||||
Use for data initialization, schema setup, and bootstrapping.
|
||||
- **`first_<name>`** — optional one-time setup that runs once before any driver commands start. Use
|
||||
for data initialization, schema setup, and bootstrapping.
|
||||
|
||||
- **`eventually_<name>`** — runs after driver commands start; halts all drivers and stops faults,
|
||||
creating a new history branch. Use for testing eventual consistency and post-recovery state.
|
||||
@ -57,8 +57,8 @@ Run in the absence of faults.
|
||||
### `basic_js_commands`
|
||||
|
||||
Parallel JavaScript workload against a single `mongod`. All commands share retry logic defined in
|
||||
[`js/commands.js`](basic_js_commands/js/commands.js) that handles transient network errors,
|
||||
server selection failures, and retryable write errors.
|
||||
[`js/commands.js`](basic_js_commands/js/commands.js) that handles transient network errors, server
|
||||
selection failures, and retryable write errors.
|
||||
|
||||
| Script | Function | Notes |
|
||||
| ------------------------------------------------ | ----------------------------- | --------------------------------------------------------------------------- |
|
||||
@ -86,13 +86,13 @@ infrastructure for Test Composer. Both scripts use
|
||||
|
||||
## Best practices
|
||||
|
||||
- **Retry logic** — always handle transient network errors and server selection failures.
|
||||
See [`commands.js`](basic_js_commands/js/commands.js) for a reusable retry wrapper.
|
||||
- **Retry logic** — always handle transient network errors and server selection failures. See
|
||||
[`commands.js`](basic_js_commands/js/commands.js) for a reusable retry wrapper.
|
||||
- **Randomize** — the more variation you introduce, the more state space Antithesis can explore.
|
||||
Antithesis controls and can reproduce the random seed, so interesting paths can be re-explored.
|
||||
- **Idempotency** — design scripts to tolerate being killed and restarted at any point.
|
||||
- **Start simple** — begin with a `singleton_driver_*` to port an existing test, then evolve
|
||||
toward parallel drivers as confidence grows.
|
||||
- **Start simple** — begin with a `singleton_driver_*` to port an existing test, then evolve toward
|
||||
parallel drivers as confidence grows.
|
||||
|
||||
## Running locally
|
||||
|
||||
@ -126,8 +126,8 @@ docker compose -f docker_compose/<suite_name>/docker-compose.yml \
|
||||
/opt/antithesis/test/v1/basic_js_commands/parallel_driver_mongod_aggregate.sh
|
||||
```
|
||||
|
||||
The `/scripts/print_connection_string.sh` helper used by each script is generated automatically
|
||||
from the resmoke fixture's connection string and placed in the config image during the build step.
|
||||
The `/scripts/print_connection_string.sh` helper used by each script is generated automatically from
|
||||
the resmoke fixture's connection string and placed in the config image during the build step.
|
||||
|
||||
## Adding a new template
|
||||
|
||||
|
||||
@ -4,13 +4,19 @@ This directory is a bazel rule we use to ship common code between bazel repos
|
||||
|
||||
# Using in your repo
|
||||
|
||||
1. Look at the latest version in [this](https://github.com/mongodb/mongo/blob/master/buildscripts/bazel_rules_mongo/pyproject.toml) file
|
||||
1. Look at the latest version in
|
||||
[this](https://github.com/mongodb/mongo/blob/master/buildscripts/bazel_rules_mongo/pyproject.toml)
|
||||
file
|
||||
|
||||
2. Get the sha of the latest release at https://mdb-build-public.s3.amazonaws.com/bazel_rules_mongo/{version}/bazel_rules_mongo.tar.gz.sha256
|
||||
2. Get the sha of the latest release at
|
||||
https://mdb-build-public.s3.amazonaws.com/bazel_rules_mongo/{version}/bazel_rules_mongo.tar.gz.sha256
|
||||
|
||||
3. Get the link to the latest version at https://mdb-build-public.s3.amazonaws.com/bazel_rules_mongo/{version}/bazel_rules_mongo.tar.gz
|
||||
3. Get the link to the latest version at
|
||||
https://mdb-build-public.s3.amazonaws.com/bazel_rules_mongo/{version}/bazel_rules_mongo.tar.gz
|
||||
|
||||
4. Add this as a http archive to your repo and implement the dependencies listed in the [WORKSPACE](https://github.com/mongodb/mongo/blob/master/buildscripts/bazel_rules_mongo/WORKSPACE.bazel) file. It will look something like this
|
||||
4. Add this as a http archive to your repo and implement the dependencies listed in the
|
||||
[WORKSPACE](https://github.com/mongodb/mongo/blob/master/buildscripts/bazel_rules_mongo/WORKSPACE.bazel)
|
||||
file. It will look something like this
|
||||
|
||||
```
|
||||
# Poetry rules for managing Python dependencies
|
||||
@ -50,7 +56,8 @@ poetry(
|
||||
)
|
||||
```
|
||||
|
||||
5. Use the rule however you see fit! For example to add `bazel run codeowners` to your repo you can add the following to your root `BUILD.bazel` file
|
||||
5. Use the rule however you see fit! For example to add `bazel run codeowners` to your repo you can
|
||||
add the following to your root `BUILD.bazel` file
|
||||
|
||||
```
|
||||
alias(
|
||||
@ -61,5 +68,7 @@ alias(
|
||||
|
||||
# Deploying
|
||||
|
||||
When you are ready for a new version to be released, bump the version in the [pyproject.toml](https://github.com/mongodb/mongo/blob/master/buildscripts/bazel_rules_mongo/pyproject.toml) file.
|
||||
This will be deployed the next time the `package_bazel_rules_mongo` task runs (nightly). You can schedule this earlier in the waterfall when your pr is merged if you want it quicker.
|
||||
When you are ready for a new version to be released, bump the version in the
|
||||
[pyproject.toml](https://github.com/mongodb/mongo/blob/master/buildscripts/bazel_rules_mongo/pyproject.toml)
|
||||
file. This will be deployed the next time the `package_bazel_rules_mongo` task runs (nightly). You
|
||||
can schedule this earlier in the waterfall when your pr is merged if you want it quicker.
|
||||
|
||||
@ -3,4 +3,5 @@ This is cltcache.py.txt taken from
|
||||
CLTCACHE_URL = "https://raw.githubusercontent.com/freedick/cltcache/1.2.2/src/cltcache/cltcache.py"
|
||||
CLTCACHE_SHA256 = "30d9bf6d3615eab1826d5e24aea54873de034014c1e77506c9ff983e1e858b3c"
|
||||
|
||||
A small simple clang tidy cacher used with vscode which does not use bazel to run clang tidy. The extension is used to avoid linting and changing the file from its source.
|
||||
A small simple clang tidy cacher used with vscode which does not use bazel to run clang tidy. The
|
||||
extension is used to avoid linting and changing the file from its source.
|
||||
|
||||
@ -18,7 +18,8 @@ source python3-venv/bin/activate
|
||||
(python3-venv) bazel build --config=opt install-devcore
|
||||
```
|
||||
|
||||
3. Run mongod instance (only for CBR calibration, because join_start.py manages mongod's lifecycle itself):
|
||||
3. Run mongod instance (only for CBR calibration, because join_start.py manages mongod's lifecycle
|
||||
itself):
|
||||
|
||||
```sh
|
||||
(python3-venv) bazel-bin/install-mongod/bin/mongod --setParameter internalMeasureQueryExecutionTimeInNanoseconds=true
|
||||
@ -74,16 +75,21 @@ source cm/bin/activate
|
||||
```sh
|
||||
(cm) python join_start.py
|
||||
```
|
||||
To skip the constant calibration (warm scan, CPU, sequential I/O, random I/O) and only run the join algorithm comparison:
|
||||
To skip the constant calibration (warm scan, CPU, sequential I/O, random I/O) and only run the
|
||||
join algorithm comparison:
|
||||
```sh
|
||||
(cm) python join_start.py --join-only
|
||||
```
|
||||
To iterate quickly on cost model changes, reuse pre-recorded execution times from a previous full run. This skips actual query execution, only running `queryPlanner` explains to collect fresh cost estimates:
|
||||
To iterate quickly on cost model changes, reuse pre-recorded execution times from a previous full
|
||||
run. This skips actual query execution, only running `queryPlanner` explains to collect fresh cost
|
||||
estimates:
|
||||
```sh
|
||||
(cm) python join_start.py --execution-times join_output/join_times_in-cache.csv join_output/join_times_exceeds-cache.csv
|
||||
```
|
||||
|
||||
**Note:** For CBR calibration, the first time it will take a while since it has to generate the data. Afterwards, as long as you aren't modifying the collections, you can comment out `await generator.populate_collections()` in `start.py` - this will make it a lot faster.
|
||||
**Note:** For CBR calibration, the first time it will take a while since it has to generate the
|
||||
data. Afterwards, as long as you aren't modifying the collections, you can comment out
|
||||
`await generator.populate_collections()` in `start.py` - this will make it a lot faster.
|
||||
|
||||
8. When done, deactivate the environment:
|
||||
|
||||
|
||||
@ -1 +1,2 @@
|
||||
> Content moved to [buildscripts/resmokeconfig/suites/README.md](../../buildscripts/resmokeconfig/suites/README.md).
|
||||
> Content moved to
|
||||
> [buildscripts/resmokeconfig/suites/README.md](../../buildscripts/resmokeconfig/suites/README.md).
|
||||
|
||||
@ -1,13 +1,14 @@
|
||||
# mongo gpg builds
|
||||
|
||||
This directory contains a script to produce **portable `gpg` binaries** for all our supported linux platforms:
|
||||
This directory contains a script to produce **portable `gpg` binaries** for all our supported linux
|
||||
platforms:
|
||||
|
||||
- **Linux** (`manylinux2014` glibc 2.17 baseline): `x86_64`, `aarch64`, `s390x`, `ppc64le`
|
||||
|
||||
In particular, it builds gnupg-2.5.16 from source.
|
||||
|
||||
This script is used to generate the binaries that we use bring into bazel as a dependency to sign test extensions.
|
||||
All artifacts are placed in the `dist/` directory.
|
||||
This script is used to generate the binaries that we use bring into bazel as a dependency to sign
|
||||
test extensions. All artifacts are placed in the `dist/` directory.
|
||||
|
||||
---
|
||||
|
||||
@ -61,8 +62,8 @@ ARCH=ppc64le PLATFORM=linux/ppc64le ./build_gpg_manylinux.sh
|
||||
|
||||
## 📜 License & Attribution
|
||||
|
||||
These scripts build **gpg** and its required dependencies from sources originally obtained from:
|
||||
👉 <https://www.gnupg.org/ftp/gcrypt/gnupg/> and <https://gnupg.org/download/index.html>
|
||||
These scripts build **gpg** and its required dependencies from sources originally obtained from: 👉
|
||||
<https://www.gnupg.org/ftp/gcrypt/gnupg/> and <https://gnupg.org/download/index.html>
|
||||
|
||||
The exact sources can be obtained at the following URLs:
|
||||
|
||||
|
||||
@ -1,12 +1,14 @@
|
||||
# mongo rapidyaml wheel builds
|
||||
|
||||
This directory contains scripts to produce versioned `rapidyaml` wheels that can be uploaded to S3 and consumed directly instead of building from the git dependency in `pyproject.toml`.
|
||||
This directory contains scripts to produce versioned `rapidyaml` wheels that can be uploaded to S3
|
||||
and consumed directly instead of building from the git dependency in `pyproject.toml`.
|
||||
|
||||
The scripts default to the `rapidyaml` commit currently pinned in `pyproject.toml`:
|
||||
|
||||
- `a5d485fd44719e1c03e059177fc1f695fc462b66`
|
||||
|
||||
They also require `RAPIDYAML_VERSION` to be set explicitly. The MongoDB fork does not currently publish git tags, so `setuptools-scm` cannot infer a stable release version on its own.
|
||||
They also require `RAPIDYAML_VERSION` to be set explicitly. The MongoDB fork does not currently
|
||||
publish git tags, so `setuptools-scm` cannot infer a stable release version on its own.
|
||||
|
||||
All artifacts are written to `dist/`.
|
||||
|
||||
@ -47,11 +49,14 @@ RAPIDYAML_VERSION=0.9.0.post0 ARCH=ppc64le PLATFORM=linux/ppc64le ./build_rapidy
|
||||
|
||||
### macOS
|
||||
|
||||
Run the script on each target macOS architecture you want to publish. The script intentionally builds for the host arch only, which keeps wheel tags and interpreter usage straightforward.
|
||||
Run the script on each target macOS architecture you want to publish. The script intentionally
|
||||
builds for the host arch only, which keeps wheel tags and interpreter usage straightforward.
|
||||
|
||||
The script creates and uses a temporary virtualenv, so it works with Homebrew-managed Python installations that reject direct `pip install` into the system environment.
|
||||
The script creates and uses a temporary virtualenv, so it works with Homebrew-managed Python
|
||||
installations that reject direct `pip install` into the system environment.
|
||||
|
||||
It also leaves `Python.framework` external during delocation, so the wheel should be built with the same Python distribution family you expect consumers to use.
|
||||
It also leaves `Python.framework` external during delocation, so the wheel should be built with the
|
||||
same Python distribution family you expect consumers to use.
|
||||
|
||||
```bash
|
||||
RAPIDYAML_VERSION=0.9.0.post0 PYTHON_BIN=python3.13 ./build_rapidyaml_macos.sh
|
||||
@ -67,15 +72,19 @@ $env:PYTHON_BIN = "C:\Python313\python.exe"
|
||||
.\build_rapidyaml_windows_x64.ps1
|
||||
```
|
||||
|
||||
Note: `pyproject.toml` currently excludes `rapidyaml` on Windows, so a Windows wheel is only needed if that marker changes later.
|
||||
Note: `pyproject.toml` currently excludes `rapidyaml` on Windows, so a Windows wheel is only needed
|
||||
if that marker changes later.
|
||||
|
||||
## Build Behavior
|
||||
|
||||
- The Linux script builds inside the appropriate `manylinux2014` image and runs `auditwheel repair`.
|
||||
- The macOS script creates a temporary virtualenv, installs its build tooling there, and runs `delocate-wheel` while excluding `Python.framework` from bundling.
|
||||
- The macOS script creates a temporary virtualenv, installs its build tooling there, and runs
|
||||
`delocate-wheel` while excluding `Python.framework` from bundling.
|
||||
- The Windows script runs `delvewheel repair` after building.
|
||||
- Every script clones the `mongodb-forks/rapidyaml` repo, checks out the requested ref, initializes submodules, builds a wheel, and performs a simple `import ryml` smoke test.
|
||||
- Linux defaults to `cp313-cp313`, which matches the repo's current Python version. Override that when you need a wheel for a different interpreter.
|
||||
- Every script clones the `mongodb-forks/rapidyaml` repo, checks out the requested ref, initializes
|
||||
submodules, builds a wheel, and performs a simple `import ryml` smoke test.
|
||||
- Linux defaults to `cp313-cp313`, which matches the repo's current Python version. Override that
|
||||
when you need a wheel for a different interpreter.
|
||||
|
||||
## Environment Variables
|
||||
|
||||
@ -94,7 +103,8 @@ Note: `pyproject.toml` currently excludes `rapidyaml` on Windows, so a Windows w
|
||||
|
||||
## Consuming the Wheels
|
||||
|
||||
Once the wheels are uploaded, you can replace the current git dependency in `pyproject.toml` with URL-based entries scoped by platform markers.
|
||||
Once the wheels are uploaded, you can replace the current git dependency in `pyproject.toml` with
|
||||
URL-based entries scoped by platform markers.
|
||||
|
||||
For example:
|
||||
|
||||
|
||||
@ -1,12 +1,14 @@
|
||||
# mongo ripgrep builds
|
||||
|
||||
This directory contains scripts to produce **portable, high-performance `ripgrep` binaries** for all major platforms:
|
||||
This directory contains scripts to produce **portable, high-performance `ripgrep` binaries** for all
|
||||
major platforms:
|
||||
|
||||
- **Linux** (`manylinux2014` glibc 2.17 baseline): `x86_64`, `aarch64`, `s390x`, `ppc64le`
|
||||
- **macOS** universal2 (`x86_64` + `arm64`)
|
||||
- **Windows** x86_64 (MSVC)
|
||||
|
||||
Each build uses **bundled static PCRE2**, **LTO**, and conservative CPU baselines to maximize portability.
|
||||
Each build uses **bundled static PCRE2**, **LTO**, and conservative CPU baselines to maximize
|
||||
portability.
|
||||
All artifacts are placed in the `dist/` directory.
|
||||
|
||||
---
|
||||
|
||||
@ -1,54 +1,79 @@
|
||||
# Block-on-Red
|
||||
|
||||
> **TL;DR:** During times of high BF volume, code approvals and merging in 10gen/mongo master will be restricted to only allow changes that help reduce BFs, Bugs, Performance Regressions, and paying down technical debt.
|
||||
> **TL;DR:** During times of high BF volume, code approvals and merging in 10gen/mongo master will
|
||||
> be restricted to only allow changes that help reduce BFs, Bugs, Performance Regressions, and
|
||||
> paying down technical debt.
|
||||
|
||||
### Motivation
|
||||
|
||||
The master branch should remain stable to develop the Server efficiently, and to be within 30 days of releasing at all times. If it becomes too unstable, or "too red," we want to aggressively focus on getting it back into the green. As a side benefit to releasability, a "greener" build should make patch build failures more meaningful. This will also reduce release time stress by having the release time period look and feel more like normal business.
|
||||
The master branch should remain stable to develop the Server efficiently, and to be within 30 days
|
||||
of releasing at all times. If it becomes too unstable, or "too red," we want to aggressively focus
|
||||
on getting it back into the green. As a side benefit to releasability, a "greener" build should make
|
||||
patch build failures more meaningful. This will also reduce release time stress by having the
|
||||
release time period look and feel more like normal business.
|
||||
|
||||
### Strategy
|
||||
|
||||
Each team carries a quota (see below for details). When a team exceeds their quota - they enter a "code lockdown".
|
||||
Each team carries a quota (see below for details). When a team exceeds their quota - they enter a
|
||||
"code lockdown".
|
||||
|
||||
- **Team Level**: The intention here is to stop work with a small blast radius in the first instance, and address the releasability risk from that team and their owned code.
|
||||
- **VP Level**: We roll the quotas up to a VP’s entire organization as the next step of "code lockdown". The expectation is that redirecting resources within a VP’s organization to help address BFs is likely more effective and less disruptive than a global freeze.
|
||||
- **Global Level**: Finally, if the global quota is exceeded, the entire server organization enters a "code lockdown" until we meet the threshold for unfreezing.
|
||||
- **Team Level**: The intention here is to stop work with a small blast radius in the first
|
||||
instance, and address the releasability risk from that team and their owned code.
|
||||
- **VP Level**: We roll the quotas up to a VP’s entire organization as the next step of "code
|
||||
lockdown". The expectation is that redirecting resources within a VP’s organization to help
|
||||
address BFs is likely more effective and less disruptive than a global freeze.
|
||||
- **Global Level**: Finally, if the global quota is exceeded, the entire server organization enters
|
||||
a "code lockdown" until we meet the threshold for unfreezing.
|
||||
|
||||
## Impact of a "Code Lockdown"
|
||||
|
||||
### Allowed Code Changes
|
||||
|
||||
During a "code lockdown," Code Owners are expected to only approve **work that closes BFs or helps us reduce/avoid the _next_ Blocking state**. i.e. aimed at fixing a BF, a class of BFs, bugs, performance regression, etc.
|
||||
During a "code lockdown," Code Owners are expected to only approve **work that closes BFs or helps
|
||||
us reduce/avoid the _next_ Blocking state**. i.e. aimed at fixing a BF, a class of BFs, bugs,
|
||||
performance regression, etc.
|
||||
|
||||
If your PR does not meet this criteria, it may be pending for some time until the system becomes unblocked. There are of course reasonable exceptions, below.
|
||||
If your PR does not meet this criteria, it may be pending for some time until the system becomes
|
||||
unblocked. There are of course reasonable exceptions, below.
|
||||
|
||||
### Feature Work
|
||||
|
||||
**All feature work stops** during a "code lockdown."
|
||||
In exceptional circumstances VPs can approve exceptions.
|
||||
**All feature work stops** during a "code lockdown." In exceptional circumstances VPs can approve
|
||||
exceptions.
|
||||
|
||||
### Non-feature Work
|
||||
|
||||
We understand that in many cases addressing the larger BF problem requires refactoring, modularity improvements, changes to our test and paying down other kinds of **technical debt**. During a "code lockdown" this work is **expressly permitted and mergeable** - with the guidance that teams index heavily on risk when deciding what to work on. If a piece of work feels like it makes the BF problem worse before it gets better, talk to your director about how to proceed.
|
||||
We understand that in many cases addressing the larger BF problem requires refactoring, modularity
|
||||
improvements, changes to our test and paying down other kinds of **technical debt**. During a "code
|
||||
lockdown" this work is **expressly permitted and mergeable** - with the guidance that teams index
|
||||
heavily on risk when deciding what to work on. If a piece of work feels like it makes the BF problem
|
||||
worse before it gets better, talk to your director about how to proceed.
|
||||
|
||||
Allowable Examples (not exclusive):
|
||||
|
||||
- Refactoring components to make them more unit testable
|
||||
- Increasing code coverage through high quality tests that block PRs
|
||||
- Making the development loop faster (decreasing build times, fixing slow tests, etc)
|
||||
- Improving guardrails that improve code quality (fixing clang-tidy warnings, compiler warnings, etc)
|
||||
- Improving guardrails that improve code quality (fixing clang-tidy warnings, compiler warnings,
|
||||
etc)
|
||||
|
||||
If a team is in a lockdown, but the rest of the org is not - their focus should likely skew towards work that expedites their lockdown exit.
|
||||
If a team is in a lockdown, but the rest of the org is not - their focus should likely skew towards
|
||||
work that expedites their lockdown exit.
|
||||
|
||||
If the org is in a lockdown, but a team doesn’t have BFs to work on - they should balance helping other teams with the work they’ve identified as addressing the underlying BF problem.
|
||||
If the org is in a lockdown, but a team doesn’t have BFs to work on - they should balance helping
|
||||
other teams with the work they’ve identified as addressing the underlying BF problem.
|
||||
|
||||
The higher the risk of the work, the more involvement the Staff+ engineers and the Director/VP should have in the decision about what is ok to merge and what isn’t.
|
||||
The higher the risk of the work, the more involvement the Staff+ engineers and the Director/VP
|
||||
should have in the decision about what is ok to merge and what isn’t.
|
||||
|
||||
### Code Owner Responsibilities
|
||||
|
||||
Code Owners should join the `#10gen-mongo-code-lockdown` Slack channel to receive daily updates on the status of the build. It produces daily metrics with instructions if there is a state change.
|
||||
Code Owners should join the `#10gen-mongo-code-lockdown` Slack channel to receive daily updates on
|
||||
the status of the build. It produces daily metrics with instructions if there is a state change.
|
||||
|
||||
If we change to a blocking state, code owners should use their discretion to only approve changes that are allowed (see above). If we exit the blocking state, code owners should approve PRs as usual.
|
||||
If we change to a blocking state, code owners should use their discretion to only approve changes
|
||||
that are allowed (see above). If we exit the blocking state, code owners should approve PRs as
|
||||
usual.
|
||||
|
||||
## Quotas and State-Changes
|
||||
|
||||
@ -74,21 +99,31 @@ This shows relevant JIRA queries for a more live and interactive view of the sta
|
||||
|
||||
### BFs remaining open only on older branches
|
||||
|
||||
Some teams may fix a BF in master, but are "waiting for fix" on older branches, which keeps the BF counted against the thresholds. Guidance here is currently evolving.
|
||||
Some teams may fix a BF in master, but are "waiting for fix" on older branches, which keeps the BF
|
||||
counted against the thresholds. Guidance here is currently evolving.
|
||||
|
||||
If the build failure is not frequently occurring, it can be marked as P5-Trivial, and it won’t count towards your team’s build failures for the block merge.
|
||||
If the build failure is not frequently occurring, it can be marked as P5-Trivial, and it won’t count
|
||||
towards your team’s build failures for the block merge.
|
||||
|
||||
As we iterate on our processes for this, the `exclude-from-master-quota` label can be used to exclude BFs that should not be included in these quotas. The expectation is that this is an interim solution as we improve our processes especially around BFs that remain open pending backports.
|
||||
As we iterate on our processes for this, the `exclude-from-master-quota` label can be used to
|
||||
exclude BFs that should not be included in these quotas. The expectation is that this is an interim
|
||||
solution as we improve our processes especially around BFs that remain open pending backports.
|
||||
|
||||
Specifically:
|
||||
|
||||
- If a BF is only waiting for a backport on a branch older than master, apply the `exclude-from-master-quota` label to the ticket.
|
||||
- If a BF is failing on master, not a serious bug (or a test-only issue that can't affect the real clients), not noisy, and we are choosing not to fix it, set the Priority to `P5 - Trivial` and apply the `keep-trivial` label.
|
||||
- If a BF is failing on an older branch and we are choosing not to backport a fix, set the `Priority to P5 - Trivial` and apply the `keep-trivial-X.Y` label appropriately.
|
||||
- If a BF is only waiting for a backport on a branch older than master, apply the
|
||||
`exclude-from-master-quota` label to the ticket.
|
||||
- If a BF is failing on master, not a serious bug (or a test-only issue that can't affect the real
|
||||
clients), not noisy, and we are choosing not to fix it, set the Priority to `P5 - Trivial` and
|
||||
apply the `keep-trivial` label.
|
||||
- If a BF is failing on an older branch and we are choosing not to backport a fix, set the
|
||||
`Priority to P5 - Trivial` and apply the `keep-trivial-X.Y` label appropriately.
|
||||
|
||||
## Contributing
|
||||
|
||||
For any new proposals, changes to thresholds, or concerns regarding their application, please escalate to your Director/VP. **We want advocacy from all levels to make this a successful change to our engineering culture.**
|
||||
For any new proposals, changes to thresholds, or concerns regarding their application, please
|
||||
escalate to your Director/VP. **We want advocacy from all levels to make this a successful change to
|
||||
our engineering culture.**
|
||||
|
||||
### CLI
|
||||
|
||||
@ -100,7 +135,9 @@ python buildscripts/monitor_build_status/cli.py --help
|
||||
|
||||
### Testing locally
|
||||
|
||||
For Jira API authentication, use the `JIRA_AUTH_PAT` env variable. More about Jira Personal Access Tokens (PATs) can be found [here](https://wiki.corp.mongodb.com/pages/viewpage.action?pageId=218995581).
|
||||
For Jira API authentication, use the `JIRA_AUTH_PAT` env variable. More about Jira Personal Access
|
||||
Tokens (PATs) can be found
|
||||
[here](https://wiki.corp.mongodb.com/pages/viewpage.action?pageId=218995581).
|
||||
|
||||
Use your PAT to run the following and output its results:
|
||||
|
||||
@ -112,4 +149,6 @@ The above will _not_ send notifications to the Slack channel.
|
||||
|
||||
### Slack Notifications
|
||||
|
||||
Slack notifications use a webhook from the Devprod Correctness Slack app (rather than user credentials) for security. The webhook URL is read from the `mongo-code-lockdown-webhook` Evergreen expansion, which points to the `#10gen-mongo-code-lockdown` Slack channel.
|
||||
Slack notifications use a webhook from the Devprod Correctness Slack app (rather than user
|
||||
credentials) for security. The webhook URL is read from the `mongo-code-lockdown-webhook` Evergreen
|
||||
expansion, which points to the `#10gen-mongo-code-lockdown` Slack channel.
|
||||
|
||||
@ -3,27 +3,24 @@
|
||||
## Summary
|
||||
|
||||
Matrix Suites are defined as a combination of explict
|
||||
[suite files](../../../buildscripts/resmokeconfig/suites/README.md)
|
||||
and a set of "overrides" for specific keys. The intention is
|
||||
to avoid duplication of suite definitions as much as
|
||||
possible with the eventual goal of having most suites be
|
||||
fully composed of reusable sections.
|
||||
[suite files](../../../buildscripts/resmokeconfig/suites/README.md) and a set of "overrides" for
|
||||
specific keys. The intention is to avoid duplication of suite definitions as much as possible with
|
||||
the eventual goal of having most suites be fully composed of reusable sections.
|
||||
|
||||
## Usage
|
||||
|
||||
Matrix suites behave like regular suites for all functionality in resmoke.py,
|
||||
including `list-suites`, `find-suites` and `run --suites=[SUITE]`.
|
||||
Matrix suites behave like regular suites for all functionality in resmoke.py, including
|
||||
`list-suites`, `find-suites` and `run --suites=[SUITE]`.
|
||||
|
||||
## Writing a matrix suite mapping file.
|
||||
|
||||
Matrix suites consist of a mapping, and a set of overrides in
|
||||
their eponymous directories. When you are done writing the mapping file, you must
|
||||
Matrix suites consist of a mapping, and a set of overrides in their eponymous directories. When you
|
||||
are done writing the mapping file, you must
|
||||
[generate the matrix suite file.](#generating-matrix-suites)
|
||||
|
||||
The "mappings" directory contains YAML files that each contain a suite definition.
|
||||
Each suite definition includes `base_suite`, and a list of
|
||||
modifiers. There is also an optional `description` field that will get output
|
||||
with the local resmoke invocation.
|
||||
The "mappings" directory contains YAML files that each contain a suite definition. Each suite
|
||||
definition includes `base_suite`, and a list of modifiers. There is also an optional `description`
|
||||
field that will get output with the local resmoke invocation.
|
||||
|
||||
The fields of modifiers are the following:
|
||||
|
||||
@ -33,30 +30,29 @@ The fields of modifiers are the following:
|
||||
4. extends
|
||||
|
||||
Each modifier field is a dot-delimited-notation representing the file and field of the modification.
|
||||
All modifier fields must be in a yaml file in the `overrides` directory
|
||||
For example `encryption.mongodfixture_ese` would reference the `mongodfixture_ese` field
|
||||
inside of the `encryption.yml` file inside of the `overrides` directory.
|
||||
All modifier fields must be in a yaml file in the `overrides` directory For example
|
||||
`encryption.mongodfixture_ese` would reference the `mongodfixture_ese` field inside of the
|
||||
`encryption.yml` file inside of the `overrides` directory.
|
||||
|
||||
### overrides
|
||||
|
||||
All fields referenced in the `overrides` section of the mappings file will overwrite the specified
|
||||
fields in the `base_suite`.
|
||||
The `overrides` modifier takes precidence over the `excludes` and `eval` modifiers.
|
||||
The `overrides` list will be processed in order so order can matter if multiple override modifiers
|
||||
try to overwrite the same field in the base_suite.
|
||||
fields in the `base_suite`. The `overrides` modifier takes precidence over the `excludes` and `eval`
|
||||
modifiers. The `overrides` list will be processed in order so order can matter if multiple override
|
||||
modifiers try to overwrite the same field in the base_suite.
|
||||
|
||||
### excludes
|
||||
|
||||
All fields referenced in the `excludes` section of the mappings file will append to the specified
|
||||
`exclude` fields in the base suite.
|
||||
The only two valid options in the referenced modifier field are `exclude_with_any_tags` and
|
||||
`exclude_files`. They are appended in the order they are specified in the mappings file.
|
||||
`exclude` fields in the base suite. The only two valid options in the referenced modifier field are
|
||||
`exclude_with_any_tags` and `exclude_files`. They are appended in the order they are specified in
|
||||
the mappings file.
|
||||
|
||||
### eval
|
||||
|
||||
All fields referenced in the `eval` section of the mappings file will append to the specified
|
||||
`config.shell_options.eval` field in the base suite.
|
||||
They are appended in the order they are specified in the mappings file.
|
||||
`config.shell_options.eval` field in the base suite. They are appended in the order they are
|
||||
specified in the mappings file.
|
||||
|
||||
### extends
|
||||
|
||||
@ -69,9 +65,8 @@ modifiers), the key being extended must already exist and also be a list.
|
||||
The generated matrix suites live in the `buildscripts/resmokeconfig/matrix_suites/generated_suites`
|
||||
directory. These files may be edited for local testing but must remain consistent with the mapping
|
||||
files. There is a task in the commit queue that enforces this. To generate a new version of these
|
||||
matrix suites, you may run
|
||||
`buildscripts/resmoke.py generate-matrix-suites`. This command
|
||||
will overwrite the current generated matrix suites on disk so make sure you do not have any unsaved
|
||||
matrix suites, you may run `buildscripts/resmoke.py generate-matrix-suites`. This command will
|
||||
overwrite the current generated matrix suites on disk so make sure you do not have any unsaved
|
||||
changes to these files.
|
||||
|
||||
## Validating matrix suites
|
||||
@ -82,5 +77,4 @@ ensures that the files are validated.
|
||||
|
||||
## FAQ
|
||||
|
||||
For questions about the user or authorship experience,
|
||||
please reach out in #server-testing.
|
||||
For questions about the user or authorship experience, please reach out in #server-testing.
|
||||
|
||||
@ -2,7 +2,8 @@
|
||||
|
||||
Test "suites" are configuration files that group which tests to run, and how.
|
||||
|
||||
Yaml files enumerate the test files that the suite encompasses, as well as any test fixtures and their configurations to leverage, options for the shell, hooks, and more.
|
||||
Yaml files enumerate the test files that the suite encompasses, as well as any test fixtures and
|
||||
their configurations to leverage, options for the shell, hooks, and more.
|
||||
|
||||
## Minimal Example
|
||||
|
||||
@ -64,7 +65,8 @@ Example:
|
||||
test_kind: js_test
|
||||
```
|
||||
|
||||
See all supported kinds in [`buildscripts/resmokelib/testing/testcases`](../../../buildscripts/resmokelib/testing/testcases/README.md).
|
||||
See all supported kinds in
|
||||
[`buildscripts/resmokelib/testing/testcases`](../../../buildscripts/resmokelib/testing/testcases/README.md).
|
||||
|
||||
## `selector`
|
||||
|
||||
@ -89,25 +91,34 @@ File path(s) of test files to include. If a path without a glob is provided, it
|
||||
|
||||
### `selector.root`
|
||||
|
||||
A file containing glob patterns, one per line, typically used by test_kind cpp_unit_test (usually build/unittests.txt). Specifies which tests to consider for including into the suite. If no other options are specified, these are the tests that will be run. Glob patterns are supported (and common) here.
|
||||
A file containing glob patterns, one per line, typically used by test_kind cpp_unit_test (usually
|
||||
build/unittests.txt). Specifies which tests to consider for including into the suite. If no other
|
||||
options are specified, these are the tests that will be run. Glob patterns are supported (and
|
||||
common) here.
|
||||
|
||||
### `selector.include_files`
|
||||
|
||||
A list of strings representing glob patterns. Includes only this subset of tests in the suite. These files will be included even if they would otherwise be excluded by tags. Will error if a test specified here was not included in the roots.
|
||||
A list of strings representing glob patterns. Includes only this subset of tests in the suite. These
|
||||
files will be included even if they would otherwise be excluded by tags. Will error if a test
|
||||
specified here was not included in the roots.
|
||||
|
||||
### `selector.exclude_files`
|
||||
|
||||
A list of strings representing glob patterns. Excludes this list of tests from the suite. These files will be excluded even if they would otherwise be included by tags. Will error if a test specified here was not included in the roots.
|
||||
A list of strings representing glob patterns. Excludes this list of tests from the suite. These
|
||||
files will be excluded even if they would otherwise be included by tags. Will error if a test
|
||||
specified here was not included in the roots.
|
||||
|
||||
### `selector.include_with_any_tags`
|
||||
|
||||
A list of strings. Only jstests which define a list of tags which includes any of these tags will be included in the suite, unless otherwise excluded by filename.
|
||||
A list of strings. Only jstests which define a list of tags which includes any of these tags will be
|
||||
included in the suite, unless otherwise excluded by filename.
|
||||
|
||||
To see all tags referenced across suites, run `./buildscripts/resmoke.py list-tags`.
|
||||
|
||||
### `selector.exclude_with_any_tags`
|
||||
|
||||
A list of strings. Any jstest which defines a list of tags which includes any of these tags will be excluded from the suite, unless otherwise included by filename.
|
||||
A list of strings. Any jstest which defines a list of tags which includes any of these tags will be
|
||||
excluded from the suite, unless otherwise included by filename.
|
||||
|
||||
To see all tags referenced across suites, run `./buildscripts/resmoke.py list-tags`.
|
||||
|
||||
@ -118,9 +129,8 @@ Defines how the tests will be executed.
|
||||
### `executor.config`
|
||||
|
||||
This section contains additional configuration for each test. The structure of this can vary
|
||||
significantly based on the `test_kind`. For specific information, you can look at the
|
||||
implementation of the `test_kind` of concern in the `buildscripts/resmokelib/testing/testcases`
|
||||
directory.
|
||||
significantly based on the `test_kind`. For specific information, you can look at the implementation
|
||||
of the `test_kind` of concern in the `buildscripts/resmokelib/testing/testcases` directory.
|
||||
|
||||
Example:
|
||||
|
||||
@ -147,7 +157,9 @@ Any parameters (besides `global_vars`) will directly be passed to the mongo shel
|
||||
|
||||
##### `executor.config.shell_options.global_vars`
|
||||
|
||||
Will use this as the base for the string passed to `--eval`. Anything specified in `shell_options.eval` will be appended after these. Formats any objects so that they will evaluate properly as a string.
|
||||
Will use this as the base for the string passed to `--eval`. Anything specified in
|
||||
`shell_options.eval` will be appended after these. Formats any objects so that they will evaluate
|
||||
properly as a string.
|
||||
|
||||
`global_vars` allows for setting global variables. A `TestData` object is a special global variable
|
||||
that is used to hold testing data. Parts of `TestData` can be updated via `resmoke` command-line
|
||||
@ -156,8 +168,8 @@ intelligently and made available to the `js_test` running. Behavior can vary on
|
||||
in general this is the order of precedence: (1) resmoke command-line (2) [suite].yml (3)
|
||||
runtime/default.
|
||||
|
||||
The mongo shell can also be invoked with flags &
|
||||
named arguments. Flags must have the `''` value, such as in the case for `nodb` above.
|
||||
The mongo shell can also be invoked with flags & named arguments. Flags must have the `''` value,
|
||||
such as in the case for `nodb` above.
|
||||
|
||||
`eval` can also be used to run generic javascript code in the shell. You can directly include
|
||||
javascript code, or you can put it in a separate script & `load` it.
|
||||
@ -166,11 +178,12 @@ javascript code, or you can put it in a separate script & `load` it.
|
||||
|
||||
Specify hooks to run before, after, and between individual tests to execute specified logic.
|
||||
|
||||
> Read more about hooks in [buildscripts/resmokelib/testing/hooks/README.md](../../../buildscripts/resmokelib/testing/hooks/README.md)
|
||||
> Read more about hooks in
|
||||
> [buildscripts/resmokelib/testing/hooks/README.md](../../../buildscripts/resmokelib/testing/hooks/README.md)
|
||||
|
||||
The hook name in the `.yml` must match its Python class name of the hook. Parameters can also be included in the `.yml`
|
||||
and will be passed to the hook's constructor (the `hook_logger` & `fixture` parameters are
|
||||
automatically included, so those should not be included in the `.yml`).
|
||||
The hook name in the `.yml` must match its Python class name of the hook. Parameters can also be
|
||||
included in the `.yml` and will be passed to the hook's constructor (the `hook_logger` & `fixture`
|
||||
parameters are automatically included, so those should not be included in the `.yml`).
|
||||
|
||||
Example:
|
||||
|
||||
@ -190,9 +203,11 @@ hooks:
|
||||
|
||||
Specify a test fixture to run around the tests.
|
||||
|
||||
> Read more about fixtures in [buildscripts/resmokelib/testing/fixtures/README.md](../../../buildscripts/resmokelib/testing/fixtures/README.md).
|
||||
> Read more about fixtures in
|
||||
> [buildscripts/resmokelib/testing/fixtures/README.md](../../../buildscripts/resmokelib/testing/fixtures/README.md).
|
||||
|
||||
The `class` sub-field corresponds to the Python class name of a fixture. All other sub-fields are passed into the constructor of the fixture. These sub-fields will vary based on the fixture used.
|
||||
The `class` sub-field corresponds to the Python class name of a fixture. All other sub-fields are
|
||||
passed into the constructor of the fixture. These sub-fields will vary based on the fixture used.
|
||||
|
||||
Example:
|
||||
|
||||
@ -238,4 +253,5 @@ Read more about [hooks](../../../buildscripts/resmokelib/testing/hooks/README.md
|
||||
|
||||
#### `executor.archive.tests`
|
||||
|
||||
Specify a list of test files to archive on failure. Wildcard selection a valid. Set to `true` to archive _all_ tests.
|
||||
Specify a list of test files to archive on failure. Wildcard selection a valid. Set to `true` to
|
||||
archive _all_ tests.
|
||||
|
||||
@ -2,11 +2,13 @@
|
||||
|
||||
Resmoke is MongoDB's integration test runner.
|
||||
|
||||
The JS Tests it can run live in the `jstests/` directory - reference its [README](../../jstests/README.md) to learn about their content.
|
||||
The JS Tests it can run live in the `jstests/` directory - reference its
|
||||
[README](../../jstests/README.md) to learn about their content.
|
||||
|
||||
## Build
|
||||
|
||||
Though the source is built with bazel, resmoke is not yet integrated. This means that the source has to be built prior to using resmoke, eg:
|
||||
Though the source is built with bazel, resmoke is not yet integrated. This means that the source has
|
||||
to be built prior to using resmoke, eg:
|
||||
|
||||
```
|
||||
bazel build install-dist-test
|
||||
@ -41,11 +43,13 @@ bazel build install-dist-test
|
||||
Generate a mongod.conf and mongos.conf using config fuzzer.
|
||||
```
|
||||
|
||||
Note: `bisect`, `setup-multiversion`, and `symbolize` commands have been moved to [`db-contrib-tool`](https://github.com/10gen/db-contrib-tool#readme).
|
||||
Note: `bisect`, `setup-multiversion`, and `symbolize` commands have been moved to
|
||||
[`db-contrib-tool`](https://github.com/10gen/db-contrib-tool#readme).
|
||||
|
||||
## Suites
|
||||
|
||||
Many of the above commands use the concept of a "suite". Loosely, suites group which tests run, and how.
|
||||
Many of the above commands use the concept of a "suite". Loosely, suites group which tests run, and
|
||||
how.
|
||||
|
||||
Read more about suites [here](../../buildscripts/resmokeconfig/suites/README.md).
|
||||
|
||||
@ -59,43 +63,47 @@ The most typical approach is to run a particular JS test file given a suite, eg:
|
||||
buildscripts/resmoke.py run --suites=no_passthrough jstests/noPassthrough/shell/js/string.js
|
||||
```
|
||||
|
||||
That executes the content of that file, using the suite configuration as a fixture setup. The suite "no_passthrough" is associated with the file [buildscripts/resmokeconfig/suites/no_passthrough.yml](../../buildscripts/resmokeconfig/suites/no_passthrough.yml).
|
||||
That executes the content of that file, using the suite configuration as a fixture setup. The suite
|
||||
"no_passthrough" is associated with the file
|
||||
[buildscripts/resmokeconfig/suites/no_passthrough.yml](../../buildscripts/resmokeconfig/suites/no_passthrough.yml).
|
||||
|
||||
Run has **100+ flags**! Use `resmoke run --help` to inspect them. To avoid risk of multiple sources of truth that can drift and become stale, **we do not attempt to document them all here** - they should each be self-descriptive and documented within the CLI help.
|
||||
Run has **100+ flags**! Use `resmoke run --help` to inspect them. To avoid risk of multiple sources
|
||||
of truth that can drift and become stale, **we do not attempt to document them all here** - they
|
||||
should each be self-descriptive and documented within the CLI help.
|
||||
|
||||
Below are very high-level descriptions for high-usage flags.
|
||||
|
||||
### Suites (`--suites`)
|
||||
|
||||
The run subcommand can run suites (list of tests and the MongoDB topology and
|
||||
configuration to run them against), and explicitly named test files.
|
||||
The run subcommand can run suites (list of tests and the MongoDB topology and configuration to run
|
||||
them against), and explicitly named test files.
|
||||
|
||||
A single suite can be specified using the `--suite` flag, and multiple suites
|
||||
can be specified by providing a comma separated list to the `--suites` flag.
|
||||
A single suite can be specified using the `--suite` flag, and multiple suites can be specified by
|
||||
providing a comma separated list to the `--suites` flag.
|
||||
|
||||
Additional documentation on our suite configuration can be found in
|
||||
[buildscripts/resmokeconfig/suites/README.md](../../buildscripts/resmokeconfig/suites/README.md).
|
||||
|
||||
### Testable Installations (`--installDir`)
|
||||
|
||||
resmoke can run tests against any testable installation of MongoDB (such
|
||||
as ASAN, Debug, Release). When possible, resmoke will automatically locate and
|
||||
run with a locally built copy of MongoDB Server, so long as that build was
|
||||
installed to a subdirectory of the root of the git repository, and there is
|
||||
exactly one build. In other situations, the `--installDir` flag, passed to run
|
||||
subcommand, can be used to indicate the location of the mongod/mongos binaries.
|
||||
resmoke can run tests against any testable installation of MongoDB (such as ASAN, Debug, Release).
|
||||
When possible, resmoke will automatically locate and run with a locally built copy of MongoDB
|
||||
Server, so long as that build was installed to a subdirectory of the root of the git repository, and
|
||||
there is exactly one build. In other situations, the `--installDir` flag, passed to run subcommand,
|
||||
can be used to indicate the location of the mongod/mongos binaries.
|
||||
|
||||
As an alternative, you may instead prefer to use the resmoke.py wrapper script
|
||||
located in the same directory as the mongod binary, which will automatically
|
||||
set `installDir` for you.
|
||||
As an alternative, you may instead prefer to use the resmoke.py wrapper script located in the same
|
||||
directory as the mongod binary, which will automatically set `installDir` for you.
|
||||
|
||||
Note that this wrapper is unavailable in packaged installations of MongoDB
|
||||
Server, such as those provided by Homebrew, and other package managers. If you
|
||||
would like to run tests against a packaged installation, you must explicitly
|
||||
pass `--installDir` to resmoke.py
|
||||
Note that this wrapper is unavailable in packaged installations of MongoDB Server, such as those
|
||||
provided by Homebrew, and other package managers. If you would like to run tests against a packaged
|
||||
installation, you must explicitly pass `--installDir` to resmoke.py
|
||||
|
||||
### Resmoke test telemetry
|
||||
|
||||
We capture telemetry from resmoke using open telemetry.
|
||||
|
||||
Using open telemetry (OTel) we capture more specific information about the internals of resmoke. This data is used for improvements specifically when running in evergreen. This data is captured on every resmoke invocation but only sent to honeycomb when running in evergreen. More info about how we use OTel in resmoke can be found [here](otel_resmoke.md).
|
||||
Using open telemetry (OTel) we capture more specific information about the internals of resmoke.
|
||||
This data is used for improvements specifically when running in evergreen. This data is captured on
|
||||
every resmoke invocation but only sent to honeycomb when running in evergreen. More info about how
|
||||
we use OTel in resmoke can be found [here](otel_resmoke.md).
|
||||
|
||||
@ -1,10 +1,12 @@
|
||||
# Extensions
|
||||
|
||||
This module provides utilities for setting up and configuring MongoDB extensions in resmoke test suites.
|
||||
This module provides utilities for setting up and configuring MongoDB extensions in resmoke test
|
||||
suites.
|
||||
|
||||
## Overview
|
||||
|
||||
Extensions are dynamically loaded shared objects (`.so` files) that provide additional functionality to MongoDB. The utilities in this folder can handle:
|
||||
Extensions are dynamically loaded shared objects (`.so` files) that provide additional functionality
|
||||
to MongoDB. The utilities in this folder can handle:
|
||||
|
||||
1. Discovering extension `.so` files in build directories
|
||||
2. Generating `.conf` configuration files for extensions
|
||||
@ -12,7 +14,8 @@ Extensions are dynamically loaded shared objects (`.so` files) that provide addi
|
||||
|
||||
## Configuration File Generation in Tests
|
||||
|
||||
Extension `.conf` files are YAML configuration files that tell the server how to load an extension. They contain:
|
||||
Extension `.conf` files are YAML configuration files that tell the server how to load an extension.
|
||||
They contain:
|
||||
|
||||
- `sharedLibraryPath`: Path to the `.so` file
|
||||
- `extensionOptions`: Optional configuration parameters for the extension
|
||||
@ -30,9 +33,11 @@ extensionOptions:
|
||||
|
||||
The `generate_extension_configs.py` module creates `.conf` files:
|
||||
|
||||
1. Receives a list of `.so` file paths (either from automatic discovery via `find_and_generate_extension_configs.py`, or manually via `--so-files` command-line argument)
|
||||
1. Receives a list of `.so` file paths (either from automatic discovery via
|
||||
`find_and_generate_extension_configs.py`, or manually via `--so-files` command-line argument)
|
||||
2. For each `.so`, creates a `.conf` file in the temp directory (`/tmp/mongo/extensions/`)
|
||||
3. Looks up corresponding extension options from `src/mongo/db/extension/test_examples/configurations.yml`, if any are specified
|
||||
3. Looks up corresponding extension options from
|
||||
`src/mongo/db/extension/test_examples/configurations.yml`, if any are specified
|
||||
4. Writes the config file with `sharedLibraryPath` and any `extensionOptions`
|
||||
|
||||
### Automatic Discovery and Generation
|
||||
|
||||
@ -2,7 +2,10 @@
|
||||
|
||||
This is a testing feature of the mongod and mongos, built into resmoke.py!
|
||||
|
||||
The config fuzzer is a resmoke feature that randomizes various server parameters of both mongod and mongos on startup. These fuzzed parameters should not affect the correctness of any tests. Therefore, the config fuzzer can be enabled for any test or suite run with resmoke to ensure the database is resilient to abnormal server configurations.
|
||||
The config fuzzer is a resmoke feature that randomizes various server parameters of both mongod and
|
||||
mongos on startup. These fuzzed parameters should not affect the correctness of any tests.
|
||||
Therefore, the config fuzzer can be enabled for any test or suite run with resmoke to ensure the
|
||||
database is resilient to abnormal server configurations.
|
||||
|
||||
More information can be displayed in the resmoke --help output:
|
||||
|
||||
@ -25,15 +28,22 @@ The bulk of the fuzzing logic is in [mongo_fuzzer_configs.py](./mongo_fuzzer_con
|
||||
|
||||
## How does it work?
|
||||
|
||||
The config fuzzer assigns random values to various tunable parameters. Server parameters and their ranges are specified manually by developers and are not discovered automatically in any way.
|
||||
The config fuzzer assigns random values to various tunable parameters. Server parameters and their
|
||||
ranges are specified manually by developers and are not discovered automatically in any way.
|
||||
|
||||
When the above resmoke flags are used, the [plugin](./plugin.py) implicitly enables the [FuzzRuntimeParameters](../../../buildscripts/resmokelib/testing/hooks/fuzz_runtime_parameters.py) hook for testing.
|
||||
When the above resmoke flags are used, the [plugin](./plugin.py) implicitly enables the
|
||||
[FuzzRuntimeParameters](../../../buildscripts/resmokelib/testing/hooks/fuzz_runtime_parameters.py)
|
||||
hook for testing.
|
||||
|
||||
## Where and When does it run on evergreen?
|
||||
|
||||
The config fuzzer is represented as a handful of evergreen tasks with "_config_fuzzer_" in the name. Search "config_fuzzer" in the [etc/](../../../etc) directory to find all the evergreen tasks.
|
||||
The config fuzzer is represented as a handful of evergreen tasks with "_config_fuzzer_" in the name.
|
||||
Search "config_fuzzer" in the [etc/](../../../etc) directory to find all the evergreen tasks.
|
||||
|
||||
Arguably the simplest evergreen task, `config_fuzzer_jsCore`, runs the "core" (i.e. `jstests/core`) resmoke suite with the config fuzzer parameters to resmoke set, and excludes some incompatible tests ([src link](https://github.com/mongodb/mongo/blob/a2e7e83a135c3096de7f360b88de1b3cdc1caaf2/etc/evergreen_yml_components/tasks/resmoke/server_divisions/durable_transactions_and_availability/tasks.yml#L1956-L1975)). Here is a sampling of some of the task names:
|
||||
Arguably the simplest evergreen task, `config_fuzzer_jsCore`, runs the "core" (i.e. `jstests/core`)
|
||||
resmoke suite with the config fuzzer parameters to resmoke set, and excludes some incompatible tests
|
||||
([src link](https://github.com/mongodb/mongo/blob/a2e7e83a135c3096de7f360b88de1b3cdc1caaf2/etc/evergreen_yml_components/tasks/resmoke/server_divisions/durable_transactions_and_availability/tasks.yml#L1956-L1975)).
|
||||
Here is a sampling of some of the task names:
|
||||
|
||||
- `config_fuzzer_concurrency_replication`
|
||||
- `config_fuzzer_concurrency_sharded_replication`
|
||||
@ -41,7 +51,10 @@ Arguably the simplest evergreen task, `config_fuzzer_jsCore`, runs the "core" (i
|
||||
|
||||
## Reproducing a config fuzzer failure
|
||||
|
||||
In the Evergreen task view, click on the Logs tab, then Task Logs, and open in Parsely. Search for "Fuzzed" ([source link](https://github.com/mongodb/mongo/blob/ca1c935aca43ca2e028507e2a878d4e12f50355b/buildscripts/resmokelib/run/__init__.py#L352-L366)). The output will look similar to this:
|
||||
In the Evergreen task view, click on the Logs tab, then Task Logs, and open in Parsely. Search for
|
||||
"Fuzzed"
|
||||
([source link](https://github.com/mongodb/mongo/blob/ca1c935aca43ca2e028507e2a878d4e12f50355b/buildscripts/resmokelib/run/__init__.py#L352-L366)).
|
||||
The output will look similar to this:
|
||||
|
||||
<details>
|
||||
<summary>Logs</summary>
|
||||
@ -112,13 +125,22 @@ In the Evergreen task view, click on the Logs tab, then Task Logs, and open in P
|
||||
|
||||
</details>
|
||||
|
||||
The log line starting with "resmoke.py invocation for local usage" and the one with "configFuzzSeed" provide an option `--configFuzzSeed=5583430894313922699` that can be used to generate the same fuzzed server parameters locally in resmoke.
|
||||
The log line starting with "resmoke.py invocation for local usage" and the one with "configFuzzSeed"
|
||||
provide an option `--configFuzzSeed=5583430894313922699` that can be used to generate the same
|
||||
fuzzed server parameters locally in resmoke.
|
||||
|
||||
## Running the config fuzzer locally
|
||||
|
||||
Before running the Resmoke config fuzzer command, you need to obtain the necessary binaries. You can download them from the "Files" section of the `archive_dist_test` task in Evergreen (e.g., binaries from the `amazon2-arm64-compile` variant). Alternatively, if you don't require those specific binaries, you can use `db-contrib-tool` to download the binaries (e.g., by running `bazel run db-contrib-tool -- setup-repro-env master`).
|
||||
Before running the Resmoke config fuzzer command, you need to obtain the necessary binaries. You can
|
||||
download them from the "Files" section of the `archive_dist_test` task in Evergreen (e.g., binaries
|
||||
from the `amazon2-arm64-compile` variant). Alternatively, if you don't require those specific
|
||||
binaries, you can use `db-contrib-tool` to download the binaries (e.g., by running
|
||||
`bazel run db-contrib-tool -- setup-repro-env master`).
|
||||
|
||||
To re-run a command locally that failed through the config fuzzer, you can navigate to the specific test that failed, and under files you can find a name titled "Resmoke.py Invocation for Local Usage". If you are replicating an older config fuzzer invocation, remove the command line argument "`--installDir=dist-test/bin`". A simple example command is shown below:
|
||||
To re-run a command locally that failed through the config fuzzer, you can navigate to the specific
|
||||
test that failed, and under files you can find a name titled "Resmoke.py Invocation for Local
|
||||
Usage". If you are replicating an older config fuzzer invocation, remove the command line argument
|
||||
"`--installDir=dist-test/bin`". A simple example command is shown below:
|
||||
|
||||
```
|
||||
buildscripts/resmoke.py run jstests/noPassthrough/bulk_write_w0.js \
|
||||
@ -127,7 +149,12 @@ buildscripts/resmoke.py run jstests/noPassthrough/bulk_write_w0.js \
|
||||
--configFuzzSeed=7956511060361033919
|
||||
```
|
||||
|
||||
It is easiest to pipe the output to another text file and then to analyze the output through there. The format of the file is slightly different, as you will not be able to explicitly look up Fuzzed, but you can look up one of the fuzzed config parameters to find the list of fuzzed config parameter settings. A subset of a log from running the above command on [this version](https://github.com/mongodb/mongo/commit/856e4ecd8612b19c8ba281cf23450d74b5838650) of master yields is the following:
|
||||
It is easiest to pipe the output to another text file and then to analyze the output through there.
|
||||
The format of the file is slightly different, as you will not be able to explicitly look up Fuzzed,
|
||||
but you can look up one of the fuzzed config parameters to find the list of fuzzed config parameter
|
||||
settings. A subset of a log from running the above command on
|
||||
[this version](https://github.com/mongodb/mongo/commit/856e4ecd8612b19c8ba281cf23450d74b5838650) of
|
||||
master yields is the following:
|
||||
|
||||
```
|
||||
js_test:bulk_write_w0] Skip waiting to connect to node with pid=2522712, port=20040
|
||||
@ -140,7 +167,8 @@ js_test:bulk_write_w0] Skip waiting to connect to node with pid=2522712, port=20
|
||||
|
||||
## Adding a new parameter to be fuzzed to the config fuzzer
|
||||
|
||||
There are two broad categories of parameters in the config fuzzer, that each have two sub-categories of parameters:
|
||||
There are two broad categories of parameters in the config fuzzer, that each have two sub-categories
|
||||
of parameters:
|
||||
|
||||
1. mongo parameters
|
||||
- mongod parameters
|
||||
@ -151,25 +179,43 @@ There are two broad categories of parameters in the config fuzzer, that each hav
|
||||
|
||||
### Adding new mongo parameters
|
||||
|
||||
Mongo parameters and their properties (e.g. min, max, default) are stored in [config_fuzzer_limits.py](./config_fuzzer_limits.py).
|
||||
Mongo parameters and their properties (e.g. min, max, default) are stored in
|
||||
[config_fuzzer_limits.py](./config_fuzzer_limits.py).
|
||||
|
||||
Below is a list of ways to fuzz configs which are supported without having to also change [mongo_fuzzer_configs.py](./mongo_fuzzer_configs.py).
|
||||
Please ensure that you add it correctly to the `mongod` or `mongos` subdictionary.
|
||||
Below is a list of ways to fuzz configs which are supported without having to also change
|
||||
[mongo_fuzzer_configs.py](./mongo_fuzzer_configs.py). Please ensure that you add it correctly to the
|
||||
`mongod` or `mongos` subdictionary.
|
||||
|
||||
You need to specify if your parameter should be fuzzed at runtime, startup, or both by declaring the `fuzz_at` key for the parameter. The `fuzz_at` key should be a list that can contain the values `startup`, `runtime`, or both. The eligible values are specified in the `set_at` keys of the corresponding `.idl` files.
|
||||
You need to specify if your parameter should be fuzzed at runtime, startup, or both by declaring the
|
||||
`fuzz_at` key for the parameter. The `fuzz_at` key should be a list that can contain the values
|
||||
`startup`, `runtime`, or both. The eligible values are specified in the `set_at` keys of the
|
||||
corresponding `.idl` files.
|
||||
|
||||
For a parameter that is only fuzzed at startup, the fuzzer will generate a fuzzed value for the parameter and set it when starting up the server.
|
||||
For a parameter that is only fuzzed at startup, the fuzzer will generate a fuzzed value for the
|
||||
parameter and set it when starting up the server.
|
||||
|
||||
For a parameter fuzzed at runtime, the fuzzer will generate a fuzzed value for the parameter while running the server based on a `period` key that is required for fuzzed runtime parameters.
|
||||
The `period` key describes how often the parameter should be changed, in seconds. Every `period` seconds, the fuzzer will select a new random value for the parameter and use the setParameter command to update the value of the
|
||||
parameter on every node in the cluster while the suite is running. This is perfomed by the [FuzzRuntimeParameters](../../../buildscripts/resmokelib/testing/hooks/fuzz_runtime_parameters.py) hook.
|
||||
For a parameter fuzzed at runtime, the fuzzer will generate a fuzzed value for the parameter while
|
||||
running the server based on a `period` key that is required for fuzzed runtime parameters. The
|
||||
`period` key describes how often the parameter should be changed, in seconds. Every `period`
|
||||
seconds, the fuzzer will select a new random value for the parameter and use the setParameter
|
||||
command to update the value of the parameter on every node in the cluster while the suite is
|
||||
running. This is perfomed by the
|
||||
[FuzzRuntimeParameters](../../../buildscripts/resmokelib/testing/hooks/fuzz_runtime_parameters.py)
|
||||
hook.
|
||||
|
||||
For parameters with complex fuzzing logic or interdependencies with other parameters, you can set `"custom_fuzz_value_assignment": True` to bypass the standard fuzzing logic. Parameters with this flag must be handled explicitly in the special handling functions (`generate_special_mongod_startup_parameters()` for startup parameters or `generate_special_runtime_parameters()` for runtime parameters). Note that parameter dependency logic is currently only supported for startup fuzzing - runtime fuzzing operates on individual parameters. See the section below on parameters requiring special handling for more details.
|
||||
For parameters with complex fuzzing logic or interdependencies with other parameters, you can set
|
||||
`"custom_fuzz_value_assignment": True` to bypass the standard fuzzing logic. Parameters with this
|
||||
flag must be handled explicitly in the special handling functions
|
||||
(`generate_special_mongod_startup_parameters()` for startup parameters or
|
||||
`generate_special_runtime_parameters()` for runtime parameters). Note that parameter dependency
|
||||
logic is currently only supported for startup fuzzing - runtime fuzzing operates on individual
|
||||
parameters. See the section below on parameters requiring special handling for more details.
|
||||
|
||||
Let `choices = [choice1, choice2, ..., choiceN]` be an array of choices that the parameter can have as a value.
|
||||
The parameters are added in order of priority chosen in the if-elif-else statement in `generate_normal_mongo_parameters()`
|
||||
in [mongo_fuzzer_configs.py](./mongo_fuzzer_configs.py).
|
||||
So, if you added the fields `default`, `min`, and `max` for a `param`, case 4 would get evaluated over case 5.
|
||||
Let `choices = [choice1, choice2, ..., choiceN]` be an array of choices that the parameter can have
|
||||
as a value. The parameters are added in order of priority chosen in the if-elif-else statement in
|
||||
`generate_normal_mongo_parameters()` in [mongo_fuzzer_configs.py](./mongo_fuzzer_configs.py). So, if
|
||||
you added the fields `default`, `min`, and `max` for a `param`, case 4 would get evaluated over
|
||||
case 5.
|
||||
|
||||
1. `param = rng.uniform(min, max)`
|
||||
|
||||
@ -218,41 +264,59 @@ So, if you added the fields `default`, `min`, and `max` for a `param`, case 4 wo
|
||||
"param": {"default": default}
|
||||
```
|
||||
|
||||
> Note: For the default case, please add the value `"fuzz_at": ["startup"]` (the default value gets set at "startup").
|
||||
> Note: For the default case, please add the value `"fuzz_at": ["startup"]` (the default value
|
||||
> gets set at "startup").
|
||||
|
||||
If you have a parameter that depends on another parameter being generated (see `throughputProbingInitialConcurrency` needing to be initialized before
|
||||
`throughputProbingMinConcurrency` and `throughputProbingMaxConcurrency` as an example in [mongo_fuzzer_configs.py](./mongo_fuzzer_configs.py)) or behavior that
|
||||
differs from the above cases, please do the following steps:
|
||||
If you have a parameter that depends on another parameter being generated (see
|
||||
`throughputProbingInitialConcurrency` needing to be initialized before
|
||||
`throughputProbingMinConcurrency` and `throughputProbingMaxConcurrency` as an example in
|
||||
[mongo_fuzzer_configs.py](./mongo_fuzzer_configs.py)) or behavior that differs from the above cases,
|
||||
please do the following steps:
|
||||
|
||||
1. Add the parameter and the needed information to [config_fuzzer_limits.py](./config_fuzzer_limits.py) (ensure to correctly add to the `mongod` or `mongos` sub-dictionary), including `"custom_fuzz_value_assignment": True` to indicate it requires special handling
|
||||
1. Add the parameter and the needed information to
|
||||
[config_fuzzer_limits.py](./config_fuzzer_limits.py) (ensure to correctly add to the `mongod` or
|
||||
`mongos` sub-dictionary), including `"custom_fuzz_value_assignment": True` to indicate it
|
||||
requires special handling
|
||||
|
||||
In [mongo_fuzzer_configs.py](./mongo_fuzzer_configs.py):
|
||||
|
||||
2. Add the parameter's special handling in `generate_special_mongod_startup_parameters()` or `generate_special_mongos_startup_parameters()` for startup parameters, or `generate_special_runtime_parameters()` for runtime parameters
|
||||
2. Add the parameter's special handling in `generate_special_mongod_startup_parameters()` or
|
||||
`generate_special_mongos_startup_parameters()` for startup parameters, or
|
||||
`generate_special_runtime_parameters()` for runtime parameters
|
||||
|
||||
> Note: Parameter dependencies (where one parameter's value constrains another) are currently only supported for startup fuzzing. Runtime fuzzing handles parameters individually.
|
||||
> Note: Parameter dependencies (where one parameter's value constrains another) are currently only
|
||||
> supported for startup fuzzing. Runtime fuzzing handles parameters individually.
|
||||
|
||||
If you add a flow control parameter, please add the the parameter's name to `flow_control_params` in `generate_mongod_parameters`.
|
||||
If you add a flow control parameter, please add the the parameter's name to `flow_control_params` in
|
||||
`generate_mongod_parameters`.
|
||||
|
||||
> Note: The main distinction between min/max vs. lower-bound/upper_bound is there is some transformation involving the lower and upper bounds,
|
||||
> while the min/max should be the true min/max of the parameters. You should also include the true min/max of the parameter so this can be logged.
|
||||
> If the min/max is not inclusive, this is added as a note above the parameter.
|
||||
> Note: The main distinction between min/max vs. lower-bound/upper_bound is there is some
|
||||
> transformation involving the lower and upper bounds, while the min/max should be the true min/max
|
||||
> of the parameters. You should also include the true min/max of the parameter so this can be
|
||||
> logged. If the min/max is not inclusive, this is added as a note above the parameter.
|
||||
|
||||
### Adding new WiredTiger parameters
|
||||
|
||||
WiredTiger parameters and their properties (e.g. min, max, default) are stored in [config_fuzzer_wt_limits.py](./config_fuzzer_wt_limits.py).
|
||||
WiredTiger parameters and their properties (e.g. min, max, default) are stored in
|
||||
[config_fuzzer_wt_limits.py](./config_fuzzer_wt_limits.py).
|
||||
|
||||
> These _can not_ be fuzzed with the [FuzzRuntimeParameters](../../../buildscripts/resmokelib/testing/hooks/fuzz_runtime_parameters.py) hook because they are only set on startup (these parameters are used in the wt configuration string).
|
||||
> These _can not_ be fuzzed with the
|
||||
> [FuzzRuntimeParameters](../../../buildscripts/resmokelib/testing/hooks/fuzz_runtime_parameters.py)
|
||||
> hook because they are only set on startup (these parameters are used in the wt configuration
|
||||
> string).
|
||||
|
||||
Below is a list of ways to fuzz configs which are supported without having to also change [mongo_fuzzer_configs.py](./mongo_fuzzer_configs.py).
|
||||
|
||||
Please ensure that you add it correctly to the `wt` (eviction parameters) or `wt_table` subdictionary.
|
||||
|
||||
Let `choices = [choice1, choice2, ..., choiceN]` be an array of choices that the parameter can have as a value.
|
||||
|
||||
The parameters are added in order of priority chosen in the if-elif-else statement in `generate_normal_wt_parameters()` in
|
||||
Below is a list of ways to fuzz configs which are supported without having to also change
|
||||
[mongo_fuzzer_configs.py](./mongo_fuzzer_configs.py).
|
||||
|
||||
Please ensure that you add it correctly to the `wt` (eviction parameters) or `wt_table`
|
||||
subdictionary.
|
||||
|
||||
Let `choices = [choice1, choice2, ..., choiceN]` be an array of choices that the parameter can have
|
||||
as a value.
|
||||
|
||||
The parameters are added in order of priority chosen in the if-elif-else statement in
|
||||
`generate_normal_wt_parameters()` in [mongo_fuzzer_configs.py](./mongo_fuzzer_configs.py).
|
||||
|
||||
1. `param = rng.choices(choices)`, where choices is an array
|
||||
|
||||
Add:
|
||||
@ -281,25 +345,32 @@ The parameters are added in order of priority chosen in the if-elif-else stateme
|
||||
"param": {"min": min, "max": max}
|
||||
```
|
||||
|
||||
If you have a parameter that depends on another parameter being generated (see `eviction_target` needing to be initialized before
|
||||
`eviction_trigger` as an example in [mongo_fuzzer_configs.py](./mongo_fuzzer_configs.py)) or behavior that differs from the above cases,
|
||||
If you have a parameter that depends on another parameter being generated (see `eviction_target`
|
||||
needing to be initialized before `eviction_trigger` as an example in
|
||||
[mongo_fuzzer_configs.py](./mongo_fuzzer_configs.py)) or behavior that differs from the above cases,
|
||||
please do the following steps:
|
||||
|
||||
1. Add the parameter and the needed information to [config_fuzzer_wt_limits.py](./config_fuzzer_wt_limits.py) (ensure to correctly add to the `wt` or `wt_table` sub-dictionary)
|
||||
1. Add the parameter and the needed information to
|
||||
[config_fuzzer_wt_limits.py](./config_fuzzer_wt_limits.py) (ensure to correctly add to the `wt`
|
||||
or `wt_table` sub-dictionary)
|
||||
|
||||
In [mongo_fuzzer_configs.py](./mongo_fuzzer_configs.py):
|
||||
|
||||
2. Add the parameter to `excluded_normal_params` in `generate_eviction_configs()` or `generate_table_configs()`
|
||||
3. Add the parameter's special handling in `generate_special_eviction_configs()` or `generate_special_table_configs()`
|
||||
2. Add the parameter to `excluded_normal_params` in `generate_eviction_configs()` or
|
||||
`generate_table_configs()`
|
||||
3. Add the parameter's special handling in `generate_special_eviction_configs()` or
|
||||
`generate_special_table_configs()`
|
||||
|
||||
> The main distinction between min/max vs. lower-bound/upper_bound is there is some transformation involving the lower and upper bounds,
|
||||
> while the min/max should be the true min/max of the parameters. You should also include the true min/max of the parameter so this can be logged.
|
||||
> If the min/max is not inclusive, this is added as a note above the parameter.
|
||||
> The main distinction between min/max vs. lower-bound/upper_bound is there is some transformation
|
||||
> involving the lower and upper bounds, while the min/max should be the true min/max of the
|
||||
> parameters. You should also include the true min/max of the parameter so this can be logged. If
|
||||
> the min/max is not inclusive, this is added as a note above the parameter.
|
||||
|
||||
## Exclusions
|
||||
|
||||
- `jstests/libs/override_methods/config_fuzzer_incompatible_commands.js`
|
||||
- These commands are too impactful to run with the config fuzzer
|
||||
- The `does_not_support_config_fuzzer` jstest tag
|
||||
- Tests with this tag may manually specify server parameters modified by the fuzzer or read global state that is modified in some way by the fuzzer.
|
||||
- Tests with this tag may manually specify server parameters modified by the fuzzer or read global
|
||||
state that is modified in some way by the fuzzer.
|
||||
- Just because a test is failing does not mean it is incompatible with the config fuzzer.
|
||||
|
||||
@ -3,7 +3,9 @@
|
||||
There are two main ways of running the core analyzer.
|
||||
|
||||
1. Running the core analyzer with local core dumps and binaries.
|
||||
2. Running the core analyzer with core dumps and binaries from an evergreen task. Note that some analysis might fail if you are not on the same AMI (Amazon Machine Image) that the task was run on.
|
||||
2. Running the core analyzer with core dumps and binaries from an evergreen task. Note that some
|
||||
analysis might fail if you are not on the same AMI (Amazon Machine Image) that the task was run
|
||||
on.
|
||||
|
||||
To run the core analyzer with local core dumps and binaries:
|
||||
|
||||
@ -11,7 +13,9 @@ To run the core analyzer with local core dumps and binaries:
|
||||
python3 buildscripts/resmoke.py core-analyzer
|
||||
```
|
||||
|
||||
This will look for binaries in the build/install directory, and it will look for core dumps in the current directory. If your local environment is different you can include `--install-dir` and `--core-dir` in your invocation to specify other locations.
|
||||
This will look for binaries in the build/install directory, and it will look for core dumps in the
|
||||
current directory. If your local environment is different you can include `--install-dir` and
|
||||
`--core-dir` in your invocation to specify other locations.
|
||||
|
||||
To run the core analyzer with core dumps and binaries from an evergreen task:
|
||||
|
||||
@ -19,11 +23,15 @@ To run the core analyzer with core dumps and binaries from an evergreen task:
|
||||
python3 buildscripts/resmoke.py core-analyzer --task-id={task_id}
|
||||
```
|
||||
|
||||
This will download all of the core dumps and binaries from the task and put them into the configured `--working-dir`, this defaults to the `core-analyzer` directory.
|
||||
This will download all of the core dumps and binaries from the task and put them into the configured
|
||||
`--working-dir`, this defaults to the `core-analyzer` directory.
|
||||
|
||||
All of the task analysis will be added to the `analysis` directory inside the configured `--working-dir`.
|
||||
All of the task analysis will be added to the `analysis` directory inside the configured
|
||||
`--working-dir`.
|
||||
|
||||
Note: Currently the core analyzer only runs on linux. Windows uses the legacy hang analyzer but will be switched over when we run into issues or have time to do the transition. We have not tackled the problem of getting core dumps on macOS so we have no core dump analysis on that operating system.
|
||||
Note: Currently the core analyzer only runs on linux. Windows uses the legacy hang analyzer but will
|
||||
be switched over when we run into issues or have time to do the transition. We have not tackled the
|
||||
problem of getting core dumps on macOS so we have no core dump analysis on that operating system.
|
||||
|
||||
### Getting core dumps
|
||||
|
||||
@ -37,28 +45,33 @@ sequenceDiagram
|
||||
Hang Analyzer ->> Core Dumps: Attach to pid and generate core dumps
|
||||
```
|
||||
|
||||
When a task times out, it hits the [timeout](https://github.com/mongodb/mongo/blob/a6e56a8e136fe554dc90565bf6acf5bf86f7a46e/etc/evergreen_yml_components/definitions.yml#L2694) section in the defined evergreen config.
|
||||
In this timeout section, we run [this](https://github.com/mongodb/mongo/blob/a6e56a8e136fe554dc90565bf6acf5bf86f7a46e/etc/evergreen_yml_components/definitions.yml#L2302) task which runs the hang-analyzer with the following invocation:
|
||||
When a task times out, it hits the
|
||||
[timeout](https://github.com/mongodb/mongo/blob/a6e56a8e136fe554dc90565bf6acf5bf86f7a46e/etc/evergreen_yml_components/definitions.yml#L2694)
|
||||
section in the defined evergreen config. In this timeout section, we run
|
||||
[this](https://github.com/mongodb/mongo/blob/a6e56a8e136fe554dc90565bf6acf5bf86f7a46e/etc/evergreen_yml_components/definitions.yml#L2302)
|
||||
task which runs the hang-analyzer with the following invocation:
|
||||
|
||||
```
|
||||
python3 buildscripts/resmoke.py hang-analyzer -o file -o stdout -m exact -p python
|
||||
```
|
||||
|
||||
This tells the hang-analyzer to look for all of the python processes (we are specifically looking for resmoke) on the machine and to signal them.
|
||||
When resmoke is [signaled](https://github.com/mongodb/mongo/blob/08a99b15eea7ae0952b2098710d565dd7f709ff6/buildscripts/resmokelib/sighandler.py#L25), it again invokes the hang analyzer with the specific pids of it's child processes.
|
||||
It will look similar to this most of the time:
|
||||
This tells the hang-analyzer to look for all of the python processes (we are specifically looking
|
||||
for resmoke) on the machine and to signal them. When resmoke is
|
||||
[signaled](https://github.com/mongodb/mongo/blob/08a99b15eea7ae0952b2098710d565dd7f709ff6/buildscripts/resmokelib/sighandler.py#L25),
|
||||
it again invokes the hang analyzer with the specific pids of it's child processes. It will look
|
||||
similar to this most of the time:
|
||||
|
||||
```
|
||||
python3 buildscripts/resmoke.py hang-analyzer -o file -o stdout -k -c -d pid1,pid2,pid3
|
||||
```
|
||||
|
||||
The things to note here are the `-k` which kills the process and `-c` which takes core dumps.
|
||||
The resulting core dumps are put into the current running directory.
|
||||
The things to note here are the `-k` which kills the process and `-c` which takes core dumps. The
|
||||
resulting core dumps are put into the current running directory.
|
||||
|
||||
#### When a test times out
|
||||
|
||||
An optional test timeout (`--testTimeout=N` seconds) can be used when running resmoke that will run the hang-analyzer on all processes related to that test.
|
||||
When a test times out, it will analyze:
|
||||
An optional test timeout (`--testTimeout=N` seconds) can be used when running resmoke that will run
|
||||
the hang-analyzer on all processes related to that test. When a test times out, it will analyze:
|
||||
|
||||
- The proccess the testcase created.
|
||||
- Any child of the testcase process.
|
||||
@ -75,23 +88,31 @@ When a test times out, it will analyze:
|
||||
| |-mongo (ENV_MARKER=2, pgid 9)
|
||||
```
|
||||
|
||||
Caution: Should a process be created in a new process group as `bar` is in the above example, it may be missed on MacOS. If `foo` crashes/exits, `bar` is orphaned and reparented to the `init` process. It is no longer a "child" and it is not generally possible to read environment variables of arbitrary processes on MacOS with System Integrity Protection (SIP) enabled.
|
||||
Caution: Should a process be created in a new process group as `bar` is in the above example, it may
|
||||
be missed on MacOS. If `foo` crashes/exits, `bar` is orphaned and reparented to the `init` process.
|
||||
It is no longer a "child" and it is not generally possible to read environment variables of
|
||||
arbitrary processes on MacOS with System Integrity Protection (SIP) enabled.
|
||||
|
||||
#### When a task fails normally
|
||||
|
||||
When a task fails normally, core dumps may also be generated by the linux kernel and put into the working directory.
|
||||
When a task fails normally, core dumps may also be generated by the linux kernel and put into the
|
||||
working directory.
|
||||
|
||||
#### Note on archival/upload in Evergreen
|
||||
|
||||
We use a non-standard way of uploading core dumps to evergreen due to [timeout issues](https://jira.mongodb.org/browse/SERVER-73171) we were facing when archiving and uploading them normally through evergreen commands.
|
||||
After investigation of the above issue, we found that compressing and uploading core dumps was slow for a couple reasons:
|
||||
We use a non-standard way of uploading core dumps to evergreen due to
|
||||
[timeout issues](https://jira.mongodb.org/browse/SERVER-73171) we were facing when archiving and
|
||||
uploading them normally through evergreen commands. After investigation of the above issue, we found
|
||||
that compressing and uploading core dumps was slow for a couple reasons:
|
||||
|
||||
1. Tarring all of the core dumps into one file takes up a lot of disk IO and disk IO was the bottleneck.
|
||||
1. Tarring all of the core dumps into one file takes up a lot of disk IO and disk IO was the
|
||||
bottleneck.
|
||||
2. Gzip is single threaded.
|
||||
3. Uploading a big file synchronously is not fast.
|
||||
|
||||
We made a [script](https://github.com/mongodb/mongo/blob/master/buildscripts/fast_archive.py) that gzips all of the core dumps in parallel and uploads them to S3 individually asynchronously.
|
||||
This solved all of the problems listed above.
|
||||
We made a [script](https://github.com/mongodb/mongo/blob/master/buildscripts/fast_archive.py) that
|
||||
gzips all of the core dumps in parallel and uploads them to S3 individually asynchronously. This
|
||||
solved all of the problems listed above.
|
||||
|
||||
### Generating the core analyzer task
|
||||
|
||||
@ -104,18 +125,26 @@ sequenceDiagram
|
||||
Generated Task ->> Core Analyzer Output: Overwrite output with<br/> core dump analysis
|
||||
```
|
||||
|
||||
In the [post task](https://github.com/mongodb/mongo/blob/709e3f4efc04b42e5d29a8ad2417a01d3610fc3f/etc/evergreen_yml_components/definitions.yml#L2665) section, we [define](https://github.com/mongodb/mongo/blob/709e3f4efc04b42e5d29a8ad2417a01d3610fc3f/etc/evergreen_yml_components/definitions.yml#L2184) the evergreen function used to generate the core analyzer task.
|
||||
This [script](https://github.com/mongodb/mongo/blob/709e3f4efc04b42e5d29a8ad2417a01d3610fc3f/buildscripts/resmokelib/hang_analyzer/gen_hang_analyzer_tasks.py) runs on every task (passing or failing) and is independent of anything else that happened prior in the task and does all of the checks to ensure it should run.
|
||||
These checks include:
|
||||
In the
|
||||
[post task](https://github.com/mongodb/mongo/blob/709e3f4efc04b42e5d29a8ad2417a01d3610fc3f/etc/evergreen_yml_components/definitions.yml#L2665)
|
||||
section, we
|
||||
[define](https://github.com/mongodb/mongo/blob/709e3f4efc04b42e5d29a8ad2417a01d3610fc3f/etc/evergreen_yml_components/definitions.yml#L2184)
|
||||
the evergreen function used to generate the core analyzer task. This
|
||||
[script](https://github.com/mongodb/mongo/blob/709e3f4efc04b42e5d29a8ad2417a01d3610fc3f/buildscripts/resmokelib/hang_analyzer/gen_hang_analyzer_tasks.py)
|
||||
runs on every task (passing or failing) and is independent of anything else that happened prior in
|
||||
the task and does all of the checks to ensure it should run. These checks include:
|
||||
|
||||
1. The task is being run on an operating system supported by the core analyzer.
|
||||
2. The task has any core dumps uploaded and attached to it.
|
||||
3. At least one of the binaries uploaded is from a binary we know how to process.
|
||||
|
||||
The output from this script is a json file in the format evergreen expects.
|
||||
We then pass this json file into the `generate.tasks` evergreen command to generate the task.
|
||||
The output from this script is a json file in the format evergreen expects. We then pass this json
|
||||
file into the `generate.tasks` evergreen command to generate the task.
|
||||
|
||||
After the task is generated, we have [another script](https://github.com/mongodb/mongo/blob/709e3f4efc04b42e5d29a8ad2417a01d3610fc3f/etc/evergreen_yml_components/definitions.yml#L2213) that finds the task that was just generated and attaches it to the current task being ran.
|
||||
After the task is generated, we have
|
||||
[another script](https://github.com/mongodb/mongo/blob/709e3f4efc04b42e5d29a8ad2417a01d3610fc3f/etc/evergreen_yml_components/definitions.yml#L2213)
|
||||
that finds the task that was just generated and attaches it to the current task being ran.
|
||||
|
||||
The reason we upload a temporary file to the original task is to attach that s3 file link to the task.
|
||||
Evergreen does not currently have a way to attach files to a task after it was ran so we need to upload something while the original task is in progress.
|
||||
The reason we upload a temporary file to the original task is to attach that s3 file link to the
|
||||
task. Evergreen does not currently have a way to attach files to a task after it was ran so we need
|
||||
to upload something while the original task is in progress.
|
||||
|
||||
@ -1,17 +1,15 @@
|
||||
# Powercycle README
|
||||
|
||||
Power cycling is the process of turning hardware off and then turning it on again.
|
||||
Powercycle test is designed to work across two machines, one machine is a "server"
|
||||
that controls and monitors the workflow and a "client" that runs Mongo server and
|
||||
is remotely crashed by "server" regularly.
|
||||
Power cycling is the process of turning hardware off and then turning it on again. Powercycle test
|
||||
is designed to work across two machines, one machine is a "server" that controls and monitors the
|
||||
workflow and a "client" that runs Mongo server and is remotely crashed by "server" regularly.
|
||||
|
||||
In evergreen the localhost that runs the task acts as a "server" and the remote
|
||||
host which is created by `host.create` evergreen command acts as a "client".
|
||||
In evergreen the localhost that runs the task acts as a "server" and the remote host which is
|
||||
created by `host.create` evergreen command acts as a "client".
|
||||
|
||||
Powercycle test is the part of resmoke. Python 3.13+ with python venv is required to
|
||||
run the resmoke (python3 from [mongodbtoolchain](http://mongodbtoolchain.build.10gen.cc/)
|
||||
is highly recommended). Python venv can be set up by running in the root mongo repo
|
||||
directory:
|
||||
Powercycle test is the part of resmoke. Python 3.13+ with python venv is required to run the resmoke
|
||||
(python3 from [mongodbtoolchain](http://mongodbtoolchain.build.10gen.cc/) is highly recommended).
|
||||
Python venv can be set up by running in the root mongo repo directory:
|
||||
|
||||
```
|
||||
python3 -m venv python3-venv
|
||||
@ -48,20 +46,18 @@ buildscripts/resmokelib/powercycle/__init__.py
|
||||
|
||||
### Set up EC2 instance
|
||||
|
||||
1. `Evergreen host.create command` - in Evergreen the remote host is created with
|
||||
the same distro as the localhost runs and some initial connections are made to ensure
|
||||
it's up before further steps
|
||||
2. `Resmoke powercycle setup-host command` - prepares remote host via ssh to run
|
||||
the powercycle test:
|
||||
1. `Evergreen host.create command` - in Evergreen the remote host is created with the same distro as
|
||||
the localhost runs and some initial connections are made to ensure it's up before further steps
|
||||
2. `Resmoke powercycle setup-host command` - prepares remote host via ssh to run the powercycle
|
||||
test:
|
||||
|
||||
```
|
||||
python buildscripts/resmoke.py powercycle setup-host
|
||||
```
|
||||
|
||||
Powercycle setup-host operations are located in
|
||||
`buildscripts/resmokelib/powercycle/setup/__init__.py`.
|
||||
`expansions.yml` file is used to load the configuration to run operations which is
|
||||
created by `expansions.write` command in Evergreen.
|
||||
`buildscripts/resmokelib/powercycle/setup/__init__.py`. `expansions.yml` file is used to load the
|
||||
configuration to run operations which is created by `expansions.write` command in Evergreen.
|
||||
|
||||
It runs several operations via ssh:
|
||||
|
||||
@ -69,12 +65,12 @@ It runs several operations via ssh:
|
||||
- copy `buildscripts` and `mongoDB executables` from localhost to the remote host
|
||||
- set up python venv on the remote host
|
||||
- set up curator to collect system & process stats on the remote host
|
||||
- install [NotMyFault](https://docs.microsoft.com/en-us/sysinternals/downloads/notmyfault)
|
||||
to crash Windows (only on Windows)
|
||||
- install [NotMyFault](https://docs.microsoft.com/en-us/sysinternals/downloads/notmyfault) to crash
|
||||
Windows (only on Windows)
|
||||
|
||||
Remote operation via ssh implementation is located in
|
||||
`buildscripts/resmokelib/powercycle/lib/remote_operations.py`.
|
||||
The following operations are supported:
|
||||
`buildscripts/resmokelib/powercycle/lib/remote_operations.py`. The following operations are
|
||||
supported:
|
||||
|
||||
- `copy_to` - copy files from the localhost to the remote host
|
||||
- `copy_from` - copy files from the remote host to the localhost
|
||||
@ -82,9 +78,8 @@ The following operations are supported:
|
||||
|
||||
### Run powercycle test
|
||||
|
||||
`Resmoke powercycle run command` - runs the powercycle test on the localhost
|
||||
which runs remote operations on the remote host via ssh and local validation
|
||||
checks:
|
||||
`Resmoke powercycle run command` - runs the powercycle test on the localhost which runs remote
|
||||
operations on the remote host via ssh and local validation checks:
|
||||
|
||||
```
|
||||
python buildscripts/resmoke.py powercycle run \
|
||||
@ -95,26 +90,26 @@ python buildscripts/resmoke.py powercycle run \
|
||||
|
||||
###### Resmoke powercycle run arguments
|
||||
|
||||
The arguments for resmoke powercycle run command are defined in `add_subcommand()`
|
||||
function in `buildscripts/resmokelib/powercycle/__init__.py`. When powercycle test
|
||||
runs remote operations on the remote host it calls the copied version of this script
|
||||
on the remote host. Thus, some resmoke powercycle run command arguments are needed
|
||||
for the remote call and shouldn't be used when calling the script on the localhost.
|
||||
The arguments for resmoke powercycle run command are defined in `add_subcommand()` function in
|
||||
`buildscripts/resmokelib/powercycle/__init__.py`. When powercycle test runs remote operations on the
|
||||
remote host it calls the copied version of this script on the remote host. Thus, some resmoke
|
||||
powercycle run command arguments are needed for the remote call and shouldn't be used when calling
|
||||
the script on the localhost.
|
||||
|
||||
`--taskName` argument is used to get powercycle task configurations that are stored
|
||||
in `buildscripts/resmokeconfig/powercycle/powercycle_tasks.yml`
|
||||
`--taskName` argument is used to get powercycle task configurations that are stored in
|
||||
`buildscripts/resmokeconfig/powercycle/powercycle_tasks.yml`
|
||||
|
||||
There is a known issue with `--setParameter` mongod options incorrectly processed
|
||||
from `mongod_options` that is described in [SERVER-47621](https://jira.mongodb.org/browse/SERVER-47621)
|
||||
There is a known issue with `--setParameter` mongod options incorrectly processed from
|
||||
`mongod_options` that is described in [SERVER-47621](https://jira.mongodb.org/browse/SERVER-47621)
|
||||
|
||||
###### Powercycle test implementation
|
||||
|
||||
The powercycle test main implementation is located in `main()` function in
|
||||
`buildscripts/resmokelib/powercycle/powercycle.py`.
|
||||
|
||||
The value of `--remoteOperation` argument is used to distinguish if we are running the script
|
||||
on the localhost or on the remote host.
|
||||
`remote_handler()` function performs the following remote operations:
|
||||
The value of `--remoteOperation` argument is used to distinguish if we are running the script on the
|
||||
localhost or on the remote host. `remote_handler()` function performs the following remote
|
||||
operations:
|
||||
|
||||
- `noop` - do nothing
|
||||
- `crash_server` - internally crash the server
|
||||
@ -157,17 +152,17 @@ When running on localhost the powercycle test loops do the following steps:
|
||||
|
||||
### Save diagnostics
|
||||
|
||||
`Resmoke powercycle save-diagnostics command` - copies powercycle diagnostics
|
||||
files from the remote host to the localhost (mainly used by Evergreen):
|
||||
`Resmoke powercycle save-diagnostics command` - copies powercycle diagnostics files from the remote
|
||||
host to the localhost (mainly used by Evergreen):
|
||||
|
||||
```
|
||||
python buildscripts/resmoke.py powercycle save-diagnostics
|
||||
```
|
||||
|
||||
Powercycle save-diagnostics operations are located in
|
||||
`buildscripts/resmokelib/powercycle/save_diagnostics/__init__.py`.
|
||||
`expansions.yml` file is used to load the configuration to run operations which is
|
||||
created by `expansions.write` command in Evergreen.
|
||||
`buildscripts/resmokelib/powercycle/save_diagnostics/__init__.py`. `expansions.yml` file is used to
|
||||
load the configuration to run operations which is created by `expansions.write` command in
|
||||
Evergreen.
|
||||
|
||||
It runs several operations via ssh:
|
||||
|
||||
@ -188,15 +183,14 @@ It runs several operations via ssh:
|
||||
|
||||
### Remote hang analyzer (optional)
|
||||
|
||||
`Resmoke powercycle remote-hang-analyzer command` - runs hang analyzer on the
|
||||
remote host (mainly used by Evergreen):
|
||||
`Resmoke powercycle remote-hang-analyzer command` - runs hang analyzer on the remote host (mainly
|
||||
used by Evergreen):
|
||||
|
||||
```
|
||||
$python buildscripts/resmoke.py powercycle remote-hang-analyzer
|
||||
```
|
||||
|
||||
Powercycle remote-hang-analyzer command calls resmoke hang analyzer on the
|
||||
remote host and is located in
|
||||
`buildscripts/resmokelib/powercycle/remote_hang_analyzer/__init__.py`
|
||||
`expansions.yml` file is used to load the configuration to run this command which is
|
||||
created by `expansions.write` command in Evergreen.
|
||||
Powercycle remote-hang-analyzer command calls resmoke hang analyzer on the remote host and is
|
||||
located in `buildscripts/resmokelib/powercycle/remote_hang_analyzer/__init__.py` `expansions.yml`
|
||||
file is used to load the configuration to run this command which is created by `expansions.write`
|
||||
command in Evergreen.
|
||||
|
||||
@ -4,24 +4,39 @@ Fixtures define a specific topology that tests run against.
|
||||
|
||||
## Supported Fixtures
|
||||
|
||||
Specify any of the following as the `fixture` in your [Suite](../../../../buildscripts/resmokeconfig/suites/README.md) config:
|
||||
Specify any of the following as the `fixture` in your
|
||||
[Suite](../../../../buildscripts/resmokeconfig/suites/README.md) config:
|
||||
|
||||
- [`BulkWriteFixture`](./bulk_write.py) - Fixture which provides JSTests with a set of clusters to run tests against.
|
||||
- [`ExternalFixture`](./external.py) - Fixture which provides JSTests capability to connect to external (non-resmoke) cluster.
|
||||
- [`ExternalShardedClusterFixture`](./shardedcluster.py) - Fixture to interact with external sharded cluster fixture.
|
||||
- [`MongoDFixture`](./standalone.py) - Fixture which provides JSTests with a standalone mongod to run against.
|
||||
- [`MongoTFixture`](./mongot.py) - Fixture which provides JSTests with a mongot to run alongside a mongod.
|
||||
- [`MultiReplicaSetFixture`](./multi_replica_set.py) - Fixture which provides JSTests with a set of replica sets to run against.
|
||||
- [`MultiShardedClusterFixture`](./multi_sharded_cluster.py) - Fixture which provides JSTests with a set of sharded clusters to run against.
|
||||
- [`ReplicaSetFixture`](./replicaset.py) - Fixture which provides JSTests with a replica set to run against.
|
||||
- [`ShardedClusterFixture`](./shardedcluster.py) - Fixture which provides JSTests with a sharded cluster to run against.
|
||||
- Used when the MongoDB deployment is started by the JavaScript test itself with `MongoRunner`, `ReplSetTest`, or `ShardingTest`.
|
||||
- [`YesFixture`](./yesfixture.py) - Fixture which spawns several `yes` executables to generate lots of log messages.
|
||||
- [`BulkWriteFixture`](./bulk_write.py) - Fixture which provides JSTests with a set of clusters to
|
||||
run tests against.
|
||||
- [`ExternalFixture`](./external.py) - Fixture which provides JSTests capability to connect to
|
||||
external (non-resmoke) cluster.
|
||||
- [`ExternalShardedClusterFixture`](./shardedcluster.py) - Fixture to interact with external sharded
|
||||
cluster fixture.
|
||||
- [`MongoDFixture`](./standalone.py) - Fixture which provides JSTests with a standalone mongod to
|
||||
run against.
|
||||
- [`MongoTFixture`](./mongot.py) - Fixture which provides JSTests with a mongot to run alongside a
|
||||
mongod.
|
||||
- [`MultiReplicaSetFixture`](./multi_replica_set.py) - Fixture which provides JSTests with a set of
|
||||
replica sets to run against.
|
||||
- [`MultiShardedClusterFixture`](./multi_sharded_cluster.py) - Fixture which provides JSTests with a
|
||||
set of sharded clusters to run against.
|
||||
- [`ReplicaSetFixture`](./replicaset.py) - Fixture which provides JSTests with a replica set to run
|
||||
against.
|
||||
- [`ShardedClusterFixture`](./shardedcluster.py) - Fixture which provides JSTests with a sharded
|
||||
cluster to run against.
|
||||
- Used when the MongoDB deployment is started by the JavaScript test itself with `MongoRunner`,
|
||||
`ReplSetTest`, or `ShardingTest`.
|
||||
- [`YesFixture`](./yesfixture.py) - Fixture which spawns several `yes` executables to generate lots
|
||||
of log messages.
|
||||
|
||||
## Interfaces
|
||||
|
||||
- [`Fixture`](./interface.py) - Base class for all fixtures.
|
||||
- [`MultiClusterFixture`](./interface.py) - Base class for fixtures that may consist of multiple independent participant clusters.
|
||||
- The participant clusters can function independently without coordination, but are bound together only for some duration as they participate in some process such as a migration. The participant clusters are fixtures themselves.
|
||||
- [`MultiClusterFixture`](./interface.py) - Base class for fixtures that may consist of multiple
|
||||
independent participant clusters.
|
||||
- The participant clusters can function independently without coordination, but are bound together
|
||||
only for some duration as they participate in some process such as a migration. The participant
|
||||
clusters are fixtures themselves.
|
||||
- [`NoOpFixture`](./interface.py) - A Fixture implementation that does not start any servers.
|
||||
- [`ReplFixture`](./interface.py) - Base class for all fixtures that support replication.
|
||||
|
||||
@ -4,84 +4,145 @@ Hooks are a mechanism to run routines _around_ the tests, at the test content bo
|
||||
|
||||
## Supported hooks
|
||||
|
||||
Specify any of the following as the `hooks` in your [Suite](../../../../buildscripts/resmokeconfig/suites/README.md) config:
|
||||
Specify any of the following as the `hooks` in your
|
||||
[Suite](../../../../buildscripts/resmokeconfig/suites/README.md) config:
|
||||
|
||||
- [`AnalyzeShardKeysInBackground`](./analyze_shard_key.py) - A hook for running `analyzeShardKey` commands while a test is running.
|
||||
- [`AntithesisLogging`](./antithesis_logging.py) - Prints antithesis commands before & after test run.
|
||||
- [`AnalyzeShardKeysInBackground`](./analyze_shard_key.py) - A hook for running `analyzeShardKey`
|
||||
commands while a test is running.
|
||||
- [`AntithesisLogging`](./antithesis_logging.py) - Prints antithesis commands before & after test
|
||||
run.
|
||||
- [`BackgroundInitialSync`](./initialsync.py) - Background Initial Sync
|
||||
- After every test, this hook checks if a background node has finished initial sync and if so validates it, tears it down, and restarts it.
|
||||
- This test accepts a parameter `n` that specifies a number of tests after which it will wait for replication to finish before validating and restarting the initial sync node.
|
||||
- This requires the ReplicaSetFixture to be started with `start_initial_sync_node=True`. If used at the same time as `CleanEveryN`, the `n` value passed to this hook should be equal to the `n` value for `CleanEveryN`.
|
||||
- [`CheckClusterIndexConsistency`](./cluster_index_consistency.py) - Checks that indexes are the same across chunks for the same collections.
|
||||
- [`CheckMetadataConsistencyInBackground`](./metadata_consistency) - Check the metadata consistency of a sharded cluster.
|
||||
- [`CheckOrphansDeleted`](./orphans.py) - Check if the range deleter failed to delete any orphan documents.
|
||||
- [`CheckReplDBHashInBackground`](./dbhash_background.py) - A hook for comparing the dbhashes of all replica set members while a test is running.
|
||||
- After every test, this hook checks if a background node has finished initial sync and if so
|
||||
validates it, tears it down, and restarts it.
|
||||
- This test accepts a parameter `n` that specifies a number of tests after which it will wait for
|
||||
replication to finish before validating and restarting the initial sync node.
|
||||
- This requires the ReplicaSetFixture to be started with `start_initial_sync_node=True`. If used
|
||||
at the same time as `CleanEveryN`, the `n` value passed to this hook should be equal to the `n`
|
||||
value for `CleanEveryN`.
|
||||
- [`CheckClusterIndexConsistency`](./cluster_index_consistency.py) - Checks that indexes are the
|
||||
same across chunks for the same collections.
|
||||
- [`CheckMetadataConsistencyInBackground`](./metadata_consistency) - Check the metadata consistency
|
||||
of a sharded cluster.
|
||||
- [`CheckOrphansDeleted`](./orphans.py) - Check if the range deleter failed to delete any orphan
|
||||
documents.
|
||||
- [`CheckReplDBHashInBackground`](./dbhash_background.py) - A hook for comparing the dbhashes of all
|
||||
replica set members while a test is running.
|
||||
- [`CheckReplDBHash`](./dbhash.py) - Check if the dbhashes match.
|
||||
- [`CheckReplOplogs`](./oplog.py) - Check that `local.oplog.rs` matches on the primary and secondaries.
|
||||
- [`CheckReplPreImagesConsistency`](./preimages_consistency.py) - Check that `config.system.preimages` is consistent between the primary and secondaries.
|
||||
- [`CheckRoutingTableConsistency`](./routing_table_consistency.py) - Verifies the absence of corrupted entries in config.chunks and config.collections.
|
||||
- [`CheckShardFilteringMetadata`](./shard_filtering_metadata.py) - Inspect filtering metadata on shards
|
||||
- [`CheckReplOplogs`](./oplog.py) - Check that `local.oplog.rs` matches on the primary and
|
||||
secondaries.
|
||||
- [`CheckReplPreImagesConsistency`](./preimages_consistency.py) - Check that
|
||||
`config.system.preimages` is consistent between the primary and secondaries.
|
||||
- [`CheckRoutingTableConsistency`](./routing_table_consistency.py) - Verifies the absence of
|
||||
corrupted entries in config.chunks and config.collections.
|
||||
- [`CheckShardFilteringMetadata`](./shard_filtering_metadata.py) - Inspect filtering metadata on
|
||||
shards
|
||||
- [`CleanEveryN`](./cleanup.py) - Restart the fixture after it has ran `n` tests.
|
||||
- [`CleanupConcurrencyWorkloads`](./cleanup_concurrency_workloads.py) - Drop all databases, except those that have been excluded.
|
||||
- For concurrency tests that run on different DBs, drop all databases except ones in `exclude_dbs`. For tests that run on the same DB, drop all databases except ones in `exclude_dbs` and the DB used by the test/workloads. For tests that run on the same collection, drop all collections in all databases except for `exclude_dbs` and the collection used by the test/workloads.
|
||||
- [`CleanupConcurrencyWorkloads`](./cleanup_concurrency_workloads.py) - Drop all databases, except
|
||||
those that have been excluded.
|
||||
- For concurrency tests that run on different DBs, drop all databases except ones in
|
||||
`exclude_dbs`. For tests that run on the same DB, drop all databases except ones in
|
||||
`exclude_dbs` and the DB used by the test/workloads. For tests that run on the same collection,
|
||||
drop all collections in all databases except for `exclude_dbs` and the collection used by the
|
||||
test/workloads.
|
||||
- On mongod-related fixtures, this will clear the dbpath
|
||||
- [`ClusterParameter`](./cluster_parameter.py) - Sets the specified cluster server parameter.
|
||||
- [`ContinuousAddRemoveShard`](./add_remove_shards.py) - Continuously adds and removes shards at regular intervals. If running with `configsvr` transitions, will transition in/out of config shard mode.
|
||||
- [`ContinuousInitialSync`](./continuous_initial_sync.py) - Periodically initial sync nodes then step them up.
|
||||
- [`ContinuousStepdown`](./stepdown.py) - regularly connect to replica sets and send a `replSetStepDown` command.
|
||||
- [`ContinuousTransition`](./replicaset_transition_to_and_from_csrs.py) - connects to replica sets and transitions them from replica set to CSRS node in the background.
|
||||
- [`DoReconfigInBackground`](./reconfig_background.py) - A hook for running a safe reconfig against a replica set while a test is running.
|
||||
- [`DropConfigCacheCollections`](./drop_config_cache_collections.py) - A hook for dropping random entries of config.cache.collections in shards.
|
||||
- [`DropSessionsCollection`](./drop_sessions_collection.py) - A hook for dropping and recreating config.system.sessions while tests are running.
|
||||
- [`ContinuousAddRemoveShard`](./add_remove_shards.py) - Continuously adds and removes shards at
|
||||
regular intervals. If running with `configsvr` transitions, will transition in/out of config shard
|
||||
mode.
|
||||
- [`ContinuousInitialSync`](./continuous_initial_sync.py) - Periodically initial sync nodes then
|
||||
step them up.
|
||||
- [`ContinuousStepdown`](./stepdown.py) - regularly connect to replica sets and send a
|
||||
`replSetStepDown` command.
|
||||
- [`ContinuousTransition`](./replicaset_transition_to_and_from_csrs.py) - connects to replica sets
|
||||
and transitions them from replica set to CSRS node in the background.
|
||||
- [`DoReconfigInBackground`](./reconfig_background.py) - A hook for running a safe reconfig against
|
||||
a replica set while a test is running.
|
||||
- [`DropConfigCacheCollections`](./drop_config_cache_collections.py) - A hook for dropping random
|
||||
entries of config.cache.collections in shards.
|
||||
- [`DropSessionsCollection`](./drop_sessions_collection.py) - A hook for dropping and recreating
|
||||
config.system.sessions while tests are running.
|
||||
- [`DropUserCollections`](./drop_user_collections.py) - Drops all user collections.
|
||||
- [`EnableSpuriousWriteConflicts`](./enable_spurious_write_conflicts.py) - Toggles write conflicts.
|
||||
- [`FCVUpgradeDowngradeInBackground`](./fcv_upgrade_downgrade.py) - A hook to run background FCV upgrade and downgrade against test servers while a test is running.
|
||||
- [`FuzzRuntimeParameters`](./fuzz_runtime_parameters.py) - Regularly connect to nodes and sends them a `setParameter` command; uses the [Config Fuzzer](../../../../buildscripts/resmokelib/generate_fuzz_config/README.md).
|
||||
- [`FuzzRuntimeStress`](./fuzz_runtime_stress.py) - Test hook that periodically changes the amount of stress the system is experiencing.
|
||||
- [`FCVUpgradeDowngradeInBackground`](./fcv_upgrade_downgrade.py) - A hook to run background FCV
|
||||
upgrade and downgrade against test servers while a test is running.
|
||||
- [`FuzzRuntimeParameters`](./fuzz_runtime_parameters.py) - Regularly connect to nodes and sends
|
||||
them a `setParameter` command; uses the
|
||||
[Config Fuzzer](../../../../buildscripts/resmokelib/generate_fuzz_config/README.md).
|
||||
- [`FuzzRuntimeStress`](./fuzz_runtime_stress.py) - Test hook that periodically changes the amount
|
||||
of stress the system is experiencing.
|
||||
- [`FuzzerRestoreSettings`](./fuzzer_restore_settings.py) - Cleans up unwanted changes from fuzzer.
|
||||
- [`GenerateAndCheckPerfResults`](./generate_and_check_perf_results.py) - Combine JSON results from individual benchmarks and check their reported values against any thresholds set for them.
|
||||
- Combines test results from individual benchmark files to a single file. This is useful for generating the json file to feed into the Evergreen performance visualization plugin.
|
||||
- [`GenerateAndCheckPerfResults`](./generate_and_check_perf_results.py) - Combine JSON results from
|
||||
individual benchmarks and check their reported values against any thresholds set for them.
|
||||
- Combines test results from individual benchmark files to a single file. This is useful for
|
||||
generating the json file to feed into the Evergreen performance visualization plugin.
|
||||
- [`HelloDelays`](./hello_failures.py) - Sets Hello fault injections.
|
||||
- [`IntermediateInitialSync`](./initialsync.py) - Intermediate Initial Sync
|
||||
- This hook accepts a parameter `n` that specifies a number of tests after which it will start up a node to initial sync, wait for replication to finish, and then validate the data.
|
||||
- This hook accepts a parameter `n` that specifies a number of tests after which it will start up
|
||||
a node to initial sync, wait for replication to finish, and then validate the data.
|
||||
- This requires the ReplicaSetFixture to be started with 'start_initial_sync_node=True'.
|
||||
- [`LagOplogApplicationInBackground`](./secondary_lag.py) - Toggles secondary oplog application lag.
|
||||
- [`LibfuzzerHook`](./cpp_libfuzzer.py) - Merges inputs after a fuzzer run.
|
||||
- [`MagicRestoreEveryN`](./magic_restore.py) - Open a backup cursor and run magic restore process after `n` tests have run.
|
||||
- [`MagicRestoreEveryN`](./magic_restore.py) - Open a backup cursor and run magic restore process
|
||||
after `n` tests have run.
|
||||
- Requires the use of `MagicRestoreFixture`.
|
||||
- [`PeriodicKillSecondaries`](./periodic_kill_secondaries.py) - Periodically kills the secondaries in a replica set.
|
||||
- Also verifies that the secondaries can reach the SECONDARY state without having connectivity to the primary after an unclean shutdown.
|
||||
- [`PeriodicStackTrace`](./periodic_stack_trace.py) - Test hook that sends the stacktracing signal to mongo processes at randomized intervals.
|
||||
- [`QueryableServerHook`](./queryable_server_hook.py) - Starts the queryable server before each test for queryable restores. Restarts the queryable server between tests.
|
||||
- [`RotateExecutionControlParams`](./rotate_execution_control_params.py) - Periodically rotates 'executionControlConcurrencyAdjustmentAlgorithm' and deprioritization server parameters to random valid values.
|
||||
- [`RunChangeStreamsInBackground`](./change_streams.py) - Run in the background full cluster change streams while a test is running. Open and close the change stream every `1..10` tests (random using `config.RANDOM_SEED`).
|
||||
- [`RunDBCheckInBackground`](./dbcheck_background.py) - A hook for running `dbCheck` on a replica set while a test is running.
|
||||
- This includes dbhashes for all non-local databases and non-replicated system collections that match on the primary and secondaries.
|
||||
- It also will check the performance results against any thresholds that are set for each benchmark. If no thresholds are set for a test, this hook should always pass.
|
||||
- [`RunQueryStats`](./run_query_stats.py) - Runs `$queryStats` after every test, and clears the query stats store before every test.
|
||||
- [`PeriodicKillSecondaries`](./periodic_kill_secondaries.py) - Periodically kills the secondaries
|
||||
in a replica set.
|
||||
- Also verifies that the secondaries can reach the SECONDARY state without having connectivity to
|
||||
the primary after an unclean shutdown.
|
||||
- [`PeriodicStackTrace`](./periodic_stack_trace.py) - Test hook that sends the stacktracing signal
|
||||
to mongo processes at randomized intervals.
|
||||
- [`QueryableServerHook`](./queryable_server_hook.py) - Starts the queryable server before each test
|
||||
for queryable restores. Restarts the queryable server between tests.
|
||||
- [`RotateExecutionControlParams`](./rotate_execution_control_params.py) - Periodically rotates
|
||||
'executionControlConcurrencyAdjustmentAlgorithm' and deprioritization server parameters to random
|
||||
valid values.
|
||||
- [`RunChangeStreamsInBackground`](./change_streams.py) - Run in the background full cluster change
|
||||
streams while a test is running. Open and close the change stream every `1..10` tests (random
|
||||
using `config.RANDOM_SEED`).
|
||||
- [`RunDBCheckInBackground`](./dbcheck_background.py) - A hook for running `dbCheck` on a replica
|
||||
set while a test is running.
|
||||
- This includes dbhashes for all non-local databases and non-replicated system collections that
|
||||
match on the primary and secondaries.
|
||||
- It also will check the performance results against any thresholds that are set for each
|
||||
benchmark. If no thresholds are set for a test, this hook should always pass.
|
||||
- [`RunQueryStats`](./run_query_stats.py) - Runs `$queryStats` after every test, and clears the
|
||||
query stats store before every test.
|
||||
- [`SimulateCrash`](./simulate_crash.py) - A hook to simulate crashes.
|
||||
- [`ValidateCollections`](./validate.py) - Run full validation.
|
||||
- [`ValidateCollectionsInBackground`](./validate_background.py) - A hook to run background collection validation against test servers while a test is running.
|
||||
- This will run on all collections in all databases on every stand-alone node, primary replica-set node, or primary shard node.
|
||||
- [`ValidateDirectSecondaryReads`](./validate_direct_secondary_reads.py) - Only supported in suites that use `ReplicaSetFixture`.
|
||||
- To be used with `set_read_preference_secondary.js` and `implicit_enable_profiler.js` in suites that read directly from secondaries in a replica set. Check the profiler collections of all databases at the end of the suite to verify that each secondary only ran the read commands it got directly from the shell.
|
||||
- [`ValidateCollectionsInBackground`](./validate_background.py) - A hook to run background
|
||||
collection validation against test servers while a test is running.
|
||||
- This will run on all collections in all databases on every stand-alone node, primary replica-set
|
||||
node, or primary shard node.
|
||||
- [`ValidateDirectSecondaryReads`](./validate_direct_secondary_reads.py) - Only supported in suites
|
||||
that use `ReplicaSetFixture`.
|
||||
- To be used with `set_read_preference_secondary.js` and `implicit_enable_profiler.js` in suites
|
||||
that read directly from secondaries in a replica set. Check the profiler collections of all
|
||||
databases at the end of the suite to verify that each secondary only ran the read commands it
|
||||
got directly from the shell.
|
||||
- [`WaitForReplication`](./wait_for_replication.py) - Wait for replication to complete.
|
||||
|
||||
## Interfaces
|
||||
|
||||
All hooks inherit from the [`buildscripts.resmokelib.testing.hooks.interface.Hook`](./interface.py) parent class and can override any subset of the following empty base methods:
|
||||
All hooks inherit from the [`buildscripts.resmokelib.testing.hooks.interface.Hook`](./interface.py)
|
||||
parent class and can override any subset of the following empty base methods:
|
||||
|
||||
- `before_suite`
|
||||
- `before_test`
|
||||
- `after_test`
|
||||
- `after_suite`
|
||||
|
||||
At least 1 base method must be overridden, otherwise the hook will not do anything at all. During test suite execution, each hook runs its custom logic in the respective scenarios. Some customizable tasks that hooks can perform include: _validating data, deleting data, performing cleanup_, etc.
|
||||
At least 1 base method must be overridden, otherwise the hook will not do anything at all. During
|
||||
test suite execution, each hook runs its custom logic in the respective scenarios. Some customizable
|
||||
tasks that hooks can perform include: _validating data, deleting data, performing cleanup_, etc.
|
||||
|
||||
- [`BGHook`](./bghook.py) - A hook that repeatedly calls `run_action()` in a background thread for the duration of the test suite.
|
||||
- [`DataConsistencyHook`](./jsfile.py) - A hook for running a static JavaScript file that checks data consistency of the server.
|
||||
- If the mongo shell process running the JavaScript file exits with a non-zero return code, then an `errors.ServerFailure` exception is raised to cause resmoke.py's test execution to stop.
|
||||
- [`BGHook`](./bghook.py) - A hook that repeatedly calls `run_action()` in a background thread for
|
||||
the duration of the test suite.
|
||||
- [`DataConsistencyHook`](./jsfile.py) - A hook for running a static JavaScript file that checks
|
||||
data consistency of the server.
|
||||
- If the mongo shell process running the JavaScript file exits with a non-zero return code, then
|
||||
an `errors.ServerFailure` exception is raised to cause resmoke.py's test execution to stop.
|
||||
- [`Hook`](./interface.py) - Common interface all Hooks will inherit from.
|
||||
- [`JSHook`](./jsfile.py) - A hook interface with a static JavaScript file to execute.
|
||||
- [`PerClusterDataConsistencyHook`](./jsfile.py) - A hook that runs on each independent cluster of the fixture.
|
||||
- [`PerClusterDataConsistencyHook`](./jsfile.py) - A hook that runs on each independent cluster of
|
||||
the fixture.
|
||||
- The independent cluster itself may be another fixture.
|
||||
|
||||
@ -1,33 +1,52 @@
|
||||
# TestCases
|
||||
|
||||
TestCases extend Python-based `unittest.TestCase` objects that resmoke can run as different "kinds" of tests.
|
||||
TestCases extend Python-based `unittest.TestCase` objects that resmoke can run as different "kinds"
|
||||
of tests.
|
||||
|
||||
## Supported TestCases
|
||||
|
||||
Specify any of the following as the `test_kind` in your [Suite](../../../../buildscripts/resmokeconfig/suites/README.md) config:
|
||||
Specify any of the following as the `test_kind` in your
|
||||
[Suite](../../../../buildscripts/resmokeconfig/suites/README.md) config:
|
||||
|
||||
- `all_versions_js_test`: [`AllVersionsJSTestCase`](./jstest.py) - Alias for JSTestCase for multiversion passthrough suites.
|
||||
- It runs with all combinations of versions of replica sets and sharded clusters. The distinct name is picked up by task generation.
|
||||
- `all_versions_js_test`: [`AllVersionsJSTestCase`](./jstest.py) - Alias for JSTestCase for
|
||||
multiversion passthrough suites.
|
||||
- It runs with all combinations of versions of replica sets and sharded clusters. The distinct
|
||||
name is picked up by task generation.
|
||||
- `benchmark_test`: [`BenchmarkTestCase`](./benchmark_test.py) - A Benchmark test to execute.
|
||||
- `bulk_write_cluster_js_test`: [`BulkWriteClusterTestCase`](./bulk_write_cluster_js_test.py) - A test to execute with connection data for multiple clusters passed through TestData.
|
||||
- `cpp_integration_test`: [`CPPIntegrationTestCase`](./cpp_integration_test.py) - A C++ integration test to execute.
|
||||
- `cpp_libfuzzer_test`: [`CPPLibfuzzerTestCase`](./cpp_libfuzzer_test.py) - A C++ libfuzzer test to execute.
|
||||
- `bulk_write_cluster_js_test`: [`BulkWriteClusterTestCase`](./bulk_write_cluster_js_test.py) - A
|
||||
test to execute with connection data for multiple clusters passed through TestData.
|
||||
- `cpp_integration_test`: [`CPPIntegrationTestCase`](./cpp_integration_test.py) - A C++ integration
|
||||
test to execute.
|
||||
- `cpp_libfuzzer_test`: [`CPPLibfuzzerTestCase`](./cpp_libfuzzer_test.py) - A C++ libfuzzer test to
|
||||
execute.
|
||||
- `cpp_unit_test`: [`CPPUnitTestCase`](./cpp_unittest.py) - A C++ unit test to execute.
|
||||
- `db_test`: [`DBTestCase`](./dbtest.py) - A dbtest to execute.
|
||||
- `fsm_workload_test`: [`FSMWorkloadTestCase`](./fsm_workload_test.py) - A wrapper for several copies of a `_SingleFSMWorkloadTestCase` to execute.
|
||||
- `js_test`: [`JSTestCase`](./jstest.py) - A wrapper for several copies of a `_SingleJSTestCase` to execute
|
||||
- Around **75% of all suites use the `js_test` kind**. See [jstests/README.md](../../../../jstests/README.md) for specific guidance.
|
||||
- `fsm_workload_test`: [`FSMWorkloadTestCase`](./fsm_workload_test.py) - A wrapper for several
|
||||
copies of a `_SingleFSMWorkloadTestCase` to execute.
|
||||
- `js_test`: [`JSTestCase`](./jstest.py) - A wrapper for several copies of a `_SingleJSTestCase` to
|
||||
execute
|
||||
- Around **75% of all suites use the `js_test` kind**. See
|
||||
[jstests/README.md](../../../../jstests/README.md) for specific guidance.
|
||||
- `json_schema_test`: [`JSONSchemaTestCase`](./json_schema_test.py) - A JSON Schema test to execute.
|
||||
- `magic_restore_js_test`: [`MagicRestoreTestCase`](./magic_restore_js_test.py) - A test to execute for running tests in a try/catch block.
|
||||
- `mongos_test`: [`MongosTestCase`](./mongos_test.py) - A TestCase which runs a mongos binary with the given parameters.
|
||||
- `multi_stmt_txn_passthrough`: [`MultiStmtTxnTestCase`](./multi_stmt_txn_test.py) - Test case for multi statement transactions.
|
||||
- `parallel_fsm_workload_test`: [`ParallelFSMWorkloadTestCase`](./fsm_workload_test.py) - An FSM workload to execute.
|
||||
- `pretty_printer_test`: [`PrettyPrinterTestCase`](./pretty_printer_testcase.py) - A pretty printer test to execute.
|
||||
- `magic_restore_js_test`: [`MagicRestoreTestCase`](./magic_restore_js_test.py) - A test to execute
|
||||
for running tests in a try/catch block.
|
||||
- `mongos_test`: [`MongosTestCase`](./mongos_test.py) - A TestCase which runs a mongos binary with
|
||||
the given parameters.
|
||||
- `multi_stmt_txn_passthrough`: [`MultiStmtTxnTestCase`](./multi_stmt_txn_test.py) - Test case for
|
||||
multi statement transactions.
|
||||
- `parallel_fsm_workload_test`: [`ParallelFSMWorkloadTestCase`](./fsm_workload_test.py) - An FSM
|
||||
workload to execute.
|
||||
- `pretty_printer_test`: [`PrettyPrinterTestCase`](./pretty_printer_testcase.py) - A pretty printer
|
||||
test to execute.
|
||||
- `py_test`: [`PyTestCase`](./pytest.py) - A python test to execute.
|
||||
- `query_tester_self_test`: [`QueryTesterSelfTestCase`](./query_tester_self_test.py) - A QueryTester self-test to execute.
|
||||
- `query_tester_server_test`: [`QueryTesterServerTestCase`](./query_tester_server_test.py) - A QueryTester server test to execute.
|
||||
- `sdam_json_test`: [`SDAMJsonTestCase`](./sdam_json_test.py) - Server Discovery and Monitoring JSON test case.
|
||||
- `server_selection_json_test`: [`ServerSelectionJsonTestCase`](./server_selection_json_test.py) - Server Selection JSON test case.
|
||||
- `query_tester_self_test`: [`QueryTesterSelfTestCase`](./query_tester_self_test.py) - A QueryTester
|
||||
self-test to execute.
|
||||
- `query_tester_server_test`: [`QueryTesterServerTestCase`](./query_tester_server_test.py) - A
|
||||
QueryTester server test to execute.
|
||||
- `sdam_json_test`: [`SDAMJsonTestCase`](./sdam_json_test.py) - Server Discovery and Monitoring JSON
|
||||
test case.
|
||||
- `server_selection_json_test`: [`ServerSelectionJsonTestCase`](./server_selection_json_test.py) -
|
||||
Server Selection JSON test case.
|
||||
- `sleep_test`: [`SleepTestCase`](./sleeptest.py) - SleepTestCase class.
|
||||
- `tla_plus_test`: [`TLAPlusTestCase`](./tla_plus_test.py) - A TLA+ specification to model-check.
|
||||
|
||||
@ -36,26 +55,36 @@ Specify any of the following as the `test_kind` in your [Suite](../../../../buil
|
||||
Top level interfaces:
|
||||
|
||||
- [`TestCase`](./interface.py) - A test case to execute. The `run_test` method must be implemented.
|
||||
- [`ProcessTestCase`](./interface.py) - Base class for TestCases that executes an external process. The `_make_process` method must be implemented.
|
||||
- [`ProcessTestCase`](./interface.py) - Base class for TestCases that executes an external process.
|
||||
The `_make_process` method must be implemented.
|
||||
|
||||
Subclasses:
|
||||
|
||||
- [`JSRunnerFileTestCase`](./jsrunnerfile.py) - A test case with a static JavaScript runner file to execute.
|
||||
- [`MultiClientsTestCase`](./jstest.py) - A wrapper for several copies of a SingleTestCase to execute.
|
||||
- [`JSRunnerFileTestCase`](./jsrunnerfile.py) - A test case with a static JavaScript runner file to
|
||||
execute.
|
||||
- [`MultiClientsTestCase`](./jstest.py) - A wrapper for several copies of a SingleTestCase to
|
||||
execute.
|
||||
- [`TestCaseFactory`](./interface.py) - Convenience interface to initialize and build test cases
|
||||
|
||||
## Fixture TestCases
|
||||
|
||||
These are testcases that are used to coordinate fixture lifecycles via resmoke's internal `FixtureTestCaseManager`.
|
||||
These are testcases that are used to coordinate fixture lifecycles via resmoke's internal
|
||||
`FixtureTestCaseManager`.
|
||||
|
||||
> NOTE This design does lead to seeing "extra" tests in a run, where a fixture sets up, your `N` tests are run, and the fixture tears down, so you see `N+2` "tests" passing via resmoke.
|
||||
> NOTE This design does lead to seeing "extra" tests in a run, where a fixture sets up, your `N`
|
||||
> tests are run, and the fixture tears down, so you see `N+2` "tests" passing via resmoke.
|
||||
|
||||
- [`FixtureTestCase`](./fixture.py) - Base class for the fixture test cases.
|
||||
- [`FixtureSetupTestCase`](./fixture.py) - TestCase for setting up a fixture.
|
||||
- [`FixtureTeardownTestCase`](./fixture.py) - TestCase for tearing down a fixture.
|
||||
- [`FixtureAbortTestCase`](./fixture.py) - TestCase for killing/aborting a fixture. Intended for use before archiving a failed test.
|
||||
- When resmoke detects that a test has failed (and [archiving](../../../../buildscripts/resmokeconfig/suites/README.md#executorarchive) is configured), it dynamically generates a new `FixtureAbortTestCase` for immediate execution. This test case sends a `SIGABRT` to each running mongod process.
|
||||
- [`FixtureAbortTestCase`](./fixture.py) - TestCase for killing/aborting a fixture. Intended for use
|
||||
before archiving a failed test.
|
||||
- When resmoke detects that a test has failed (and
|
||||
[archiving](../../../../buildscripts/resmokeconfig/suites/README.md#executorarchive) is
|
||||
configured), it dynamically generates a new `FixtureAbortTestCase` for immediate execution.
|
||||
This test case sends a `SIGABRT` to each running mongod process.
|
||||
|
||||
## Testing TestCases
|
||||
|
||||
Self-tests for the testcases themselves can be found in [buildscripts/tests/resmokelib/testing/testcases/](../../../../buildscripts/tests/resmokelib/testing/testcases/)
|
||||
Self-tests for the testcases themselves can be found in
|
||||
[buildscripts/tests/resmokelib/testing/testcases/](../../../../buildscripts/tests/resmokelib/testing/testcases/)
|
||||
|
||||
@ -1,33 +1,55 @@
|
||||
# S3 Binary
|
||||
|
||||
This is a small utility to help safely manage tool binaries that are stored in MongoDB's S3 bucket for the purpose of using in this repository's build, test, or release processes.
|
||||
This is a small utility to help safely manage tool binaries that are stored in MongoDB's S3 bucket
|
||||
for the purpose of using in this repository's build, test, or release processes.
|
||||
|
||||
### Security
|
||||
|
||||
Any time a binary is pulled down from the internet and executed, there is risk that the binary has been modified unintentionally. This tool creates a hash of the binary that the developer is uploads and stores a record of it in a programmatically accessible Python script (see `buildscripts/s3_binary/hashes.py`). When a tool uses the S3 binary, this interface forces a checksum of the binary before the binary is run, verifying the result against the value stored in `hashes.py` and stopping execution if it doesn't match.
|
||||
Any time a binary is pulled down from the internet and executed, there is risk that the binary has
|
||||
been modified unintentionally. This tool creates a hash of the binary that the developer is uploads
|
||||
and stores a record of it in a programmatically accessible Python script (see
|
||||
`buildscripts/s3_binary/hashes.py`). When a tool uses the S3 binary, this interface forces a
|
||||
checksum of the binary before the binary is run, verifying the result against the value stored in
|
||||
`hashes.py` and stopping execution if it doesn't match.
|
||||
|
||||
### Hermetic Guarantee
|
||||
|
||||
The other risk of relying on a binary stored in S3 is that if the binary is changed, that it will change the results of previously run tests or builds in continuous integration. This is not ideal since there are often cases where an old commit needs to be re-ran to reproduce user issues. Storing the hash in the repository and preventing modifications prevents accidental compatibility breaks of previous commits.
|
||||
The other risk of relying on a binary stored in S3 is that if the binary is changed, that it will
|
||||
change the results of previously run tests or builds in continuous integration. This is not ideal
|
||||
since there are often cases where an old commit needs to be re-ran to reproduce user issues. Storing
|
||||
the hash in the repository and preventing modifications prevents accidental compatibility breaks of
|
||||
previous commits.
|
||||
|
||||
### Example Usage
|
||||
|
||||
Scenario: You have a developer tool called db-contrib-tool that you want to build into a binary, and then use that binary as part of a test process in 10gen/mongo. To use the s3_binary tool you would:
|
||||
Scenario: You have a developer tool called db-contrib-tool that you want to build into a binary, and
|
||||
then use that binary as part of a test process in 10gen/mongo. To use the s3_binary tool you would:
|
||||
|
||||
1. Create your binaries and put them into a single directory on your local system, ex:
|
||||
/tmp/db-contrib-tool/db-contrib-tool-v1_windows.exe
|
||||
/tmp/db-contrib-tool/db-contrib-tool-v1_linux
|
||||
/tmp/db-contrib-tool/db-contrib-tool-v1_windows.exe /tmp/db-contrib-tool/db-contrib-tool-v1_linux
|
||||
|
||||
2. Invoke bazel run buildscripts/s3_binary:upload -- /tmp/db-contrib-tool s3://mdb-build-public/db-contrib-tool/v1
|
||||
2. Invoke bazel run buildscripts/s3_binary:upload -- /tmp/db-contrib-tool
|
||||
s3://mdb-build-public/db-contrib-tool/v1
|
||||
|
||||
3. Follow the prompts, this will then update your local `buildscripts/s3_binary/hashes.py` file mapping the s3 path of each binary to its sha256 hash.
|
||||
3. Follow the prompts, this will then update your local `buildscripts/s3_binary/hashes.py` file
|
||||
mapping the s3 path of each binary to its sha256 hash.
|
||||
|
||||
4. Update your test code to call: `download_s3_binary(f"s3://mdb-build-public/db-contrib-tool/v1/db-contrib-tool-v1_{os}{ext}")`. This will then automatically verify the download matches the hash at runtime.
|
||||
4. Update your test code to call:
|
||||
`download_s3_binary(f"s3://mdb-build-public/db-contrib-tool/v1/db-contrib-tool-v1_{os}{ext}")`.
|
||||
This will then automatically verify the download matches the hash at runtime.
|
||||
|
||||
5. Create a commit with your new code that adds in the `download_s3_binary` call and the `buildscripts/s3_binary/hashes.py` modifications.
|
||||
5. Create a commit with your new code that adds in the `download_s3_binary` call and the
|
||||
`buildscripts/s3_binary/hashes.py` modifications.
|
||||
|
||||
The case above covers usage in Python. If using another language like starlark for Bazel dependencies, you would follow the same flow but copy the hashes into the starlark code instead of relying off of hashes.py. Please retain the modifications to hashes.py regardless to make it easy to use your binaries in python.
|
||||
The case above covers usage in Python. If using another language like starlark for Bazel
|
||||
dependencies, you would follow the same flow but copy the hashes into the starlark code instead of
|
||||
relying off of hashes.py. Please retain the modifications to hashes.py regardless to make it easy to
|
||||
use your binaries in python.
|
||||
|
||||
### Future Additions
|
||||
|
||||
In general, it's less error prone to have the entire flow of building, uploading, and using a binary all happen in an automated pipeline without developer interaction. In the future, this tool will be updated to be easily invocable from a continuous integration pipeline that performs the build and either returns the hashes to the user to be later committed, or automatically submits a PR to update them.
|
||||
In general, it's less error prone to have the entire flow of building, uploading, and using a binary
|
||||
all happen in an automated pipeline without developer interaction. In the future, this tool will be
|
||||
updated to be easily invocable from a continuous integration pipeline that performs the build and
|
||||
either returns the hashes to the user to be later committed, or automatically submits a PR to update
|
||||
them.
|
||||
|
||||
@ -55,8 +55,8 @@ bazel test --test_output=summary --test_tag_filters=-intermediate_debug,server-p
|
||||
|
||||
## Storage Execution
|
||||
|
||||
The smoke test suites for storage execution are divided up into components. The smoke test suite
|
||||
for all of the components that storage execution owns can be run with the following:
|
||||
The smoke test suites for storage execution are divided up into components. The smoke test suite for
|
||||
all of the components that storage execution owns can be run with the following:
|
||||
|
||||
```
|
||||
bazel test --test_output=summary --test_tag_filters=-intermediate_debug,server-bsoncolumn,server-collection-write-path,server-external-sorter,server-index-builds,server-key-string,server-storage-engine-integration,server-timeseries-bucket-catalog,server-tracking-allocators,server-ttl //...
|
||||
@ -76,7 +76,8 @@ There are currently no smoke test integration tests for this component.
|
||||
|
||||
### Server-Collection-Write-Path
|
||||
|
||||
The unit and integration tests for the server-collection-write-path component can be run with the following:
|
||||
The unit and integration tests for the server-collection-write-path component can be run with the
|
||||
following:
|
||||
|
||||
```
|
||||
bazel test --test_output=summary --test_tag_filters=-intermediate_debug,server-collection-write-path //...
|
||||
@ -112,7 +113,8 @@ There are currently no smoke test integration tests for this component.
|
||||
|
||||
### Server-Storage-Engine-Integration
|
||||
|
||||
The unit and integration tests for the server-storage-engine-integration component can be run with the following:
|
||||
The unit and integration tests for the server-storage-engine-integration component can be run with
|
||||
the following:
|
||||
|
||||
```
|
||||
bazel test --test_output=summary --test_tag_filters=-intermediate_debug,server-storage-engine-integration //...
|
||||
|
||||
@ -10,7 +10,8 @@ mongodb_repo_root$ source python3-venv/bin/activate
|
||||
(python3-venv) mongodb_repo_root$ python buildscripts/resmoke.py run --suites resmoke_end2end_tests
|
||||
```
|
||||
|
||||
- Finer grained control of tests can also be run with by invoking python's unittest main by hand. E.g:
|
||||
- Finer grained control of tests can also be run with by invoking python's unittest main by hand.
|
||||
E.g:
|
||||
|
||||
```
|
||||
(python3-venv) mongodb_repo_root$ python -m unittest -v buildscripts.tests.resmoke_end2end.test_resmoke.TestTestSelection.test_at_sign_as_replay_file
|
||||
|
||||
@ -4,24 +4,26 @@
|
||||
|
||||
Antithesis is a third party vendor with an environment that can perform network fuzzing. We can
|
||||
upload images containing `docker-compose.yml` files, which represent various MongoDB topologies, to
|
||||
the Antithesis Docker registry. Antithesis runs `docker-compose up` from these images to spin up
|
||||
the corresponding multi-container application in their environment and run a test suite. Network
|
||||
fuzzing is performed on the topology while the test suite runs & a report is generated by
|
||||
Antithesis identifying bugs. Check out
|
||||
https://github.com/mongodb/mongo/wiki/Testing-MongoDB-with-Antithesis to see an example of how we
|
||||
use Antithesis today.
|
||||
the Antithesis Docker registry. Antithesis runs `docker-compose up` from these images to spin up the
|
||||
corresponding multi-container application in their environment and run a test suite. Network fuzzing
|
||||
is performed on the topology while the test suite runs & a report is generated by Antithesis
|
||||
identifying bugs. Check out https://github.com/mongodb/mongo/wiki/Testing-MongoDB-with-Antithesis to
|
||||
see an example of how we use Antithesis today.
|
||||
|
||||
## Base Images
|
||||
|
||||
The `base_images` directory consists of the building blocks for creating a MongoDB test topology.
|
||||
These images are uploaded to the Antithesis Docker registry [nightly](https://github.com/mongodb/mongo/blob/6cf8b162a61173eb372b54213def6dd61e1fd684/etc/evergreen_yml_components/variants/ubuntu/test_dev_master_and_lts_branches_only.yml#L28) during the
|
||||
[`antithesis image build and push`](https://github.com/mongodb/mongo/blob/020632e3ae328f276b2c251417b5a39389af6141/etc/evergreen_yml_components/definitions.yml#L2823) function.
|
||||
These images are uploaded to the Antithesis Docker registry
|
||||
[nightly](https://github.com/mongodb/mongo/blob/6cf8b162a61173eb372b54213def6dd61e1fd684/etc/evergreen_yml_components/variants/ubuntu/test_dev_master_and_lts_branches_only.yml#L28)
|
||||
during the
|
||||
[`antithesis image build and push`](https://github.com/mongodb/mongo/blob/020632e3ae328f276b2c251417b5a39389af6141/etc/evergreen_yml_components/definitions.yml#L2823)
|
||||
function.
|
||||
|
||||
### mongo_binaries
|
||||
|
||||
This image contains the latest `mongo`, `mongos` and `mongod` binaries. It can be used to
|
||||
start a `mongod` instance, `mongos` instance or execute `mongo` commands. This is the main building
|
||||
block for creating the System Under Test topology.
|
||||
This image contains the latest `mongo`, `mongos` and `mongod` binaries. It can be used to start a
|
||||
`mongod` instance, `mongos` instance or execute `mongo` commands. This is the main building block
|
||||
for creating the System Under Test topology.
|
||||
|
||||
### workload
|
||||
|
||||
@ -36,16 +38,16 @@ buildscript/resmoke.py run --suite antithesis_concurrency_sharded_with_stepdowns
|
||||
|
||||
**Every topology must have 1 workload container.**
|
||||
|
||||
Note: During `workload` image build, `evergreen/antithesis_image_build_and_push.sh` runs, which generates
|
||||
"antithesis compatible" test suites and prepends them with `antithesis_`. These are the test suites
|
||||
that can run in antithesis and are available from within the `workload` container.
|
||||
Note: During `workload` image build, `evergreen/antithesis_image_build_and_push.sh` runs, which
|
||||
generates "antithesis compatible" test suites and prepends them with `antithesis_`. These are the
|
||||
test suites that can run in antithesis and are available from within the `workload` container.
|
||||
|
||||
### Dockerfile
|
||||
|
||||
This assembles an image with the necessary files for spinning up the corresponding topology. It
|
||||
consists of a `docker-compose.yml`, a `logs` directory, a `scripts` directory and a `data`
|
||||
directory. If this is structured properly, you should be able to copy the files & directories
|
||||
from this image and run `docker-compose up` to set up the desired topology.
|
||||
directory. If this is structured properly, you should be able to copy the files & directories from
|
||||
this image and run `docker-compose up` to set up the desired topology.
|
||||
|
||||
Example from what `buildscripts/resmokelib/testing/docker_cluster_image_builder.py` generates:
|
||||
|
||||
@ -67,8 +69,8 @@ therefore use `FROM scratch`.
|
||||
|
||||
### docker-compose.yml
|
||||
|
||||
This describes how to construct the corresponding topology using the
|
||||
`mongo-binaries` and `workload` images.
|
||||
This describes how to construct the corresponding topology using the `mongo-binaries` and `workload`
|
||||
images.
|
||||
|
||||
Example from `buildscripts/antithesis/topologies/sharded_cluster/docker-compose.yml`:
|
||||
|
||||
@ -162,15 +164,15 @@ networks:
|
||||
|
||||
Each container must have a `command` in `docker-compose.yml` that runs an init script. The init
|
||||
script belongs in the `scripts` directory, which is included as a volume. The `command` should be
|
||||
set like so: `/bin/bash /scripts/[script_name].sh` or `python3 /scripts/[script_name].py`. This is
|
||||
a requirement for the topology to start up properly in Antithesis.
|
||||
set like so: `/bin/bash /scripts/[script_name].sh` or `python3 /scripts/[script_name].py`. This is a
|
||||
requirement for the topology to start up properly in Antithesis.
|
||||
|
||||
When creating `mongod` or `mongos` instances, route the logs like so:
|
||||
`--logpath /var/log/mongodb/mongodb.log` and utilize `volumes` -- as in `database1`.
|
||||
This enables us to easily retrieve logs if a bug is detected by Antithesis.
|
||||
`--logpath /var/log/mongodb/mongodb.log` and utilize `volumes` -- as in `database1`. This enables us
|
||||
to easily retrieve logs if a bug is detected by Antithesis.
|
||||
|
||||
The `ipv4_address` should be set to `10.20.20.130` or higher if you do not want that container to
|
||||
be affected by network fuzzing. For instance, you would likely not want the `workload` container
|
||||
The `ipv4_address` should be set to `10.20.20.130` or higher if you do not want that container to be
|
||||
affected by network fuzzing. For instance, you would likely not want the `workload` container
|
||||
to be affected by network fuzzing -- as shown in the example above.
|
||||
|
||||
Use the `evergreen-latest-master` tag for all images. This is updated automatically in
|
||||
@ -182,20 +184,26 @@ Take a look at `buildscripts/antithesis/topologies/sharded_cluster/scripts/mongo
|
||||
how to use util methods from `buildscripts/antithesis/topologies/sharded_cluster/scripts/utils.py`
|
||||
to set up the desired topology. You can also use simple shell scripts as in the case of
|
||||
`buildscripts/antithesis/topologies/sharded_cluster/scripts/database_init.py`. These init scripts
|
||||
must not end in order to keep the underlying container alive. You can use an infinite while
|
||||
loop for `python` scripts or you can use `tail -f /dev/null` for shell scripts.
|
||||
must not end in order to keep the underlying container alive. You can use an infinite while loop for
|
||||
`python` scripts or you can use `tail -f /dev/null` for shell scripts.
|
||||
|
||||
## How do I create a new topology for Antithesis testing?
|
||||
|
||||
This should be done with care to ensure we are using our limited resources efficiently.
|
||||
|
||||
Create a new task extending the `antithesis_task_template`, tagged with `antithesis`, passing the specified `suite` to the `antithesis image build and push` task. See other examples to get started.
|
||||
Create a new task extending the `antithesis_task_template`, tagged with `antithesis`, passing the
|
||||
specified `suite` to the `antithesis image build and push` task. See other examples to get started.
|
||||
|
||||
## How do I test my suite in antithesis?
|
||||
|
||||
If you provide the evergreen parameter `schedule_antithesis_tests` to your evergreen patch, once we build the antithesis images in your evergreen patch we send antithesis an api request to run your newly created images for an hour. You will get emailed the report when it finishes running in antithesis.
|
||||
If you provide the evergreen parameter `schedule_antithesis_tests` to your evergreen patch, once we
|
||||
build the antithesis images in your evergreen patch we send antithesis an api request to run your
|
||||
newly created images for an hour. You will get emailed the report when it finishes running in
|
||||
antithesis.
|
||||
|
||||
Important Note: This will happen for every antithesis task you schedule in your patch. Please do not schedule more than 1 or 2 tasks with this parameter at a time or it will use up a lot of our testing time allocated with antithesis.
|
||||
Important Note: This will happen for every antithesis task you schedule in your patch. Please do not
|
||||
schedule more than 1 or 2 tasks with this parameter at a time or it will use up a lot of our testing
|
||||
time allocated with antithesis.
|
||||
|
||||
`evergreen patch --param schedule_antithesis_tests=true`
|
||||
|
||||
@ -203,10 +211,10 @@ Important Note: This will happen for every antithesis task you schedule in your
|
||||
|
||||
### Normal resmoke testing
|
||||
|
||||
Antithesis constantly runs your resmoke suite with one random test from the suite at a time.
|
||||
We support this out-of-the-box with most resmoke suites that use python fixtures.
|
||||
This is very similar to how tests run in evergreen.
|
||||
Your antithesis tasks in evergreen will default to this if the `antithesis_test_composer_dir` var is not specified on the task.
|
||||
Antithesis constantly runs your resmoke suite with one random test from the suite at a time. We
|
||||
support this out-of-the-box with most resmoke suites that use python fixtures. This is very similar
|
||||
to how tests run in evergreen. Your antithesis tasks in evergreen will default to this if the
|
||||
`antithesis_test_composer_dir` var is not specified on the task.
|
||||
|
||||
### Test Composer
|
||||
|
||||
@ -222,4 +230,5 @@ Evergreen configuration details, see
|
||||
|
||||
## Additional Resources
|
||||
|
||||
If you are interested in leveraging Antithesis feel free to reach out to #ask-devprod-correctness or #server-testing on Slack.
|
||||
If you are interested in leveraging Antithesis feel free to reach out to #ask-devprod-correctness or
|
||||
#server-testing on Slack.
|
||||
|
||||
109
docs/baton.md
109
docs/baton.md
@ -1,11 +1,10 @@
|
||||
# Server-Internal Baton Pattern
|
||||
|
||||
Batons are lightweight job queues in _mongod_ and _mongos_ processes that allow
|
||||
recording the intent to execute a task (e.g., polling on a network socket) and
|
||||
deferring its execution to a later time. Batons, often by reusing `Client`
|
||||
threads and through the _Waitable_ interface, move the execution of scheduled
|
||||
tasks out of the line, potentially hiding the execution cost from the critical
|
||||
path. A total of four baton classes are available today:
|
||||
Batons are lightweight job queues in _mongod_ and _mongos_ processes that allow recording the intent
|
||||
to execute a task (e.g., polling on a network socket) and deferring its execution to a later time.
|
||||
Batons, often by reusing `Client` threads and through the _Waitable_ interface, move the execution
|
||||
of scheduled tasks out of the line, potentially hiding the execution cost from the critical path. A
|
||||
total of four baton classes are available today:
|
||||
|
||||
- [Baton][baton]
|
||||
- [DefaultBaton][defaultBaton]
|
||||
@ -14,72 +13,74 @@ path. A total of four baton classes are available today:
|
||||
|
||||
## Baton Basics
|
||||
|
||||
All baton implementations extend _Baton_. They are tightly associated with an
|
||||
`OperationContext` and its `Client` thread. An `OperationContext` that belongs
|
||||
to a `ServiceContext` with a `TransportLayer` uses an `AsioNetworkingBaton`,
|
||||
else a `DefaultBaton`. The baton is accessed through the `OperationContext` with
|
||||
a call to `OperationContext::getBaton()`.
|
||||
All baton implementations extend _Baton_. They are tightly associated with an `OperationContext` and
|
||||
its `Client` thread. An `OperationContext` that belongs to a `ServiceContext` with a
|
||||
`TransportLayer` uses an `AsioNetworkingBaton`, else a `DefaultBaton`. The baton is accessed through
|
||||
the `OperationContext` with a call to `OperationContext::getBaton()`.
|
||||
|
||||
Each baton implementation exposes an interface to allow scheduling tasks on the
|
||||
baton, to demand the awakening of the baton on client socket disconnect, and to
|
||||
create a _SubBaton_. A _SubBaton_, for any of the baton types, is essentially a
|
||||
handle to a local object that proxies scheduling requests to its underlying baton
|
||||
until it is detached (e.g., through destruction of its handle).
|
||||
Each baton implementation exposes an interface to allow scheduling tasks on the baton, to demand the
|
||||
awakening of the baton on client socket disconnect, and to create a _SubBaton_. A _SubBaton_, for
|
||||
any of the baton types, is essentially a handle to a local object that proxies scheduling requests
|
||||
to its underlying baton until it is detached (e.g., through destruction of its handle).
|
||||
|
||||
Additionally, a _NetworkingBaton_ enables consumers of a transport layer to
|
||||
execute I/O themselves, rather than delegating it to other threads. They are
|
||||
special batons that are able to poll network sockets, which is not feasible
|
||||
through other baton types. This is essential for minimizing context switches and
|
||||
improving the readability of stack traces.
|
||||
Additionally, a _NetworkingBaton_ enables consumers of a transport layer to execute I/O themselves,
|
||||
rather than delegating it to other threads. They are special batons that are able to poll network
|
||||
sockets, which is not feasible through other baton types. This is essential for minimizing context
|
||||
switches and improving the readability of stack traces.
|
||||
|
||||
A baton runs automatically when blocking on its associated `OperationContext`
|
||||
with a call to `OperationContext::waitForConditionOrInterrupt()`. Many different
|
||||
apis that take in or use an _Interruptible_ will eventually call into this method
|
||||
(e.g. `Future::get(...)`, `OperationContext::sleepUntil(...)`, etc.).
|
||||
A baton runs automatically when blocking on its associated `OperationContext` with a call to
|
||||
`OperationContext::waitForConditionOrInterrupt()`. Many different apis that take in or use an
|
||||
_Interruptible_ will eventually call into this method (e.g. `Future::get(...)`,
|
||||
`OperationContext::sleepUntil(...)`, etc.).
|
||||
|
||||
### DefaultBaton
|
||||
|
||||
DefaultBaton is the most basic baton implementation. This baton provides the
|
||||
platform to execute tasks while a client thread awaits an event or a timeout,
|
||||
essentially paving the way towards utilizing idle cycles of client threads for
|
||||
useful work. Tasks can be scheduled on this baton through its associated
|
||||
`OperationContext` and using `OperationContext::getBaton()::schedule(...)`.
|
||||
DefaultBaton is the most basic baton implementation. This baton provides the platform to execute
|
||||
tasks while a client thread awaits an event or a timeout, essentially paving the way towards
|
||||
utilizing idle cycles of client threads for useful work. Tasks can be scheduled on this baton
|
||||
through its associated `OperationContext` and using `OperationContext::getBaton()::schedule(...)`.
|
||||
|
||||
Note that because _Baton_ extends an _OutOfLineExecutor_, it can be used as the
|
||||
executor to run work on an `ExecutorFuture`.
|
||||
Note that because _Baton_ extends an _OutOfLineExecutor_, it can be used as the executor to run work
|
||||
on an `ExecutorFuture`.
|
||||
|
||||
### AsioNetworkingBaton
|
||||
|
||||
The AsioNetworkingBaton can schedule and run tasks similarly to the _DefaultBaton_,
|
||||
but it also implements the _NetworkingBaton_ interface to provide a networking
|
||||
reactor. It can register sessions to monitor and will utilize `poll(2)` and
|
||||
`eventfd(2)` to wait until I/O can be performed on the socket or until interrupted.
|
||||
The AsioNetworkingBaton can schedule and run tasks similarly to the _DefaultBaton_, but it also
|
||||
implements the _NetworkingBaton_ interface to provide a networking reactor. It can register sessions
|
||||
to monitor and will utilize `poll(2)` and `eventfd(2)` to wait until I/O can be performed on the
|
||||
socket or until interrupted.
|
||||
|
||||
This baton is primarily used for egress networking where it gets scheduled to send
|
||||
off a command after a connection is made (see the relevant code [here][asioNetworkingBatonScheduling]).
|
||||
This means that the AsioNetworkingBaton will normally perform socket I/O without
|
||||
needing to poll. It only registers a session for polling if another read or
|
||||
write is needed on the socket (e.g. [registering a session during socket read][asioNetworkingBatonPollingSetup]).
|
||||
This baton is primarily used for egress networking where it gets scheduled to send off a command
|
||||
after a connection is made (see the relevant code [here][asioNetworkingBatonScheduling]). This means
|
||||
that the AsioNetworkingBaton will normally perform socket I/O without needing to poll. It only
|
||||
registers a session for polling if another read or write is needed on the socket (e.g. [registering
|
||||
a session during socket read][asioNetworkingBatonPollingSetup]).
|
||||
|
||||
In order for an egress session to use the baton, it must be specified as an
|
||||
argument to `TaskExecutor::scheduleRemoteCommand(...)`.
|
||||
In order for an egress session to use the baton, it must be specified as an argument to
|
||||
`TaskExecutor::scheduleRemoteCommand(...)`.
|
||||
|
||||
Note that this baton is only available for Linux.
|
||||
|
||||
## Example
|
||||
|
||||
For an example of scheduling a task on the `OperationContext` baton, see
|
||||
[here][example].
|
||||
For an example of scheduling a task on the `OperationContext` baton, see [here][example].
|
||||
|
||||
## Considerations
|
||||
|
||||
Since any task scheduled on a baton is intended for out-of-line execution, it
|
||||
must be non-blocking and preferably short-lived to ensure forward progress.
|
||||
Since any task scheduled on a baton is intended for out-of-line execution, it must be non-blocking
|
||||
and preferably short-lived to ensure forward progress.
|
||||
|
||||
[baton]: https://github.com/mongodb/mongo/blob/5906d967c3144d09fab6a4cc1daddb295df19ffb/src/mongo/db/baton.h#L61-L178
|
||||
[defaultBaton]: https://github.com/mongodb/mongo/blob/9cfe13115e92a43d1b9273ee1d5817d548264ba7/src/mongo/db/default_baton.h#L46-L75
|
||||
[networkingBaton]: https://github.com/mongodb/mongo/blob/9cfe13115e92a43d1b9273ee1d5817d548264ba7/src/mongo/transport/baton.h#L61-L96
|
||||
[asioNetworkingBaton]: https://github.com/mongodb/mongo/blob/9cfe13115e92a43d1b9273ee1d5817d548264ba7/src/mongo/transport/baton_asio_linux.h#L60-L529
|
||||
[asioNetworkingBatonScheduling]: https://github.com/mongodb/mongo/blob/46b8c49b4e13cc4c8389b2822f9e30dd73b81d6e/src/mongo/executor/network_interface_tl.cpp#L910
|
||||
[asioNetworkingBatonPollingSetup]: https://github.com/mongodb/mongo/blob/eab4ec41cc2b28bf0a38eb813f9690e1bfa6c9a6/src/mongo/transport/asio/asio_session_impl.cpp#L666-L696
|
||||
[example]: https://github.com/mongodb/mongo/blob/262e5a961fa7221bfba5722aeea2db719f2149f5/src/mongo/s/multi_statement_transaction_requests_sender.cpp#L91-L99
|
||||
[baton]:
|
||||
https://github.com/mongodb/mongo/blob/5906d967c3144d09fab6a4cc1daddb295df19ffb/src/mongo/db/baton.h#L61-L178
|
||||
[defaultBaton]:
|
||||
https://github.com/mongodb/mongo/blob/9cfe13115e92a43d1b9273ee1d5817d548264ba7/src/mongo/db/default_baton.h#L46-L75
|
||||
[networkingBaton]:
|
||||
https://github.com/mongodb/mongo/blob/9cfe13115e92a43d1b9273ee1d5817d548264ba7/src/mongo/transport/baton.h#L61-L96
|
||||
[asioNetworkingBaton]:
|
||||
https://github.com/mongodb/mongo/blob/9cfe13115e92a43d1b9273ee1d5817d548264ba7/src/mongo/transport/baton_asio_linux.h#L60-L529
|
||||
[asioNetworkingBatonScheduling]:
|
||||
https://github.com/mongodb/mongo/blob/46b8c49b4e13cc4c8389b2822f9e30dd73b81d6e/src/mongo/executor/network_interface_tl.cpp#L910
|
||||
[asioNetworkingBatonPollingSetup]:
|
||||
https://github.com/mongodb/mongo/blob/eab4ec41cc2b28bf0a38eb813f9690e1bfa6c9a6/src/mongo/transport/asio/asio_session_impl.cpp#L666-L696
|
||||
[example]:
|
||||
https://github.com/mongodb/mongo/blob/262e5a961fa7221bfba5722aeea2db719f2149f5/src/mongo/s/multi_statement_transaction_requests_sender.cpp#L91-L99
|
||||
|
||||
@ -1,6 +1,7 @@
|
||||
# Branching
|
||||
|
||||
This document describes branching task regarding file updates in `10gen/mongo` repository that should be done on a new branch immediately after a branch cut.
|
||||
This document describes branching task regarding file updates in `10gen/mongo` repository that
|
||||
should be done on a new branch immediately after a branch cut.
|
||||
|
||||
## Table of contents
|
||||
|
||||
@ -14,11 +15,14 @@ This document describes branching task regarding file updates in `10gen/mongo` r
|
||||
|
||||
### GitHub App credentials
|
||||
|
||||
Add GitHub app credentials (app id and key) in the new project settings, eg. https://spruce.corp.mongodb.com/project/mongodb-mongo-v8.3/settings/github-app-settings (additional MANA permissions may be required, else coordinate with Release team contacts).
|
||||
Add GitHub app credentials (app id and key) in the new project settings, eg.
|
||||
https://spruce.corp.mongodb.com/project/mongodb-mongo-v8.3/settings/github-app-settings (additional
|
||||
MANA permissions may be required, else coordinate with Release team contacts).
|
||||
|
||||
## 2. Create working branch
|
||||
|
||||
To save time during the branch cut these branching changes could be done beforehand, but not too early to avoid extra file conflicts, and then rebased on a new `vX.Y` branch.
|
||||
To save time during the branch cut these branching changes could be done beforehand, but not too
|
||||
early to avoid extra file conflicts, and then rebased on a new `vX.Y` branch.
|
||||
|
||||
Create a working branch from `master` or from a new `vX.Y` branch if it already exists:
|
||||
|
||||
@ -30,13 +34,16 @@ git checkout -b vX.Y-branching-task
|
||||
|
||||
## 2. Update files
|
||||
|
||||
**IMPORTANT!** All of these changes should be a separate commit, but they should be pushed together in the same commit-queue task.
|
||||
**IMPORTANT!** All of these changes should be a separate commit, but they should be pushed together
|
||||
in the same commit-queue task.
|
||||
|
||||
The reason they should be pushed as separate commits is in the case of needing to revert one aspect of this entire task.
|
||||
The reason they should be pushed as separate commits is in the case of needing to revert one aspect
|
||||
of this entire task.
|
||||
|
||||
> See [8.2 branching PR](https://github.com/mongodb/mongo/pull/38920/commits) for reference.
|
||||
|
||||
Some have some automated steps you can run, but please double-check their edits. Initialize the version here, used throughout:
|
||||
Some have some automated steps you can run, but please double-check their edits. Initialize the
|
||||
version here, used throughout:
|
||||
|
||||
```sh
|
||||
VERSION=8.3
|
||||
@ -51,7 +58,9 @@ sed -i "s/master/v$VERSION/g" copy.bara.sky
|
||||
sed -i 's/branch = "master"/branch = "v'"$VERSION"'"/' buildscripts/sync_repo_with_copybara.py
|
||||
```
|
||||
|
||||
For each file [`copy.bara.sky`](../../copy.bara.sky) and [`sync_repo_with_copybara.py`](../../buildscripts/sync_repo_with_copybara.py), the "master" branch references should be replaced with the new branch name.
|
||||
For each file [`copy.bara.sky`](../../copy.bara.sky) and
|
||||
[`sync_repo_with_copybara.py`](../../buildscripts/sync_repo_with_copybara.py), the "master" branch
|
||||
references should be replaced with the new branch name.
|
||||
|
||||
### Evergreen YAML configurations
|
||||
|
||||
@ -63,16 +72,23 @@ Run the following automation and verify results:
|
||||
sed -i "s/suffix\"] = \"latest\"/suffix\"] = \"v$VERSION-latest\"/g" buildscripts/generate_version_expansions.py
|
||||
```
|
||||
|
||||
In the file [`buildscripts/generate_version_expansions.py`](../../buildscripts/generate_version_expansions.py), the "latest" suffixes should be replaced with the new branch name.
|
||||
In the file
|
||||
[`buildscripts/generate_version_expansions.py`](../../buildscripts/generate_version_expansions.py),
|
||||
the "latest" suffixes should be replaced with the new branch name.
|
||||
|
||||
#### 2. Nightly YAML
|
||||
|
||||
[`etc/evergreen_nightly.yml`](../../etc/evergreen_nightly.yml) will be used as YAML configuration in the new `mongodb-mongo-vX.Y` evergreen project.
|
||||
[`etc/evergreen_nightly.yml`](../../etc/evergreen_nightly.yml) will be used as YAML configuration in
|
||||
the new `mongodb-mongo-vX.Y` evergreen project.
|
||||
|
||||
This will move some build variants from `etc/evergreen.yml` to continue running on a new branch project. More information about build variants after branching is [here](../evergreen-testing/yaml_configuration/buildvariants.md#build-variants-after-branching).
|
||||
This will move some build variants from `etc/evergreen.yml` to continue running on a new branch
|
||||
project. More information about build variants after branching is
|
||||
[here](../evergreen-testing/yaml_configuration/buildvariants.md#build-variants-after-branching).
|
||||
|
||||
- Copy over commit-queue aliases and patch aliases from [`etc/evergreen.yml`](../../etc/evergreen.yml)
|
||||
- Update "include" section: comment out or uncomment file includes as instructions in the comments suggest.
|
||||
- Copy over commit-queue aliases and patch aliases from
|
||||
[`etc/evergreen.yml`](../../etc/evergreen.yml)
|
||||
- Update "include" section: comment out or uncomment file includes as instructions in the comments
|
||||
suggest.
|
||||
|
||||
#### 3. Burn-in tasks
|
||||
|
||||
@ -82,7 +98,12 @@ Run the following automation and verify results:
|
||||
sed -i '/burn_in_tag_include_build_variants/{N;N;N;d;}' etc/evergreen_yml_components/variants/misc/misc.yml
|
||||
```
|
||||
|
||||
In the file [`etc/evergreen_yml_components/variants/misc/misc.yml`](../../etc/evergreen_yml_components/variants/misc/misc.yml), build variant names in the ["burn_in_tag_include_build_variants" expansion](https://github.com/mongodb/mongo/blob/0a68308f0d39a928ed551f285ba72ca560c38576/etc/evergreen_yml_components/variants/misc/misc.yml#L21) that are _not_ included in [`etc/evergreen_nightly.yml`](../../etc/evergreen_nightly.yml) are _removed_.
|
||||
In the file
|
||||
[`etc/evergreen_yml_components/variants/misc/misc.yml`](../../etc/evergreen_yml_components/variants/misc/misc.yml),
|
||||
build variant names in the
|
||||
["burn_in_tag_include_build_variants" expansion](https://github.com/mongodb/mongo/blob/0a68308f0d39a928ed551f285ba72ca560c38576/etc/evergreen_yml_components/variants/misc/misc.yml#L21)
|
||||
that are _not_ included in [`etc/evergreen_nightly.yml`](../../etc/evergreen_nightly.yml) are
|
||||
_removed_.
|
||||
|
||||
#### 4. Suggested to Required
|
||||
|
||||
@ -94,7 +115,9 @@ sed -i 's@display_name: "\* Amazon Linux 2023 arm64 Enterprise"@display_name: "!
|
||||
sed -i 's/tags: \["suggested", "forbid_tasks_tagged_with_experimental"\]/tags: ["required", "forbid_tasks_tagged_with_experimental"]/g' etc/evergreen_yml_components/variants/amazon/test_dev.yml
|
||||
```
|
||||
|
||||
For the variant `enterprise-amazon-linux2023-arm64` in [`etc/evergreen_yml_components/variants/amazon/test_dev.yml`](../../etc/evergreen_yml_components/variants/amazon/test_dev.yml), replace:
|
||||
For the variant `enterprise-amazon-linux2023-arm64` in
|
||||
[`etc/evergreen_yml_components/variants/amazon/test_dev.yml`](../../etc/evergreen_yml_components/variants/amazon/test_dev.yml),
|
||||
replace:
|
||||
|
||||
- "\*" with "!" in their display names
|
||||
- "suggested" variant tag with "required"
|
||||
@ -116,10 +139,12 @@ sed -i 's/!.incompatible_all_feature_flags/!.requires_all_feature_flags/g' $FILE
|
||||
|
||||
For the build variant names:
|
||||
|
||||
- in [`etc/evergreen_yml_components/variants/windows/test_dev.yml`](../../etc/evergreen_yml_components/variants/windows/test_dev.yml):
|
||||
- in
|
||||
[`etc/evergreen_yml_components/variants/windows/test_dev.yml`](../../etc/evergreen_yml_components/variants/windows/test_dev.yml):
|
||||
- `enterprise-windows-all-feature-flags-required`
|
||||
- `enterprise-windows-all-feature-flags-non-essential`
|
||||
- in [`etc/evergreen_yml_components/variants/sanitizer/test_dev.yml`](../../etc/evergreen_yml_components/variants/sanitizer/test_dev.yml):
|
||||
- in
|
||||
[`etc/evergreen_yml_components/variants/sanitizer/test_dev.yml`](../../etc/evergreen_yml_components/variants/sanitizer/test_dev.yml):
|
||||
|
||||
- `linux-debug-aubsan-lite-all-feature-flags-required`
|
||||
|
||||
@ -130,9 +155,12 @@ For the build variant names:
|
||||
|
||||
#### 6. Sys-perf YAML
|
||||
|
||||
[`etc/system_perf.yml`](../../etc/system_perf.yml) will be used as YAML configuration for a new `sys-perf-X.Y` evergreen project
|
||||
[`etc/system_perf.yml`](../../etc/system_perf.yml) will be used as YAML configuration for a new
|
||||
`sys-perf-X.Y` evergreen project
|
||||
|
||||
> Ensure that [DSI](https://github.com/10gen/dsi/blob/master/evergreen/system_perf/README.md#branching) has been updated with new branches
|
||||
> Ensure that
|
||||
> [DSI](https://github.com/10gen/dsi/blob/master/evergreen/system_perf/README.md#branching) has been
|
||||
> updated with new branches
|
||||
|
||||
Run the following automation and verify results:
|
||||
|
||||
@ -146,8 +174,13 @@ sed -i "s@evergreen/system_perf/master/variants.yml@evergreen/system_perf/$VERSI
|
||||
In the file [`etc/system_perf.yml`](../../etc/system_perf.yml), the following should be reflected:
|
||||
|
||||
- Remove `evergreen/system_perf/master/master_variants.yml` from "include" section
|
||||
- With the exception of `base.yml`, update all other entries that contain `master` in the path to contain `X.Y` in the path instead. (e.g. `evergreen/system_perf/master/variants.yml` should become `evergreen/system_perf/X.Y/variants.yml`).
|
||||
- Update the [evergreen project variable](https://docs.devprod.prod.corp.mongodb.com/evergreen/Project-Configuration/Project-and-Distro-Settings#variables) `compile_project` in the new sys-perf-X.Y evergreen project to point to the new mongodb-mongo-vX.Y branch
|
||||
- With the exception of `base.yml`, update all other entries that contain `master` in the path to
|
||||
contain `X.Y` in the path instead. (e.g. `evergreen/system_perf/master/variants.yml` should become
|
||||
`evergreen/system_perf/X.Y/variants.yml`).
|
||||
- Update the
|
||||
[evergreen project variable](https://docs.devprod.prod.corp.mongodb.com/evergreen/Project-Configuration/Project-and-Distro-Settings#variables)
|
||||
`compile_project` in the new sys-perf-X.Y evergreen project to point to the new mongodb-mongo-vX.Y
|
||||
branch
|
||||
|
||||
#### 7. Evergreen project validation
|
||||
|
||||
@ -157,7 +190,10 @@ Run the following automation and verify results:
|
||||
sed -i 's/RELEASE_BRANCH = False/RELEASE_BRANCH = True/g' buildscripts/validate_evg_project_config.py
|
||||
```
|
||||
|
||||
In file [`buildscripts/validate_evg_project_config.py`](../../buildscripts/validate_evg_project_config.py), the `RELEASE_BRANCH` variable should be set to `True` to leverage a specialized shortcut conditional to `evaluate` the project, not `validate`.
|
||||
In file
|
||||
[`buildscripts/validate_evg_project_config.py`](../../buildscripts/validate_evg_project_config.py),
|
||||
the `RELEASE_BRANCH` variable should be set to `True` to leverage a specialized shortcut conditional
|
||||
to `evaluate` the project, not `validate`.
|
||||
|
||||
#### 8. Coverity
|
||||
|
||||
@ -167,7 +203,8 @@ Run the following automation and verify results:
|
||||
sed -i "s/stream: mongo.master/stream: mongo.v$VERSION/g" etc/coverity.yml
|
||||
```
|
||||
|
||||
In the file [`etc/coverity.yml`](../../etc/coverity.yml), the "stream" should be updated to the new branch.
|
||||
In the file [`etc/coverity.yml`](../../etc/coverity.yml), the "stream" should be updated to the new
|
||||
branch.
|
||||
|
||||
#### Finally: format and lint
|
||||
|
||||
@ -179,7 +216,8 @@ Run linters and formatters and fix anything that couldn't be autofixed.
|
||||
|
||||
## 3. Test changes
|
||||
|
||||
In case working branch was created from `master` branch, rebase it on a new `vX.Y` branch and fix file conflicts if any.
|
||||
In case working branch was created from `master` branch, rebase it on a new `vX.Y` branch and fix
|
||||
file conflicts if any.
|
||||
|
||||
Schedule required patch on a new `mongodb-mongo-vX.Y` project:
|
||||
|
||||
@ -187,7 +225,8 @@ Schedule required patch on a new `mongodb-mongo-vX.Y` project:
|
||||
evergreen patch -p mongodb-mongo-vX.Y -a required
|
||||
```
|
||||
|
||||
If patch results reveal that some steps are missing or outdated in this file, make sure to update the branching documentation on a "master" branch accordingly.
|
||||
If patch results reveal that some steps are missing or outdated in this file, make sure to update
|
||||
the branching documentation on a "master" branch accordingly.
|
||||
|
||||
## 4. Merge changes
|
||||
|
||||
|
||||
@ -1,8 +1,7 @@
|
||||
# Building MongoDB
|
||||
|
||||
Please note that prebuilt binaries are available on
|
||||
[mongodb.org](http://www.mongodb.org/downloads) and may be the easiest
|
||||
way to get started, rather than building from source.
|
||||
Please note that prebuilt binaries are available on [mongodb.org](http://www.mongodb.org/downloads)
|
||||
and may be the easiest way to get started, rather than building from source.
|
||||
|
||||
To build MongoDB, you will need:
|
||||
|
||||
@ -20,13 +19,13 @@ To build MongoDB, you will need:
|
||||
- On Ubuntu, the lzma library is required. Install `liblzma-dev`
|
||||
- On Amazon Linux, the xz-devel library is required. `yum install xz-devel`
|
||||
- Python 3.13
|
||||
- About 13 GB of free disk space for the core binaries (`mongod`,
|
||||
`mongos`, and `mongo`).
|
||||
- About 13 GB of free disk space for the core binaries (`mongod`, `mongos`, and `mongo`).
|
||||
|
||||
If using a newer version of a C++ compiler than listed above, it may work. However the versions listed above have been verified to work.
|
||||
If using a newer version of a C++ compiler than listed above, it may work. However the versions
|
||||
listed above have been verified to work.
|
||||
|
||||
MongoDB supports the following architectures: arm64, ppc64le, s390x,
|
||||
and x86-64. More detailed platform instructions can be found below.
|
||||
MongoDB supports the following architectures: arm64, ppc64le, s390x, and x86-64. More detailed
|
||||
platform instructions can be found below.
|
||||
|
||||
## Quick (re)Start
|
||||
|
||||
@ -45,23 +44,21 @@ If you only want to build the database server `mongod`:
|
||||
|
||||
$ bazel build install-mongod
|
||||
|
||||
**_Note_**: For C++ compilers that are newer than the supported
|
||||
version, the compiler may issue new warnings that cause MongoDB to
|
||||
fail to build since the build system treats compiler warnings as
|
||||
errors. To ignore the warnings, pass the switch
|
||||
`--disable_warnings_as_errors=True` to the bazel command.
|
||||
**_Note_**: For C++ compilers that are newer than the supported version, the compiler may issue new
|
||||
warnings that cause MongoDB to fail to build since the build system treats compiler warnings as
|
||||
errors. To ignore the warnings, pass the switch `--disable_warnings_as_errors=True` to the bazel
|
||||
command.
|
||||
|
||||
$ bazel build install-mongod --disable_warnings_as_errors=True
|
||||
|
||||
If you want to build absolutely everything (`mongod`, `mongo`, unit
|
||||
tests, etc):
|
||||
If you want to build absolutely everything (`mongod`, `mongo`, unit tests, etc):
|
||||
|
||||
$ bazel build --build_tag_filters=mongo_binary //src/mongo/...
|
||||
|
||||
## Bazel Targets
|
||||
|
||||
The following targets can be named on the bazel command line to build and
|
||||
install a subset of components:
|
||||
The following targets can be named on the bazel command line to build and install a subset of
|
||||
components:
|
||||
|
||||
- `install-mongod`
|
||||
- `install-mongos`
|
||||
@ -69,16 +66,15 @@ install a subset of components:
|
||||
- `install-dist` (includes all server components)
|
||||
- `install-devcore` (includes `mongod`, `mongos`, and `jstestshell` (formerly `mongo` shell))
|
||||
|
||||
**_NOTE_**: The `install-core` and `install-dist` targets are _not_
|
||||
guaranteed to be identical. The `install-core` target will only ever include a
|
||||
minimal set of "core" server components, while `install-dist` is intended
|
||||
for a functional end-user installation. If you are testing, you should use the
|
||||
`install-devcore` or `install-dist` targets instead.
|
||||
**_NOTE_**: The `install-core` and `install-dist` targets are _not_ guaranteed to be identical. The
|
||||
`install-core` target will only ever include a minimal set of "core" server components, while
|
||||
`install-dist` is intended for a functional end-user installation. If you are testing, you should
|
||||
use the `install-devcore` or `install-dist` targets instead.
|
||||
|
||||
## Where to find Binaries
|
||||
|
||||
The build system will produce an installation tree into `bazel-bin/install`, as well
|
||||
individual install target trees like `bazel-bin/<install-target>`.
|
||||
The build system will produce an installation tree into `bazel-bin/install`, as well individual
|
||||
install target trees like `bazel-bin/<install-target>`.
|
||||
|
||||
## Windows
|
||||
|
||||
@ -97,8 +93,6 @@ To install dependencies on Debian or Ubuntu systems:
|
||||
|
||||
## OS X
|
||||
|
||||
Install Xcode 16.4 or newer. Make sure macOS 15.5 platform
|
||||
is installed.
|
||||
Install Xcode 16.4 or newer. Make sure macOS 15.5 platform is installed.
|
||||
|
||||
Install llvm and lld, version 19 from brew:
|
||||
brew install llvm@19 lld@19
|
||||
Install llvm and lld, version 19 from brew: brew install llvm@19 lld@19
|
||||
|
||||
@ -5,25 +5,23 @@ current version of master, if not explicitly stated otherwise. Implementation de
|
||||
versions may vary slightly.
|
||||
|
||||
Change streams are a convenient way for an application to monitor changes made to the data in a
|
||||
deployment.
|
||||
The events produced by change streams are called "change events". The event data is produced from
|
||||
the oplog(s) of the deployment.
|
||||
The events that are emitted by change streams include
|
||||
deployment. The events produced by change streams are called "change events". The event data is
|
||||
produced from the oplog(s) of the deployment. The events that are emitted by change streams include
|
||||
|
||||
- DML events: emitted for operations that insert, update, replace, or delete individual documents.
|
||||
- DDL events: emitted for operations that create, drop, or modify collections, databases, or views.
|
||||
- Data placement events: emitted for operations that define or modify the placement of data inside
|
||||
a sharded cluster.
|
||||
- Data placement events: emitted for operations that define or modify the placement of data inside a
|
||||
sharded cluster.
|
||||
- Cluster topology events: emitted for operations that add or remove shards in a sharded cluster.
|
||||
|
||||
Which exact event types are emitted by a change stream depends on the change stream configuration
|
||||
and the deployment type.
|
||||
|
||||
Change streams are mainly used by customer applications and tools to keep track of changes to the
|
||||
data in a deployment, in order to relay these updates to external systems.
|
||||
Some of MongoDB's own tools and components are also based on change streams, e.g. _mongosync_ (C2C),
|
||||
Atlas Search, Atlas Stream Processing, and the resharding process.
|
||||
The component that opens a change stream and pulls events from it is called the "consumer".
|
||||
data in a deployment, in order to relay these updates to external systems. Some of MongoDB's own
|
||||
tools and components are also based on change streams, e.g. _mongosync_ (C2C), Atlas Search, Atlas
|
||||
Stream Processing, and the resharding process. The component that opens a change stream and pulls
|
||||
events from it is called the "consumer".
|
||||
|
||||
## Change Stream Guarantees
|
||||
|
||||
@ -31,17 +29,16 @@ Change Streams provide various guarantees:
|
||||
|
||||
- Ordering: change streams deliver events in the order they originally occurred within the target
|
||||
namespace (e.g., collection, database, or entire cluster). The order is based on the sequence in
|
||||
which the operations were applied to the oplog.
|
||||
In a sharded cluster, the events from multiple oplogs will be merged deterministically into a
|
||||
single, ordered stream of change events.
|
||||
which the operations were applied to the oplog. In a sharded cluster, the events from multiple
|
||||
oplogs will be merged deterministically into a single, ordered stream of change events.
|
||||
- Durability and reproducability: change streams are based on the internal oplog, which is part of
|
||||
the deployment's replication mechanism. Change streams only deliver events after they have been
|
||||
committed to a majority of nodes and durably persisted, ensuring they will not be rolled back.
|
||||
- Exactly-once delivery: every event in a change stream is emitted exactly once, and no event that
|
||||
matches the change stream filter is skipped.
|
||||
- Resumability: change stream consumption can be interrupted due to transient errors (e.g. network
|
||||
issues, node failures, application errors), but it can be resumed from the exact point where
|
||||
the consumption stopped. This is made possible by the resume token (`_id` field) that accompanies
|
||||
issues, node failures, application errors), but it can be resumed from the exact point where the
|
||||
consumption stopped. This is made possible by the resume token (`_id` field) that accompanies
|
||||
every change event, which acts as a bookmark. This allows to the consumer to continue processing
|
||||
changes from the last known position without missing events.
|
||||
|
||||
@ -71,9 +68,8 @@ opened against standalone _mongod_ instances, as there is no oplog to generate t
|
||||
standalone mode.
|
||||
|
||||
In replica set deployments, the change stream can be opened directly on any replica set member of
|
||||
the deployment.
|
||||
In sharded cluster deployments, the change stream must be opened against any of the deployment's
|
||||
_mongos_ processes.
|
||||
the deployment. In sharded cluster deployments, the change stream must be opened against any of the
|
||||
deployment's _mongos_ processes.
|
||||
|
||||
A change stream is opened by executing an `aggregate` command with a pipeline that contains at least
|
||||
the `$changeStream` pipeline stage.
|
||||
@ -115,9 +111,8 @@ db.getSiblingDB("testDB").runCommand({
|
||||
```
|
||||
|
||||
The `aggregate` parameter must be set to `1` for database-level change streams, and the command must
|
||||
be executed inside the desired database.
|
||||
The internal namespace that is used by database-level change streams is `<dbName>.$cmd.aggregate`
|
||||
(where `<dbName>` is the actual name of the database).
|
||||
be executed inside the desired database. The internal namespace that is used by database-level
|
||||
change streams is `<dbName>.$cmd.aggregate` (where `<dbName>` is the actual name of the database).
|
||||
|
||||
### Opening an All-Cluster Change Stream
|
||||
|
||||
@ -161,9 +156,8 @@ into smaller fragments, in order to avoid running into `BSONObjectTooLarge` erro
|
||||
### Change Stream Start Time
|
||||
|
||||
When opening a change stream without specifying an explicit point in time, the change stream will be
|
||||
opened using the current time, and will report only change events that happened after that point
|
||||
in time.
|
||||
The current time here is
|
||||
opened using the current time, and will report only change events that happened after that point in
|
||||
time. The current time here is
|
||||
|
||||
- the time of the latest majority-committed operation for replica set change streams, or
|
||||
- the value of the cluster's vector clock for sharded cluster change streams.
|
||||
@ -174,9 +168,8 @@ parameter is specified as a logical timestamp.
|
||||
|
||||
### Resuming Change Streams
|
||||
|
||||
Change streams allow the consumer to resume the change stream after an error occurred.
|
||||
To support resumability, change streams report a "resume token" inside the `_id` field of every
|
||||
emitted event.
|
||||
Change streams allow the consumer to resume the change stream after an error occurred. To support
|
||||
resumability, change streams report a "resume token" inside the `_id` field of every emitted event.
|
||||
To resume a change stream after an error occurred, the resume token of a previously consumed event
|
||||
can be passed in one of the parameters `resumeAfter` or `startAfter` when opening a change stream.
|
||||
|
||||
@ -198,8 +191,7 @@ with a different `$match` expression may lead to different events being returned
|
||||
the event with the original resume token not being found in the new change stream.
|
||||
|
||||
The resume tokens that are emitted by change streams are string values that contain a hexadecimal
|
||||
encoding of the internal resume token data.
|
||||
The internal resume token data contains
|
||||
encoding of the internal resume token data. The internal resume token data contains
|
||||
|
||||
- the cluster time of an event.
|
||||
- the version of the resume token format.
|
||||
@ -212,11 +204,13 @@ The internal resume token data contains
|
||||
Resume tokens are versioned. Currently only version 2 is supported.
|
||||
|
||||
Future versions may introduce new resume token versions. Client applications should treat resume
|
||||
tokens as opaque identifiers and should not make any assumptions about the format or internals
|
||||
or resume tokens, nor should they rely on the internal implementation details of resume tokens.
|
||||
tokens as opaque identifiers and should not make any assumptions about the format or internals or
|
||||
resume tokens, nor should they rely on the internal implementation details of resume tokens.
|
||||
|
||||
Resume tokens are serialized and deserialized by the [ResumeToken](https://github.com/mongodb/mongo/blob/eb4c6148f6a25c444be39a0e330506834526d935/src/mongo/db/pipeline/resume_token.h#L148)
|
||||
class. The resume token internal data is stored in [ResumeTokenData](https://github.com/mongodb/mongo/blob/eb4c6148f6a25c444be39a0e330506834526d935/src/mongo/db/pipeline/resume_token.h#L51).
|
||||
Resume tokens are serialized and deserialized by the
|
||||
[ResumeToken](https://github.com/mongodb/mongo/blob/eb4c6148f6a25c444be39a0e330506834526d935/src/mongo/db/pipeline/resume_token.h#L148)
|
||||
class. The resume token internal data is stored in
|
||||
[ResumeTokenData](https://github.com/mongodb/mongo/blob/eb4c6148f6a25c444be39a0e330506834526d935/src/mongo/db/pipeline/resume_token.h#L51).
|
||||
|
||||
#### Resume Token Types
|
||||
|
||||
@ -225,12 +219,12 @@ There are two types of resume tokens:
|
||||
- event resume tokens
|
||||
- high watermark resume tokens
|
||||
|
||||
The former stem from actual change events.
|
||||
High watermark token are a special kind of change stream resume token that represent a logical
|
||||
position in the global change stream ordered only by cluster time, not a specific event.
|
||||
The former stem from actual change events. High watermark token are a special kind of change stream
|
||||
resume token that represent a logical position in the global change stream ordered only by cluster
|
||||
time, not a specific event.
|
||||
|
||||
High watermark tokens sort strictly before any real event token at the same cluster time.
|
||||
That is, a high‑watermark token for time T sorts ahead of all events whose cluster time >= T.
|
||||
High watermark tokens sort strictly before any real event token at the same cluster time. That is, a
|
||||
high‑watermark token for time T sorts ahead of all events whose cluster time >= T.
|
||||
|
||||
#### Decoding Resume Tokens
|
||||
|
||||
@ -267,43 +261,42 @@ by the consumer or the change stream runs into an error. Also, unused cursors ar
|
||||
garbage-collected after a period of inactivity.
|
||||
|
||||
When opening a change stream on a sharded cluster, the targeted `mongos` instance will open the
|
||||
required cursors on the relevant shards of the cluster and also the config server. Here, the `mongos`
|
||||
instance will also automatically open additional cursors in case new shards are added to the
|
||||
cluster. All this is abstracted from the consumer of the change stream. The consumer of the change
|
||||
stream will only see a single cursor and interact with _mongos_, which handles the complexity of
|
||||
managing the underlying shard cursors.
|
||||
required cursors on the relevant shards of the cluster and also the config server. Here, the
|
||||
`mongos` instance will also automatically open additional cursors in case new shards are added to
|
||||
the cluster. All this is abstracted from the consumer of the change stream. The consumer of the
|
||||
change stream will only see a single cursor and interact with _mongos_, which handles the complexity
|
||||
of managing the underlying shard cursors.
|
||||
|
||||
If a change stream cursor can be successfully established, the cursor id is returned to the
|
||||
consumer. The consumer can then use the cursor id to pull change events from the change stream by
|
||||
issuing follow-up `getMore` commands to this cursor.
|
||||
|
||||
If a change stream cursor cannot be successfully opened, the initial `aggregate` command will
|
||||
return an error, and the returned cursor id will be `0`. In this case, no events can be consumed
|
||||
from the change stream, and the consumer needs to resolve the error.
|
||||
If a change stream cursor cannot be successfully opened, the initial `aggregate` command will return
|
||||
an error, and the returned cursor id will be `0`. In this case, no events can be consumed from the
|
||||
change stream, and the consumer needs to resolve the error.
|
||||
|
||||
### Change Stream errors
|
||||
|
||||
When a change stream is opened at a specific point in time, it is validated that the oplog of all
|
||||
participating nodes actually contains data for this point in time.
|
||||
If the oplog does not contain any data for the exact point in time or before, it would be possible
|
||||
that the requested data has already fallen off the oplog.
|
||||
In case no oplog entry can be found that is at least as old as the specified timetamp, opening the
|
||||
change stream will fail with error code `OplogQueryMinTsMissing`.
|
||||
This validation happens for all change streams, regardless if the start timestamp is specified via
|
||||
the `resumeAfter`, `startAfter` or `startAtOperationTime` parameters, or if the start time is
|
||||
implied from the current time.
|
||||
An exception in which opening a change stream at a later point in time than the timestamp of the
|
||||
first present oplog entry is permitted is for new shard primaries.
|
||||
New shard primary can be added to an existing cluster at any point in time. When a new shard primary
|
||||
is added, its first oplog entry will be a no-op entry with `msg` == `initiating set` (on ASC) or
|
||||
`msg` == `new primary` (on DSC).
|
||||
participating nodes actually contains data for this point in time. If the oplog does not contain any
|
||||
data for the exact point in time or before, it would be possible that the requested data has already
|
||||
fallen off the oplog. In case no oplog entry can be found that is at least as old as the specified
|
||||
timetamp, opening the change stream will fail with error code `OplogQueryMinTsMissing`. This
|
||||
validation happens for all change streams, regardless if the start timestamp is specified via the
|
||||
`resumeAfter`, `startAfter` or `startAtOperationTime` parameters, or if the start time is implied
|
||||
from the current time. An exception in which opening a change stream at a later point in time than
|
||||
the timestamp of the first present oplog entry is permitted is for new shard primaries. New shard
|
||||
primary can be added to an existing cluster at any point in time. When a new shard primary is added,
|
||||
its first oplog entry will be a no-op entry with `msg` == `initiating set` (on ASC) or `msg` ==
|
||||
`new primary` (on DSC).
|
||||
|
||||
The code for this can be found [here](https://github.com/mongodb/mongo/blob/eb4c6148f6a25c444be39a0e330506834526d935/src/mongo/db/exec/classic/collection_scan.cpp#L195-L227).
|
||||
The code for this can be found
|
||||
[here](https://github.com/mongodb/mongo/blob/eb4c6148f6a25c444be39a0e330506834526d935/src/mongo/db/exec/classic/collection_scan.cpp#L195-L227).
|
||||
|
||||
Another common error is `ChangeStreamHistoryLost`. This error is raised when a change stream is
|
||||
opened with a resume token that cannot be found (anymore) in any of the participating nodes' oplogs.
|
||||
This can either happen when the resume event has actually fallen off the oplog, or, when a
|
||||
change stream is resumed with the resume token from another change stream with a different `$match`
|
||||
This can either happen when the resume event has actually fallen off the oplog, or, when a change
|
||||
stream is resumed with the resume token from another change stream with a different `$match`
|
||||
expression. In this case, the new change stream may filter out the resume event due to the different
|
||||
`$match` expression, so it cannot be found anymore.
|
||||
|
||||
@ -342,9 +335,9 @@ request:
|
||||
- `maxTimeMS`: maximum server-side waiting time for producing events.
|
||||
|
||||
The `getMore` command will fill the response with up to `batchSize` results if that many events are
|
||||
available. A response can also contain less events than the specified `batchSize`.
|
||||
Regardless of the specified batch size, the maximum response size limit of 16MB will be honored, in
|
||||
order to prevent responses from getting too large.
|
||||
available. A response can also contain less events than the specified `batchSize`. Regardless of the
|
||||
specified batch size, the maximum response size limit of 16MB will be honored, in order to prevent
|
||||
responses from getting too large.
|
||||
|
||||
A change stream response is returned to the consumer when
|
||||
|
||||
@ -353,14 +346,13 @@ A change stream response is returned to the consumer when
|
||||
would make it exceed the 16MB size limit.
|
||||
|
||||
In case the change stream cursor has reached the end of the oplog and there are currently no events
|
||||
to return, the response will be returned immediately if it already contains at least one event.
|
||||
If the response is empty, the change stream will wait for at most `maxTimeMS` for new oplog entries
|
||||
to arrive.
|
||||
If no new oplog entries arrive within `maxTimeMS`, an empty response will be returned. If new oplog
|
||||
entries arrive within `maxTimeMS` and at least one of them matches the change stream's filter, the
|
||||
matching event will be returned immediately. If oplog entries arrive but do not match the change
|
||||
stream's filter, the change stream will wait for matching oplog entries until `maxTimeMS` is fully
|
||||
expired.
|
||||
to return, the response will be returned immediately if it already contains at least one event. If
|
||||
the response is empty, the change stream will wait for at most `maxTimeMS` for new oplog entries to
|
||||
arrive. If no new oplog entries arrive within `maxTimeMS`, an empty response will be returned. If
|
||||
new oplog entries arrive within `maxTimeMS` and at least one of them matches the change stream's
|
||||
filter, the matching event will be returned immediately. If oplog entries arrive but do not match
|
||||
the change stream's filter, the change stream will wait for matching oplog entries until `maxTimeMS`
|
||||
is fully expired.
|
||||
|
||||
### Generic Event layout
|
||||
|
||||
@ -379,8 +371,8 @@ The following generic fields are added for change streams that were opened with
|
||||
- `collectionUUID`: UUID of the collection for which the event occurred, if applicable.
|
||||
- `operationDescription`: populated for DDL events.
|
||||
|
||||
Most other fields are event type-specific, so they are only present for specific events.
|
||||
A few such fields include:
|
||||
Most other fields are event type-specific, so they are only present for specific events. A few such
|
||||
fields include:
|
||||
|
||||
- `documentKey`: the `_id` value of the affected document, populated for DML events. May contain the
|
||||
shard key values for sharded collections.
|
||||
@ -389,9 +381,11 @@ A few such fields include:
|
||||
value than `default`.
|
||||
- `updateDescription` / `rawUpdateDescription`: contains details for "update" events.
|
||||
|
||||
The majority of change stream event fields are emitted by the `ChangeStreamDefaultEventTransformation`
|
||||
object [here](https://github.com/mongodb/mongo/blob/eb4c6148f6a25c444be39a0e330506834526d935/src/mongo/db/pipeline/change_stream_event_transform.cpp#L321). This object is called by the `ChangeStreamEventTransform`
|
||||
stage [here](https://github.com/mongodb/mongo/blob/eb4c6148f6a25c444be39a0e330506834526d935/src/mongo/db/exec/agg/change_stream_transform_stage.cpp#L75).
|
||||
The majority of change stream event fields are emitted by the
|
||||
`ChangeStreamDefaultEventTransformation` object
|
||||
[here](https://github.com/mongodb/mongo/blob/eb4c6148f6a25c444be39a0e330506834526d935/src/mongo/db/pipeline/change_stream_event_transform.cpp#L321).
|
||||
This object is called by the `ChangeStreamEventTransform` stage
|
||||
[here](https://github.com/mongodb/mongo/blob/eb4c6148f6a25c444be39a0e330506834526d935/src/mongo/db/exec/agg/change_stream_transform_stage.cpp#L75).
|
||||
|
||||
A custom `$project` stage in the change stream pipeline can be used to suppress certain fields.
|
||||
|
||||
@ -401,8 +395,8 @@ Emitted change events can get large, especially if they contain pre- or post-ima
|
||||
the events can exceed the maximum BSON object size of 16MB, which can lead to `BSONObjectTooLarge`
|
||||
errors when trying to process these change stream events.
|
||||
|
||||
To split large change stream events into multiple smaller chunks, change stream consumers can add
|
||||
a `$changeStreamSplitLargeEvent` stage as the last step of their change stream pipeline, e.g.
|
||||
To split large change stream events into multiple smaller chunks, change stream consumers can add a
|
||||
`$changeStreamSplitLargeEvent` stage as the last step of their change stream pipeline, e.g.
|
||||
|
||||
```js
|
||||
db.getSiblingDB("testDB").runCommand({
|
||||
@ -419,8 +413,10 @@ db.getSiblingDB("testDB").runCommand({
|
||||
});
|
||||
```
|
||||
|
||||
The splitting is performed by the `ChangeStreamSplitLargeEventStage` stage [here](https://github.com/mongodb/mongo/blob/eb4c6148f6a25c444be39a0e330506834526d935/src/mongo/db/exec/agg/change_stream_split_large_event_stage.cpp#L72),
|
||||
using [this helper function](https://github.com/mongodb/mongo/blob/eb4c6148f6a25c444be39a0e330506834526d935/src/mongo/db/pipeline/change_stream_split_event_helpers.cpp#L63).
|
||||
The splitting is performed by the `ChangeStreamSplitLargeEventStage` stage
|
||||
[here](https://github.com/mongodb/mongo/blob/eb4c6148f6a25c444be39a0e330506834526d935/src/mongo/db/exec/agg/change_stream_split_large_event_stage.cpp#L72),
|
||||
using
|
||||
[this helper function](https://github.com/mongodb/mongo/blob/eb4c6148f6a25c444be39a0e330506834526d935/src/mongo/db/pipeline/change_stream_split_event_helpers.cpp#L63).
|
||||
The change stream consumer is responsible for assembling the split event fragments into a single
|
||||
event later.
|
||||
|
||||
@ -434,10 +430,9 @@ close the change stream cursor in specific situations:
|
||||
- the target collection is renamed
|
||||
- the parent database of the target collection is dropped
|
||||
- in database-level change streams, the change stream is invalidated if the target database is
|
||||
dropped.
|
||||
In case a change stream gets invalidated by any of the above situations, it will emit a special
|
||||
"invalidate" event to inform the consumer that further processing is not possible.
|
||||
There are no "invalidate" events in all-cluster change streams.
|
||||
dropped. In case a change stream gets invalidated by any of the above situations, it will emit a
|
||||
special "invalidate" event to inform the consumer that further processing is not possible. There
|
||||
are no "invalidate" events in all-cluster change streams.
|
||||
|
||||
Issuing of change stream invalidate events is implemented in the `ChangeStreamCheckInvalidateStage`
|
||||
[here](https://github.com/mongodb/mongo/blob/eb4c6148f6a25c444be39a0e330506834526d935/src/mongo/db/exec/agg/change_stream_check_invalidate_stage.cpp#L106-L157).
|
||||
@ -445,12 +440,13 @@ Issuing of change stream invalidate events is implemented in the `ChangeStreamCh
|
||||
## Change Stream Parameters
|
||||
|
||||
The behavior of change streams can be controlled via various parameters that can be passed with the
|
||||
initial `aggregate` command used to open the change stream.
|
||||
The parameters are defined in an [IDL file](https://github.com/mongodb/mongo/blob/eb4c6148f6a25c444be39a0e330506834526d935/src/mongo/db/pipeline/document_source_change_stream.idl#L84).
|
||||
initial `aggregate` command used to open the change stream. The parameters are defined in an
|
||||
[IDL file](https://github.com/mongodb/mongo/blob/eb4c6148f6a25c444be39a0e330506834526d935/src/mongo/db/pipeline/document_source_change_stream.idl#L84).
|
||||
|
||||
The parameters that are provided when opening the change stream are automatically validated using
|
||||
mechanisms provided by the IDL framework. Additional validation of the change stream parameters is
|
||||
performed [here](https://github.com/mongodb/mongo/blob/eb4c6148f6a25c444be39a0e330506834526d935/src/mongo/db/pipeline/document_source_change_stream.cpp#L391).
|
||||
performed
|
||||
[here](https://github.com/mongodb/mongo/blob/eb4c6148f6a25c444be39a0e330506834526d935/src/mongo/db/pipeline/document_source_change_stream.cpp#L391).
|
||||
Invalid change stream parameters are immediately rejected with appropriate errors.
|
||||
|
||||
### `fullDocument`
|
||||
@ -466,17 +462,16 @@ The following values are possible:
|
||||
may not be the same version of the document that was present when the "update" change event was
|
||||
originally recorded. If no document can be found by the lookup, the `fullDocument` field will
|
||||
contain `null`.
|
||||
- `whenAvailable`: the `fullDocument` field will be populated with the post-image for the event.
|
||||
The post-image is generated on the fly from a stored pre-image and applying a delta update from
|
||||
the event on top of it. If no post-image is available, the `fullDocument` field will contain
|
||||
`null`.
|
||||
- `whenAvailable`: the `fullDocument` field will be populated with the post-image for the event. The
|
||||
post-image is generated on the fly from a stored pre-image and applying a delta update from the
|
||||
event on top of it. If no post-image is available, the `fullDocument` field will contain `null`.
|
||||
- `required`: populates the `fullDocument` field with the post-image for the event. Post-images are
|
||||
generated in the same way as in `whenAvailable`. If no post-image can be generated, this will
|
||||
abort the change stream with a `NoMatchingDocument` error.
|
||||
|
||||
The latter two options rely on pre-images to be enabled for the target collection(s).
|
||||
When pre-images are enabled, they are written synchronously with the regular "update" oplog entry,
|
||||
and change stream events aren’t returned until both have been majority-committed.
|
||||
The latter two options rely on pre-images to be enabled for the target collection(s). When
|
||||
pre-images are enabled, they are written synchronously with the regular "update" oplog entry, and
|
||||
change stream events aren’t returned until both have been majority-committed.
|
||||
|
||||
Post-images for "update" events are added to change events by the `ChangeStreamAddPostImage` stage
|
||||
[here](https://github.com/mongodb/mongo/blob/eb4c6148f6a25c444be39a0e330506834526d935/src/mongo/db/exec/agg/change_stream_add_post_image_stage.cpp#L84).
|
||||
@ -506,29 +501,25 @@ parameters are:
|
||||
#### `showExpandedEvents` (public)
|
||||
|
||||
The `showExpandedEvents` flag can be used to make a change stream return both additional event types
|
||||
and additional fields.
|
||||
The flag defaults to `false`. In this mode, change streams will only return DML events and no DDL
|
||||
events.
|
||||
When setting `showExpandedEvents` to `true`, change streams will also emit events for various DDL
|
||||
operations.
|
||||
In addition, setting `showExpandedEvents` will make change streams return the additional fields
|
||||
`collectionUUID` (for various change stream event types) and `updateDescription.disambiguatedPaths`
|
||||
(for update events).
|
||||
and additional fields. The flag defaults to `false`. In this mode, change streams will only return
|
||||
DML events and no DDL events. When setting `showExpandedEvents` to `true`, change streams will also
|
||||
emit events for various DDL operations. In addition, setting `showExpandedEvents` will make change
|
||||
streams return the additional fields `collectionUUID` (for various change stream event types) and
|
||||
`updateDescription.disambiguatedPaths` (for update events).
|
||||
|
||||
#### `matchCollectionUUIDForUpdateLookup` (public)
|
||||
|
||||
The `matchCollectionUUIDForUpdateLookup` field can be used to ensure that "updateLookup" operations
|
||||
are performed on the correct collection in case multiple collections with the same name have existed
|
||||
over time.
|
||||
This is relevant, because change streams can be opened retroactively on collections that were already
|
||||
dropped and may have been recreated with the same name but different contents afterwards.
|
||||
over time. This is relevant, because change streams can be opened retroactively on collections that
|
||||
were already dropped and may have been recreated with the same name but different contents
|
||||
afterwards.
|
||||
|
||||
The flag defaults to `false`. In this case, "updateLookup" operations will not verify that the
|
||||
looked-up document is actually from the same collection "generation" as the change event the
|
||||
document was looked up for.
|
||||
If set to `true`, "updateLookup" operations will compare the collection UUID of the change event
|
||||
with the UUID of the collection. If there is a UUID mismatch, the returned `fullDocument` field of
|
||||
the event will be set to `null`.
|
||||
document was looked up for. If set to `true`, "updateLookup" operations will compare the collection
|
||||
UUID of the change event with the UUID of the collection. If there is a UUID mismatch, the returned
|
||||
`fullDocument` field of the event will be set to `null`.
|
||||
|
||||
#### `allChangesForCluster` (public)
|
||||
|
||||
@ -539,29 +530,28 @@ automatically when opening an all-cluster change stream.
|
||||
|
||||
The `showSystemEvents` flag can be used to make change streams return events for collections inside
|
||||
the `system` namespace. These are not emitted by default. Setting `showSystemEvents` to `true` will
|
||||
also include events related to system collections in the change stream.
|
||||
The flag defaults to `false` and is internal.
|
||||
also include events related to system collections in the change stream. The flag defaults to `false`
|
||||
and is internal.
|
||||
|
||||
#### `showMigrationEvents` (internal)
|
||||
|
||||
The `showMigrationEvents` flag can be used to make change streams return DML events that are
|
||||
happening during chunk migrations. If set to `true`, insert and delete events related to chunk
|
||||
migrations will be reported as if they were regular events.
|
||||
The flag defaults to `false` and is internal.
|
||||
migrations will be reported as if they were regular events. The flag defaults to `false` and is
|
||||
internal.
|
||||
|
||||
#### `showCommitTimestamp` (internal)
|
||||
|
||||
The `showCommitTimestamp` flag can be used to include the transaction commit timestamp inside DML
|
||||
events that were part of a prepared transaction.
|
||||
The flag defaults to `true` and is internal. It is used by the resharding.
|
||||
events that were part of a prepared transaction. The flag defaults to `true` and is internal. It is
|
||||
used by the resharding.
|
||||
|
||||
#### `showRawUpdateDescription` (internal)
|
||||
|
||||
The `showRawUpdateDescription` flag can be used to make change streams emit the raw, internal format
|
||||
used for "update" oplog entries.
|
||||
If set to `true`, emitted change stream "update" events will contain a `rawUpdateDescription` field.
|
||||
The default is `false`. In this case, emitted change stream "update" events will contain the regular
|
||||
`updateDescription` field.
|
||||
used for "update" oplog entries. If set to `true`, emitted change stream "update" events will
|
||||
contain a `rawUpdateDescription` field. The default is `false`. In this case, emitted change stream
|
||||
"update" events will contain the regular `updateDescription` field.
|
||||
|
||||
#### `allowToRunOnConfigDB` (internal)
|
||||
|
||||
@ -572,9 +562,9 @@ server to keep track of shard additions and removals in the deployment.
|
||||
#### `$_passthroughToShard` (internal)
|
||||
|
||||
In sharded cluster deployments, all change streams are supposed to be opened on _mongos_. _mongos_
|
||||
will open the required cursors to the data shards and the config server on the consumer's behalf.
|
||||
If the consumer only wants to target a specific shard of the cluster, they can use the `$_passthroughToShard`
|
||||
aggregation parameter to limit the change stream to a single shard.
|
||||
will open the required cursors to the data shards and the config server on the consumer's behalf. If
|
||||
the consumer only wants to target a specific shard of the cluster, they can use the
|
||||
`$_passthroughToShard` aggregation parameter to limit the change stream to a single shard.
|
||||
|
||||
For example, to open a collection-level change stream targeting only one of the cluster's shards
|
||||
(identified by the value in `shardId`), the following example code can be used:
|
||||
@ -592,8 +582,8 @@ db.getSiblingDB("testDB").runCommand({
|
||||
});
|
||||
```
|
||||
|
||||
Using `$_passthroughToShard` will bypass the regular cluster shard targeting for change streams
|
||||
and open a replica set change stream pipeline (only) on the targeted shard. The change events that
|
||||
Using `$_passthroughToShard` will bypass the regular cluster shard targeting for change streams and
|
||||
open a replica set change stream pipeline (only) on the targeted shard. The change events that
|
||||
mongos retrieves from the single shard will be returned as is, without using a merge pipeline on
|
||||
_mongos_.
|
||||
|
||||
@ -609,23 +599,26 @@ stream against a _mongos_ instance. The _mongos_ instance will then use the clus
|
||||
information to open the cursors on the config server and the data shards on behalf of the consumer.
|
||||
Because of the ordering guarantee provided by change streams, _mongos_ must wait until all cursors
|
||||
have either responded with events, or ran into a timeout and reported that currently no more events
|
||||
are available for them.
|
||||
The latter is why change streams in a sharded cluster can have higher latency than change streams
|
||||
in replica sets.
|
||||
are available for them. The latter is why change streams in a sharded cluster can have higher
|
||||
latency than change streams in replica sets.
|
||||
|
||||
For sharded cluster change streams, the merging of the multiple streams of change events from the
|
||||
different cursors is performed by the [`AsyncResultsMerger`](https://github.com/mongodb/mongo/blob/eb4c6148f6a25c444be39a0e330506834526d935/src/mongo/s/query/exec/async_results_merger.h#L100).
|
||||
different cursors is performed by the
|
||||
[`AsyncResultsMerger`](https://github.com/mongodb/mongo/blob/eb4c6148f6a25c444be39a0e330506834526d935/src/mongo/s/query/exec/async_results_merger.h#L100).
|
||||
|
||||
## Change Stream Pipeline Building
|
||||
|
||||
A change stream pipeline issued by a consumer contains the `$changeStream` meta stage.
|
||||
This stage is expanded internally into multiple `DocumentSource`s [here](https://github.com/mongodb/mongo/blob/eb4c6148f6a25c444be39a0e330506834526d935/src/mongo/db/pipeline/change_stream_pipeline_helpers.cpp#L171).
|
||||
A change stream pipeline issued by a consumer contains the `$changeStream` meta stage. This stage is
|
||||
expanded internally into multiple `DocumentSource`s
|
||||
[here](https://github.com/mongodb/mongo/blob/eb4c6148f6a25c444be39a0e330506834526d935/src/mongo/db/pipeline/change_stream_pipeline_helpers.cpp#L171).
|
||||
|
||||
The change stream `DocumentSource`s are located in the `src/mongo/db/pipeline` directory [here](https://github.com/mongodb/mongo/tree/eb4c6148f6a25c444be39a0e330506834526d935/src/mongo/db/pipeline), among other `DocumentSource`s that
|
||||
are not related to change streams.
|
||||
The `DocumentSource`s are only used for pipeline building and optimization, but they are converted
|
||||
into execution `Stage`s later when the change stream is executed.
|
||||
These `Stage`s are located in the `src/mongo/db/exec/agg` directory [here](https://github.com/mongodb/mongo/tree/eb4c6148f6a25c444be39a0e330506834526d935/src/mongo/db/exec/agg).
|
||||
The change stream `DocumentSource`s are located in the `src/mongo/db/pipeline` directory
|
||||
[here](https://github.com/mongodb/mongo/tree/eb4c6148f6a25c444be39a0e330506834526d935/src/mongo/db/pipeline),
|
||||
among other `DocumentSource`s that are not related to change streams. The `DocumentSource`s are only
|
||||
used for pipeline building and optimization, but they are converted into execution `Stage`s later
|
||||
when the change stream is executed. These `Stage`s are located in the `src/mongo/db/exec/agg`
|
||||
directory
|
||||
[here](https://github.com/mongodb/mongo/tree/eb4c6148f6a25c444be39a0e330506834526d935/src/mongo/db/exec/agg).
|
||||
|
||||
### Replica Set Pipelines
|
||||
|
||||
@ -634,13 +627,14 @@ On a replica set, the `$changeStream` stage is expanded into the following inter
|
||||
- `$_internalChangeStreamOplogMatch`
|
||||
- `$_internalChangeStreamUnwindTransaction`
|
||||
- `$_internalChangeStreamTransform`
|
||||
- `$_internalChangeStreamCheckInvalidate` (only present for collection-level and database-level change
|
||||
streams)
|
||||
- `$_internalChangeStreamCheckInvalidate` (only present for collection-level and database-level
|
||||
change streams)
|
||||
- `$_internalChangeStreamCheckResumability`
|
||||
- `$_internalChangeStreamAddPreImage` (only present if `fullDocumentBeforeChange` is not set to `off`)
|
||||
- `$_internalChangeStreamAddPreImage` (only present if `fullDocumentBeforeChange` is not set to
|
||||
`off`)
|
||||
- `$_internalChangeStreamAddPostImage` (only present if `fullDocument` is not set to `default`)
|
||||
- `$_internalChangeStreamEnsureResumeTokenPresent` (only present if the change stream resume token is
|
||||
not a high water mark token)
|
||||
- `$_internalChangeStreamEnsureResumeTokenPresent` (only present if the change stream resume token
|
||||
is not a high water mark token)
|
||||
- user-defined `$match` expression (only present if the user's change stream pipeline contains a
|
||||
`$match` stage)
|
||||
- user-defined `$project` expression (only present if the user's change stream pipeline contains a
|
||||
@ -648,8 +642,8 @@ On a replica set, the `$changeStream` stage is expanded into the following inter
|
||||
- `$_internalChangeStreamSplitLargeEvent` (only present if the change stream is opened with the
|
||||
`$changeStreamSplitLargeEvent` pipeline step)
|
||||
|
||||
The change stream pipeline on replica sets will also contain a `$match` stage to filter out all non-DML
|
||||
change events in case `showExpandedEvents` is not set.
|
||||
The change stream pipeline on replica sets will also contain a `$match` stage to filter out all
|
||||
non-DML change events in case `showExpandedEvents` is not set.
|
||||
|
||||
### Sharded Cluster Pipelines
|
||||
|
||||
@ -659,10 +653,11 @@ following internal stages:
|
||||
- `$_internalChangeStreamOplogMatch`
|
||||
- `$_internalChangeStreamUnwindTransaction`
|
||||
- `$_internalChangeStreamTransform`
|
||||
- `$_internalChangeStreamCheckInvalidate` (only present for collection-level and database-level change
|
||||
streams)
|
||||
- `$_internalChangeStreamCheckInvalidate` (only present for collection-level and database-level
|
||||
change streams)
|
||||
- `$_internalChangeStreamCheckResumability`
|
||||
- `$_internalChangeStreamAddPreImage` (only present if `fullDocumentBeforeChange` is not set to `off`)
|
||||
- `$_internalChangeStreamAddPreImage` (only present if `fullDocumentBeforeChange` is not set to
|
||||
`off`)
|
||||
- `$_internalChangeStreamAddPostImage` (only present if `fullDocument` is not set to `default`)
|
||||
- user-defined `$match` expression (only present if the user's change stream pipeline contains a
|
||||
`$match` stage)
|
||||
@ -674,8 +669,8 @@ following internal stages:
|
||||
---
|
||||
|
||||
- `$_internalChangeStreamHandleTopologyChange`
|
||||
- `$_internalChangeStreamEnsureResumeTokenPresent` (only present if the change stream resume token is
|
||||
not a high water mark token)
|
||||
- `$_internalChangeStreamEnsureResumeTokenPresent` (only present if the change stream resume token
|
||||
is not a high water mark token)
|
||||
|
||||
Additionally, the change stream pipeline on a sharded cluster will contain a `$match` stage to
|
||||
filter out all non-DML change events in case `showExpandedEvents` is not set.
|
||||
@ -685,9 +680,9 @@ After building the initial pipeline stages, _mongos_ will split the pipeline int
|
||||
- a part that is executed on data shards ("shard pipeline") and
|
||||
- a part that is executed on _mongos_ ("merge pipeline").
|
||||
|
||||
The pipeline split point is above the `$_internalChangeStreamHandleTopologyChange` stage.
|
||||
_mongos_ will also add a `$mergeCursors` stage that aggregates the responses from different shards
|
||||
and the config server into a single, sorted stream.
|
||||
The pipeline split point is above the `$_internalChangeStreamHandleTopologyChange` stage. _mongos_
|
||||
will also add a `$mergeCursors` stage that aggregates the responses from different shards and the
|
||||
config server into a single, sorted stream.
|
||||
|
||||
#### Data Shard Pipeline
|
||||
|
||||
@ -696,15 +691,16 @@ The shard pipeline will look like this:
|
||||
- `$_internalChangeStreamOplogMatch`
|
||||
- `$_internalChangeStreamUnwindTransaction`
|
||||
- `$_internalChangeStreamTransform`
|
||||
- `$_internalChangeStreamCheckInvalidate` (only present for collection-level and database-level change
|
||||
streams)
|
||||
- `$_internalChangeStreamCheckInvalidate` (only present for collection-level and database-level
|
||||
change streams)
|
||||
- `$_internalChangeStreamCheckResumability`
|
||||
- `$_internalChangeStreamAddPreImage` (only present if `fullDocumentBeforeChange` is not set to `off`)
|
||||
- `$_internalChangeStreamAddPreImage` (only present if `fullDocumentBeforeChange` is not set to
|
||||
`off`)
|
||||
- `$_internalChangeStreamAddPostImage` (only present if `fullDocument` is not set to `default`)
|
||||
- user-defined `$match` expression (only present if the user's change stream pipeline contains a
|
||||
`$match` stage)
|
||||
- user-defined `$project` expression (only present if the change stream pipeline contains a `$project`
|
||||
stage)
|
||||
- user-defined `$project` expression (only present if the change stream pipeline contains a
|
||||
`$project` stage)
|
||||
- `$_internalChangeStreamSplitLargeEvent` (only present if the change stream is opened with the
|
||||
`$changeStreamSplitLargeEvent` pipeline step)
|
||||
|
||||
@ -714,16 +710,18 @@ The merge pipeline on _mongos_ will look like this:
|
||||
|
||||
- `$mergeCursors`
|
||||
- `$_internalChangeStreamHandleTopologyChange`
|
||||
- `$_internalChangeStreamEnsureResumeTokenPresent` (only present if the change stream resume token is
|
||||
not a high water mark token)
|
||||
- `$_internalChangeStreamEnsureResumeTokenPresent` (only present if the change stream resume token
|
||||
is not a high water mark token)
|
||||
|
||||
### Details of individual Pipeline Stages
|
||||
|
||||
#### `$_internalChangeStreamOplogMatch`
|
||||
|
||||
This stage is responsible for reading data from the oplog and filtering out irrelevant events.
|
||||
The `DocumentSourceChangeStreamOplogMatch` code is [here](https://github.com/mongodb/mongo/blob/eb4c6148f6a25c444be39a0e330506834526d935/src/mongo/db/pipeline/document_source_change_stream_oplog_match.h#L61).
|
||||
The oplog filter for the stage is built [here](https://github.com/mongodb/mongo/blob/eb4c6148f6a25c444be39a0e330506834526d935/src/mongo/db/pipeline/document_source_change_stream_oplog_match.cpp#L79).
|
||||
This stage is responsible for reading data from the oplog and filtering out irrelevant events. The
|
||||
`DocumentSourceChangeStreamOplogMatch` code is
|
||||
[here](https://github.com/mongodb/mongo/blob/eb4c6148f6a25c444be39a0e330506834526d935/src/mongo/db/pipeline/document_source_change_stream_oplog_match.h#L61).
|
||||
The oplog filter for the stage is built
|
||||
[here](https://github.com/mongodb/mongo/blob/eb4c6148f6a25c444be39a0e330506834526d935/src/mongo/db/pipeline/document_source_change_stream_oplog_match.cpp#L79).
|
||||
|
||||
There is no `Stage` equivalent for `DocumentSourceChangeStreamOplogMatch`, as it will be turned into
|
||||
a `$cursor` stage for execution.
|
||||
@ -731,28 +729,35 @@ a `$cursor` stage for execution.
|
||||
#### `$_internalChangeStreamUnwindTransaction`
|
||||
|
||||
This stage is responsible for "unwinding" (expanding) multiple operations that are contained in an
|
||||
"applyOps" oplog entry into individual events.
|
||||
The `DocumentSourceChangeStreamUnwindTransaction` code is [here](https://github.com/mongodb/mongo/blob/eb4c6148f6a25c444be39a0e330506834526d935/src/mongo/db/pipeline/document_source_change_stream_unwind_transaction.h#L71).
|
||||
The `ChangeStreamUnwindTransactionStage` code is [here](https://github.com/mongodb/mongo/blob/eb4c6148f6a25c444be39a0e330506834526d935/src/mongo/db/pipeline/document_source_change_stream_unwind_transaction.cpp#L83).
|
||||
"applyOps" oplog entry into individual events. The `DocumentSourceChangeStreamUnwindTransaction`
|
||||
code is
|
||||
[here](https://github.com/mongodb/mongo/blob/eb4c6148f6a25c444be39a0e330506834526d935/src/mongo/db/pipeline/document_source_change_stream_unwind_transaction.h#L71).
|
||||
The `ChangeStreamUnwindTransactionStage` code is
|
||||
[here](https://github.com/mongodb/mongo/blob/eb4c6148f6a25c444be39a0e330506834526d935/src/mongo/db/pipeline/document_source_change_stream_unwind_transaction.cpp#L83).
|
||||
|
||||
#### `$_internalChangeStreamTransform`
|
||||
|
||||
This stage is responsible for converting oplog entries into change events. It will build a change
|
||||
event document for every oplog entry that enters this stage.
|
||||
Event fields are added based on the change stream configuration.
|
||||
The `DocumentSourceChangeStreamTransform` code is [here](https://github.com/mongodb/mongo/blob/eb4c6148f6a25c444be39a0e330506834526d935/src/mongo/db/pipeline/document_source_change_stream_transform.h#L60).
|
||||
The `ChangeStreamTransformStage` code is [here](https://github.com/mongodb/mongo/blob/eb4c6148f6a25c444be39a0e330506834526d935/src/mongo/db/exec/agg/change_stream_transform_stage.cpp#L75).
|
||||
The actual event transformation happens inside `ChangeStreamDefaultEventTransformation` [here](https://github.com/mongodb/mongo/blob/eb4c6148f6a25c444be39a0e330506834526d935/src/mongo/db/pipeline/change_stream_event_transform.cpp#L321).
|
||||
event document for every oplog entry that enters this stage. Event fields are added based on the
|
||||
change stream configuration. The `DocumentSourceChangeStreamTransform` code is
|
||||
[here](https://github.com/mongodb/mongo/blob/eb4c6148f6a25c444be39a0e330506834526d935/src/mongo/db/pipeline/document_source_change_stream_transform.h#L60).
|
||||
The `ChangeStreamTransformStage` code is
|
||||
[here](https://github.com/mongodb/mongo/blob/eb4c6148f6a25c444be39a0e330506834526d935/src/mongo/db/exec/agg/change_stream_transform_stage.cpp#L75).
|
||||
The actual event transformation happens inside `ChangeStreamDefaultEventTransformation`
|
||||
[here](https://github.com/mongodb/mongo/blob/eb4c6148f6a25c444be39a0e330506834526d935/src/mongo/db/pipeline/change_stream_event_transform.cpp#L321).
|
||||
|
||||
#### `$_internalChangeStreamCheckInvalidate`
|
||||
|
||||
This stage is responsible for creating change stream "invalidate" events and is only added for
|
||||
collection-level and database-level change streams.
|
||||
The `DocumentSourceChangeStreamCheckInvalidate` code is [here](https://github.com/mongodb/mongo/blob/eb4c6148f6a25c444be39a0e330506834526d935/src/mongo/db/pipeline/document_source_change_stream_check_invalidate.h#L65).
|
||||
The `ChangeStreamCheckInvalidate` code is [here](https://github.com/mongodb/mongo/blob/eb4c6148f6a25c444be39a0e330506834526d935/src/mongo/db/exec/agg/change_stream_check_invalidate_stage.cpp#L106).
|
||||
collection-level and database-level change streams. The `DocumentSourceChangeStreamCheckInvalidate`
|
||||
code is
|
||||
[here](https://github.com/mongodb/mongo/blob/eb4c6148f6a25c444be39a0e330506834526d935/src/mongo/db/pipeline/document_source_change_stream_check_invalidate.h#L65).
|
||||
The `ChangeStreamCheckInvalidate` code is
|
||||
[here](https://github.com/mongodb/mongo/blob/eb4c6148f6a25c444be39a0e330506834526d935/src/mongo/db/exec/agg/change_stream_check_invalidate_stage.cpp#L106).
|
||||
|
||||
When an invalidate event is encountered, the stage will first emit an "invalidate" event, and then
|
||||
throws a `ChangeStreamInvalidated` exception on the next call. The [`ChangeStreamInvalidatedInfo`](https://github.com/mongodb/mongo/blob/eb4c6148f6a25c444be39a0e330506834526d935/src/mongo/db/pipeline/change_stream_invalidation_info.h#L47).
|
||||
throws a `ChangeStreamInvalidated` exception on the next call. The
|
||||
[`ChangeStreamInvalidatedInfo`](https://github.com/mongodb/mongo/blob/eb4c6148f6a25c444be39a0e330506834526d935/src/mongo/db/pipeline/change_stream_invalidation_info.h#L47).
|
||||
exception type contains the error code `ChangeStreamInvalidated`.
|
||||
|
||||
#### `$_internalChangeStreamCheckResumability`
|
||||
@ -761,18 +766,22 @@ This stage checks if the oplog has enough history to resume the change stream, a
|
||||
events up to the given resume point. If no data for the resume point can be found in the oplog
|
||||
anymore, it will throw a `ChangeStreamHistoryLost` error.
|
||||
|
||||
The `DocumentSourceChangeStreamCheckResumability` code is [here](https://github.com/mongodb/mongo/blob/eb4c6148f6a25c444be39a0e330506834526d935/src/mongo/db/pipeline/document_source_change_stream_check_resumability.h#L79).
|
||||
The `ChangeStreamCheckResumabilityStage` code is [here](https://github.com/mongodb/mongo/blob/eb4c6148f6a25c444be39a0e330506834526d935/src/mongo/db/exec/agg/change_stream_check_resumability_stage.cpp#L68).
|
||||
The `DocumentSourceChangeStreamCheckResumability` code is
|
||||
[here](https://github.com/mongodb/mongo/blob/eb4c6148f6a25c444be39a0e330506834526d935/src/mongo/db/pipeline/document_source_change_stream_check_resumability.h#L79).
|
||||
The `ChangeStreamCheckResumabilityStage` code is
|
||||
[here](https://github.com/mongodb/mongo/blob/eb4c6148f6a25c444be39a0e330506834526d935/src/mongo/db/exec/agg/change_stream_check_resumability_stage.cpp#L68).
|
||||
|
||||
#### `$_internalChangeStreamAddPreImage`
|
||||
|
||||
This stage is responsible for adding pre-image data to "update", "replace" and "delete" events. It
|
||||
is only added to change stream pipelines if the `fullDocumentBeforeChange` parameter is not set to
|
||||
`off`.
|
||||
If enabled, the stage relies on the pre-images stored in the system's pre-image system collection.
|
||||
`off`. If enabled, the stage relies on the pre-images stored in the system's pre-image system
|
||||
collection.
|
||||
|
||||
The `DocumentSourceChangeStreamAddPreImage` code is [here](https://github.com/mongodb/mongo/blob/eb4c6148f6a25c444be39a0e330506834526d935/src/mongo/db/pipeline/document_source_change_stream_add_pre_image.h#L67).
|
||||
The `ChangeStreamAddPreImageStage` code is [here](https://github.com/mongodb/mongo/blob/eb4c6148f6a25c444be39a0e330506834526d935/src/mongo/db/exec/agg/change_stream_add_pre_image_stage.cpp#L67).
|
||||
The `DocumentSourceChangeStreamAddPreImage` code is
|
||||
[here](https://github.com/mongodb/mongo/blob/eb4c6148f6a25c444be39a0e330506834526d935/src/mongo/db/pipeline/document_source_change_stream_add_pre_image.h#L67).
|
||||
The `ChangeStreamAddPreImageStage` code is
|
||||
[here](https://github.com/mongodb/mongo/blob/eb4c6148f6a25c444be39a0e330506834526d935/src/mongo/db/exec/agg/change_stream_add_pre_image_stage.cpp#L67).
|
||||
|
||||
#### `$_internalChangeStreamAddPostImage`
|
||||
|
||||
@ -780,23 +789,24 @@ This stage is responsible for adding post-image data to "update" events. It is o
|
||||
stream pipelines if the `fullDocument` parameter is not set to `default`.
|
||||
|
||||
If `fullDocument` is set to `updateLookup`, the stage will perform a lookup for the current version
|
||||
of a document that was updated by an "update" event, and store it in the `fullDocument` field of
|
||||
the "update" event if present. The lookup is performed using the `_id` value of the document from
|
||||
the change event. As the lookup is executed at a different point in time than when the change event
|
||||
was recorded, it is possible that the lookup finds a different version of the document than the one
|
||||
that was active when the change event was recorded. This can happen if the document was updated
|
||||
again between the change event and the lookup. The lookup may also find no document at all if the
|
||||
document was deleted after the "update" event, but before the lookup.
|
||||
In case the lookup cannot find a document with the requested `_id`, it will populate the
|
||||
`fullDocument` field with a value of `null`.
|
||||
of a document that was updated by an "update" event, and store it in the `fullDocument` field of the
|
||||
"update" event if present. The lookup is performed using the `_id` value of the document from the
|
||||
change event. As the lookup is executed at a different point in time than when the change event was
|
||||
recorded, it is possible that the lookup finds a different version of the document than the one that
|
||||
was active when the change event was recorded. This can happen if the document was updated again
|
||||
between the change event and the lookup. The lookup may also find no document at all if the document
|
||||
was deleted after the "update" event, but before the lookup. In case the lookup cannot find a
|
||||
document with the requested `_id`, it will populate the `fullDocument` field with a value of `null`.
|
||||
|
||||
If `fullDocument` is set to `whenAvailable` or `required`, the stage will make use of the stored
|
||||
pre-image of the document in the system's pre-image system collection. It will fetch the pre-image
|
||||
and then apply the delta that is stored in the "update" change event on top of it, and store the
|
||||
result in the `fullDocument` field.
|
||||
|
||||
The `DocumentSourceChangeStreamAddPostImage` code is [here](https://github.com/mongodb/mongo/blob/eb4c6148f6a25c444be39a0e330506834526d935/src/mongo/db/pipeline/document_source_change_stream_add_post_image.h#L63).
|
||||
The `ChangeStreamAddPostImageStage` code is [here](https://github.com/mongodb/mongo/blob/eb4c6148f6a25c444be39a0e330506834526d935/src/mongo/db/exec/agg/change_stream_add_post_image_stage.cpp#L84).
|
||||
The `DocumentSourceChangeStreamAddPostImage` code is
|
||||
[here](https://github.com/mongodb/mongo/blob/eb4c6148f6a25c444be39a0e330506834526d935/src/mongo/db/pipeline/document_source_change_stream_add_post_image.h#L63).
|
||||
The `ChangeStreamAddPostImageStage` code is
|
||||
[here](https://github.com/mongodb/mongo/blob/eb4c6148f6a25c444be39a0e330506834526d935/src/mongo/db/exec/agg/change_stream_add_post_image_stage.cpp#L84).
|
||||
|
||||
#### `$_internalChangeStreamEnsureResumeTokenPresent`
|
||||
|
||||
@ -805,18 +815,22 @@ the change stream parameters is actually in the stream. The stage is only presen
|
||||
stream resume token is not a high water mark token. If the resume token cannot be found in the
|
||||
stream, it will throw a `ChangeStreamFatalError`.
|
||||
|
||||
The `DocumentSourceChangeStreamEnsureResumeTokenPresent` code is [here](https://github.com/mongodb/mongo/blob/eb4c6148f6a25c444be39a0e330506834526d935/src/mongo/db/pipeline/document_source_change_stream_ensure_resume_token_present.h#L51).
|
||||
The `ChangeStreamEnsureResumeTokenPresent` code is [here](https://github.com/mongodb/mongo/blob/eb4c6148f6a25c444be39a0e330506834526d935/src/mongo/db/exec/agg/change_stream_ensure_resume_token_present_stage.cpp#L67).
|
||||
The `DocumentSourceChangeStreamEnsureResumeTokenPresent` code is
|
||||
[here](https://github.com/mongodb/mongo/blob/eb4c6148f6a25c444be39a0e330506834526d935/src/mongo/db/pipeline/document_source_change_stream_ensure_resume_token_present.h#L51).
|
||||
The `ChangeStreamEnsureResumeTokenPresent` code is
|
||||
[here](https://github.com/mongodb/mongo/blob/eb4c6148f6a25c444be39a0e330506834526d935/src/mongo/db/exec/agg/change_stream_ensure_resume_token_present_stage.cpp#L67).
|
||||
|
||||
#### `$_internalChangeStreamHandleTopologyChange`
|
||||
|
||||
This stage is only present in sharded cluster change streams and is always part of the _mongos_
|
||||
merge pipeline. The stage is responsible for opening additional cursors to shards that have been
|
||||
added to the cluster. It will handle "insert" events into the `config.shards` collection that
|
||||
were observed from the config server.
|
||||
added to the cluster. It will handle "insert" events into the `config.shards` collection that were
|
||||
observed from the config server.
|
||||
|
||||
The `DocumentSourceChangeStreamHandleTopologyChange` code can be found [here](https://github.com/mongodb/mongo/blob/eb4c6148f6a25c444be39a0e330506834526d935/src/mongo/db/pipeline/document_source_change_stream_handle_topology_change.h#L63).
|
||||
The `ChangeStreamHandleTopologyChangeStage` code can be found [here](https://github.com/mongodb/mongo/blob/eb4c6148f6a25c444be39a0e330506834526d935/src/mongo/db/exec/agg/change_stream_handle_topology_change_stage.cpp#L121).
|
||||
The `DocumentSourceChangeStreamHandleTopologyChange` code can be found
|
||||
[here](https://github.com/mongodb/mongo/blob/eb4c6148f6a25c444be39a0e330506834526d935/src/mongo/db/pipeline/document_source_change_stream_handle_topology_change.h#L63).
|
||||
The `ChangeStreamHandleTopologyChangeStage` code can be found
|
||||
[here](https://github.com/mongodb/mongo/blob/eb4c6148f6a25c444be39a0e330506834526d935/src/mongo/db/exec/agg/change_stream_handle_topology_change_stage.cpp#L121).
|
||||
|
||||
## Missing documentation (to be completed)
|
||||
|
||||
|
||||
@ -1,75 +1,70 @@
|
||||
# Command Dispatch
|
||||
|
||||
Command dispatch refers to the general process by which client requests are
|
||||
taken from the network, parsed, sanitized, then finally run on databases.
|
||||
Command dispatch refers to the general process by which client requests are taken from the network,
|
||||
parsed, sanitized, then finally run on databases.
|
||||
|
||||
## Service Entry Points
|
||||
|
||||
[Service entry points][service_entry_point_h] fulfill the transition from the
|
||||
transport layer into command implementations. For each incoming connection
|
||||
from a client (in the form of a [session][session_h] object), a new dedicated
|
||||
thread is spawned then detached, and is also assigned a new [session workflow]
|
||||
[session_workflow_h], responsible for maintaining the workflow of a
|
||||
single client connection during its lifetime. Central to the entry point is the
|
||||
`handleRequest()` function, which manages the server-side logic of processing
|
||||
requests and returns a response message indicating the result of the
|
||||
corresponding request message. This function is currently implemented by several
|
||||
subclasses of the parent `ServiceEntryPoint` in order to account for the
|
||||
differences in processing requests between the shard and router roles -- these
|
||||
distinctions are reflected in the `ServiceEntryPointRouterRole` and
|
||||
`ServiceEntryPointShardRole` subclasses (see [here][service_entry_point_router_role_h]
|
||||
and [here][service_entry_point_shard_role.h]).
|
||||
[Service entry points][service_entry_point_h] fulfill the transition from the transport layer into
|
||||
command implementations. For each incoming connection from a client (in the form of a
|
||||
[session][session_h] object), a new dedicated thread is spawned then detached, and is also assigned
|
||||
a new [session workflow] [session_workflow_h], responsible for maintaining the workflow of a single
|
||||
client connection during its lifetime. Central to the entry point is the `handleRequest()` function,
|
||||
which manages the server-side logic of processing requests and returns a response message indicating
|
||||
the result of the corresponding request message. This function is currently implemented by several
|
||||
subclasses of the parent `ServiceEntryPoint` in order to account for the differences in processing
|
||||
requests between the shard and router roles -- these distinctions are reflected in the
|
||||
`ServiceEntryPointRouterRole` and `ServiceEntryPointShardRole` subclasses (see
|
||||
[here][service_entry_point_router_role_h] and [here][service_entry_point_shard_role.h]).
|
||||
|
||||
## Strategy
|
||||
|
||||
One area in which the _mongos_ entry point differs from its _mongod_ counterpart
|
||||
is in its usage of the [Strategy class][strategy_h]. `Strategy` operates as a
|
||||
legacy interface for processing client read, write, and command requests; there
|
||||
is a near 1-to-1 mapping between its constituent functions and request types
|
||||
(e.g. `writeOp()` for handling write operation requests, `getMore()` for a
|
||||
getMore request, etc.). These functions comprise the backbone of the _mongos_
|
||||
entry point's `handleRequest()` -- that is to say, when a valid request is
|
||||
received, it is sieved and ultimately passed along to the appropriate Strategy
|
||||
class member function. The significance of using the Strategy class specifically
|
||||
with the _mongos_ entry point is that it [facilitates query routing to
|
||||
shards][mongos_router] in _addition_ to running queries against targeted
|
||||
databases (see [s/transaction_router.h][transaction_router_h] for finer
|
||||
details).
|
||||
One area in which the _mongos_ entry point differs from its _mongod_ counterpart is in its usage of
|
||||
the [Strategy class][strategy_h]. `Strategy` operates as a legacy interface for processing client
|
||||
read, write, and command requests; there is a near 1-to-1 mapping between its constituent functions
|
||||
and request types (e.g. `writeOp()` for handling write operation requests, `getMore()` for a getMore
|
||||
request, etc.). These functions comprise the backbone of the _mongos_ entry point's
|
||||
`handleRequest()` -- that is to say, when a valid request is received, it is sieved and ultimately
|
||||
passed along to the appropriate Strategy class member function. The significance of using the
|
||||
Strategy class specifically with the _mongos_ entry point is that it [facilitates query routing to
|
||||
shards][mongos_router] in _addition_ to running queries against targeted databases (see
|
||||
[s/transaction_router.h][transaction_router_h] for finer details).
|
||||
|
||||
## Commands
|
||||
|
||||
The [Command class][commands_h] serves as a means of cataloging a server command
|
||||
as well as ascribing various attributes and behaviors to commands via the [type
|
||||
system][template_method_pattern], that will likely be used during the lifespan
|
||||
of a particular server. Construction of a Command should only occur during
|
||||
server startup. When a new Command is constructed, that Command is stored in a
|
||||
global `CommandRegistry` object for future reference. There are two kinds of
|
||||
Command subclasses: `BasicCommand` and `TypedCommand`.
|
||||
The [Command class][commands_h] serves as a means of cataloging a server command as well as
|
||||
ascribing various attributes and behaviors to commands via the [type
|
||||
system][template_method_pattern], that will likely be used during the lifespan of a particular
|
||||
server. Construction of a Command should only occur during server startup. When a new Command is
|
||||
constructed, that Command is stored in a global `CommandRegistry` object for future reference. There
|
||||
are two kinds of Command subclasses: `BasicCommand` and `TypedCommand`.
|
||||
|
||||
A major distinction between the two is in their implementation of the `parse()`
|
||||
member function. `parse()` takes in a request and returns a handle to a single
|
||||
invocation of a particular Command (represented by a `CommandInvocation`), that
|
||||
can then be used to run the Command. The `BasicCommand::parse()` is a naive
|
||||
implementation that merely forwards incoming requests to the Invocation and
|
||||
makes sure that the Command does not support document sequences. The
|
||||
implementation of `TypedCommand::parse()`, on the other hand, varies depending
|
||||
on the Request type parameter the Command takes in. Since the `TypedCommand`
|
||||
accepts requests generated by IDL, the parsing function associated with a usable
|
||||
Request type must allow it to be parsed as an IDL command. In handling requests,
|
||||
both the _mongos_ and _mongod_ entry points interact with the Command subclasses
|
||||
through the `CommandHelpers` struct in order to parse requests and ultimately
|
||||
run them as Commands.
|
||||
A major distinction between the two is in their implementation of the `parse()` member function.
|
||||
`parse()` takes in a request and returns a handle to a single invocation of a particular Command
|
||||
(represented by a `CommandInvocation`), that can then be used to run the Command. The
|
||||
`BasicCommand::parse()` is a naive implementation that merely forwards incoming requests to the
|
||||
Invocation and makes sure that the Command does not support document sequences. The implementation
|
||||
of `TypedCommand::parse()`, on the other hand, varies depending on the Request type parameter the
|
||||
Command takes in. Since the `TypedCommand` accepts requests generated by IDL, the parsing function
|
||||
associated with a usable Request type must allow it to be parsed as an IDL command. In handling
|
||||
requests, both the _mongos_ and _mongod_ entry points interact with the Command subclasses through
|
||||
the `CommandHelpers` struct in order to parse requests and ultimately run them as Commands.
|
||||
|
||||
## Admission control
|
||||
|
||||
To ensure stability of our servers, we have implemented different admission control mechanisms to prevent data-nodes from becoming overloaded with operations. When implementing a new command, it's important to decide whether the command will be subject to one of the admission controls in place and understand the resulting outcomes.
|
||||
To ensure stability of our servers, we have implemented different admission control mechanisms to
|
||||
prevent data-nodes from becoming overloaded with operations. When implementing a new command, it's
|
||||
important to decide whether the command will be subject to one of the admission controls in place
|
||||
and understand the resulting outcomes.
|
||||
|
||||
For example, user commands may be subject to Ingress Admission Control, which happens in the [ServiceEntryPoint][IngressControl].
|
||||
For information on admission control and how to implement admission control into a new command, please see [Admission Control README][ACReadMe]
|
||||
For example, user commands may be subject to Ingress Admission Control, which happens in the
|
||||
[ServiceEntryPoint][IngressControl]. For information on admission control and how to implement
|
||||
admission control into a new command, please see [Admission Control README][ACReadMe]
|
||||
|
||||
## See Also
|
||||
|
||||
For details on transport internals, including ingress networking, see [this document][transport_internals].
|
||||
For details on transport internals, including ingress networking, see [this
|
||||
document][transport_internals].
|
||||
|
||||
[service_entry_point_h]: ../src/mongo/transport/service_entry_point.h
|
||||
[session_h]: ../src/mongo/transport/session.h
|
||||
@ -85,4 +80,5 @@ For details on transport internals, including ingress networking, see [this docu
|
||||
[template_method_pattern]: https://en.wikipedia.org/wiki/Template_method_pattern
|
||||
[transport_internals]: ../src/mongo/transport/README.md
|
||||
[ACReadMe]: ../src/mongo/db/admission/README.md
|
||||
[IngressControl]: https://github.com/mongodb/mongo/blob/a86c7f5de2a5de4d2f49e40e8970754ec6a5ba6c/src/mongo/db/service_entry_point_shard_role.cpp#L1803
|
||||
[IngressControl]:
|
||||
https://github.com/mongodb/mongo/blob/a86c7f5de2a5de4d2f49e40e8970754ec6a5ba6c/src/mongo/db/service_entry_point_shard_role.cpp#L1803
|
||||
|
||||
@ -14,9 +14,9 @@ dynamically extensible.
|
||||
A `ServiceContext` represents all of the state of a single Mongo server process, which may be either
|
||||
a `mongod` or a `mongos`. It creates and manages the previously mentioned `Client`s and
|
||||
`OperationContext`s, as well as a `TransportLayer` for performing network operations, a
|
||||
`PeriodicRunner` for running housekeeping tasks periodically, a `StorageEngine` for interacting
|
||||
with the actual database itself, and a set of time sources. In general, every Mongo server process
|
||||
has a single `ServiceContext`, known as the _global_ `ServiceContext`. Typical uses of the global
|
||||
`PeriodicRunner` for running housekeeping tasks periodically, a `StorageEngine` for interacting with
|
||||
the actual database itself, and a set of time sources. In general, every Mongo server process has a
|
||||
single `ServiceContext`, known as the _global_ `ServiceContext`. Typical uses of the global
|
||||
`ServiceContext` outside of server initialization and shutdown include looking up `Client` or
|
||||
`OperationContext` information for a particular thread or operation, or killing one or more running
|
||||
operations during, e.g., a primary replica step-down. The global `ServiceContext` is created during
|
||||
@ -28,16 +28,16 @@ The `ServiceContext` associated with a given `Client` object can be fetched in a
|
||||
using [`Client::getServiceContext()`][client-get-service-context-url] when possible. As of time of
|
||||
writing, every server process only maintains a single `ServiceContext`, but preferring
|
||||
`Client::getServiceContext()` or `ServiceContext::getCurrentServiceContext()` over
|
||||
[`ServiceContext::getGlobalServiceContext()`][get-global-service-context-url] will allow us to
|
||||
more easily maintain multiple `ServiceContext`s per server process if desired in the future.
|
||||
[`ServiceContext::getGlobalServiceContext()`][get-global-service-context-url] will allow us to more
|
||||
easily maintain multiple `ServiceContext`s per server process if desired in the future.
|
||||
|
||||
## [`Client`][client-url]
|
||||
|
||||
Each logical connection to a Mongo service is managed by a `Client` object, where a logical
|
||||
connection may be a user or an internal process that needs to run a command or query on the database.
|
||||
Construction of a `Client` object is typically performed with a call to `makeClient` on the global
|
||||
`ServiceContext`, which can then be attached to any thread of execution, or with a call to
|
||||
[`Client::initThread`][client-init-thread-url] which constructs a `Client` on the global
|
||||
connection may be a user or an internal process that needs to run a command or query on the
|
||||
database. Construction of a `Client` object is typically performed with a call to `makeClient` on
|
||||
the global `ServiceContext`, which can then be attached to any thread of execution, or with a call
|
||||
to [`Client::initThread`][client-init-thread-url] which constructs a `Client` on the global
|
||||
`ServiceContext` and binds it to the current thread. All operations executed by the `Client` will
|
||||
take place on that `Client`’s associated thread serially over the network connection managed by the
|
||||
`Session` object that was passed into the `Client`’s constructor. If no `Session` is passed to the
|
||||
@ -70,13 +70,13 @@ operations. The semantics of the `Client` lock are summarized in the table below
|
||||
|
||||
[`Client::cc()`][client-cc-url] may be used to get the `Client` object associated with the currently
|
||||
executing thread. Prefer passing `Client` objects as parameters over calls to `Client::cc()` when
|
||||
possible. A [`ThreadClient`][thread-client-url] is an RAII-style class which may be used to construct
|
||||
and bind a `Client` to the current running thread and automatically unbind it once the `ThreadClient`
|
||||
goes out of scope. An [`AlternativeClientRegion`][acr-url] is another RAII-style class which may be
|
||||
used to temporarily bind a `Client` object to the currently running thread (holding any currently
|
||||
bound `Client` in reserve), rebinding the current thread’s old `Client` to the current thread upon
|
||||
falling out of scope. [`ClientStrand`][client-strand-url] functions similarly, but also provides an
|
||||
`Executor` interface for binding a `Client` to an arbitrary thread.
|
||||
possible. A [`ThreadClient`][thread-client-url] is an RAII-style class which may be used to
|
||||
construct and bind a `Client` to the current running thread and automatically unbind it once the
|
||||
`ThreadClient` goes out of scope. An [`AlternativeClientRegion`][acr-url] is another RAII-style
|
||||
class which may be used to temporarily bind a `Client` object to the currently running thread
|
||||
(holding any currently bound `Client` in reserve), rebinding the current thread’s old `Client` to
|
||||
the current thread upon falling out of scope. [`ClientStrand`][client-strand-url] functions
|
||||
similarly, but also provides an `Executor` interface for binding a `Client` to an arbitrary thread.
|
||||
|
||||
## [`OperationContext`][operation-context-url]
|
||||
|
||||
@ -92,23 +92,37 @@ performed asynchronously.
|
||||
|
||||
### Interruptibility
|
||||
|
||||
`OperationContext`s implement the [`Interruptible`][interruptible-url] interface, which allows them to
|
||||
be killed by their associated `Client`s (or, by proxy, their owning `ServiceContext`). See
|
||||
[this comment block][opctx-interruptible-comment-block-url] for more details on when and how
|
||||
`OperationContext`s implement the [`Interruptible`][interruptible-url] interface, which allows them
|
||||
to be killed by their associated `Client`s (or, by proxy, their owning `ServiceContext`). See [this
|
||||
comment block][opctx-interruptible-comment-block-url] for more details on when and how
|
||||
`OperationContext`s are interrupted.
|
||||
|
||||
[service-context-url]: https://github.com/mongodb/mongo/blob/ecc6179c18ed1e3b38d7ee244319210b18e24bad/src/mongo/db/service_context.h#L141
|
||||
[decorable-url]: https://github.com/mongodb/mongo/blob/ecc6179c18ed1e3b38d7ee244319210b18e24bad/src/mongo/util/decorable.h
|
||||
[client-get-service-context-url]: https://github.com/mongodb/mongo/blob/ecc6179c18ed1e3b38d7ee244319210b18e24bad/src/mongo/db/client.h#L117
|
||||
[get-global-service-context-url]: https://github.com/mongodb/mongo/blob/ecc6179c18ed1e3b38d7ee244319210b18e24bad/src/mongo/db/service_context.h#L755
|
||||
[client-url]: https://github.com/mongodb/mongo/blob/ecc6179c18ed1e3b38d7ee244319210b18e24bad/src/mongo/db/client.h
|
||||
[client-init-thread-url]: https://github.com/mongodb/mongo/blob/ecc6179c18ed1e3b38d7ee244319210b18e24bad/src/mongo/db/client.h#L75
|
||||
[client-cc-url]: https://github.com/mongodb/mongo/blob/ecc6179c18ed1e3b38d7ee244319210b18e24bad/src/mongo/db/client.h#L372
|
||||
[thread-client-url]: https://github.com/mongodb/mongo/blob/ecc6179c18ed1e3b38d7ee244319210b18e24bad/src/mongo/db/client.h#L320
|
||||
[acr-url]: https://github.com/mongodb/mongo/blob/ecc6179c18ed1e3b38d7ee244319210b18e24bad/src/mongo/db/client.h#L347
|
||||
[client-strand-url]: https://github.com/mongodb/mongo/blob/ecc6179c18ed1e3b38d7ee244319210b18e24bad/src/mongo/db/client_strand.h
|
||||
[operation-context-url]: https://github.com/mongodb/mongo/blob/ecc6179c18ed1e3b38d7ee244319210b18e24bad/src/mongo/db/operation_context.h
|
||||
[service-context-url]:
|
||||
https://github.com/mongodb/mongo/blob/ecc6179c18ed1e3b38d7ee244319210b18e24bad/src/mongo/db/service_context.h#L141
|
||||
[decorable-url]:
|
||||
https://github.com/mongodb/mongo/blob/ecc6179c18ed1e3b38d7ee244319210b18e24bad/src/mongo/util/decorable.h
|
||||
[client-get-service-context-url]:
|
||||
https://github.com/mongodb/mongo/blob/ecc6179c18ed1e3b38d7ee244319210b18e24bad/src/mongo/db/client.h#L117
|
||||
[get-global-service-context-url]:
|
||||
https://github.com/mongodb/mongo/blob/ecc6179c18ed1e3b38d7ee244319210b18e24bad/src/mongo/db/service_context.h#L755
|
||||
[client-url]:
|
||||
https://github.com/mongodb/mongo/blob/ecc6179c18ed1e3b38d7ee244319210b18e24bad/src/mongo/db/client.h
|
||||
[client-init-thread-url]:
|
||||
https://github.com/mongodb/mongo/blob/ecc6179c18ed1e3b38d7ee244319210b18e24bad/src/mongo/db/client.h#L75
|
||||
[client-cc-url]:
|
||||
https://github.com/mongodb/mongo/blob/ecc6179c18ed1e3b38d7ee244319210b18e24bad/src/mongo/db/client.h#L372
|
||||
[thread-client-url]:
|
||||
https://github.com/mongodb/mongo/blob/ecc6179c18ed1e3b38d7ee244319210b18e24bad/src/mongo/db/client.h#L320
|
||||
[acr-url]:
|
||||
https://github.com/mongodb/mongo/blob/ecc6179c18ed1e3b38d7ee244319210b18e24bad/src/mongo/db/client.h#L347
|
||||
[client-strand-url]:
|
||||
https://github.com/mongodb/mongo/blob/ecc6179c18ed1e3b38d7ee244319210b18e24bad/src/mongo/db/client_strand.h
|
||||
[operation-context-url]:
|
||||
https://github.com/mongodb/mongo/blob/ecc6179c18ed1e3b38d7ee244319210b18e24bad/src/mongo/db/operation_context.h
|
||||
[kill-op-url]: https://docs.mongodb.com/manual/reference/command/killOp/
|
||||
[baton-url]: https://github.com/mongodb/mongo/blob/ecc6179c18ed1e3b38d7ee244319210b18e24bad/src/mongo/db/baton.h
|
||||
[interruptible-url]: https://github.com/mongodb/mongo/blob/ecc6179c18ed1e3b38d7ee244319210b18e24bad/src/mongo/util/interruptible.h
|
||||
[opctx-interruptible-comment-block-url]: https://github.com/mongodb/mongo/blob/ecc6179c18ed1e3b38d7ee244319210b18e24bad/src/mongo/db/operation_context.cpp#L281
|
||||
[baton-url]:
|
||||
https://github.com/mongodb/mongo/blob/ecc6179c18ed1e3b38d7ee244319210b18e24bad/src/mongo/db/baton.h
|
||||
[interruptible-url]:
|
||||
https://github.com/mongodb/mongo/blob/ecc6179c18ed1e3b38d7ee244319210b18e24bad/src/mongo/util/interruptible.h
|
||||
[opctx-interruptible-comment-block-url]:
|
||||
https://github.com/mongodb/mongo/blob/ecc6179c18ed1e3b38d7ee244319210b18e24bad/src/mongo/db/operation_context.cpp#L281
|
||||
|
||||
File diff suppressed because it is too large
Load Diff
@ -4,8 +4,10 @@
|
||||
|
||||
**👉 Please visit the new [Dev Container Documentation](./devcontainer/README.md) for:**
|
||||
|
||||
- 📖 [**Getting Started Guide**](./devcontainer/getting-started.md) - Step-by-step setup instructions
|
||||
- 🏗️ [**Architecture & Technical Details**](./devcontainer/architecture.md) - How everything works under the hood
|
||||
- 📖 [**Getting Started Guide**](./devcontainer/getting-started.md) - Step-by-step setup
|
||||
instructions
|
||||
- 🏗️ [**Architecture & Technical Details**](./devcontainer/architecture.md) - How everything works
|
||||
under the hood
|
||||
- 🔧 [**Troubleshooting Guide**](./devcontainer/troubleshooting.md) - Solutions to common issues
|
||||
- 💡 [**Advanced Usage**](./devcontainer/advanced.md) - Customization and power user features
|
||||
- ❓ [**FAQ**](./devcontainer/faq.md) - Frequently asked questions
|
||||
|
||||
@ -1,10 +1,12 @@
|
||||
# MongoDB Development with Dev Containers
|
||||
|
||||
**⚠️ BETA:** The devcontainer setup is currently in Beta stage. Please report issues and feedback to the team.
|
||||
**⚠️ BETA:** The devcontainer setup is currently in Beta stage. Please report issues and feedback to
|
||||
the team.
|
||||
|
||||
## 📚 Documentation Index
|
||||
|
||||
This is the comprehensive guide for developing MongoDB using Dev Containers. Choose the guide that best fits your needs:
|
||||
This is the comprehensive guide for developing MongoDB using Dev Containers. Choose the guide that
|
||||
best fits your needs:
|
||||
|
||||
### 🚀 [Getting Started](./getting-started.md)
|
||||
|
||||
@ -80,7 +82,8 @@ This is the comprehensive guide for developing MongoDB using Dev Containers. Cho
|
||||
|
||||
## What are Dev Containers?
|
||||
|
||||
Dev Containers provide a consistent, reproducible development environment using Docker containers. This ensures:
|
||||
Dev Containers provide a consistent, reproducible development environment using Docker containers.
|
||||
This ensures:
|
||||
|
||||
- ✅ **Consistency**: Everyone works with identical tooling and dependencies
|
||||
- ✅ **Isolation**: Your host system stays clean
|
||||
|
||||
@ -1,8 +1,10 @@
|
||||
# Advanced Dev Container Usage
|
||||
|
||||
This guide covers advanced workflows and power user features for managing multiple containers, backups, and complex development scenarios.
|
||||
This guide covers advanced workflows and power user features for managing multiple containers,
|
||||
backups, and complex development scenarios.
|
||||
|
||||
**Looking to customize your devcontainer?** See the [Customization Guide](./customization.md) for dotfiles, VS Code settings, extensions, and performance tuning.
|
||||
**Looking to customize your devcontainer?** See the [Customization Guide](./customization.md) for
|
||||
dotfiles, VS Code settings, extensions, and performance tuning.
|
||||
|
||||
## Table of Contents
|
||||
|
||||
|
||||
@ -1,6 +1,7 @@
|
||||
# Dev Container Architecture
|
||||
|
||||
This document provides a deep dive into how the MongoDB devcontainer is structured and how all the pieces work together.
|
||||
This document provides a deep dive into how the MongoDB devcontainer is structured and how all the
|
||||
pieces work together.
|
||||
|
||||
## Table of Contents
|
||||
|
||||
@ -201,7 +202,8 @@ MongoDB requires specific compiler versions. The toolchain installation process
|
||||
|
||||
### Toolchain Configuration
|
||||
|
||||
The `toolchain_config.env` file contains architecture-specific toolchain definitions for both ARM64 and AMD64:
|
||||
The `toolchain_config.env` file contains architecture-specific toolchain definitions for both ARM64
|
||||
and AMD64:
|
||||
|
||||
```bash
|
||||
# Generated by toolchain.py
|
||||
@ -289,7 +291,8 @@ The MongoDB toolchain includes:
|
||||
|
||||
### Toolchain Updates
|
||||
|
||||
The toolchain is managed by the MongoDB team. When updates are available, you'll get them automatically when you:
|
||||
The toolchain is managed by the MongoDB team. When updates are available, you'll get them
|
||||
automatically when you:
|
||||
|
||||
- Pull the latest changes from the repository
|
||||
- Rebuild your devcontainer
|
||||
|
||||
@ -1,10 +1,14 @@
|
||||
# Customizing Your Dev Container
|
||||
|
||||
This guide covers personal customizations you can make to your MongoDB devcontainer **without modifying the repository's devcontainer configuration**. These are user-level settings that only affect your development environment.
|
||||
This guide covers personal customizations you can make to your MongoDB devcontainer **without
|
||||
modifying the repository's devcontainer configuration**. These are user-level settings that only
|
||||
affect your development environment.
|
||||
|
||||
**Want to modify the devcontainer setup for everyone?** See [Contributing Customizations](#contributing-customizations) at the bottom.
|
||||
**Want to modify the devcontainer setup for everyone?** See
|
||||
[Contributing Customizations](#contributing-customizations) at the bottom.
|
||||
|
||||
**For general VS Code settings** (themes, fonts, keybindings), see the [VS Code documentation](https://code.visualstudio.com/docs/getstarted/settings).
|
||||
**For general VS Code settings** (themes, fonts, keybindings), see the
|
||||
[VS Code documentation](https://code.visualstudio.com/docs/getstarted/settings).
|
||||
|
||||
## Table of Contents
|
||||
|
||||
@ -76,7 +80,9 @@ This applies to all devcontainers you work with, not just MongoDB.
|
||||
|
||||
## Contributing Customizations
|
||||
|
||||
The customizations above are all user-level and don't require changes to the repository. If you want to modify the devcontainer setup itself to benefit all MongoDB developers, you'll need to submit a PR.
|
||||
The customizations above are all user-level and don't require changes to the repository. If you want
|
||||
to modify the devcontainer setup itself to benefit all MongoDB developers, you'll need to submit a
|
||||
PR.
|
||||
|
||||
**Examples of repository-level customizations:**
|
||||
|
||||
@ -108,4 +114,5 @@ The customizations above are all user-level and don't require changes to the rep
|
||||
- [Architecture](./architecture.md) - How devcontainers work
|
||||
- [Advanced Usage](./advanced.md) - Multiple containers, backups, workflows
|
||||
- [Troubleshooting](./troubleshooting.md) - Fix issues
|
||||
- [VS Code Dev Containers Documentation](https://code.visualstudio.com/docs/devcontainers/containers) - General VS Code features
|
||||
- [VS Code Dev Containers Documentation](https://code.visualstudio.com/docs/devcontainers/containers) -
|
||||
General VS Code features
|
||||
|
||||
@ -6,14 +6,16 @@ Frequently asked questions about MongoDB development with dev containers.
|
||||
|
||||
### What is a dev container?
|
||||
|
||||
A dev container (development container) is a Docker container configured specifically for development. It includes:
|
||||
A dev container (development container) is a Docker container configured specifically for
|
||||
development. It includes:
|
||||
|
||||
- All build tools and dependencies
|
||||
- IDE configuration and extensions
|
||||
- Persistent storage for caches and settings
|
||||
- Consistent environment across all developers
|
||||
|
||||
Think of it as a portable, reproducible development environment that runs on any machine with Docker.
|
||||
Think of it as a portable, reproducible development environment that runs on any machine with
|
||||
Docker.
|
||||
|
||||
[Learn more about dev containers →](https://containers.dev/)
|
||||
|
||||
@ -43,11 +45,14 @@ Report issues to help improve it for everyone!
|
||||
- Pros: Works without SSH keys, simpler for read-only access
|
||||
- Cons: May require password/token for push operations
|
||||
|
||||
See the [Getting Started guide SSH setup section](./getting-started.md#4-configure-ssh-keys-recommended) for details.
|
||||
See the
|
||||
[Getting Started guide SSH setup section](./getting-started.md#4-configure-ssh-keys-recommended) for
|
||||
details.
|
||||
|
||||
### How do SSH keys work with devcontainers?
|
||||
|
||||
VS Code automatically forwards your SSH agent to the container, so you don't need to copy keys into the container.
|
||||
VS Code automatically forwards your SSH agent to the container, so you don't need to copy keys into
|
||||
the container.
|
||||
|
||||
**Requirements:**
|
||||
|
||||
@ -65,7 +70,8 @@ ssh-add -l
|
||||
ssh -T git@github.com
|
||||
```
|
||||
|
||||
**Inside the container**, Git commands will automatically use your host's SSH keys through agent forwarding.
|
||||
**Inside the container**, Git commands will automatically use your host's SSH keys through agent
|
||||
forwarding.
|
||||
|
||||
[Learn more about SSH agent forwarding →](https://code.visualstudio.com/remote/advancedcontainers/sharing-git-credentials)
|
||||
|
||||
@ -126,7 +132,8 @@ First-time setup includes:
|
||||
- WSL2 installed and configured
|
||||
- Docker Desktop with WSL2 integration enabled
|
||||
|
||||
**Important:** Clone repository in WSL2 filesystem (not `/mnt/c/`), not Windows filesystem, for best performance.
|
||||
**Important:** Clone repository in WSL2 filesystem (not `/mnt/c/`), not Windows filesystem, for best
|
||||
performance.
|
||||
|
||||
### Can I use this on Apple Silicon (M1/M2/M3)?
|
||||
|
||||
@ -161,7 +168,8 @@ docker cp <container_id>:/workspaces/mongo/file.txt ~/Downloads/
|
||||
|
||||
**Option 3: Use bind mount** (sacrifices performance)
|
||||
|
||||
Open your existing local repository in VS Code and use "Dev Containers: Reopen in Container". This uses a bind mount which allows direct host filesystem access but is slower, especially on macOS.
|
||||
Open your existing local repository in VS Code and use "Dev Containers: Reopen in Container". This
|
||||
uses a bind mount which allows direct host filesystem access but is slower, especially on macOS.
|
||||
|
||||
### Can I use my existing local clone?
|
||||
|
||||
@ -369,8 +377,7 @@ gcc --version # Should show the MongoDB toolchain GCC version
|
||||
ls -la ~/.config/engflow_auth/
|
||||
```
|
||||
|
||||
**Re-authenticate:**
|
||||
Contact MongoDB team for authentication flow.
|
||||
**Re-authenticate:** Contact MongoDB team for authentication flow.
|
||||
|
||||
**Build locally instead:**
|
||||
|
||||
@ -406,13 +413,15 @@ Allocate as much disk space as you can comfortably spare. We recommend at least
|
||||
|
||||
**Allocate as much as possible** while leaving enough for your host OS to function (~4-8 GB).
|
||||
|
||||
More RAM = faster builds with more parallel jobs. MongoDB builds are resource-intensive and benefit greatly from additional memory.
|
||||
More RAM = faster builds with more parallel jobs. MongoDB builds are resource-intensive and benefit
|
||||
greatly from additional memory.
|
||||
|
||||
### How many CPU cores should I allocate?
|
||||
|
||||
**Allocate as many cores as possible** while leaving a couple for your host OS (1-2 cores).
|
||||
|
||||
Bazel parallelizes well; more cores = significantly faster builds. If you have 8+ cores available, MongoDB builds will complete much faster.
|
||||
Bazel parallelizes well; more cores = significantly faster builds. If you have 8+ cores available,
|
||||
MongoDB builds will complete much faster.
|
||||
|
||||
### Can I reduce resource usage?
|
||||
|
||||
@ -437,7 +446,8 @@ bazel clean # Clear build outputs
|
||||
bazel clean --expunge # Clear everything (reclaim disk space)
|
||||
```
|
||||
|
||||
> **Note:** Reducing resources will make builds slower. If possible, it's better to allocate more resources to Docker instead.
|
||||
> **Note:** Reducing resources will make builds slower. If possible, it's better to allocate more
|
||||
> resources to Docker instead.
|
||||
|
||||
### How do I monitor resource usage?
|
||||
|
||||
@ -492,7 +502,8 @@ But you lose VS Code integration, extensions, and convenience features.
|
||||
- **Architecture Details**: [architecture.md](./architecture.md)
|
||||
- **Troubleshooting**: [troubleshooting.md](./troubleshooting.md)
|
||||
- **Advanced Topics**: [advanced.md](./advanced.md)
|
||||
- **VS Code Docs**: [code.visualstudio.com/docs/devcontainers](https://code.visualstudio.com/docs/devcontainers/containers)
|
||||
- **VS Code Docs**:
|
||||
[code.visualstudio.com/docs/devcontainers](https://code.visualstudio.com/docs/devcontainers/containers)
|
||||
|
||||
### Who do I contact for help?
|
||||
|
||||
|
||||
@ -1,16 +1,19 @@
|
||||
# Getting Started with MongoDB Dev Containers
|
||||
|
||||
This guide will walk you through setting up your MongoDB development environment using Dev Containers.
|
||||
This guide will walk you through setting up your MongoDB development environment using Dev
|
||||
Containers.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
### 1. Install Docker
|
||||
|
||||
Dev Containers require Docker to be installed and running on your system. Choose one of the following Docker providers:
|
||||
Dev Containers require Docker to be installed and running on your system. Choose one of the
|
||||
following Docker providers:
|
||||
|
||||
#### Option A: Rancher Desktop (Recommended)
|
||||
|
||||
[Rancher Desktop](https://rancherdesktop.io/) is our recommended Docker provider for devcontainer development.
|
||||
[Rancher Desktop](https://rancherdesktop.io/) is our recommended Docker provider for devcontainer
|
||||
development.
|
||||
|
||||
**Installation:**
|
||||
|
||||
@ -20,28 +23,34 @@ Dev Containers require Docker to be installed and running on your system. Choose
|
||||
- **Container Engine**: Select `dockerd (moby)` ⚠️ **Important!**
|
||||
- **Configure Path**: Select "Automatic"
|
||||
|
||||
**Recommended Settings:**
|
||||
After installation, increase resources for better build performance:
|
||||
**Recommended Settings:** After installation, increase resources for better build performance:
|
||||
|
||||
1. Open Rancher Desktop → Preferences → Virtual Machine
|
||||
2. **Memory**: Allocate as much as your system allows (leave ~4-8 GB for your host OS)
|
||||
3. **CPUs**: Allocate as many cores as possible (leave 1-2 for your host OS)
|
||||
4. **Disk**: Rancher Desktop doesn't have a UI for disk size. To increase it, see [Troubleshooting - Increase Docker disk allocation](./troubleshooting.md#build-fails-with-no-space-left-on-device) for instructions.
|
||||
4. **Disk**: Rancher Desktop doesn't have a UI for disk size. To increase it, see
|
||||
[Troubleshooting - Increase Docker disk allocation](./troubleshooting.md#build-fails-with-no-space-left-on-device)
|
||||
for instructions.
|
||||
5. Apply changes and restart Rancher Desktop
|
||||
|
||||
> **Tip:** More resources = faster builds. MongoDB builds benefit significantly from additional CPU cores and memory.
|
||||
> **Tip:** More resources = faster builds. MongoDB builds benefit significantly from additional CPU
|
||||
> cores and memory.
|
||||
|
||||
**IMPORTANT!**: If you already have VSCode open when you install Rancher Desktop, make sure to restart VSCode otherwise it may not find the Docker socket and VSCode will prompt you to install Docker Desktop instead.
|
||||
**IMPORTANT!**: If you already have VSCode open when you install Rancher Desktop, make sure to
|
||||
restart VSCode otherwise it may not find the Docker socket and VSCode will prompt you to install
|
||||
Docker Desktop instead.
|
||||
|
||||
#### Option B: Docker Desktop
|
||||
|
||||
[Docker Desktop](https://www.docker.com/products/docker-desktop/) is a popular alternative.
|
||||
|
||||
> **Note on Licensing**: Docker Desktop may require a paid license for commercial use. Please review the licensing terms to ensure compliance with your use case.
|
||||
> **Note on Licensing**: Docker Desktop may require a paid license for commercial use. Please review
|
||||
> the licensing terms to ensure compliance with your use case.
|
||||
|
||||
**Installation:**
|
||||
|
||||
1. Download from [docker.com/products/docker-desktop](https://www.docker.com/products/docker-desktop/)
|
||||
1. Download from
|
||||
[docker.com/products/docker-desktop](https://www.docker.com/products/docker-desktop/)
|
||||
2. Install and start Docker Desktop
|
||||
3. Go to Settings → Resources and allocate generously:
|
||||
- **Memory**: Allocate as much as possible (leave ~4-8 GB for your host OS)
|
||||
@ -52,7 +61,8 @@ After installation, increase resources for better build performance:
|
||||
|
||||
[OrbStack](https://orbstack.dev/) is a lightweight, fast Docker alternative for macOS.
|
||||
|
||||
> **Note on Licensing**: OrbStack may require a paid license for commercial use. Please review the licensing terms to ensure compliance with your use case.
|
||||
> **Note on Licensing**: OrbStack may require a paid license for commercial use. Please review the
|
||||
> licensing terms to ensure compliance with your use case.
|
||||
|
||||
**Installation:**
|
||||
|
||||
@ -64,12 +74,14 @@ After installation, increase resources for better build performance:
|
||||
|
||||
For Linux users, you can use Docker Engine directly.
|
||||
|
||||
**Installation:**
|
||||
Follow the official guide: [docs.docker.com/engine/install](https://docs.docker.com/engine/install/)
|
||||
**Installation:** Follow the official guide:
|
||||
[docs.docker.com/engine/install](https://docs.docker.com/engine/install/)
|
||||
|
||||
### 2. Create SSH Directory (Required)
|
||||
|
||||
> **⚠️ Critical:** You **must** have a `~/.ssh` directory on your host machine before building the devcontainer. The devcontainer requires this directory to exist, regardless of whether you use SSH or HTTPS to clone the repository.
|
||||
> **⚠️ Critical:** You **must** have a `~/.ssh` directory on your host machine before building the
|
||||
> devcontainer. The devcontainer requires this directory to exist, regardless of whether you use SSH
|
||||
> or HTTPS to clone the repository.
|
||||
|
||||
```bash
|
||||
# On your HOST machine (not inside the container)
|
||||
@ -87,13 +99,17 @@ Download and install VS Code from [code.visualstudio.com](https://code.visualstu
|
||||
1. Open VS Code
|
||||
2. Go to Extensions (⌘/Ctrl+Shift+X)
|
||||
3. Search for "Dev Containers"
|
||||
4. Install the [Dev Containers](https://marketplace.visualstudio.com/items?itemName=ms-vscode-remote.remote-containers) extension by Microsoft
|
||||
4. Install the
|
||||
[Dev Containers](https://marketplace.visualstudio.com/items?itemName=ms-vscode-remote.remote-containers)
|
||||
extension by Microsoft
|
||||
|
||||
### 5. Configure SSH Keys (Recommended)
|
||||
|
||||
To clone the repository using SSH (recommended for contributors), you'll need SSH keys configured with GitHub.
|
||||
To clone the repository using SSH (recommended for contributors), you'll need SSH keys configured
|
||||
with GitHub.
|
||||
|
||||
> **⚠️ Important:** Run all commands in this section on your **host machine** (not inside the container). SSH keys need to be set up before cloning the repository into the container.
|
||||
> **⚠️ Important:** Run all commands in this section on your **host machine** (not inside the
|
||||
> container). SSH keys need to be set up before cloning the repository into the container.
|
||||
|
||||
#### Check if you have SSH keys
|
||||
|
||||
@ -183,7 +199,8 @@ Get-Service ssh-agent | Set-Service -StartupType Automatic
|
||||
Start-Service ssh-agent
|
||||
```
|
||||
|
||||
> **Note:** VS Code automatically forwards your SSH agent to the container, so your keys will be available inside the devcontainer.
|
||||
> **Note:** VS Code automatically forwards your SSH agent to the container, so your keys will be
|
||||
> available inside the devcontainer.
|
||||
|
||||
[Learn more about using SSH keys with GitHub →](https://docs.github.com/en/authentication/connecting-to-github-with-ssh)
|
||||
|
||||
@ -191,7 +208,8 @@ Start-Service ssh-agent
|
||||
|
||||
### Step 1: Clone Repository in Named Container Volume
|
||||
|
||||
For **optimal performance**, especially on macOS, clone the repository directly into a Docker volume rather than your local filesystem. This is crucial for Bazel performance.
|
||||
For **optimal performance**, especially on macOS, clone the repository directly into a Docker volume
|
||||
rather than your local filesystem. This is crucial for Bazel performance.
|
||||
|
||||
#### Why Named Volumes?
|
||||
|
||||
@ -397,7 +415,8 @@ ssh-add ~/.ssh/id_ed25519
|
||||
# Command Palette → "Dev Containers: Rebuild Container"
|
||||
```
|
||||
|
||||
**VS Code SSH Agent Forwarding**: The Dev Containers extension automatically forwards your SSH agent, but this requires:
|
||||
**VS Code SSH Agent Forwarding**: The Dev Containers extension automatically forwards your SSH
|
||||
agent, but this requires:
|
||||
|
||||
- SSH agent running on host with keys loaded
|
||||
- SSH key files in default location (`~/.ssh/`)
|
||||
|
||||
@ -28,7 +28,8 @@ Docker version <version> or later is required
|
||||
|
||||
**Solution**
|
||||
|
||||
Restart VSCode. If you install Rancher Desktop while you already have VSCode open, it doesn't properly detect the Docker socket and prompts you to install Docker Desktop by mistake.
|
||||
Restart VSCode. If you install Rancher Desktop while you already have VSCode open, it doesn't
|
||||
properly detect the Docker socket and prompts you to install Docker Desktop by mistake.
|
||||
|
||||
## Container Build Issues
|
||||
|
||||
@ -48,7 +49,9 @@ Error response from daemon: invalid mount config for type "bind": bind source pa
|
||||
|
||||
**Root Cause:**
|
||||
|
||||
The devcontainer configuration mounts your `~/.ssh` directory to enable Git operations over SSH. If this directory doesn't exist on your host machine, the container fails to start. **This directory is required even if you plan to use HTTPS instead of SSH for cloning.**
|
||||
The devcontainer configuration mounts your `~/.ssh` directory to enable Git operations over SSH. If
|
||||
this directory doesn't exist on your host machine, the container fails to start. **This directory is
|
||||
required even if you plan to use HTTPS instead of SSH for cloning.**
|
||||
|
||||
**Solutions:**
|
||||
|
||||
@ -73,7 +76,8 @@ SSH agent forwarding behavior varies by Docker provider on macOS:
|
||||
- With dockerd runtime: Automatic agent forwarding
|
||||
- With containerd runtime: Agent forwarding requires additional setup
|
||||
|
||||
To use SSH agent forwarding, ensure your SSH keys are added to your host's SSH agent before starting the container:
|
||||
To use SSH agent forwarding, ensure your SSH keys are added to your host's SSH agent before starting
|
||||
the container:
|
||||
|
||||
```bash
|
||||
ssh-add ~/.ssh/id_ed25519 # or your key name
|
||||
@ -117,7 +121,8 @@ Error: failed to solve: write /var/lib/docker/...: no space left on device
|
||||
disk: 100GB
|
||||
```
|
||||
4. Start Rancher Desktop
|
||||
5. If Rancher Desktop was previously initialized, you may need to perform a factory reset (Preferences → Troubleshooting → Reset Kubernetes) for the disk size change to take effect.
|
||||
5. If Rancher Desktop was previously initialized, you may need to perform a factory reset
|
||||
(Preferences → Troubleshooting → Reset Kubernetes) for the disk size change to take effect.
|
||||
|
||||
**On Windows (WSL2):**
|
||||
|
||||
@ -125,7 +130,8 @@ Error: failed to solve: write /var/lib/docker/...: no space left on device
|
||||
|
||||
1. Stop Rancher Desktop
|
||||
2. Run: `wsl --shutdown`
|
||||
3. Follow Microsoft's guide to increase WSL2 disk size: https://learn.microsoft.com/en-us/windows/wsl/disk-space
|
||||
3. Follow Microsoft's guide to increase WSL2 disk size:
|
||||
https://learn.microsoft.com/en-us/windows/wsl/disk-space
|
||||
|
||||
**Docker Desktop:**
|
||||
|
||||
@ -174,7 +180,8 @@ Error: Failed to download toolchain
|
||||
curl -I "$(grep TOOLCHAIN_URL .devcontainer/toolchain_config.env | cut -d'"' -f2)"
|
||||
```
|
||||
|
||||
3. **If toolchain URL is broken**, report it to the MongoDB team. This is a devcontainer configuration issue that needs to be fixed upstream.
|
||||
3. **If toolchain URL is broken**, report it to the MongoDB team. This is a devcontainer
|
||||
configuration issue that needs to be fixed upstream.
|
||||
|
||||
### Build Fails with Checksum Mismatch
|
||||
|
||||
@ -203,7 +210,8 @@ Got: def456...
|
||||
# Command Palette → "Dev Containers: Rebuild Container Without Cache"
|
||||
```
|
||||
|
||||
3. **If problem persists**, this is likely a devcontainer configuration issue - report it to the MongoDB team.
|
||||
3. **If problem persists**, this is likely a devcontainer configuration issue - report it to the
|
||||
MongoDB team.
|
||||
|
||||
### Container Fails to Start
|
||||
|
||||
@ -288,11 +296,9 @@ Got: def456...
|
||||
- File save is delayed
|
||||
- Terminal autocomplete is slow
|
||||
|
||||
**Root Cause:**
|
||||
Bind mounts on macOS use osxfs which has high latency for filesystem operations.
|
||||
**Root Cause:** Bind mounts on macOS use osxfs which has high latency for filesystem operations.
|
||||
|
||||
**Solution:**
|
||||
✅ **Use named volumes instead of bind mounts** (see Getting Started guide)
|
||||
**Solution:** ✅ **Use named volumes instead of bind mounts** (see Getting Started guide)
|
||||
|
||||
### High CPU Usage
|
||||
|
||||
@ -517,7 +523,8 @@ fatal: Could not read from remote repository.
|
||||
ssh-add ~/.ssh/id_ed25519 # or id_rsa
|
||||
```
|
||||
|
||||
See [Getting Started - SSH Setup](./getting-started.md#4-configure-ssh-keys-recommended) for detailed instructions.
|
||||
See [Getting Started - SSH Setup](./getting-started.md#4-configure-ssh-keys-recommended) for
|
||||
detailed instructions.
|
||||
|
||||
### SSH Works on Host But Not in Container
|
||||
|
||||
@ -527,8 +534,7 @@ See [Getting Started - SSH Setup](./getting-started.md#4-configure-ssh-keys-reco
|
||||
- Same operations fail inside devcontainer
|
||||
- "Permission denied" or asks for password
|
||||
|
||||
**Root Cause:**
|
||||
SSH agent forwarding isn't working properly.
|
||||
**Root Cause:** SSH agent forwarding isn't working properly.
|
||||
|
||||
**Solutions:**
|
||||
|
||||
@ -633,8 +639,7 @@ git config --global credential.helper store
|
||||
# Next time you enter credentials, they'll be saved
|
||||
```
|
||||
|
||||
**Option 3: Fix SSH agent forwarding**:
|
||||
See "SSH Works on Host But Not in Container" section above.
|
||||
**Option 3: Fix SSH agent forwarding**: See "SSH Works on Host But Not in Container" section above.
|
||||
|
||||
### Multiple SSH Keys (Personal + Work)
|
||||
|
||||
@ -868,8 +873,7 @@ ModuleNotFoundError: No module named 'pymongo'
|
||||
- History cleared
|
||||
- Python venv empty
|
||||
|
||||
**Root Cause:**
|
||||
Volumes not mounting correctly
|
||||
**Root Cause:** Volumes not mounting correctly
|
||||
|
||||
**Solutions:**
|
||||
|
||||
@ -917,8 +921,8 @@ docker cp <container_id>:/workspaces/mongo/file.txt ~/Downloads/
|
||||
# Right-click file → Download...
|
||||
```
|
||||
|
||||
**To edit with external tools:**
|
||||
Use bind mounts instead of named volumes (but sacrifices performance).
|
||||
**To edit with external tools:** Use bind mounts instead of named volumes (but sacrifices
|
||||
performance).
|
||||
|
||||
### Volume Fills Up Disk
|
||||
|
||||
@ -1070,8 +1074,7 @@ permission denied while trying to connect to Docker daemon
|
||||
- Slow builds
|
||||
- Out of memory errors
|
||||
|
||||
**Solution:**
|
||||
Go to Docker Desktop → Settings → Resources and allocate generously:
|
||||
**Solution:** Go to Docker Desktop → Settings → Resources and allocate generously:
|
||||
|
||||
- **CPUs**: Allocate as many as possible (leave 1-2 for host OS)
|
||||
- **Memory**: Allocate as much as possible (leave ~4-8 GB for host OS)
|
||||
@ -1087,8 +1090,7 @@ Go to Docker Desktop → Settings → Resources and allocate generously:
|
||||
- Docker-outside-of-docker doesn't work
|
||||
- Volume mounts fail
|
||||
|
||||
**Solution:**
|
||||
OrbStack has some limitations with devcontainer features. Try:
|
||||
**Solution:** OrbStack has some limitations with devcontainer features. Try:
|
||||
|
||||
1. Update to latest OrbStack version
|
||||
2. Check OrbStack documentation for devcontainer compatibility
|
||||
@ -1177,7 +1179,8 @@ cd mongo
|
||||
|
||||
If your issue isn't covered here:
|
||||
|
||||
1. **Check VS Code Docs**: [code.visualstudio.com/docs/devcontainers](https://code.visualstudio.com/docs/devcontainers/containers)
|
||||
1. **Check VS Code Docs**:
|
||||
[code.visualstudio.com/docs/devcontainers](https://code.visualstudio.com/docs/devcontainers/containers)
|
||||
2. **Search Issues**: MongoDB GitHub repository issues
|
||||
3. **Ask the Team**: MongoDB developers Slack/chat
|
||||
4. **File a Bug**: Include:
|
||||
|
||||
@ -1,26 +1,95 @@
|
||||
# Egress Networking
|
||||
|
||||
Egress networking entails outbound communication (i.e. requests) from a client process to a server process (e.g. _mongod_), as well as inbound communication (i.e. responses) from such a server process back to a client process.
|
||||
Egress networking entails outbound communication (i.e. requests) from a client process to a server
|
||||
process (e.g. _mongod_), as well as inbound communication (i.e. responses) from such a server
|
||||
process back to a client process.
|
||||
|
||||
## Remote Commands
|
||||
|
||||
A remote command represents an exchange of data between a client and a server. A remote command consists of two steps: a request, which the clients sends to the server, and a response, which the client receives from the server. These elements are represented by the [request][remote_command_request_h] and [response][remote_command_response_h] objects; each wraps the BSON that represents the on-wire transacted data and metadata that describes the context of the command, such as the host that the command targets. Each object also contains metadata that corresponds to its half of the command lifecycle. For example, the request object notes the timeout of the command and the operation's unique identifier, among other fields, and the response object notes the final disposition of the command's data exchange as a `Status` object (which takes no position on the success of the command's semantics at the remote) and the time that the command actually took to execute, among other fields. In the case of an exhaust command, there may be multiple responses for a single request.
|
||||
A remote command represents an exchange of data between a client and a server. A remote command
|
||||
consists of two steps: a request, which the clients sends to the server, and a response, which the
|
||||
client receives from the server. These elements are represented by the
|
||||
[request][remote_command_request_h] and [response][remote_command_response_h] objects; each wraps
|
||||
the BSON that represents the on-wire transacted data and metadata that describes the context of the
|
||||
command, such as the host that the command targets. Each object also contains metadata that
|
||||
corresponds to its half of the command lifecycle. For example, the request object notes the timeout
|
||||
of the command and the operation's unique identifier, among other fields, and the response object
|
||||
notes the final disposition of the command's data exchange as a `Status` object (which takes no
|
||||
position on the success of the command's semantics at the remote) and the time that the command
|
||||
actually took to execute, among other fields. In the case of an exhaust command, there may be
|
||||
multiple responses for a single request.
|
||||
|
||||
## Connection Pooling
|
||||
|
||||
The [executor::ConnectionPool][connection_pool_h] class is responsible for pooling connections to any number of hosts. It contains zero or more `ConnectionPool::SpecificPool` objects, each of which pools connections for a unique host, and exactly one `ConnectionPool::ControllerInterface` object, which is responsible for the addition, removal, and updating of `SpecificPool`s to, from, and in its owning `ConnectionPool`. When a caller requests a connection to a host from the `ConnectionPool`, the `ConnectionPool` creates a new `SpecificPool` to pool connections for that host if one does not exist already, and then the `ConnectionPool` forwards the request to the `SpecificPool`. A `SpecificPool` expires when its `hostTimeout` has passed without any connection requests, after which time it becomes unusable; further requests for connections to that host will trigger the creation of a fresh `SpecificPool`.
|
||||
The [executor::ConnectionPool][connection_pool_h] class is responsible for pooling connections to
|
||||
any number of hosts. It contains zero or more `ConnectionPool::SpecificPool` objects, each of which
|
||||
pools connections for a unique host, and exactly one `ConnectionPool::ControllerInterface` object,
|
||||
which is responsible for the addition, removal, and updating of `SpecificPool`s to, from, and in its
|
||||
owning `ConnectionPool`. When a caller requests a connection to a host from the `ConnectionPool`,
|
||||
the `ConnectionPool` creates a new `SpecificPool` to pool connections for that host if one does not
|
||||
exist already, and then the `ConnectionPool` forwards the request to the `SpecificPool`. A
|
||||
`SpecificPool` expires when its `hostTimeout` has passed without any connection requests, after
|
||||
which time it becomes unusable; further requests for connections to that host will trigger the
|
||||
creation of a fresh `SpecificPool`.
|
||||
|
||||
The final result of a successful connection request made through `ConnectionPool::getConnection` is a `ConnectionPool::ConnectionInterface`, which represents a connection ready for use. Externally, the `ConnectionInterface` is primarily used by the caller to exchange data with its remote host. Callers return `ConnectionInterface`s to the pool by allowing them to destruct and callers must signal to the pool the final disposition of the connection beforehand through the `indicate*` family of methods. `ConnectionInterface`s also support setting timers to schedule future activities. Internally, the `ConnectionInterface` is used to prepare the connection for data exchange before transferring ownership to the caller and refreshing the health of a connection when the caller returns the connection to the pool. `ConnectionInterface` also maintains a notion of generation, which is implemented as a monotonically-incrementing counter. When a caller returns a `ConnectionInterface` to a `ConnectionPool` from a generation prior to the current generation of the corresponding `SpecificPool`, the connection is dropped. The current generation of a `SpecificPool` is incremented when the pool experiences certain failures (e.g., when to establish a new connection). `ConnectionPool` also drops a connection if the caller called `indicateFailure` on the connection before returning it. `ConnectionPool` uses a global mutex for access to `SpecificPool`s as well as generation counters.
|
||||
The final result of a successful connection request made through `ConnectionPool::getConnection` is
|
||||
a `ConnectionPool::ConnectionInterface`, which represents a connection ready for use. Externally,
|
||||
the `ConnectionInterface` is primarily used by the caller to exchange data with its remote host.
|
||||
Callers return `ConnectionInterface`s to the pool by allowing them to destruct and callers must
|
||||
signal to the pool the final disposition of the connection beforehand through the `indicate*` family
|
||||
of methods. `ConnectionInterface`s also support setting timers to schedule future activities.
|
||||
Internally, the `ConnectionInterface` is used to prepare the connection for data exchange before
|
||||
transferring ownership to the caller and refreshing the health of a connection when the caller
|
||||
returns the connection to the pool. `ConnectionInterface` also maintains a notion of generation,
|
||||
which is implemented as a monotonically-incrementing counter. When a caller returns a
|
||||
`ConnectionInterface` to a `ConnectionPool` from a generation prior to the current generation of the
|
||||
corresponding `SpecificPool`, the connection is dropped. The current generation of a `SpecificPool`
|
||||
is incremented when the pool experiences certain failures (e.g., when to establish a new
|
||||
connection). `ConnectionPool` also drops a connection if the caller called `indicateFailure` on the
|
||||
connection before returning it. `ConnectionPool` uses a global mutex for access to `SpecificPool`s
|
||||
as well as generation counters.
|
||||
|
||||
`ConnectionPool` uses its single instance of `EgressConnectionCloserManager` to determine when hosts should be dropped. The manager consists of multiple `EgressConnectionClosers`, which are used to determine whether hosts should be dropped. In the context of the ConnectionPool, the manager's purpose is to drop _connections_ to hosts based on whether they have been marked as keep open or not.
|
||||
`ConnectionPool` uses its single instance of `EgressConnectionCloserManager` to determine when hosts
|
||||
should be dropped. The manager consists of multiple `EgressConnectionClosers`, which are used to
|
||||
determine whether hosts should be dropped. In the context of the ConnectionPool, the manager's
|
||||
purpose is to drop _connections_ to hosts based on whether they have been marked as keep open or
|
||||
not.
|
||||
|
||||
## Internal Network Clients
|
||||
|
||||
Client-side outbound communication in egress networking is primarily handled by the [AsyncDBClient class][async_client_h]. The async client is responsible for initializing a connection to a particular host as well as initializing the [wire protocol][wire_protocol] for client-server communication, after which remote requests can be sent by the client and corresponding remote responses from a database can subsequently be received. In setting up the wire protocol, the async client sends an [isMaster][is_master] request to the server and parses the server's isMaster response to ensure that the status of the connection is OK. An initial isMaster request is constructed in the legacy OP_QUERY protocol, so that clients can still communicate with servers that may not support other protocols. The async client also supports client authentication functionality (i.e. authenticating a user's credentials, client host, remote host, etc.).
|
||||
Client-side outbound communication in egress networking is primarily handled by the [AsyncDBClient
|
||||
class][async_client_h]. The async client is responsible for initializing a connection to a
|
||||
particular host as well as initializing the [wire protocol][wire_protocol] for client-server
|
||||
communication, after which remote requests can be sent by the client and corresponding remote
|
||||
responses from a database can subsequently be received. In setting up the wire protocol, the async
|
||||
client sends an [isMaster][is_master] request to the server and parses the server's isMaster
|
||||
response to ensure that the status of the connection is OK. An initial isMaster request is
|
||||
constructed in the legacy OP_QUERY protocol, so that clients can still communicate with servers that
|
||||
may not support other protocols. The async client also supports client authentication functionality
|
||||
(i.e. authenticating a user's credentials, client host, remote host, etc.).
|
||||
|
||||
The scheduling of requests is managed by the [task executor][task_executor_h], which maintains the notion of **events** and **callbacks**. Callbacks represent work (e.g. remote requests) that is to be executed by the executor, and are scheduled by client threads as well as other callbacks. There are several variations of work scheduling methods, which include: immediate scheduling, scheduling no earlier than a specified time, and scheduling iff a specified event has been signalled. These methods return a handle that can be used while the executor is still in scope for either waiting on or cancelling the scheduled callback in question. If a scheduled callback is cancelled, it remains on the work queue and is technically still run, but is labeled as having been 'cancelled' beforehand. Once a given callback/request is scheduled, the task executor is then able to execute such requests via a [network interface][network_interface_h]. The network interface, connected to a particular host/server, begins the asynchronous execution of commands specified via a request bundled in the aforementioned callback handle. The interface is capable of blocking threads until its associated task executor has work that needs to be performed, and is likewise able to return from an idle state when it receives a signal that the executor has new work to process.
|
||||
The scheduling of requests is managed by the [task executor][task_executor_h], which maintains the
|
||||
notion of **events** and **callbacks**. Callbacks represent work (e.g. remote requests) that is to
|
||||
be executed by the executor, and are scheduled by client threads as well as other callbacks. There
|
||||
are several variations of work scheduling methods, which include: immediate scheduling, scheduling
|
||||
no earlier than a specified time, and scheduling iff a specified event has been signalled. These
|
||||
methods return a handle that can be used while the executor is still in scope for either waiting on
|
||||
or cancelling the scheduled callback in question. If a scheduled callback is cancelled, it remains
|
||||
on the work queue and is technically still run, but is labeled as having been 'cancelled'
|
||||
beforehand. Once a given callback/request is scheduled, the task executor is then able to execute
|
||||
such requests via a [network interface][network_interface_h]. The network interface, connected to a
|
||||
particular host/server, begins the asynchronous execution of commands specified via a request
|
||||
bundled in the aforementioned callback handle. The interface is capable of blocking threads until
|
||||
its associated task executor has work that needs to be performed, and is likewise able to return
|
||||
from an idle state when it receives a signal that the executor has new work to process.
|
||||
|
||||
Client-side legacy networking draws upon the `DBClientBase` class, of which there are multiple subclasses residing in the `src/mongo/client` folder. The [replica set DBClient][dbclient_rs_h] discerns which one of multiple servers in a replica set is the primary at construction time, and establishes a connection (using the `DBClientConnection` wrapper class, also extended from `DBClientBase`) with the replica set via the primary. In cases where the primary server is unresponsive within a specified time range, the RS DBClient will automatically attempt to establish a secondary server as the new primary (see [automatic failover][automatic_failover]).
|
||||
Client-side legacy networking draws upon the `DBClientBase` class, of which there are multiple
|
||||
subclasses residing in the `src/mongo/client` folder. The [replica set DBClient][dbclient_rs_h]
|
||||
discerns which one of multiple servers in a replica set is the primary at construction time, and
|
||||
establishes a connection (using the `DBClientConnection` wrapper class, also extended from
|
||||
`DBClientBase`) with the replica set via the primary. In cases where the primary server is
|
||||
unresponsive within a specified time range, the RS DBClient will automatically attempt to establish
|
||||
a secondary server as the new primary (see [automatic failover][automatic_failover]).
|
||||
|
||||
## See Also
|
||||
|
||||
|
||||
@ -3,26 +3,26 @@
|
||||
## What it is
|
||||
|
||||
Similar to [burn_in_tests](burn_in_tests.md), `burn_in_tags` also detects the javascript tests
|
||||
(under the [jstests directory](https://github.com/mongodb/mongo/tree/master/jstests))
|
||||
that are new or have changed since the last git command and then runs those tests in repeated
|
||||
mode to validate their stability. But instead of running the tests on their original build
|
||||
variants, `burn_in_tags` runs them on the burn_in build variants that are generated separately.
|
||||
(under the [jstests directory](https://github.com/mongodb/mongo/tree/master/jstests)) that are new
|
||||
or have changed since the last git command and then runs those tests in repeated mode to validate
|
||||
their stability. But instead of running the tests on their original build variants, `burn_in_tags`
|
||||
runs them on the burn_in build variants that are generated separately.
|
||||
|
||||
## How to use it
|
||||
|
||||
You can use `burn_in_tags` on evergreen by selecting the `burn_in_tags_gen` task when creating a patch.
|
||||
The burn_in build variants, i.e., `enterprise-rhel-8-64-bit-inmem` and `enterprise-rhel-8-64-bit-multiversion`
|
||||
will be generated, each of which will have a `burn_in_tests` task generated by the
|
||||
[mongo-task-generator](https://github.com/mongodb/mongo-task-generator). `burn_in_tests` task, a
|
||||
[generated task](task_generation.md), may have multiple sub-tasks which run the test suites only for the
|
||||
new or changed javascript tests (note that a javascript test can be included in multiple test suites). Each of
|
||||
those tests will be run 2 times minimum, and 1000 times maximum or for 10 minutes, whichever is reached first.
|
||||
You can use `burn_in_tags` on evergreen by selecting the `burn_in_tags_gen` task when creating a
|
||||
patch. The burn_in build variants, i.e., `enterprise-rhel-8-64-bit-inmem` and
|
||||
`enterprise-rhel-8-64-bit-multiversion` will be generated, each of which will have a `burn_in_tests`
|
||||
task generated by the [mongo-task-generator](https://github.com/mongodb/mongo-task-generator).
|
||||
`burn_in_tests` task, a [generated task](task_generation.md), may have multiple sub-tasks which run
|
||||
the test suites only for the new or changed javascript tests (note that a javascript test can be
|
||||
included in multiple test suites). Each of those tests will be run 2 times minimum, and 1000 times
|
||||
maximum or for 10 minutes, whichever is reached first.
|
||||
|
||||
## ! Run All Affected JStests
|
||||
|
||||
The `! Run All Affected JStests` variant has a single `burn_in_tags_gen` task. This task will create &
|
||||
activate [`burn_in_tests`](burn_in_tests.md) tasks for all required and suggested
|
||||
variants. The end result is that any jstests that have been modified in the patch will
|
||||
run on all required and suggested variants. This should give users a clear signal on
|
||||
whether their jstests changes have introduced a failure that could potentially lead
|
||||
to a revert or follow-up bug fix commit.
|
||||
The `! Run All Affected JStests` variant has a single `burn_in_tags_gen` task. This task will create
|
||||
& activate [`burn_in_tests`](burn_in_tests.md) tasks for all required and suggested variants. The
|
||||
end result is that any jstests that have been modified in the patch will run on all required and
|
||||
suggested variants. This should give users a clear signal on whether their jstests changes have
|
||||
introduced a failure that could potentially lead to a revert or follow-up bug fix commit.
|
||||
|
||||
@ -3,19 +3,21 @@
|
||||
## What it is
|
||||
|
||||
`burn_in_tests` detects the javascript tests (under the
|
||||
[jstests directory](https://github.com/mongodb/mongo/tree/master/jstests)) that are new or have changed
|
||||
since the last git command and then runs those tests in repeated mode to validate their stability.
|
||||
[jstests directory](https://github.com/mongodb/mongo/tree/master/jstests)) that are new or have
|
||||
changed since the last git command and then runs those tests in repeated mode to validate their
|
||||
stability.
|
||||
|
||||
## How to use it
|
||||
|
||||
You can use `burn_in_tests` on evergreen by selecting the `burn_in_tests_gen` task when creating a patch,
|
||||
since `burn_in_tests` task is a [generated task](task_generation.md) generated by the
|
||||
[mongo-task-generator](https://github.com/mongodb/mongo-task-generator).
|
||||
`burn_in_tests` task will be generated on each of the applicable build variants, and
|
||||
may have multiple sub-tasks which run the test suites only for the new or changed javascript tests (note
|
||||
that a javascript test can be included in multiple test suites). Each of those tests will be run 2 times
|
||||
minimum, and 1000 times maximum or for 10 minutes, whichever is reached first.
|
||||
You can use `burn_in_tests` on evergreen by selecting the `burn_in_tests_gen` task when creating a
|
||||
patch, since `burn_in_tests` task is a [generated task](task_generation.md) generated by the
|
||||
[mongo-task-generator](https://github.com/mongodb/mongo-task-generator). `burn_in_tests` task will
|
||||
be generated on each of the applicable build variants, and may have multiple sub-tasks which run the
|
||||
test suites only for the new or changed javascript tests (note that a javascript test can be
|
||||
included in multiple test suites). Each of those tests will be run 2 times minimum, and 1000 times
|
||||
maximum or for 10 minutes, whichever is reached first.
|
||||
|
||||
You can also use `burn_in_tests` locally from within the [mongo repo](https://github.com/mongodb/mongo)
|
||||
by running the script `python buildscripts/burn_in_tests.py`. For more information about this usage, you can
|
||||
run `python buildscripts/burn_in_tests.py --help`.
|
||||
You can also use `burn_in_tests` locally from within the
|
||||
[mongo repo](https://github.com/mongodb/mongo) by running the script
|
||||
`python buildscripts/burn_in_tests.py`. For more information about this usage, you can run
|
||||
`python buildscripts/burn_in_tests.py --help`.
|
||||
|
||||
@ -34,37 +34,37 @@ For some of the versions we are using such generic names as `latest`, `last-lts`
|
||||
- `latest` - the current version. In Evergreen, the version that was compiled in the current build.
|
||||
|
||||
- `last-lts` - the latest LTS (Long Term Support) Major release version. In Evergreen, the version
|
||||
that was downloaded from the last LTS release branch project. It resolves to an entry
|
||||
in `longTermSupportReleases` of [releases.yml](../../src/mongo/util/version/releases.yml).
|
||||
that was downloaded from the last LTS release branch project. It resolves to an entry in
|
||||
`longTermSupportReleases` of [releases.yml](../../src/mongo/util/version/releases.yml).
|
||||
|
||||
- `last-continuous` - the latest Rapid release version. In Evergreen, the version that was
|
||||
downloaded from the Rapid release branch project. It resolves to the entry in
|
||||
`featureCompatibilityVersions` of [releases.yml](../../src/mongo/util/version/releases.yml)
|
||||
that looks older than the output of `git describe`. Will not be tested against if it is listed in
|
||||
`featureCompatibilityVersions` of [releases.yml](../../src/mongo/util/version/releases.yml) that
|
||||
looks older than the output of `git describe`. Will not be tested against if it is listed in
|
||||
`eolVersions` as being end of life.
|
||||
|
||||
Note: The latest release.yml file from master is always used, even fetched remotely when on another branch.
|
||||
Note: The latest release.yml file from master is always used, even fetched remotely when on another
|
||||
branch.
|
||||
|
||||
### Old vs new
|
||||
|
||||
Many multiversion tasks are running tests against `latest`/`last-lts` or `latest`/`last-continuous`
|
||||
versions. In such context we refer to `last-lts` and `last-continuous` versions as the `old`
|
||||
version and to `latest` as a `new` version.
|
||||
versions. In such context we refer to `last-lts` and `last-continuous` versions as the `old` version
|
||||
and to `latest` as a `new` version.
|
||||
|
||||
A `new` version is compiled in the same way as for non-multiversion tasks. The `old` versions of
|
||||
compiled binaries are downloaded from the old branch projects with
|
||||
[`db-contrib-tool`](https://github.com/10gen/db-contrib-tool).
|
||||
`db-contrib-tool` searches for the latest available compiled binaries on the old branch projects in
|
||||
Evergreen.
|
||||
[`db-contrib-tool`](https://github.com/10gen/db-contrib-tool). `db-contrib-tool` searches for the
|
||||
latest available compiled binaries on the old branch projects in Evergreen.
|
||||
|
||||
### Explicit and Implicit multiversion suites
|
||||
|
||||
Multiversion suites can be explicit and implicit.
|
||||
|
||||
- Explicit - JS tests are aware of the binary versions they are running,
|
||||
e.g. [multiversion.yml](https://github.com/mongodb/mongo/blob/e91cda950e50aa4c707efbdd0be208481493fc96/buildscripts/resmokeconfig/suites/multiversion.yml).
|
||||
The version of binaries is explicitly set in JS tests,
|
||||
e.g. [jstests/multiVersion/genericSetFCVUsage/major_version_upgrade.js](https://github.com/mongodb/mongo/blob/397c8da541940b3fbe6257243f97a342fe7e0d3b/jstests/multiVersion/genericSetFCVUsage/major_version_upgrade.js#L33-L44):
|
||||
- Explicit - JS tests are aware of the binary versions they are running, e.g.
|
||||
[multiversion.yml](https://github.com/mongodb/mongo/blob/e91cda950e50aa4c707efbdd0be208481493fc96/buildscripts/resmokeconfig/suites/multiversion.yml).
|
||||
The version of binaries is explicitly set in JS tests, e.g.
|
||||
[jstests/multiVersion/genericSetFCVUsage/major_version_upgrade.js](https://github.com/mongodb/mongo/blob/397c8da541940b3fbe6257243f97a342fe7e0d3b/jstests/multiVersion/genericSetFCVUsage/major_version_upgrade.js#L33-L44):
|
||||
|
||||
```js
|
||||
const versions = [
|
||||
@ -101,8 +101,8 @@ const versions = [
|
||||
];
|
||||
```
|
||||
|
||||
- Implicit - JS tests know nothing about the binary versions they are running,
|
||||
e.g. [retryable_writes_downgrade.yml](https://github.com/mongodb/mongo/blob/e91cda950e50aa4c707efbdd0be208481493fc96/buildscripts/resmokeconfig/suites/retryable_writes_downgrade.yml).
|
||||
- Implicit - JS tests know nothing about the binary versions they are running, e.g.
|
||||
[retryable_writes_downgrade.yml](https://github.com/mongodb/mongo/blob/e91cda950e50aa4c707efbdd0be208481493fc96/buildscripts/resmokeconfig/suites/retryable_writes_downgrade.yml).
|
||||
Most of the implicit multiversion suites are using matrix suites, e.g. `replica_sets_last_lts`:
|
||||
|
||||
```bash
|
||||
@ -134,7 +134,8 @@ test_kind: js_test
|
||||
|
||||
In implicit multiversion suites the version of binaries is defined on the resmoke fixture level.
|
||||
|
||||
The [example](https://github.com/mongodb/mongo/blob/e91cda950e50aa4c707efbdd0be208481493fc96/buildscripts/resmokeconfig/matrix_suites/overrides/multiversion.yml#L5-L8)
|
||||
The
|
||||
[example](https://github.com/mongodb/mongo/blob/e91cda950e50aa4c707efbdd0be208481493fc96/buildscripts/resmokeconfig/matrix_suites/overrides/multiversion.yml#L5-L8)
|
||||
of replica set fixture configuration override:
|
||||
|
||||
```yaml
|
||||
@ -144,7 +145,8 @@ fixture:
|
||||
mixed_bin_versions: new_new_old
|
||||
```
|
||||
|
||||
The [example](https://github.com/mongodb/mongo/blob/e91cda950e50aa4c707efbdd0be208481493fc96/buildscripts/resmokeconfig/matrix_suites/overrides/multiversion.yml#L53-L57)
|
||||
The
|
||||
[example](https://github.com/mongodb/mongo/blob/e91cda950e50aa4c707efbdd0be208481493fc96/buildscripts/resmokeconfig/matrix_suites/overrides/multiversion.yml#L53-L57)
|
||||
of sharded cluster fixture configuration override:
|
||||
|
||||
```yaml
|
||||
@ -155,7 +157,8 @@ fixture:
|
||||
mixed_bin_versions: new_old_old_new
|
||||
```
|
||||
|
||||
The [example](https://github.com/mongodb/mongo/blob/e91cda950e50aa4c707efbdd0be208481493fc96/buildscripts/resmokeconfig/matrix_suites/overrides/multiversion.yml#L139-L145)
|
||||
The
|
||||
[example](https://github.com/mongodb/mongo/blob/e91cda950e50aa4c707efbdd0be208481493fc96/buildscripts/resmokeconfig/matrix_suites/overrides/multiversion.yml#L139-L145)
|
||||
of shell fixture configuration override:
|
||||
|
||||
```yaml
|
||||
@ -171,20 +174,25 @@ value:
|
||||
### Version combinations
|
||||
|
||||
In implicit multiversion suites the same set of tests may run in similar suites that are using
|
||||
various mixed version combinations. Those version combinations depend on the type of resmoke
|
||||
fixture the suite is running with. These are the recommended version combinations to test against based on the suite fixtures:
|
||||
various mixed version combinations. Those version combinations depend on the type of resmoke fixture
|
||||
the suite is running with. These are the recommended version combinations to test against based on
|
||||
the suite fixtures:
|
||||
|
||||
- Replica set fixture combinations:
|
||||
|
||||
- `last-lts new-new-old` (i.e. suite runs the replica set fixture that spins up the `latest` and
|
||||
the `last-lts` versions in a 3-node replica set where the 1st node is the `latest`, 2nd - `latest`,
|
||||
3rd - `last-lts`, etc.)
|
||||
the `last-lts` versions in a 3-node replica set where the 1st node is the `latest`, 2nd -
|
||||
`latest`, 3rd - `last-lts`, etc.)
|
||||
- `last-lts new-old-new`
|
||||
- `last-lts old-new-new`
|
||||
- `last-continuous new-new-old`
|
||||
- `last-continuous new-old-new`
|
||||
- `last-continuous old-new-new`
|
||||
- Ex: [change_streams](https://github.com/mongodb/mongo/blob/88d59bfe9d5ee2c9938ae251f7a77a8bf1250a6b/buildscripts/resmokeconfig/suites/change_streams.yml) uses a [`ReplicaSetFixture`](https://github.com/mongodb/mongo/blob/88d59bfe9d5ee2c9938ae251f7a77a8bf1250a6b/buildscripts/resmokeconfig/suites/change_streams.yml#L50) so the corresponding multiversion suites are
|
||||
- Ex:
|
||||
[change_streams](https://github.com/mongodb/mongo/blob/88d59bfe9d5ee2c9938ae251f7a77a8bf1250a6b/buildscripts/resmokeconfig/suites/change_streams.yml)
|
||||
uses a
|
||||
[`ReplicaSetFixture`](https://github.com/mongodb/mongo/blob/88d59bfe9d5ee2c9938ae251f7a77a8bf1250a6b/buildscripts/resmokeconfig/suites/change_streams.yml#L50)
|
||||
so the corresponding multiversion suites are
|
||||
- [`change_streams_last_continuous_new_new_old`](https://github.com/mongodb/mongo/blob/612814f4ce56282c47d501817ba28337c26d7aba/buildscripts/resmokeconfig/matrix_suites/mappings/change_streams_last_continuous_new_new_old.yml)
|
||||
- [`change_streams_last_continuous_new_old_new`](https://github.com/mongodb/mongo/blob/612814f4ce56282c47d501817ba28337c26d7aba/buildscripts/resmokeconfig/matrix_suites/mappings/change_streams_last_continuous_new_old_new.yml)
|
||||
- [`change_streams_last_continuous_old_new_new`](https://github.com/mongodb/mongo/blob/612814f4ce56282c47d501817ba28337c26d7aba/buildscripts/resmokeconfig/matrix_suites/mappings/change_streams_last_continuous_old_new_new.yml)
|
||||
@ -199,7 +207,11 @@ fixture the suite is running with. These are the recommended version combination
|
||||
replica sets per shard where the 1st node of the 1st shard is the `latest`, 2nd node of 1st
|
||||
shard - `last-lts`, 1st node of 2nd shard - `last-lts`, 2nd node of 2nd shard - `latest`, etc.)
|
||||
- `last-continuous new-old-old-new`
|
||||
- Ex: [change_streams_downgrade](https://github.com/mongodb/mongo/blob/a96b83b2fa7010a5823fefac2469b4a06a697cf1/buildscripts/resmokeconfig/suites/change_streams_downgrade.yml) uses a [`ShardedClusterFixture`](https://github.com/mongodb/mongo/blob/a96b83b2fa7010a5823fefac2469b4a06a697cf1/buildscripts/resmokeconfig/suites/change_streams_downgrade.yml#L408) so the corresponding multiversion suites are
|
||||
- Ex:
|
||||
[change_streams_downgrade](https://github.com/mongodb/mongo/blob/a96b83b2fa7010a5823fefac2469b4a06a697cf1/buildscripts/resmokeconfig/suites/change_streams_downgrade.yml)
|
||||
uses a
|
||||
[`ShardedClusterFixture`](https://github.com/mongodb/mongo/blob/a96b83b2fa7010a5823fefac2469b4a06a697cf1/buildscripts/resmokeconfig/suites/change_streams_downgrade.yml#L408)
|
||||
so the corresponding multiversion suites are
|
||||
- [`change_streams_downgrade_last_continuous_new_old_old_new`](https://github.com/mongodb/mongo/blob/612814f4ce56282c47d501817ba28337c26d7aba/buildscripts/resmokeconfig/matrix_suites/mappings/change_streams_downgrade_last_continuous_new_old_old_new.yml)
|
||||
- [`change_streams_downgrade_last_lts_new_old_old_new`](https://github.com/mongodb/mongo/blob/612814f4ce56282c47d501817ba28337c26d7aba/buildscripts/resmokeconfig/matrix_suites/mappings/change_streams_downgrade_last_lts_new_old_old_new.yml)
|
||||
|
||||
@ -207,18 +219,21 @@ fixture the suite is running with. These are the recommended version combination
|
||||
- `last-lts` (i.e. suite runs the shell fixture that spins up `last-lts` as the `old` versions,
|
||||
etc.)
|
||||
- `last-continuous`
|
||||
- Ex: [initial_sync_fuzzer](https://github.com/mongodb/mongo/blob/908625ffdec050a71aa2ce47c35788739f629c60/buildscripts/resmokeconfig/suites/initial_sync_fuzzer.yml) uses a Shell Fixture, so the corresponding multiversion suites are
|
||||
- Ex:
|
||||
[initial_sync_fuzzer](https://github.com/mongodb/mongo/blob/908625ffdec050a71aa2ce47c35788739f629c60/buildscripts/resmokeconfig/suites/initial_sync_fuzzer.yml)
|
||||
uses a Shell Fixture, so the corresponding multiversion suites are
|
||||
- [`initial_sync_fuzzer_last_lts`](https://github.com/mongodb/mongo/blob/612814f4ce56282c47d501817ba28337c26d7aba/buildscripts/resmokeconfig/matrix_suites/mappings/initial_sync_fuzzer_last_lts.yml)
|
||||
- [`initial_sync_fuzzer_last_continuous`](https://github.com/mongodb/mongo/blob/612814f4ce56282c47d501817ba28337c26d7aba/buildscripts/resmokeconfig/matrix_suites/mappings/initial_sync_fuzzer_last_continuous.yml)
|
||||
|
||||
If `last-lts` and `last-continuous` versions happen to be the same, or last-continuous is EOL, we skip `last-continuous`
|
||||
and run multiversion suites with only `last-lts` combinations in Evergreen.
|
||||
If `last-lts` and `last-continuous` versions happen to be the same, or last-continuous is EOL, we
|
||||
skip `last-continuous` and run multiversion suites with only `last-lts` combinations in Evergreen.
|
||||
|
||||
## Working with multiversion tasks in Evergreen
|
||||
|
||||
### Multiversion task generation
|
||||
|
||||
Please refer to mongo-task-generator [documentation](https://github.com/mongodb/mongo-task-generator/blob/master/docs/generating_tasks.md#multiversion-testing)
|
||||
Please refer to mongo-task-generator
|
||||
[documentation](https://github.com/mongodb/mongo-task-generator/blob/master/docs/generating_tasks.md#multiversion-testing)
|
||||
for generating multiversion tasks in Evergreen.
|
||||
|
||||
### Exclude tests from multiversion testing
|
||||
@ -240,20 +255,21 @@ multiversion where `XX` is the version number, e.g. `requires_fcv_70` stands for
|
||||
```
|
||||
|
||||
Tests with `requires_fcv_XX` tags are excluded from multiversion tasks that may run the versions
|
||||
below the specified FCV version, e.g. when the `latest` version is `6.2`, `last-continuous` is
|
||||
`6.1` and `last-lts` is `6.0`, tests tagged with `requires_fcv_61` will NOT run in multiversion
|
||||
tasks that run `latest` with `last-lts`, but will run in multiversion tasks that run `lastest` with
|
||||
below the specified FCV version, e.g. when the `latest` version is `6.2`, `last-continuous` is `6.1`
|
||||
and `last-lts` is `6.0`, tests tagged with `requires_fcv_61` will NOT run in multiversion tasks that
|
||||
run `latest` with `last-lts`, but will run in multiversion tasks that run `lastest` with
|
||||
`last-continuous`.
|
||||
|
||||
In addition to disabling multiversion tests based on FCV, there is no need to run in-development `featureFlagXYZ` tests
|
||||
(featureFlags that have `default: false`) because these tests will most likely fail on older versions that
|
||||
have not implemented this feature. For multiversion tasks, we pass the `--runNoFeatureFlagTests` flag to avoid these
|
||||
failures on `all feature flag` variants.
|
||||
In addition to disabling multiversion tests based on FCV, there is no need to run in-development
|
||||
`featureFlagXYZ` tests (featureFlags that have `default: false`) because these tests will most
|
||||
likely fail on older versions that have not implemented this feature. For multiversion tasks, we
|
||||
pass the `--runNoFeatureFlagTests` flag to avoid these failures on `all feature flag` variants.
|
||||
|
||||
For more info on FCV, take a look at [FCV_AND_FEATURE_FLAG_README.md](https://github.com/mongodb/mongo/blob/master/src/mongo/db/repl/FCV_AND_FEATURE_FLAG_README.md).
|
||||
For more info on FCV, take a look at
|
||||
[FCV_AND_FEATURE_FLAG_README.md](https://github.com/mongodb/mongo/blob/master/src/mongo/db/repl/FCV_AND_FEATURE_FLAG_README.md).
|
||||
|
||||
Another common case could be that the changes on master branch are breaking multiversion tests,
|
||||
but with those changes backported to the older branches the multiversion tests should work.
|
||||
In order to temporarily disable the test from running in multiversion it can be added to the
|
||||
Another common case could be that the changes on master branch are breaking multiversion tests, but
|
||||
with those changes backported to the older branches the multiversion tests should work. In order to
|
||||
temporarily disable the test from running in multiversion it can be added to the
|
||||
[etc/backports_required_for_multiversion_tests.yml](https://github.com/mongodb/mongo/blob/fcdfe29cee066278b94ea2749456fc433cc398c6/etc/backports_required_for_multiversion_tests.yml#L1-L19).
|
||||
Please follow the instructions described in the file.
|
||||
|
||||
@ -7,21 +7,22 @@ evergreen command.
|
||||
Task generation allow us to do things like dynamically split a task into sub-tasks that can be run
|
||||
in parallel, or generate sub-tasks to run against different mongodb versions.
|
||||
|
||||
Task generation is typically done with the [mongo-task-generator](https://github.com/mongodb/mongo-task-generator)
|
||||
tool. Refer to its [documentation](https://github.com/mongodb/mongo-task-generator/blob/master/docs/generating_tasks.md)
|
||||
Task generation is typically done with the
|
||||
[mongo-task-generator](https://github.com/mongodb/mongo-task-generator) tool. Refer to its
|
||||
[documentation](https://github.com/mongodb/mongo-task-generator/blob/master/docs/generating_tasks.md)
|
||||
for details on how it works.
|
||||
|
||||
## Configuring a task to be generated
|
||||
|
||||
In order to generate a task, we typically create a placeholder task. By convention the name of
|
||||
these tasks should end in "\_gen". Most of the time, generated tasks should inherit the
|
||||
In order to generate a task, we typically create a placeholder task. By convention the name of these
|
||||
tasks should end in "\_gen". Most of the time, generated tasks should inherit the
|
||||
[gen_task_template](https://github.com/mongodb/mongo/blob/31864e3866ce9cc54c08463019846ded2ad9e6e5/etc/evergreen_yml_components/definitions.yml#L99-L107)
|
||||
which configures the required dependencies.
|
||||
|
||||
The placeholder tasks needs to have the "generate resmoke tasks" function as one of its `commands`.
|
||||
This is how the `mongo-task-generator` knows that the task needs to be generated. You can also
|
||||
add `vars` to the function call to configure how the task will generated. You can refer to
|
||||
the [mongo-task-generator](https://github.com/mongodb/mongo-task-generator/blob/master/docs/generating_tasks.md#use-cases)
|
||||
This is how the `mongo-task-generator` knows that the task needs to be generated. You can also add
|
||||
`vars` to the function call to configure how the task will generated. You can refer to the
|
||||
[mongo-task-generator](https://github.com/mongodb/mongo-task-generator/blob/master/docs/generating_tasks.md#use-cases)
|
||||
documentation for details on what options are available.
|
||||
|
||||
Once a placeholder task in defined, you can reference it just like a normal task.
|
||||
@ -40,15 +41,15 @@ Task generation is performed as a 2-step process.
|
||||
additional tasks in the future, they will exist to be run.
|
||||
|
||||
This step will also hide all the placeholder tasks into a display task called `generator_tasks`
|
||||
in each build variant. Once task generation is completed, the user should perform actions on
|
||||
the generated tasks instead of the placeholder tasks, we encourage this by hiding the
|
||||
placeholder tasks from view.
|
||||
in each build variant. Once task generation is completed, the user should perform actions on the
|
||||
generated tasks instead of the placeholder tasks, we encourage this by hiding the placeholder
|
||||
tasks from view.
|
||||
|
||||
2. After the tasks have been generated, the placeholder tasks are free to run. The placeholder tasks
|
||||
simply find the task generated for them and mark it activated. Since generated tasks are
|
||||
created in the "inactive" state, this will activate any generated tasks whose placeholder task
|
||||
runs. This enables users to select tasks to run on the initial task selection page even though
|
||||
the tasks have not yet been generated.
|
||||
simply find the task generated for them and mark it activated. Since generated tasks are created
|
||||
in the "inactive" state, this will activate any generated tasks whose placeholder task runs. This
|
||||
enables users to select tasks to run on the initial task selection page even though the tasks
|
||||
have not yet been generated.
|
||||
|
||||
**Note**: While this 2-step process allows a similar user experience to working with normal tasks,
|
||||
it does create a few UI quirks. For example, evergreen will hide "inactive" tasks in the UI, as a
|
||||
|
||||
@ -2,10 +2,15 @@
|
||||
|
||||
## Types of timeouts
|
||||
|
||||
There are two types of timeouts that [Evergreen supports](https://github.com/evergreen-ci/evergreen/wiki/Project-Commands#timeoutupdate):
|
||||
There are two types of timeouts that
|
||||
[Evergreen supports](https://github.com/evergreen-ci/evergreen/wiki/Project-Commands#timeoutupdate):
|
||||
|
||||
- **Exec Timeout**: The _exec timeout_ is the overall timeout for a task. Once the total runtime for a test exceeds this value, the timeout logic will be triggered. This value is specified by `exec_timeout_secs` in the Evergreen configuration.
|
||||
- **Idle Timeout**: The _idle timeout_ is the amount of time Evergreen will wait for output to be generated before considering the task hung and triggering the timeout logic. This value is specified by `timeout_secs` in the Evergreen configuration.
|
||||
- **Exec Timeout**: The _exec timeout_ is the overall timeout for a task. Once the total runtime for
|
||||
a test exceeds this value, the timeout logic will be triggered. This value is specified by
|
||||
`exec_timeout_secs` in the Evergreen configuration.
|
||||
- **Idle Timeout**: The _idle timeout_ is the amount of time Evergreen will wait for output to be
|
||||
generated before considering the task hung and triggering the timeout logic. This value is
|
||||
specified by `timeout_secs` in the Evergreen configuration.
|
||||
|
||||
**Note**: In most cases, the **exec timeout** is the more useful of the two timeouts.
|
||||
|
||||
@ -15,15 +20,27 @@ There are several ways to set the timeout for a task running in Evergreen.
|
||||
|
||||
### Specifying timeouts in the Evergreen YAML configuration
|
||||
|
||||
Timeouts can be specified directly in the `evergreen.yml` (and related) files, both for tasks and build variants. This approach is useful for setting default timeout values but is limited because different build variants often have varying runtime characteristics. This means it is not possible to set timeouts for a specific task running on a specific build variant using only this method.
|
||||
Timeouts can be specified directly in the `evergreen.yml` (and related) files, both for tasks and
|
||||
build variants. This approach is useful for setting default timeout values but is limited because
|
||||
different build variants often have varying runtime characteristics. This means it is not possible
|
||||
to set timeouts for a specific task running on a specific build variant using only this method.
|
||||
|
||||
### Overrides: [etc/evergreen_timeouts.yml](../../etc/evergreen_timeouts.yml)
|
||||
|
||||
The `etc/evergreen_timeouts.yml` file allows overriding timeouts for specific tasks on specific build variants. This workaround helps address the limitations of directly specifying timeouts in `evergreen.yml`. To use this method, the task must include the `determine task timeout` and `update task timeout expansions` functions at the beginning of its Evergreen definition. Many Resmoke tasks already incorporate these functions.
|
||||
The `etc/evergreen_timeouts.yml` file allows overriding timeouts for specific tasks on specific
|
||||
build variants. This workaround helps address the limitations of directly specifying timeouts in
|
||||
`evergreen.yml`. To use this method, the task must include the `determine task timeout` and
|
||||
`update task timeout expansions` functions at the beginning of its Evergreen definition. Many
|
||||
Resmoke tasks already incorporate these functions.
|
||||
|
||||
### Resmoke tasks: [buildscripts/evergreen_task_timeout.py](../../buildscripts/evergreen_task_timeout.py)
|
||||
|
||||
This script reads the `etc/evergreen_timeouts.yml` file to calculate the appropriate timeout settings. Additionally, it checks historical test results for the task being run to determine if enough information is available to calculate timeouts based on past data. The script also supports more advanced methods of determining timeouts, such as applying aggressive timeout measures for tasks executed in the commit queue or on required build variants. In cases of conflict, the commit queue and required build variant limits take precedence over the previous two methods.
|
||||
This script reads the `etc/evergreen_timeouts.yml` file to calculate the appropriate timeout
|
||||
settings. Additionally, it checks historical test results for the task being run to determine if
|
||||
enough information is available to calculate timeouts based on past data. The script also supports
|
||||
more advanced methods of determining timeouts, such as applying aggressive timeout measures for
|
||||
tasks executed in the commit queue or on required build variants. In cases of conflict, the commit
|
||||
queue and required build variant limits take precedence over the previous two methods.
|
||||
|
||||
The timeout that was calculated by the script can be retrieved from the logs:
|
||||
|
||||
@ -38,4 +55,8 @@ The timeout that was calculated by the script can be retrieved from the logs:
|
||||
|
||||
### Compile tasks: [evergreen/generate_override_timeout.py](../../evergreen/generate_override_timeout.py)
|
||||
|
||||
This script is used for compile tasks defined in files such as `etc/evergreen_yml_components/tasks/compile_tasks.yml` and `etc/evergreen_yml_components/tasks/compile_tasks_shared.yml`. The script reads the `etc/evergreen_timeouts.yml` file and calculates appropriate timeouts. The Evergreen function `override task timeout` then runs this script to update the timeouts accordingly.
|
||||
This script is used for compile tasks defined in files such as
|
||||
`etc/evergreen_yml_components/tasks/compile_tasks.yml` and
|
||||
`etc/evergreen_yml_components/tasks/compile_tasks_shared.yml`. The script reads the
|
||||
`etc/evergreen_timeouts.yml` file and calculates appropriate timeouts. The Evergreen function
|
||||
`override task timeout` then runs this script to update the timeouts accordingly.
|
||||
|
||||
@ -1,37 +1,47 @@
|
||||
# Build Variants
|
||||
|
||||
This document describes build variants (a.k.a. variants, or builds, or buildvariants) that are used in `mongodb-mongo-*` projects.
|
||||
To know more about build variants, please refer to the [Build Variants](https://docs.devprod.prod.corp.mongodb.com/evergreen/Project-Configuration/Project-Configuration-Files#build-variants) section of the Evergreen wiki.
|
||||
This document describes build variants (a.k.a. variants, or builds, or buildvariants) that are used
|
||||
in `mongodb-mongo-*` projects. To know more about build variants, please refer to the
|
||||
[Build Variants](https://docs.devprod.prod.corp.mongodb.com/evergreen/Project-Configuration/Project-Configuration-Files#build-variants)
|
||||
section of the Evergreen wiki.
|
||||
|
||||
## YAML files structure
|
||||
|
||||
Build variant configuration files are in `etc/evergreen_yml_components/variants` directory.
|
||||
They are merged into `etc/evergreen.yml` and `etc/evergreen_nightly.yml` with Evergreen's [include](https://docs.devprod.prod.corp.mongodb.com/evergreen/Project-Configuration/Project-Configuration-Files#include) feature.
|
||||
Build variant configuration files are in `etc/evergreen_yml_components/variants` directory. They are
|
||||
merged into `etc/evergreen.yml` and `etc/evergreen_nightly.yml` with Evergreen's
|
||||
[include](https://docs.devprod.prod.corp.mongodb.com/evergreen/Project-Configuration/Project-Configuration-Files#include)
|
||||
feature.
|
||||
|
||||
Inside `etc/evergreen_yml_components/variants` directory there are more directories,
|
||||
which are in most cases platform names (e.g. amazon, rhel etc.) or build variant group names (e.g. sanitizer etc.).
|
||||
Inside `etc/evergreen_yml_components/variants` directory there are more directories, which are in
|
||||
most cases platform names (e.g. amazon, rhel etc.) or build variant group names (e.g. sanitizer
|
||||
etc.).
|
||||
|
||||
Be aware that some of these files could be also used or re-used to be merged into `etc/system_perf.yml` which is used for `sys-perf` project.
|
||||
Be aware that some of these files could be also used or re-used to be merged into
|
||||
`etc/system_perf.yml` which is used for `sys-perf` project.
|
||||
|
||||
## Build Variants in `mongodb-mongo-master` and `mongodb-mongo-master-nightly`
|
||||
|
||||
`mongodb-mongo-master` evergreen project uses `etc/evergreen.yml` and contains all build variants for development, including all feature-specific, patch build required, and suggested variants.
|
||||
`mongodb-mongo-master` evergreen project uses `etc/evergreen.yml` and contains all build variants
|
||||
for development, including all feature-specific, patch build required, and suggested variants.
|
||||
|
||||
`mongodb-mongo-master-nightly` evergreen project uses `etc/evergreen_nightly.yml` and contains build variants for public nightly builds.
|
||||
`mongodb-mongo-master-nightly` evergreen project uses `etc/evergreen_nightly.yml` and contains build
|
||||
variants for public nightly builds.
|
||||
|
||||
## Required and Suggested Build Variants
|
||||
|
||||
"Required" build variants are defined as any build variant with a `!` at the front of its display name in Evergreen.
|
||||
These build variants also have `required` tag.
|
||||
"Required" build variants are defined as any build variant with a `!` at the front of its display
|
||||
name in Evergreen. These build variants also have `required` tag.
|
||||
|
||||
[Required Patch Builds Policy](https://wiki.corp.mongodb.com/display/KERNEL/Required+Patch+Builds+Policy)
|
||||
|
||||
"Suggested" build variants are defined as any build variant with a `*` at the front of its display name in Evergreen.
|
||||
These build variants also have `suggested` tag.
|
||||
"Suggested" build variants are defined as any build variant with a `*` at the front of its display
|
||||
name in Evergreen. These build variants also have `suggested` tag.
|
||||
|
||||
## Build Variants with forbid_tasks_tagged_with_experimental
|
||||
|
||||
Build variants with the `forbid_tasks_tagged_with_experimental` tag indicate that they do not allow tasks tagged as `experimental` to run. This tag is used in conjunction with the `forbid-tasks-with-tag-on-variants` evergreen lint rule to enforce this restriction.
|
||||
Build variants with the `forbid_tasks_tagged_with_experimental` tag indicate that they do not allow
|
||||
tasks tagged as `experimental` to run. This tag is used in conjunction with the
|
||||
`forbid-tasks-with-tag-on-variants` evergreen lint rule to enforce this restriction.
|
||||
|
||||
## Build Variants after branching
|
||||
|
||||
@ -39,34 +49,48 @@ In each of platform or build variant group directory there can be these files:
|
||||
|
||||
- `test_dev.yml`
|
||||
|
||||
- these files are merged into `etc/evergreen.yml` which is used for `mongodb-mongo-master` project on master branch
|
||||
- after branching on all new branches these files are merged into `etc/evergreen_nightly.yml` which is used for a new branch `mongodb-mongo-vX.Y` project
|
||||
- these files are merged into `etc/evergreen.yml` which is used for `mongodb-mongo-master` project
|
||||
on master branch
|
||||
- after branching on all new branches these files are merged into `etc/evergreen_nightly.yml`
|
||||
which is used for a new branch `mongodb-mongo-vX.Y` project
|
||||
|
||||
- `test_dev_master_and_lts_branches_only.yml`
|
||||
|
||||
- these files are merged into `etc/evergreen.yml` which is used for `mongodb-mongo-master` project on master branch
|
||||
- after branching for LTS release (v7.0, v8.0 etc.) on a new branch these files are merged into `etc/evergreen_nightly.yml` which is used for a new branch `mongodb-mongo-vX.Y` project
|
||||
- **important**: all tests that are running on these build variants will NOT run on a new Rapid release (v7.1, v7.2, v7.3, v8.1, v8.2, v8.3 etc.) branch projects
|
||||
- these files are merged into `etc/evergreen.yml` which is used for `mongodb-mongo-master` project
|
||||
on master branch
|
||||
- after branching for LTS release (v7.0, v8.0 etc.) on a new branch these files are merged into
|
||||
`etc/evergreen_nightly.yml` which is used for a new branch `mongodb-mongo-vX.Y` project
|
||||
- **important**: all tests that are running on these build variants will NOT run on a new Rapid
|
||||
release (v7.1, v7.2, v7.3, v8.1, v8.2, v8.3 etc.) branch projects
|
||||
|
||||
- `test_dev_master_branch_only.yml`
|
||||
|
||||
- these files are merged into `etc/evergreen.yml` which is used for `mongodb-mongo-master` project on master branch
|
||||
- these files are merged into `etc/evergreen.yml` which is used for `mongodb-mongo-master` project
|
||||
on master branch
|
||||
- after branching on all new branches these files are NOT used
|
||||
- **important**: all tests that are running on these build variants will NOT run on a new branch `mongodb-mongo-vX.Y` project
|
||||
- **important**: all tests that are running on these build variants will NOT run on a new branch
|
||||
`mongodb-mongo-vX.Y` project
|
||||
|
||||
- `test_release.yml`
|
||||
|
||||
- these files are merged into `etc/evergreen_nightly.yml` which is used for `mongodb-mongo-master-nightly` project on master branch
|
||||
- after branching on all new branches these files are merged into `etc/evergreen_nightly.yml` which is used for a new branch `mongodb-mongo-vX.Y` project
|
||||
- these files are merged into `etc/evergreen_nightly.yml` which is used for
|
||||
`mongodb-mongo-master-nightly` project on master branch
|
||||
- after branching on all new branches these files are merged into `etc/evergreen_nightly.yml`
|
||||
which is used for a new branch `mongodb-mongo-vX.Y` project
|
||||
|
||||
- `test_release_master_and_lts_branches_only.yml`
|
||||
|
||||
- these files are merged into `etc/evergreen_nightly.yml` which is used for `mongodb-mongo-master-nightly` project on master branch
|
||||
- after branching for LTS release (v7.0, v8.0 etc.) on a new branch these files are merged into `etc/evergreen_nightly.yml` which is used for a new branch `mongodb-mongo-vX.Y` project
|
||||
- **important**: all tests that are running on these build variants will NOT run on a new Rapid release (v7.1, v7.2, v7.3, v8.1, v8.2, v8.3 etc.) branch projects
|
||||
- these files are merged into `etc/evergreen_nightly.yml` which is used for
|
||||
`mongodb-mongo-master-nightly` project on master branch
|
||||
- after branching for LTS release (v7.0, v8.0 etc.) on a new branch these files are merged into
|
||||
`etc/evergreen_nightly.yml` which is used for a new branch `mongodb-mongo-vX.Y` project
|
||||
- **important**: all tests that are running on these build variants will NOT run on a new Rapid
|
||||
release (v7.1, v7.2, v7.3, v8.1, v8.2, v8.3 etc.) branch projects
|
||||
|
||||
- `test_release_master_branch_only.yml`
|
||||
|
||||
- these files are merged into `etc/evergreen_nightly.yml` which is used for `mongodb-mongo-master-nightly` project on master branch
|
||||
- these files are merged into `etc/evergreen_nightly.yml` which is used for
|
||||
`mongodb-mongo-master-nightly` project on master branch
|
||||
- after branching on all new branches these files are NOT used
|
||||
- **important**: all tests that are running on these build variants will NOT run on a new branch `mongodb-mongo-vX.Y` project
|
||||
- **important**: all tests that are running on these build variants will NOT run on a new branch
|
||||
`mongodb-mongo-vX.Y` project
|
||||
|
||||
@ -11,14 +11,14 @@ section of the Evergreen wiki.
|
||||
|
||||
### `mongodb-mongo-master`
|
||||
|
||||
The main project for testing MongoDB's dev environments with a number build variants,
|
||||
each one corresponding to a particular compile or testing environment to support development.
|
||||
Each build variant runs a set of tasks; each task ususally runs one or more tests.
|
||||
The main project for testing MongoDB's dev environments with a number build variants, each one
|
||||
corresponding to a particular compile or testing environment to support development. Each build
|
||||
variant runs a set of tasks; each task ususally runs one or more tests.
|
||||
|
||||
### `mongodb-mongo-master-nightly`
|
||||
|
||||
Tracks the same branch as `mongodb-mongo-master`, each build variant corresponds to a
|
||||
(version, OS, architecure) triplet for a supported MongoDB nightly release.
|
||||
Tracks the same branch as `mongodb-mongo-master`, each build variant corresponds to a (version, OS,
|
||||
architecure) triplet for a supported MongoDB nightly release.
|
||||
|
||||
### `sys_perf`
|
||||
|
||||
@ -28,22 +28,23 @@ The system performance project.
|
||||
|
||||
The above Evergreen projects are defined in the following files:
|
||||
|
||||
- `etc/evergreen_yml_components/**.yml`. YAML files containing definitions for tasks, functions, buildvariants, etc.
|
||||
They are copied from the existing evergreen.yml file.
|
||||
- `etc/evergreen_yml_components/**.yml`. YAML files containing definitions for tasks, functions,
|
||||
buildvariants, etc. They are copied from the existing evergreen.yml file.
|
||||
|
||||
- `etc/evergreen.yml`. Imports components from above and serves as the project config for mongodb-mongo-master,
|
||||
containing all build variants for development, including all feature-specific, patch build required, and suggested
|
||||
variants.
|
||||
- `etc/evergreen.yml`. Imports components from above and serves as the project config for
|
||||
mongodb-mongo-master, containing all build variants for development, including all
|
||||
feature-specific, patch build required, and suggested variants.
|
||||
|
||||
- `etc/evergreen_nightly.yml`. The project configuration for mongodb-mongo-master-nightly, containing only build
|
||||
variants for public nightly builds, imports similar components as evergreen.yml to ensure consistency.
|
||||
- `etc/evergreen_nightly.yml`. The project configuration for mongodb-mongo-master-nightly,
|
||||
containing only build variants for public nightly builds, imports similar components as
|
||||
evergreen.yml to ensure consistency.
|
||||
|
||||
- `etc/sys_perf.yml`. Configuration file for the system performance project.
|
||||
|
||||
## Release Branching Process
|
||||
|
||||
Only the `mongodb-mongo-master-nightly` project will be branched with required and other
|
||||
necessary variants (e.g. sanitizers) added back in. Most variants in `mongodb-mongo-master`
|
||||
would be dropped by default but can be re-introduced to the release branches manually on an
|
||||
as-needed basis. For Rapid releases, all but the variants relevant to Atlas in
|
||||
`mongodb-mongo-master-nightly` may be dropped as well.
|
||||
Only the `mongodb-mongo-master-nightly` project will be branched with required and other necessary
|
||||
variants (e.g. sanitizers) added back in. Most variants in `mongodb-mongo-master` would be dropped
|
||||
by default but can be re-introduced to the release branches manually on an as-needed basis. For
|
||||
Rapid releases, all but the variants relevant to Atlas in `mongodb-mongo-master-nightly` may be
|
||||
dropped as well.
|
||||
|
||||
@ -1,11 +1,15 @@
|
||||
# Task ownership tags
|
||||
|
||||
This document describes task ownership tags that are used in `mongodb-mongo-master` and `mongodb-mongo-master-nightly` projects.
|
||||
This document describes task ownership tags that are used in `mongodb-mongo-master` and
|
||||
`mongodb-mongo-master-nightly` projects.
|
||||
|
||||
Every task in in `mongodb-mongo-master` and `mongodb-mongo-master-nightly` projects should be tag with exactly one `assigned_to_jira_team_.+` tag.
|
||||
Team names (the part after `assigned_to_jira_team_`) should match `evergreen_tag_name` from team configurations in [mothra](https://github.com/10gen/mothra/tree/main/mothra/teams).
|
||||
Every task in in `mongodb-mongo-master` and `mongodb-mongo-master-nightly` projects should be tag
|
||||
with exactly one `assigned_to_jira_team_.+` tag. Team names (the part after
|
||||
`assigned_to_jira_team_`) should match `evergreen_tag_name` from team configurations in
|
||||
[mothra](https://github.com/10gen/mothra/tree/main/mothra/teams).
|
||||
|
||||
This is enforced by linter. YAML linter configuration could be found [here](../../../etc/evergreen_lint.yml).
|
||||
This is enforced by linter. YAML linter configuration could be found
|
||||
[here](../../../etc/evergreen_lint.yml).
|
||||
|
||||
If the linter configuration is missing your team:
|
||||
|
||||
@ -13,4 +17,7 @@ If the linter configuration is missing your team:
|
||||
2. Make sure that your team configuration in mothra has `evergreen_tag_name`
|
||||
3. Update the tag list with `assigned_to_jira_team_{evergreen_tag_name}` tag for your team
|
||||
|
||||
Dynamically generated tasks for resmoke suites (i.e. the ones named like `//buildscripts/resmokeconfig:core`) will set the ownership tag based on a best effort lookup from the codeowner of the test's definition to a team name from mothra, picking the first encountered in case of multiple possible assignments.
|
||||
Dynamically generated tasks for resmoke suites (i.e. the ones named like
|
||||
`//buildscripts/resmokeconfig:core`) will set the ownership tag based on a best effort lookup from
|
||||
the codeowner of the test's definition to a team name from mothra, picking the first encountered in
|
||||
case of multiple possible assignments.
|
||||
|
||||
@ -1,49 +1,58 @@
|
||||
# Task selection tags
|
||||
|
||||
This document describes task selection tags that are used in `mongodb-mongo-master` and `mongodb-mongo-master-nightly` projects.
|
||||
To know more about task tags, please refer to the [Task and Variant Tags](https://docs.devprod.prod.corp.mongodb.com/evergreen/Project-Configuration/Project-Configuration-Files#task-and-variant-tags) section of the Evergreen wiki.
|
||||
This document describes task selection tags that are used in `mongodb-mongo-master` and
|
||||
`mongodb-mongo-master-nightly` projects. To know more about task tags, please refer to the
|
||||
[Task and Variant Tags](https://docs.devprod.prod.corp.mongodb.com/evergreen/Project-Configuration/Project-Configuration-Files#task-and-variant-tags)
|
||||
section of the Evergreen wiki.
|
||||
|
||||
The majority of variants in `mongodb-mongo-master-nightly` project and the most significat variants in `mongodb-mongo-master` project are using required and optional groups of task selection tags.
|
||||
In order to add tasks to those variants, please use them as described in the following sections.
|
||||
The majority of variants in `mongodb-mongo-master-nightly` project and the most significat variants
|
||||
in `mongodb-mongo-master` project are using required and optional groups of task selection tags. In
|
||||
order to add tasks to those variants, please use them as described in the following sections.
|
||||
|
||||
## Required task selection tags
|
||||
|
||||
Every task in `mongodb-mongo-master` and `mongodb-mongo-master-nightly` project must be tagged with exactly one required selection tag.
|
||||
This is enforced by linter. YAML linter configuration could be found [here](../../../etc/evergreen_lint.yml).
|
||||
Every task in `mongodb-mongo-master` and `mongodb-mongo-master-nightly` project must be tagged with
|
||||
exactly one required selection tag. This is enforced by linter. YAML linter configuration could be
|
||||
found [here](../../../etc/evergreen_lint.yml).
|
||||
|
||||
- `development_critical` - these tasks should be green prior to the merge and will block merging if failing, e.g. jsCore.
|
||||
We run these tasks on all variants and in the commit-queue.
|
||||
- `development_critical` - these tasks should be green prior to the merge and will block merging if
|
||||
failing, e.g. jsCore. We run these tasks on all variants and in the commit-queue.
|
||||
|
||||
- `development_critical_single_variant` - the same as `development_critical` but these tasks do not require to run on multiple variants, e.g. clang-tidy, formatters, linters etc.
|
||||
We run these tasks on the required variant and in the commit-queue.
|
||||
- `development_critical_single_variant` - the same as `development_critical` but these tasks do not
|
||||
require to run on multiple variants, e.g. clang-tidy, formatters, linters etc. We run these tasks
|
||||
on the required variant and in the commit-queue.
|
||||
|
||||
- `no_commit_queue` - add this to tasks in development_critical that you do not want in the commit-queue
|
||||
- `no_commit_queue` - add this to tasks in development_critical that you do not want in the
|
||||
commit-queue
|
||||
|
||||
- `release_critical` - these tasks should be green prior to the release.
|
||||
We run these tasks on all release and development (required and suggested) variants.
|
||||
It should be uncommon to add tasks to this tag but if your task needs to run on many different OSes and it is extremely broad in coverage then you can add it to this tag.
|
||||
- `release_critical` - these tasks should be green prior to the release. We run these tasks on all
|
||||
release and development (required and suggested) variants. It should be uncommon to add tasks to
|
||||
this tag but if your task needs to run on many different OSes and it is extremely broad in
|
||||
coverage then you can add it to this tag.
|
||||
|
||||
- `default` - these tasks are running as part of a required patch build.
|
||||
We run these tasks on the most significant development variants (required patches, tsan, aubsan, etc.).
|
||||
Use this tag if you are not sure which tag to use for your new task.
|
||||
- `default` - these tasks are running as part of a required patch build. We run these tasks on the
|
||||
most significant development variants (required patches, tsan, aubsan, etc.). Use this tag if you
|
||||
are not sure which tag to use for your new task.
|
||||
|
||||
- `non_deterministic` - these tasks depend significantly on randomization and we expect to see some unique failures, e.g. fuzzers etc.
|
||||
We run these tasks on non-required development variants.
|
||||
- `non_deterministic` - these tasks depend significantly on randomization and we expect to see some
|
||||
unique failures, e.g. fuzzers etc. We run these tasks on non-required development variants.
|
||||
|
||||
- `experimental` - these tasks are not running anywhere regularly.
|
||||
We do not use this tag for selecting tasks to run on variants.
|
||||
This tag could be used for tasks that you would like to run on your own custom variants.
|
||||
- `experimental` - these tasks are not running anywhere regularly. We do not use this tag for
|
||||
selecting tasks to run on variants. This tag could be used for tasks that you would like to run on
|
||||
your own custom variants.
|
||||
|
||||
- `auxiliary` - these are various setup, helper, etc. tasks and should be mostly owned by infrastructure team.
|
||||
You should almost never use this tag.
|
||||
Please reach out to [#ask-devprod-build](https://mongodb.enterprise.slack.com/archives/CR8SNBY0N) before adding tasks with this tag.
|
||||
- `auxiliary` - these are various setup, helper, etc. tasks and should be mostly owned by
|
||||
infrastructure team. You should almost never use this tag. Please reach out to
|
||||
[#ask-devprod-build](https://mongodb.enterprise.slack.com/archives/CR8SNBY0N) before adding tasks
|
||||
with this tag.
|
||||
|
||||
**Important**: Do not change anything in this list without talking to [#ask-devprod-build](https://mongodb.enterprise.slack.com/archives/CR8SNBY0N).
|
||||
**Important**: Do not change anything in this list without talking to
|
||||
[#ask-devprod-build](https://mongodb.enterprise.slack.com/archives/CR8SNBY0N).
|
||||
|
||||
## Optional task selection tags
|
||||
|
||||
In addition to the required task selection tags there is a list of optional selection tags.
|
||||
Every task could be tagged with any number of the following tags:
|
||||
In addition to the required task selection tags there is a list of optional selection tags. Every
|
||||
task could be tagged with any number of the following tags:
|
||||
|
||||
- `incompatible_community` - the task should be excluded from the community variants.
|
||||
- `incompatible_windows` - the task should be excluded from Windows variants.
|
||||
@ -55,16 +64,20 @@ Every task could be tagged with any number of the following tags:
|
||||
- `incompatible_aubsan` - the task should be excluded from {A,UB}SAN variants.
|
||||
- `incompatible_tsan` - the task should be excluded from TSAN variants.
|
||||
- `incompatible_debug_mode` - the task should be excluded from Debug Mode variants.
|
||||
- `incompatible_system_allocator` - the task should be excluded from variants that use the system allocator.
|
||||
- `incompatible_system_allocator` - the task should be excluded from variants that use the system
|
||||
allocator.
|
||||
- `incompatible_all_feature_flags` - the task should be excluded from all-feature-flags variants.
|
||||
- `incompatible_development_variant` - the task should be excluded from the development variants.
|
||||
- `incompatible_oscrypto` - the task should be excluded from variants unsupported by oscrypto.
|
||||
- `requires_compile_variant` - the task can (or should) only run on variants that has compile releated expansions.
|
||||
- `requires_compile_variant` - the task can (or should) only run on variants that has compile
|
||||
releated expansions.
|
||||
- `requires_large_host` - the task requires a large host to run.
|
||||
- `requires_large_host_aubsan` - the task requires a large host to run on {A,UB}SAN variants.
|
||||
- `requires_large_host_tsan` - the task requires a large host to run on TSAN variants.
|
||||
- `requires_large_host_debug_mode` - the task requires a large host to run on Debug Mode variants.
|
||||
- `requires_large_host_commit_queue` - the task requires a large host to run on in the commit-queue.
|
||||
- `requires_all_feature_flags` - the task can only run on variants that has all-feature-flags configuration.
|
||||
- `requires_execution_on_windows_patch_build` - the task should be run on the required Windows build variant on each patch
|
||||
build. See [SERVER-79037](https://jira.mongodb.org/browse/SERVER-79037) for how this was calculated.
|
||||
- `requires_all_feature_flags` - the task can only run on variants that has all-feature-flags
|
||||
configuration.
|
||||
- `requires_execution_on_windows_patch_build` - the task should be run on the required Windows build
|
||||
variant on each patch build. See [SERVER-79037](https://jira.mongodb.org/browse/SERVER-79037) for
|
||||
how this was calculated.
|
||||
|
||||
@ -5,16 +5,16 @@ MongoDB code uses the following types of assertions that are available for use:
|
||||
- `uassert` and `iassert`
|
||||
- Checks for per-operation user errors. Operation-fatal.
|
||||
- `tassert`
|
||||
- Like uassert in that it checks for per-operation user errors, but inhibits clean shutdown
|
||||
in tests. Operation-fatal, but process-fatal in testing environments during shutdown.
|
||||
- Like uassert in that it checks for per-operation user errors, but inhibits clean shutdown in
|
||||
tests. Operation-fatal, but process-fatal in testing environments during shutdown.
|
||||
- `massert`
|
||||
- Checks per-operation invariants. Operation-fatal.
|
||||
- `fassert`
|
||||
- Checks fatal process invariants. Process-fatal. Use to detect unexpected situations (such
|
||||
as a system function returning an unexpected error status).
|
||||
- Checks fatal process invariants. Process-fatal. Use to detect unexpected situations (such as a
|
||||
system function returning an unexpected error status).
|
||||
- `invariant`
|
||||
- Checks process invariant. Process-fatal. Use to detect code logic errors ("pointer should
|
||||
never be null", "we should always be locked").
|
||||
- Checks process invariant. Process-fatal. Use to detect code logic errors ("pointer should never
|
||||
be null", "we should always be locked").
|
||||
|
||||
**Note**: Calling C function `assert` is not allowed. Use one of the above instead.
|
||||
|
||||
@ -50,8 +50,8 @@ Some assertions will increment an assertion counter. The `serverStatus` command
|
||||
- `tripwire`
|
||||
- Incremented by `tassert`.
|
||||
- `rollovers`
|
||||
- When any counter reaches a value of `1 << 30`, all of the counters are reset and
|
||||
the "rollovers" counter is incremented.
|
||||
- When any counter reaches a value of `1 << 30`, all of the counters are reset and the "rollovers"
|
||||
counter is incremented.
|
||||
|
||||
## Considerations
|
||||
|
||||
@ -61,52 +61,53 @@ terminate the current operation, not the whole process. Be careful not to corrup
|
||||
mistakenly using these assertions midway through mutating process state.
|
||||
|
||||
`fassert` failures will terminate the entire process; this is used for low-level checks where
|
||||
continuing might lead to corrupt data or loss of data on disk. Additionally, `fassert` will log
|
||||
a generic assertion message with fatal severity and add a breakpoint before terminating.
|
||||
continuing might lead to corrupt data or loss of data on disk. Additionally, `fassert` will log a
|
||||
generic assertion message with fatal severity and add a breakpoint before terminating.
|
||||
|
||||
To log a custom assertion message and terminate the server, use `LOGV2_FATAL`.
|
||||
To avoid printing a stacktrace on failure use `fassertNoTrace` or `LOGV2_FATAL_NO_TRACE`.
|
||||
Consider using them if there is only one way to reach this fatal point in code.
|
||||
To log a custom assertion message and terminate the server, use `LOGV2_FATAL`. To avoid printing a
|
||||
stacktrace on failure use `fassertNoTrace` or `LOGV2_FATAL_NO_TRACE`. Consider using them if there
|
||||
is only one way to reach this fatal point in code.
|
||||
|
||||
`tassert` will fail the operation like `uassert`, but also triggers a "deferred-fatality tripwire
|
||||
flag". In testing environments, if the tripwire flag is set during shutdown, the process will
|
||||
invoke the tripwire fatal assertion. In non-testing environments, there will only be a warning
|
||||
during shutdown that tripwire assertions have failed.
|
||||
flag". In testing environments, if the tripwire flag is set during shutdown, the process will invoke
|
||||
the tripwire fatal assertion. In non-testing environments, there will only be a warning during
|
||||
shutdown that tripwire assertions have failed.
|
||||
|
||||
`tassert` presents more diagnostics than `uassert`. `tassert` will log the assertion as an error,
|
||||
log scoped debug info (for more info, see ScopedDebugInfoStack defined in
|
||||
[mongo/util/assert_util.h][assert_util_h]), print the stack trace, and add a breakpoint.
|
||||
The purpose of `tassert` is to ensure that operation failures will cause a test suite to fail
|
||||
without resorting to different behavior during testing. `tassert` should only be used to check
|
||||
for unexpected values produced by defined behavior.
|
||||
[mongo/util/assert_util.h][assert_util_h]), print the stack trace, and add a breakpoint. The purpose
|
||||
of `tassert` is to ensure that operation failures will cause a test suite to fail without resorting
|
||||
to different behavior during testing. `tassert` should only be used to check for unexpected values
|
||||
produced by defined behavior.
|
||||
|
||||
Both `massert` and `uassert` take error codes, so that all assertions have codes associated with
|
||||
them. Currently, programmers are free to provide the error code by either [using a unique location
|
||||
number](#choosing-a-unique-location-number) or choosing a named code from `ErrorCodes`. Unique location
|
||||
numbers have no meaning other than a way to associate a log message with a line of code.
|
||||
them. Currently, programmers are free to provide the error code by either
|
||||
[using a unique location number](#choosing-a-unique-location-number) or choosing a named code from
|
||||
`ErrorCodes`. Unique location numbers have no meaning other than a way to associate a log message
|
||||
with a line of code.
|
||||
|
||||
`massert` will log the assertion message as an error, while `uassert` will log the message with
|
||||
debug level of 1 (for more info about log debug level, see [docs/logging.md][logging_md]).
|
||||
|
||||
`iassert` provides similar functionality to `uassert`, but it logs at a debug level of 3 and
|
||||
does not increment user assertion counters. We should always choose `iassert` over `uassert`
|
||||
when we expect a failure, a failure might be recoverable, or failure accounting is not interesting.
|
||||
`iassert` provides similar functionality to `uassert`, but it logs at a debug level of 3 and does
|
||||
not increment user assertion counters. We should always choose `iassert` over `uassert` when we
|
||||
expect a failure, a failure might be recoverable, or failure accounting is not interesting.
|
||||
|
||||
### Choosing a unique location number
|
||||
|
||||
The current convention for choosing a unique location number is to use the 5 or 6 digit SERVER ticket number
|
||||
for the ticket being addressed when the assertion is added, followed by a two digit counter to distinguish
|
||||
between codes added as part of the same ticket. For example, if you're working on SERVER-12345, the first
|
||||
error code would be 1234500, the second would be 1234501, etc. This convention can also be used for LOGV2
|
||||
logging id numbers.
|
||||
The current convention for choosing a unique location number is to use the 5 or 6 digit SERVER
|
||||
ticket number for the ticket being addressed when the assertion is added, followed by a two digit
|
||||
counter to distinguish between codes added as part of the same ticket. For example, if you're
|
||||
working on SERVER-12345, the first error code would be 1234500, the second would be 1234501, etc.
|
||||
This convention can also be used for LOGV2 logging id numbers.
|
||||
|
||||
The only real constraint for unique location numbers is that they must be unique across the codebase. This is
|
||||
verified at compile time with a [python script][errorcodes_py].
|
||||
The only real constraint for unique location numbers is that they must be unique across the
|
||||
codebase. This is verified at compile time with a [python script][errorcodes_py].
|
||||
|
||||
## Exception
|
||||
|
||||
A failed operation-fatal assertion throws an `AssertionException` or a child of that.
|
||||
The inheritance hierarchy resembles:
|
||||
A failed operation-fatal assertion throws an `AssertionException` or a child of that. The
|
||||
inheritance hierarchy resembles:
|
||||
|
||||
- `std::exception`
|
||||
- `mongo::DBException`
|
||||
@ -123,14 +124,14 @@ upwards harmlessly. The code should also expect, and properly handle, `UserExcep
|
||||
|
||||
## ErrorCodes and Status
|
||||
|
||||
MongoDB uses `ErrorCodes` both internally and externally: a subset of error codes (e.g.,
|
||||
`BadValue`) are used externally to pass errors over the wire and to clients. These error codes are
|
||||
the means for MongoDB processes (e.g., _mongod_ and _mongo_) to communicate errors, and are visible
|
||||
to client applications. Other error codes are used internally to indicate the underlying reason for
|
||||
a failed operation. For instance, `PeriodicJobIsStopped` is an internal error code that is passed
|
||||
to callback functions running inside a [`PeriodicRunner`][periodic_runner_h] once the runner is
|
||||
stopped. The internal error codes are for internal use only and must never be returned to clients
|
||||
(i.e., in a network response).
|
||||
MongoDB uses `ErrorCodes` both internally and externally: a subset of error codes (e.g., `BadValue`)
|
||||
are used externally to pass errors over the wire and to clients. These error codes are the means for
|
||||
MongoDB processes (e.g., _mongod_ and _mongo_) to communicate errors, and are visible to client
|
||||
applications. Other error codes are used internally to indicate the underlying reason for a failed
|
||||
operation. For instance, `PeriodicJobIsStopped` is an internal error code that is passed to callback
|
||||
functions running inside a [`PeriodicRunner`][periodic_runner_h] once the runner is stopped. The
|
||||
internal error codes are for internal use only and must never be returned to clients (i.e., in a
|
||||
network response).
|
||||
|
||||
Zero or more error categories can be assigned to `ErrorCodes`, which allows a single handler to
|
||||
serve a group of `ErrorCodes`. `RetriableError`, for instance, is an `ErrorCategory` that includes
|
||||
@ -140,10 +141,10 @@ operation that fails with any error code in this category can be safely retried.
|
||||
we can use `ErrorCodes::is${category}(${error})` to check error categories. Both methods provide
|
||||
similar functionality.
|
||||
|
||||
To represent the status of an executed operation (e.g., a command or a function invocation), we
|
||||
use `Status` objects, which represent an error state or the absence thereof. A `Status` uses the
|
||||
standardized `ErrorCodes` to determine the underlying cause of an error. It also allows assigning
|
||||
a textual description, as well as code-specific extra info, to the error code for further
|
||||
To represent the status of an executed operation (e.g., a command or a function invocation), we use
|
||||
`Status` objects, which represent an error state or the absence thereof. A `Status` uses the
|
||||
standardized `ErrorCodes` to determine the underlying cause of an error. It also allows assigning a
|
||||
textual description, as well as code-specific extra info, to the error code for further
|
||||
clarification. The extra info is a subclass of `ErrorExtraInfo` and specific to `ErrorCodes`. Look
|
||||
for `extra` in [here][error_codes_yml] for reference.
|
||||
|
||||
@ -153,28 +154,26 @@ functions with multiple out parameters. We can either pass an error code or an a
|
||||
`StatusWith` object, indicating failure or success of the operation. For examples of the proper
|
||||
usage of `StatusWith`, see [mongo/base/status_with.h][status_with_h] and
|
||||
[mongo/base/status_with_test.cpp][status_with_test_cpp]. It is highly recommended to use `uassert`
|
||||
or `iassert` over `StatusWith`, and catch exceptions instead of checking `Status` objects
|
||||
returned from functions. Using `StatusWith` to indicate exceptions, instead of throwing via
|
||||
`uassert` and `iassert`, makes it very difficult to identify that an error has occurred, and
|
||||
could lead to the wrong error being propagated.
|
||||
or `iassert` over `StatusWith`, and catch exceptions instead of checking `Status` objects returned
|
||||
from functions. Using `StatusWith` to indicate exceptions, instead of throwing via `uassert` and
|
||||
`iassert`, makes it very difficult to identify that an error has occurred, and could lead to the
|
||||
wrong error being propagated.
|
||||
|
||||
## Using noexcept
|
||||
|
||||
Server code should generally be written to be exception safe. Historically,
|
||||
we've had bugs due to code being overzealously marked `noexcept`. In such
|
||||
contexts, throwing an exception crashes the server, which can compromise
|
||||
availability. However, _just_ removing `noexcept` from such code is not a viable
|
||||
solution \- exception unsafe code may _need_ to crash in order to avoid causing
|
||||
an even worse failure. We want to work towards ensuring that functions that
|
||||
ought to be are in fact exception safe, and remove `noexcept` usage where it's
|
||||
not warranted. Here, we outline guidelines for doing so.
|
||||
Server code should generally be written to be exception safe. Historically, we've had bugs due to
|
||||
code being overzealously marked `noexcept`. In such contexts, throwing an exception crashes the
|
||||
server, which can compromise availability. However, _just_ removing `noexcept` from such code is not
|
||||
a viable solution \- exception unsafe code may _need_ to crash in order to avoid causing an even
|
||||
worse failure. We want to work towards ensuring that functions that ought to be are in fact
|
||||
exception safe, and remove `noexcept` usage where it's not warranted. Here, we outline guidelines
|
||||
for doing so.
|
||||
|
||||
Noexcept is a runtime check that terminates the process rather than allowing
|
||||
the function to exit because of a throw. Noexcept may be used when it can be
|
||||
thought of as a bug for any uncaught exception to be thrown. There is no
|
||||
compile-time check that exceptions will not be thrown within a `noexcept`
|
||||
function. Instead, putting `noexcept` on a function may be thought of as similar
|
||||
to using invariant in the following way:
|
||||
Noexcept is a runtime check that terminates the process rather than allowing the function to exit
|
||||
because of a throw. Noexcept may be used when it can be thought of as a bug for any uncaught
|
||||
exception to be thrown. There is no compile-time check that exceptions will not be thrown within a
|
||||
`noexcept` function. Instead, putting `noexcept` on a function may be thought of as similar to using
|
||||
invariant in the following way:
|
||||
|
||||
```c
|
||||
// Example noexcept code.
|
||||
@ -190,92 +189,80 @@ void func() try {
|
||||
}
|
||||
```
|
||||
|
||||
**As with invariant, be very careful when putting `noexcept` on a function that
|
||||
interacts with untrusted input.** This has been the root cause of serious past
|
||||
bugs.
|
||||
**As with invariant, be very careful when putting `noexcept` on a function that interacts with
|
||||
untrusted input.** This has been the root cause of serious past bugs.
|
||||
|
||||
### Adding or Removing noexcept
|
||||
|
||||
When considering removing `noexcept` from a function, the author of that change
|
||||
must ensure that the function’s implementation and its callsites are not
|
||||
relying on the function not throwing for correctness. Because of this, **be
|
||||
careful putting `noexcept` on a function** if there’s a chance it may need to be
|
||||
removed later. `noexcept` generally **should not be used** solely for reasons of
|
||||
performance optimization. Aside from the cases listed in the next section, it
|
||||
should not be assumed to improve performance without solid evidence.
|
||||
When considering removing `noexcept` from a function, the author of that change must ensure that the
|
||||
function’s implementation and its callsites are not relying on the function not throwing for
|
||||
correctness. Because of this, **be careful putting `noexcept` on a function** if there’s a chance it
|
||||
may need to be removed later. `noexcept` generally **should not be used** solely for reasons of
|
||||
performance optimization. Aside from the cases listed in the next section, it should not be assumed
|
||||
to improve performance without solid evidence.
|
||||
|
||||
If a part of the implementation would benefit from relying on not throwing, but
|
||||
`noexcept` is not meant to be a part of the function’s contract, it is acceptable
|
||||
to use a try/catch/invariant construction similar to the example above or an
|
||||
internal `noexcept` helper function.
|
||||
If a part of the implementation would benefit from relying on not throwing, but `noexcept` is not
|
||||
meant to be a part of the function’s contract, it is acceptable to use a try/catch/invariant
|
||||
construction similar to the example above or an internal `noexcept` helper function.
|
||||
|
||||
When adding or removing `noexcept`, also consider what types of exceptions are
|
||||
possible in that context and in our codebase. Refer to the “Where Exceptions
|
||||
are Possible” section for more details.
|
||||
When adding or removing `noexcept`, also consider what types of exceptions are possible in that
|
||||
context and in our codebase. Refer to the “Where Exceptions are Possible” section for more details.
|
||||
|
||||
If you are uncertain about adding or removing `noexcept` in a given situation,
|
||||
reach out to \#server-programmability on slack.
|
||||
If you are uncertain about adding or removing `noexcept` in a given situation, reach out to
|
||||
\#server-programmability on slack.
|
||||
|
||||
### Cases Where noexcept is Encouraged
|
||||
|
||||
This list is not exhaustive and there are cases not enumerated here that are
|
||||
valid uses of `noexcept`.
|
||||
This list is not exhaustive and there are cases not enumerated here that are valid uses of
|
||||
`noexcept`.
|
||||
|
||||
#### Move operations
|
||||
|
||||
Using `noexcept` with move operations allows operations to skip generating
|
||||
exception handling code. If a type’s move operation will not throw exceptions,
|
||||
it is strictly worse not to use `noexcept`. For instance, std::vector\<T\> can
|
||||
use optimized versions of certain operations when T has `noexcept` move
|
||||
operations. In these cases, **`noexcept` can be considered a requirement**. Of
|
||||
course, if a move operation genuinely needs to throw exceptions, then don’t
|
||||
mark it `noexcept`. This should be very rare – moves should be non-throwing in
|
||||
almost all cases.
|
||||
Using `noexcept` with move operations allows operations to skip generating exception handling code.
|
||||
If a type’s move operation will not throw exceptions, it is strictly worse not to use `noexcept`.
|
||||
For instance, std::vector\<T\> can use optimized versions of certain operations when T has
|
||||
`noexcept` move operations. In these cases, **`noexcept` can be considered a requirement**. Of
|
||||
course, if a move operation genuinely needs to throw exceptions, then don’t mark it `noexcept`. This
|
||||
should be very rare – moves should be non-throwing in almost all cases.
|
||||
|
||||
#### Swap operations
|
||||
|
||||
Allows callers to optimize for an exception-free pathway. **Swap operations
|
||||
should follow the same `noexcept` guidelines as move operations**.
|
||||
Allows callers to optimize for an exception-free pathway. **Swap operations should follow the same
|
||||
`noexcept` guidelines as move operations**.
|
||||
|
||||
#### Hash functions
|
||||
|
||||
Allows some hashing library types to optimize for an exception-free pathway.
|
||||
This can even affect the behavior, performance, and even layout of certain
|
||||
container types (such as libstdc++’s
|
||||
[unordered_map](https://gcc.gnu.org/onlinedocs/libstdc++/manual/unordered_associative.html)).
|
||||
**Hash functions should follow the `noexcept` guidelines as move operations.**
|
||||
Allows some hashing library types to optimize for an exception-free pathway. This can even affect
|
||||
the behavior, performance, and even layout of certain container types (such as libstdc++’s
|
||||
[unordered_map](https://gcc.gnu.org/onlinedocs/libstdc++/manual/unordered_associative.html)). **Hash
|
||||
functions should follow the `noexcept` guidelines as move operations.**
|
||||
|
||||
#### Destructors and “Destructor-Safe” Functions
|
||||
|
||||
Destructors are generally implicitly `noexcept`, and are encouraged to remain
|
||||
implicitly `noexcept` \- that is, by not marking them with `noexcept(false)`.
|
||||
Functions where “destructor safety” is a core part of their functionality **may
|
||||
be marked `noexcept`**. This is not a requirement – destructors are allowed to
|
||||
call potentially-throwing functions. It is also not a blanket recommendation to
|
||||
consider `noexcept` for all functions called from destructors. When calling a
|
||||
potentially-throwing function from a destructor, think about whether or not it
|
||||
can indeed throw in that context, and if exceptions need to be handled. If it
|
||||
can indeed throw in that context, exceptions almost certainly need to be
|
||||
handled \- otherwise the server will crash.
|
||||
Destructors are generally implicitly `noexcept`, and are encouraged to remain implicitly `noexcept`
|
||||
\- that is, by not marking them with `noexcept(false)`. Functions where “destructor safety” is a
|
||||
core part of their functionality **may be marked `noexcept`**. This is not a requirement –
|
||||
destructors are allowed to call potentially-throwing functions. It is also not a blanket
|
||||
recommendation to consider `noexcept` for all functions called from destructors. When calling a
|
||||
potentially-throwing function from a destructor, think about whether or not it can indeed throw in
|
||||
that context, and if exceptions need to be handled. If it can indeed throw in that context,
|
||||
exceptions almost certainly need to be handled \- otherwise the server will crash.
|
||||
|
||||
The lambda passed to `ON_BLOCK_EXIT()` and `ScopeGuard()` should be treated
|
||||
similarly to destructors: it is executed in a `noexcept` context (a destructor)
|
||||
and marking it as such is discouraged as being noisy. But code intended to be
|
||||
called from them can be.
|
||||
The lambda passed to `ON_BLOCK_EXIT()` and `ScopeGuard()` should be treated similarly to
|
||||
destructors: it is executed in a `noexcept` context (a destructor) and marking it as such is
|
||||
discouraged as being noisy. But code intended to be called from them can be.
|
||||
|
||||
### Where Exceptions are Possible
|
||||
|
||||
In our codebase, generally DBException is the only type of exception that
|
||||
should be crossing API boundaries. If an exception other than a DBException
|
||||
does cross an API boundary, it should be considered a bug. Whichever component
|
||||
throws the exception should handle it locally, even if only by translating it
|
||||
to a DBException. Generally any caller you would consider to be an external
|
||||
caller should be able to rely on DBException being the only exception type your
|
||||
function will throw.
|
||||
In our codebase, generally DBException is the only type of exception that should be crossing API
|
||||
boundaries. If an exception other than a DBException does cross an API boundary, it should be
|
||||
considered a bug. Whichever component throws the exception should handle it locally, even if only by
|
||||
translating it to a DBException. Generally any caller you would consider to be an external caller
|
||||
should be able to rely on DBException being the only exception type your function will throw.
|
||||
|
||||
Allocations using the global new allocator or std::allocator in our codebase do
|
||||
not throw, instead terminating the process directly when OOM conditions are
|
||||
encountered. As such, there is no need to handle exceptions from these sources.
|
||||
Allocations using the global new allocator or std::allocator in our codebase do not throw, instead
|
||||
terminating the process directly when OOM conditions are encountered. As such, there is no need to
|
||||
handle exceptions from these sources.
|
||||
|
||||
## Gotchas
|
||||
|
||||
@ -284,10 +271,10 @@ Gotchas to watch out for:
|
||||
- Generally, do not throw an `AssertionException` directly. Functions like `uasserted()` do work
|
||||
beyond just that. In particular, it makes sure that the `getLastError` structures are set up
|
||||
properly.
|
||||
- Think about the location of your asserts in constructors, as the destructor would not be
|
||||
called. But at a minimum, use `wassert` a lot therein, we want to know if something is wrong.
|
||||
- Do **not** throw in destructors or allow exceptions to leak out (if you call a function that
|
||||
may throw).
|
||||
- Think about the location of your asserts in constructors, as the destructor would not be called.
|
||||
But at a minimum, use `wassert` a lot therein, we want to know if something is wrong.
|
||||
- Do **not** throw in destructors or allow exceptions to leak out (if you call a function that may
|
||||
throw).
|
||||
|
||||
[raii]: https://en.wikipedia.org/wiki/Resource_acquisition_is_initialization
|
||||
[error_codes_yml]: ../src/mongo/base/error_codes.yml
|
||||
|
||||
@ -6,18 +6,17 @@ branches, enhance diagnostics, or achieve any number of other aims. Fail points
|
||||
configured, and disabled via command request to a remote process or via an API within the same
|
||||
process.
|
||||
|
||||
For more on what test-only means and how to enable the `configureFailPoint` command, see [test_commands][test_only].
|
||||
For more on what test-only means and how to enable the `configureFailPoint` command, see
|
||||
[test_commands][test_only].
|
||||
|
||||
## Using Fail Points
|
||||
|
||||
A fail point must first be defined using `MONGO_FAIL_POINT_DEFINE(myFailPoint)`. This statement
|
||||
adds the fail point to a registry and allows it to be evaluated in code. There are three common
|
||||
patterns for evaluating a fail point:
|
||||
A fail point must first be defined using `MONGO_FAIL_POINT_DEFINE(myFailPoint)`. This statement adds
|
||||
the fail point to a registry and allows it to be evaluated in code. There are three common patterns
|
||||
for evaluating a fail point:
|
||||
|
||||
- Exercise a rarely used branch:
|
||||
`if (whenPigsFly || myFailPoint.shouldFail()) { ... }`
|
||||
- Block until the fail point is unset:
|
||||
`myFailPoint.pauseWhileSet();`
|
||||
- Exercise a rarely used branch: `if (whenPigsFly || myFailPoint.shouldFail()) { ... }`
|
||||
- Block until the fail point is unset: `myFailPoint.pauseWhileSet();`
|
||||
- Use the fail point's payload to perform custom behavior:
|
||||
`myFailPoint.execute([](const BSONObj& data) { useMyPayload(data); };`
|
||||
|
||||
@ -30,9 +29,9 @@ Fail point configuration involves choosing a "mode" for activation (e.g., "alway
|
||||
providing additional data in the form of a BSON object. For the vast majority of cases, this is done
|
||||
by issuing a `configureFailPoint` command request. This is made easier in JavaScript using the
|
||||
`configureFailPoint` helper from [fail_point_util.js][fail_point_util]. Fail points can also be
|
||||
useful in C++ unit tests and integration tests. To configure fail points on the local process, use
|
||||
a `FailPointEnableBlock` to enable and configure the fail point for a given block scope. Finally,
|
||||
a fail point can also be set via setParameter by its name prefixed with "failpoint." (e.g.,
|
||||
useful in C++ unit tests and integration tests. To configure fail points on the local process, use a
|
||||
`FailPointEnableBlock` to enable and configure the fail point for a given block scope. Finally, a
|
||||
fail point can also be set via setParameter by its name prefixed with "failpoint." (e.g.,
|
||||
"failpoint.myFailPoint").
|
||||
|
||||
Users can also wait until a fail point has been evaluated a certain number of times **_over its
|
||||
@ -50,8 +49,8 @@ command implementations, see [here][fail_point_commands].
|
||||
|
||||
The `failCommand` fail point is a special fail point used to mock arbitrary response behaviors to
|
||||
requests filtered by command, appName, etc. It is most often used to simulate specific conditions
|
||||
between nodes like invalid replica set configurations. For examples of use, see the
|
||||
[failCommand JavaScript tests][fail_command_javascript_test].
|
||||
between nodes like invalid replica set configurations. For examples of use, see the [failCommand
|
||||
JavaScript tests][fail_command_javascript_test].
|
||||
|
||||
[fail_point]: ../src/mongo/util/fail_point.h
|
||||
[fail_point_test]: ../src/mongo/util/fail_point_test.cpp
|
||||
|
||||
@ -68,11 +68,11 @@ Future<Message> call(Message& toSend) {
|
||||
First, notice that our calls to `TransportSession::sourceMessage` and
|
||||
`TransportSession::sinkMessage` have been replaced with calls to asynchronous versions of those
|
||||
functions. These asynchronous versions are future-returning; they don't block, but also don't return
|
||||
a result right away. Instead, they return a future that we can chain continuations onto; `then,
|
||||
onError` and `onCompletion` are all member functions of `Future<T>` that take a callable as argument
|
||||
and invoke that callable when the chained-to future is ready. Unsurprisingly, continuations chained
|
||||
with `.then` are run when the future is readied successfully with a `T`, and therefore callables
|
||||
chained with `.then` should take a `T` as argument. Mirroring this behavior, `.onError`
|
||||
a result right away. Instead, they return a future that we can chain continuations onto;
|
||||
`then, onError` and `onCompletion` are all member functions of `Future<T>` that take a callable as
|
||||
argument and invoke that callable when the chained-to future is ready. Unsurprisingly, continuations
|
||||
chained with `.then` are run when the future is readied successfully with a `T`, and therefore
|
||||
callables chained with `.then` should take a `T` as argument. Mirroring this behavior, `.onError`
|
||||
continuations are run only when the future is readied with an error, and continuations chained this
|
||||
way take a `Status` as argument which they can inspect to discover the error explaining why a `T`
|
||||
could not be delivered. Continuations chained with `.onCompletion` are run when the future resolves,
|
||||
@ -107,18 +107,17 @@ associated Futures exactly one time, and must do so before being destroyed (othe
|
||||
will be set with the `ErrorCodes::BrokenPromise` error, which is considered a programmer error and
|
||||
may crash debug builds of the server in the future).
|
||||
|
||||
To create a `Promise` that has a Future, you may use the [`PromiseAndFuture<T>`][pf]
|
||||
utility type. Upon construction, it contains a created `Promise<T>` and its
|
||||
corresponding `Future<T>`. The perhaps-familiar `makePromiseFuture<T>` factory
|
||||
function now simply returns `PromiseAndFuture<T>{}`.
|
||||
To create a `Promise` that has a Future, you may use the [`PromiseAndFuture<T>`][pf] utility type.
|
||||
Upon construction, it contains a created `Promise<T>` and its corresponding `Future<T>`. The
|
||||
perhaps-familiar `makePromiseFuture<T>` factory function now simply returns `PromiseAndFuture<T>{}`.
|
||||
|
||||
As was previously alluded to, it's
|
||||
also possible to make a "ready future" - one that has no associated promise and is already filled
|
||||
with a value or error. These might be useful in cases where the code that produces values in a way
|
||||
that's normally asynchronous happens to have one available already when a request comes in, and
|
||||
would like to return it right away. To create such a ready future, use `Future<T>::makeReady()`, or
|
||||
the helper function [makeReadyFutureWith(Func&& func)][mrfw] which will call the specified `func`
|
||||
and create a ready `Future` from its returned value.
|
||||
As was previously alluded to, it's also possible to make a "ready future" - one that has no
|
||||
associated promise and is already filled with a value or error. These might be useful in cases where
|
||||
the code that produces values in a way that's normally asynchronous happens to have one available
|
||||
already when a request comes in, and would like to return it right away. To create such a ready
|
||||
future, use `Future<T>::makeReady()`, or the helper function [makeReadyFutureWith(Func&&
|
||||
func)][mrfw] which will call the specified `func` and create a ready `Future` from its returned
|
||||
value.
|
||||
|
||||
Lastly, there might be occasions when multiple futures should be fulfilled with the same value, at
|
||||
the same time. This use case is best served by `SharedPromise` and the associated `SharedSemiFuture`
|
||||
@ -144,8 +143,8 @@ calling threads, and return `Future<T>`s to those threads that will be readied o
|
||||
available. The service may have its own internal threads it uses to produce `T`s, and doesn't want
|
||||
to lend out its internal threads to do the work chained via continuations to the `Future<T>`s it's
|
||||
given to calling threads. Instead, it needs to insist that continuations are not chained onto the
|
||||
futures it gives out, or that the caller receiving the future
|
||||
arranges for some _other_ thread to run continuations.
|
||||
futures it gives out, or that the caller receiving the future arranges for some _other_ thread to
|
||||
run continuations.
|
||||
|
||||
Fortunately, the service can enforce these guarantees using two types closely related to
|
||||
`Future<T>`: the types `SemiFuture<T>` and `ExecutorFuture<T>`.
|
||||
@ -270,33 +269,32 @@ will traverse the remaining continuation chain, and find the continuation chaine
|
||||
is run.
|
||||
|
||||
Note that all of the continuation-chaining functions we've discussed, like `.then()`, return future-
|
||||
like types themselves (i.e. `Future<T>`, `SemiFuture<T>`, and the like). When we chain
|
||||
continuations in the manner we've been discussing here, subsequent continuations run when the future
|
||||
returned by the previous continuation is ready, and the future-like type is "unwrapped" such that
|
||||
the type wrapped by the future (or, in the case of failure, the error) is passed directly to the
|
||||
subsequent continuation. For more detail on this topic, see the block comment above the
|
||||
continuation-chaining member functions in [future.h][future], starting above the definition for
|
||||
`then()`.
|
||||
like types themselves (i.e. `Future<T>`, `SemiFuture<T>`, and the like). When we chain continuations
|
||||
in the manner we've been discussing here, subsequent continuations run when the future returned by
|
||||
the previous continuation is ready, and the future-like type is "unwrapped" such that the type
|
||||
wrapped by the future (or, in the case of failure, the error) is passed directly to the subsequent
|
||||
continuation. For more detail on this topic, see the block comment above the continuation-chaining
|
||||
member functions in [future.h][future], starting above the definition for `then()`.
|
||||
|
||||
At some point, we may have no more continuations to add to a future chain, and will want to either
|
||||
synchronously extract the value or error held in the last future of the chain, or add a callback to
|
||||
asynchronously consume this value. The `.get()` and `.getAsync()` members of future-like types
|
||||
provide these facilities for terminating a future chain by extracting or asynchronously
|
||||
consuming the result of the chain. The `.getAsync()` function works much like `.onCompletion()`,
|
||||
taking a `Status` or `StatusWith<T>` and running regardless of whether or not the previous link in
|
||||
the chain resolved with error or success, and running asynchronously when the previous results are
|
||||
ready (to determine what thread `.getAsync()` will run on, follow the rules laid out in the previous
|
||||
"Where Do Continuations Run?" section.) Conversely, `.get()` takes no arguments, and blocks when it
|
||||
is called until the entirety of the continuation chain is resolved, with the final result given back
|
||||
to the blocking caller. Note that if the final result of the chain was an error that can be
|
||||
converted to a MongoDB `Status` type (i.e. either a `Status`-family type or `DBException`), it will
|
||||
be re-thrown as a `DBException` at the site where `.get()` is called when it is available. If the
|
||||
code calling `.get()` is not capable of handling an exception, use `.getNoThrow()` instead to
|
||||
extract the same error in the form of a `Status`. In the case of `.getAsync()`, all errors are
|
||||
converted to `Status`, and crucially, callables chained as continuations via `.getAsync()` cannot
|
||||
throw any exceptions, as there is no appropriate context with which to handle an asynchronous
|
||||
exception. If an exception is thrown from a continuation chained via `.getAsync()`, the entire
|
||||
process will be terminated (i.e. the program will crash).
|
||||
provide these facilities for terminating a future chain by extracting or asynchronously consuming
|
||||
the result of the chain. The `.getAsync()` function works much like `.onCompletion()`, taking a
|
||||
`Status` or `StatusWith<T>` and running regardless of whether or not the previous link in the chain
|
||||
resolved with error or success, and running asynchronously when the previous results are ready (to
|
||||
determine what thread `.getAsync()` will run on, follow the rules laid out in the previous "Where Do
|
||||
Continuations Run?" section.) Conversely, `.get()` takes no arguments, and blocks when it is called
|
||||
until the entirety of the continuation chain is resolved, with the final result given back to the
|
||||
blocking caller. Note that if the final result of the chain was an error that can be converted to a
|
||||
MongoDB `Status` type (i.e. either a `Status`-family type or `DBException`), it will be re-thrown as
|
||||
a `DBException` at the site where `.get()` is called when it is available. If the code calling
|
||||
`.get()` is not capable of handling an exception, use `.getNoThrow()` instead to extract the same
|
||||
error in the form of a `Status`. In the case of `.getAsync()`, all errors are converted to `Status`,
|
||||
and crucially, callables chained as continuations via `.getAsync()` cannot throw any exceptions, as
|
||||
there is no appropriate context with which to handle an asynchronous exception. If an exception is
|
||||
thrown from a continuation chained via `.getAsync()`, the entire process will be terminated (i.e.
|
||||
the program will crash).
|
||||
|
||||
## Notes and Links
|
||||
|
||||
|
||||
105
docs/fuzztest.md
105
docs/fuzztest.md
@ -2,31 +2,27 @@
|
||||
title: FuzzTest
|
||||
---
|
||||
|
||||
FuzzTest is a coverage-guided fuzzing framework for C++ that integrates
|
||||
directly with GoogleTest. FuzzTest lets you write _property-based tests_: you
|
||||
describe the shape of your inputs using typed _domains_, and the framework
|
||||
generates and mutates values that satisfy those constraints. FuzzTest
|
||||
uses Centipede as its fuzzing engine and AUBSAN to surface undefined
|
||||
behavior.
|
||||
FuzzTest is a coverage-guided fuzzing framework for C++ that integrates directly with GoogleTest.
|
||||
FuzzTest lets you write _property-based tests_: you describe the shape of your inputs using typed
|
||||
_domains_, and the framework generates and mutates values that satisfy those constraints. FuzzTest
|
||||
uses Centipede as its fuzzing engine and AUBSAN to surface undefined behavior.
|
||||
|
||||
# When to use FuzzTest
|
||||
|
||||
- Your function under test accepts structured inputs (integers, strings,
|
||||
custom types, BSON objects, etc.) rather than an opaque byte blob.
|
||||
- You want to express correctness properties beyond "does not crash", such
|
||||
as API invariants, differential equivalence, or roundtrip symmetry.
|
||||
- You want a fuzz test that also runs cleanly as a unit test in normal CI,
|
||||
without needing a special fuzzer build variant.
|
||||
- Your function under test accepts structured inputs (integers, strings, custom types, BSON objects,
|
||||
etc.) rather than an opaque byte blob.
|
||||
- You want to express correctness properties beyond "does not crash", such as API invariants,
|
||||
differential equivalence, or roundtrip symmetry.
|
||||
- You want a fuzz test that also runs cleanly as a unit test in normal CI, without needing a special
|
||||
fuzzer build variant.
|
||||
|
||||
# How to use FuzzTest
|
||||
|
||||
## The property function and FUZZ_TEST macro
|
||||
|
||||
A FuzzTest consists of a _property function_ and a registration macro.
|
||||
The property function is a plain C++ function whose parameters define the
|
||||
inputs to fuzz. The framework calls it repeatedly with generated values,
|
||||
looking for any call that triggers an assertion failure or sanitizer
|
||||
error.
|
||||
A FuzzTest consists of a _property function_ and a registration macro. The property function is a
|
||||
plain C++ function whose parameters define the inputs to fuzz. The framework calls it repeatedly
|
||||
with generated values, looking for any call that triggers an assertion failure or sanitizer error.
|
||||
|
||||
```cpp
|
||||
#include "fuzztest/fuzztest.h"
|
||||
@ -38,14 +34,16 @@ void MyFunctionFuzzer(const std::string& input) {
|
||||
FUZZ_TEST(MyTestSuite, MyFunctionFuzzer);
|
||||
```
|
||||
|
||||
When no `.WithDomains()` clause is provided, each parameter defaults to
|
||||
`fuzztest::Arbitrary<T>()`, which covers most standard library types.
|
||||
When no `.WithDomains()` clause is provided, each parameter defaults to `fuzztest::Arbitrary<T>()`,
|
||||
which covers most standard library types.
|
||||
|
||||
## Specifying input domains
|
||||
|
||||
Use `.WithDomains()` to constrain the generated inputs:
|
||||
|
||||
> ⚠️ **Warning:** Never initialize input domains with global objects initialized in other compilation units. For more information see [Fuzz_Test Macro](https://github.com/google/fuzztest/blob/main/doc/fuzz-test-macro.md)
|
||||
> ⚠️ **Warning:** Never initialize input domains with global objects initialized in other
|
||||
> compilation units. For more information see
|
||||
> [Fuzz_Test Macro](https://github.com/google/fuzztest/blob/main/doc/fuzz-test-macro.md)
|
||||
|
||||
```cpp
|
||||
void ProcessRequestFuzzer(int opcode, const std::string& payload) {
|
||||
@ -56,14 +54,18 @@ FUZZ_TEST(MyTestSuite, ProcessRequestFuzzer)
|
||||
/*payload=*/fuzztest::Arbitrary<std::string>());
|
||||
```
|
||||
|
||||
FuzzTest ships with a rich set of built-in domains. A complete list of default types implemented in fuzztest can be found in the [Fuzztest Domain Reference](https://github.com/google/fuzztest/blob/main/doc/domains-reference.md). Also see [BSON Fuzzing](#fuzzing-bson).
|
||||
FuzzTest ships with a rich set of built-in domains. A complete list of default types implemented in
|
||||
fuzztest can be found in the
|
||||
[Fuzztest Domain Reference](https://github.com/google/fuzztest/blob/main/doc/domains-reference.md).
|
||||
Also see [BSON Fuzzing](#fuzzing-bson).
|
||||
|
||||
## Providing seeds
|
||||
|
||||
Seed values give the fuzzer a head start by providing known-interesting
|
||||
inputs to mutate:
|
||||
Seed values give the fuzzer a head start by providing known-interesting inputs to mutate:
|
||||
|
||||
> ⚠️ **Warning:** Never initialize seeds with global objects initialized in other compilation units. For more information see [Fuzz_Test Macro](https://github.com/google/fuzztest/blob/main/doc/fuzz-test-macro.md)
|
||||
> ⚠️ **Warning:** Never initialize seeds with global objects initialized in other compilation units.
|
||||
> For more information see
|
||||
> [Fuzz_Test Macro](https://github.com/google/fuzztest/blob/main/doc/fuzz-test-macro.md)
|
||||
|
||||
```cpp
|
||||
FUZZ_TEST(MyTestSuite, ProcessRequestFuzzer)
|
||||
@ -82,11 +84,9 @@ FUZZ_TEST(MyTestSuite, ProcessRequestFuzzer)
|
||||
|
||||
## Common correctness patterns
|
||||
|
||||
Beyond "does not crash", FuzzTest makes it easy to assert higher-level
|
||||
properties.
|
||||
Beyond "does not crash", FuzzTest makes it easy to assert higher-level properties.
|
||||
|
||||
**Roundtrip**: verify that encode→decode (or serialize→parse) is the
|
||||
identity:
|
||||
**Roundtrip**: verify that encode→decode (or serialize→parse) is the identity:
|
||||
|
||||
```cpp
|
||||
void SerializeRoundtrips(const MyMessage& msg) {
|
||||
@ -97,8 +97,7 @@ void SerializeRoundtrips(const MyMessage& msg) {
|
||||
FUZZ_TEST(MyTestSuite, SerializeRoundtrips);
|
||||
```
|
||||
|
||||
**Differential fuzzing**: compare two implementations of the same
|
||||
operation:
|
||||
**Differential fuzzing**: compare two implementations of the same operation:
|
||||
|
||||
```cpp
|
||||
void ImplementationsAgree(const std::string& input) {
|
||||
@ -109,10 +108,11 @@ FUZZ_TEST(MyTestSuite, ImplementationsAgree);
|
||||
|
||||
## Using fixtures
|
||||
|
||||
If your test requires expensive one-time setup (e.g. starting a service),
|
||||
use a fixture with `FUZZ_TEST_F`. Any default-constructible class can be
|
||||
a fixture; the constructor and destructor run once for the whole fuzz test,
|
||||
not once per iteration. When using fixtures, care should be taken to ensure that only the initial fixture state is retained. Program state created during a test _**must**_ not affect or be affected by subsequent iterations.
|
||||
If your test requires expensive one-time setup (e.g. starting a service), use a fixture with
|
||||
`FUZZ_TEST_F`. Any default-constructible class can be a fixture; the constructor and destructor run
|
||||
once for the whole fuzz test, not once per iteration. When using fixtures, care should be taken to
|
||||
ensure that only the initial fixture state is retained. Program state created during a test
|
||||
_**must**_ not affect or be affected by subsequent iterations.
|
||||
|
||||
```cpp
|
||||
class MyServiceFuzzTest {
|
||||
@ -132,10 +132,10 @@ FUZZ_TEST_F(MyServiceFuzzTest, RequestFuzzer);
|
||||
|
||||
## Fuzzing BSON
|
||||
|
||||
MongoDB provides a custom FuzzTest domain for generating valid BSON
|
||||
objects: `mongo::bson_mutator::BSONObjImpl`. It is registered as the
|
||||
`Arbitrary<ConstSharedBuffer>` specialization, so any fuzz test that
|
||||
accepts a `ConstSharedBuffer` will automatically receive well-formed BSON.
|
||||
MongoDB provides a custom FuzzTest domain for generating valid BSON objects:
|
||||
`mongo::bson_mutator::BSONObjImpl`. It is registered as the `Arbitrary<ConstSharedBuffer>`
|
||||
specialization, so any fuzz test that accepts a `ConstSharedBuffer` will automatically receive
|
||||
well-formed BSON.
|
||||
|
||||
```cpp
|
||||
#include "mongo/bson/bson_mutator/bson_mutator.h"
|
||||
@ -147,8 +147,7 @@ void MyCommandFuzzer(ConstSharedBuffer input) {
|
||||
FUZZ_TEST(MyCommandFuzzTest, MyCommandFuzzer);
|
||||
```
|
||||
|
||||
To constrain which fields are present and their types, use the
|
||||
`.With<Type>()` builders:
|
||||
To constrain which fields are present and their types, use the `.With<Type>()` builders:
|
||||
|
||||
```cpp
|
||||
FUZZ_TEST(MyCommandFuzzTest, MyCommandFuzzer)
|
||||
@ -158,8 +157,8 @@ FUZZ_TEST(MyCommandFuzzTest, MyCommandFuzzer)
|
||||
.WithLong("limit", fuzztest::InRange(0LL, 1000LL)));
|
||||
```
|
||||
|
||||
Fields added via `.With<Type>()` are not guaranteed to appear in every
|
||||
generated object, which exercises missing-field error handling as well.
|
||||
Fields added via `.With<Type>()` are not guaranteed to appear in every generated object, which
|
||||
exercises missing-field error handling as well.
|
||||
|
||||
Use `.WithVariant()` when a field may legally hold more than one type:
|
||||
|
||||
@ -171,8 +170,7 @@ fuzztest::Arbitrary<mongo::ConstSharedBuffer>()
|
||||
});
|
||||
```
|
||||
|
||||
Use `.WithAny()` when a key should be present but its type is
|
||||
unconstrained:
|
||||
Use `.WithAny()` when a key should be present but its type is unconstrained:
|
||||
|
||||
```cpp
|
||||
fuzztest::Arbitrary<mongo::ConstSharedBuffer>().WithAny("filter");
|
||||
@ -180,8 +178,8 @@ fuzztest::Arbitrary<mongo::ConstSharedBuffer>().WithAny("filter");
|
||||
|
||||
## Bazel target
|
||||
|
||||
Use `mongo_cc_fuzztest` (from `//bazel:mongo_src_rules.bzl`) to declare a
|
||||
fuzz test target. It links in FuzzTest and GoogleTest automatically:
|
||||
Use `mongo_cc_fuzztest` (from `//bazel:mongo_src_rules.bzl`) to declare a fuzz test target. It links
|
||||
in FuzzTest and GoogleTest automatically:
|
||||
|
||||
```python
|
||||
mongo_cc_fuzztest(
|
||||
@ -198,8 +196,8 @@ mongo_cc_fuzztest(
|
||||
|
||||
## Unit test mode
|
||||
|
||||
Every `FUZZ_TEST` is also a regular GoogleTest test. In unit test mode,
|
||||
the property function is called a small number of times with minimal inputs. This lets fuzz tests run in ordinary CI
|
||||
Every `FUZZ_TEST` is also a regular GoogleTest test. In unit test mode, the property function is
|
||||
called a small number of times with minimal inputs. This lets fuzz tests run in ordinary CI
|
||||
alongside unit tests:
|
||||
|
||||
```
|
||||
@ -208,10 +206,9 @@ bazel test --compiler_type=clang --config=fuzztest --fsan --opt=debug --allocato
|
||||
|
||||
## Fuzzing mode
|
||||
|
||||
Fuzzing mode enables sanitizer and coverage instrumentation and runs the
|
||||
test indefinitely (or until a crash is found). It requires the `fsan`
|
||||
build configuration. Check our Evergreen configuration for the current
|
||||
bazel arguments, or run:
|
||||
Fuzzing mode enables sanitizer and coverage instrumentation and runs the test indefinitely (or until
|
||||
a crash is found). It requires the `fsan` build configuration. Check our Evergreen configuration for
|
||||
the current bazel arguments, or run:
|
||||
|
||||
```
|
||||
bazel run --compiler_type=clang --config=fuzztest --fsan --opt=debug --allocator=system +my_command_fuzztest -- \
|
||||
@ -226,7 +223,9 @@ bazel run --compiler_type=clang --config=fuzztest --fsan --opt=debug --allocator
|
||||
|
||||
## Evergreen
|
||||
|
||||
Fuzz tests defined in bazel using `mongo_cc_fuzztest` will periodically run on the master branch in evergreen. The compiled tests and their associated corpus are saved to S3 and can be downloaded for debugging issues. The corpus is reused between evergreen runs in order to increase fuzzing coverage.
|
||||
Fuzz tests defined in bazel using `mongo_cc_fuzztest` will periodically run on the master branch in
|
||||
evergreen. The compiled tests and their associated corpus are saved to S3 and can be downloaded for
|
||||
debugging issues. The corpus is reused between evergreen runs in order to increase fuzzing coverage.
|
||||
|
||||
## Useful flags
|
||||
|
||||
|
||||
@ -33,24 +33,24 @@ outputs.
|
||||
code changes.
|
||||
|
||||
- Multiple test variations MAY be bundled into a single test. Recommended when testing same feature
|
||||
with different inputs. This helps reviewing the outputs by grouping similar tests together, and also
|
||||
reduces the number of output files.
|
||||
with different inputs. This helps reviewing the outputs by grouping similar tests together, and
|
||||
also reduces the number of output files.
|
||||
|
||||
- Changes to test fixture or test code that affect non-trivial amount test outputs MUST BE done in
|
||||
separate pull request from production code changes:
|
||||
|
||||
- Pull request for test code only changes can be easily reviewed, even if large number of test
|
||||
outputs are modified. While such changes can still introduce merge conflicts, they don't introduce
|
||||
risk of regression (if outputs were valid
|
||||
outputs are modified. While such changes can still introduce merge conflicts, they don't
|
||||
introduce risk of regression (if outputs were valid
|
||||
- Pull requests with mixed production
|
||||
|
||||
- Tests in the same suite SHOULD share the fixtures when appropriate. This reduces cost of adding
|
||||
new tests to the suite. Changes to the fixture may only affect expected outputs from that fixtures,
|
||||
and those output can be updated in bulk.
|
||||
new tests to the suite. Changes to the fixture may only affect expected outputs from that
|
||||
fixtures, and those output can be updated in bulk.
|
||||
|
||||
- Tests in different suites SHOULD NOT reuse/share fixtures. Changes to the fixture can affect large
|
||||
number of expected outputs.
|
||||
There are exceptions to that rule, and tests in different suites MAY reuse/share fixtures if:
|
||||
number of expected outputs. There are exceptions to that rule, and tests in different suites MAY
|
||||
reuse/share fixtures if:
|
||||
|
||||
- Test fixture is considered stable and changes rarely.
|
||||
- Tests suites are related, either by sharing tests, or testing similar components.
|
||||
@ -59,9 +59,8 @@ outputs.
|
||||
|
||||
- Tests SHOULD print both inputs and outputs of the tested code. This makes it easy for reviewers to
|
||||
verify of the expected outputs are indeed correct by having both input and output next to each
|
||||
other.
|
||||
Otherwise finding the input used to produce the new output may not be practical, and might not even
|
||||
be included in the diff.
|
||||
other. Otherwise finding the input used to produce the new output may not be practical, and might
|
||||
not even be included in the diff.
|
||||
|
||||
- When resolving merge conflicts on the expected output files, one of the approaches below SHOULD be
|
||||
used:
|
||||
@ -71,8 +70,8 @@ outputs.
|
||||
hanges done by local branch.
|
||||
- "Accept yours", rerun the tests and verify the new outputs. This approach requires knowledge of
|
||||
production/test code changes in "theirs" branch. However, if such changes resulted in
|
||||
straightforward and repetitive output changes, like due to printing code change or fixture change,
|
||||
it may be easier to verify than reinspecting local changes.
|
||||
straightforward and repetitive output changes, like due to printing code change or fixture
|
||||
change, it may be easier to verify than reinspecting local changes.
|
||||
|
||||
- Expected test outputs SHOULD be reused across tightly-coupled test suites. The suites are
|
||||
tightly-coupled if:
|
||||
@ -92,8 +91,8 @@ outputs.
|
||||
- Versioned tests, where expected behavior is the same for majority of test inputs/scenarios.
|
||||
|
||||
- AVOID manually modifying expected output files. Those files are considered to be auto generated.
|
||||
Instead, run the tests and then copy the generated output as a new expected output file. See "How to
|
||||
diff and accept new test outputs" section for instructions.
|
||||
Instead, run the tests and then copy the generated output as a new expected output file. See "How
|
||||
to diff and accept new test outputs" section for instructions.
|
||||
|
||||
# How to use write Golden Data tests?
|
||||
|
||||
@ -121,9 +120,10 @@ outputs. Verifies the output with the expected output that is in the source repo
|
||||
|
||||
See: [golden_test.h](../src/mongo/unittest/golden_test.h)
|
||||
|
||||
Before running `bazel test`, set up the golden test framework as described in the `Setup` section below.
|
||||
This will ensure that the C++ test outputs are written to a location where `buildscripts/golden_test.py`
|
||||
can find them so that the `diff` and `accept` functions work as expected.
|
||||
Before running `bazel test`, set up the golden test framework as described in the `Setup` section
|
||||
below. This will ensure that the C++ test outputs are written to a location where
|
||||
`buildscripts/golden_test.py` can find them so that the `diff` and `accept` functions work as
|
||||
expected.
|
||||
|
||||
**Example:**
|
||||
|
||||
@ -160,8 +160,7 @@ TEST_F(MySuiteFixture, MyFeatureBTest) {
|
||||
}
|
||||
```
|
||||
|
||||
Also see self-test:
|
||||
[golden_test_test.cpp](../src/mongo/unittest/golden_test_test.cpp)
|
||||
Also see self-test: [golden_test_test.cpp](../src/mongo/unittest/golden_test_test.cpp)
|
||||
|
||||
# How to diff and accept new test outputs on a workstation
|
||||
|
||||
@ -177,13 +176,15 @@ buildscripts/golden_test.py requires a one-time workstation setup.
|
||||
Note: this setup is only required to use buildscripts/golden_test.py itself. It is NOT required to
|
||||
just run the Golden Data tests when not using buildscripts/golden_test.py.
|
||||
|
||||
1. Create a yaml config file, as described by [Appendix - Config file reference](#appendix---config-file-reference).
|
||||
1. Create a yaml config file, as described by
|
||||
[Appendix - Config file reference](#appendix---config-file-reference).
|
||||
2. Set GOLDEN_TEST_CONFIG_PATH environment variable to config file location, so that is available
|
||||
when running tests and when running buildscripts/golden_test.py tool.
|
||||
|
||||
### Automatic Setup
|
||||
|
||||
Use buildscripts/golden_test.py builtin setup to initialize default config for your current platform.
|
||||
Use buildscripts/golden_test.py builtin setup to initialize default config for your current
|
||||
platform.
|
||||
|
||||
**Instructions for Linux**
|
||||
|
||||
@ -195,8 +196,8 @@ buildscripts/golden_test.py setup
|
||||
|
||||
**Instructions for Windows**
|
||||
|
||||
Run buildscripts/golden_test.py setup utility.
|
||||
You may be asked for a password, when not running in "Run as administrator" shell.
|
||||
Run buildscripts/golden_test.py setup utility. You may be asked for a password, when not running in
|
||||
"Run as administrator" shell.
|
||||
|
||||
```cmd
|
||||
c:\python\python310\python.exe buildscripts/golden_test.py setup
|
||||
@ -295,7 +296,8 @@ $> buildscripts/golden_test.py --help
|
||||
|
||||
### Update multiple expected files at once
|
||||
|
||||
Some tests will run in multiple passthroughs or build variants, so they have multiple expected files.
|
||||
Some tests will run in multiple passthroughs or build variants, so they have multiple expected
|
||||
files.
|
||||
|
||||
Whenever the test is updated, all the expected files should be updated together as well.
|
||||
|
||||
@ -306,8 +308,8 @@ buildscripts/golden_test.py --verbose clean-run-accept jstests/query_golden/NAME
|
||||
This option uses `resmoke.py find-suites` to determine the passthrough suites a test belongs to and
|
||||
runs them.
|
||||
|
||||
If the test is found to only belong to the `query_golden_classic` passthrough, it is assumed that
|
||||
it can have multiple expected results due to being run under multiple build variants with a different
|
||||
If the test is found to only belong to the `query_golden_classic` passthrough, it is assumed that it
|
||||
can have multiple expected results due to being run under multiple build variants with a different
|
||||
`internalQueryFrameworkControl` settings. So the test will be run with various values for
|
||||
`internalQueryFrameworkControl`.
|
||||
|
||||
@ -348,22 +350,21 @@ outputRootPattern:
|
||||
type: String
|
||||
optional: true
|
||||
description:
|
||||
Root path patten that will be used to write expected and actual test outputs for all tests
|
||||
in the test run.
|
||||
If not specified a temporary folder location will be used.
|
||||
Path pattern string may use '%' characters in the last part of the path. '%' characters in
|
||||
the last part of the path will be replaced with random lowercase hexadecimal digits.
|
||||
examples: /var/tmp/test_output/out-%%%%-%%%%-%%%%-%%%%
|
||||
/var/tmp/test_output
|
||||
Root path patten that will be used to write expected and actual test outputs for all tests in
|
||||
the test run. If not specified a temporary folder location will be used. Path pattern string may
|
||||
use '%' characters in the last part of the path. '%' characters in the last part of the path
|
||||
will be replaced with random lowercase hexadecimal digits.
|
||||
examples: /var/tmp/test_output/out-%%%%-%%%%-%%%%-%%%% /var/tmp/test_output
|
||||
|
||||
diffCmd:
|
||||
type: String
|
||||
optional: true
|
||||
description: Shell command to diff a single golden test run output.
|
||||
{{expected}} and {{actual}} variables should be used and will be replaced with expected and
|
||||
actual output folder paths respectively.
|
||||
This property is not used to decide whether the test passes or fails; it is only used to
|
||||
display differences once we've decided that a test failed.
|
||||
examples: git diff --no-index "{{expected}}" "{{actual}}"
|
||||
diff -ruN --unidirectional-new-file --color=always "{{expected}}" "{{actual}}"
|
||||
description:
|
||||
Shell command to diff a single golden test run output. {{expected}} and {{actual}} variables
|
||||
should be used and will be replaced with expected and actual output folder paths respectively.
|
||||
This property is not used to decide whether the test passes or fails; it is only used to display
|
||||
differences once we've decided that a test failed.
|
||||
examples:
|
||||
git diff --no-index "{{expected}}" "{{actual}}" diff -ruN --unidirectional-new-file
|
||||
--color=always "{{expected}}" "{{actual}}"
|
||||
```
|
||||
|
||||
165
docs/idl.md
165
docs/idl.md
@ -142,8 +142,8 @@ mongo_idl_library(
|
||||
```
|
||||
|
||||
Bazel knows how to invoke the IDL compiler and generate files in the build directory with the C++
|
||||
code. This code can also be generated by `--build_tag_filters=gen_source` tag in bazel which is useful for
|
||||
code navigation.
|
||||
code. This code can also be generated by `--build_tag_filters=gen_source` tag in bazel which is
|
||||
useful for code navigation.
|
||||
|
||||
The generated IDL code looks something like the simplified code below.
|
||||
|
||||
@ -206,17 +206,17 @@ fields on the `commands` object.
|
||||
|
||||
The special features/requirements of commands:
|
||||
|
||||
1. First element must match the name of the command, and the parsing rules of this element
|
||||
can be customized via the `namespace` field.
|
||||
1. First element must match the name of the command, and the parsing rules of this element can be
|
||||
customized via the `namespace` field.
|
||||
2. In `OP_MSG`, `$db` must be present or defaults to `admin`
|
||||
3. Commands may have a `struct` as a reply
|
||||
4. Commands may be a part of API Version 1
|
||||
5. Any structs marked with `is_generic_cmd_list: "arg"` that are in imported IDL files
|
||||
will automatically be chained to all commands. The IDL compiler imports
|
||||
[`generic_argument.idl`](generic_argument.idl) by default, so any generic argument struct
|
||||
defined in that file will be chained to all commands by default.
|
||||
6. Command replies ignore the generic arguments fields like `$clusterTime`, `ok`, etc
|
||||
during parsing. The list of these fields is in [`generic_argument.idl`](generic_argument.idl).
|
||||
5. Any structs marked with `is_generic_cmd_list: "arg"` that are in imported IDL files will
|
||||
automatically be chained to all commands. The IDL compiler imports
|
||||
[`generic_argument.idl`](generic_argument.idl) by default, so any generic argument struct defined
|
||||
in that file will be chained to all commands by default.
|
||||
6. Command replies ignore the generic arguments fields like `$clusterTime`, `ok`, etc during
|
||||
parsing. The list of these fields is in [`generic_argument.idl`](generic_argument.idl).
|
||||
|
||||
Example Command:
|
||||
|
||||
@ -388,7 +388,8 @@ void idlDeserialize(StringEnumEnum& en, ::mongo::StringData value, const IDLPars
|
||||
constexpr ::mongo::StringData idlGetDefaultParserFieldName(StringEnumEnum) { return "StringEnumEnum"; }
|
||||
```
|
||||
|
||||
These ADL hooks are not intended to be used directly by user code. See [Serialization/Deserialization API](#serializationdeserialization-api).
|
||||
These ADL hooks are not intended to be used directly by user code. See
|
||||
[Serialization/Deserialization API](#serializationdeserialization-api).
|
||||
|
||||
### Integer Enums
|
||||
|
||||
@ -420,7 +421,8 @@ std::int32_t idlSerialize(IntEnum value);
|
||||
constexpr ::mongo::StringData idlGetDefaultParserFieldName(IntEnum) { return "IntEnum"; }
|
||||
```
|
||||
|
||||
These ADL hooks are not intended to be used directly by user code. See [Serialization/Deserialization API](#serializationdeserialization-api).
|
||||
These ADL hooks are not intended to be used directly by user code. See
|
||||
[Serialization/Deserialization API](#serializationdeserialization-api).
|
||||
|
||||
### Serialization/Deserialization API
|
||||
|
||||
@ -432,9 +434,9 @@ The public API to serialize and deserialize IDL-generated enums is defined in
|
||||
auto parsedEnum = idl::deserialize<IdlEnum>(value);
|
||||
```
|
||||
|
||||
The definitions of `idl::serialize()` and `idl::deserialize()` rely on the autogenerated ADL hooks to
|
||||
find the serializer/deserializer implementations for each enum. User code should use this public API
|
||||
and not the ADL hooks directly.
|
||||
The definitions of `idl::serialize()` and `idl::deserialize()` rely on the autogenerated ADL hooks
|
||||
to find the serializer/deserializer implementations for each enum. User code should use this public
|
||||
API and not the ADL hooks directly.
|
||||
|
||||
### Reference
|
||||
|
||||
@ -482,8 +484,8 @@ types allow users to customize IDL parsing for their own unique needs.
|
||||
|
||||
A field in a struct or command can be defined as a type but a field can also be an array, enum,
|
||||
struct or variant. Declaring a field as something other then a type preferred to using types since
|
||||
it allows more type information to be represented in IDL over C++. See `type` in the [field
|
||||
reference](#struct-fields-attribute-reference) for more information.
|
||||
it allows more type information to be represented in IDL over C++. See `type` in the
|
||||
[field reference](#struct-fields-attribute-reference) for more information.
|
||||
|
||||
Type supports builtin BSON types like int32, int64, and string. These are types built into
|
||||
`BSONElement`/`BSONObjBuilder`. It also supports custom types to give the code full control of
|
||||
@ -529,11 +531,11 @@ The five key things to note in this example:
|
||||
`BSONElement` as a parameter. The IDL generator has custom rules for `BSONElement`.
|
||||
- `serializer` - omitted in this example because `BSONObjBuilder` has builtin support for
|
||||
`std::string`
|
||||
- `is_view` - indicates whether the type is a view or not. If the type is a view, then it's
|
||||
possible that objects of the type will not own all of its members. If the type is not a view,
|
||||
then objects of the type are guaranteed to own all of its members. This field is optional and
|
||||
defaults to True. To reduce the size of the C++ representation of structs including this type,
|
||||
you can specify this field as False if the type is not a view type.
|
||||
- `is_view` - indicates whether the type is a view or not. If the type is a view, then it's possible
|
||||
that objects of the type will not own all of its members. If the type is not a view, then objects
|
||||
of the type are guaranteed to own all of its members. This field is optional and defaults to True.
|
||||
To reduce the size of the C++ representation of structs including this type, you can specify this
|
||||
field as False if the type is not a view type.
|
||||
|
||||
### Custom Types
|
||||
|
||||
@ -590,22 +592,29 @@ IDLAnyType:
|
||||
- `std::vector<_>` - When using `std::vector<->`, the getters/setters using
|
||||
`mongo::ConstDataRange` instead
|
||||
- `deserializer` - string - a method name to all deserialize the type. Typically this is a function
|
||||
that takes `BSONElement` as a parameter. The IDL generator has custom rules for `BSONElement`. - By default, IDL assumes it is a instance methods of `cpp_type`. - If prefixed with `::`, assumes the function is a global static function - By default, the deserializer's function signature is `<function_name>(<cpp_type>)`. - For `object` types, the deserializer's function signature is `<function_name>(const BSONObj&
|
||||
obj)` - For `any` types, the deserializer's function signature is `<function_name>(BSONElement
|
||||
element)`.
|
||||
- `serializer` - string -a method name to all serialize the type. - By default, IDL assumes it is a instance methods of `cpp_type`. - If prefixed with `::`, assumes the function is a global static function - By default, the deserializer's function signature is `<type_append> <function_name>(const
|
||||
<cpp_type>&)` where `type_append` is a type `BSONObjBuilder` understands. - For `object` types, the deserializer's function signature is `<function_name>(const BSONObj&
|
||||
obj)` - For `any` types that are not in an array, the serializer's function signature is
|
||||
`<function_name>(StringData fieldName, BSONObjBuilder* builder)`. - For `any` types that are in an array, the serializer's function signature is
|
||||
that takes `BSONElement` as a parameter. The IDL generator has custom rules for `BSONElement`. -
|
||||
By default, IDL assumes it is a instance methods of `cpp_type`. - If prefixed with `::`, assumes
|
||||
the function is a global static function - By default, the deserializer's function signature is
|
||||
`<function_name>(<cpp_type>)`. - For `object` types, the deserializer's function signature is
|
||||
`<function_name>(const BSONObj& obj)` - For `any` types, the deserializer's function signature is
|
||||
`<function_name>(BSONElement element)`.
|
||||
- `serializer` - string -a method name to all serialize the type. - By default, IDL assumes it is a
|
||||
instance methods of `cpp_type`. - If prefixed with `::`, assumes the function is a global static
|
||||
function - By default, the deserializer's function signature is
|
||||
`<type_append> <function_name>(const <cpp_type>&)` where `type_append` is a type `BSONObjBuilder`
|
||||
understands. - For `object` types, the deserializer's function signature is
|
||||
`<function_name>(const BSONObj& obj)` - For `any` types that are not in an array, the serializer's
|
||||
function signature is `<function_name>(StringData fieldName, BSONObjBuilder* builder)`. - For
|
||||
`any` types that are in an array, the serializer's function signature is
|
||||
`<function_name>(BSONArrayBuilder* builder)`.
|
||||
- `deserialize_with_tenant` - bool - if set, adds `TenantId` as the first parameter to
|
||||
`deserializer`
|
||||
- `internal_only` - bool - undocumented, DO NOT USE
|
||||
- `default` - string - default value for a type. A field in a struct inherits this value if a field
|
||||
does not set a default. See struct's `default` rules for more information.
|
||||
- `is_view` - indicates whether the type is a view or not. If the type is a view, then it's
|
||||
possible that objects of the type will not own all of its members. If the type is not a view,
|
||||
then objects of the type are guaranteed to own all of its members.
|
||||
- `is_view` - indicates whether the type is a view or not. If the type is a view, then it's possible
|
||||
that objects of the type will not own all of its members. If the type is not a view, then objects
|
||||
of the type are guaranteed to own all of its members.
|
||||
|
||||
## Structs
|
||||
|
||||
@ -638,9 +647,8 @@ exampleStruct:
|
||||
optional: true
|
||||
defaultedField:
|
||||
description: >-
|
||||
Most callers should rely on 42
|
||||
as it is the answer to the question
|
||||
of life the universe and everything.
|
||||
Most callers should rely on 42 as it is the answer to the question of life the universe and
|
||||
everything.
|
||||
type: long
|
||||
validator:
|
||||
gt: 0
|
||||
@ -762,8 +770,8 @@ multi level chained structs.
|
||||
- `is_command_reply` - bool - if true, marks the struct as a command reply. A struct marked a
|
||||
`is_command_reply` generates a parser that ignores known generic or common fields across all
|
||||
replies when parsing replies (i.e. `ok`, `errmsg`, etc)
|
||||
- `is_generic_cmd_list` - string - choice [`arg`, `reply`], if set, generates functions `bool
|
||||
hasField(StringData)` and `bool shouldForwardToShards(StringData)` for each field in the
|
||||
- `is_generic_cmd_list` - string - choice [`arg`, `reply`], if set, generates functions
|
||||
`bool hasField(StringData)` and `bool shouldForwardToShards(StringData)` for each field in the
|
||||
struct. If set to `arg`, the struct will automatically be chained to every `command`.
|
||||
- `query_shape_component` - bool - true indicates this special serialization code will be generated
|
||||
to serialize as a query shape
|
||||
@ -784,10 +792,10 @@ hasField(StringData)` and `bool shouldForwardToShards(StringData)` for each fiel
|
||||
have a variant of strings and structs.
|
||||
- Variant string support differentiates the type to choose based on the BSON type.
|
||||
- Variant struct support differentiates the type to choose based on the _first_ field of the
|
||||
struct. The first field must be unique in each struct across the structs. When parsing a
|
||||
BSON object as a variant of multiple structs, the parser assumes that the first field
|
||||
declared in the IDL struct is always the first field in its BSON representation.
|
||||
See `bulkWrite` for an example.
|
||||
struct. The first field must be unique in each struct across the structs. When parsing a BSON
|
||||
object as a variant of multiple structs, the parser assumes that the first field declared in
|
||||
the IDL struct is always the first field in its BSON representation. See `bulkWrite` for an
|
||||
example.
|
||||
- `ignore` - bool - true means field generates no code but is ignored by the generated deserializer.
|
||||
Used to deprecate fields that no longer have an affect but allow strict parsers to ignore them.
|
||||
- `optional` - bool - true means the field is optional. Generated C++ type is
|
||||
@ -819,8 +827,9 @@ Comparisons are generated with C++ operators for these comparisons
|
||||
- `lt` - string - Validates field is less than or equal to `string`
|
||||
- `gte` - string - Validates field is greater than `string`
|
||||
- `lte` - string - Validates field is less than or equal to `string`
|
||||
- `callback` - string - A static function to call of the shape `Status <function_name>(const
|
||||
<cpp_type> value)`. For non-simple types, `value` is passed by const-reference.
|
||||
- `callback` - string - A static function to call of the shape
|
||||
`Status <function_name>(const <cpp_type> value)`. For non-simple types, `value` is passed by
|
||||
const-reference.
|
||||
|
||||
## Commands
|
||||
|
||||
@ -830,24 +839,24 @@ the `command` object when compared to `struct`.
|
||||
|
||||
The special features:
|
||||
|
||||
1. First element must match the name of the command, and the parsing rules of this element
|
||||
can be customized via the `namespace` field.
|
||||
1. First element must match the name of the command, and the parsing rules of this element can be
|
||||
customized via the `namespace` field.
|
||||
2. In `OP_MSG`, `$db` must be present or defaults to `admin`
|
||||
3. Commands may have a `struct` as a reply
|
||||
4. Commands may be a part of API Version 1
|
||||
5. Any structs marked with `is_generic_cmd_list: "arg"` that are in imported IDL files
|
||||
will automatically be chained to all commands. The IDL compiler imports
|
||||
[`generic_argument.idl`](generic_argument.idl) by default, so any generic argument struct
|
||||
defined in that file will be chained to all commands by default.
|
||||
6. Command replies ignore the generic arguments fields like `$clusterTime`, `ok`, etc
|
||||
during parsing. The list of these fields is in [`generic_argument.idl`](generic_argument.idl).
|
||||
5. Any structs marked with `is_generic_cmd_list: "arg"` that are in imported IDL files will
|
||||
automatically be chained to all commands. The IDL compiler imports
|
||||
[`generic_argument.idl`](generic_argument.idl) by default, so any generic argument struct defined
|
||||
in that file will be chained to all commands by default.
|
||||
6. Command replies ignore the generic arguments fields like `$clusterTime`, `ok`, etc during
|
||||
parsing. The list of these fields is in [`generic_argument.idl`](generic_argument.idl).
|
||||
|
||||
The `namespace` field is the field that describes one kind of parameter a command takes.
|
||||
|
||||
1. `concatenate_with_db` - takes a collection name. Generates a method `const NamespaceString
|
||||
getNamespace()`. Examples: `insert`, `update`, `delete`
|
||||
2. `concatenate_with_db_or_uuid` - takes a collection name. Generates a method `const
|
||||
NamespaceStringOrUUID& getNamespaceOrUUID()`. Examples: `find`, `count`
|
||||
1. `concatenate_with_db` - takes a collection name. Generates a method
|
||||
`const NamespaceString getNamespace()`. Examples: `insert`, `update`, `delete`
|
||||
2. `concatenate_with_db_or_uuid` - takes a collection name. Generates a method
|
||||
`const NamespaceStringOrUUID& getNamespaceOrUUID()`. Examples: `find`, `count`
|
||||
3. `ignored` - ignores the first argument entirely. Examples: `hello`, `setParameter`, `ping`
|
||||
4. `type` - takes a struct as the first argument. Examples: `getLog`, `clearLog`, `renameCollection`
|
||||
|
||||
@ -866,15 +875,16 @@ Commands can also specify their replies that they return. Replies are regular `s
|
||||
- `immutable` - [see structs](#struct-reference)
|
||||
- `non_const_getter` - [see structs](#struct-reference)
|
||||
- `namespace` - string - choice of a string [`concatenate_with_db`, `concatenate_with_db_or_uuid`,
|
||||
`ignored`, `type`]. Instructs how the value of command field should be parsed - `concatenate_with_db` - Indicates the command field is a string and should be treated as a
|
||||
collection name. Typically used by commands that deal with collections. Automatically
|
||||
concatenated with `$db` by the IDL parser. Adds a method `const NamespaceString getNamespace()`
|
||||
to the generated class. - `concatenate_with_db_or_uuid` - Indicates the command field is a string or uuid, and should be
|
||||
treated as a collection name. Typically used by commands that deal with collections.
|
||||
Automatically concatenated with `$db` by the IDL parser. Adds a method `const
|
||||
NamespaceStringOrUUID& getNamespaceOrUUID()` to the generated class. - `ignored` - Ignores the value of the command field. Used by commands that ignore their command
|
||||
argument entirely - `type` - Indicates the command takes a custom type for the first field. `type` field must be
|
||||
set.
|
||||
`ignored`, `type`]. Instructs how the value of command field should be parsed -
|
||||
`concatenate_with_db` - Indicates the command field is a string and should be treated as a
|
||||
collection name. Typically used by commands that deal with collections. Automatically concatenated
|
||||
with `$db` by the IDL parser. Adds a method `const NamespaceString getNamespace()` to the
|
||||
generated class. - `concatenate_with_db_or_uuid` - Indicates the command field is a string or
|
||||
uuid, and should be treated as a collection name. Typically used by commands that deal with
|
||||
collections. Automatically concatenated with `$db` by the IDL parser. Adds a method
|
||||
`const NamespaceStringOrUUID& getNamespaceOrUUID()` to the generated class. - `ignored` - Ignores
|
||||
the value of the command field. Used by commands that ignore their command argument entirely -
|
||||
`type` - Indicates the command takes a custom type for the first field. `type` field must be set.
|
||||
- `type` - string - name of IDL type or struct to parse the command field as
|
||||
- `command_name` - string - IDL generated parser expects the command to be named the name of YAML
|
||||
map. This can be overwritten with `command_name`. Commands should be `camelCase`
|
||||
@ -893,8 +903,8 @@ NamespaceStringOrUUID& getNamespaceOrUUID()` to the generated class. - `ignored`
|
||||
|
||||
### Access Check Reference
|
||||
|
||||
A list of privileges the command checks. Only applicable for commands that are a part of
|
||||
API Version 1. Checked at runtime when test commands are enabled.
|
||||
A list of privileges the command checks. Only applicable for commands that are a part of API
|
||||
Version 1. Checked at runtime when test commands are enabled.
|
||||
|
||||
- `none` - bool - No privileges required
|
||||
- `simple` - mapping - single [check or privilege](#check-or-privilege)
|
||||
@ -1002,28 +1012,29 @@ unit tests exercise all features and combinations IDL can handle.
|
||||
#### BSONObj Anchor
|
||||
|
||||
The parsing method a struct is initialized with indicates what type of ownership the constructed
|
||||
object has on the `BSONObj` parameter. An internal `BSONObj` anchor ensures that the lifetime of
|
||||
the `BSONObj` matches the lifetime of the object in the cases that the `BSONObj` parameter is
|
||||
owned or shared.
|
||||
object has on the `BSONObj` parameter. An internal `BSONObj` anchor ensures that the lifetime of the
|
||||
`BSONObj` matches the lifetime of the object in the cases that the `BSONObj` parameter is owned or
|
||||
shared.
|
||||
|
||||
#### View Types
|
||||
|
||||
If the struct is a view, then it's possible that objects of the type will not own all of its
|
||||
members. If the struct is not a view, then objects of the type are guaranteed to own all of its
|
||||
members. This is determined by recursively checking the fields of a struct. This info is used
|
||||
during generation to determine whether or not a struct will need a `BSONObj` anchor.
|
||||
members. This is determined by recursively checking the fields of a struct. This info is used during
|
||||
generation to determine whether or not a struct will need a `BSONObj` anchor.
|
||||
|
||||
## Best Practices
|
||||
|
||||
IDL has been in use since 2017. In that time, here are a few best practices:
|
||||
|
||||
1. strict or non-strict parsers - Structs that are persisted to disk should set `strict: false`.
|
||||
It's better for upgrade/downgrade. Commands should set `strict: true` or omit it as `strict:
|
||||
true` is the default. 1. For persistance: For upgrade/downgrade, if a persisted document with a strict parser has a
|
||||
field added in new version N+1 and then the user downgrades to old version N, the strict
|
||||
parser will throw an exception and reject the document. If this document was part of the
|
||||
storage catalog for instance, the server would fail to start. 2. For commands: By using strict parsers, it gives the server the ability to add fields without
|
||||
the risk of clients accidentally sending fields with the same name that had been ignored.
|
||||
It's better for upgrade/downgrade. Commands should set `strict: true` or omit it as
|
||||
`strict: true` is the default. 1. For persistance: For upgrade/downgrade, if a persisted document
|
||||
with a strict parser has a field added in new version N+1 and then the user downgrades to old
|
||||
version N, the strict parser will throw an exception and reject the document. If this document
|
||||
was part of the storage catalog for instance, the server would fail to start. 2. For commands: By
|
||||
using strict parsers, it gives the server the ability to add fields without the risk of clients
|
||||
accidentally sending fields with the same name that had been ignored.
|
||||
2. Extending existing structs/commands - all new fields in a struct/command must be marked optional
|
||||
to support backwards compatibility. For new structs/commands, there should be some required
|
||||
fields. It does not matter if the struct is not persisted, non-optional fields break backwards
|
||||
|
||||
@ -2,28 +2,26 @@
|
||||
title: LibFuzzer
|
||||
---
|
||||
|
||||
> **!!NOTE!!**: LibFuzzer is deprecated and should not be used for new fuzz tests. See [FuzzTest](fuzztest.md) for new fuzzing implementations
|
||||
> **!!NOTE!!**: LibFuzzer is deprecated and should not be used for new fuzz tests. See
|
||||
> [FuzzTest](fuzztest.md) for new fuzzing implementations
|
||||
|
||||
LibFuzzer is a tool for performing coverage guided fuzzing of C/C++
|
||||
code. LibFuzzer will try to trigger AUBSAN failures in a function you
|
||||
provide, by repeatedly calling it with a carefully crafted byte array as
|
||||
input. Each input will be assigned a "score". Byte arrays which exercise
|
||||
new or more regions of code will score better. LibFuzzer will merge and
|
||||
mutate high scoring inputs in order to gradually cover more and more
|
||||
possible behavior.
|
||||
LibFuzzer is a tool for performing coverage guided fuzzing of C/C++ code. LibFuzzer will try to
|
||||
trigger AUBSAN failures in a function you provide, by repeatedly calling it with a carefully crafted
|
||||
byte array as input. Each input will be assigned a "score". Byte arrays which exercise new or more
|
||||
regions of code will score better. LibFuzzer will merge and mutate high scoring inputs in order to
|
||||
gradually cover more and more possible behavior.
|
||||
|
||||
# When to use LibFuzzer
|
||||
|
||||
> **!!NOTE!!**: LibFuzzer is deprecated and should not be used for new fuzz tests. See [FuzzTest](fuzztest.md) for new fuzzing implementations
|
||||
> **!!NOTE!!**: LibFuzzer is deprecated and should not be used for new fuzz tests. See
|
||||
> [FuzzTest](fuzztest.md) for new fuzzing implementations
|
||||
|
||||
LibFuzzer is great for testing functions which accept a opaque blob of
|
||||
untrusted user-provided data.
|
||||
LibFuzzer is great for testing functions which accept a opaque blob of untrusted user-provided data.
|
||||
|
||||
# How to use LibFuzzer
|
||||
|
||||
LibFuzzer implements `int main`, and expects to be linked with an object
|
||||
file which provides the function under test. You will achieve this by
|
||||
writing a cpp file which implements
|
||||
LibFuzzer implements `int main`, and expects to be linked with an object file which provides the
|
||||
function under test. You will achieve this by writing a cpp file which implements
|
||||
|
||||
```cpp
|
||||
extern "C" int LLVMFuzzerTestOneInput(const uint8_t *Data, size_t Size) {
|
||||
@ -31,26 +29,22 @@ extern "C" int LLVMFuzzerTestOneInput(const uint8_t *Data, size_t Size) {
|
||||
}
|
||||
```
|
||||
|
||||
`LLVMFuzzerTestOneInput` will be called repeatedly, with fuzzer
|
||||
generated bytes in `Data`. `Size` will always truthfully tell your
|
||||
implementation how many bytes are in `Data`. If your function crashes or
|
||||
induces an AUBSAN fault, LibFuzzer will consider that to be a finding
|
||||
worth reporting.
|
||||
`LLVMFuzzerTestOneInput` will be called repeatedly, with fuzzer generated bytes in `Data`. `Size`
|
||||
will always truthfully tell your implementation how many bytes are in `Data`. If your function
|
||||
crashes or induces an AUBSAN fault, LibFuzzer will consider that to be a finding worth reporting.
|
||||
|
||||
Keep in mind that your function will often "just" be adapting `Data` to
|
||||
whatever format our internal C++ functions requires. However, you have a
|
||||
lot of freedom in exactly what you choose to do. Just make sure your
|
||||
function crashes or produces an invariant when something interesting
|
||||
happens! As just a few ideas:
|
||||
Keep in mind that your function will often "just" be adapting `Data` to whatever format our internal
|
||||
C++ functions requires. However, you have a lot of freedom in exactly what you choose to do. Just
|
||||
make sure your function crashes or produces an invariant when something interesting happens! As just
|
||||
a few ideas:
|
||||
|
||||
- You might choose to call multiple implementations of a single
|
||||
operation, and validate that they produce the same output when
|
||||
presented the same input.
|
||||
- You could tease out individual bytes from `Data` and provide them as
|
||||
different arguments to the function under test.
|
||||
- You might choose to call multiple implementations of a single operation, and validate that they
|
||||
produce the same output when presented the same input.
|
||||
- You could tease out individual bytes from `Data` and provide them as different arguments to the
|
||||
function under test.
|
||||
|
||||
Finally, your cpp file will need a bazel target. There is a method which
|
||||
defines fuzzer targets, much like how we define unittests. For example:
|
||||
Finally, your cpp file will need a bazel target. There is a method which defines fuzzer targets,
|
||||
much like how we define unittests. For example:
|
||||
|
||||
```python
|
||||
mongo_cc_fuzzer_test(
|
||||
@ -70,25 +64,21 @@ defines fuzzer targets, much like how we define unittests. For example:
|
||||
|
||||
# Running LibFuzzer
|
||||
|
||||
Your test's object file and **all** of its dependencies must be compiled
|
||||
with the "fuzzer" sanitizer, plus a set of sanitizers which might
|
||||
produce interesting runtime errors like AUBSAN. Evergreen has a build
|
||||
variant, whose name will include the string "FUZZER", which will compile
|
||||
and run all of the fuzzer tests.
|
||||
Your test's object file and **all** of its dependencies must be compiled with the "fuzzer"
|
||||
sanitizer, plus a set of sanitizers which might produce interesting runtime errors like AUBSAN.
|
||||
Evergreen has a build variant, whose name will include the string "FUZZER", which will compile and
|
||||
run all of the fuzzer tests.
|
||||
|
||||
The fuzzers can be built locally, for development and debugging. Check
|
||||
our Evergreen configuration for the current bazel arguments.
|
||||
The fuzzers can be built locally, for development and debugging. Check our Evergreen configuration
|
||||
for the current bazel arguments.
|
||||
|
||||
LibFuzzer binaries will accept a path to a directory containing its
|
||||
"corpus". A corpus is a list of examples known to produce interesting
|
||||
outputs. LibFuzzer will start producing interesting results more quickly
|
||||
if starts off with a set of inputs which it can begin mutating. When its
|
||||
done, it will write down any new inputs it discovered into its corpus.
|
||||
Re-using a corpus across executions is a good way to make LibFuzzer
|
||||
return more results in less time. Our Evergreen tasks will try to
|
||||
acquire and re-use a corpus from an earlier commit, if it can.
|
||||
LibFuzzer binaries will accept a path to a directory containing its "corpus". A corpus is a list of
|
||||
examples known to produce interesting outputs. LibFuzzer will start producing interesting results
|
||||
more quickly if starts off with a set of inputs which it can begin mutating. When its done, it will
|
||||
write down any new inputs it discovered into its corpus. Re-using a corpus across executions is a
|
||||
good way to make LibFuzzer return more results in less time. Our Evergreen tasks will try to acquire
|
||||
and re-use a corpus from an earlier commit, if it can.
|
||||
|
||||
# References
|
||||
|
||||
- [LibFuzzer's official
|
||||
documentation](https://llvm.org/docs/LibFuzzer.html)
|
||||
- [LibFuzzer's official documentation](https://llvm.org/docs/LibFuzzer.html)
|
||||
|
||||
@ -60,9 +60,8 @@ Ex: `bash buildscripts/yamllinters.sh`
|
||||
|
||||
## Python Linters
|
||||
|
||||
The `bazel run lint` command runs all Python linters as well as several other linters in our code base. You can
|
||||
run auto-remediations via:
|
||||
`bazel run lint --fix`.
|
||||
The `bazel run lint` command runs all Python linters as well as several other linters in our code
|
||||
base. You can run auto-remediations via: `bazel run lint --fix`.
|
||||
|
||||
Ex: `bazel run lint`
|
||||
|
||||
|
||||
@ -1,18 +1,18 @@
|
||||
# Proxy protocol support
|
||||
|
||||
`mongod` and `mongos` have built-in support for connections made via L4 load balancers using
|
||||
the [proxy protocol][proxy-protocol-url] header. Placing `mongos` or `mongod` behind load balancers
|
||||
`mongod` and `mongos` have built-in support for connections made via L4 load balancers using the
|
||||
[proxy protocol][proxy-protocol-url] header. Placing `mongos` or `mongod` behind load balancers
|
||||
requires proper configuration of the load balancers, `mongos`, and `mongod`.
|
||||
|
||||
# Configuring mongod
|
||||
|
||||
To use `mongod` with a L4 load balancer (or reverse proxy) it _must_ be configured with the
|
||||
`proxyPort` config option whose value can be specified at program start in any of the ways
|
||||
mentioned in the server config documentation. This config option opens a new port to which the
|
||||
L4 load balancer _must_ connect.
|
||||
`proxyPort` config option whose value can be specified at program start in any of the ways mentioned
|
||||
in the server config documentation. This config option opens a new port to which the L4 load
|
||||
balancer _must_ connect.
|
||||
|
||||
The L4 load balancer (or reverse proxy) _must_ emit a [proxy protocol][proxy-protocol-url] header
|
||||
at the start of its connection stream. `mongod` supports both version 1 and version 2 of the proxy
|
||||
The L4 load balancer (or reverse proxy) _must_ emit a [proxy protocol][proxy-protocol-url] header at
|
||||
the start of its connection stream. `mongod` supports both version 1 and version 2 of the proxy
|
||||
standard.
|
||||
|
||||
# Reverse proxy vs load balancer
|
||||
@ -20,8 +20,8 @@ standard.
|
||||
Sharded clusters might be configured to work with either a L4 load balancer or a reverse proxy. In
|
||||
both cases the proxy or load balancer _must_ connect to the `mongos`'s load-balancer port.
|
||||
|
||||
Placing `mongos` behind a reverse proxy does not hide the list of `mongos`. The driver will choose
|
||||
a specific `mongos` to connect to via the reverse proxy.
|
||||
Placing `mongos` behind a reverse proxy does not hide the list of `mongos`. The driver will choose a
|
||||
specific `mongos` to connect to via the reverse proxy.
|
||||
|
||||
Placing `mongos` behind an L4 load balancer hides the list of `mongos`. The driver only sees the
|
||||
load balancer and, the connections it makes are routed by the load balancer to a `mongos`. There is
|
||||
@ -33,11 +33,18 @@ that connections from a driver are distributed among multiple `mongos`.
|
||||
When a sharded cluster is deployed with a reverse proxy, there are two conditions that must be
|
||||
fulfilled :
|
||||
|
||||
- `mongos` must be configured with the [MongoDB Server Parameter](https://docs.mongodb.com/manual/reference/parameters/) `loadBalancerPort` whose value can be specified at program start in any of the ways mentioned in the server parameter documentation.
|
||||
This option causes `mongos` to open a second port. All connections made from reverse proxy _must_ be made over this port, and no regular connections (without HAProxy protocol header) may be made over this port.
|
||||
- The reverse proxy _must_ be configured to emit a [proxy protocol][proxy-protocol-url] header
|
||||
at the [start of its connection stream](https://github.com/mongodb/mongo/commit/3a18d295d22b377cc7bc4c97bd3b6884d065bb85). `mongos` [supports](https://github.com/mongodb/mongo/commit/786482da93c3e5e58b1c690cb060f00c60864f69) both version 1 and version 2 of the proxy
|
||||
protocol standard.
|
||||
- `mongos` must be configured with the
|
||||
[MongoDB Server Parameter](https://docs.mongodb.com/manual/reference/parameters/)
|
||||
`loadBalancerPort` whose value can be specified at program start in any of the ways mentioned in
|
||||
the server parameter documentation. This option causes `mongos` to open a second port. All
|
||||
connections made from reverse proxy _must_ be made over this port, and no regular connections
|
||||
(without HAProxy protocol header) may be made over this port.
|
||||
- The reverse proxy _must_ be configured to emit a [proxy protocol][proxy-protocol-url] header at
|
||||
the
|
||||
[start of its connection stream](https://github.com/mongodb/mongo/commit/3a18d295d22b377cc7bc4c97bd3b6884d065bb85).
|
||||
`mongos`
|
||||
[supports](https://github.com/mongodb/mongo/commit/786482da93c3e5e58b1c690cb060f00c60864f69) both
|
||||
version 1 and version 2 of the proxy protocol standard.
|
||||
|
||||
The driver does not require any configuration change compared to a cluster without a reverse proxy.
|
||||
|
||||
@ -46,22 +53,32 @@ The driver does not require any configuration change compared to a cluster witho
|
||||
When a sharded cluster is deployed with an L4 load balancer there are three conditions that must be
|
||||
fulfilled :
|
||||
|
||||
- `mongos` must be configured with the [MongoDB Server Parameter](https://docs.mongodb.com/manual/reference/parameters/) `loadBalancerPort` whose value can be specified at program start in any of the ways mentioned in the server parameter documentation.
|
||||
This option causes `mongos` to open a second port. All connections made from load
|
||||
balancers _must_ be made over this port, and no regular connections (without HAProxy protocol header) may be made over this port.
|
||||
- The L4 load balancer _must_ be configured to emit a [proxy protocol][proxy-protocol-url] header
|
||||
at the [start of its connection stream](https://github.com/mongodb/mongo/commit/3a18d295d22b377cc7bc4c97bd3b6884d065bb85). `mongos` [supports](https://github.com/mongodb/mongo/commit/786482da93c3e5e58b1c690cb060f00c60864f69) both version 1 and version 2 of the proxy
|
||||
protocol standard.
|
||||
- Clients (drivers or shells) connecting to a `mongos` through the load balancer must set the `loadBalanced` option,
|
||||
e.g., when connecting to a local `mongos` instance through the load balancer, if the `loadBalancerPort` server parameter was set to 20100, the
|
||||
connection string must be of the form `"mongodb://localhost:20100/?loadBalanced=true"`.
|
||||
- `mongos` must be configured with the
|
||||
[MongoDB Server Parameter](https://docs.mongodb.com/manual/reference/parameters/)
|
||||
`loadBalancerPort` whose value can be specified at program start in any of the ways mentioned in
|
||||
the server parameter documentation. This option causes `mongos` to open a second port. All
|
||||
connections made from load balancers _must_ be made over this port, and no regular connections
|
||||
(without HAProxy protocol header) may be made over this port.
|
||||
- The L4 load balancer _must_ be configured to emit a [proxy protocol][proxy-protocol-url] header at
|
||||
the
|
||||
[start of its connection stream](https://github.com/mongodb/mongo/commit/3a18d295d22b377cc7bc4c97bd3b6884d065bb85).
|
||||
`mongos`
|
||||
[supports](https://github.com/mongodb/mongo/commit/786482da93c3e5e58b1c690cb060f00c60864f69) both
|
||||
version 1 and version 2 of the proxy protocol standard.
|
||||
- Clients (drivers or shells) connecting to a `mongos` through the load balancer must set the
|
||||
`loadBalanced` option, e.g., when connecting to a local `mongos` instance through the load
|
||||
balancer, if the `loadBalancerPort` server parameter was set to 20100, the connection string must
|
||||
be of the form `"mongodb://localhost:20100/?loadBalanced=true"`.
|
||||
|
||||
There are some subtle behavioral differences that the load balancer options enable, chief of
|
||||
which is how `mongos` deals with open cursors on client disconnection. Over a normal connection,
|
||||
`mongos` will keep open cursors alive for a short while after client disconnection in case the
|
||||
client reconnects and continues to request more from the given cursor. Since client reconnections
|
||||
aren't expected behind a load balancer (as the load balancer will likely redirect a given client
|
||||
to a different `mongos` instance upon reconnection), we eagerly [close cursors](https://github.com/mongodb/mongo/commit/b429d5dda98bbe18ab0851ffd1729d3b57fc8a4e) on load balanced
|
||||
client disconnects. We also [abort any in-progress transactions](https://github.com/mongodb/mongo/commit/74628ed4e314dfe0fd69d3fbae1411981a869f6b) that were initiated by the load balanced client.
|
||||
There are some subtle behavioral differences that the load balancer options enable, chief of which
|
||||
is how `mongos` deals with open cursors on client disconnection. Over a normal connection, `mongos`
|
||||
will keep open cursors alive for a short while after client disconnection in case the client
|
||||
reconnects and continues to request more from the given cursor. Since client reconnections aren't
|
||||
expected behind a load balancer (as the load balancer will likely redirect a given client to a
|
||||
different `mongos` instance upon reconnection), we eagerly
|
||||
[close cursors](https://github.com/mongodb/mongo/commit/b429d5dda98bbe18ab0851ffd1729d3b57fc8a4e) on
|
||||
load balanced client disconnects. We also
|
||||
[abort any in-progress transactions](https://github.com/mongodb/mongo/commit/74628ed4e314dfe0fd69d3fbae1411981a869f6b)
|
||||
that were initiated by the load balanced client.
|
||||
|
||||
[proxy-protocol-url]: https://www.haproxy.org/download/1.8/doc/proxy-protocol.txt
|
||||
|
||||
371
docs/logging.md
371
docs/logging.md
@ -1,9 +1,9 @@
|
||||
# Log System Overview
|
||||
|
||||
The new log system adds capability to produce structured logs in the [Relaxed
|
||||
Extended JSON 2.0.0][relaxed_json_2] format. The new API requires names to be
|
||||
given to variables, forming field names for the variables in structured JSON
|
||||
logs. Named variables are called attributes in the log system.
|
||||
The new log system adds capability to produce structured logs in the [Relaxed Extended JSON
|
||||
2.0.0][relaxed_json_2] format. The new API requires names to be given to variables, forming field
|
||||
names for the variables in structured JSON logs. Named variables are called attributes in the log
|
||||
system.
|
||||
|
||||
# Style guide
|
||||
|
||||
@ -13,43 +13,38 @@ Log lines are composed primarily of a message (`msg`) and attributes (`attr` fie
|
||||
|
||||
## Philosophy
|
||||
|
||||
As you write log messages, keep the following in mind: A big thing that makes
|
||||
JSON and BSON useful as data formats is the ability to provide rich field names.
|
||||
As you write log messages, keep the following in mind: A big thing that makes JSON and BSON useful
|
||||
as data formats is the ability to provide rich field names.
|
||||
|
||||
What makes logv2 machine readable is that we write an intact Extended BSON
|
||||
format.
|
||||
What makes logv2 machine readable is that we write an intact Extended BSON format.
|
||||
|
||||
But, what makes these lines human readable is that the `msg` provides a simple,
|
||||
clear context for interpreting well-formed field names and values in the `attr`
|
||||
subdocument.
|
||||
But, what makes these lines human readable is that the `msg` provides a simple, clear context for
|
||||
interpreting well-formed field names and values in the `attr` subdocument.
|
||||
|
||||
## Specific Guidance
|
||||
|
||||
For maximum readability, a log message additionally has the least amount of
|
||||
repetition possible, and shares attribute names with other related log lines.
|
||||
For maximum readability, a log message additionally has the least amount of repetition possible, and
|
||||
shares attribute names with other related log lines.
|
||||
|
||||
### Message (the msg field)
|
||||
|
||||
The `msg` field predicates a reader's interpretation of the log line. It should
|
||||
be crafted with care and attention.
|
||||
The `msg` field predicates a reader's interpretation of the log line. It should be crafted with care
|
||||
and attention.
|
||||
|
||||
- Concisely describe what the log line is reporting, providing enough
|
||||
context necessary for interpreting attribute field names and values
|
||||
- Concisely describe what the log line is reporting, providing enough context necessary for
|
||||
interpreting attribute field names and values
|
||||
- Capitalize the first letter, as in a sentence
|
||||
- Avoid unnecessary punctuation, but punctuate between sentences if using
|
||||
multiple sentences
|
||||
- Avoid unnecessary punctuation, but punctuate between sentences if using multiple sentences
|
||||
- Do not conclude with punctuation
|
||||
- You may occasionally encounter `msg` strings containing fmt-style
|
||||
`{expr}` braces. These are legacy artifacts and should be rephrased
|
||||
according to these guidelines.
|
||||
- You may occasionally encounter `msg` strings containing fmt-style `{expr}` braces. These are
|
||||
legacy artifacts and should be rephrased according to these guidelines.
|
||||
|
||||
### Attributes (fields in the attr subdocument)
|
||||
|
||||
The `attr` subdocument includes important metrics/statistics about the logged
|
||||
event for the purposes of debugging or performance analysis. These variables
|
||||
should be named very well, as though intended for a very human-readable portion
|
||||
of the codebase (like config variable declaration, abstract class definitions,
|
||||
etc.)
|
||||
The `attr` subdocument includes important metrics/statistics about the logged event for the purposes
|
||||
of debugging or performance analysis. These variables should be named very well, as though intended
|
||||
for a very human-readable portion of the codebase (like config variable declaration, abstract class
|
||||
definitions, etc.)
|
||||
|
||||
For `attr` field names, do the following:
|
||||
|
||||
@ -57,40 +52,38 @@ For `attr` field names, do the following:
|
||||
|
||||
The bar for understanding should be:
|
||||
|
||||
- Someone with reasonable understanding of mongod behavior should understand
|
||||
immediately what is being logged
|
||||
- Someone with reasonable troubleshooting skill should be able to extract doc-
|
||||
or code-searchable phrases to learn about what is being logged
|
||||
- Someone with reasonable understanding of mongod behavior should understand immediately what is
|
||||
being logged
|
||||
- Someone with reasonable troubleshooting skill should be able to extract doc- or code-searchable
|
||||
phrases to learn about what is being logged
|
||||
|
||||
#### Precisely describe values and units
|
||||
|
||||
Exception: Do not add a unit suffix when logging a Duration type. The system
|
||||
automatically adds this unit.
|
||||
Exception: Do not add a unit suffix when logging a Duration type. The system automatically adds this
|
||||
unit.
|
||||
|
||||
#### When providing an execution time attribute, ensure it is named "durationMillis"
|
||||
|
||||
To describe the execution time of an operation using our preferred method:
|
||||
Specify an `attr` name of “duration” and provide a value using the Milliseconds
|
||||
Duration type. The log system will automatically append "Millis" to the
|
||||
attribute name.
|
||||
To describe the execution time of an operation using our preferred method: Specify an `attr` name of
|
||||
“duration” and provide a value using the Milliseconds Duration type. The log system will
|
||||
automatically append "Millis" to the attribute name.
|
||||
|
||||
Alternatively, specify an `attr` name of “durationMillis” and provide the
|
||||
number of milliseconds as an integer type.
|
||||
Alternatively, specify an `attr` name of “durationMillis” and provide the number of milliseconds as
|
||||
an integer type.
|
||||
|
||||
**Importantly**: downstream analysis tools will rely on this convention, as a
|
||||
replacement for the "[0-9]+ms$" format of prior logs.
|
||||
**Importantly**: downstream analysis tools will rely on this convention, as a replacement for the
|
||||
"[0-9]+ms$" format of prior logs.
|
||||
|
||||
#### Use certain specific terms whenever possible
|
||||
|
||||
When logging the below information, do so with these specific terms:
|
||||
|
||||
- **namespace** - when logging a value of the form
|
||||
"\<db name\>.\<collection name\>". Do not use "collection" or abbreviate to "ns"
|
||||
- **namespace** - when logging a value of the form "\<db name\>.\<collection name\>". Do not use
|
||||
"collection" or abbreviate to "ns"
|
||||
- **db** - instead of "database"
|
||||
- **error** - when an error occurs, instead of "status". Use this for objects
|
||||
of type Status and DBException
|
||||
- **reason** - to provide rationale for an event/action when "error" isn't
|
||||
appropriate
|
||||
- **error** - when an error occurs, instead of "status". Use this for objects of type Status and
|
||||
DBException
|
||||
- **reason** - to provide rationale for an event/action when "error" isn't appropriate
|
||||
|
||||
### Examples
|
||||
|
||||
@ -122,11 +115,10 @@ The log system is made available with the following header:
|
||||
|
||||
#include "mongo/logv2/log.h"
|
||||
|
||||
The macro `MONGO_LOGV2_DEFAULT_COMPONENT` is expanded by all logging macros.
|
||||
This configuration macro must expand at their point of use to a `LogComponent`
|
||||
expression, which is implicitly attached to the emitted message. It is
|
||||
conventionally defined near the top of a `.cpp` file after headers are included,
|
||||
and before any logging macros are invoked. Example:
|
||||
The macro `MONGO_LOGV2_DEFAULT_COMPONENT` is expanded by all logging macros. This configuration
|
||||
macro must expand at their point of use to a `LogComponent` expression, which is implicitly attached
|
||||
to the emitted message. It is conventionally defined near the top of a `.cpp` file after headers are
|
||||
included, and before any logging macros are invoked. Example:
|
||||
|
||||
#define MONGO_LOGV2_DEFAULT_COMPONENT ::mongo::logv2::LogComponent::kDefault
|
||||
|
||||
@ -138,22 +130,19 @@ Logging is performed using function style macros:
|
||||
...,
|
||||
"nameN"_attr = varN);
|
||||
|
||||
The ID is a signed 32bit integer in the same number space as the error code
|
||||
numbers. It is used to uniquely identify a log statement. If changing existing
|
||||
code, using a new ID is strongly advised to avoid any parsing ambiguity. When
|
||||
selecting ID during work on JIRA ticket `SERVER-ABCDE` you can use the JIRA
|
||||
ticket number to avoid ID collisions with other engineers by taking ID from the
|
||||
range `ABCDE00` - `ABCDE99`.
|
||||
The ID is a signed 32bit integer in the same number space as the error code numbers. It is used to
|
||||
uniquely identify a log statement. If changing existing code, using a new ID is strongly advised to
|
||||
avoid any parsing ambiguity. When selecting ID during work on JIRA ticket `SERVER-ABCDE` you can use
|
||||
the JIRA ticket number to avoid ID collisions with other engineers by taking ID from the range
|
||||
`ABCDE00` - `ABCDE99`.
|
||||
|
||||
Attributes are created with the `_attr` user-defined literal. The intermediate
|
||||
object that gets instantiated provides the assignment operator `=` for
|
||||
assigning a value to the attribute.
|
||||
Attributes are created with the `_attr` user-defined literal. The intermediate object that gets
|
||||
instantiated provides the assignment operator `=` for assigning a value to the attribute.
|
||||
|
||||
The message string must be a compile time constant.
|
||||
This is to avoid dynamic attribute names in the log output and to be able to
|
||||
add compile time verification of log statements in the future. If the string
|
||||
needs to be shared with anything else (like constructing a Status object) you
|
||||
can use this pattern:
|
||||
The message string must be a compile time constant. This is to avoid dynamic attribute names in the
|
||||
log output and to be able to add compile time verification of log statements in the future. If the
|
||||
string needs to be shared with anything else (like constructing a Status object) you can use this
|
||||
pattern:
|
||||
|
||||
static constexpr char str[] = "the string";
|
||||
|
||||
@ -172,13 +161,12 @@ can use this pattern:
|
||||
|
||||
### Log Component
|
||||
|
||||
To override the default component, a separate logging API can be used that
|
||||
takes a `LogOptions` structure:
|
||||
To override the default component, a separate logging API can be used that takes a `LogOptions`
|
||||
structure:
|
||||
|
||||
LOGV2_OPTIONS(options, message-string, attr0, ...);
|
||||
|
||||
`LogOptions` can be constructed with a `LogComponent` to avoid verbosity in the
|
||||
log statement.
|
||||
`LogOptions` can be constructed with a `LogComponent` to avoid verbosity in the log statement.
|
||||
|
||||
##### Example
|
||||
|
||||
@ -186,9 +174,8 @@ log statement.
|
||||
|
||||
### Log Severity
|
||||
|
||||
`LOGV2` is the logging macro for the default informational (0) severity. To log
|
||||
to different severities there are separate logging macros to be used, they all
|
||||
take paramaters like `LOGV2`:
|
||||
`LOGV2` is the logging macro for the default informational (0) severity. To log to different
|
||||
severities there are separate logging macros to be used, they all take paramaters like `LOGV2`:
|
||||
|
||||
- `LOGV2_WARNING`
|
||||
- `LOGV2_ERROR`
|
||||
@ -202,18 +189,17 @@ There is also variations that take `LogOptions` if needed:
|
||||
- `LOGV2_ERROR_OPTIONS`
|
||||
- `LOGV2_FATAL_OPTIONS`
|
||||
|
||||
Fatal level log statements using `LOGV2_FATAL` perform `fassert` after logging,
|
||||
using the provided ID as assert id. `LOGV2_FATAL_NOTRACE` perform
|
||||
`fassertNoTrace` and `LOGV2_FATAL_CONTINUE` does not `fassert` allowing for
|
||||
continued execution. `LOGV2_FATAL_CONTINUE` is meant to be used when a fatal
|
||||
error has occurred but a different way of halting execution is desired such as
|
||||
`std::terminate` or `fassertFailedWithStatus`.
|
||||
Fatal level log statements using `LOGV2_FATAL` perform `fassert` after logging, using the provided
|
||||
ID as assert id. `LOGV2_FATAL_NOTRACE` perform `fassertNoTrace` and `LOGV2_FATAL_CONTINUE` does not
|
||||
`fassert` allowing for continued execution. `LOGV2_FATAL_CONTINUE` is meant to be used when a fatal
|
||||
error has occurred but a different way of halting execution is desired such as `std::terminate` or
|
||||
`fassertFailedWithStatus`.
|
||||
|
||||
`LOGV2_FATAL_OPTIONS` performs `fassert` by default like `LOGV2_FATAL` but this
|
||||
can be changed by setting the `FatalMode` on the `LogOptions`.
|
||||
`LOGV2_FATAL_OPTIONS` performs `fassert` by default like `LOGV2_FATAL` but this can be changed by
|
||||
setting the `FatalMode` on the `LogOptions`.
|
||||
|
||||
Debug-level logging is slightly different where an additional parameter (as
|
||||
integer) required to indicate the desired debug level:
|
||||
Debug-level logging is slightly different where an additional parameter (as integer) required to
|
||||
indicate the desired debug level:
|
||||
|
||||
LOGV2_DEBUG(ID, debug-level, message-string, attr0, ...);
|
||||
|
||||
@ -224,17 +210,15 @@ integer) required to indicate the desired debug level:
|
||||
message-string,
|
||||
attr0, ...);
|
||||
|
||||
`LOGV2_PROD_ONLY` logs like a default `LOGV2` log in production, but debug-1 log
|
||||
in internal testing. It accepts the same arguments as `LOGV2`. This log level is
|
||||
for log lines that may be spammy in testing but are more rare in production. As
|
||||
such, they may be useful in investigations. This level also preserves backwards
|
||||
compatibility for logs that are no longer as useful as when they were introduced.
|
||||
To determine whether to log, this macro uses the `LogSeverity::ProdOnly()`
|
||||
level, which returns level `LogSeverity::Debug(1)` when in a testing environment
|
||||
and `LogSeverity::Log()` otherwise. Whether the server is in a testing
|
||||
environment is determined using the `enableTestCommands` server parameter.
|
||||
It is preferred to use other macros over this one as it introduces a difference
|
||||
between testing and production. There is also the `LOGV2_PROD_ONLY_OPTIONS`
|
||||
`LOGV2_PROD_ONLY` logs like a default `LOGV2` log in production, but debug-1 log in internal
|
||||
testing. It accepts the same arguments as `LOGV2`. This log level is for log lines that may be
|
||||
spammy in testing but are more rare in production. As such, they may be useful in investigations.
|
||||
This level also preserves backwards compatibility for logs that are no longer as useful as when they
|
||||
were introduced. To determine whether to log, this macro uses the `LogSeverity::ProdOnly()` level,
|
||||
which returns level `LogSeverity::Debug(1)` when in a testing environment and `LogSeverity::Log()`
|
||||
otherwise. Whether the server is in a testing environment is determined using the
|
||||
`enableTestCommands` server parameter. It is preferred to use other macros over this one as it
|
||||
introduces a difference between testing and production. There is also the `LOGV2_PROD_ONLY_OPTIONS`
|
||||
variation that takes `LogOptions`.
|
||||
|
||||
##### Example
|
||||
@ -248,15 +232,13 @@ variation that takes `LogOptions`.
|
||||
|
||||
### Log Tags
|
||||
|
||||
Log tags are replacing the Tee from the old log system as the way to indicate
|
||||
that the log should also be written to a `RamLog` (accessible with the `getLog`
|
||||
command).
|
||||
Log tags are replacing the Tee from the old log system as the way to indicate that the log should
|
||||
also be written to a `RamLog` (accessible with the `getLog` command).
|
||||
|
||||
Tags are added to a log statement with the options API similarly to how
|
||||
non-default components are specified by constructing a `LogOptions`.
|
||||
Tags are added to a log statement with the options API similarly to how non-default components are
|
||||
specified by constructing a `LogOptions`.
|
||||
|
||||
Multiple tags can be attached to a log statement using the bitwise or operator
|
||||
`|`.
|
||||
Multiple tags can be attached to a log statement using the bitwise or operator `|`.
|
||||
|
||||
##### Example
|
||||
|
||||
@ -267,19 +249,18 @@ Multiple tags can be attached to a log statement using the bitwise or operator
|
||||
|
||||
### Dynamic attributes
|
||||
|
||||
Sometimes there is a need to add attributes depending on runtime conditionals.
|
||||
To support this there is the `DynamicAttributes` class that has an `add` method
|
||||
to add named attributes one by one. This class is meant to be used when you
|
||||
have this specific requirement and is not the general logging API.
|
||||
Sometimes there is a need to add attributes depending on runtime conditionals. To support this there
|
||||
is the `DynamicAttributes` class that has an `add` method to add named attributes one by one. This
|
||||
class is meant to be used when you have this specific requirement and is not the general logging
|
||||
API.
|
||||
|
||||
When finished, it is logged using the regular logging API but the
|
||||
`DynamicAttributes` instance is passed as the first attribute parameter. Mixing
|
||||
`_attr` literals with the `DynamicAttributes` is not supported.
|
||||
When finished, it is logged using the regular logging API but the `DynamicAttributes` instance is
|
||||
passed as the first attribute parameter. Mixing `_attr` literals with the `DynamicAttributes` is not
|
||||
supported.
|
||||
|
||||
When using the `DynamicAttributes` you need to be careful about parameter
|
||||
lifetimes. The `DynamicAttributes` binds attributes _by reference_ and the
|
||||
reference must be valid when passing the `DynamicAttributes` to the log
|
||||
statement.
|
||||
When using the `DynamicAttributes` you need to be careful about parameter lifetimes. The
|
||||
`DynamicAttributes` binds attributes _by reference_ and the reference must be valid when passing the
|
||||
`DynamicAttributes` to the log statement.
|
||||
|
||||
##### Example
|
||||
|
||||
@ -321,11 +302,11 @@ Many basic types have built in support:
|
||||
|
||||
### User-defined types
|
||||
|
||||
To make a user-defined type loggable it needs a serialization member function
|
||||
that the log system can bind to.
|
||||
To make a user-defined type loggable it needs a serialization member function that the log system
|
||||
can bind to.
|
||||
|
||||
The system binds and uses serialization functions by looking for functions in
|
||||
the following priority order:
|
||||
The system binds and uses serialization functions by looking for functions in the following priority
|
||||
order:
|
||||
|
||||
- Structured serialization functions
|
||||
- `void x.serialize(BSONObjBuilder*) const` (member)
|
||||
@ -338,19 +319,18 @@ the following priority order:
|
||||
- `x.toString() ` (member)
|
||||
- `toString(x)` (non-member)
|
||||
|
||||
Enums cannot have member functions, but they will still try to bind to the
|
||||
`toStringForLogging(e)` or `toString(e)` non-members. If neither is available,
|
||||
the enum value will be logged as its underlying integral type.
|
||||
Enums cannot have member functions, but they will still try to bind to the `toStringForLogging(e)`
|
||||
or `toString(e)` non-members. If neither is available, the enum value will be logged as its
|
||||
underlying integral type.
|
||||
|
||||
In order to offer structured serialization and output, a type would need to
|
||||
supply a structured serialization function. Otherwise, if only stringification
|
||||
is provided, the output will be an escaped string.
|
||||
In order to offer structured serialization and output, a type would need to supply a structured
|
||||
serialization function. Otherwise, if only stringification is provided, the output will be an
|
||||
escaped string.
|
||||
|
||||
The `toStringForLogging` non-member is an ADL customization hook used to
|
||||
override `toString` for very rare cases where `toString` is inappropriate for
|
||||
logging perhaps because it's needed for other non-logging formatting. Usually a
|
||||
`toString` (member or nonmember) is a sufficient customization point and should
|
||||
be preferred as a canonical stringification of the object.
|
||||
The `toStringForLogging` non-member is an ADL customization hook used to override `toString` for
|
||||
very rare cases where `toString` is inappropriate for logging perhaps because it's needed for other
|
||||
non-logging formatting. Usually a `toString` (member or nonmember) is a sufficient customization
|
||||
point and should be preferred as a canonical stringification of the object.
|
||||
|
||||
_NOTE: No `operator<<` overload is used even if available_
|
||||
|
||||
@ -370,20 +350,19 @@ _NOTE: No `operator<<` overload is used even if available_
|
||||
|
||||
### Container support
|
||||
|
||||
STL containers and data structures that have STL like interfaces are loggable
|
||||
as long as they contain loggable elements (built-in, user-defined or other
|
||||
containers).
|
||||
STL containers and data structures that have STL like interfaces are loggable as long as they
|
||||
contain loggable elements (built-in, user-defined or other containers).
|
||||
|
||||
#### Sequential containers
|
||||
|
||||
Sequential containers like `std::vector`, `std::deque` and `std::list` are
|
||||
loggable and the elements get formatted as JSON array in structured output.
|
||||
Sequential containers like `std::vector`, `std::deque` and `std::list` are loggable and the elements
|
||||
get formatted as JSON array in structured output.
|
||||
|
||||
#### Associative containers
|
||||
|
||||
Associative containers such as `std::map` and `stdx::unordered_map` loggable
|
||||
with the requirement that they key is of a string type. The structured format
|
||||
is a JSON object where the field names are the key.
|
||||
Associative containers such as `std::map` and `stdx::unordered_map` loggable with the requirement
|
||||
that they key is of a string type. The structured format is a JSON object where the field names are
|
||||
the key.
|
||||
|
||||
#### Ranges
|
||||
|
||||
@ -392,11 +371,10 @@ Ranges is loggable via helpers to indicate what type of range it is
|
||||
- `seqLog(begin, end)`
|
||||
- `mapLog(begin, end)`
|
||||
|
||||
seqLog indicates that it is a sequential range where the iterators point to
|
||||
loggable value directly.
|
||||
seqLog indicates that it is a sequential range where the iterators point to loggable value directly.
|
||||
|
||||
mapLog indicates that it is a range coming from an associative container where
|
||||
the iterators point to a key-value pair.
|
||||
mapLog indicates that it is a range coming from an associative container where the iterators point
|
||||
to a key-value pair.
|
||||
|
||||
##### Examples
|
||||
|
||||
@ -425,10 +403,9 @@ the iterators point to a key-value pair.
|
||||
|
||||
#### Containers and `uint64_t`
|
||||
|
||||
Logging of containers uses `BSONObj` as an internal representation and
|
||||
`uint64_t` is not a supported type with `BSONObjBuilder::append()`. As a user
|
||||
you can use `boost::transform_iterator` to cast the `uint64_t` to a supported
|
||||
type.
|
||||
Logging of containers uses `BSONObj` as an internal representation and `uint64_t` is not a supported
|
||||
type with `BSONObjBuilder::append()`. As a user you can use `boost::transform_iterator` to cast the
|
||||
`uint64_t` to a supported type.
|
||||
|
||||
##### Example
|
||||
|
||||
@ -448,17 +425,14 @@ type.
|
||||
|
||||
### Duration types
|
||||
|
||||
Duration types have special formatting to match existing practices in the
|
||||
server code base. Their resulting format depends on the context they are
|
||||
logged.
|
||||
Duration types have special formatting to match existing practices in the server code base. Their
|
||||
resulting format depends on the context they are logged.
|
||||
|
||||
When durations are formatted as JSON or BSON a unit suffix is added to the
|
||||
attribute name when building the field name. The value will be count of the
|
||||
duration as a number.
|
||||
When durations are formatted as JSON or BSON a unit suffix is added to the attribute name when
|
||||
building the field name. The value will be count of the duration as a number.
|
||||
|
||||
When logging containers with durations there is no attribute per duration
|
||||
instance that can have the suffix added. In this case durations are instead
|
||||
formatted as a BSON object.
|
||||
When logging containers with durations there is no attribute per duration instance that can have the
|
||||
suffix added. In this case durations are instead formatted as a BSON object.
|
||||
|
||||
##### Examples
|
||||
|
||||
@ -485,9 +459,9 @@ formatted as a BSON object.
|
||||
|
||||
# Attribute naming abstraction
|
||||
|
||||
The style guide contains recommendations for attribute naming in certain cases.
|
||||
To make abstraction of attribute naming possible a `logAttrs` function can be
|
||||
implemented as a friend function in a class with the following signature:
|
||||
The style guide contains recommendations for attribute naming in certain cases. To make abstraction
|
||||
of attribute naming possible a `logAttrs` function can be implemented as a friend function in a
|
||||
class with the following signature:
|
||||
|
||||
class AnyUserType {
|
||||
public:
|
||||
@ -505,15 +479,13 @@ implemented as a friend function in a class with the following signature:
|
||||
|
||||
## Multiple attributes
|
||||
|
||||
In some cases a loggable type might be composed as a hierarchy in the C++ type
|
||||
system which would lead to a very verbose structured log output as every level
|
||||
in the hierarcy needs a name when outputted as JSON. The attribute naming
|
||||
abstraction system can also be used to collapse such hierarchies. Instead of
|
||||
making a type loggable it can instead return one or more attributes from its
|
||||
In some cases a loggable type might be composed as a hierarchy in the C++ type system which would
|
||||
lead to a very verbose structured log output as every level in the hierarcy needs a name when
|
||||
outputted as JSON. The attribute naming abstraction system can also be used to collapse such
|
||||
hierarchies. Instead of making a type loggable it can instead return one or more attributes from its
|
||||
members by using `multipleAttrs` in `logAttrs` functions.
|
||||
|
||||
`multipleAttrs(...)` accepts attributes or instances of types with `logAttrs`
|
||||
functions implemented.
|
||||
`multipleAttrs(...)` accepts attributes or instances of types with `logAttrs` functions implemented.
|
||||
|
||||
##### Examples
|
||||
|
||||
@ -535,12 +507,11 @@ functions implemented.
|
||||
|
||||
## Handling temporary lifetime with multiple attributes
|
||||
|
||||
To avoid lifetime issues (log attributes bind their values by reference) it is
|
||||
recommended to **not** create attributes when using `multipleAttrs` unless
|
||||
attributes are created for members directly. If `logAttrs` or `""_attr=` is
|
||||
used inside a `logAttrs` function on the return of a function returning by
|
||||
value it will result in a dangling reference. The following example illustrates
|
||||
the problem:
|
||||
To avoid lifetime issues (log attributes bind their values by reference) it is recommended to
|
||||
**not** create attributes when using `multipleAttrs` unless attributes are created for members
|
||||
directly. If `logAttrs` or `""_attr=` is used inside a `logAttrs` function on the return of a
|
||||
function returning by value it will result in a dangling reference. The following example
|
||||
illustrates the problem:
|
||||
|
||||
class SomeSubType {
|
||||
public:
|
||||
@ -566,10 +537,9 @@ the problem:
|
||||
std::string name_;
|
||||
};
|
||||
|
||||
The better implementation would be to let the log system control the
|
||||
lifetime by passing the instance to `multipleAttrs` without creating the
|
||||
attribute. The log system will detect that it is not an attribute and will
|
||||
attempt to create attributes by calling `logAttrs`:
|
||||
The better implementation would be to let the log system control the lifetime by passing the
|
||||
instance to `multipleAttrs` without creating the attribute. The log system will detect that it is
|
||||
not an attribute and will attempt to create attributes by calling `logAttrs`:
|
||||
|
||||
friend auto logAttrs(const SomeType& type) {
|
||||
return logv2::multipleAttrs("name"_attr=type.name(), type.sub());
|
||||
@ -579,11 +549,10 @@ attempt to create attributes by calling `logAttrs`:
|
||||
|
||||
## Combining uassert with log statement
|
||||
|
||||
Code that emits a high severity log statement may also need to emit a `uassert`
|
||||
after the log. There is the `UserAssertAfterLog` logging option that allows you
|
||||
to re-use the log statement to do the formatting required for the `uassert`.
|
||||
The assertion id can be either the logging ID by passing `UserAssertAfterLog`
|
||||
with no arguments or the assertion id can set by constructing
|
||||
Code that emits a high severity log statement may also need to emit a `uassert` after the log. There
|
||||
is the `UserAssertAfterLog` logging option that allows you to re-use the log statement to do the
|
||||
formatting required for the `uassert`. The assertion id can be either the logging ID by passing
|
||||
`UserAssertAfterLog` with no arguments or the assertion id can set by constructing
|
||||
`UserAssertAfterLog` with an `ErrorCodes::Error`.
|
||||
|
||||
The assertion reason string will be a plain text log and can be provided with additional attribute
|
||||
@ -614,26 +583,23 @@ Would emit a `uassert` after performing the log that is equivalent to:
|
||||
|
||||
## Unstructured logging for local development
|
||||
|
||||
To make it easier to use the log system for tracing in local development, there
|
||||
is a special API that does not use IDs or attribute names:
|
||||
To make it easier to use the log system for tracing in local development, there is a special API
|
||||
that does not use IDs or attribute names:
|
||||
|
||||
logd(format-string, value0, ..., valueN);
|
||||
|
||||
It formats the string using libfmt similarly to what
|
||||
`fmt::format(format-string, value0, ..., valueN)` would produce but using the
|
||||
regular log system type support on how types are made loggable. The formatted
|
||||
string is logged as the `msg` field in the JSON output, with no `attr`
|
||||
subobject.
|
||||
`fmt::format(format-string, value0, ..., valueN)` would produce but using the regular log system
|
||||
type support on how types are made loggable. The formatted string is logged as the `msg` field in
|
||||
the JSON output, with no `attr` subobject.
|
||||
|
||||
When using `logd` the log will emitted with standard severity and the default
|
||||
component.
|
||||
When using `logd` the log will emitted with standard severity and the default component.
|
||||
|
||||
A difference from regular logging, `logd` is allowed to be used in header files
|
||||
by including `logv2/log_debug.h`.
|
||||
A difference from regular logging, `logd` is allowed to be used in header files by including
|
||||
`logv2/log_debug.h`.
|
||||
|
||||
Unstructured logging is not allowed to be used in code committed to master,
|
||||
there is a lint check to validate this. It is however allowed to be used in
|
||||
Evergreen patch builds.
|
||||
Unstructured logging is not allowed to be used in code committed to master, there is a lint check to
|
||||
validate this. It is however allowed to be used in Evergreen patch builds.
|
||||
|
||||
##### Examples
|
||||
|
||||
@ -642,8 +608,8 @@ Evergreen patch builds.
|
||||
|
||||
## Rate limiting
|
||||
|
||||
Rate limiting logs is useful to reduce the impact of logging on database throughput. At high
|
||||
rate and concurrency, logging can be expensive and reduce performance. Attention should be paid
|
||||
Rate limiting logs is useful to reduce the impact of logging on database throughput. At high rate
|
||||
and concurrency, logging can be expensive and reduce performance. Attention should be paid
|
||||
specifically to logs that can occur on every operation, whether they fail or succeed.
|
||||
|
||||
The rate limiting feature is implemented by `SeveritySuppressor` (see
|
||||
@ -653,8 +619,8 @@ severity; subsequent logs within that interval are emitted at a "quiet" severity
|
||||
level). This ensures logs are not always written unless the logging level is increased for the
|
||||
component.
|
||||
|
||||
`SeveritySuppressor` is typically used with `StaticImmortal` for static storage. The interval can
|
||||
be configured with a server parameter when constructing SeveritySuppressor.
|
||||
`SeveritySuppressor` is typically used with `StaticImmortal` for static storage. The interval can be
|
||||
configured with a server parameter when constructing SeveritySuppressor.
|
||||
|
||||
##### Example
|
||||
|
||||
@ -666,18 +632,17 @@ be configured with a server parameter when constructing SeveritySuppressor.
|
||||
"Slow network response send time",
|
||||
"elapsed"_attr = bob.obj());
|
||||
|
||||
In this example, the first log within each gSlowNetworkLogRate-second window is emitted at Info level;
|
||||
subsequent logs within that window are emitted at Debug(2), which requires increasing the component's
|
||||
log level to be visible.
|
||||
In this example, the first log within each gSlowNetworkLogRate-second window is emitted at Info
|
||||
level; subsequent logs within that window are emitted at Debug(2), which requires increasing the
|
||||
component's log level to be visible.
|
||||
|
||||
For per-key rate limiting (e.g., one log per key per interval), use `KeyedSeveritySuppressor`
|
||||
instead.
|
||||
|
||||
# JSON output format
|
||||
|
||||
Produces structured logs of the [Relaxed Extended JSON 2.0.0][relaxed_json_2]
|
||||
format. Below is an example of a log statement in C++ and a pretty-printed JSON
|
||||
output:
|
||||
Produces structured logs of the [Relaxed Extended JSON 2.0.0][relaxed_json_2] format. Below is an
|
||||
example of a log statement in C++ and a pretty-printed JSON output:
|
||||
|
||||
C++ statement:
|
||||
|
||||
@ -717,5 +682,7 @@ Output:
|
||||
---
|
||||
|
||||
[relaxed_json_2]: https://github.com/mongodb/specifications/blob/master/source/extended-json.rst
|
||||
[_lastOplogEntryFetcherCallbackForStopTimestamp]: https://github.com/mongodb/mongo/blob/13caf3c499a22c2274bd533043eb7e06e6f8e8a4/src/mongo/db/repl/initial_syncer.cpp#L1500-L1512
|
||||
[_summarizeRollback]: https://github.com/mongodb/mongo/blob/13caf3c499a22c2274bd533043eb7e06e6f8e8a4/src/mongo/db/repl/rollback_impl.cpp#L1263-L1305
|
||||
[_lastOplogEntryFetcherCallbackForStopTimestamp]:
|
||||
https://github.com/mongodb/mongo/blob/13caf3c499a22c2274bd533043eb7e06e6f8e8a4/src/mongo/db/repl/initial_syncer.cpp#L1500-L1512
|
||||
[_summarizeRollback]:
|
||||
https://github.com/mongodb/mongo/blob/13caf3c499a22c2274bd533043eb7e06e6f8e8a4/src/mongo/db/repl/rollback_impl.cpp#L1263-L1305
|
||||
|
||||
@ -2,5 +2,5 @@
|
||||
|
||||
- Avoid using bare pointers for dynamically allocated objects. Prefer `std::unique_ptr`,
|
||||
`std::shared_ptr`, or another RAII class such as `BSONObj`.
|
||||
- If you assign the output of `new/malloc()` directly to a bare pointer you should document where
|
||||
it gets deleted/freed, who owns it along the way, and how exception safety is ensured.
|
||||
- If you assign the output of `new/malloc()` directly to a bare pointer you should document where it
|
||||
gets deleted/freed, who owns it along the way, and how exception safety is ensured.
|
||||
|
||||
@ -15,86 +15,87 @@ TODO
|
||||
## Why are we doing this?
|
||||
|
||||
Having a clear delineation between public and private APIs for each module will improve the
|
||||
maintainability and velocity of our codebase. Teams will have more freedom to evolve their
|
||||
internal implementation details without affecting consumers. Consumers will benefit from
|
||||
knowing what APIs are intended for their consumption.
|
||||
maintainability and velocity of our codebase. Teams will have more freedom to evolve their internal
|
||||
implementation details without affecting consumers. Consumers will benefit from knowing what APIs
|
||||
are intended for their consumption.
|
||||
|
||||
## Assigning files to modules
|
||||
|
||||
The file `modules_poc/modules.yaml` contains a list of modules, each containing
|
||||
a list of files. Each file must be contained in only one module. Note that
|
||||
module assignment is not required to map neatly to team ownership.
|
||||
The file `modules_poc/modules.yaml` contains a list of modules, each containing a list of files.
|
||||
Each file must be contained in only one module. Note that module assignment is not required to map
|
||||
neatly to team ownership.
|
||||
|
||||
In cases where multiple globs match a file, the current rule is that the
|
||||
longest glob wins. This is used as a simpler-to-implement version of
|
||||
most-specific glob wins, which we may switch to in the future.
|
||||
In cases where multiple globs match a file, the current rule is that the longest glob wins. This is
|
||||
used as a simpler-to-implement version of most-specific glob wins, which we may switch to in the
|
||||
future.
|
||||
|
||||
## How do I mark API visibility?
|
||||
|
||||
This section will just describe the basic process. Later sections will cover the tooling
|
||||
available to help, along with caveats to be aware of.
|
||||
This section will just describe the basic process. Later sections will cover the tooling available
|
||||
to help, along with caveats to be aware of.
|
||||
|
||||
First read the documentation in [src/mongo/util/modules.h](https://github.com/mongodb/mongo/blob/master/src/mongo/util/modules.h)
|
||||
for the canonical list and description of visibility levels. As a brief overview of the main
|
||||
levels from least to most restrictive:
|
||||
First read the documentation in
|
||||
[src/mongo/util/modules.h](https://github.com/mongodb/mongo/blob/master/src/mongo/util/modules.h) for
|
||||
the canonical list and description of visibility levels. As a brief overview of the main levels from
|
||||
least to most restrictive:
|
||||
|
||||
- `OPEN`: This is available for usage _and inheritance_ from anywhere in the codebase
|
||||
- `PUBLIC`: This is available for usage from anywhere in the codebase. For types, subclasses may
|
||||
only be defined in the same module.
|
||||
- `NEEDS_REPLACEMENT` and `USE_REPLACEMENT(...)`: These are collectively considered
|
||||
"unfortunately public" and are available for use, but should be avoided
|
||||
- `NEEDS_REPLACEMENT` and `USE_REPLACEMENT(...)`: These are collectively considered "unfortunately
|
||||
public" and are available for use, but should be avoided
|
||||
- `PARENT_PRIVATE`: This is similar to `PRIVATE`, but allows usage from any file in the parent
|
||||
module, including other submodules
|
||||
- `PRIVATE`: This may only be used from the current module or one of its submodules
|
||||
- `FILE_PRIVATE`: This may only be used from the current "file family" (roughly, header \+ cpp
|
||||
\+ tests). It may not be used by other files, even from the same module.
|
||||
- `FILE_PRIVATE`: This may only be used from the current "file family" (roughly, header \+ cpp \+
|
||||
tests). It may not be used by other files, even from the same module.
|
||||
|
||||
You can think of public vs private similarly to how you would the sections of a `class`: they
|
||||
indicate whether something is intended to be part of the API or an implementation detail. The
|
||||
difference is that they apply at a wider granularity of code than a single class, with
|
||||
implementation details available to either the full module (and its submodules) for `PRIVATE`
|
||||
or the file family for `FILE_PRIVATE`.
|
||||
implementation details available to either the full module (and its submodules) for `PRIVATE` or the
|
||||
file family for `FILE_PRIVATE`.
|
||||
|
||||
The macros in that header file are attached to declarations and set the visibility level for
|
||||
that declaration and all of its "semantic children"[^1]. The macros are C++ attributes which
|
||||
means that they need to go in specific places that differ based on what is being marked (for
|
||||
templates, the location does not change and is always somewhere after the `template <...>` part):
|
||||
The macros in that header file are attached to declarations and set the visibility level for that
|
||||
declaration and all of its "semantic children"[^1]. The macros are C++ attributes which means that
|
||||
they need to go in specific places that differ based on what is being marked (for templates, the
|
||||
location does not change and is always somewhere after the `template <...>` part):
|
||||
|
||||
- `MONGO_MOD_PUBLIC;` by itself as the first line after includes in a header sets the default
|
||||
for that header (only `PUBLIC`, `PARENT_PRIVATE`, and `FILE_PRIVATE` are allowed here)
|
||||
- `namespace MONGO_MOD mongo {` (this does not work with nested namespaces in a single
|
||||
declaration like `namespace mongo::repl`)
|
||||
- `MONGO_MOD_PUBLIC;` by itself as the first line after includes in a header sets the default for
|
||||
that header (only `PUBLIC`, `PARENT_PRIVATE`, and `FILE_PRIVATE` are allowed here)
|
||||
- `namespace MONGO_MOD mongo {` (this does not work with nested namespaces in a single declaration
|
||||
like `namespace mongo::repl`)
|
||||
- `class MONGO_MOD Foo {` (Ditto for `enum`, `struct`, and `union`)
|
||||
- `MONGO_MOD void func(...);`
|
||||
- `MONGO_MOD int var;`
|
||||
- `concept isFooable MONGO_MOD {`
|
||||
|
||||
For the cases where it goes at the beginning of the line, if clang-format chooses an unfortunate
|
||||
place to break the line, it usually helps to undo the formatting then put the macro on its own
|
||||
line above the declaration.
|
||||
place to break the line, it usually helps to undo the formatting then put the macro on its own line
|
||||
above the declaration.
|
||||
|
||||
APIs are marked one header at a time, by including `"mongo/util/modules.h"` in the header.
|
||||
This causes the header to be treated as "modularized" which has the following effects:
|
||||
APIs are marked one header at a time, by including `"mongo/util/modules.h"` in the header. This
|
||||
causes the header to be treated as "modularized" which has the following effects:
|
||||
|
||||
- All declarations in that header (not transitive includes) default to `PRIVATE`, meaning that
|
||||
the public API is what must be marked.
|
||||
- Members in `private:` sections in classes default to `PRIVATE`, regardless of the visibility
|
||||
of the class. The only way the language would allow them to be used from outside of the module
|
||||
is if you have cross-module friendships, which should generally be avoided. If needed
|
||||
temporarily, favor `NEEDS_REPLACEMENT` over `PUBLIC` for these declarations.
|
||||
- Declarations ending in `_forTest` default to `FILE_PRIVATE` to support the common case where
|
||||
they are only intended for testing that class. If they are actually intended to support testing
|
||||
of consumers, not just the type they are defined on, they can be explicitly given `PUBLIC` or
|
||||
- All declarations in that header (not transitive includes) default to `PRIVATE`, meaning that the
|
||||
public API is what must be marked.
|
||||
- Members in `private:` sections in classes default to `PRIVATE`, regardless of the visibility of
|
||||
the class. The only way the language would allow them to be used from outside of the module is if
|
||||
you have cross-module friendships, which should generally be avoided. If needed temporarily, favor
|
||||
`NEEDS_REPLACEMENT` over `PUBLIC` for these declarations.
|
||||
- Declarations ending in `_forTest` default to `FILE_PRIVATE` to support the common case where they
|
||||
are only intended for testing that class. If they are actually intended to support testing of
|
||||
consumers, not just the type they are defined on, they can be explicitly given `PUBLIC` or
|
||||
`PRIVATE` visibility.
|
||||
- Internal and detail namespaces default to `PRIVATE` and cannot be made less restricted, but
|
||||
can still be marked as `FILE_PRIVATE`. Individual declarations within the namespace can be
|
||||
exposed as necessary, but they cannot be exposed in bulk without changing the name of the
|
||||
namespace to something that doesn't imply private.
|
||||
- Internal and detail namespaces default to `PRIVATE` and cannot be made less restricted, but can
|
||||
still be marked as `FILE_PRIVATE`. Individual declarations within the namespace can be exposed as
|
||||
necessary, but they cannot be exposed in bulk without changing the name of the namespace to
|
||||
something that doesn't imply private.
|
||||
|
||||
For internal headers of a module which do not contribute to its public API, simply including
|
||||
`modules.h` is sufficient. There is a [tool](#the-private-header-marker) to automate this
|
||||
process. You may additionally want to consider whether any APIs should be marked `FILE_PRIVATE`,
|
||||
but that is optional.
|
||||
`modules.h` is sufficient. There is a [tool](#the-private-header-marker) to automate this process.
|
||||
You may additionally want to consider whether any APIs should be marked `FILE_PRIVATE`, but that is
|
||||
optional.
|
||||
|
||||
For IDL files, you mark visibility of whole types (`struct`, `enum`, and `command`) with the
|
||||
`mod_visibility` option. The value should be the same as one of the `MONGO_MOD` macros, but
|
||||
@ -105,17 +106,17 @@ compelling use case for this.
|
||||
|
||||
## What tooling exists to help me?
|
||||
|
||||
Note that all tooling should be run from within a properly set-up python virtual environment.
|
||||
This includes running `buildscripts/poetry_sync.sh` to ensure you have the correct dependencies.
|
||||
Note that all tooling should be run from within a properly set-up python virtual environment. This
|
||||
includes running `buildscripts/poetry_sync.sh` to ensure you have the correct dependencies.
|
||||
|
||||
### The scanner and merger
|
||||
|
||||
The merger generates a cross reference of all first-party usages of first-party code and stores
|
||||
it in `merged_decls.json`, which is used by the rest of our tooling. It is also where we validate
|
||||
that there are no disallowed accesses. It will be invoked for you by the browser when you ask it
|
||||
to rescan, or you can also manually run it as `modules_poc/merge_decls.py`. If you are interested
|
||||
in analyzing that file, [`jq`](https://jqlang.org/) is a powerful tool, or you can just write
|
||||
some python.
|
||||
The merger generates a cross reference of all first-party usages of first-party code and stores it
|
||||
in `merged_decls.json`, which is used by the rest of our tooling. It is also where we validate that
|
||||
there are no disallowed accesses. It will be invoked for you by the browser when you ask it to
|
||||
rescan, or you can also manually run it as `modules_poc/merge_decls.py`. If you are interested in
|
||||
analyzing that file, [`jq`](https://jqlang.org/) is a powerful tool, or you can just write some
|
||||
python.
|
||||
|
||||
As a rather extreme example of what you can do with `jq`, here is how the progress reports are
|
||||
generated:
|
||||
@ -129,43 +130,43 @@ generated:
|
||||
jq 'map(., .mod = "TOTAL") | group_by(.mod)[] | group_by(.loc | split(":")[0]) | {mod: .[0].[0].mod, total: length, marked: map(select(any(.visibility == "UNKNOWN") | not)) | length} | .done = (1000 * .marked / .total | round) / 10 | "\(.mod): \(" " * (.mod | 40-length)) \(.done)% (\(.marked) / \(.total))"' -r merged_decls.json
|
||||
```
|
||||
|
||||
Internally, the merger will internally invoke `bazel build --config=mod-scanner //src/mongo/...`
|
||||
to run the scanner over the whole codebase (or the parts that have changed since the last scan),
|
||||
taking advantage of bazel remote execution to achieve very high levels of parallelism.
|
||||
Internally, the merger will internally invoke `bazel build --config=mod-scanner //src/mongo/...` to
|
||||
run the scanner over the whole codebase (or the parts that have changed since the last scan), taking
|
||||
advantage of bazel remote execution to achieve very high levels of parallelism.
|
||||
|
||||
### The browser
|
||||
|
||||
The main piece of tooling to run is the browser, which is launched by running
|
||||
`modules_poc/browse.py`. If you haven't scanned the codebase recently, it will offer to run it
|
||||
for you which will take a few minutes. After modifying the source code, you can rescan at any
|
||||
time by pressing `r`. It will only rescan files that have been modified or that transitively
|
||||
include modified headers.
|
||||
`modules_poc/browse.py`. If you haven't scanned the codebase recently, it will offer to run it for
|
||||
you which will take a few minutes. After modifying the source code, you can rescan at any time by
|
||||
pressing `r`. It will only rescan files that have been modified or that transitively include
|
||||
modified headers.
|
||||
|
||||
The browser is primarily intended to assist in labeling public APIs, so the files are sorted
|
||||
with the most number of unlabeled declarations ("unknowns") first. You can search for a file
|
||||
by pressing `f` or press `m` to filter the files by module.
|
||||
The browser is primarily intended to assist in labeling public APIs, so the files are sorted with
|
||||
the most number of unlabeled declarations ("unknowns") first. You can search for a file by pressing
|
||||
`f` or press `m` to filter the files by module.
|
||||
|
||||
The list of available key bindings is shown on the right. You can toggle that by pressing `?`.
|
||||
Other keybinding of note are that you can press `g` to go to the currently highlighted
|
||||
declaration or location in your editor (only when running in the vscode or nvim terminal),
|
||||
and `p` to toggle an inline preview of the location within the browser. You can press `Tab ↹`
|
||||
to toggle between the tree and the code preview. The mouse is fully supported for scrolling
|
||||
and expanding rows in the tree, and there are aliases for some basic vim keybinds (`hjkl/`).
|
||||
The list of available key bindings is shown on the right. You can toggle that by pressing `?`. Other
|
||||
keybinding of note are that you can press `g` to go to the currently highlighted declaration or
|
||||
location in your editor (only when running in the vscode or nvim terminal), and `p` to toggle an
|
||||
inline preview of the location within the browser. You can press `Tab ↹` to toggle between the tree
|
||||
and the code preview. The mouse is fully supported for scrolling and expanding rows in the tree, and
|
||||
there are aliases for some basic vim keybinds (`hjkl/`).
|
||||
|
||||
### The private header marker
|
||||
|
||||
Once you have scanned the codebase and produced a `merged_decls.json`,
|
||||
`modules_poc/private_headers.py` can be used to find all header and IDL files where there are
|
||||
no currently detected external usages and automatically mark them as fully private to the
|
||||
module. This does not necessarily mean that all automatically marked headers are intended to
|
||||
be private. A human should review to ensure that the marked headers match intent. You can pass
|
||||
flags to filter on any/all of module, owning team, or path glob. For headers matching the filter,
|
||||
the script will also warn of usages of `_forTest` external to the file family that may need to
|
||||
be marked `PRIVATE` to make them available to the whole module since they default to only being
|
||||
available to the file family for marked headers.
|
||||
`modules_poc/private_headers.py` can be used to find all header and IDL files where there are no
|
||||
currently detected external usages and automatically mark them as fully private to the module. This
|
||||
does not necessarily mean that all automatically marked headers are intended to be private. A human
|
||||
should review to ensure that the marked headers match intent. You can pass flags to filter on
|
||||
any/all of module, owning team, or path glob. For headers matching the filter, the script will also
|
||||
warn of usages of `_forTest` external to the file family that may need to be marked `PRIVATE` to
|
||||
make them available to the whole module since they default to only being available to the file
|
||||
family for marked headers.
|
||||
|
||||
Make sure to run `buildscripts/clang_format.py format-my` or `bazel run format` after using it
|
||||
to modify any C++ files.
|
||||
Make sure to run `buildscripts/clang_format.py format-my` or `bazel run format` after using it to
|
||||
modify any C++ files.
|
||||
|
||||
Example usage:
|
||||
|
||||
@ -178,13 +179,12 @@ Example usage:
|
||||
### The PR comment generator
|
||||
|
||||
You can run `modules_poc/mod_diff.py` to output a brief summary of all of the API (including
|
||||
visibility levels and usages counts) for each file modified in your branch. When putting up a PR
|
||||
to mark API visibility, you should add a comment with its output to the PR as an aide to
|
||||
reviewers. The output is intended to be close enough to C++ that you should put it in a
|
||||
` ```cpp ` block when making your PR comment to make it more readable. You can also
|
||||
pipe it through `bat -lcpp` to make it colorful locally. Note that it will use the last
|
||||
scan output, so if you've modified any headers, you should run a rescan prior to running this
|
||||
tool.
|
||||
visibility levels and usages counts) for each file modified in your branch. When putting up a PR to
|
||||
mark API visibility, you should add a comment with its output to the PR as an aide to reviewers. The
|
||||
output is intended to be close enough to C++ that you should put it in a ` ```cpp ` block when
|
||||
making your PR comment to make it more readable. You can also pipe it through `bat -lcpp` to make it
|
||||
colorful locally. Note that it will use the last scan output, so if you've modified any headers, you
|
||||
should run a rescan prior to running this tool.
|
||||
|
||||
## Workflow
|
||||
|
||||
@ -198,24 +198,23 @@ The general workflow for each PR will generally be the same:
|
||||
5. Run [the pr comment generator](#the-pr-comment-generator) to show the APIs that you have marked
|
||||
- Look through this to ensure that everything is as you expect.
|
||||
6. Put up a PR and include the generated comment in a ` ```cpp ` block
|
||||
- I suggest keeping PRs small (say, no more than 10 files at a time) so that they are
|
||||
manageable by reviewers. As an exception it seems reasonable to auto-mark many headers as
|
||||
private in a single PR, as long as those PRs are separate from those containing any manual
|
||||
marking.
|
||||
- I suggest keeping PRs small (say, no more than 10 files at a time) so that they are manageable
|
||||
by reviewers. As an exception it seems reasonable to auto-mark many headers as private in a
|
||||
single PR, as long as those PRs are separate from those containing any manual marking.
|
||||
|
||||
When first starting to mark a module, I suggest running the [`modules_poc/private_headers.py`](#the-private-header-marker)
|
||||
script with `--dry-run` (or `-n`) and `--module=YOUR_MODULE`. For larger modules (in particular,
|
||||
the `query` mega module) you may want to pass a `--glob` so that you can focus on a smaller
|
||||
subset of the code initially. That will give you an overview of the files that are used from
|
||||
outside your module (which contain defacto public APIs today) and those that do not (which can
|
||||
automatically be marked as private implementation details).
|
||||
When first starting to mark a module, I suggest running the
|
||||
[`modules_poc/private_headers.py`](#the-private-header-marker) script with `--dry-run` (or `-n`) and
|
||||
`--module=YOUR_MODULE`. For larger modules (in particular, the `query` mega module) you may want to
|
||||
pass a `--glob` so that you can focus on a smaller subset of the code initially. That will give you
|
||||
an overview of the files that are used from outside your module (which contain defacto public APIs
|
||||
today) and those that do not (which can automatically be marked as private implementation details).
|
||||
|
||||
If all of the defacto private headers seem like they should be private, you can remove the
|
||||
dry-run flag to have it automatically mark them as private. Be sure to validate that their
|
||||
contents are actually intended to be private. Remember that the point of having a human doing
|
||||
the marking is to ensure that we correctly capture intent. You can optionally mark implementation
|
||||
details within each header as `FILE_PRIVATE`, if you would like to prevent them from being used
|
||||
elsewhere even within the module.
|
||||
If all of the defacto private headers seem like they should be private, you can remove the dry-run
|
||||
flag to have it automatically mark them as private. Be sure to validate that their contents are
|
||||
actually intended to be private. Remember that the point of having a human doing the marking is to
|
||||
ensure that we correctly capture intent. You can optionally mark implementation details within each
|
||||
header as `FILE_PRIVATE`, if you would like to prevent them from being used elsewhere even within
|
||||
the module.
|
||||
|
||||
You can then open [the browser](#the-browser) (`modules_poc/browse.py`) to look at the remaining
|
||||
headers. It will show you what is used and from where. It will be particularly useful for things
|
||||
@ -229,137 +228,136 @@ that seem like they should be private, but are being used externally.
|
||||
`modules_poc/modules.yaml` to move them.
|
||||
2. If there is already a public API that callers should use instead, mark it as
|
||||
`USE_REPLACEMENT(better_api)`. The argument accepts any C++ tokens, but the intent is where
|
||||
possible to use the name of the replacement. This will generate a ticket for all teams using
|
||||
that code.
|
||||
possible to use the name of the replacement. This will generate a ticket for all teams using that
|
||||
code.
|
||||
1. If there are very few users, consider just cleaning them up.
|
||||
3. Reconsider making this API public if other modules need its functionality, and this is
|
||||
the only way to get it.
|
||||
4. Otherwise, if there is no public API that fulfills the needs of the callers, but you
|
||||
don't want the current API to remain public long-term, use `NEEDS_REPLACEMENT`. This will
|
||||
generate a ticket for the team that owns that code.
|
||||
1. If the API was "obviously" intended to be private (eg it is in a `details` namespace)
|
||||
and callers would be reasonably able to implement the functionality themselves, possibly
|
||||
by writing their own version, it seems acceptable to use
|
||||
3. Reconsider making this API public if other modules need its functionality, and this is the only
|
||||
way to get it.
|
||||
4. Otherwise, if there is no public API that fulfills the needs of the callers, but you don't want
|
||||
the current API to remain public long-term, use `NEEDS_REPLACEMENT`. This will generate a ticket
|
||||
for the team that owns that code.
|
||||
1. If the API was "obviously" intended to be private (eg it is in a `details` namespace) and
|
||||
callers would be reasonably able to implement the functionality themselves, possibly by
|
||||
writing their own version, it seems acceptable to use
|
||||
`USE_REPLACEMENT(do not use internal details)`
|
||||
|
||||
## Caveats and Limitations
|
||||
|
||||
**OVERARCHING GUIDELINE**: Always try to mark declarations correctly according to intent,
|
||||
even if it will not be enforced by the current tooling. This is both to provide the correct
|
||||
information to human readers, as well as to avoid issues if we improve the tooling in the
|
||||
future to eliminate these limitations
|
||||
**OVERARCHING GUIDELINE**: Always try to mark declarations correctly according to intent, even if it
|
||||
will not be enforced by the current tooling. This is both to provide the correct information to
|
||||
human readers, as well as to avoid issues if we improve the tooling in the future to eliminate these
|
||||
limitations
|
||||
|
||||
The rest of this section is fairly technical and probably not necessary for most readers unless
|
||||
they notice something "weird" going on and want to dive into why. Most of these limitations are
|
||||
more likely to affect the core modules since most of the rest of our code does not expose APIs
|
||||
via macros and templates or have APIs only consumed by templates, and those are where most of
|
||||
these issues come up.
|
||||
The rest of this section is fairly technical and probably not necessary for most readers unless they
|
||||
notice something "weird" going on and want to dive into why. Most of these limitations are more
|
||||
likely to affect the core modules since most of the rest of our code does not expose APIs via macros
|
||||
and templates or have APIs only consumed by templates, and those are where most of these issues come
|
||||
up.
|
||||
|
||||
- We do not track usages of namespaces at all, only the declarations within namespaces. When
|
||||
a namespace is marked with a visibility, it does not affect the visibility of the namespace
|
||||
itself (since it doesn't have one), it sets the default visibility for all declarations within
|
||||
**that namespace block**. Each time a namespace is reopened it is a separate block and the
|
||||
visibility markers on other blocks of the same namespace do not apply.
|
||||
- The scanner only knows about declarations that it sees being used. For implementation reasons,
|
||||
it only discovers declarations by seeing what every usage is using. This can either cause or be
|
||||
- We do not track usages of namespaces at all, only the declarations within namespaces. When a
|
||||
namespace is marked with a visibility, it does not affect the visibility of the namespace itself
|
||||
(since it doesn't have one), it sets the default visibility for all declarations within **that
|
||||
namespace block**. Each time a namespace is reopened it is a separate block and the visibility
|
||||
markers on other blocks of the same namespace do not apply.
|
||||
- The scanner only knows about declarations that it sees being used. For implementation reasons, it
|
||||
only discovers declarations by seeing what every usage is using. This can either cause or be
|
||||
caused by other limitations.
|
||||
- Usages in templates may not be seen. This is especially the case for "dependent types and
|
||||
values" which are things that are not known by the compiler before the template is instantiated.
|
||||
- This is a problem for functions where any arguments are dependent if it can't figure out
|
||||
which overload will be selected. It is even worse for free-functions called unqualified
|
||||
(`f(blah)` rather than `ns::f(blah)` or `x.f(blah)`) since due to ADL, overload resolution
|
||||
is _always_ delayed for them.
|
||||
- Everything that results from a macro expansion is treated as-if it was written at the point
|
||||
of expansion. This applies to both declarations and usages. If you have an API that should
|
||||
only be used via the defined macros, mark it as `MOD_PUBLIC_FOR_TECHNICAL_REASONS` to signal
|
||||
to readers that they should avoid direct usage, even if the tooling won't prevent it. We may
|
||||
improve this in the future.
|
||||
- Template variables are completely ignored due to some unfortunate clang bugs. Still, try
|
||||
to mark them correctly since we may change this in the future.
|
||||
- Usages in templates may not be seen. This is especially the case for "dependent types and values"
|
||||
which are things that are not known by the compiler before the template is instantiated.
|
||||
- This is a problem for functions where any arguments are dependent if it can't figure out which
|
||||
overload will be selected. It is even worse for free-functions called unqualified (`f(blah)`
|
||||
rather than `ns::f(blah)` or `x.f(blah)`) since due to ADL, overload resolution is _always_
|
||||
delayed for them.
|
||||
- Everything that results from a macro expansion is treated as-if it was written at the point of
|
||||
expansion. This applies to both declarations and usages. If you have an API that should only be
|
||||
used via the defined macros, mark it as `MOD_PUBLIC_FOR_TECHNICAL_REASONS` to signal to readers
|
||||
that they should avoid direct usage, even if the tooling won't prevent it. We may improve this in
|
||||
the future.
|
||||
- Template variables are completely ignored due to some unfortunate clang bugs. Still, try to mark
|
||||
them correctly since we may change this in the future.
|
||||
- Method calls are assigned to the static type at the call site. This has two important effects:
|
||||
- A subclass's overridden method may seem unused if it is only used via calls through a base
|
||||
class pointer/reference
|
||||
- Calls through a base class pointer/reference count as calls of that class's method, not of
|
||||
the interface's
|
||||
- Defaulted members (methods, ctors, dtors) are treated as usages of the class itself,
|
||||
regardless of whether they implicitly or explicitly defaulted. This is because clang does not
|
||||
provide an API to distinguish between those cases.
|
||||
- Template normalization woes: we try really hard to report declarations as the template
|
||||
`foo<T>` rather than separate instantiations like `foo<int>`, `foo<string>`, etc, **unless**
|
||||
they are explicitly specialized, meaning that the instantiation has its own definition different
|
||||
from the main template. Unfortunately, clang does a bad job at this and we have a number of
|
||||
kludgy workarounds. The most important effects:
|
||||
- Explicit specializations of function and variable templates are ignored and always converted
|
||||
to the primary template.
|
||||
- A subclass's overridden method may seem unused if it is only used via calls through a base class
|
||||
pointer/reference
|
||||
- Calls through a base class pointer/reference count as calls of that class's method, not of the
|
||||
interface's
|
||||
- Defaulted members (methods, ctors, dtors) are treated as usages of the class itself, regardless of
|
||||
whether they implicitly or explicitly defaulted. This is because clang does not provide an API to
|
||||
distinguish between those cases.
|
||||
- Template normalization woes: we try really hard to report declarations as the template `foo<T>`
|
||||
rather than separate instantiations like `foo<int>`, `foo<string>`, etc, **unless** they are
|
||||
explicitly specialized, meaning that the instantiation has its own definition different from the
|
||||
main template. Unfortunately, clang does a bad job at this and we have a number of kludgy
|
||||
workarounds. The most important effects:
|
||||
- Explicit specializations of function and variable templates are ignored and always converted to
|
||||
the primary template.
|
||||
- We do treat explicit specializations of types as separate (using the heuristic of having a
|
||||
separate location than the main template), because they can have a different shape and API than
|
||||
the main template. In general they should probably have the same visibility though, unless the
|
||||
instantiation is using a private type which should be unavailable to consumers anyway.
|
||||
- Clang assigns many locations to the site of explicit template instantiations and extern
|
||||
template declarations, even when there is a better location that it can see. Luckily these
|
||||
are fairly rare.
|
||||
- Clang assigns many locations to the site of explicit template instantiations and extern template
|
||||
declarations, even when there is a better location that it can see. Luckily these are fairly
|
||||
rare.
|
||||
- Sometimes clang reports the resolved destination of `using` declarations and type alias, but
|
||||
usually it reports the `using` declaration itself. A few notable cases (these are trends and
|
||||
may not be absolute\!)
|
||||
usually it reports the `using` declaration itself. A few notable cases (these are trends and may
|
||||
not be absolute\!)
|
||||
- `using Base::foo;` to expose a member of a base class is resolved as a usage of `Base::foo`
|
||||
rather than `Derived::foo`. This is especially notable when the `Base` class is intended to be
|
||||
a private implementation detail. You will need to mark all exposed methods as public.
|
||||
- `using Base::Base;` to pull in the base constructors is the opposite and is recorded as a
|
||||
usage of `Derived::Base(args)`, which is odd because such a declaration doesn't actually exist.
|
||||
rather than `Derived::foo`. This is especially notable when the `Base` class is intended to be a
|
||||
private implementation detail. You will need to mark all exposed methods as public.
|
||||
- `using Base::Base;` to pull in the base constructors is the opposite and is recorded as a usage
|
||||
of `Derived::Base(args)`, which is odd because such a declaration doesn't actually exist.
|
||||
- Internal/details namespaces (currently defined as matching the regex `(detail|internal)s?$`)
|
||||
implicitly have implicit default visibility of private if `modules.h` is included. It is not
|
||||
possible to give the namespace a public visibility, but you can restrict it further with
|
||||
`FILE_PRIVATE`. If you want declarations inside it to be usable from outside your module you
|
||||
must mark children of the namespace explicitly, or rename it to not use a name that implies
|
||||
that it is for internal usage only. A somewhat common case will be marking internal declarations
|
||||
that are only intended to be used via macros with `PUBLIC_FOR_TECHNICAL_REASONS`.
|
||||
- Be very careful with forward declarations. Try to avoid them wherever possible (unless there
|
||||
is a significant benefit). Especially avoid forward declaring anything from another module\!
|
||||
Where forward declarations must be used, make sure that they have the same visibility as the
|
||||
real definition. As an exception, if every TU that sees the forward declaration will also see
|
||||
the definition it is OK to omit marking the forward definition. This may happen when they are
|
||||
both in the same header, or the forward declaration is in a private implementation detail header
|
||||
which is included by the defining header. Be aware of the implicit visibility marking which also
|
||||
applies to forward declaration, if they are the only declaration seen in the TU.
|
||||
- Never forward declare functions to avoid including a header. They are much more problematic
|
||||
than types, both in general in C++ and specifically for this tooling.
|
||||
- We try to use the definition location for types defined in headers, but the "canonical"
|
||||
location (clang's term for the first declaration seen in the current TU) for everything else.
|
||||
If the type is defined in a .cpp, we use the canonical location.
|
||||
`FILE_PRIVATE`. If you want declarations inside it to be usable from outside your module you must
|
||||
mark children of the namespace explicitly, or rename it to not use a name that implies that it is
|
||||
for internal usage only. A somewhat common case will be marking internal declarations that are
|
||||
only intended to be used via macros with `PUBLIC_FOR_TECHNICAL_REASONS`.
|
||||
- Be very careful with forward declarations. Try to avoid them wherever possible (unless there is a
|
||||
significant benefit). Especially avoid forward declaring anything from another module\! Where
|
||||
forward declarations must be used, make sure that they have the same visibility as the real
|
||||
definition. As an exception, if every TU that sees the forward declaration will also see the
|
||||
definition it is OK to omit marking the forward definition. This may happen when they are both in
|
||||
the same header, or the forward declaration is in a private implementation detail header which is
|
||||
included by the defining header. Be aware of the implicit visibility marking which also applies to
|
||||
forward declaration, if they are the only declaration seen in the TU.
|
||||
- Never forward declare functions to avoid including a header. They are much more problematic than
|
||||
types, both in general in C++ and specifically for this tooling.
|
||||
- We try to use the definition location for types defined in headers, but the "canonical" location
|
||||
(clang's term for the first declaration seen in the current TU) for everything else. If the type
|
||||
is defined in a .cpp, we use the canonical location.
|
||||
- We only consider declarations in headers, never in .cpp files.
|
||||
- Be mindful of `_forTest` functions. They default to `FILE_PRIVATE` since they are typically
|
||||
intended only for use when testing the type they are defined on, not when testing consumers.
|
||||
In the cases where they _are_ intended as part of the API for testing consumers, you can
|
||||
explicitly mark them `PUBLIC` or `PRIVATE` depending on whether they should be usable from
|
||||
outside your module or not.
|
||||
- Things used implicitly (eg implicit conversion operators) are still counted as usages even
|
||||
if they are not specifically named at the call site
|
||||
- When merging information from multiple TUs, definitions always replace the metadata gathered
|
||||
from TUs that only saw a declaration.
|
||||
- Note that we aren't guaranteed to see every definition, in particular for functions that
|
||||
are not called from the TU that they are defined in. So this cannot be used to find places
|
||||
where we deleted the definition but forgot to delete the declaration (we wouldn't see them
|
||||
anyway, since we only track things that are used, and undefined things can't really be used,
|
||||
except trivially, without breaking the build).
|
||||
- `private` members of classes are implicitly `PRIVATE`, and must be explicitly marked otherwise
|
||||
if desired. They should probably never be made `PUBLIC` since that implies cross-module
|
||||
friendship. In the few places where we have that today, they have been made one of the flavors
|
||||
of unfortunately public: `NEEDS_REPLACEMENT` or `USE_INSTEAD`.
|
||||
- `public` members of `private` types do not inherit the implicit `PRIVATE` and follow the
|
||||
normal rule of looking for their nearest semantic parent with an explicit marker. That means
|
||||
that they may be `PUBLIC`. However, the language rules still apply and as long as an
|
||||
instance of the type is never handed to consumers they will have no way of accessing those
|
||||
members.
|
||||
intended only for use when testing the type they are defined on, not when testing consumers. In
|
||||
the cases where they _are_ intended as part of the API for testing consumers, you can explicitly
|
||||
mark them `PUBLIC` or `PRIVATE` depending on whether they should be usable from outside your
|
||||
module or not.
|
||||
- Things used implicitly (eg implicit conversion operators) are still counted as usages even if they
|
||||
are not specifically named at the call site
|
||||
- When merging information from multiple TUs, definitions always replace the metadata gathered from
|
||||
TUs that only saw a declaration.
|
||||
- Note that we aren't guaranteed to see every definition, in particular for functions that are not
|
||||
called from the TU that they are defined in. So this cannot be used to find places where we
|
||||
deleted the definition but forgot to delete the declaration (we wouldn't see them anyway, since
|
||||
we only track things that are used, and undefined things can't really be used, except trivially,
|
||||
without breaking the build).
|
||||
- `private` members of classes are implicitly `PRIVATE`, and must be explicitly marked otherwise if
|
||||
desired. They should probably never be made `PUBLIC` since that implies cross-module friendship.
|
||||
In the few places where we have that today, they have been made one of the flavors of
|
||||
unfortunately public: `NEEDS_REPLACEMENT` or `USE_INSTEAD`.
|
||||
- `public` members of `private` types do not inherit the implicit `PRIVATE` and follow the normal
|
||||
rule of looking for their nearest semantic parent with an explicit marker. That means that they
|
||||
may be `PUBLIC`. However, the language rules still apply and as long as an instance of the type
|
||||
is never handed to consumers they will have no way of accessing those members.
|
||||
- `protected` members do not default to `PRIVATE`, but because we only allow subclassing from
|
||||
`OPEN` classes, the language visibility rules will disallow access from outside the module
|
||||
unless you choose to allow it by use `OPEN` classes or `friend`s. Note that making any
|
||||
subclass `OPEN` exposes all `protected` members of parents unless they are marked `PRIVATE`.
|
||||
- `friend` declarations are mostly ignored, except when they are a definition. So the
|
||||
definitions using the "hidden friend" pattern are tracked, but we ignore it if the definition
|
||||
is in a cpp file.
|
||||
unless you choose to allow it by use `OPEN` classes or `friend`s. Note that making any subclass
|
||||
`OPEN` exposes all `protected` members of parents unless they are marked `PRIVATE`.
|
||||
- `friend` declarations are mostly ignored, except when they are a definition. So the definitions
|
||||
using the "hidden friend" pattern are tracked, but we ignore it if the definition is in a cpp
|
||||
file.
|
||||
|
||||
[^1]:
|
||||
Clang distinguishes between "semantic" and "lexical" parents. The primary differences
|
||||
are that members of classes (including member types) are semantic children of the class even
|
||||
when defined out of line, and conversely `friend` declarations are not, and instead are
|
||||
considered semantic children of the nearest namespace.
|
||||
Clang distinguishes between "semantic" and "lexical" parents. The primary differences are that
|
||||
members of classes (including member types) are semantic children of the class even when defined
|
||||
out of line, and conversely `friend` declarations are not, and instead are considered semantic
|
||||
children of the nearest namespace.
|
||||
|
||||
@ -2,15 +2,20 @@
|
||||
|
||||
## ALLOWED_UNOWNED_FILES.yml File Format
|
||||
|
||||
This file is for repos that require all files be owned. Some files may be listed here as an exception and will be added to the end of the CODEOWNERS.
|
||||
This file is for repos that require all files be owned. Some files may be listed here as an
|
||||
exception and will be added to the end of the CODEOWNERS.
|
||||
|
||||
`version` is the current version of the `ALLOWED_UNOWNED_FILES.yml` file format. The only version is `1.0.0`.
|
||||
`version` is the current version of the `ALLOWED_UNOWNED_FILES.yml` file format. The only version is
|
||||
`1.0.0`.
|
||||
|
||||
`filters` are a list of filters that each have a `filter` and `justificaiton` field.
|
||||
|
||||
`filter` is a file path. This file path must start with a `/` and is relative to the root repo directory. Directories or globs are not supported at the moment to ensure careful selection of files allowed to be unowned. This can be reconsidered if proper usecases appear.
|
||||
`filter` is a file path. This file path must start with a `/` and is relative to the root repo
|
||||
directory. Directories or globs are not supported at the moment to ensure careful selection of files
|
||||
allowed to be unowned. This can be reconsidered if proper usecases appear.
|
||||
|
||||
`justification` is the reason why this file should be unowned. A common case is that this is a generated file that has checks in CI to ensure it is in the correct format.
|
||||
`justification` is the reason why this file should be unowned. A common case is that this is a
|
||||
generated file that has checks in CI to ensure it is in the correct format.
|
||||
|
||||
### Example file
|
||||
|
||||
@ -23,7 +28,8 @@ filters: # List of all filters
|
||||
|
||||
### Configuration
|
||||
|
||||
This can be configured in any repo with `bazel_rules_mongo` by putting the following lines in your `.bazelrc` file:
|
||||
This can be configured in any repo with `bazel_rules_mongo` by putting the following lines in your
|
||||
`.bazelrc` file:
|
||||
|
||||
```
|
||||
common --define codeowners_have_allowed_unowned_files=True
|
||||
|
||||
@ -15,7 +15,8 @@ Banned owners should be separated by newlines. Empty lines and lines starting wi
|
||||
|
||||
### Configuration
|
||||
|
||||
This can be configured in any repo with `bazel_rules_mongo` by putting the following lines in your `.bazelrc` file:
|
||||
This can be configured in any repo with `bazel_rules_mongo` by putting the following lines in your
|
||||
`.bazelrc` file:
|
||||
|
||||
```
|
||||
common --define codeowners_have_banned_codeowners=True
|
||||
|
||||
@ -1,23 +1,40 @@
|
||||
# Code Owners
|
||||
|
||||
After modifying any OWNERS files, the overall ownership database (`.github/CODEOWNERS`) must be rebuilt.
|
||||
This is done by running `bazel run codeowners`.
|
||||
After modifying any OWNERS files, the overall ownership database (`.github/CODEOWNERS`) must be
|
||||
rebuilt. This is done by running `bazel run codeowners`.
|
||||
|
||||
## OWNERS.yml File Format
|
||||
|
||||
This is loosely based on [kubernetes](https://www.kubernetes.dev/docs/guide/owners/) and [chromium](https://chromium.googlesource.com/chromium/src/+/HEAD/docs/code_reviews.md) OWNERS files.
|
||||
This is loosely based on [kubernetes](https://www.kubernetes.dev/docs/guide/owners/) and
|
||||
[chromium](https://chromium.googlesource.com/chromium/src/+/HEAD/docs/code_reviews.md) OWNERS files.
|
||||
|
||||
`version` is the current version of the `OWNERS.yml` file format. The latest version is `2.0.0`. For previous versions, see the [changelog](#owners-changelog).
|
||||
`version` is the current version of the `OWNERS.yml` file format. The latest version is `2.0.0`. For
|
||||
previous versions, see the [changelog](#owners-changelog).
|
||||
|
||||
`aliases` point to yaml files files that list aliases that can be used in this OWNERS.yml file.
|
||||
|
||||
`filters` are a list of globs that match [gitignore syntax](https://git-scm.com/docs/gitignore#_pattern_format). The filter must match at least once file and be unique to the file. Each filter must have a list of `approvers`. An approval from any single approver will allow the code to be merged. `NOOWNER` can be specified to mark a filter as unowned. Each filter can optionally have a `metadata` tag. Inside that tag a user can put whatever tags they want. We have reserved two meaningful tags `emeritus_approvers` and `owning_team`. This is not an exhaustive list and more documented and undocumented options can be added later. There is no linting done on the metadata tag.
|
||||
`filters` are a list of globs that match
|
||||
[gitignore syntax](https://git-scm.com/docs/gitignore#_pattern_format). The filter must match at
|
||||
least once file and be unique to the file. Each filter must have a list of `approvers`. An approval
|
||||
from any single approver will allow the code to be merged. `NOOWNER` can be specified to mark a
|
||||
filter as unowned. Each filter can optionally have a `metadata` tag. Inside that tag a user can put
|
||||
whatever tags they want. We have reserved two meaningful tags `emeritus_approvers` and
|
||||
`owning_team`. This is not an exhaustive list and more documented and undocumented options can be
|
||||
added later. There is no linting done on the metadata tag.
|
||||
|
||||
`emeritus_approvers` are folks that used to be approvers that no longer have approver privileges. This allows us to keep track of folks who built up a knowledge base of this code that might need to be consulted in a critical situation. Both `approvers` and `emeritus_approvers` should be either github usernames, emails, or aliases.
|
||||
`emeritus_approvers` are folks that used to be approvers that no longer have approver privileges.
|
||||
This allows us to keep track of folks who built up a knowledge base of this code that might need to
|
||||
be consulted in a critical situation. Both `approvers` and `emeritus_approvers` should be either
|
||||
github usernames, emails, or aliases.
|
||||
|
||||
`owning_team` is a team that owns the files, however this team does not have approval privileges. Instead this team should be looked to for asking questions. This metadata can also be used programmatically to, for example, generate a report of all the files owned by a particular team, even though that team has nominated specific engineers as approvers.
|
||||
`owning_team` is a team that owns the files, however this team does not have approval privileges.
|
||||
Instead this team should be looked to for asking questions. This metadata can also be used
|
||||
programmatically to, for example, generate a report of all the files owned by a particular team,
|
||||
even though that team has nominated specific engineers as approvers.
|
||||
|
||||
`options` are not required and are various options about how to use this OWNERS.yml file. Currently there is only a single option `no_parent_owners` which is defaulted to false. If this option is set to true it will stop upwards OWNERS resolution.
|
||||
`options` are not required and are various options about how to use this OWNERS.yml file. Currently
|
||||
there is only a single option `no_parent_owners` which is defaulted to false. If this option is set
|
||||
to true it will stop upwards OWNERS resolution.
|
||||
|
||||
### Example file
|
||||
|
||||
@ -70,7 +87,8 @@ options: # All options for this file
|
||||
|
||||
`version` is the current version of the aliases file format. This should always be `1.0.0`.
|
||||
|
||||
`aliases` are a list of group names. Each group name must have one or more reviewers. Reviewers should be github usernames.
|
||||
`aliases` are a list of group names. Each group name must have one or more reviewers. Reviewers
|
||||
should be github usernames.
|
||||
|
||||
## Example File
|
||||
|
||||
@ -133,18 +151,26 @@ filters:
|
||||
|
||||
### Example 1
|
||||
|
||||
If someone changes `a/b/c/file.py` the owner resolution will select teamC since the first file searched is `a/b/c/OWNERS.yml` First we compare if `file.py` matches `*.md`. It does not so we now check if `file.py` matches `*`. It does match so teamC is selected for review.
|
||||
If someone changes `a/b/c/file.py` the owner resolution will select teamC since the first file
|
||||
searched is `a/b/c/OWNERS.yml` First we compare if `file.py` matches `*.md`. It does not so we now
|
||||
check if `file.py` matches `*`. It does match so teamC is selected for review.
|
||||
|
||||
### Example 2
|
||||
|
||||
If someone changes `a/b/c/file.yaml` the owner resolution will not find a team. The first file searched is `a/b/c/OWNERS.yml`. No filters match file.yaml. Next we search in `a/b/OWNERS.yml`. No filters match there either. We stop searching up because `no_parent_owners` is set to true.
|
||||
If someone changes `a/b/c/file.yaml` the owner resolution will not find a team. The first file
|
||||
searched is `a/b/c/OWNERS.yml`. No filters match file.yaml. Next we search in `a/b/OWNERS.yml`. No
|
||||
filters match there either. We stop searching up because `no_parent_owners` is set to true.
|
||||
|
||||
## OWNERS Changelog
|
||||
|
||||
### v2.0.0
|
||||
|
||||
See the [previous version](https://github.com/mongodb/mongo/blob/79590effe86c471cc15d91c6785599ec2085d7c0/docs/owners/owners_format.md) of this documentation for details on v1.0.0.
|
||||
See the
|
||||
[previous version](https://github.com/mongodb/mongo/blob/79590effe86c471cc15d91c6785599ec2085d7c0/docs/owners/owners_format.md)
|
||||
of this documentation for details on v1.0.0.
|
||||
|
||||
Patterns without a slash are no longer prepended with `**/` to make them apply recursively. If you want your pattern you apply recursively you must add the `**/` yourself now.
|
||||
Patterns without a slash are no longer prepended with `**/` to make them apply recursively. If you
|
||||
want your pattern you apply recursively you must add the `**/` yourself now.
|
||||
|
||||
The `*` pattern is now resolved as the directory name to ensure it applies recursively by default. You can use the `/*` pattern to only match inside the current directory.
|
||||
The `*` pattern is now resolved as the directory name to ensure it applies recursively by default.
|
||||
You can use the `/*` pattern to only match inside the current directory.
|
||||
|
||||
@ -12,16 +12,16 @@ To find the correct binary for a specific log you need to:
|
||||
curl -O http://s3.amazonaws.com/downloads.mongodb.org/linux/mongodb-linux-x86_64-debugsymbols-1.x.x.tgz
|
||||
```
|
||||
|
||||
You can also get the debugsymbols archive for official builds through [the Downloads page][1]. In the
|
||||
Archived Releases section, click on the appropriate platform link to view the available archives.
|
||||
Select the appropriate debug symbols archive.
|
||||
You can also get the debugsymbols archive for official builds through [the Downloads page][1]. In
|
||||
the Archived Releases section, click on the appropriate platform link to view the available
|
||||
archives. Select the appropriate debug symbols archive.
|
||||
|
||||
## Using mongosymb.py to get file and line numbers
|
||||
|
||||
Stacktraces are logged on a line with `msg` `BACKTRACE`. The full backtrace contents are available in
|
||||
an attribute named `bt`. To convert this into a list of source locations with file and line numbers,
|
||||
copy the contents of the `bt` JSON blob into a file, then direct the contents of that file into
|
||||
the standard input of `buildscripts/mongosymb.py`:
|
||||
Stacktraces are logged on a line with `msg` `BACKTRACE`. The full backtrace contents are available
|
||||
in an attribute named `bt`. To convert this into a list of source locations with file and line
|
||||
numbers, copy the contents of the `bt` JSON blob into a file, then direct the contents of that file
|
||||
into the standard input of `buildscripts/mongosymb.py`:
|
||||
|
||||
```
|
||||
cat bt | buildscripts/mongosymb.py --debug-file-resolver=path path/to/debug/symbols/file
|
||||
@ -55,8 +55,8 @@ $ cat bt | buildscripts/mongosymb.py --debug-file-resolver=path bazel-bin/instal
|
||||
|
||||
## Stack Trace Schema
|
||||
|
||||
Stack traces are typically logged as log message 31380, having a `bt` attribute
|
||||
that holds a JSON object value:
|
||||
Stack traces are typically logged as log message 31380, having a `bt` attribute that holds a JSON
|
||||
object value:
|
||||
|
||||
```json
|
||||
"bt": {
|
||||
@ -86,10 +86,9 @@ that holds a JSON object value:
|
||||
}
|
||||
```
|
||||
|
||||
The "processInfo" subobject has other information about the process, but
|
||||
the most important thing for the stack trace is the "somap", which is an
|
||||
array of all dynamically linked ELF files, including the main executable,
|
||||
and where in memory they were loaded.
|
||||
The "processInfo" subobject has other information about the process, but the most important thing
|
||||
for the stack trace is the "somap", which is an array of all dynamically linked ELF files, including
|
||||
the main executable, and where in memory they were loaded.
|
||||
|
||||
Partial example showing a few typical frames:
|
||||
|
||||
|
||||
@ -2,27 +2,55 @@
|
||||
|
||||
## Project Impetus
|
||||
|
||||
We frequently encounter Python errors that are caused by a python dependency author updating their package that is backward breaking. The following tickets are a few examples of this happening:
|
||||
[SERVER-79126](https://jira.mongodb.org/browse/SERVER-79126), [SERVER-79798](https://jira.mongodb.org/browse/SERVER-79798), [SERVER-53348](https://jira.mongodb.org/browse/SERVER-53348), [SERVER-57036](https://jira.mongodb.org/browse/SERVER-57036), [SERVER-44579](https://jira.mongodb.org/browse/SERVER-44579), [SERVER-70845](https://jira.mongodb.org/browse/SERVER-70845), [SERVER-63974](https://jira.mongodb.org/browse/SERVER-63974), [SERVER-61791](https://jira.mongodb.org/browse/SERVER-61791), and [SERVER-60950](https://jira.mongodb.org/browse/SERVER-60950). We have always known this was a problem and have known there was a way to fix it. We finally had the bandwidth to tackle this problem.
|
||||
We frequently encounter Python errors that are caused by a python dependency author updating their
|
||||
package that is backward breaking. The following tickets are a few examples of this happening:
|
||||
[SERVER-79126](https://jira.mongodb.org/browse/SERVER-79126),
|
||||
[SERVER-79798](https://jira.mongodb.org/browse/SERVER-79798),
|
||||
[SERVER-53348](https://jira.mongodb.org/browse/SERVER-53348),
|
||||
[SERVER-57036](https://jira.mongodb.org/browse/SERVER-57036),
|
||||
[SERVER-44579](https://jira.mongodb.org/browse/SERVER-44579),
|
||||
[SERVER-70845](https://jira.mongodb.org/browse/SERVER-70845),
|
||||
[SERVER-63974](https://jira.mongodb.org/browse/SERVER-63974),
|
||||
[SERVER-61791](https://jira.mongodb.org/browse/SERVER-61791), and
|
||||
[SERVER-60950](https://jira.mongodb.org/browse/SERVER-60950). We have always known this was a
|
||||
problem and have known there was a way to fix it. We finally had the bandwidth to tackle this
|
||||
problem.
|
||||
|
||||
## Project Prework
|
||||
|
||||
First, we wanted to test out using poetry so we converted mongo-container project to use poetry [SERVER-76974](https://jira.mongodb.org/browse/SERVER-76974). This showed promise and we considered this a green light to move forward on converting the server python to use poetry.
|
||||
First, we wanted to test out using poetry so we converted mongo-container project to use poetry
|
||||
[SERVER-76974](https://jira.mongodb.org/browse/SERVER-76974). This showed promise and we considered
|
||||
this a green light to move forward on converting the server python to use poetry.
|
||||
|
||||
Before we could start the project we had to upgrade python to a version that was not EoL. This work is captured in [SERVER-72262](https://jira.mongodb.org/browse/SERVER-72262). We upgraded python to 3.10 on every system except windows. Windows could not be upgraded due to a test problem relating to some cipher suites [SERVER-79172](https://jira.mongodb.org/browse/SERVER-79172).
|
||||
Before we could start the project we had to upgrade python to a version that was not EoL. This work
|
||||
is captured in [SERVER-72262](https://jira.mongodb.org/browse/SERVER-72262). We upgraded python to
|
||||
3.10 on every system except windows. Windows could not be upgraded due to a test problem relating to
|
||||
some cipher suites [SERVER-79172](https://jira.mongodb.org/browse/SERVER-79172).
|
||||
|
||||
## Conversion to Poetry
|
||||
|
||||
After the prework was done we wrote, tested, and merged [SERVER-76751](https://jira.mongodb.org/browse/SERVER-76751) which is converting the mongo python dependencies to poetry. This ticket had an absurd amount of dependencies and required a significant amount of patch builds. The total number of changes was pretty small but it affected a lot of different projects.
|
||||
After the prework was done we wrote, tested, and merged
|
||||
[SERVER-76751](https://jira.mongodb.org/browse/SERVER-76751) which is converting the mongo python
|
||||
dependencies to poetry. This ticket had an absurd amount of dependencies and required a significant
|
||||
amount of patch builds. The total number of changes was pretty small but it affected a lot of
|
||||
different projects.
|
||||
|
||||
Knowing there was a lot this touched we expected to see some bugs and were quick to try to fix them. Some of these were caught before merge and some were caught after.
|
||||
Knowing there was a lot this touched we expected to see some bugs and were quick to try to fix them.
|
||||
Some of these were caught before merge and some were caught after.
|
||||
|
||||
[BUILD-17860](https://jira.mongodb.org/browse/BUILD-17860) required the build team to rebuild python on macosx arm. This was caught before merging.
|
||||
[BUILD-17860](https://jira.mongodb.org/browse/BUILD-17860) required the build team to rebuild python
|
||||
on macosx arm. This was caught before merging.
|
||||
|
||||
[SERVER-81122](https://jira.mongodb.org/browse/SERVER-81122) found that poetry broke the spawnhost script. This was caught after merge.
|
||||
[SERVER-81122](https://jira.mongodb.org/browse/SERVER-81122) found that poetry broke the spawnhost
|
||||
script. This was caught after merge.
|
||||
|
||||
[SERVER-81061](https://jira.mongodb.org/browse/SERVER-81061) and [BF-29909](https://jira.mongodb.org/browse/BF-29909) were found by sys-perf since they run their own build and do not use the standard build process. Therefore it was very hard to test for this one. This was caught post merge.
|
||||
[SERVER-81061](https://jira.mongodb.org/browse/SERVER-81061) and
|
||||
[BF-29909](https://jira.mongodb.org/browse/BF-29909) were found by sys-perf since they run their own
|
||||
build and do not use the standard build process. Therefore it was very hard to test for this one.
|
||||
This was caught post merge.
|
||||
|
||||
[SERVER-80799](https://jira.mongodb.org/browse/SERVER-80799) found that poetry broke mongo tooling metrics collection (not OTel). This was only found since an engineer on the team saw this bug in the code. This was caught post merge.
|
||||
[SERVER-80799](https://jira.mongodb.org/browse/SERVER-80799) found that poetry broke mongo tooling
|
||||
metrics collection (not OTel). This was only found since an engineer on the team saw this bug in the
|
||||
code. This was caught post merge.
|
||||
|
||||
Overall, when changing something so foundational it is inevitable that some things will break.
|
||||
|
||||
@ -1,10 +1,10 @@
|
||||
# PrimaryOnlyService
|
||||
|
||||
The PrimaryOnlyService machinery provides a way to register tasks that should run only when current
|
||||
node is Primary, and should be driven to completion across replica set failovers on the new
|
||||
Primary. It is intended to be used by tasks that can be modeled as a state machine with a single
|
||||
MongoDB document containing the current state, which newly-elected Primaries can use to rebuild the
|
||||
state of the task after failover and pick up where the old Primary left off.
|
||||
node is Primary, and should be driven to completion across replica set failovers on the new Primary.
|
||||
It is intended to be used by tasks that can be modeled as a state machine with a single MongoDB
|
||||
document containing the current state, which newly-elected Primaries can use to rebuild the state of
|
||||
the task after failover and pick up where the old Primary left off.
|
||||
|
||||
## Classes
|
||||
|
||||
@ -62,16 +62,17 @@ what state it is in and thus what work still needs to be performed, and what wor
|
||||
completed by the previous Primary.
|
||||
|
||||
To see an example bare-bones PrimaryOnlyService implementation to use as a reference, check out the
|
||||
TestService defined in this unit test: https://github.com/mongodb/mongo/blob/master/src/mongo/db/repl/primary_only_service_test.cpp
|
||||
TestService defined in this unit test:
|
||||
https://github.com/mongodb/mongo/blob/master/src/mongo/db/repl/primary_only_service_test.cpp
|
||||
|
||||
## Behavior during state transitions
|
||||
|
||||
At stepUp, each PrimaryOnlyService queries its state document collection, and for each document
|
||||
found, creates and launches a PrimaryOnlyService::Instance initialized off of the state
|
||||
document. This happens asynchronously relative to the core replication stepUp process - there is no
|
||||
guarantee that when stepUp completes and the RSTL lock is dropped that the PrimaryOnlyServices have
|
||||
finished rebuilding all their Instances. At stepDown all Instances are interrupted, but the threads
|
||||
running their work are not joined, and the Instance objects containing their in-memory state are not
|
||||
found, creates and launches a PrimaryOnlyService::Instance initialized off of the state document.
|
||||
This happens asynchronously relative to the core replication stepUp process - there is no guarantee
|
||||
that when stepUp completes and the RSTL lock is dropped that the PrimaryOnlyServices have finished
|
||||
rebuilding all their Instances. At stepDown all Instances are interrupted, but the threads running
|
||||
their work are not joined, and the Instance objects containing their in-memory state are not
|
||||
released, until the next stepUp. This is done to reduce the likelihood of blocking within the state
|
||||
transition process and delaying it for the entire node. This behavior does, however, guarantee that
|
||||
there will never be two Instances of the same PrimaryOnlyService with the same InstanceID running at
|
||||
|
||||
@ -1,11 +1,14 @@
|
||||
# Priority port support
|
||||
|
||||
`mongod` and `mongos` support a dedicated **priority port** intended for **internal, high-priority operations** such as automation monitoring, MongoTune, and critical intra-cluster replication traffic.
|
||||
`mongod` and `mongos` support a dedicated **priority port** intended for **internal, high-priority
|
||||
operations** such as automation monitoring, MongoTune, and critical intra-cluster replication
|
||||
traffic.
|
||||
|
||||
With a priority port configured:
|
||||
|
||||
- The database listens on a second TCP port in addition to the main port.
|
||||
- Connections accepted on the priority port are exempt from connection limits, connection establishment rate limiting, and ingress request rate limiting.
|
||||
- Connections accepted on the priority port are exempt from connection limits, connection
|
||||
establishment rate limiting, and ingress request rate limiting.
|
||||
- gRPC is not supported.
|
||||
|
||||
The feature is **disabled by default**.
|
||||
@ -35,7 +38,8 @@ net:
|
||||
When the transport layer starts:
|
||||
|
||||
- A **separate listener thread** is created for the priority port in the ASIO transport layer.
|
||||
- Sessions created from the priority port are tagged so downstream code can distinguish them from main-port sessions (similar to the load balancer port implementation).
|
||||
- Sessions created from the priority port are tagged so downstream code can distinguish them from
|
||||
main-port sessions (similar to the load balancer port implementation).
|
||||
|
||||
---
|
||||
|
||||
@ -47,27 +51,33 @@ Priority-port connections differ from normal connections in several ways.
|
||||
|
||||
When a new connection is accepted:
|
||||
|
||||
- Connections from the priority port are treated as **limit-exempt** in the session manager, reusing the existing exemption machinery used for CIDR-based exemptions.
|
||||
- Connections from the priority port are treated as **limit-exempt** in the session manager, reusing
|
||||
the existing exemption machinery used for CIDR-based exemptions.
|
||||
- These connections can continue to be created even when the normal connection limit is reached.
|
||||
|
||||
Metrics:
|
||||
|
||||
- `serverStatus.connections.priority` counts current connections on the priority port only.
|
||||
- These connections are also included in `connections.limitExempt` (along with CIDR-based exemptions).
|
||||
- These connections are also included in `connections.limitExempt` (along with CIDR-based
|
||||
exemptions).
|
||||
|
||||
## Rate limiters
|
||||
|
||||
Two ingress-side rate limiters recognize priority-port exemptions:
|
||||
|
||||
- [**SessionEstablishmentRateLimiter**](../src/mongo/db/admission/README.md#session-establishment-rate-limiter) (connection establishment)
|
||||
- [**IngressRequestRateLimiter**](../src/mongo/db/admission/README.md#ingress-request-rate-limiting) (request rate limiting)
|
||||
- [**SessionEstablishmentRateLimiter**](../src/mongo/db/admission/README.md#session-establishment-rate-limiter)
|
||||
(connection establishment)
|
||||
- [**IngressRequestRateLimiter**](../src/mongo/db/admission/README.md#ingress-request-rate-limiting)
|
||||
(request rate limiting)
|
||||
|
||||
## Logging and profiling
|
||||
|
||||
For observability and debugging, the server records whether an operation came through the priority port:
|
||||
For observability and debugging, the server records whether an operation came through the priority
|
||||
port:
|
||||
|
||||
- `CurOp` / currentOp output includes a flag indicating the connection is from the priority port.
|
||||
- Slow query log and profiler entries include whether the operation was executed via a priority-port connection.
|
||||
- Slow query log and profiler entries include whether the operation was executed via a priority-port
|
||||
connection.
|
||||
- Client summary reports also distinguish clients on the main vs priority port.
|
||||
|
||||
---
|
||||
@ -79,7 +89,8 @@ For observability and debugging, the server records whether an operation came th
|
||||
To connect to a replica set via the priority port, a user must:
|
||||
|
||||
- Use a connection string that points directly at a specific host and priority port.
|
||||
- Set `directConnection=true` to disable SDAM and prevent the driver from using hello-based host discovery, which currently does not advertise the priority port.
|
||||
- Set `directConnection=true` to disable SDAM and prevent the driver from using hello-based host
|
||||
discovery, which currently does not advertise the priority port.
|
||||
|
||||
Example:
|
||||
|
||||
@ -92,11 +103,14 @@ mongodb://hostA:27018/?directConnection=true
|
||||
For `mongos`:
|
||||
|
||||
- You may connect directly to the `mongos` priority port.
|
||||
- `directConnection=true` is **not required** for `mongos` connections, since SDAM is not used in the same way.
|
||||
- `directConnection=true` is **not required** for `mongos` connections, since SDAM is not used in
|
||||
the same way.
|
||||
|
||||
Important limitation:
|
||||
|
||||
- **Priority does not automatically propagate**:
|
||||
- If a client connects to a `mongos` via the priority port and `mongos` forwards a command to shards, those shard-side connections still use the main ports and do **not** inherit priority-port behavior in the current implementation.
|
||||
- If a client connects to a `mongos` via the priority port and `mongos` forwards a command to
|
||||
shards, those shard-side connections still use the main ports and do **not** inherit
|
||||
priority-port behavior in the current implementation.
|
||||
|
||||
---
|
||||
|
||||
@ -37,9 +37,9 @@ Users can set or modify a server parameter at startup and/or runtime, depending
|
||||
specified for `set_at`. For instance, `logLevel` may be set at both startup and runtime, as
|
||||
indicated by `set_at` (see the above code snippet).
|
||||
|
||||
At startup, server parameters may be set using the `--setParameter` command line option.
|
||||
At runtime, the `setParameter` command may be used to modify server parameters.
|
||||
See the [`setParameter` documentation][set-parameter] for details.
|
||||
At startup, server parameters may be set using the `--setParameter` command line option. At runtime,
|
||||
the `setParameter` command may be used to modify server parameters. See the [`setParameter`
|
||||
documentation][set-parameter] for details.
|
||||
|
||||
## How to get the value provided for a parameter
|
||||
|
||||
@ -99,27 +99,28 @@ must be unique across the server instance. More information on the specific fiel
|
||||
|
||||
- `set_at` (required): Must contain the value `startup`, `runtime`, [`startup`, `runtime`], or
|
||||
`cluster`. If `runtime` is specified along with `cpp_varname`, then `decltype(cpp_varname)` must
|
||||
refer to a thread-safe storage type, specifically: `Atomic<T>`, `std::atomic<T>`,
|
||||
or `boost::synchronized<T>`. Parameters declared as `cluster` can only be set at runtime and exhibit
|
||||
refer to a thread-safe storage type, specifically: `Atomic<T>`, `std::atomic<T>`, or
|
||||
`boost::synchronized<T>`. Parameters declared as `cluster` can only be set at runtime and exhibit
|
||||
numerous differences. See [Cluster Server Parameters](cluster-server-parameters) below.
|
||||
|
||||
- `description` (required): Free-form text field currently used only for commenting the generated C++
|
||||
code. Future uses may preserve this value for a possible `{listSetParameters:1}` command or other
|
||||
programmatic and potentially user-facing purposes.
|
||||
- `description` (required): Free-form text field currently used only for commenting the generated
|
||||
C++ code. Future uses may preserve this value for a possible `{listSetParameters:1}` command or
|
||||
other programmatic and potentially user-facing purposes.
|
||||
|
||||
- `cpp_vartype`: Declares the full storage type. If `cpp_vartype` is not defined, it may be inferred
|
||||
from the C++ variable referenced by `cpp_varname`.
|
||||
|
||||
- `cpp_varname`: Declares the underlying variable or C++ `struct` member to use when setting or reading the
|
||||
server parameter. If defined together with `cpp_vartype`, the storage will be declared as a global
|
||||
variable, and externed in the generated header file. If defined alone, a variable of this name will
|
||||
assume to have been declared and defined by the implementer, and its type will be automatically
|
||||
inferred at compile time. If `cpp_varname` is not defined, then `cpp_class` must be specified.
|
||||
- `cpp_varname`: Declares the underlying variable or C++ `struct` member to use when setting or
|
||||
reading the server parameter. If defined together with `cpp_vartype`, the storage will be declared
|
||||
as a global variable, and externed in the generated header file. If defined alone, a variable of
|
||||
this name will assume to have been declared and defined by the implementer, and its type will be
|
||||
automatically inferred at compile time. If `cpp_varname` is not defined, then `cpp_class` must be
|
||||
specified.
|
||||
|
||||
- `cpp_class`: Declares a custom `ServerParameter` class in the generated header using the provided
|
||||
string, or the name field in the associated map. The declared class will require an implementation
|
||||
of `setFromString()`, and optionally `set()`, `append()`, and a constructor.
|
||||
See [Specialized Server Parameters](#specialized-server-parameters) below.
|
||||
of `setFromString()`, and optionally `set()`, `append()`, and a constructor. See
|
||||
[Specialized Server Parameters](#specialized-server-parameters) below.
|
||||
|
||||
- `default`: String or expression map representation of the initial value.
|
||||
|
||||
@ -127,10 +128,10 @@ must be unique across the server instance. More information on the specific fiel
|
||||
This is a required field and must be explicitly set to `false` to disable redaction.
|
||||
|
||||
- `omit_in_ftdc`: Only applies to cluster parameters. If set to `true`, then the cluster parameter
|
||||
will be omitted when `getClusterParameter` is invoked with `omitInFTDC: true`.
|
||||
In practice, FTDC runs `getClusterParameter` with this option periodically to
|
||||
collect configuration metadata about the server and setting this flag to true
|
||||
for a cluster parameter ensures that its value(s) will not be exposed in FTDC.
|
||||
will be omitted when `getClusterParameter` is invoked with `omitInFTDC: true`. In practice, FTDC
|
||||
runs `getClusterParameter` with this option periodically to collect configuration metadata about
|
||||
the server and setting this flag to true for a cluster parameter ensures that its value(s) will
|
||||
not be exposed in FTDC.
|
||||
|
||||
- `test_only`: Set to `true` to disable this set parameter if `enableTestCommands` is not specified.
|
||||
|
||||
@ -141,26 +142,27 @@ must be unique across the server instance. More information on the specific fiel
|
||||
new value has been stored. Prototype: `Status(const cpp_vartype&);`
|
||||
|
||||
- `condition`: Up to five conditional rules for deciding whether or not to apply this server
|
||||
parameter. `preprocessor` will be evaluated first, followed by `constexpr`, then finally `expr`. If
|
||||
no provided setting evaluates to `false`, the server parameter will be registered. `feature_flag` and
|
||||
`min_fcv` are evaluated after the parameter is registered, and instead affect whether the parameter
|
||||
is enabled. `min_fcv` is a string of the form `X.Y`, representing the minimum FCV version for which
|
||||
this parameter should be enabled. `feature_flag` is the name of a feature flag variable upon which
|
||||
this server parameter depends -- if the feature flag is disabled, this parameter will be disabled.
|
||||
`feature_flag` should be removed when all other instances of that feature flag are deleted, which
|
||||
typically is done after the next LTS version of the server is branched. `min_fcv` should be removed
|
||||
after it is no longer possible to downgrade to a FCV lower than that version - this occurs when the
|
||||
next LTS version of the server is branched.
|
||||
parameter. `preprocessor` will be evaluated first, followed by `constexpr`, then finally `expr`.
|
||||
If no provided setting evaluates to `false`, the server parameter will be registered.
|
||||
`feature_flag` and `min_fcv` are evaluated after the parameter is registered, and instead affect
|
||||
whether the parameter is enabled. `min_fcv` is a string of the form `X.Y`, representing the
|
||||
minimum FCV version for which this parameter should be enabled. `feature_flag` is the name of a
|
||||
feature flag variable upon which this server parameter depends -- if the feature flag is disabled,
|
||||
this parameter will be disabled. `feature_flag` should be removed when all other instances of that
|
||||
feature flag are deleted, which typically is done after the next LTS version of the server is
|
||||
branched. `min_fcv` should be removed after it is no longer possible to downgrade to a FCV lower
|
||||
than that version - this occurs when the next LTS version of the server is branched.
|
||||
|
||||
- `validator`: Zero or many validation rules to impose on the setting. All specified rules must pass
|
||||
to consider the new setting valid. `lt`, `gt`, `lte`, `gte` fields provide for simple numeric limits
|
||||
or expression maps which evaluate to numeric values. For all other validation cases, specify
|
||||
callback as a C++ function or static method. Note that validation rules (including callback) may run
|
||||
in any order. To perform an action after all validation rules have completed, `on_update` should be
|
||||
preferred instead. Callback prototype: `Status(const cpp_vartype&, const boost::optional<TenantId>&);`
|
||||
to consider the new setting valid. `lt`, `gt`, `lte`, `gte` fields provide for simple numeric
|
||||
limits or expression maps which evaluate to numeric values. For all other validation cases,
|
||||
specify callback as a C++ function or static method. Note that validation rules (including
|
||||
callback) may run in any order. To perform an action after all validation rules have completed,
|
||||
`on_update` should be preferred instead. Callback prototype:
|
||||
`Status(const cpp_vartype&, const boost::optional<TenantId>&);`
|
||||
|
||||
- `is_deprecated`: Mark the server parameter as deprecated. Warns users if the server parameter
|
||||
is ever used. Defaults to false.
|
||||
- `is_deprecated`: Mark the server parameter as deprecated. Warns users if the server parameter is
|
||||
ever used. Defaults to false.
|
||||
|
||||
Any symbols such as global variables or callbacks used by a server parameter must be imported using
|
||||
the usual IDL machinery via `globals.cpp_includes`. Similarly, all generated code will be nested
|
||||
@ -240,9 +242,8 @@ to any other work, this custom constructor must invoke its parent's constructor.
|
||||
Status {name}::set(const BSONElement& val, const boost::optional<TenantId>& tenantId);
|
||||
```
|
||||
|
||||
Otherwise the base class implementation `ServerParameter::set` is used. It
|
||||
invokes `setFromString` using a string representation of `val`, if the `val` is
|
||||
holding one of the supported types.
|
||||
Otherwise the base class implementation `ServerParameter::set` is used. It invokes `setFromString`
|
||||
using a string representation of `val`, if the `val` is holding one of the supported types.
|
||||
|
||||
`override_validate`: If `true`, the implementer must provide a `validate` member function as:
|
||||
|
||||
@ -261,8 +262,8 @@ must be provided with the following signature:
|
||||
Status {name}::append(OperationContext*, BSONObjBuilder*, StringData, const boost::optional<TenantId>& tenantId);
|
||||
```
|
||||
|
||||
`override_warn_if_deprecated`: If `true`, allows a custom warnIfDeprecated() method to be defined, defaults
|
||||
to `false`.
|
||||
`override_warn_if_deprecated`: If `true`, allows a custom warnIfDeprecated() method to be defined,
|
||||
defaults to `false`.
|
||||
|
||||
Lastly, a `setFromString` method must always be provided with the following signature:
|
||||
|
||||
@ -318,17 +319,17 @@ preferred to implementing custom parameter propagation whenever possible.
|
||||
|
||||
`setClusterParameter` persists the new value of the indicated cluster server parameter onto a
|
||||
majority of nodes on non-sharded replica sets. On sharded clusters, it majority-writes the new value
|
||||
onto every shard and the config server. This ensures that every **mongod** in the cluster will be able
|
||||
to recover the most recently written value for all cluster server parameters on restart.
|
||||
onto every shard and the config server. This ensures that every **mongod** in the cluster will be
|
||||
able to recover the most recently written value for all cluster server parameters on restart.
|
||||
Additionally, `setClusterParameter` blocks until the majority write succeeds in a replica set
|
||||
deployment, which guarantees that the parameter value will not be rolled back after being set.
|
||||
In a sharded cluster deployment, the new value has to be majority-committed on the config shard and
|
||||
deployment, which guarantees that the parameter value will not be rolled back after being set. In a
|
||||
sharded cluster deployment, the new value has to be majority-committed on the config shard and
|
||||
locally-committed on all other shards.
|
||||
|
||||
The cluster parameters are persisted in the `config.clusterParameters` collections and cached in
|
||||
memory on every **mongod**. The cache updates are done by the `ClusterServerParameterOpObserver` class.
|
||||
Every **mongos** also maintains an in-memory cache by polling the config server for updated cluster
|
||||
server parameter values every `clusterServerParameterRefreshIntervalSecs` using the
|
||||
memory on every **mongod**. The cache updates are done by the `ClusterServerParameterOpObserver`
|
||||
class. Every **mongos** also maintains an in-memory cache by polling the config server for updated
|
||||
cluster server parameter values every `clusterServerParameterRefreshIntervalSecs` using the
|
||||
`ClusterParameterRefresher` periodic job.
|
||||
|
||||
`getClusterParameter` returns the cached value of the requested cluster server parameter on the node
|
||||
@ -347,10 +348,10 @@ following members to the resulting type:
|
||||
was updated; used by runtime audit configuration, and to prevent concurrent and redundant cluster
|
||||
parameter updates.
|
||||
|
||||
It is highly recommended to specify validation rules or a callback function via the `param.validator`
|
||||
field. These validators are called before the new value of the cluster server parameter is written
|
||||
to disk during `setClusterParameter`.
|
||||
See [server_parameter_with_storage_test.idl][cluster-server-param-with-storage-test] and
|
||||
It is highly recommended to specify validation rules or a callback function via the
|
||||
`param.validator` field. These validators are called before the new value of the cluster server
|
||||
parameter is written to disk during `setClusterParameter`. See
|
||||
[server_parameter_with_storage_test.idl][cluster-server-param-with-storage-test] and
|
||||
[server_parameter_with_storage_test_structs.idl][cluster-server-param-with-storage-test-structs] for
|
||||
examples.
|
||||
|
||||
@ -394,21 +395,21 @@ Tue `reset()` method must be implemented and should update the cluster server pa
|
||||
default value.
|
||||
|
||||
All cluster server parameters are tenant-aware, meaning that on serverless clusters, each tenant has
|
||||
an isolated set of parameters. The `setClusterParameter` and `getClusterParameter` commands will pass
|
||||
the `tenantId` on the command request to the `ServerParameter`'s methods. On dedicated
|
||||
an isolated set of parameters. The `setClusterParameter` and `getClusterParameter` commands will
|
||||
pass the `tenantId` on the command request to the `ServerParameter`'s methods. On dedicated
|
||||
(non-serverless) clusters, `boost::none` will be passed. IDL-defined cluster server parameters will
|
||||
handle the passed-in `tenantId` automatically and store separate parameter values per-tenant.
|
||||
Specialized server parameters will have to take care to correctly handle the passed-in `tenantId` and
|
||||
to enforce tenant isolation.
|
||||
Specialized server parameters will have to take care to correctly handle the passed-in `tenantId`
|
||||
and to enforce tenant isolation.
|
||||
|
||||
Like normal server parameters, cluster server parameters can be defined to be dependent on a minimum
|
||||
FCV version or a specific feature flag using the `condition.min_fcv` and `condition.feature_flag` syntax discussed
|
||||
above. During FCV downgrade, the cluster parameter value stored on disk will be deleted if either:
|
||||
(1) The downgraded FCV is lower than the cluster parameter's `min_fcv`, or (2) The cluster
|
||||
parameter's `feature_flag` is disabled on the downgraded FCV. While a cluster server parameter is
|
||||
disabled due to either of these conditions, `setClusterParameter` on it will always fail, and
|
||||
`getClusterParameter` will fail on **mongod**, and return the default value on **mongos** -- this
|
||||
difference in behavior is due to **mongos** being unaware of the current FCV.
|
||||
FCV version or a specific feature flag using the `condition.min_fcv` and `condition.feature_flag`
|
||||
syntax discussed above. During FCV downgrade, the cluster parameter value stored on disk will be
|
||||
deleted if either: (1) The downgraded FCV is lower than the cluster parameter's `min_fcv`, or (2)
|
||||
The cluster parameter's `feature_flag` is disabled on the downgraded FCV. While a cluster server
|
||||
parameter is disabled due to either of these conditions, `setClusterParameter` on it will always
|
||||
fail, and `getClusterParameter` will fail on **mongod**, and return the default value on **mongos**
|
||||
-- this difference in behavior is due to **mongos** being unaware of the current FCV.
|
||||
|
||||
See [server_parameter_specialized_test.idl][specialized-cluster-server-param-test-idl] and
|
||||
[server_parameter_specialized_test.h][specialized-cluster-server-param-test-data] for examples.
|
||||
@ -582,9 +583,11 @@ classDiagram
|
||||
[parameters.idl]: ../src/mongo/db/commands/parameters.idl
|
||||
[set-parameter]: https://docs.mongodb.com/manual/reference/parameters/#synopsis
|
||||
[get-parameter]: https://docs.mongodb.com/manual/reference/command/getParameter/#getparameter
|
||||
[quiet-param]: https://github.com/mongodb/mongo/search?q=serverGlobalParams+quiet+extension:idl&type=code
|
||||
[quiet-param]:
|
||||
https://github.com/mongodb/mongo/search?q=serverGlobalParams+quiet+extension:idl&type=code
|
||||
[ftdc-file-size-param]: ../src/mongo/db/ftdc/ftdc_server.idl
|
||||
[cluster-server-param-with-storage-test]: ../src/mongo/idl/server_parameter_with_storage_test.idl
|
||||
[cluster-server-param-with-storage-test-structs]: ../src/mongo/idl/server_parameter_with_storage_test_structs.idl
|
||||
[cluster-server-param-with-storage-test-structs]:
|
||||
../src/mongo/idl/server_parameter_with_storage_test_structs.idl
|
||||
[specialized-cluster-server-param-test-idl]: ../src/mongo/idl/server_parameter_specialized_test.idl
|
||||
[specialized-cluster-server-param-test-data]: ../src/mongo/idl/server_parameter_specialized_test.h
|
||||
|
||||
@ -1,7 +1,7 @@
|
||||
# Test Commands
|
||||
|
||||
All test commands are denoted with the `.testOnly()` modifier to the `MONGO_REGISTER_COMMAND` invocation.
|
||||
For example:
|
||||
All test commands are denoted with the `.testOnly()` modifier to the `MONGO_REGISTER_COMMAND`
|
||||
invocation. For example:
|
||||
|
||||
```c++
|
||||
MONGO_REGISTER_COMMAND(EchoCommand).testOnly();
|
||||
@ -9,9 +9,9 @@ MONGO_REGISTER_COMMAND(EchoCommand).testOnly();
|
||||
|
||||
## How to enable
|
||||
|
||||
To be able to run these commands, the server must be started with the `enableTestCommands=1`
|
||||
server parameter (e.g. `--setParameter enableTestCommands=1`). Resmoke.py often sets this server
|
||||
parameter for testing.
|
||||
To be able to run these commands, the server must be started with the `enableTestCommands=1` server
|
||||
parameter (e.g. `--setParameter enableTestCommands=1`). Resmoke.py often sets this server parameter
|
||||
for testing.
|
||||
|
||||
## Examples
|
||||
|
||||
|
||||
@ -1,7 +1,7 @@
|
||||
# Testing
|
||||
|
||||
Most tests for MongoDB are run through resmoke, our test runner and orchestration tool.
|
||||
The entry point for resmoke can be found at `buildscripts/resmoke.py`
|
||||
Most tests for MongoDB are run through resmoke, our test runner and orchestration tool. The entry
|
||||
point for resmoke can be found at `buildscripts/resmoke.py`
|
||||
|
||||
## Concepts
|
||||
|
||||
@ -9,9 +9,12 @@ Learn more about related topics using their own targeted documentation:
|
||||
|
||||
- [resmoke](../../buildscripts/resmokelib/README.md), the test runner
|
||||
- [suites](../../buildscripts/resmokeconfig/suites/README.md), how tests are grouped and configured
|
||||
- [fixtures](../../buildscripts/resmokelib/testing/fixtures/README.md), specify the server topology that tests run against
|
||||
- [hooks](../../buildscripts/resmokelib/testing/hooks/README.md), logic to run before, after and/or between individual tests
|
||||
- [testcases](../../buildscripts/resmokelib/testing/testcases/README.md), Python-based unittest interfaces that resmoke can run as different "kinds" of tests.
|
||||
- [fixtures](../../buildscripts/resmokelib/testing/fixtures/README.md), specify the server topology
|
||||
that tests run against
|
||||
- [hooks](../../buildscripts/resmokelib/testing/hooks/README.md), logic to run before, after and/or
|
||||
between individual tests
|
||||
- [testcases](../../buildscripts/resmokelib/testing/testcases/README.md), Python-based unittest
|
||||
interfaces that resmoke can run as different "kinds" of tests.
|
||||
|
||||
## Basic Example
|
||||
|
||||
@ -35,4 +38,7 @@ Now, **run the test content** from one test file:
|
||||
buildscripts/resmoke.py run --suites=no_passthrough jstests/noPassthrough/shell/js/string.js
|
||||
```
|
||||
|
||||
The suite defined in [buildscripts/resmokeconfig/suites/no_passthrough.yml](../../buildscripts/resmokeconfig/suites/no_passthrough.yml) includes that `string.js` file via glob selections, specifies no fixtures, no hooks, and a minimal config for the executor.
|
||||
The suite defined in
|
||||
[buildscripts/resmokeconfig/suites/no_passthrough.yml](../../buildscripts/resmokeconfig/suites/no_passthrough.yml)
|
||||
includes that `string.js` file via glob selections, specifies no fixtures, no hooks, and a minimal
|
||||
config for the executor.
|
||||
|
||||
@ -2,80 +2,69 @@
|
||||
|
||||
## Overview
|
||||
|
||||
The FSM tests are meant to exercise concurrency within MongoDB. The suite
|
||||
consists of workloads, which define discrete units of work as states in a FSM,
|
||||
and runners, which define which tests to run and how they should be run. Each
|
||||
workload defines states, which are JS functions that perform some meaningful
|
||||
series of tasks and assertions, and transitions, which define how to move
|
||||
between those states. A single workload begins by executing its setup function,
|
||||
which is called once during the runner's thread of execution. Next, the runner
|
||||
generates the number of threads specified by the workload, and each spawned
|
||||
thread executes the start state (typically named "init") defined by the
|
||||
workload. From this point on, each worker thread executes its own independent
|
||||
copy of the FSM, and will randomly move between states (after executing the
|
||||
function) based on the probabilities defined in the workload's transition table.
|
||||
Each worker thread continues doing so until the number of transitions it makes
|
||||
has reached the number of iterations defined by the workload. Once all the
|
||||
worker threads have finished, the runner executes the workload's teardown
|
||||
function.
|
||||
The FSM tests are meant to exercise concurrency within MongoDB. The suite consists of workloads,
|
||||
which define discrete units of work as states in a FSM, and runners, which define which tests to run
|
||||
and how they should be run. Each workload defines states, which are JS functions that perform some
|
||||
meaningful series of tasks and assertions, and transitions, which define how to move between those
|
||||
states. A single workload begins by executing its setup function, which is called once during the
|
||||
runner's thread of execution. Next, the runner generates the number of threads specified by the
|
||||
workload, and each spawned thread executes the start state (typically named "init") defined by the
|
||||
workload. From this point on, each worker thread executes its own independent copy of the FSM, and
|
||||
will randomly move between states (after executing the function) based on the probabilities defined
|
||||
in the workload's transition table. Each worker thread continues doing so until the number of
|
||||
transitions it makes has reached the number of iterations defined by the workload. Once all the
|
||||
worker threads have finished, the runner executes the workload's teardown function.
|
||||
|
||||

|
||||
|
||||
The runner provides two modes of execution for workloads: serial and parallel.
|
||||
Serial mode runs the provided workloads one after the other,
|
||||
waiting for all threads of a workload to complete before moving on to the next
|
||||
workload. Parallel mode runs subsets of the provided workloads in separate
|
||||
The runner provides two modes of execution for workloads: serial and parallel. Serial mode runs the
|
||||
provided workloads one after the other, waiting for all threads of a workload to complete before
|
||||
moving on to the next workload. Parallel mode runs subsets of the provided workloads in separate
|
||||
threads simultaneously.
|
||||
|
||||
New methods were added to allow for finer-grained assertions under different
|
||||
situations. For example, a test that inserts a document into a collection, and
|
||||
wants to assert its existence will fail if another test removes that document.
|
||||
One option would have been to disable all assertions when running a mixture of
|
||||
different workloads together, but doing so would make the system incapable of
|
||||
detecting anything other than server crashes. Another option would have been to
|
||||
design the workloads to be conflict-free (e.g. writing to separate collections,
|
||||
using commutative operators), but this would leave large gaps in the achievable
|
||||
test coverage. Neither of those options were found to be very appealing.
|
||||
Instead, we chose to introduce the concept of an "assertion level" that acts as
|
||||
a precondition for when an assertion is evaluated. This allows us to still make
|
||||
some assertions, even when running a mixture of different workloads together.
|
||||
There are three assertion levels: `ALWAYS`, `OWN_COLL`, and `OWN_DB`. They can
|
||||
be thought of as follows:
|
||||
New methods were added to allow for finer-grained assertions under different situations. For
|
||||
example, a test that inserts a document into a collection, and wants to assert its existence will
|
||||
fail if another test removes that document. One option would have been to disable all assertions
|
||||
when running a mixture of different workloads together, but doing so would make the system incapable
|
||||
of detecting anything other than server crashes. Another option would have been to design the
|
||||
workloads to be conflict-free (e.g. writing to separate collections, using commutative operators),
|
||||
but this would leave large gaps in the achievable test coverage. Neither of those options were found
|
||||
to be very appealing. Instead, we chose to introduce the concept of an "assertion level" that acts
|
||||
as a precondition for when an assertion is evaluated. This allows us to still make some assertions,
|
||||
even when running a mixture of different workloads together. There are three assertion levels:
|
||||
`ALWAYS`, `OWN_COLL`, and `OWN_DB`. They can be thought of as follows:
|
||||
|
||||
- `ALWAYS`: A statement that remains unequivocally true, regardless of what
|
||||
another workload might be doing to the collection I was given (hint: think
|
||||
defensively). Examples include "1 = 1" or inserting a document into a
|
||||
collection (disregarding any unique indices).
|
||||
- `ALWAYS`: A statement that remains unequivocally true, regardless of what another workload might
|
||||
be doing to the collection I was given (hint: think defensively). Examples include "1 = 1" or
|
||||
inserting a document into a collection (disregarding any unique indices).
|
||||
|
||||
- `OWN_COLL`: A statement that is true only if I am the only workload operating
|
||||
on the collection I was given. Examples include counting the number of
|
||||
documents in a collection or updating a previously inserted document.
|
||||
- `OWN_COLL`: A statement that is true only if I am the only workload operating on the collection I
|
||||
was given. Examples include counting the number of documents in a collection or updating a
|
||||
previously inserted document.
|
||||
|
||||
- `OWN_DB`: A statement that is true only if I am the only workload operating on
|
||||
the database I was given. Examples include renaming a collection or verifying
|
||||
that a collection is capped. The workload typically relies on the use of
|
||||
another collection aside from the one given.
|
||||
- `OWN_DB`: A statement that is true only if I am the only workload operating on the database I was
|
||||
given. Examples include renaming a collection or verifying that a collection is capped. The
|
||||
workload typically relies on the use of another collection aside from the one given.
|
||||
|
||||
## Creating your own workload
|
||||
|
||||
All workloads are stored in `jstests/concurrency/fsm_workloads` and as specific
|
||||
examples you can refer to
|
||||
All workloads are stored in `jstests/concurrency/fsm_workloads` and as specific examples you can
|
||||
refer to
|
||||
|
||||
1. `jstests/concurrency/fsm_example.js`
|
||||
1. `jstests/concurrency/fsm_example_inheritance.js`
|
||||
|
||||
for writing new workloads. Every workload is loaded in as inline JavaScript
|
||||
using the "load" function, which is a lot more like a `#include` than
|
||||
`require.js`. This means that whatever variables are declared in the global
|
||||
scope of the file will become part of the scope where load is called. The runner
|
||||
will be looking for a variable called `$config` which will store the
|
||||
for writing new workloads. Every workload is loaded in as inline JavaScript using the "load"
|
||||
function, which is a lot more like a `#include` than `require.js`. This means that whatever
|
||||
variables are declared in the global scope of the file will become part of the scope where load is
|
||||
called. The runner will be looking for a variable called `$config` which will store the
|
||||
configuration of your workload.
|
||||
|
||||
### The $config object
|
||||
|
||||
There should be exactly one `$config` per workload. For style consistency as
|
||||
well as safety, be sure to wrap the value of `$config` in an anonymous function.
|
||||
This will create a JS closure and a new scope:
|
||||
There should be exactly one `$config` per workload. For style consistency as well as safety, be sure
|
||||
to wrap the value of `$config` in an anonymous function. This will create a JS closure and a new
|
||||
scope:
|
||||
|
||||
```javascript
|
||||
$config = (function() {
|
||||
@ -93,19 +82,17 @@ $config = (function() {
|
||||
)();
|
||||
```
|
||||
|
||||
When finished executing, `$config` must return an object containing the properties
|
||||
above (some of which are optional, see below).
|
||||
When finished executing, `$config` must return an object containing the properties above (some of
|
||||
which are optional, see below).
|
||||
|
||||
### Defining states
|
||||
|
||||
It's best to also declare states within its own closure so as not to interfere
|
||||
with the scope of $config. Each state takes two arguments, the db object and the
|
||||
collection name. For later, note that this db and collection are the only one
|
||||
that you can be guaranteed to "own" when asserting. Try to make each state a
|
||||
discrete unit of work that can stand alone without the other states.
|
||||
Additionally, try to define each function that makes up a state
|
||||
with a name as opposed to anonymously - this makes easier to read backtraces
|
||||
when things go wrong.
|
||||
It's best to also declare states within its own closure so as not to interfere with the scope of
|
||||
$config. Each state takes two arguments, the db object and the collection name. For later, note that
|
||||
this db and collection are the only one that you can be guaranteed to "own" when asserting. Try to
|
||||
make each state a discrete unit of work that can stand alone without the other states. Additionally,
|
||||
try to define each function that makes up a state with a name as opposed to anonymously - this makes
|
||||
easier to read backtraces when things go wrong.
|
||||
|
||||
```javascript
|
||||
$config = (function () {
|
||||
@ -146,14 +133,12 @@ $config = (function () {
|
||||
|
||||
### Defining transitions
|
||||
|
||||
The transitions object defines the probabilities of moving from one state to a
|
||||
different state. When a state's function is finished executing, the FSM randomly
|
||||
chooses the next state using the probabilities provided in the transitions
|
||||
object. The probabilities of the transitions object do not necessarily need to
|
||||
sum to 1.0, since the mechanism for choosing the next state uses normalized
|
||||
random values. Here it is not necessary to use a separate closure. In the
|
||||
example below, we're denoting an equal probability of moving to either of the
|
||||
scan states from the init state:
|
||||
The transitions object defines the probabilities of moving from one state to a different state. When
|
||||
a state's function is finished executing, the FSM randomly chooses the next state using the
|
||||
probabilities provided in the transitions object. The probabilities of the transitions object do not
|
||||
necessarily need to sum to 1.0, since the mechanism for choosing the next state uses normalized
|
||||
random values. Here it is not necessary to use a separate closure. In the example below, we're
|
||||
denoting an equal probability of moving to either of the scan states from the init state:
|
||||
|
||||
```javascript
|
||||
$config = (function () {
|
||||
@ -174,15 +159,13 @@ $config = (function () {
|
||||
|
||||
### Setup and teardown functions
|
||||
|
||||
The setup and teardown functions are special in that they'll only be executed in
|
||||
one thread. See the Runners section for more information about when they're run
|
||||
relative to other workloads in various modes. The setup and teardown functions
|
||||
take three arguments: db, coll, and cluster. The setup function (and
|
||||
corresponding teardown) should perform most of the initialization your workload
|
||||
needs, for example setting parameters on the server, adding seed data, or
|
||||
setting up indexes. Note that rather than executing adminCommands (and others)
|
||||
against the provided `db` you should use the provided
|
||||
`cluster.executeOnMongodNodes` and `cluster.executeOnMongosNodes` functionality.
|
||||
The setup and teardown functions are special in that they'll only be executed in one thread. See the
|
||||
Runners section for more information about when they're run relative to other workloads in various
|
||||
modes. The setup and teardown functions take three arguments: db, coll, and cluster. The setup
|
||||
function (and corresponding teardown) should perform most of the initialization your workload needs,
|
||||
for example setting parameters on the server, adding seed data, or setting up indexes. Note that
|
||||
rather than executing adminCommands (and others) against the provided `db` you should use the
|
||||
provided `cluster.executeOnMongodNodes` and `cluster.executeOnMongosNodes` functionality.
|
||||
|
||||
```javascript
|
||||
$config = (function () {
|
||||
@ -224,18 +207,16 @@ $config = (function () {
|
||||
|
||||
### The `data` object
|
||||
|
||||
The `data` object preserves information between different states of an FSM within
|
||||
an individual thread. Within a single state, the data object becomes the 'this'
|
||||
context in which the state executes. Additionally, a tid attribute is added to
|
||||
data by the runner to allow each thread to access a unique ID. Data is usually
|
||||
defined above states inside the config, but left below it in the returned
|
||||
object. Data is also available as the 'this' context in setup and teardown
|
||||
functions. Note that once the FSM begins, the context data that was passed to
|
||||
the setup function is copied into each thread - meaning each thread has its own
|
||||
copy of the data and modifications to data will not be passed back to the
|
||||
teardown function outside of what was changed in setup. Additionally, in
|
||||
composition, each workload has its own data, meaning you don't have to worry
|
||||
about properties being overridden by workloads other than the current one.
|
||||
The `data` object preserves information between different states of an FSM within an individual
|
||||
thread. Within a single state, the data object becomes the 'this' context in which the state
|
||||
executes. Additionally, a tid attribute is added to data by the runner to allow each thread to
|
||||
access a unique ID. Data is usually defined above states inside the config, but left below it in the
|
||||
returned object. Data is also available as the 'this' context in setup and teardown functions. Note
|
||||
that once the FSM begins, the context data that was passed to the setup function is copied into each
|
||||
thread - meaning each thread has its own copy of the data and modifications to data will not be
|
||||
passed back to the teardown function outside of what was changed in setup. Additionally, in
|
||||
composition, each workload has its own data, meaning you don't have to worry about properties being
|
||||
overridden by workloads other than the current one.
|
||||
|
||||
```javascript
|
||||
$config = (function () {
|
||||
@ -255,57 +236,50 @@ $config = (function () {
|
||||
|
||||
#### `threadCount`
|
||||
|
||||
threadCount is the number of threads that will be used to run your workload in
|
||||
Serial and Parallel modes. In both modes, the number of threads you provide will
|
||||
execute the FSM simultaneously, cycling through different states of the
|
||||
workload. Note that in serial mode, no other threads will be running outside of
|
||||
those pertaining to this workload, and in parallel mode, other workloads will
|
||||
also be given threads to execute their FSM. In some cases in parallel mode, this
|
||||
number will be scaled down to make sure that all workloads can fit within the
|
||||
number of threads available due to system or performance constraints.
|
||||
threadCount is the number of threads that will be used to run your workload in Serial and Parallel
|
||||
modes. In both modes, the number of threads you provide will execute the FSM simultaneously, cycling
|
||||
through different states of the workload. Note that in serial mode, no other threads will be running
|
||||
outside of those pertaining to this workload, and in parallel mode, other workloads will also be
|
||||
given threads to execute their FSM. In some cases in parallel mode, this number will be scaled down
|
||||
to make sure that all workloads can fit within the number of threads available due to system or
|
||||
performance constraints.
|
||||
|
||||
#### `iterations`
|
||||
|
||||
This is just the number of states the FSM will go through before exiting. NOTE:
|
||||
it is _not_ the number of times each state will be executed.
|
||||
This is just the number of states the FSM will go through before exiting. NOTE: it is _not_ the
|
||||
number of times each state will be executed.
|
||||
|
||||
#### `startState` (optional)
|
||||
|
||||
Default value is 'init'. If your workload does not have an init state than you
|
||||
must specify in which state to begin.
|
||||
Default value is 'init'. If your workload does not have an init state than you must specify in which
|
||||
state to begin.
|
||||
|
||||
### Workload helpers
|
||||
|
||||
`jstests/concurrency/fsm_workload_helpers` contains a few files that you can
|
||||
include using 'load' at the top of a workload. These provide auxiliary
|
||||
functionality that might be necessary for some workloads. The most important of
|
||||
which is probably server_types.js
|
||||
`jstests/concurrency/fsm_workload_helpers` contains a few files that you can include using 'load' at
|
||||
the top of a workload. These provide auxiliary functionality that might be necessary for some
|
||||
workloads. The most important of which is probably server_types.js
|
||||
|
||||
#### server_types.js
|
||||
|
||||
This helper file contains four functions: isMongos, isMongod, isMMAPv1, and
|
||||
isWiredTiger. These can be used to restrict operations on different
|
||||
functionality available in sharded environments, as well as based on storage
|
||||
engine, and work as you would expect. One thing to note is that before calling
|
||||
either isMMAPv1 or isWiredTiger, first verify isMongod. When special casing
|
||||
functionality for sharded environments or storage engines, try to special case a
|
||||
test for the exceptionality while still leaving in place assertions for either
|
||||
case.
|
||||
This helper file contains four functions: isMongos, isMongod, isMMAPv1, and isWiredTiger. These can
|
||||
be used to restrict operations on different functionality available in sharded environments, as well
|
||||
as based on storage engine, and work as you would expect. One thing to note is that before calling
|
||||
either isMMAPv1 or isWiredTiger, first verify isMongod. When special casing functionality for
|
||||
sharded environments or storage engines, try to special case a test for the exceptionality while
|
||||
still leaving in place assertions for either case.
|
||||
|
||||
#### indexed_noindex.js
|
||||
|
||||
This helper can be used along with inheritance, to create a workload that is
|
||||
exactly the same as an existing workload, but with the index created during
|
||||
setup removed. In order to use this replace the function you provide to the
|
||||
extendWorkload function with indexedNoindex. Additionally, ensure that the
|
||||
workload you are extending has a function in its data object called
|
||||
"getIndexSpec" that returns the spec for the index to be removed.
|
||||
This helper can be used along with inheritance, to create a workload that is exactly the same as an
|
||||
existing workload, but with the index created during setup removed. In order to use this replace the
|
||||
function you provide to the extendWorkload function with indexedNoindex. Additionally, ensure that
|
||||
the workload you are extending has a function in its data object called "getIndexSpec" that returns
|
||||
the spec for the index to be removed.
|
||||
|
||||
```javascript
|
||||
import {extendWorkload} from "jstests/concurrency/fsm_libs/extend_workload.js";
|
||||
load(
|
||||
"jstests/concurrency/fsm_workload_modifiers/collection_write_path/indexed_noindex.js",
|
||||
); // for indexedNoindex
|
||||
load("jstests/concurrency/fsm_workload_modifiers/collection_write_path/indexed_noindex.js"); // for indexedNoindex
|
||||
import {$config as $baseConfig} from "jstests/concurrency/fsm_workloads/workload_with_index.js";
|
||||
|
||||
export const $config = extendWorkload($baseConfig, indexedNoIndex);
|
||||
@ -313,90 +287,80 @@ export const $config = extendWorkload($baseConfig, indexedNoIndex);
|
||||
|
||||
#### drop_utils.js
|
||||
|
||||
These helpers provide safe methods for dropping collections, databases, roles,
|
||||
and users created during a workload's execution. The methods take a regular
|
||||
expression that the collection, database, role, or user name must match for it
|
||||
to be dropped. Prefixing the items in any of these categories you create with a
|
||||
prefix defined by your workload name is a good idea since the workload file name
|
||||
can be assumed unique and will allow you to only affect your workload in these
|
||||
cases.
|
||||
These helpers provide safe methods for dropping collections, databases, roles, and users created
|
||||
during a workload's execution. The methods take a regular expression that the collection, database,
|
||||
role, or user name must match for it to be dropped. Prefixing the items in any of these categories
|
||||
you create with a prefix defined by your workload name is a good idea since the workload file name
|
||||
can be assumed unique and will allow you to only affect your workload in these cases.
|
||||
|
||||
## Test runners
|
||||
|
||||
By default, all runners below are allowed to open a maximum of
|
||||
`maxAllowedConnections` (= 100 by default) explicit connections. In replicated
|
||||
and sharded environments, implicit connections are created to the original
|
||||
mongod provided to the mongo shell executing the runner (one for each thread).
|
||||
This behavior cannot be controlled, but it highlights the importance of always
|
||||
using the db object provided in the FSM states rather than the global db which
|
||||
will always correspond to the mongod the mongo shell initially connected to.
|
||||
By default, all runners below are allowed to open a maximum of `maxAllowedConnections` (= 100 by
|
||||
default) explicit connections. In replicated and sharded environments, implicit connections are
|
||||
created to the original mongod provided to the mongo shell executing the runner (one for each
|
||||
thread). This behavior cannot be controlled, but it highlights the importance of always using the db
|
||||
object provided in the FSM states rather than the global db which will always correspond to the
|
||||
mongod the mongo shell initially connected to.
|
||||
|
||||
### Execution modes
|
||||
|
||||
#### Serial
|
||||
|
||||
Serial is the simplest of all three modes and basically works as explained
|
||||
above. Setup is run single threaded, data is copied into multiple threads where
|
||||
the states are executed, and once all the threads have finished a teardown
|
||||
function is run and the runner moves onto the next workload.
|
||||
Serial is the simplest of all three modes and basically works as explained above. Setup is run
|
||||
single threaded, data is copied into multiple threads where the states are executed, and once all
|
||||
the threads have finished a teardown function is run and the runner moves onto the next workload.
|
||||
|
||||

|
||||
|
||||
#### Parallel (Simultaneous)
|
||||
|
||||
In parallel or simultaneous mode (the naming convention has been slightly
|
||||
inconsistent), the ordering becomes a little different. All workloads have their
|
||||
setup functions run, then threads are spawned for each workload, and once they
|
||||
all complete, all threads have their teardown function run.
|
||||
In parallel or simultaneous mode (the naming convention has been slightly inconsistent), the
|
||||
ordering becomes a little different. All workloads have their setup functions run, then threads are
|
||||
spawned for each workload, and once they all complete, all threads have their teardown function run.
|
||||
|
||||

|
||||
|
||||
### Existing runners
|
||||
|
||||
The existing runners all use `jstests/concurrency/fsm_libs/runner.js` to
|
||||
actually execute the workloads. Most information about arguments and available
|
||||
runWorkloads methods can be found by inspecting the source. Below you can find
|
||||
the existing runners explained. The first argument to the three runWorkloads
|
||||
methods (each corresponding to a different run mode), is an array of workload
|
||||
files to run. clusterOptions, the second argument to the runWorkloads functions,
|
||||
is explained in the other components section below. Execution options for
|
||||
runWorkloads functions, the third argument, can contain the following options
|
||||
(some depend on the run mode):
|
||||
The existing runners all use `jstests/concurrency/fsm_libs/runner.js` to actually execute the
|
||||
workloads. Most information about arguments and available runWorkloads methods can be found by
|
||||
inspecting the source. Below you can find the existing runners explained. The first argument to the
|
||||
three runWorkloads methods (each corresponding to a different run mode), is an array of workload
|
||||
files to run. clusterOptions, the second argument to the runWorkloads functions, is explained in the
|
||||
other components section below. Execution options for runWorkloads functions, the third argument,
|
||||
can contain the following options (some depend on the run mode):
|
||||
|
||||
- `numSubsets` - Not available in serial mode, determines how many subsets of
|
||||
workloads to execute in parallel mode
|
||||
- `subsetSize` - Not available in serial mode, determines how large each subset of
|
||||
workloads executed is
|
||||
- `numSubsets` - Not available in serial mode, determines how many subsets of workloads to execute
|
||||
in parallel mode
|
||||
- `subsetSize` - Not available in serial mode, determines how large each subset of workloads
|
||||
executed is
|
||||
|
||||
#### fsm_all.js
|
||||
|
||||
Runs all workloads serially. For each workload, `$config.threadCount` threads
|
||||
are spawned and each thread runs for exactly `$config.iterations` steps starting
|
||||
at `$config.startState` and transitioning to other states based on the
|
||||
transition probabilities defined in $config.transitions.
|
||||
Runs all workloads serially. For each workload, `$config.threadCount` threads are spawned and each
|
||||
thread runs for exactly `$config.iterations` steps starting at `$config.startState` and
|
||||
transitioning to other states based on the transition probabilities defined in $config.transitions.
|
||||
|
||||
#### fsm_all_simultaneous.js
|
||||
|
||||
options: numSubsets, subsetSize
|
||||
|
||||
Runs numSubsets subsets of size subsetSize of all workloads. The workloads in
|
||||
each subset are started in parallel and each workload is run according to
|
||||
settings in `$config`.
|
||||
Runs numSubsets subsets of size subsetSize of all workloads. The workloads in each subset are
|
||||
started in parallel and each workload is run according to settings in `$config`.
|
||||
|
||||
#### fsm_all_replication.js
|
||||
|
||||
Sets up a replica set (with 3 mongods by default) and runs workloads serially or
|
||||
in parallel. For example,
|
||||
Sets up a replica set (with 3 mongods by default) and runs workloads serially or in parallel. For
|
||||
example,
|
||||
|
||||
`runWorkloadsSerially([<workload1>, <workload2>, ...], { replication: true } )`
|
||||
|
||||
creates a replica set with 3 members and runs some workloads serially on the
|
||||
primary.
|
||||
creates a replica set with 3 members and runs some workloads serially on the primary.
|
||||
|
||||
#### fsm_all_sharded.js
|
||||
|
||||
Sets up a sharded cluster (with 2 shards and 1 mongos by default) and runs
|
||||
workloads serially or in parallel. For example,
|
||||
Sets up a sharded cluster (with 2 shards and 1 mongos by default) and runs workloads serially or in
|
||||
parallel. For example,
|
||||
|
||||
`runWorkloadsInParallel([<workload1>, <workload2>, ...], { sharded: true } )`
|
||||
|
||||
@ -404,36 +368,33 @@ creates a sharded cluster and runs workloads in parallel.
|
||||
|
||||
#### fsm_all_sharded_replication.js
|
||||
|
||||
Sets up a sharded cluster (with 2 shards, each having 3 replica set members, and
|
||||
1 mongos by default) and runs workloads serially or in parallel.
|
||||
Sets up a sharded cluster (with 2 shards, each having 3 replica set members, and 1 mongos by
|
||||
default) and runs workloads serially or in parallel.
|
||||
|
||||
### Excluding a workload
|
||||
|
||||
If any workloads fail because of known bugs in MongoDB, persistent MCI failures
|
||||
or timeouts, the troublesome workload can be excluded from running by placing it
|
||||
in the exclusion array in the corresponding runner. Please remember to place a
|
||||
comment next to the excluded workload name identifying the reason a workload is
|
||||
being excluded. For example,
|
||||
If any workloads fail because of known bugs in MongoDB, persistent MCI failures or timeouts, the
|
||||
troublesome workload can be excluded from running by placing it in the exclusion array in the
|
||||
corresponding runner. Please remember to place a comment next to the excluded workload name
|
||||
identifying the reason a workload is being excluded. For example,
|
||||
|
||||
`'agg_sort_external.js', // SERVER-16700 Deadlock on WiredTiger LSM`
|
||||
|
||||
Each file should also have two predefined sections - one for known bugs and one
|
||||
for restrictions. The one above would be considered a known bug. However,
|
||||
excluding a compact workload from sharded runners would be a restriction because
|
||||
compact can only be run against individual mongods.
|
||||
Each file should also have two predefined sections - one for known bugs and one for restrictions.
|
||||
The one above would be considered a known bug. However, excluding a compact workload from sharded
|
||||
runners would be a restriction because compact can only be run against individual mongods.
|
||||
|
||||
## Other components of the FSM library
|
||||
|
||||
Most of these components live in jstests/concurrency/fsm_libs and provide the
|
||||
functionality used by the runner.
|
||||
Most of these components live in jstests/concurrency/fsm_libs and provide the functionality used by
|
||||
the runner.
|
||||
|
||||
### ThreadManager
|
||||
|
||||
Responsible for spawning and joining worker threads. Each spawned thread is
|
||||
wrapped in a try/finally block to ensure that the database connection implicitly
|
||||
created during the thread's execution is eventually closed explicitly. The
|
||||
ThreadManager sets a random seed `([0, randInt(1e13))` which is the range of
|
||||
`new Date().getTime())` before executing each workload.
|
||||
Responsible for spawning and joining worker threads. Each spawned thread is wrapped in a try/finally
|
||||
block to ensure that the database connection implicitly created during the thread's execution is
|
||||
eventually closed explicitly. The ThreadManager sets a random seed `([0, randInt(1e13))` which is
|
||||
the range of `new Date().getTime())` before executing each workload.
|
||||
|
||||
### Worker Thread
|
||||
|
||||
@ -441,36 +402,30 @@ Thread spawned by ThreadManager and used to run a Finite State Machine.
|
||||
|
||||
### Cluster
|
||||
|
||||
cluster.js is responsible for providing the cluster object that is passed to
|
||||
setup and teardown functions, and the initial connection to a db to be used by
|
||||
runner to pass to the workloads. For anything except for standalone, it makes
|
||||
use of the shell's built-in cluster test helpers like `ShardingTest` and
|
||||
`ReplSetTest`. clusterOptions are passed to cluster.js for initialization.
|
||||
cluster.js is responsible for providing the cluster object that is passed to setup and teardown
|
||||
functions, and the initial connection to a db to be used by runner to pass to the workloads. For
|
||||
anything except for standalone, it makes use of the shell's built-in cluster test helpers like
|
||||
`ShardingTest` and `ReplSetTest`. clusterOptions are passed to cluster.js for initialization.
|
||||
clusterOptions include:
|
||||
|
||||
- `replication`: boolean, whether or not to use replication in the cluster
|
||||
- `sameCollection`: boolean, whether or not all workloads are passed the same
|
||||
collection
|
||||
- `sameCollection`: boolean, whether or not all workloads are passed the same collection
|
||||
- `sameDB`: boolean, whether or not all workloads are passed the same DB
|
||||
- `setupFunctions`: object, containing at most two functions under the keys
|
||||
'mongod' and 'mongos'. This allows you to run a function against all mongod or
|
||||
mongos nodes in the cluster as part of the cluster initialization. Each
|
||||
function takes a single argument, the db object against which configuration
|
||||
can be run (will be set for each mongod/mongos)
|
||||
- `setupFunctions`: object, containing at most two functions under the keys 'mongod' and 'mongos'.
|
||||
This allows you to run a function against all mongod or mongos nodes in the cluster as part of the
|
||||
cluster initialization. Each function takes a single argument, the db object against which
|
||||
configuration can be run (will be set for each mongod/mongos)
|
||||
- `sharded`: boolean, whether or not to use sharding in the cluster
|
||||
|
||||
Note that sameCollection and sameDB can increase contention for a resource, but
|
||||
will also decrease the strength of the assertions by ruling out the use of OwnDB
|
||||
and OwnColl assertions.
|
||||
Note that sameCollection and sameDB can increase contention for a resource, but will also decrease
|
||||
the strength of the assertions by ruling out the use of OwnDB and OwnColl assertions.
|
||||
|
||||
### Miscellaneous Execution Notes
|
||||
|
||||
- A `CountDownLatch` (exposed through the v8-based mongo shell, as of MongoDB 3.0)
|
||||
is used as a synchronization primitive by the ThreadManager to wait until all
|
||||
spawned threads have finished being spawned before starting workload
|
||||
execution.
|
||||
- If more than 20% of the threads fail while spawning, we abort the test. If
|
||||
fewer than 20% of the threads fail while spawning we allow the non-failed
|
||||
threads to continue with the test. The 20% threshold is somewhat arbitrary;
|
||||
the goal is to abort if "mostly all" of the threads failed but to tolerate "a
|
||||
few" threads failing.
|
||||
- A `CountDownLatch` (exposed through the v8-based mongo shell, as of MongoDB 3.0) is used as a
|
||||
synchronization primitive by the ThreadManager to wait until all spawned threads have finished
|
||||
being spawned before starting workload execution.
|
||||
- If more than 20% of the threads fail while spawning, we abort the test. If fewer than 20% of the
|
||||
threads fail while spawning we allow the non-failed threads to continue with the test. The 20%
|
||||
threshold is somewhat arbitrary; the goal is to abort if "mostly all" of the threads failed but to
|
||||
tolerate "a few" threads failing.
|
||||
|
||||
@ -1,37 +1,34 @@
|
||||
# Hang Analyzer
|
||||
|
||||
The hang analyzer is a tool to collect cores and other information from processes
|
||||
that are suspected to have hung. Any task which exceeds its timeout in Evergreen
|
||||
will automatically be hang-analyzed, with information being written compressed
|
||||
and uploaded to S3.
|
||||
The hang analyzer is a tool to collect cores and other information from processes that are suspected
|
||||
to have hung. Any task which exceeds its timeout in Evergreen will automatically be hang-analyzed,
|
||||
with information being written compressed and uploaded to S3.
|
||||
|
||||
The hang analyzer can also be invoked locally at any time. For all non-Jepsen
|
||||
tasks, the invocation is `buildscripts/resmoke.py hang-analyzer -o file -o stdout -m exact -p python`. You may need to substitute `python` with the name of the python binary
|
||||
you are using, which may be one of `python`, `python3`, or on Windows: `Python`,
|
||||
`Python3`.
|
||||
The hang analyzer can also be invoked locally at any time. For all non-Jepsen tasks, the invocation
|
||||
is `buildscripts/resmoke.py hang-analyzer -o file -o stdout -m exact -p python`. You may need to
|
||||
substitute `python` with the name of the python binary you are using, which may be one of `python`,
|
||||
`python3`, or on Windows: `Python`, `Python3`.
|
||||
|
||||
For jepsen tasks, the invocation is `buildscripts/resmoke.py hang-analyzer -o file -o stdout -p dbtest,java,mongo,mongod,mongos,python,_test`.
|
||||
For jepsen tasks, the invocation is
|
||||
`buildscripts/resmoke.py hang-analyzer -o file -o stdout -p dbtest,java,mongo,mongod,mongos,python,_test`.
|
||||
|
||||
## Interesting Processes
|
||||
|
||||
The hang analyzer detects and runs against processes which are considered
|
||||
interesting.
|
||||
The hang analyzer detects and runs against processes which are considered interesting.
|
||||
|
||||
Tasks whose name contains "jepsen": any process whose name exactly matches one
|
||||
of `dbtest,java,mongo,mongod,mongos,python,_test`.
|
||||
Tasks whose name contains "jepsen": any process whose name exactly matches one of
|
||||
`dbtest,java,mongo,mongod,mongos,python,_test`.
|
||||
|
||||
In all other scenarios, including local use of the hang-analyzer, an interesting
|
||||
process is any of:
|
||||
In all other scenarios, including local use of the hang-analyzer, an interesting process is any of:
|
||||
|
||||
- process that starts with `python` or `live-record`
|
||||
- one which has been spawned as a child process of resmoke.
|
||||
|
||||
The resmoke subcommand `hang-analyzer` will send SIGUSR1/use SetEvent to signal
|
||||
resmoke to:
|
||||
The resmoke subcommand `hang-analyzer` will send SIGUSR1/use SetEvent to signal resmoke to:
|
||||
|
||||
- Print stack traces for all python threads
|
||||
- Collect core dumps and other information for any non-python child
|
||||
processes, see `Data Collection` below
|
||||
- Collect core dumps and other information for any non-python child processes, see `Data Collection`
|
||||
below
|
||||
- Re-signal any python child processes to do the same
|
||||
|
||||
## Data Collection
|
||||
@ -41,8 +38,8 @@ Data collection occurs in the following sequence:
|
||||
- Pause all non-python processes
|
||||
- Grab debug symbols on non-Sanitizer builds
|
||||
- Signal python Processes
|
||||
- Dump cores of as many processes as possible, until the disk quota is exceeded.
|
||||
The default quota is 90% of total volume space.
|
||||
- Dump cores of as many processes as possible, until the disk quota is exceeded. The default quota
|
||||
is 90% of total volume space.
|
||||
|
||||
- Collect additional, non-core data. Ideally:
|
||||
- Print C++ Stack traces
|
||||
@ -54,13 +51,12 @@ Data collection occurs in the following sequence:
|
||||
- Dump java processes (Jepsen tests) with jstack
|
||||
- SIGABRT (Unix)/terminate (Windows) go processes
|
||||
|
||||
Note that the list of non-core data collected is only accurate on Linux. Other
|
||||
platforms only perform a subset of these operations.
|
||||
Note that the list of non-core data collected is only accurate on Linux. Other platforms only
|
||||
perform a subset of these operations.
|
||||
|
||||
Additionally, note that the hang analyzer is subject to Evergreen post task
|
||||
timeouts, and may not have enough time to collect all information before
|
||||
being terminated by the Evergreen agent. When running locally there is no
|
||||
timeout, and the hang analyzer may ironically hang indefinitely.
|
||||
Additionally, note that the hang analyzer is subject to Evergreen post task timeouts, and may not
|
||||
have enough time to collect all information before being terminated by the Evergreen agent. When
|
||||
running locally there is no timeout, and the hang analyzer may ironically hang indefinitely.
|
||||
|
||||
### Implementations
|
||||
|
||||
|
||||
@ -2,11 +2,23 @@
|
||||
|
||||
## Overview
|
||||
|
||||
[Mongobridge](https://github.com/mongodb/mongo/blob/e810af1916caaedb1cde8d1e1b74bb50b2461daf/src/mongo/tools/mongobridge_tool/bridge.cpp#L1) is a network fault injection testing tool that allows test authors to intentionally simulate network issues such as connection failures, message delays, or packet loss during communication to any node in a cluster. It acts as a transparent proxy between MongoDB processes and their clients, enabling controlled network fault injection for testing distributed system behavior.
|
||||
[Mongobridge](https://github.com/mongodb/mongo/blob/e810af1916caaedb1cde8d1e1b74bb50b2461daf/src/mongo/tools/mongobridge_tool/bridge.cpp#L1)
|
||||
is a network fault injection testing tool that allows test authors to intentionally simulate network
|
||||
issues such as connection failures, message delays, or packet loss during communication to any node
|
||||
in a cluster. It acts as a transparent proxy between MongoDB processes and their clients, enabling
|
||||
controlled network fault injection for testing distributed system behavior.
|
||||
|
||||
## How It Works
|
||||
|
||||
When `ReplSetTest` or `ShardingTest` are instructed to use `mongobridge`, they will [set up a mongobridge process](https://github.com/mongodb/mongo/blob/e810af1916caaedb1cde8d1e1b74bb50b2461daf/jstests/libs/replsettest.js#L2962) for each node that [creates a ProxiedConnection](https://github.com/mongodb/mongo/blob/e810af1916caaedb1cde8d1e1b74bb50b2461daf/src/mongo/tools/mongobridge_tool/bridge.cpp#L323-L324) between the node and any clients (including other nodes in the cluster) attempting to communicate with it. When test authors send a command to a node, mongobridge [intercepts the command and applies any configured actions](https://github.com/mongodb/mongo/blob/e810af1916caaedb1cde8d1e1b74bb50b2461daf/src/mongo/tools/mongobridge_tool/bridge.cpp#L395-L430) onto the commands before forwarding the command along to the node itself. This allows simple fault injection from the test author's perspective.
|
||||
When `ReplSetTest` or `ShardingTest` are instructed to use `mongobridge`, they will
|
||||
[set up a mongobridge process](https://github.com/mongodb/mongo/blob/e810af1916caaedb1cde8d1e1b74bb50b2461daf/jstests/libs/replsettest.js#L2962)
|
||||
for each node that
|
||||
[creates a ProxiedConnection](https://github.com/mongodb/mongo/blob/e810af1916caaedb1cde8d1e1b74bb50b2461daf/src/mongo/tools/mongobridge_tool/bridge.cpp#L323-L324)
|
||||
between the node and any clients (including other nodes in the cluster) attempting to communicate
|
||||
with it. When test authors send a command to a node, mongobridge
|
||||
[intercepts the command and applies any configured actions](https://github.com/mongodb/mongo/blob/e810af1916caaedb1cde8d1e1b74bb50b2461daf/src/mongo/tools/mongobridge_tool/bridge.cpp#L395-L430)
|
||||
onto the commands before forwarding the command along to the node itself. This allows simple fault
|
||||
injection from the test author's perspective.
|
||||
|
||||
## Quick Start
|
||||
|
||||
@ -23,7 +35,8 @@ To use mongobridge in your tests:
|
||||
});
|
||||
```
|
||||
|
||||
- **Test commands must be enabled**: Mongobridge's `*From` commands require `enableTestCommands: true` (which is the default in test environments)
|
||||
- **Test commands must be enabled**: Mongobridge's `*From` commands require
|
||||
`enableTestCommands: true` (which is the default in test environments)
|
||||
|
||||
2. **Inject network faults** using bridge commands:
|
||||
|
||||
@ -38,11 +51,16 @@ To use mongobridge in your tests:
|
||||
st.rs0.getPrimary().acceptConnectionsFrom(st.rs0.getSecondary());
|
||||
```
|
||||
|
||||
3. Operations that depend on communication between the affected nodes will fail or timeout as expected.
|
||||
3. Operations that depend on communication between the affected nodes will fail or timeout as
|
||||
expected.
|
||||
|
||||
## What to keep in mind
|
||||
|
||||
Be aware that there are consequences to injecting network faults between nodes that can cause downstream impact in (for example) heartbeats, sync source selection, and SDAM, and so after a fault has been injected the test may not be in the state you expect it to be in for future commands. It is best to keep mongobridge tests relatively short and targeted to ensure that flakiness due to these faults doesn't impact the rest of your testing.
|
||||
Be aware that there are consequences to injecting network faults between nodes that can cause
|
||||
downstream impact in (for example) heartbeats, sync source selection, and SDAM, and so after a fault
|
||||
has been injected the test may not be in the state you expect it to be in for future commands. It is
|
||||
best to keep mongobridge tests relatively short and targeted to ensure that flakiness due to these
|
||||
faults doesn't impact the rest of your testing.
|
||||
|
||||
## Command Reference
|
||||
|
||||
@ -71,7 +89,8 @@ node.acceptConnectionsFrom([node1, node2, node3]); // Multiple nodes
|
||||
node.rejectConnectionsFrom(otherNode);
|
||||
```
|
||||
|
||||
**Effect**: New connections are rejected, existing connections are closed when a new request is sent over them
|
||||
**Effect**: New connections are rejected, existing connections are closed when a new request is sent
|
||||
over them
|
||||
|
||||
**Use case**: Simulating complete network partitions
|
||||
|
||||
@ -183,7 +202,8 @@ primary.discardMessagesFrom(secondary, 0.3);
|
||||
|
||||
### Limitations
|
||||
|
||||
- **OP_QUERY exhaust**: Not supported for legacy exhaust queries (OP_MSG exhaust cursors are supported)
|
||||
- **OP_QUERY exhaust**: Not supported for legacy exhaust queries (OP_MSG exhaust cursors are
|
||||
supported)
|
||||
- **Direct connections**: Only works when connections go through the bridge proxy
|
||||
- **TLS support**: Mongobridge is not supported if the cluster is using TLS.
|
||||
|
||||
|
||||
@ -11,26 +11,32 @@ Using OTel we capture the following things
|
||||
3. Duration of hooks before and after test/suite
|
||||
4. Resmoke archiver (when there is a failure we archive core dumps)
|
||||
|
||||
To see this visually navigate to the [resmoke dataset](https://ui.honeycomb.io/mongodb-4b/environments/production/datasets/resmoke/home) and view a recent trace.
|
||||
To see this visually navigate to the
|
||||
[resmoke dataset](https://ui.honeycomb.io/mongodb-4b/environments/production/datasets/resmoke/home)
|
||||
and view a recent trace.
|
||||
|
||||
## A look at source code
|
||||
|
||||
### Configuration
|
||||
|
||||
The bulk of configuration is done in the
|
||||
`_set_up_tracing(...)` method in [configure_resmoke.py#L164](https://github.com/mongodb/mongo/blob/976ce50f6134789e73c639848b35f10040f0ff4a/buildscripts/resmokelib/configure_resmoke.py#L164). This method includes documentation on how it works.
|
||||
The bulk of configuration is done in the `_set_up_tracing(...)` method in
|
||||
[configure_resmoke.py#L164](https://github.com/mongodb/mongo/blob/976ce50f6134789e73c639848b35f10040f0ff4a/buildscripts/resmokelib/configure_resmoke.py#L164).
|
||||
This method includes documentation on how it works.
|
||||
|
||||
## BatchedBaggageSpanProcessor
|
||||
|
||||
See documentation [batched_baggage_span_processor.py#L8](https://github.com/mongodb/mongo/blob/976ce50f6134789e73c639848b35f10040f0ff4a/buildscripts/resmokelib/utils/batched_baggage_span_processor.py#L8)
|
||||
See documentation
|
||||
[batched_baggage_span_processor.py#L8](https://github.com/mongodb/mongo/blob/976ce50f6134789e73c639848b35f10040f0ff4a/buildscripts/resmokelib/utils/batched_baggage_span_processor.py#L8)
|
||||
|
||||
## FileSpanExporter
|
||||
|
||||
See documentation [file_span_exporter.py#L16](https://github.com/mongodb/mongo/blob/976ce50f6134789e73c639848b35f10040f0ff4a/buildscripts/resmokelib/utils/file_span_exporter.py#L16)
|
||||
See documentation
|
||||
[file_span_exporter.py#L16](https://github.com/mongodb/mongo/blob/976ce50f6134789e73c639848b35f10040f0ff4a/buildscripts/resmokelib/utils/file_span_exporter.py#L16)
|
||||
|
||||
## Capturing Data
|
||||
|
||||
We mostly capture data by using a decorator on methods. Example taken from [job.py#L200](https://github.com/mongodb/mongo/blob/6d36ac392086df85844870eef1d773f35020896c/buildscripts/resmokelib/testing/job.py#L200)
|
||||
We mostly capture data by using a decorator on methods. Example taken from
|
||||
[job.py#L200](https://github.com/mongodb/mongo/blob/6d36ac392086df85844870eef1d773f35020896c/buildscripts/resmokelib/testing/job.py#L200)
|
||||
|
||||
```
|
||||
TRACER = trace.get_tracer("resmoke")
|
||||
@ -41,7 +47,11 @@ def func_name(...):
|
||||
span.set_attribute("attr1", True)
|
||||
```
|
||||
|
||||
This system is nice because the decorator captures exceptions and other failures and a user can never forget to close a span. On occasion we will also start a span using the `with` clause in python. However, the decorator method is preferred since the method below makes more of a readability impact on the code. This example is taken from [job.py#L215](https://github.com/mongodb/mongo/blob/6d36ac392086df85844870eef1d773f35020896c/buildscripts/resmokelib/testing/job.py#L215)
|
||||
This system is nice because the decorator captures exceptions and other failures and a user can
|
||||
never forget to close a span. On occasion we will also start a span using the `with` clause in
|
||||
python. However, the decorator method is preferred since the method below makes more of a
|
||||
readability impact on the code. This example is taken from
|
||||
[job.py#L215](https://github.com/mongodb/mongo/blob/6d36ac392086df85844870eef1d773f35020896c/buildscripts/resmokelib/testing/job.py#L215)
|
||||
|
||||
```
|
||||
with TRACER.start_as_current_span("func_name", attributes={}):
|
||||
@ -51,4 +61,9 @@ with TRACER.start_as_current_span("func_name", attributes={}):
|
||||
|
||||
## Insights We Have Made (so far)
|
||||
|
||||
Using [this dashboard](https://ui.honeycomb.io/mongodb-4b/environments/production/board/3bATQLb38bh/Server-CI) and [this query](https://ui.honeycomb.io/mongodb-4b/environments/production/datasets/resmoke/result/GFa2YJ6d4vU/a/7EYuMJtH8KX/Slowest-Resmoke-Tests) we can see the most expensive single js tests. We plan to make tickets for teams to fix these long running tests for cloud savings as well as developer time savings.
|
||||
Using
|
||||
[this dashboard](https://ui.honeycomb.io/mongodb-4b/environments/production/board/3bATQLb38bh/Server-CI)
|
||||
and
|
||||
[this query](https://ui.honeycomb.io/mongodb-4b/environments/production/datasets/resmoke/result/GFa2YJ6d4vU/a/7EYuMJtH8KX/Slowest-Resmoke-Tests)
|
||||
we can see the most expensive single js tests. We plan to make tickets for teams to fix these long
|
||||
running tests for cloud savings as well as developer time savings.
|
||||
|
||||
@ -1,10 +1,14 @@
|
||||
# Resmoke Module Configuration
|
||||
|
||||
This configuration allows additional modules to be added to Resmoke, providing more context about their associated directories. Modules can specify directories for fixtures, hooks, suites, and JavaScript tests, which Resmoke incorporates during its testing process.
|
||||
This configuration allows additional modules to be added to Resmoke, providing more context about
|
||||
their associated directories. Modules can specify directories for fixtures, hooks, suites, and
|
||||
JavaScript tests, which Resmoke incorporates during its testing process.
|
||||
|
||||
## Adding a New Module
|
||||
|
||||
To add a new module to Resmoke, define the module name and specify its `fixture_dirs`, `hook_dirs`, `suite_dirs`, and `jstest_dirs` in the YAML configuration. Each field should be a list of directory paths.
|
||||
To add a new module to Resmoke, define the module name and specify its `fixture_dirs`, `hook_dirs`,
|
||||
`suite_dirs`, and `jstest_dirs` in the YAML configuration. Each field should be a list of directory
|
||||
paths.
|
||||
|
||||
### Example YAML Configuration
|
||||
|
||||
@ -25,9 +29,12 @@ my_new_module:
|
||||
- **`fixture_dirs`**: Directories containing fixtures associated with the module.
|
||||
- **`hook_dirs`**: Directories containing hooks associated with the module.
|
||||
- **`suite_dirs`**: Directories containing suites with test configurations.
|
||||
- **`jstest_dirs`**: Directories containing JavaScript tests specific to the module. This ensures module-specific tests are excluded from other suite configurations when the module is disabled.
|
||||
- **`jstest_dirs`**: Directories containing JavaScript tests specific to the module. This ensures
|
||||
module-specific tests are excluded from other suite configurations when the module is disabled.
|
||||
|
||||
## Notes
|
||||
|
||||
- Any suite can use jstests from any directory, when the module is enabled the configured jstest dirs does nothing. Only when the module is disabled does it filter out the tests that might be configured in a suite from a different module.
|
||||
- Any suite can use jstests from any directory, when the module is enabled the configured jstest
|
||||
dirs does nothing. Only when the module is disabled does it filter out the tests that might be
|
||||
configured in a suite from a different module.
|
||||
- Fields can be omitted or empty lists
|
||||
|
||||
@ -1,55 +1,48 @@
|
||||
# Thread Pools
|
||||
|
||||
A thread pool ([Wikipedia][thread_pools_wikipedia]) accepts and executes
|
||||
lightweight work items called "tasks", using a carefully managed group
|
||||
of dedicated long-running worker threads. The worker threads perform
|
||||
the work items in parallel without forcing each work item to assume the
|
||||
burden of starting and destroying a dedicated thead.
|
||||
A thread pool ([Wikipedia][thread_pools_wikipedia]) accepts and executes lightweight work items
|
||||
called "tasks", using a carefully managed group of dedicated long-running worker threads. The worker
|
||||
threads perform the work items in parallel without forcing each work item to assume the burden of
|
||||
starting and destroying a dedicated thead.
|
||||
|
||||
## Classes
|
||||
|
||||
### `ThreadPoolInterface`
|
||||
|
||||
The [`ThreadPoolInterface`][thread_pool_interface.h] abstract interface is
|
||||
an extension of the `OutOfLineExecutor` (see [the executors architecture
|
||||
guide][executors]) abstract interface, adding `startup`, `shutdown`, and
|
||||
`join` virtual member functions. It is the base class for our thread
|
||||
pool classes.
|
||||
The [`ThreadPoolInterface`][thread_pool_interface.h] abstract interface is an extension of the
|
||||
`OutOfLineExecutor` (see [the executors architecture guide][executors]) abstract interface, adding
|
||||
`startup`, `shutdown`, and `join` virtual member functions. It is the base class for our thread pool
|
||||
classes.
|
||||
|
||||
### `ThreadPool`
|
||||
|
||||
[`ThreadPool`][thread_pool.h] is the most basic concrete thread pool. The
|
||||
number of worker threads is adaptive, but configurable with a min/max
|
||||
range. Idle worker threads are reaped (down to the configured min), while
|
||||
new worker threads can be created when needed (up to the configured max).
|
||||
[`ThreadPool`][thread_pool.h] is the most basic concrete thread pool. The number of worker threads
|
||||
is adaptive, but configurable with a min/max range. Idle worker threads are reaped (down to the
|
||||
configured min), while new worker threads can be created when needed (up to the configured max).
|
||||
|
||||
### `ThreadPoolTaskExecutor`
|
||||
|
||||
[`ThreadPoolTaskExecutor`][thread_pool_task_executor.h] is not a thread
|
||||
pool, but rather a `TaskExecutor` that uses a `ThreadPoolInterface` and
|
||||
a `NetworkInterface` to execute scheduled tasks. It's configured with a
|
||||
`ThreadPoolInterface` over which it _takes_ ownership, and a
|
||||
`NetworkInterface`, of which it _shares_ ownership. With these resources
|
||||
it implements the elaborate `TaskExecutor` interface (see [executors]).
|
||||
[`ThreadPoolTaskExecutor`][thread_pool_task_executor.h] is not a thread pool, but rather a
|
||||
`TaskExecutor` that uses a `ThreadPoolInterface` and a `NetworkInterface` to execute scheduled
|
||||
tasks. It's configured with a `ThreadPoolInterface` over which it _takes_ ownership, and a
|
||||
`NetworkInterface`, of which it _shares_ ownership. With these resources it implements the elaborate
|
||||
`TaskExecutor` interface (see [executors]).
|
||||
|
||||
### `NetworkInterfaceThreadPool`
|
||||
|
||||
[`NetworkInterfaceThreadPool`][network_interface_thread_pool.h] is a
|
||||
thread pool implementation that doesn't actually own any worker threads.
|
||||
It runs its tasks on the background thread of a
|
||||
[`NetworkInterfaceThreadPool`][network_interface_thread_pool.h] is a thread pool implementation that
|
||||
doesn't actually own any worker threads. It runs its tasks on the background thread of a
|
||||
[`NetworkInterface`][network_interface.h].
|
||||
|
||||
Incoming tasks that are scheduled from the `NetworkInterface`'s thread
|
||||
are run immediately. Otherwise they are queued to be run by the
|
||||
`NetworkInterface` thread when it is available.
|
||||
Incoming tasks that are scheduled from the `NetworkInterface`'s thread are run immediately.
|
||||
Otherwise they are queued to be run by the `NetworkInterface` thread when it is available.
|
||||
|
||||
### `ThreadPoolMock`
|
||||
|
||||
[`ThreadPoolMock`][thread_pool_mock.h] is a `ThreadPoolInterface`. It is not
|
||||
a mock of a `ThreadPool`. It has no configurable stored responses. It has
|
||||
one worker thread and a pointer to a `NetworkInterfaceMock`, and with these
|
||||
resources it simulates a thread pool well enough to be used by a
|
||||
`ThreadPoolTaskExecutor` in unit tests.
|
||||
[`ThreadPoolMock`][thread_pool_mock.h] is a `ThreadPoolInterface`. It is not a mock of a
|
||||
`ThreadPool`. It has no configurable stored responses. It has one worker thread and a pointer to a
|
||||
`NetworkInterfaceMock`, and with these resources it simulates a thread pool well enough to be used
|
||||
by a `ThreadPoolTaskExecutor` in unit tests.
|
||||
|
||||
[thread_pools_wikipedia]: https://en.wikipedia.org/wiki/Thread_pool
|
||||
[executors]: ../src/mongo/executor/README.md
|
||||
|
||||
@ -1,13 +1,14 @@
|
||||
Note: this doc is being continuously updated while changes are being made to the unit test framework.
|
||||
Note: this doc is being continuously updated while changes are being made to the unit test
|
||||
framework.
|
||||
|
||||
# Overview
|
||||
|
||||
# Features
|
||||
|
||||
The MongoDB unit test framework is a thin layer built atop GoogleTest, so most GoogleTest features
|
||||
(see [Google Test documentation][google_test_docs]) are available for use aside from anything
|
||||
listed out in [Banned Features](#banned-features). The unit testing framework also includes
|
||||
enhanced reporting of test output (see
|
||||
(see [Google Test documentation][google_test_docs]) are available for use aside from anything listed
|
||||
out in [Banned Features](#banned-features). The unit testing framework also includes enhanced
|
||||
reporting of test output (see
|
||||
[Enhanced Reporting of Test Output](#enhanced-reporting-of-test-output)).
|
||||
|
||||
The core unittest features can be accessed by including the `mongo/unittest/unittest.h` header and
|
||||
@ -18,8 +19,8 @@ using the `mongo_cc_unit_test` bazel rule.
|
||||
### Parameterized tests
|
||||
|
||||
Parameterized tests are a GoogleTest feature that allows the same test logic to be run with
|
||||
different values or types (see GoogleTest docs on
|
||||
[Value-Parameterized Tests][value_parameterized_tests] and [Typed Tests][typed_tests]).
|
||||
different values or types (see GoogleTest docs on [Value-Parameterized
|
||||
Tests][value_parameterized_tests] and [Typed Tests][typed_tests]).
|
||||
|
||||
```cpp
|
||||
class TestFixture :
|
||||
@ -41,8 +42,8 @@ TEST_P(TestFixture, MongoTest) {
|
||||
### GoogleMock
|
||||
|
||||
GoogleMock can be used by including the `mongo/unittest/unittest.h` header. You should never
|
||||
directly include `<gmock/gmock.h>`. There are matchers for common mongo types such as `BSONObj`
|
||||
in `mongo/unittest/matcher.h`.
|
||||
directly include `<gmock/gmock.h>`. There are matchers for common mongo types such as `BSONObj` in
|
||||
`mongo/unittest/matcher.h`.
|
||||
|
||||
## Banned Features
|
||||
|
||||
@ -63,9 +64,9 @@ GoogleTest fatal assertions, such as no fatal assertions allowed in non-void hel
|
||||
|
||||
## Enhanced Reporting of Test Output
|
||||
|
||||
The Enhanced Reporter improves test reporting by colorizing and formatting output, maintaining
|
||||
a progress indicator, printing enhanced failure information, and suppressing log output on
|
||||
passing tests.
|
||||
The Enhanced Reporter improves test reporting by colorizing and formatting output, maintaining a
|
||||
progress indicator, printing enhanced failure information, and suppressing log output on passing
|
||||
tests.
|
||||
|
||||
These command line flags may be used to configure the Enhanced Reporter:
|
||||
|
||||
@ -74,9 +75,9 @@ These command line flags may be used to configure the Enhanced Reporter:
|
||||
|
||||
## Death Tests
|
||||
|
||||
The MongoDB unit testing framework uses `DEATH_TEST` (with `DEATH_TEST_F`, `DEATH_TEST_REGEX`,
|
||||
and `DEATH_TEST_REGEX_F` variants) to test code that is expected to cause the process to
|
||||
terminate. This should replace all uses of the `ASSERT_DEATH` macro from GoogleTest (see
|
||||
The MongoDB unit testing framework uses `DEATH_TEST` (with `DEATH_TEST_F`, `DEATH_TEST_REGEX`, and
|
||||
`DEATH_TEST_REGEX_F` variants) to test code that is expected to cause the process to terminate. This
|
||||
should replace all uses of the `ASSERT_DEATH` macro from GoogleTest (see
|
||||
[unittest/death_test.h][death_test_h] for more details).
|
||||
|
||||
Similar to GoogleTest, `DEATH_TEST` test suite names should be suffixed with `DeathTest`. For
|
||||
@ -98,8 +99,10 @@ DEATH_TEST_F(FixtureNameDeathTest, TestName) {
|
||||
}
|
||||
```
|
||||
|
||||
[death_test_naming]: https://github.com/google/googletest/blob/main/docs/advanced.md#death-test-naming
|
||||
[death_test_naming]:
|
||||
https://github.com/google/googletest/blob/main/docs/advanced.md#death-test-naming
|
||||
[death_test_h]: ../src/mongo/unittest/death_test.h
|
||||
[google_test_docs]: https://github.com/google/googletest/blob/main/docs/primer.md
|
||||
[value_parameterized_tests]: https://github.com/google/googletest/blob/main/docs/advanced.md#value-parameterized-tests
|
||||
[value_parameterized_tests]:
|
||||
https://github.com/google/googletest/blob/main/docs/advanced.md#value-parameterized-tests
|
||||
[typed_tests]: https://github.com/google/googletest/blob/main/docs/advanced.md#typed-tests
|
||||
|
||||
@ -56,9 +56,10 @@ Contact for more Information: https://www.mongodb.com/contact
|
||||
### Note to 1194.22
|
||||
|
||||
The Board interprets paragraphs (a) through (k) of this section as consistent with the following
|
||||
priority 1 Checkpoints of the Web Content Accessibility Guidelines 1.0 (WCAG 1.0) (May 5 1999) published by the Web
|
||||
Accessibility Initiative of the World Wide Web Consortium: Paragraph (a) - 1.1, (b) - 1.4, (c\) - 2.1, (d) - 6.1,
|
||||
(e) - 1.2, (f) - 9.1, (g) - 5.1, (h) - 5.2, (i) - 12.1, (j) - 7.1, (k) - 11.4.
|
||||
priority 1 Checkpoints of the Web Content Accessibility Guidelines 1.0 (WCAG 1.0) (May 5 1999)
|
||||
published by the Web Accessibility Initiative of the World Wide Web Consortium: Paragraph (a) - 1.1,
|
||||
(b) - 1.4, (c\) - 2.1, (d) - 6.1, (e) - 1.2, (f) - 9.1, (g) - 5.1, (h) - 5.2, (i) - 12.1, (j) - 7.1,
|
||||
(k) - 11.4.
|
||||
|
||||
## Section 1194.23 Telecommunications Products – Detail
|
||||
|
||||
|
||||
@ -1,84 +1,160 @@
|
||||
# Javascript Test Guide
|
||||
|
||||
At MongoDB we write integration tests in JavaScript. These are tests written to exercise some behavior of a running MongoDB server, replica set, or sharded cluster. This guide aims to provide some general guidelines and best practices on how to write good tests.
|
||||
At MongoDB we write integration tests in JavaScript. These are tests written to exercise some
|
||||
behavior of a running MongoDB server, replica set, or sharded cluster. This guide aims to provide
|
||||
some general guidelines and best practices on how to write good tests.
|
||||
|
||||
## Principles
|
||||
|
||||
### Minimize the test case as much as possible while still exercising and testing the desired behavior.
|
||||
|
||||
- For example, if you are testing that document deletion works correctly, it may be entirely sufficient to insert just a single document and then delete that document. Inserting multiple documents would be unnecessary. A guiding principle on this is to ask yourself how easy it would be for a new person coming to this test to quickly understand it. If there are multiple documents being inserted into a collection, in a test that only tests document deletion, a newcomer might ask the question: “is it important that the test uses multiple documents, or incidental?”. It is best if you can remove these kinds of questions from a person’s mind, by keeping only the absolute essential parts of a test.
|
||||
- We should always strive for unittesting when possible, so if the functionality you want to test can be covered by a unit test, we should write a unit test instead.
|
||||
- For example, if you are testing that document deletion works correctly, it may be entirely
|
||||
sufficient to insert just a single document and then delete that document. Inserting multiple
|
||||
documents would be unnecessary. A guiding principle on this is to ask yourself how easy it would
|
||||
be for a new person coming to this test to quickly understand it. If there are multiple documents
|
||||
being inserted into a collection, in a test that only tests document deletion, a newcomer might
|
||||
ask the question: “is it important that the test uses multiple documents, or incidental?”. It is
|
||||
best if you can remove these kinds of questions from a person’s mind, by keeping only the absolute
|
||||
essential parts of a test.
|
||||
- We should always strive for unittesting when possible, so if the functionality you want to test
|
||||
can be covered by a unit test, we should write a unit test instead.
|
||||
|
||||
### Add a block comment at the top of the JavaScript test file giving a clear and concise overview of what a test is trying to verify.
|
||||
|
||||
- For tests that are more complicated, a brief description of the test steps might be useful as well.
|
||||
- For tests that are more complicated, a brief description of the test steps might be useful as
|
||||
well.
|
||||
|
||||
### Keep debuggability in mind.
|
||||
|
||||
- Assertion error messages should contain all information relevant to debugging the test. This means the server’s response from the failed command should almost always be included in the assertion error message. It can also be helpful to include parameters that vary during the test to avoid requiring the investigator to use the logs/backtrace to determine what the test was attempting to do.
|
||||
- Think about how easy it would be to debug your test if something failed and a newcomer only had the logs of the test to look at. This can help guide your decision on what log messages to include and to what level of detail. The jsTestLog function is useful for this, as it is good at visually demarcating different phases of a test. As a tip, run your test a few times and just study the log messages, imagining you are an engineer debugging the test with only these logs to look at. Think about how understandable the logs would be to a newcomer. It is easy to add log messages to a test but then forget to see how they would actually appear.
|
||||
- Never insert identical documents unless necessary. It is very useful in debugging to be able to figure out where a given piece of data came from.
|
||||
- If a test does the same thing multiple times, consider factoring it out into a library. Shorter running tests are easier to debug and code duplication is always bad.
|
||||
- Assertion error messages should contain all information relevant to debugging the test. This means
|
||||
the server’s response from the failed command should almost always be included in the assertion
|
||||
error message. It can also be helpful to include parameters that vary during the test to avoid
|
||||
requiring the investigator to use the logs/backtrace to determine what the test was attempting to
|
||||
do.
|
||||
- Think about how easy it would be to debug your test if something failed and a newcomer only had
|
||||
the logs of the test to look at. This can help guide your decision on what log messages to include
|
||||
and to what level of detail. The jsTestLog function is useful for this, as it is good at visually
|
||||
demarcating different phases of a test. As a tip, run your test a few times and just study the log
|
||||
messages, imagining you are an engineer debugging the test with only these logs to look at. Think
|
||||
about how understandable the logs would be to a newcomer. It is easy to add log messages to a test
|
||||
but then forget to see how they would actually appear.
|
||||
- Never insert identical documents unless necessary. It is very useful in debugging to be able to
|
||||
figure out where a given piece of data came from.
|
||||
- If a test does the same thing multiple times, consider factoring it out into a library. Shorter
|
||||
running tests are easier to debug and code duplication is always bad.
|
||||
|
||||
### Do not hardcode collection or database names, especially if they are used multiple times throughout a test.
|
||||
|
||||
It is best to use variable names that attempt to describe what a value is used for. For example, naming a variable that stores a collection named `collectionToDrop` is much better than just naming the variable `collName`.
|
||||
It is best to use variable names that attempt to describe what a value is used for. For example,
|
||||
naming a variable that stores a collection named `collectionToDrop` is much better than just naming
|
||||
the variable `collName`.
|
||||
|
||||
### Make every effort to make your test as deterministic as possible.
|
||||
|
||||
- Non-deterministic tests add noise to our build system and, in general, make it harder for yourself and other engineers to determine if the system really is working correctly or not. Flaky integration tests should be considered bugs, and we should not allow them to be committed to the server codebase. One way to make jstests more deterministic is to use failpoints to force the events happening in expected order. However, if we have to use failpoints to make this test deterministic, we should consider write a unit test instead.
|
||||
- Note that our fuzzer and concurrency test suites are often an exception to this rule. In those cases we sometimes give up some level of determinism in order to trigger a wider class of rare edge cases. For targeted JavaScript integration tests, however, highly deterministic tests should be the goal.
|
||||
- Non-deterministic tests add noise to our build system and, in general, make it harder for yourself
|
||||
and other engineers to determine if the system really is working correctly or not. Flaky
|
||||
integration tests should be considered bugs, and we should not allow them to be committed to the
|
||||
server codebase. One way to make jstests more deterministic is to use failpoints to force the
|
||||
events happening in expected order. However, if we have to use failpoints to make this test
|
||||
deterministic, we should consider write a unit test instead.
|
||||
- Note that our fuzzer and concurrency test suites are often an exception to this rule. In those
|
||||
cases we sometimes give up some level of determinism in order to trigger a wider class of rare
|
||||
edge cases. For targeted JavaScript integration tests, however, highly deterministic tests should
|
||||
be the goal.
|
||||
|
||||
### Think hard about all the assumptions that the test relies on.
|
||||
|
||||
- For example, if a certain phase of the test ran much slower or much faster, would it cause your test to fail for the wrong reason?
|
||||
- If your test includes hard-coded timeouts, make sure they are set appropriately. If a test is waiting for a certain condition to be true, and the test should not proceed until that condition is met, it is often correct to just wait “indefinitely”, instead of adding some arbitrary timeout value, like 30 seconds. In practice this usually means setting some reasonable upper limit, for example, 10 minutes.
|
||||
- Also, for replication tests, make sure data exists on the right nodes at the right time. For example, if you a do a write and don’t explicitly wait for it to replicate, it might not reach a secondary node before you try to do the next step of the test.
|
||||
- Does your test require data to be stored persistently? Remember that we have test variants that run on in-memory/ephemeral storage engines
|
||||
- There are timeouts in the test suites and we aim to make all tests in the same suite finish before timeout. That says we should always make the test run quickly to keep the test short in terms of duration.
|
||||
- For example, if a certain phase of the test ran much slower or much faster, would it cause your
|
||||
test to fail for the wrong reason?
|
||||
- If your test includes hard-coded timeouts, make sure they are set appropriately. If a test is
|
||||
waiting for a certain condition to be true, and the test should not proceed until that condition
|
||||
is met, it is often correct to just wait “indefinitely”, instead of adding some arbitrary timeout
|
||||
value, like 30 seconds. In practice this usually means setting some reasonable upper limit, for
|
||||
example, 10 minutes.
|
||||
- Also, for replication tests, make sure data exists on the right nodes at the right time. For
|
||||
example, if you a do a write and don’t explicitly wait for it to replicate, it might not reach a
|
||||
secondary node before you try to do the next step of the test.
|
||||
- Does your test require data to be stored persistently? Remember that we have test variants that
|
||||
run on in-memory/ephemeral storage engines
|
||||
- There are timeouts in the test suites and we aim to make all tests in the same suite finish before
|
||||
timeout. That says we should always make the test run quickly to keep the test short in terms of
|
||||
duration.
|
||||
|
||||
### Make tests fail as early as possible.
|
||||
|
||||
- If something goes wrong early in the test, it’s much harder to diagnose when that error becomes visible much later.
|
||||
- Wrap every command in assert.commandWorked, or assert.commandFailedWithCode. There is also assert.commandFailed that won't check the return error code, but we should always try to use assert.commandFailedWithCode to make sure the test won't pass on an unexpected error.
|
||||
- If something goes wrong early in the test, it’s much harder to diagnose when that error becomes
|
||||
visible much later.
|
||||
- Wrap every command in assert.commandWorked, or assert.commandFailedWithCode. There is also
|
||||
assert.commandFailed that won't check the return error code, but we should always try to use
|
||||
assert.commandFailedWithCode to make sure the test won't pass on an unexpected error.
|
||||
|
||||
### Be aware of all the configurations and variants that your test might run under.
|
||||
|
||||
- Make sure that your test still works correctly if is run in a different configuration or on a different platform than the one you might have tested on.
|
||||
- Varying storage engines and suites can often affect a test’s behavior. For example, maybe your test fails unexpectedly if it runs with authentication turned on with an in-memory storage engine. You don’t have to run a new test on every possible platform before committing it, but you should be confident that your test doesn’t break in an unexpected configuration.
|
||||
- Make sure that your test still works correctly if is run in a different configuration or on a
|
||||
different platform than the one you might have tested on.
|
||||
- Varying storage engines and suites can often affect a test’s behavior. For example, maybe your
|
||||
test fails unexpectedly if it runs with authentication turned on with an in-memory storage engine.
|
||||
You don’t have to run a new test on every possible platform before committing it, but you should
|
||||
be confident that your test doesn’t break in an unexpected configuration.
|
||||
|
||||
### Avoid assertions that verify properties indirectly.
|
||||
|
||||
All assertions in a test should attempt to verify the most specific property possible. For example, if you are trying to test that a certain collection exists, it is better to assert that the collection’s exact name exists in the list of collections, as opposed to verifying that the collection count is equal to 1. The desired collection’s existence is sufficient for the collection count to be 1, but not necessary (a different collection could exist in its place). Be wary of adding these kind of indirect assertions in a test.
|
||||
All assertions in a test should attempt to verify the most specific property possible. For example,
|
||||
if you are trying to test that a certain collection exists, it is better to assert that the
|
||||
collection’s exact name exists in the list of collections, as opposed to verifying that the
|
||||
collection count is equal to 1. The desired collection’s existence is sufficient for the collection
|
||||
count to be 1, but not necessary (a different collection could exist in its place). Be wary of
|
||||
adding these kind of indirect assertions in a test.
|
||||
|
||||
### Test Isolation
|
||||
|
||||
Your JS test will likely be running with many other files before and after it. It's important to start from a known state, and to restore that state (to a reasonable extent) at the end of your test content.
|
||||
Your JS test will likely be running with many other files before and after it. It's important to
|
||||
start from a known state, and to restore that state (to a reasonable extent) at the end of your test
|
||||
content.
|
||||
|
||||
- **Before**: If there are critical assumptions about the environment that your test needs, assert for it explicitly before proceeding to the real test content (instead of debugging side effects of that not being the case)
|
||||
- If you have a precondition on the _environment_, use [`@tags`](./tags.md) instead of just an early-return. This will avoid the test being scheduled in the first place if the environment is not supported.
|
||||
- **After**: If you are modifying the fixture, do everything possible to safely restore those changes at the end of your test content, even after a test failure. Resmokes' `--continueOnFailure` flag is used in CI, so the fixture is shared across many test files, and is only torn down at the end.
|
||||
- Note, a fixture _can_ immediately "abort" after a test failure, only if [archiving](../../../../buildscripts/resmokeconfig/suites/README.md#executorarchive) is configured, but that shouldn't be assumed because that is a per-suite configuration (and your test can run in many passthrough suite combinations).
|
||||
- One easy approach to restoring your state is to use the [Mocha-style](#use-mocha-style-constructs) `after` hooks in your test content.
|
||||
- **Before**: If there are critical assumptions about the environment that your test needs, assert
|
||||
for it explicitly before proceeding to the real test content (instead of debugging side effects of
|
||||
that not being the case)
|
||||
- If you have a precondition on the _environment_, use [`@tags`](./tags.md) instead of just an
|
||||
early-return. This will avoid the test being scheduled in the first place if the environment is
|
||||
not supported.
|
||||
- **After**: If you are modifying the fixture, do everything possible to safely restore those
|
||||
changes at the end of your test content, even after a test failure. Resmokes'
|
||||
`--continueOnFailure` flag is used in CI, so the fixture is shared across many test files, and is
|
||||
only torn down at the end.
|
||||
- Note, a fixture _can_ immediately "abort" after a test failure, only if
|
||||
[archiving](../../../../buildscripts/resmokeconfig/suites/README.md#executorarchive) is
|
||||
configured, but that shouldn't be assumed because that is a per-suite configuration (and your
|
||||
test can run in many passthrough suite combinations).
|
||||
- One easy approach to restoring your state is to use the
|
||||
[Mocha-style](#use-mocha-style-constructs) `after` hooks in your test content.
|
||||
|
||||
## Modern JS: Modules in Practice
|
||||
|
||||
We have fully migrated to the modularized JavaScript world so any new test should use modules and adapt the new style.
|
||||
We have fully migrated to the modularized JavaScript world so any new test should use modules and
|
||||
adapt the new style.
|
||||
|
||||
### Only import/export what you need.
|
||||
|
||||
It's always important to keep the test context clean so we should only import/export what we need.
|
||||
|
||||
- The unused import is against [no-unused-vars](https://eslint.org/docs/latest/rules/no-unused-vars) rule in ESLint though we haven't enforced it.
|
||||
- We don't have a linter to check export since it's hard to tell the necessity, but we should only export the modules that are imported by other tests or will be needed in the future.
|
||||
- The unused import is against [no-unused-vars](https://eslint.org/docs/latest/rules/no-unused-vars)
|
||||
rule in ESLint though we haven't enforced it.
|
||||
- We don't have a linter to check export since it's hard to tell the necessity, but we should only
|
||||
export the modules that are imported by other tests or will be needed in the future.
|
||||
|
||||
### Declare variables in proper scope.
|
||||
|
||||
In the past, we have seen tests referring some "undeclared" or "redeclared" variables, which are actually introduced through `load()`. Now with modules, the scope is more clear. We can use global variables properly to setup the test and don't need to worry about polluting other tests.
|
||||
In the past, we have seen tests referring some "undeclared" or "redeclared" variables, which are
|
||||
actually introduced through `load()`. Now with modules, the scope is more clear. We can use global
|
||||
variables properly to setup the test and don't need to worry about polluting other tests.
|
||||
|
||||
### Name variables properly when exporting.
|
||||
|
||||
To avoid naming conflicts, we should not make the name of exported variables too general which could easily conflict with another variable from the test which import your module. For example, in the following case, the module exported a variable named `alphabet` and it will lead to a re-declaration error.
|
||||
To avoid naming conflicts, we should not make the name of exported variables too general which could
|
||||
easily conflict with another variable from the test which import your module. For example, in the
|
||||
following case, the module exported a variable named `alphabet` and it will lead to a re-declaration
|
||||
error.
|
||||
|
||||
```
|
||||
import {alphabet} from "/matts/module.js";
|
||||
@ -87,7 +163,9 @@ const alphabet = "xyz"; // ERROR
|
||||
|
||||
### Prefer let/const over var
|
||||
|
||||
`let/const` should be preferred over `var` since these can help detect double declaration at the first place. Like, in the naming conflict example, if the second line is using var, it could easily mess up without throwing an error.
|
||||
`let/const` should be preferred over `var` since these can help detect double declaration at the
|
||||
first place. Like, in the naming conflict example, if the second line is using var, it could easily
|
||||
mess up without throwing an error.
|
||||
|
||||
### Export in ES6 style
|
||||
|
||||
@ -116,7 +194,8 @@ This can help the language server to discover the methods and provide code navig
|
||||
|
||||
### Use Mocha-style Constructs
|
||||
|
||||
The [mochalite.js](../jstests/libs/mochalite.js) library ports over a subset of [MochaJS](https://mochajs.org/) functionality for the shell, including:
|
||||
The [mochalite.js](../jstests/libs/mochalite.js) library ports over a subset of
|
||||
[MochaJS](https://mochajs.org/) functionality for the shell, including:
|
||||
|
||||
- `it` test contruction
|
||||
- `describe` suite structures
|
||||
@ -125,19 +204,13 @@ The [mochalite.js](../jstests/libs/mochalite.js) library ports over a subset of
|
||||
- `before` and `after` hooks, to run _once_ around _all_ `it` tests
|
||||
- `beforeEach` and `afterEach` hooks, to run around _each_ `it` test
|
||||
- The above (excluding `describe` variants) also support `async` functions
|
||||
- Resmoke test filtering using the `--mochagrep` flag, which mirrors the [`grep`](https://mochajs.org/#-grep-regexp-g-regexp) flag from MochaJS
|
||||
- Resmoke test filtering using the `--mochagrep` flag, which mirrors the
|
||||
[`grep`](https://mochajs.org/#-grep-regexp-g-regexp) flag from MochaJS
|
||||
|
||||
Example using several APIs:
|
||||
|
||||
```js
|
||||
import {
|
||||
after,
|
||||
afterEach,
|
||||
before,
|
||||
beforeEach,
|
||||
describe,
|
||||
it,
|
||||
} from "jstests/libs/mochalite.js";
|
||||
import {after, afterEach, before, beforeEach, describe, it} from "jstests/libs/mochalite.js";
|
||||
|
||||
describe("simple inserts and finds", () => {
|
||||
before(() => {
|
||||
@ -157,9 +230,7 @@ describe("simple inserts and finds", () => {
|
||||
assert.eq(this.fixtureDB.find({name: "test"}).count(), 1);
|
||||
});
|
||||
it("should error on invalid data", () => {
|
||||
const e = assert.throws(() =>
|
||||
this.fixtureDB.insert({notafield: undefined}),
|
||||
);
|
||||
const e = assert.throws(() => this.fixtureDB.insert({notafield: undefined}));
|
||||
assert.eq(e.message, "Field 'notafield' not found");
|
||||
});
|
||||
});
|
||||
@ -182,7 +253,9 @@ buildscripts/resmoke.py run --suites=no_passthrough --mochagrep "do something" j
|
||||
|
||||
## Test Tags
|
||||
|
||||
JS Test files can leverage "tags" that suites can key off of to include and/or exclude as necessary. Not scheduling a test to run is much faster than the test doing an early-return when preconditions are not met.
|
||||
JS Test files can leverage "tags" that suites can key off of to include and/or exclude as necessary.
|
||||
Not scheduling a test to run is much faster than the test doing an early-return when preconditions
|
||||
are not met.
|
||||
|
||||
The simplest use case is having something like the following at the top of your js test file:
|
||||
|
||||
|
||||
@ -4,19 +4,31 @@ For a short introduction to property-based testing or fast-check, see [Appendix]
|
||||
|
||||
## Core PBT Design
|
||||
|
||||
The 'Core PBTs' are a subset of our property-based tests that use a shared schema and models. Their purpose is to provide basic coverage of our query language that may not be tested by the rest of our jstests. This means only simple stages such as $project, $match, $sort, etc are covered. More complicated stages such as $lookup or $facet are not tested. PBTs outside of the core set may test these more complex features.
|
||||
The 'Core PBTs' are a subset of our property-based tests that use a shared schema and models. Their
|
||||
purpose is to provide basic coverage of our query language that may not be tested by the rest of our
|
||||
jstests. This means only simple stages such as $project, $match, $sort, etc are covered. More
|
||||
complicated stages such as $lookup or $facet are not tested. PBTs outside of the core set may test
|
||||
these more complex features.
|
||||
|
||||
These tests have been highly effective at finding bugs. As of writing they have caught 24 bugs in 8 months. See [SERVER-89308](https://jira.mongodb.org/browse/SERVER-89308) for a full list of issues.
|
||||
These tests have been highly effective at finding bugs. As of writing they have caught 24 bugs in 8
|
||||
months. See [SERVER-89308](https://jira.mongodb.org/browse/SERVER-89308) for a full list of issues.
|
||||
|
||||
The Core PBT design is built off of a few key principles about randomized testing:
|
||||
|
||||
### Properties Dictate the Models
|
||||
|
||||
In our fuzzer, we have grammar for most of MQL. While this provides more coverage, it means the property we assert is weaker. We can add as much as we'd like to the model, because the property comes second to the model. We're willing to add exceptions to the property to make it work.
|
||||
In our fuzzer, we have grammar for most of MQL. While this provides more coverage, it means the
|
||||
property we assert is weaker. We can add as much as we'd like to the model, because the property
|
||||
comes second to the model. We're willing to add exceptions to the property to make it work.
|
||||
|
||||
However, the "model dictates the property" design also backfired, because in addition to exceptions in the property, we need to post-process the generated queries. Adding $sort to several places throughout an aggregation pipeline means we are no longer testing MQL, but rather an artificial subset of MQL that a user would never write.
|
||||
However, the "model dictates the property" design also backfired, because in addition to exceptions
|
||||
in the property, we need to post-process the generated queries. Adding $sort to several places
|
||||
throughout an aggregation pipeline means we are no longer testing MQL, but rather an artificial
|
||||
subset of MQL that a user would never write.
|
||||
|
||||
For this reason, the properties come first in our Core PBTs, and have few exceptions. They dictate what model we use so no postprocessing is needed. The PBT models are significantly smaller than the fuzzer models.
|
||||
For this reason, the properties come first in our Core PBTs, and have few exceptions. They dictate
|
||||
what model we use so no postprocessing is needed. The PBT models are significantly smaller than the
|
||||
fuzzer models.
|
||||
|
||||
### Small Schema
|
||||
|
||||
@ -24,19 +36,32 @@ For this reason, the properties come first in our Core PBTs, and have few except
|
||||
|
||||
A small number of fields in our schema allows us to find interesting interactions more easily.
|
||||
|
||||
An example of an interaction could be query optimizations. Let's say an optimization on `[{$match: {*field*: 5}}, {$sort: {*field*: 1}}]` only kicks in when the two fields are the same. In a PBT where there are one thousand possible fields (`a`, `b`, `c`, but also `a.b.c`, `a.a.a` and all combinations), the probability of finding this optimization is `1/1000`. With six fields, it's increased to `1/6`.
|
||||
An example of an interaction could be query optimizations. Let's say an optimization on
|
||||
`[{$match: {*field*: 5}}, {$sort: {*field*: 1}}]` only kicks in when the two fields are the same. In
|
||||
a PBT where there are one thousand possible fields (`a`, `b`, `c`, but also `a.b.c`, `a.a.a` and all
|
||||
combinations), the probability of finding this optimization is `1/1000`. With six fields, it's
|
||||
increased to `1/6`.
|
||||
|
||||
Another interaction is between queries and indexes. Queries and indexes generated from a small schema make the indexes more likely to be used.
|
||||
Another interaction is between queries and indexes. Queries and indexes generated from a small
|
||||
schema make the indexes more likely to be used.
|
||||
|
||||
Bugs tend to come from interactions and special cases. A query that has no optimizations applied and does not use an index requires much less complicated logic, which is correlated to less bugs.
|
||||
Bugs tend to come from interactions and special cases. A query that has no optimizations applied and
|
||||
does not use an index requires much less complicated logic, which is correlated to less bugs.
|
||||
|
||||
#### Simple Values to Avoid MQL Inconsistencies
|
||||
|
||||
Related to [Properties Dictate the Models](#properties-dictate-the-models), a simpler document model also allows for stronger properties.
|
||||
Related to [Properties Dictate the Models](#properties-dictate-the-models), a simpler document model
|
||||
also allows for stronger properties.
|
||||
|
||||
There are inconsistencies in our query language that are accepted behavior, but cause issues in property-based testing. We can work around them by being careful about the values we allow in documents.
|
||||
There are inconsistencies in our query language that are accepted behavior, but cause issues in
|
||||
property-based testing. We can work around them by being careful about the values we allow in
|
||||
documents.
|
||||
|
||||
[SERVER-12869](https://jira.mongodb.org/browse/SERVER-12869) is an issue that stems from null and missing being encoded the same way in our index format. This means a covering plan (a plan with no `FETCH` node) cannot distinguish between null and missing. This inconsistency is the cause of lots of noise from our fuzzer, since one differing value in a query result can propogate. In our Core PBTs, we do not allow missing fields. This means:
|
||||
[SERVER-12869](https://jira.mongodb.org/browse/SERVER-12869) is an issue that stems from null and
|
||||
missing being encoded the same way in our index format. This means a covering plan (a plan with no
|
||||
`FETCH` node) cannot distinguish between null and missing. This inconsistency is the cause of lots
|
||||
of noise from our fuzzer, since one differing value in a query result can propogate. In our Core
|
||||
PBTs, we do not allow missing fields. This means:
|
||||
|
||||
- Documents must have all fields in the schema
|
||||
- We can only index fields in the schema
|
||||
@ -44,7 +69,9 @@ There are inconsistencies in our query language that are accepted behavior, but
|
||||
|
||||
`null` is allowed.
|
||||
|
||||
Floating point values are another area the PBTs avoid. Results can differ depending on the order of floating point operations. These differences can propogate. For this reason the only number values allowed are integers.
|
||||
Floating point values are another area the PBTs avoid. Results can differ depending on the order of
|
||||
floating point operations. These differences can propogate. For this reason the only number values
|
||||
allowed are integers.
|
||||
|
||||
## Modeling Workloads
|
||||
|
||||
@ -62,8 +89,9 @@ A workload consists of a collection model and an aggregation model, in the follo
|
||||
}
|
||||
```
|
||||
|
||||
Using one workload model instead of separate (and independent) collection models and agg models allows them to be interrelated.
|
||||
For example, if we want to model a PBT to test partial indexes where every query should satisfy the partial index filter, we can write:
|
||||
Using one workload model instead of separate (and independent) collection models and agg models
|
||||
allows them to be interrelated. For example, if we want to model a PBT to test partial indexes where
|
||||
every query should satisfy the partial index filter, we can write:
|
||||
|
||||
```
|
||||
fc.record({
|
||||
@ -78,7 +106,8 @@ fc.record({
|
||||
});
|
||||
```
|
||||
|
||||
and this is a valid workload model. If the collection and aggregation models are passed separately, they would be independent an unable to coordinate with shared arbitraries (like `partialFilter`).
|
||||
and this is a valid workload model. If the collection and aggregation models are passed separately,
|
||||
they would be independent an unable to coordinate with shared arbitraries (like `partialFilter`).
|
||||
|
||||
### Schema
|
||||
|
||||
@ -95,11 +124,13 @@ The Core PBT schema is:
|
||||
}
|
||||
```
|
||||
|
||||
For now, this is also a valid model for a document in a time-series collection (where `t` is the time field and `m` is the meta field), but the models may diverge.
|
||||
For now, this is also a valid model for a document in a time-series collection (where `t` is the
|
||||
time field and `m` is the meta field), but the models may diverge.
|
||||
|
||||
### Query Generation
|
||||
|
||||
These models cover a limited number of aggregation stages, located in `jstests/libs/property_test_helpers/models`. The supported stages are:
|
||||
These models cover a limited number of aggregation stages, located in
|
||||
`jstests/libs/property_test_helpers/models`. The supported stages are:
|
||||
|
||||
- $project
|
||||
- $addFields
|
||||
@ -112,7 +143,8 @@ These models cover a limited number of aggregation stages, located in `jstests/l
|
||||
#### Query Families
|
||||
|
||||
Rather than generating single, standalone queries, our query model generates a "family" of queries.
|
||||
At its leaves, a query family contains multiple values that the leaf could take on. For example instead of generating a single query with a concrete value `1` at the leaf:
|
||||
At its leaves, a query family contains multiple values that the leaf could take on. For example
|
||||
instead of generating a single query with a concrete value `1` at the leaf:
|
||||
|
||||
```
|
||||
[{$match: {a: 1}}, {$project: {b: 0}}]
|
||||
@ -133,7 +165,8 @@ Then we extract several queries that have the same shape.
|
||||
```
|
||||
|
||||
This allows us to write properties that use the plan cache more often rather than relying on chance.
|
||||
Properties can use the `getQuery` interface to ask for queries with different shapes, or the same shape with different leaf values plugged in.
|
||||
Properties can use the `getQuery` interface to ask for queries with different shapes, or the same
|
||||
shape with different leaf values plugged in.
|
||||
|
||||
## Core PBTs
|
||||
|
||||
@ -143,15 +176,15 @@ Details are provided at the top of each file.
|
||||
|
||||
## Debugging a PBT Failure
|
||||
|
||||
Currently, all PBTs have a fixed seed.
|
||||
This means that as long as the bug it found is deterministic on the server's side, the PBT will consistently run into the issue.
|
||||
If the bug is not deterministic, the PBT may or may not fail.
|
||||
Currently, all PBTs have a fixed seed. This means that as long as the bug it found is deterministic
|
||||
on the server's side, the PBT will consistently run into the issue. If the bug is not deterministic,
|
||||
the PBT may or may not fail.
|
||||
|
||||
### Shrinking (Minimizing)
|
||||
|
||||
Once a counterexample (a failing case) to the property is found, fast-check tests will automatically attempt to shrink the issue.
|
||||
Shrinking often does not reach the global minimum counterexample, since fast-check cannot make certain jumps.
|
||||
For example it has no way of knowing that
|
||||
Once a counterexample (a failing case) to the property is found, fast-check tests will automatically
|
||||
attempt to shrink the issue. Shrinking often does not reach the global minimum counterexample, since
|
||||
fast-check cannot make certain jumps. For example it has no way of knowing that
|
||||
|
||||
`{$and: [{a: {$eq: 1}}]}`
|
||||
|
||||
@ -163,30 +196,39 @@ or even
|
||||
|
||||
`{a: 1}`
|
||||
|
||||
This could be solved if fast-check had domain-specific knowledge about MQL or if it fuzzed counterexamples during shrinking.
|
||||
However the counterexamples are usually small enough where there isn't much left to shrink.
|
||||
This could be solved if fast-check had domain-specific knowledge about MQL or if it fuzzed
|
||||
counterexamples during shrinking. However the counterexamples are usually small enough where there
|
||||
isn't much left to shrink.
|
||||
|
||||
For non-deterministic issues, fast-check's shrinking is not as effective because it receives mixed signals from the property on whether the shrunk counterexamples fail or not.
|
||||
For non-deterministic issues, fast-check's shrinking is not as effective because it receives mixed
|
||||
signals from the property on whether the shrunk counterexamples fail or not.
|
||||
|
||||
### Failure Output
|
||||
|
||||
After a failure is minimized, the counterexample is printed out.
|
||||
This includes debug data such as the counterexample that fast-check found and the error it ran into.
|
||||
The counterexample will be a workload (see [Modeling Workloads](#modeling-workloads)), containing all information about the collection and queries run against it.
|
||||
After a failure is minimized, the counterexample is printed out. This includes debug data such as
|
||||
the counterexample that fast-check found and the error it ran into. The counterexample will be a
|
||||
workload (see [Modeling Workloads](#modeling-workloads)), containing all information about the
|
||||
collection and queries run against it.
|
||||
|
||||
To reproduce the issue, the workload can be copied and pasted into the failing property-based test, specifically by passing it in as the `examples` argument to `testProperty`.
|
||||
fast-check will take these hand-written examples and run them before trying randomized examples.
|
||||
See `partial_index_pbt.js` (which references `pbt_resolved_bugs.js`) for an example of this.
|
||||
`partial_index_pbt.js` uses the `examples` argument to ensure workloads that previously would fail are run.
|
||||
It can be used in the same way to repro existing bugs from BFs.
|
||||
To reproduce the issue, the workload can be copied and pasted into the failing property-based test,
|
||||
specifically by passing it in as the `examples` argument to `testProperty`. fast-check will take
|
||||
these hand-written examples and run them before trying randomized examples. See
|
||||
`partial_index_pbt.js` (which references `pbt_resolved_bugs.js`) for an example of this.
|
||||
`partial_index_pbt.js` uses the `examples` argument to ensure workloads that previously would fail
|
||||
are run. It can be used in the same way to repro existing bugs from BFs.
|
||||
|
||||
# Appendix
|
||||
|
||||
## Property-Based Testing (PBT)
|
||||
|
||||
Property-based testing is a testing method that asserts properties hold over many example inputs. In our use of PBT, it involves two components, a "model" and a "property function". The model is a description of the object we are testing. It is used to generate examples of what the object looks like. These examples are routed into the property function, which asserts that the object has the characteristics we expect them to have.
|
||||
Property-based testing is a testing method that asserts properties hold over many example inputs. In
|
||||
our use of PBT, it involves two components, a "model" and a "property function". The model is a
|
||||
description of the object we are testing. It is used to generate examples of what the object looks
|
||||
like. These examples are routed into the property function, which asserts that the object has the
|
||||
characteristics we expect them to have.
|
||||
|
||||
Let's say we wrote a new integer addition function `add` that we'd like to test. We could calculate the correct answer to different addition problems, and assert that `add` behaves correctly.
|
||||
Let's say we wrote a new integer addition function `add` that we'd like to test. We could calculate
|
||||
the correct answer to different addition problems, and assert that `add` behaves correctly.
|
||||
|
||||
```
|
||||
assert.eq(add(1, 2), 3);
|
||||
@ -194,7 +236,9 @@ assert.eq(add(-1, 1), 0);
|
||||
...
|
||||
```
|
||||
|
||||
In addition to tests written with concrete values, we could also write a PBT to test for characteristics we expect `add` to have. Addition is commutative for example, meaning `add(a, b)` should always equal `add(b, a)`. We can write a function for this:
|
||||
In addition to tests written with concrete values, we could also write a PBT to test for
|
||||
characteristics we expect `add` to have. Addition is commutative for example, meaning `add(a, b)`
|
||||
should always equal `add(b, a)`. We can write a function for this:
|
||||
|
||||
```
|
||||
function testAdd(a, b){
|
||||
@ -202,12 +246,20 @@ function testAdd(a, b){
|
||||
}
|
||||
```
|
||||
|
||||
The input to `testAdd` could use the builtin Javascript `Random` package, or a PBT library such as fast-check.
|
||||
The input to `testAdd` could use the builtin Javascript `Random` package, or a PBT library such as
|
||||
fast-check.
|
||||
|
||||
The way the query team uses PBT tends to be more complex, and almost always involves modeling a subset of our query language, documents, and indexes. Our fuzzer is a form of property-based testing, since we generate random queries and assert correctness against different controls (an older mongo version, a collection without indexes, etc)
|
||||
The way the query team uses PBT tends to be more complex, and almost always involves modeling a
|
||||
subset of our query language, documents, and indexes. Our fuzzer is a form of property-based
|
||||
testing, since we generate random queries and assert correctness against different controls (an
|
||||
older mongo version, a collection without indexes, etc)
|
||||
|
||||
## fast-check
|
||||
|
||||
fast-check (located in jstests/third_party/fast_check/fc-3.1.0.js) is a property-based testing framework for javascript/typescript. It provides building-block components to use for larger models, and has functionality to test properties against these models. It also has built-in logic for shrinking (minimizing) counterexamples to properties.
|
||||
fast-check (located in jstests/third_party/fast_check/fc-3.1.0.js) is a property-based testing
|
||||
framework for javascript/typescript. It provides building-block components to use for larger models,
|
||||
and has functionality to test properties against these models. It also has built-in logic for
|
||||
shrinking (minimizing) counterexamples to properties.
|
||||
|
||||
For an example of how to use fast-check to write a property-based test, see [project_coalescing.js](../../aggregation/sources/project/project_coalescing.js)
|
||||
For an example of how to use fast-check to write a property-based test, see
|
||||
[project_coalescing.js](../../aggregation/sources/project/project_coalescing.js)
|
||||
|
||||
@ -4,5 +4,7 @@ These tests test upgrade/downgrade behavior expected between different versions
|
||||
|
||||
Those that begin failing upon branching should be assessed by the owner teams:
|
||||
|
||||
- Is the test only applicable to specific versions during specific development cycles? If so, delete it from irrelevant branches and master.
|
||||
- Does the test add value for "last" (dynamic) version features? If so, modify the test to be more robust. These should always pass regardless of MongoDB version.
|
||||
- Is the test only applicable to specific versions during specific development cycles? If so, delete
|
||||
it from irrelevant branches and master.
|
||||
- Does the test add value for "last" (dynamic) version features? If so, modify the test to be more
|
||||
robust. These should always pass regardless of MongoDB version.
|
||||
|
||||
@ -1,3 +1,4 @@
|
||||
# FCV / setFCV core infrastructure
|
||||
|
||||
This folder contains tests the core FCV and setFCV upgrade/downgrade infrastructure. It does not contain tests linked to any other particular feature.
|
||||
This folder contains tests the core FCV and setFCV upgrade/downgrade infrastructure. It does not
|
||||
contain tests linked to any other particular feature.
|
||||
|
||||
@ -1,6 +1,8 @@
|
||||
# Introduction
|
||||
|
||||
The plan_stability tests record the current winning plan for a set of ~ 1K queries produced by SPM-3816. If those plans ever change, the test is expected to fail at which point a human would decide if the changed plans are for the better or for the worse.
|
||||
The plan_stability tests record the current winning plan for a set of ~ 1K queries produced by
|
||||
SPM-3816. If those plans ever change, the test is expected to fail at which point a human would
|
||||
decide if the changed plans are for the better or for the worse.
|
||||
|
||||
# Running
|
||||
|
||||
@ -13,7 +15,8 @@ $ buildscripts/resmoke.py run \
|
||||
jstests/query_golden/plan_stability.js
|
||||
```
|
||||
|
||||
There are several resmoke suites predefined for different plan ranking modes, for which it is not needed to add mongod parameters:
|
||||
There are several resmoke suites predefined for different plan ranking modes, for which it is not
|
||||
needed to add mongod parameters:
|
||||
|
||||
```bash
|
||||
query_golden_cbr_automatic
|
||||
@ -42,7 +45,9 @@ To obtain a diff that contains an individual diff fragment for each changed plan
|
||||
2. Edit the `~/.golden_test_config.yml` to use a customized diff command:
|
||||
|
||||
```yml
|
||||
diffCmd: 'git -c diff.plan_stability.xfuncname=">>>pipeline" diff --unified=0 --function-context --no-index "{{expected}}" "{{actual}}"'
|
||||
diffCmd:
|
||||
'git -c diff.plan_stability.xfuncname=">>>pipeline" diff --unified=0 --function-context --no-index
|
||||
"{{expected}}" "{{actual}}"'
|
||||
```
|
||||
|
||||
3. You can now run `buildscripts/golden_test.py diff` as usual and the output will look like this:
|
||||
@ -68,15 +73,20 @@ This provides the plan that changed, the pipeline it belonged to, and the execut
|
||||
|
||||
## Using the summarization scripts
|
||||
|
||||
The `feature-extractor` internal repository contains a summarization script that can be used to obtain a summary of the failed test as well as information on the individual regressions that should be looked into. Please see `scripts/cbr/README.md` in that repository for more information.
|
||||
The `feature-extractor` internal repository contains a summarization script that can be used to
|
||||
obtain a summary of the failed test as well as information on the individual regressions that should
|
||||
be looked into. Please see `scripts/cbr/README.md` in that repository for more information.
|
||||
|
||||
# Debugging failures
|
||||
|
||||
## Which pipeline is the problematic one?
|
||||
|
||||
In Evergreen, the diff will most likely show a pipeline **below** the counters. This is however the following pipeline in the test, not the one you are looking for. The problematic pipeline is the one that comes **before** it in the `expected_output` file.
|
||||
In Evergreen, the diff will most likely show a pipeline **below** the counters. This is however the
|
||||
following pipeline in the test, not the one you are looking for. The problematic pipeline is the one
|
||||
that comes **before** it in the `expected_output` file.
|
||||
|
||||
In local execution, if your environment is configured as described above, the diff will show the actual pipeline of interest, **above** the counters.
|
||||
In local execution, if your environment is configured as described above, the diff will show the
|
||||
actual pipeline of interest, **above** the counters.
|
||||
|
||||
## Running the offending pipelines manually
|
||||
|
||||
@ -98,7 +108,8 @@ and wait until the script has advanced to the following log line:
|
||||
[js_test:plan_stability] [jsTest] ----
|
||||
```
|
||||
|
||||
2. Connect to `mongodb://127.0.0.1:20000` and run the offending pipeline against the `db.plan_stability` collection.
|
||||
2. Connect to `mongodb://127.0.0.1:20000` and run the offending pipeline against the
|
||||
`db.plan_stability` collection.
|
||||
|
||||
```bash
|
||||
mongosh mongodb://127.0.0.1:20000
|
||||
@ -113,7 +124,10 @@ db.plan_stability.aggregate(pipeline).explain().queryPlanner.rejectedPlans.sort(
|
||||
|
||||
## Converting the pipeline to JavaScript
|
||||
|
||||
The pipelines in the diff are **EJSON**-ish, while the mongosh shell expects **JavaScript**. EJSON-ish and JavaScript are identical when it comes to basic types, such as strings and integers, but if the pipeline contains timestamps and decimals, the JSON needs to be converted to JavaScript using `EJSON.parse()`:
|
||||
The pipelines in the diff are **EJSON**-ish, while the mongosh shell expects **JavaScript**.
|
||||
EJSON-ish and JavaScript are identical when it comes to basic types, such as strings and integers,
|
||||
but if the pipeline contains timestamps and decimals, the JSON needs to be converted to JavaScript
|
||||
using `EJSON.parse()`:
|
||||
|
||||
```js
|
||||
> pipelineStr = '[{"$match":{"field20_Timestamp_idx":{"$gt":{"$timestamp":{"t":1760551205,"i":0}}}},"field12_Decimal128_idx":{"$lte":{"$numberDecimal":"35.1"}}}]';
|
||||
@ -130,23 +144,26 @@ The pipelines in the diff are **EJSON**-ish, while the mongosh shell expects **J
|
||||
db.plan_stability2.aggregate(pipeline);
|
||||
```
|
||||
|
||||
Note that **ISO Timestamps** need to be handled separately. JSON will store those as strings, resulting in loss of typing information that `EJSON.parse()` can not recover. This will result in a semantic change in the query unless manually converted to an `ISODate` object:
|
||||
Note that **ISO Timestamps** need to be handled separately. JSON will store those as strings,
|
||||
resulting in loss of typing information that `EJSON.parse()` can not recover. This will result in a
|
||||
semantic change in the query unless manually converted to an `ISODate` object:
|
||||
|
||||
```js
|
||||
// Manually convert
|
||||
// [{"$match":{"field19_datetime_idx":{"$gte":"2024-01-27T00:00:00.000Z"}}}]
|
||||
// to the correct JavaScript
|
||||
|
||||
pipeline = [
|
||||
{$match: {field19_datetime_idx: {$gte: ISODate("2024-01-27T00:00:00.000Z")}}},
|
||||
];
|
||||
pipeline = [{$match: {field19_datetime_idx: {$gte: ISODate("2024-01-27T00:00:00.000Z")}}}];
|
||||
```
|
||||
|
||||
## Is the new plan better or worse?
|
||||
|
||||
For the majority of the plans, it will be obvious if the new plan is better or worse because all the execution counters would have moved in the same direction without any ambiguity.
|
||||
For the majority of the plans, it will be obvious if the new plan is better or worse because all the
|
||||
execution counters would have moved in the same direction without any ambiguity.
|
||||
|
||||
Some plans, such as those involving `$sort` or `$limit` will sometimes change in a way that makes some counters better while others become worse. For those queries, consider running them manually multiple times to compare their wallclock execution times:
|
||||
Some plans, such as those involving `$sort` or `$limit` will sometimes change in a way that makes
|
||||
some counters better while others become worse. For those queries, consider running them manually
|
||||
multiple times to compare their wallclock execution times:
|
||||
|
||||
```javascript
|
||||
pipeline = [...];
|
||||
@ -162,11 +179,15 @@ You can also modify `collSize` in `plan_stability.js` to temporarily use a large
|
||||
|
||||
If you want to run a comparison between estimation methods `X` and `Y`:
|
||||
|
||||
1. If method `X` is not multi-planning, place the `jstests/query_golden/expected_files/X` for estimation method `X` in the root of `expected_files`, so that they are used as the base for the comparison;
|
||||
1. If method `X` is not multi-planning, place the `jstests/query_golden/expected_files/X` for
|
||||
estimation method `X` in the root of `expected_files`, so that they are used as the base for the
|
||||
comparison;
|
||||
|
||||
2. Temporary remove the expected files for method `Y` from `expected_files/query_golden/expected_files/Y` so that they are not considered;
|
||||
2. Temporary remove the expected files for method `Y` from
|
||||
`expected_files/query_golden/expected_files/Y` so that they are not considered;
|
||||
|
||||
3. Run the test as described above, specifying `featureFlagCostBasedRanker`/`internalQueryCBRCEMethod`;
|
||||
3. Run the test as described above, specifying
|
||||
`featureFlagCostBasedRanker`/`internalQueryCBRCEMethod`;
|
||||
|
||||
4. Use the summarization script as described above to produce a report.
|
||||
|
||||
@ -179,5 +200,5 @@ To accept the new plans, use `buildscripts/golden_test.py accept`, as with any o
|
||||
## Removing individual pipelines
|
||||
|
||||
If a given pipeline proves flaky, that is, is flipping between one plan and another for no reason,
|
||||
you can comment it out from the test with a note. Re-run the test and then run `buildscripts/golden_test.py accept`
|
||||
to persist the change.
|
||||
you can comment it out from the test with a note. Re-run the test and then run
|
||||
`buildscripts/golden_test.py accept` to persist the change.
|
||||
|
||||
@ -1,21 +1,26 @@
|
||||
# Introduction
|
||||
|
||||
The plan stability tests for join optimization are golden tests that execute a number of joins against the TPC-H dataset.
|
||||
The plan stability tests for join optimization are golden tests that execute a number of joins
|
||||
against the TPC-H dataset.
|
||||
|
||||
For each pipeline we persist the following in the golden test output:
|
||||
|
||||
- the MQL command, including the base table and the pipeline
|
||||
- a concise representation of the winning plan for the query
|
||||
- execution counters that quantify the effort it took to execute the query in terms of docs and keys examined
|
||||
- execution counters that quantify the effort it took to execute the query in terms of docs and keys
|
||||
examined
|
||||
- data about the resultset, such as the number of rows returned
|
||||
|
||||
## Prerequisites
|
||||
|
||||
This test requires the following:
|
||||
|
||||
- The `mongorestore` tool, accessible on the $PATH. This tool is part of the [MongoDB Database Tools](https://www.mongodb.com/try/download/database-tools) package.
|
||||
- The `mongorestore` tool, accessible on the $PATH. This tool is part of the
|
||||
[MongoDB Database Tools](https://www.mongodb.com/try/download/database-tools) package.
|
||||
|
||||
- The TPC-H dataset, located in a directory named `tpc-h` that is on the same level as the mongodb repository. The dataset is available from the `query-benchmark-data` S3 bucket. You can retrieve it as follows:
|
||||
- The TPC-H dataset, located in a directory named `tpc-h` that is on the same level as the mongodb
|
||||
repository. The dataset is available from the `query-benchmark-data` S3 bucket. You can retrieve
|
||||
it as follows:
|
||||
|
||||
```bash
|
||||
mkdir ~/tpc-h
|
||||
@ -26,7 +31,8 @@ aws sso login
|
||||
aws s3 cp s3://query-benchmark-data/tpc-h/tpch-0.1-normalized.archive.gz tpc-h/tpch-0.1-normalized.archive.gz --region us-east-1
|
||||
```
|
||||
|
||||
In evergreen, tasks such as `query_golden_join_optimization_plan_stability` make sure the prerequisites are already in place.
|
||||
In evergreen, tasks such as `query_golden_join_optimization_plan_stability` make sure the
|
||||
prerequisites are already in place.
|
||||
|
||||
- The golden test framework configured with a custom diff rule
|
||||
|
||||
@ -77,13 +83,16 @@ The report contains the following information:
|
||||
- the most-improved queries, useful for obtaining examples for presentation purposes;
|
||||
- all individual failures, categorized and pretty-printed.
|
||||
|
||||
The report has one section per jstest -- if you are running multiple tests, each one will be processed and reported separately.
|
||||
The report has one section per jstest -- if you are running multiple tests, each one will be
|
||||
processed and reported separately.
|
||||
|
||||
## Debugging
|
||||
|
||||
> [!WARNING] > **_WARNING:_** The queries dumped by this test, the diff tooling or the summary report may contain EJSON constructs, such as $numberDecimal
|
||||
> that are not properly processed by `coll.aggregate()` unless converted using `EJSON.parse()`. Typing information around ISO dates may have also been lost, so manually recreate those as `ISODate(...)`.
|
||||
> See the "A note on the queries" section below for more information.
|
||||
> [!WARNING] > **_WARNING:_** The queries dumped by this test, the diff tooling or the summary
|
||||
> report may contain EJSON constructs, such as $numberDecimal that are not properly processed by
|
||||
> `coll.aggregate()` unless converted using `EJSON.parse()`. Typing information around ISO dates may
|
||||
> have also been lost, so manually recreate those as `ISODate(...)`. See the "A note on the queries"
|
||||
> section below for more information.
|
||||
|
||||
### Determining the offending query
|
||||
|
||||
@ -91,7 +100,9 @@ Each query has an `idx` key that can be used to track it across files and report
|
||||
|
||||
### Starting a populated MongoDB instance
|
||||
|
||||
To obtain a running, populated MongoDB instance, run `resmoke.py run` with the `--pauseAfterPopulate` option. This will start mongod, load the data and then pause resmoke at the following line:
|
||||
To obtain a running, populated MongoDB instance, run `resmoke.py run` with the
|
||||
`--pauseAfterPopulate` option. This will start mongod, load the data and then pause resmoke at the
|
||||
following line:
|
||||
|
||||
```
|
||||
[js_test:plan_stability_join_opt_tpch] [jsTest] TestData.pauseAfterPopulate is set. Pausing indefinitely ...
|
||||
@ -124,15 +135,18 @@ The collections will be restored to the `tpch` database.
|
||||
|
||||
## A note on the queries
|
||||
|
||||
The queries you see in files, diffs, bug reports may be in various formats, depending on whether they were dumped using JavaScript, python, or some other method.
|
||||
The queries you see in files, diffs, bug reports may be in various formats, depending on whether
|
||||
they were dumped using JavaScript, python, or some other method.
|
||||
|
||||
Therefore, it is important to obtain the query plan of the query and make sure that what you are seeing locally matches the plan from the bug report.
|
||||
Therefore, it is important to obtain the query plan of the query and make sure that what you are
|
||||
seeing locally matches the plan from the bug report.
|
||||
|
||||
The following caveats are currently known:
|
||||
|
||||
### Typing information for timestamps
|
||||
|
||||
Typing information for timestamps is frequently lost, so a query may contain ISO timestamps as strings:
|
||||
Typing information for timestamps is frequently lost, so a query may contain ISO timestamps as
|
||||
strings:
|
||||
|
||||
```json
|
||||
{"l_commitdate": {"$lt": "1993-03-17T00:00:00"}}
|
||||
@ -146,7 +160,8 @@ You will need to manually convert this into a timestamp:
|
||||
{'l_commitdate': {'$lt': new ISODate('1993-03-17T00:00:00')}}
|
||||
```
|
||||
|
||||
Since the typing information has been lost somewhere along the pipeline, no existing library is available to restore it for you.
|
||||
Since the typing information has been lost somewhere along the pipeline, no existing library is
|
||||
available to restore it for you.
|
||||
|
||||
### EJSON output
|
||||
|
||||
@ -158,6 +173,8 @@ Sometimes the query will be provided in EJSON, so you will see:
|
||||
|
||||
in the output.
|
||||
|
||||
In mongosh, `aggregate()` does not support EJSON directly, so passing EJSON to it will succeed but will not produce the expected results.
|
||||
In mongosh, `aggregate()` does not support EJSON directly, so passing EJSON to it will succeed but
|
||||
will not produce the expected results.
|
||||
|
||||
Either pass this output as `EJSON.parse()` (if your input is a string), `EJSON.deserialize()` (if your input is parsed already) or manually convert it to standard MQL.
|
||||
Either pass this output as `EJSON.parse()` (if your input is a string), `EJSON.deserialize()` (if
|
||||
your input is parsed already) or manually convert it to standard MQL.
|
||||
|
||||
@ -2,15 +2,18 @@
|
||||
|
||||
Bazel test targets for resmoke suites.
|
||||
|
||||
For documentation of the `resmoke_suite_test` rule, see [bazel/resmoke/README.md](bazel/resmoke/README.md).
|
||||
For documentation of the `resmoke_suite_test` rule, see
|
||||
[bazel/resmoke/README.md](bazel/resmoke/README.md).
|
||||
|
||||
## Configuring
|
||||
|
||||
In addition to attributes for `resmoke_suite_test`, the following are options for configuring test targets.
|
||||
In addition to attributes for `resmoke_suite_test`, the following are options for configuring test
|
||||
targets.
|
||||
|
||||
### tags
|
||||
|
||||
Arbitrary tags may also be added to group test targets for batch execution. For example, a custom tag lets you run all matching suites at once:
|
||||
Arbitrary tags may also be added to group test targets for batch execution. For example, a custom
|
||||
tag lets you run all matching suites at once:
|
||||
|
||||
```
|
||||
bazel test //jstests/suites/... --test_tag_filters=my_tag
|
||||
@ -26,7 +29,8 @@ The following tags have special meaning:
|
||||
|
||||
### target_compatible_with
|
||||
|
||||
Configure platforms/build options that the test is compatible with. Use this to exclude the test suite from platforms in CI.
|
||||
Configure platforms/build options that the test is compatible with. Use this to exclude the test
|
||||
suite from platforms in CI.
|
||||
|
||||
Example — exclude the test on PPC/S390x, MacOS, and TSAN builds:
|
||||
|
||||
|
||||
@ -1,6 +1,8 @@
|
||||
# JS Test Tags
|
||||
|
||||
JS Test files can leverage "tags" that suites can key off of to include and/or exclude as necessary. Not scheduling a test to run is much faster than the test doing an early-return when preconditions are not met.
|
||||
JS Test files can leverage "tags" that suites can key off of to include and/or exclude as necessary.
|
||||
Not scheduling a test to run is much faster than the test doing an early-return when preconditions
|
||||
are not met.
|
||||
|
||||
The simplest use case is having something like the following at the top of your js test file:
|
||||
|
||||
@ -38,7 +40,10 @@ and can also include (meta) comments:
|
||||
*/
|
||||
```
|
||||
|
||||
The tags are meant to be used in suite configurations, to [`include_with_any_tags`](../buildscripts/resmokeconfig/suites/README.md#selectorinclude_with_any_tags) and/or [`exclude_with_any_tags`](../buildscripts/resmokeconfig/suites/README.md#selectorexclude_with_any_tags):
|
||||
The tags are meant to be used in suite configurations, to
|
||||
[`include_with_any_tags`](../buildscripts/resmokeconfig/suites/README.md#selectorinclude_with_any_tags)
|
||||
and/or
|
||||
[`exclude_with_any_tags`](../buildscripts/resmokeconfig/suites/README.md#selectorexclude_with_any_tags):
|
||||
|
||||
```bash
|
||||
test_kind: js_test
|
||||
@ -50,7 +55,8 @@ selector:
|
||||
- disabled_for_fcv_6_1_upgrade
|
||||
```
|
||||
|
||||
Build variants can also use tags via the `test_flags` expansion, which facilitates tag-exclusions _across suites_ that run with the variant:
|
||||
Build variants can also use tags via the `test_flags` expansion, which facilitates tag-exclusions
|
||||
_across suites_ that run with the variant:
|
||||
|
||||
```
|
||||
expansions:
|
||||
@ -60,6 +66,9 @@ Build variants can also use tags via the `test_flags` expansion, which facilitat
|
||||
|
||||
## Available Tags
|
||||
|
||||
There is no current exhaustive list, since tags are arbitrary labels and do not need to be "registered". However, tags are always "global", and many are reused. Names should have communicate clear intent; and be reused/consolidated when appropriate.
|
||||
There is no current exhaustive list, since tags are arbitrary labels and do not need to be
|
||||
"registered". However, tags are always "global", and many are reused. Names should have communicate
|
||||
clear intent; and be reused/consolidated when appropriate.
|
||||
|
||||
> Use `buildscripts/resmoke.py list-tags` to find which tags are actively referenced by suite configs, although there may be more in JS files and Build Variant expansions.
|
||||
> Use `buildscripts/resmoke.py list-tags` to find which tags are actively referenced by suite
|
||||
> configs, although there may be more in JS files and Build Variant expansions.
|
||||
|
||||
Some files were not shown because too many files have changed in this diff Show More
Loading…
Reference in New Issue
Block a user