Steve McClure 32e8f260de SERVER-124136 Format markdown via prettier: wrap lines and use width of 100 (#52231 )

GitOrigin-RevId: 3305c1e2ee3a6a2c3a5b2b7883b0f491a59ed646

2026-04-21 19:20:11 +00:00

2.3 KiB

Raw Blame History

Cost Model Calibrator

Getting Started

1) Setup Mongod

First, prepare the MongoDB server:

Activate the standard virtual environment:

source python3-venv/bin/activate

Build server with optimizations (makes doc insertion faster):

(python3-venv) bazel build --config=opt install-devcore

Run mongod instance (only for CBR calibration, because join_start.py manages mongod's lifecycle itself):

(python3-venv) bazel-bin/install-mongod/bin/mongod --setParameter internalMeasureQueryExecutionTimeInNanoseconds=true

2) Setup Cost Model Calibrator

In another terminal:

Navigate to the cost model directory:

cd buildscripts/cost_model

Set up Python alias to use MongoDB toolchain:

alias python=/opt/mongodbtoolchain/v4/bin/python3

Deactivate any existing Python environment (if needed):

deactivate

Create new virtual environment:

/opt/mongodbtoolchain/v4/bin/python3 -m venv cm

Activate the new environment:

source cm/bin/activate

Install required packages:

(cm) python -m pip install -r requirements.txt

Run the calibrator:

For CBR cost model calibration:
```
(cm) python start.py
```
For JOO cost model calibration:
```
(cm) python join_start.py
```
To skip the constant calibration (warm scan, CPU, sequential I/O, random I/O) and only run the join algorithm comparison:
```
(cm) python join_start.py --join-only
```
To iterate quickly on cost model changes, reuse pre-recorded execution times from a previous full run. This skips actual query execution, only running queryPlanner explains to collect fresh cost estimates:
```
(cm) python join_start.py --execution-times join_output/join_times_in-cache.csv join_output/join_times_exceeds-cache.csv
```

Note: For CBR calibration, the first time it will take a while since it has to generate the data. Afterwards, as long as you aren't modifying the collections, you can comment out await generator.populate_collections() in start.py - this will make it a lot faster.

When done, deactivate the environment:

(cm) deactivate

Install New Packages

Install the package:

(cm) python -m pip install <package_name>

Update requirements.txt:

(cm) python -m pip freeze > requirements.txt

2.3 KiB Raw Blame History