SERVER-123290 Update extension host README with rules registration (#53966)

GitOrigin-RevId: a430843bb17367024ffa971275476187c3cea875
This commit is contained in:
Adithi Raghavan 2026-05-19 16:56:24 -04:00 committed by MongoDB Bot
parent 43f60af37b
commit 229a1c0755

View File

@ -62,3 +62,86 @@ extensionOptions:
At startup, the host will first load a `.conf` file, then use its `sharedLibraryPath` to access and
load the `SharedLibrary` file that represents the extension and pass the corresponding
`extensionOptions` to its initialization function.
## Pipeline Rewrite Rules
Extensions can register pipeline rewrite rules to enable query optimizations. On the host side,
rules are stored in `_extensionRuleRegistry` — a static
`unordered_map<string, vector<PipelineRewriteRule>>` inside `DocumentSourceExtensionOptimizable`,
populated once at startup when `registerStageRules()` is called through the host portal.
### Registering rules
The extension calls `_registerStageRules<StageDescriptor>(portal, rules)` inside `initialize()`:
```cpp
void initialize(const sdk::HostPortalHandle& portal) override {
_registerStage<MyStageDescriptor>(portal);
std::vector<PipelineRewriteRule> rules{
{"absorbMatch", kPipelineRewriteRuleTagReordering},
{"inlineProject", kPipelineRewriteRuleTagInPlace},
};
_registerStageRules<MyStageDescriptor>(portal, rules);
}
```
This call crosses the C API boundary through the host portal and arrives at `registerStageRules()`
on the host, which inserts the rules into `_extensionRuleRegistry` keyed by stage name. Each rule
has a **name** (used to dispatch to the correct extension override) and a **tag**:
- `kInPlace` — only modifies the stage's internal state; the host routes these through
`extensionDispatcherInPlacePrecondition`
- `kReordering` — may erase or reorder adjacent pipeline stages; the host routes these through
`extensionDispatcherReorderingPrecondition`
### Dispatch at optimization time
When a `DocumentSourceExtensionOptimizable` stage is constructed, `_buildOwnedRewriteRules()` wraps
the entries from `_extensionRuleRegistry` for that stage name into host-side
`rule_based_rewrites::pipeline::PipelineRewriteRule` objects (via `wrapExtensionRule()`), caching
them in `_ownedRewriteRules`. The rules must be owned per-stage instance because host-side RBR rules
hold function pointers that close over the specific extension stage object.
During the optimization pass the Rule-Based Rewriter calls one of the two host-side dispatcher
preconditions. Each precondition calls `dispatchExtensionRules(ctx, tagFilter)`, which iterates
`_ownedRewriteRules` and queues any rule whose tag matches the filter with `ctx.addRule()`.
### Precondition and transform callbacks
For each queued rule, the RBR engine calls the extension's `evaluateRulePrecondition`. The call
crosses the C API boundary through the `MongoExtensionLogicalAggStage` vtable, with
`PipelineRewriteContextAdapter` wrapping the host-side context so the extension can safely inspect
adjacent stages via its `PipelineRewriteContextHandle`.
```cpp
bool evaluateRulePrecondition(
std::string_view ruleName,
extension::ConstPipelineRewriteContextHandle ctx) const override {
if (ruleName == "absorbMatch")
return ctx->hasAtLeastNNextStages(1) &&
ctx->getNthNextStage(1)->getName() == "$match";
return false;
}
```
If the precondition returns `true`, the engine calls `evaluateRuleTransform` through the same vtable
path. Returning `true` from the transform signals a structural pipeline modification, causing the
rewriter to re-queue the stage for further optimization passes.
```cpp
bool evaluateRuleTransform(
std::string_view ruleName,
extension::PipelineRewriteContextHandle ctx) override {
if (ruleName == "absorbMatch") {
BSONObj filter = ctx->getNthNextStage(1)->getFilter();
// fold filter into this stage's internal state …
ctx->eraseNthNext(1);
return true;
}
return false;
}
```
See `src/mongo/db/extension/test_examples/desugar/vector_search_optimization.cpp` for a complete
worked example.