diff --git a/doc/developer/index.rst b/doc/developer/index.rst new file mode 100644 index 000000000..2ce1e0536 --- /dev/null +++ b/doc/developer/index.rst @@ -0,0 +1,9 @@ +Developer Guide +=============== + +Technical guide for contributors to PyMongo. + +.. toctree:: + :maxdepth: 1 + + periodic_executor diff --git a/doc/developer/periodic_executor.rst b/doc/developer/periodic_executor.rst new file mode 100644 index 000000000..b794ad80a --- /dev/null +++ b/doc/developer/periodic_executor.rst @@ -0,0 +1,69 @@ +Periodic Executors +================== + +.. currentmodule:: pymongo + +PyMongo implements a :class:`~periodic_executor.PeriodicExecutor` for two +purposes: as the background thread for :class:`~monitor.Monitor`, and to +regularly check if there are `OP_KILL_CURSORS` messages that must be sent to the server. + +Monitoring +---------- + +For each server in the topology, :class:`~topology.Topology` launches a +monitor thread. This thread must not prevent the topology from being freed, +so it weakrefs the topology. Furthermore, it uses a weakref callback to close +itself promptly when the topology is freed. + +Solid lines represent strong references, dashed lines weak ones: + +.. generated with graphviz from periodic-executor-refs.dot + +.. image:: ../static/periodic-executor-refs.png + +See `Stopping Executors`_ below for an explanation of ``_EXECUTORS``. + +Killing Cursors +--------------- + +An incompletely iterated :class:`~cursor.Cursor` on the client represents an +open cursor object on the server. In code like this, we lose a reference to +the cursor before finishing iteration:: + + for doc in collection.find(): + raise Exception() + +We try to send an `OP_KILL_CURSORS` to the server to tell it to clean up the +server-side cursor. But we must not take any locks directly from the cursor's +destructor (see `PYTHON-799 `_), +so we cannot safely use the PyMongo data structures required to send a message. +The solution is to add the cursor's id to an array on the +:class:`~mongo_client.MongoClient` without taking any locks. + +Each client has a :class:`~periodic_executor.PeriodicExecutor` devoted to +checking the array for cursor ids. Any it sees are the result of cursors that +were freed while the server-side cursor was still open. The executor can safely +take the locks it needs in order to send the `OP_KILL_CURSORS` message. + +Stopping Executors +------------------ + +Just as :class:`~cursor.Cursor` must not take any locks from its destructor, +neither can :class:`~mongo_client.MongoClient` and :class:`~topology.Topology`. +Thus, although the client calls :meth:`close` on its kill-cursors thread, and +the topology calls :meth:`close` on all its monitor threads, the :meth:`close` +method cannot actually call :meth:`wake` on the executor, since :meth:`wake` +takes a lock. + +Instead, executors wake very frequently to check if ``self.close`` is set, +and if so they exit. + +A thread can log spurious errors if it wakes late in the Python interpreter's +shutdown sequence, so we try to join threads before then. Each periodic +executor (either a monitor or a kill-cursors thread) adds a weakref to itself +to a set called ``_EXECUTORS``, in the ``periodic_executor`` module. + +An `exit handler`_ runs on shutdown and tells all executors to stop, then +tries (with a short timeout) to join all executor threads. + +.. _exit handler: https://docs.python.org/2/library/atexit.html diff --git a/doc/index.rst b/doc/index.rst index 471224cd5..ada6a87ee 100644 --- a/doc/index.rst +++ b/doc/index.rst @@ -32,6 +32,9 @@ everything you need to know to use **PyMongo**. A listing of Python tools and libraries that have been written for MongoDB. +:doc:`developer/index` + Developer guide for contributors to PyMongo. + Getting Help ------------ If you're having trouble or have questions about PyMongo, the best place to ask is the `MongoDB user group `_. Once you get an answer, it'd be great if you could work it back into this documentation and contribute! @@ -88,4 +91,4 @@ Indices and tables contributors changelog python3 - + developer/index diff --git a/doc/static/periodic-executor-refs.dot b/doc/static/periodic-executor-refs.dot new file mode 100644 index 000000000..003af9f57 --- /dev/null +++ b/doc/static/periodic-executor-refs.dot @@ -0,0 +1,16 @@ +digraph "Monitor and PeriodicExecutor" { + // Strong references. + topology -> server + server -> monitor + monitor -> executor + executor -> "target()" + "target()" -> self_ref + + // Weak references + edge [style="dashed"]; + + self_ref -> monitor [curved=true] + monitor -> topology + executor -> thread + _EXECUTORS -> executor +} diff --git a/doc/static/periodic-executor-refs.png b/doc/static/periodic-executor-refs.png new file mode 100644 index 000000000..9530a5669 Binary files /dev/null and b/doc/static/periodic-executor-refs.png differ diff --git a/pymongo/periodic_executor.py b/pymongo/periodic_executor.py index c630b9258..a71b378f0 100644 --- a/pymongo/periodic_executor.py +++ b/pymongo/periodic_executor.py @@ -70,7 +70,7 @@ class PeriodicExecutor(object): callback; see monitor.py. Since this can be called from a weakref callback during garbage - collection it must take no locks! + collection it must take no locks! That means it cannot call wake(). """ self._stopped = True