PYTHON-1993 Add client-side field level encryption documentation examples

Specify pymongocrypt<2.0.0 in setup.py for compatibility.
This commit is contained in:
Shane Harvey 2019-11-14 14:43:36 -08:00
parent 849a415356
commit d0423d2d53
9 changed files with 540 additions and 45 deletions

View File

@ -1,5 +1,5 @@
:mod:`encryption` -- Client side encryption
===========================================
:mod:`encryption` -- Client-Side Field Level Encryption
=======================================================
.. automodule:: pymongo.encryption
:members:

View File

@ -1,8 +1,8 @@
:mod:`encryption_options` -- Support for automatic client side encryption
:mod:`encryption_options` -- Automatic Client-Side Field Level Encryption
=========================================================================
.. automodule:: pymongo.encryption_options
:synopsis: Support for automatic client side encryption
:synopsis: Support for automatic client-side field level encryption
.. autoclass:: pymongo.encryption_options.AutoEncryptionOpts
:members:

498
doc/examples/encryption.rst Normal file
View File

@ -0,0 +1,498 @@
Client-Side Field Level Encryption
==================================
New in MongoDB 4.2, client-side field level encryption allows an application
to encrypt specific data fields in addition to pre-existing MongoDB
encryption features such as `Encryption at Rest
<https://docs.mongodb.com/manual/core/security-encryption-at-rest/>`_ and
`TLS/SSL (Transport Encryption)
<https://docs.mongodb.com/manual/core/security-transport-encryption/>`_.
With field level encryption, applications can encrypt fields in documents
*prior* to transmitting data over the wire to the server. Client-side field
level encryption supports workloads where applications must guarantee that
unauthorized parties, including server administrators, cannot read the
encrypted data.
.. seealso:: The MongoDB documentation for `Client-Side Field Level Encryption
<https://docs.mongodb.com/manual/core/security-client-side-encryption/>`_.
Dependencies
------------
To get started using client-side field level encryption in your project,
you will need to install the
`pymongocrypt <https://pypi.org/project/pymongocrypt/>`_ library
as well as the driver itself. Install both the driver and a compatible
version of pymongocrypt like this::
$ python -m pip install 'pymongo[encryption]'
Note that installing on Linux requires pip 19 or later for manylinux2010 wheel
support. For more information about installing pymongocrypt see
`the installation instructions on the project's PyPI page
<https://pypi.org/project/pymongocrypt/>`_.
mongocryptd
-----------
The ``mongocryptd`` binary is required for automatic client-side encryption
and is included as a component in the `MongoDB Enterprise Server package
<https://docs.mongodb.com/manual/administration/install-enterprise/>`_.
For detailed installation instructions see
`the MongoDB documentation on mongocryptd
<https://docs.mongodb.com/manual/reference/security-client-side-encryption-appendix/#mongocryptd>`_.
``mongocryptd`` performs the following:
- Parses the automatic encryption rules specified to the database connection.
If the JSON schema contains invalid automatic encryption syntax or any
document validation syntax, ``mongocryptd`` returns an error.
- Uses the specified automatic encryption rules to mark fields in read and
write operations for encryption.
- Rejects read/write operations that may return unexpected or incorrect results
when applied to an encrypted field. For supported and unsupported operations,
see `Read/Write Support with Automatic Field Level Encryption
<https://docs.mongodb.com/manual/reference/security-client-side-query-aggregation-support/>`_.
A MongoClient configured with auto encryption will automatically spawn the
``mongocryptd`` process from the application's ``PATH``. Applications can
control the spawning behavior as part of the automatic encryption options.
For example to set the path to the ``mongocryptd`` process::
auto_encryption_opts = AutoEncryptionOpts(
...,
mongocryptd_spawn_path='/path/to/mongocryptd')
To control the logging output of ``mongocryptd`` pass options using
``mongocryptd_spawn_args``::
auto_encryption_opts = AutoEncryptionOpts(
...,
mongocryptd_spawn_args=['--logpath=/path/to/mongocryptd.log', '--logappend'])
If your application wishes to manage the ``mongocryptd`` process manually,
it is possible to disable spawning ``mongocryptd``::
auto_encryption_opts = AutoEncryptionOpts(
...,
mongocryptd_bypass_spawn=True,
# URI of the local ``mongocryptd`` process.
mongocryptd_uri='mongodb://localhost:27020')
``mongocryptd`` is only responsible for supporting automatic client-side field
level encryption and does not itself perform any encryption or decryption.
.. _automatic-client-side-encryption:
Automatic Client-Side Field Level Encryption
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Automatic client-side field level encryption is enabled by creating a
:class:`~pymongo.mongo_client.MongoClient` with the ``auto_encryption_opts``
option set to an instance of
:class:`~pymongo.encryption_options.AutoEncryptionOpts`. The following
examples show how to setup automatic client-side field level encryption
using :class:`~pymongo.encryption.ClientEncryption` to create a new
encryption data key.
.. note:: Automatic client-side field level encryption requires MongoDB 4.2
enterprise or a MongoDB 4.2 Atlas cluster. The community version of the
server supports automatic decryption as well as
:ref:`explicit-client-side-encryption`.
Providing Local Automatic Encryption Rules
``````````````````````````````````````````
The following example shows how to specify automatic encryption rules via the
``schema_map`` option. The automatic encryption rules are expressed using a
`strict subset of the JSON Schema syntax
<https://docs.mongodb.com/manual/reference/security-client-side-automatic-json-schema/>`_.
Supplying a ``schema_map`` provides more security than relying on
JSON Schemas obtained from the server. It protects against a
malicious server advertising a false JSON Schema, which could trick
the client into sending unencrypted data that should be encrypted.
JSON Schemas supplied in the ``schema_map`` only apply to configuring
automatic client-side field level encryption. Other validation
rules in the JSON schema will not be enforced by the driver and
will result in an error.::
import os
from bson.codec_options import CodecOptions
from bson import json_util
from pymongo import MongoClient
from pymongo.encryption import (Algorithm,
ClientEncryption)
from pymongo.encryption_options import AutoEncryptionOpts
def create_json_schema_file(kms_providers, key_vault_namespace,
key_vault_client):
client_encryption = ClientEncryption(
kms_providers,
key_vault_namespace,
key_vault_client,
# The CodecOptions class used for encrypting and decrypting.
# This should be the same CodecOptions instance you have configured
# on MongoClient, Database, or Collection. We will not be calling
# encrypt() or decrypt() in this example so we can use any
# CodecOptions.
CodecOptions())
# Create a new data key and json schema for the encryptedField.
# https://docs.mongodb.com/manual/reference/security-client-side-automatic-json-schema/
data_key_id = client_encryption.create_data_key(
'local', key_alt_names=['pymongo_encryption_example_1'])
schema = {
"properties": {
"encryptedField": {
"encrypt": {
"keyId": [data_key_id],
"bsonType": "string",
"algorithm":
Algorithm.AEAD_AES_256_CBC_HMAC_SHA_512_Deterministic
}
}
},
"bsonType": "object"
}
# Use CANONICAL_JSON_OPTIONS so that other drivers and tools will be
# able to parse the MongoDB extended JSON file.
json_schema_string = json_util.dumps(
schema, json_options=json_util.CANONICAL_JSON_OPTIONS)
with open('jsonSchema.json', 'w') as file:
file.write(json_schema_string)
def main():
# The MongoDB namespace (db.collection) used to store the
# encrypted documents in this example.
encrypted_namespace = "test.coll"
# This must be the same master key that was used to create
# the encryption key.
local_master_key = os.urandom(96)
kms_providers = {"local": {"key": local_master_key}}
# The MongoDB namespace (db.collection) used to store
# the encryption data keys.
key_vault_namespace = "encryption.__pymongoTestKeyVault"
key_vault_db_name, key_vault_coll_name = key_vault_namespace.split(".", 1)
# The MongoClient used to access the key vault (key_vault_namespace).
key_vault_client = MongoClient()
key_vault = key_vault_client[key_vault_db_name][key_vault_coll_name]
# Ensure that two data keys cannot share the same keyAltName.
key_vault.drop()
key_vault.create_index(
"keyAltNames",
unique=True,
partialFilterExpression={"keyAltNames": {"$exists": True}})
create_json_schema_file(
kms_providers, key_vault_namespace, key_vault_client)
# Load the JSON Schema and construct the local schema_map option.
with open('jsonSchema.json', 'r') as file:
json_schema_string = file.read()
json_schema = json_util.loads(json_schema_string)
schema_map = {encrypted_namespace: json_schema}
auto_encryption_opts = AutoEncryptionOpts(
kms_providers, key_vault_namespace, schema_map=schema_map)
client = MongoClient(auto_encryption_opts=auto_encryption_opts)
db_name, coll_name = encrypted_namespace.split(".", 1)
coll = client[db_name][coll_name]
# Clear old data
coll.drop()
coll.insert_one({"encryptedField": "123456789"})
print('Decrypted document: %s' % (coll.find_one(),))
unencrypted_coll = MongoClient()[db_name][coll_name]
print('Encrypted document: %s' % (unencrypted_coll.find_one(),))
if __name__ == "__main__":
main()
Server-Side Field Level Encryption Enforcement
``````````````````````````````````````````````
The MongoDB 4.2 server supports using schema validation to enforce encryption
of specific fields in a collection. This schema validation will prevent an
application from inserting unencrypted values for any fields marked with the
``"encrypt"`` JSON schema keyword.
The following example shows how to setup automatic client-side field level
encryption using
:class:`~pymongo.encryption.ClientEncryption` to create a new encryption
data key and create a collection with the
`Automatic Encryption JSON Schema Syntax
<https://docs.mongodb.com/manual/reference/security-client-side-automatic-json-schema/>`_::
import os
from bson.codec_options import CodecOptions
from bson.binary import STANDARD
from pymongo import MongoClient
from pymongo.encryption import (Algorithm,
ClientEncryption)
from pymongo.encryption_options import AutoEncryptionOpts
from pymongo.errors import OperationFailure
from pymongo.write_concern import WriteConcern
def main():
# The MongoDB namespace (db.collection) used to store the
# encrypted documents in this example.
encrypted_namespace = "test.coll"
# This must be the same master key that was used to create
# the encryption key.
local_master_key = os.urandom(96)
kms_providers = {"local": {"key": local_master_key}}
# The MongoDB namespace (db.collection) used to store
# the encryption data keys.
key_vault_namespace = "encryption.__pymongoTestKeyVault"
key_vault_db_name, key_vault_coll_name = key_vault_namespace.split(".", 1)
# The MongoClient used to access the key vault (key_vault_namespace).
key_vault_client = MongoClient()
key_vault = key_vault_client[key_vault_db_name][key_vault_coll_name]
# Ensure that two data keys cannot share the same keyAltName.
key_vault.drop()
key_vault.create_index(
"keyAltNames",
unique=True,
partialFilterExpression={"keyAltNames": {"$exists": True}})
client_encryption = ClientEncryption(
kms_providers,
key_vault_namespace,
key_vault_client,
# The CodecOptions class used for encrypting and decrypting.
# This should be the same CodecOptions instance you have configured
# on MongoClient, Database, or Collection. We will not be calling
# encrypt() or decrypt() in this example so we can use any
# CodecOptions.
CodecOptions())
# Create a new data key and json schema for the encryptedField.
data_key_id = client_encryption.create_data_key(
'local', key_alt_names=['pymongo_encryption_example_2'])
json_schema = {
"properties": {
"encryptedField": {
"encrypt": {
"keyId": [data_key_id],
"bsonType": "string",
"algorithm":
Algorithm.AEAD_AES_256_CBC_HMAC_SHA_512_Deterministic
}
}
},
"bsonType": "object"
}
auto_encryption_opts = AutoEncryptionOpts(
kms_providers, key_vault_namespace)
client = MongoClient(auto_encryption_opts=auto_encryption_opts)
db_name, coll_name = encrypted_namespace.split(".", 1)
db = client[db_name]
# Clear old data
db.drop_collection(coll_name)
# Create the collection with the encryption JSON Schema.
db.create_collection(
coll_name,
# uuid_representation=STANDARD is required to ensure that any
# UUIDs in the $jsonSchema document are encoded to BSON Binary
# with the standard UUID subtype 4. This is only needed when
# running the "create" collection command with an encryption
# JSON Schema.
codec_options=CodecOptions(uuid_representation=STANDARD),
write_concern=WriteConcern(w="majority"),
validator={"$jsonSchema": json_schema})
coll = client[db_name][coll_name]
coll.insert_one({"encryptedField": "123456789"})
print('Decrypted document: %s' % (coll.find_one(),))
unencrypted_coll = MongoClient()[db_name][coll_name]
print('Encrypted document: %s' % (unencrypted_coll.find_one(),))
try:
unencrypted_coll.insert_one({"encryptedField": "123456789"})
except OperationFailure as exc:
print('Unencrypted insert failed: %s' % (exc.details,))
if __name__ == "__main__":
main()
.. _explicit-client-side-encryption:
Explicit Encryption
~~~~~~~~~~~~~~~~~~~
Explicit encryption is a MongoDB community feature and does not use the
``mongocryptd`` process. Explicit encryption is provided by the
:class:`~pymongo.encryption.ClientEncryption` class, for example::
import os
from pymongo import MongoClient
from pymongo.encryption import (Algorithm,
ClientEncryption)
def main():
# This must be the same master key that was used to create
# the encryption key.
local_master_key = os.urandom(96)
kms_providers = {"local": {"key": local_master_key}}
# The MongoDB namespace (db.collection) used to store
# the encryption data keys.
key_vault_namespace = "encryption.__pymongoTestKeyVault"
key_vault_db_name, key_vault_coll_name = key_vault_namespace.split(".", 1)
# The MongoClient used to read/write application data.
client = MongoClient()
coll = client.test.coll
# Clear old data
coll.drop()
# Set up the key vault (key_vault_namespace) for this example.
key_vault = client[key_vault_db_name][key_vault_coll_name]
# Ensure that two data keys cannot share the same keyAltName.
key_vault.drop()
key_vault.create_index(
"keyAltNames",
unique=True,
partialFilterExpression={"keyAltNames": {"$exists": True}})
client_encryption = ClientEncryption(
kms_providers,
key_vault_namespace,
# The MongoClient to use for reading/writing to the key vault.
# This can be the same MongoClient used by the main application.
client,
# The CodecOptions class used for encrypting and decrypting.
# This should be the same CodecOptions instance you have configured
# on MongoClient, Database, or Collection.
coll.codec_options)
# Create a new data key and json schema for the encryptedField.
data_key_id = client_encryption.create_data_key(
'local', key_alt_names=['pymongo_encryption_example_3'])
# Explicitly encrypt a field:
encrypted_field = client_encryption.encrypt(
"123456789",
Algorithm.AEAD_AES_256_CBC_HMAC_SHA_512_Deterministic,
key_id=data_key_id)
coll.insert_one({"encryptedField": encrypted_field})
doc = coll.find_one()
print('Encrypted document: %s' % (doc,))
# Explicitly decrypt the field:
doc["encryptedField"] = client_encryption.decrypt(doc["encryptedField"])
print('Decrypted document: %s' % (doc,))
# Cleanup resources.
client_encryption.close()
client.close()
if __name__ == "__main__":
main()
Explicit Encryption with Automatic Decryption
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Although automatic encryption requires MongoDB 4.2 enterprise or a
MongoDB 4.2 Atlas cluster, automatic *decryption* is supported for all users.
To configure automatic *decryption* without automatic *encryption* set
``bypass_auto_encryption=True`` in
:class:`~pymongo.encryption_options.AutoEncryptionOpts`::
import os
from pymongo import MongoClient
from pymongo.encryption import (Algorithm,
ClientEncryption)
from pymongo.encryption_options import AutoEncryptionOpts
def main():
# This must be the same master key that was used to create
# the encryption key.
local_master_key = os.urandom(96)
kms_providers = {"local": {"key": local_master_key}}
# The MongoDB namespace (db.collection) used to store
# the encryption data keys.
key_vault_namespace = "encryption.__pymongoTestKeyVault"
key_vault_db_name, key_vault_coll_name = key_vault_namespace.split(".", 1)
# bypass_auto_encryption=True disable automatic encryption but keeps
# the automatic _decryption_ behavior. bypass_auto_encryption will
# also disable spawning mongocryptd.
auto_encryption_opts = AutoEncryptionOpts(
kms_providers, key_vault_namespace, bypass_auto_encryption=True)
client = MongoClient(auto_encryption_opts=auto_encryption_opts)
coll = client.test.coll
# Clear old data
coll.drop()
# Set up the key vault (key_vault_namespace) for this example.
key_vault = client[key_vault_db_name][key_vault_coll_name]
# Ensure that two data keys cannot share the same keyAltName.
key_vault.drop()
key_vault.create_index(
"keyAltNames",
unique=True,
partialFilterExpression={"keyAltNames": {"$exists": True}})
client_encryption = ClientEncryption(
kms_providers,
key_vault_namespace,
# The MongoClient to use for reading/writing to the key vault.
# This can be the same MongoClient used by the main application.
client,
# The CodecOptions class used for encrypting and decrypting.
# This should be the same CodecOptions instance you have configured
# on MongoClient, Database, or Collection.
coll.codec_options)
# Create a new data key and json schema for the encryptedField.
data_key_id = client_encryption.create_data_key(
'local', key_alt_names=['pymongo_encryption_example_4'])
# Explicitly encrypt a field:
encrypted_field = client_encryption.encrypt(
"123456789",
Algorithm.AEAD_AES_256_CBC_HMAC_SHA_512_Deterministic,
key_alt_name='pymongo_encryption_example_4')
coll.insert_one({"encryptedField": encrypted_field})
# Automatically decrypts any encrypted fields.
doc = coll.find_one()
print('Decrypted document: %s' % (doc,))
unencrypted_coll = MongoClient().test.coll
print('Encrypted document: %s' % (unencrypted_coll.find_one(),))
# Cleanup resources.
client_encryption.close()
client.close()
if __name__ == "__main__":
main()

View File

@ -31,3 +31,4 @@ MongoDB, you can start it like so:
server_selection
tailable
tls
encryption

View File

@ -25,6 +25,9 @@ everything you need to know to use **PyMongo**.
:doc:`examples/tls`
Using PyMongo with TLS / SSL.
:doc:`examples/encryption`
Using PyMongo with client side encryption.
:doc:`faq`
Some questions that come up often.

View File

@ -12,11 +12,7 @@
# See the License for the specific language governing permissions and
# limitations under the License.
"""Support for explicit client side encryption.
**Support for client side encryption is in beta. Backwards-breaking changes
may be made before the final release.**
"""
"""Support for explicit client-side field level encryption."""
import contextlib
import os
@ -35,7 +31,7 @@ except ImportError:
_HAVE_PYMONGOCRYPT = False
MongoCryptCallback = object
from bson import _bson_to_dict, _dict_to_bson, decode, encode
from bson import _dict_to_bson, decode, encode
from bson.codec_options import CodecOptions
from bson.binary import (Binary,
STANDARD,
@ -204,13 +200,13 @@ class _EncryptionIO(MongoCryptCallback):
:Returns:
The _id of the inserted data key document.
"""
# insert does not return the inserted _id when given a RawBSONDocument.
doc = _bson_to_dict(data_key, _DATA_KEY_OPTS)
if not isinstance(doc.get('_id'), uuid.UUID):
raise TypeError(
'data_key _id must be a bson.binary.Binary with subtype 4')
res = self.key_vault_coll.insert_one(doc)
return Binary(res.inserted_id.bytes, subtype=UUID_SUBTYPE)
raw_doc = RawBSONDocument(data_key)
data_key_id = raw_doc.get('_id')
if not isinstance(data_key_id, uuid.UUID):
raise TypeError('data_key _id must be a UUID')
self.key_vault_coll.insert_one(raw_doc)
return Binary(data_key_id.bytes, subtype=UUID_SUBTYPE)
def bson_encode(self, doc):
"""Encode a document to BSON.
@ -338,11 +334,11 @@ class Algorithm(object):
class ClientEncryption(object):
"""Explicit client side encryption."""
"""Explicit client-side field level encryption."""
def __init__(self, kms_providers, key_vault_namespace, key_vault_client,
codec_options):
"""Explicit client side encryption.
"""Explicit client-side field level encryption.
The ClientEncryption class encapsulates explicit operations on a key
vault collection that cannot be done directly on a MongoClient. Similar
@ -353,8 +349,7 @@ class ClientEncryption(object):
creating data keys. It does not provide an API to query keys from the
key vault collection, as this can be done directly on the MongoClient.
.. note:: Support for client side encryption is in beta.
Backwards-breaking changes may be made before the final release.
See :ref:`explicit-client-side-encryption` for an example.
:Parameters:
- `kms_providers`: Map of KMS provider options. Two KMS providers
@ -377,14 +372,17 @@ class ClientEncryption(object):
containing the `key_vault_namespace` collection.
- `codec_options`: An instance of
:class:`~bson.codec_options.CodecOptions` to use when encoding a
value for encryption and decoding the decrypted BSON value.
value for encryption and decoding the decrypted BSON value. This
should be the same CodecOptions instance configured on the
MongoClient, Database, or Collection used to access application
data.
.. versionadded:: 3.9
"""
if not _HAVE_PYMONGOCRYPT:
raise ConfigurationError(
"client side encryption requires the pymongocrypt library: "
"install a compatible version with: "
"client-side field level encryption requires the pymongocrypt "
"library: install a compatible version with: "
"python -m pip install 'pymongo[encryption]'")
if not isinstance(codec_options, CodecOptions):

View File

@ -12,11 +12,7 @@
# See the License for the specific language governing permissions and
# limitations under the License.
"""Support for automatic client side encryption.
**Support for client side encryption is in beta. Backwards-breaking changes
may be made before the final release.**
"""
"""Support for automatic client-side field level encryption."""
import copy
@ -30,7 +26,7 @@ from pymongo.errors import ConfigurationError
class AutoEncryptionOpts(object):
"""Options to configure automatic encryption."""
"""Options to configure automatic client-side field level encryption."""
def __init__(self, kms_providers, key_vault_namespace,
key_vault_client=None, schema_map=None,
@ -39,21 +35,21 @@ class AutoEncryptionOpts(object):
mongocryptd_bypass_spawn=False,
mongocryptd_spawn_path='mongocryptd',
mongocryptd_spawn_args=None):
"""Options to configure automatic encryption.
"""Options to configure automatic client-side field level encryption.
Automatic encryption is an enterprise only feature that only
applies to operations on a collection. Automatic encryption is not
Automatic client-side field level encryption requires MongoDB 4.2
enterprise or a MongoDB 4.2 Atlas cluster. Automatic encryption is not
supported for operations on a database or view and will result in
error. To bypass automatic encryption (but enable automatic
decryption), set ``bypass_auto_encryption=True`` in
AutoEncryptionOpts.
error.
Explicit encryption/decryption and automatic decryption is a
community feature. A MongoClient configured with
bypassAutoEncryption=true will still automatically decrypt.
Although automatic encryption requires MongoDB 4.2 enterprise or a
MongoDB 4.2 Atlas cluster, automatic *decryption* is supported for all
users. To configure automatic *decryption* without automatic
*encryption* set ``bypass_auto_encryption=True``. Explicit
encryption and explicit decryption is also supported for all users
with the :class:`~pymongo.encryption.ClientEncryption` class.
.. note:: Support for client side encryption is in beta.
Backwards-breaking changes may be made before the final release.
See :ref:`automatic-client-side-encryption` for an example.
:Parameters:
- `kms_providers`: Map of KMS provider options. Two KMS providers

View File

@ -481,9 +481,8 @@ class MongoClient(common.BaseObject):
- `auto_encryption_opts`: A
:class:`~pymongo.encryption_options.AutoEncryptionOpts` which
configures this client to automatically encrypt collection commands
and automatically decrypt results. **Support for client side
encryption is in beta. Backwards-breaking changes may be made
before the final release.**
and automatically decrypt results. See
:ref:`automatic-client-side-encryption` for an example.
.. mongodoc:: connections

View File

@ -318,7 +318,7 @@ ext_modules = [Extension('bson._cbson',
'bson/buffer.c'])]
extras_require = {
'encryption': ['pymongocrypt'], # For client side field level encryption.
'encryption': ['pymongocrypt<2.0.0'],
'snappy': ['python-snappy'],
'zstd': ['zstandard'],
}