PYTHON-767 Support JSON strict mode $date output
PYTHON-1039 Support JSON strict mode $numberLong output
PYTHON-1103 Support JSON strict mode UUID output
PYTHON-1111 Support custom document class in loads
PYTHON-1111 Support tz_aware and tzinfo in loads
Refactor milliseconds to datetime conversions
This change resolves four issues:
PYTHON-826 The new codec_options submodule is moved from pymongo to bson.
PYTHON-827 Use codec_options in BSON APIs.
Functions and methods of the bson module that accepted the options as_class,
tz_aware, and uuid_subtype now accept a codec_options parameter instead.
For example, the function definition for bson.decode_all changes from this:
def decode_all(data, as_class=dict, tz_aware=True,
uuid_subtype=OLD_UUID_SUBTYPE)
to:
def decode_all(data, codec_options=CodecOptions())
The following functions are changed:
- decode_all
- decode_iter
- decode_file_iter
The following methods are changed:
- BSON.encode
- BSON.decode
This is a breaking change for any application that uses the BSON API directly
and changes any of the named parameter defaults. No changes are required for
applications that use the default values for these options. The behavior
remains the same.
PYTHON-828 Internal BSON module changes to support CodecOptions
The pure Python BSON module passes around a CodecOptions instance instead of
as_class, tz_aware, and uuid_subtype. C extensions pass these values around in
a struct.
PYTHON-801 Rename uuid_subtype to uuid_representation.
When decoding large collections of bson documents, the python representation
of dicts are time and space costly, so it's sometimes useful to generate and
consume the documents iteratively. This patch adds two new functions to do
that: decode_iter and decode_file_iter. The first is given all the bson data,
but yields one document at a time, while the second reads from a file object
enough to yield one document at a time (to avoid reading in an entire file).
This change provides perf improvements for decoding
most types in pure python. Like the previous changes
for encoding, the biggest improvements are seen decoding
BSON arrays to python lists - over 150% using pypy.
This is just a cleanup of the existing decoder. I tried
using a namedtuple but that imposed up to a 17% perf hit
(a regular tuple imposed no measureable perf hit). We may
be able to avoid that problem with a new API that accepts
decoder options in a specific class instead of creating the
instance in the decoder itself.
This is the first step in rewriting the pure python BSON
module. These changes provide measurable improvements for
all types including up to a 95% improvement in encoding
performance for lists/tuples.