Document content encoding differences with Requests (#1416)

* Document content encoding differences with Requests * Apply suggestions from code review * Tweak copy for Windows-1252 Co-authored-by: Jamie Hewland <jamie.hewland@hpe.com> Co-authored-by: Jamie Hewland <jamie.hewland@hpe.com>
2020-12-06 01:16:59 +01:00 · 2020-12-06 01:16:59 +01:00 · 7ea6019c70
commit 7ea6019c70
parent 28cbe77676
1 changed files with 6 additions and 0 deletions
--- a/docs/compatibility.md
+++ b/docs/compatibility.md
@ -31,6 +31,12 @@ httpx.post(..., data={"message": "Hello, world"})
 If you're using a type checking tool such as `mypy`, you'll see warnings issues if using test/byte content with the `data` argument.
 However, for compatibility reasons with `requests`, we do still handle the case where `data=...` is used with raw binary and text contents.

+## Content encoding
+
+HTTPX uses `utf-8` for encoding `str` request bodies. For example, when using `content=<str>` the request body will be encoded to `utf-8` before being sent over the wire. This differs from Requests which uses `latin1`. If you need an explicit encoding, pass encoded bytes explictly, e.g. `content=<str>.encode("latin1")`.
+
+For response bodies, assuming the server didn't send an explicit encoding then HTTPX will do its best to figure out an appropriate encoding. Unlike Requests which uses the `chardet` library, HTTPX relies on a plainer fallback strategy (basically attempting UTF-8, or using Windows-1252 as a fallback). This strategy should be robust enough to handle the vast majority of use cases.
+
 ## Status Codes

 In our documentation we prefer the uppercased versions, such as `codes.NOT_FOUND`, but also provide lower-cased versions for API compatibility with `requests`.