-
-
Notifications
You must be signed in to change notification settings - Fork 34.2k
Description
Feature or enhancement
RFC 4648 describes two alphabets for Base64 (standard and urlsafe) and two alphabets for base32 (standard and hexadecimal). Python also implements three variants of Base85 (Ascii85 is more complex than this, but it can be based on Base85). A number of other formats are based on BaseXX encoding with alternative alphabets.
So, I suggest to adde the alphabet parameter in several binascii functions. They can be used in the implementation of the base64 module or directly by users implementing alternative formats.
We can remove just added functions b2a_z85() and a2b_z85() -- they are equivalent of b2a_base85() and a2b_base85() with an alternative alphabet. Also, Base64 with alternative alphabets will be more efficient for large data. Accidentally, this also fixes #145968.
For encoding functions we can simply pass a bytes object containing all alphabet characters. Decoding functions need a reverse table of length which maps a byte to its index or special invalid values. We can provide a function which creates such table from the alphabet.
Alternatively, we can create it automatically from the passed alphabet argument and cache the result. This is less flexible but more user friendly interface. It add some overhead for small input data, because you need to calculate a hash of the 64- or 85-bytes object, but for large data this is insignificant.