Bytes() Initializer Adding An Additional Byte?
I initialize a utf-8 encoding string in python3: bytes('\xc2', encoding='utf-8', errors='strict') but on writing it out I get two bytes! >>> s = bytes('\xc2', encoding='u
Solution 1:
The Unicode codepoint "\xc2"
(which can also be written as "Â"
), is two bytes long when encoded with the utf-8
encoding. If you were expecting it to be the single byte b'\xc2'
, you probably want to use a different encoding, such as "latin-1"
:
>>>s = bytes("\xc2", encoding="latin-1", errors="strict")>>>s
b'\xc2'
If you area really creating "\xc2"
directly with a literal though, there's no need to mess around with the bytes
constructor to turn it into a bytes
instance. Just use the b
prefix on the literal to create the bytes directly:
s = b"\xc2"
Post a Comment for "Bytes() Initializer Adding An Additional Byte?"