Memo on decoding the email header.
Using the sender/from address as an example.
from email.header import decode_header
decoded = decode_header(message['from'])
outstr = decoded[0][0].decode(decoded[0][1])
# sample: decoded[0][0]
# b'LinkedIn\xe3\x82\xb3\xe3\x83\xb3\xe3\x82\xbf\xe3\x82\xaf\xe3\x83\x88'
# sample: decoded[0][1]
# utf-8
# sample: outstr
# LinkedInコンタクト
Here's two print codes.
Below seems more regular, but during debugging, exceptions occurred, so I opted for the UTF-8 specified method above to observe the situation.
print(i, outstr.encode('utf-8', errors='replace').decode('utf-8'))
print(outstr.encode(sys.stdout.encoding, errors='replace').decode(sys.stdout.encoding))
def print_decoded_item(i, instr):
outstr = instr
try:
decoded = decode_header(instr)
outstr = decoded[0][0].decode(decoded[0][1])
except Exception as e:
passtry:
print(i, outstr.encode('utf-8', errors='replace').decode('utf-8'))
except Exception as e:
pass