python - Python3 - ascii/utf-8/iso-8859-1 can't decode byte 0xe5 (Swedish characters) -

March 15, 2012

i've tried io, repr() etc, don't work!

problem inputting `å` (`\xe5`):

(none of these work)

import sys print(sys.stdin.read(1))

sys.stdin = io.textiowrapper(sys.stdin.detach(), errors='replace', encoding='iso-8859-1', newline='\n') print(sys.stdin.read(1))

x = sys.stdin.buffer.read(1) print(x.decode('utf-8'))

they give me unicodedecodeerror: 'utf-8' codec can't decode byte 0xe5 in position 0: unexpected end of data

also tried starting python with: export pythonioencoding=utf-8 doesn't work either.

now, here's i'm at:

import sys, codecs sys.stdout = codecs.getwriter("utf-8")(sys.stdout.detach()) sys.stdin = codecs.getwriter("utf-8")(sys.stdin.detach())  x = sys.stdin.read(1)  print(x.decode('utf-8', 'replace'))

this gives me: ï¿½
it's close...

how can take \xe5 , turn å in console? without breaking input() well, because solution breaks it.

note: know has been asked before, non of solve it.. not io

some info of system

os.environ['lang'] == 'c' sys.getdefaultencoding() == 'utf-8' sys.stdout.encoding == 'ansi_x3.4-1968' sys.stdin.encoding == 'ansi_x3.4-1968'

my os: archlinux running xterm
running locale -a gives me: c | posix | sv_se.utf8

i've followed these:

(and few 50 more)

solution (sort of, still breaks `input()`)

sys.stdout = codecs.getwriter("latin-1")(sys.stdout.detach()) sys.stdin = codecs.getwriter("latin-1")(sys.stdin.detach())  x = sys.stdin.read(1)  print(x.decode('latin-1', 'replace'))

you running in xterm, not support utf-8 default. run xterm -u8 or use uxterm fix that.

the other way work around that, use different locale; set locale latin-1 example:

export lang=sv_se.iso-8859-1

but limited 256 codepoints, versus full range (several million) of unicode standard.

note python 2 never decoded input; writing out read terminal fine because raw bytes read interpreted terminal in same locale; reading , writing latin-1 bytes works fine. that's not quite same processing unicode data, however.

Search This Blog

Detect

python - Python3 - ascii/utf-8/iso-8859-1 can't decode byte 0xe5 (Swedish characters) -

problem inputting `å` (`\xe5`):

now, here's i'm at:

some info of system

solution (sort of, still breaks `input()`)

Comments

Post a Comment

Popular posts from this blog

javascript - addthis share facebook and google+ url -

ios - Show keyboard with UITextField in the input accessory view -

c++ - importing crypto++ in QT application and occurring linker errors? -

python - Python3 - ascii/utf-8/iso-8859-1 can't decode byte 0xe5 (Swedish characters) -

problem inputting å (\xe5):

now, here's i'm at:

some info of system

solution (sort of, still breaks input())

Comments

Post a Comment

Popular posts from this blog

javascript - addthis share facebook and google+ url -

ios - Show keyboard with UITextField in the input accessory view -

c++ - importing crypto++ in QT application and occurring linker errors? -

problem inputting `å` (`\xe5`):

solution (sort of, still breaks `input()`)