Hello, I am in the process of learning Python 3 for the purposes of NLP.
I am trying to work with a .txt that has non-ASCII characters. In the exercise I have to demonstrate the differences in the length of documents. My code looks like this
I understand what the lines do, I checked the solved exercise, and It is the same as this, but for some reason it won´t compile.
I get the following error message:
Traceback (most recent call last):
File "C:...my folders...", line 56, in <module>
File "C:\Users\user\AppData\Local\Programs\Python\Python37-32\lib\encodings\cp1250.py", line 23, in decode
UnicodeDecodeError: 'charmap' codec can't decode byte 0x90 in position 88422: character maps to <undefined>
Character code 0x'90' is a control character, not a printable character.
Some control characters, such as NL and CR and TAB I would expect to decode properly, even though they are not printable, since they are print control characters. But 0x'90' is a generic device control with no standard meaning, so apparently the code converter rejected it.
When it comes to destroying a civilization, gas chambers cannot hold a candle to echo chambers.
posted 2 months ago
Okay, I am not sure what you are saying. Does this mean that the code is right? There shouldn't be any problems with the .txt file as well. What can I do?