• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Ron McLeod
  • Paul Clapham
  • Rob Spoor
  • Liutauras Vilda
Sheriffs:
  • Jeanne Boyarsky
  • Junilu Lacar
  • Tim Cooke
Saloon Keepers:
  • Tim Holloway
  • Piet Souris
  • Stephan van Hulst
  • Tim Moores
  • Carey Brown
Bartenders:
  • Frits Walraven
  • Himai Minh

can't read desktop files in Python because of default encoding 1252 in Microsoft Windows 10

 
Greenhorn
Posts: 2
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi,

Recently, I updated Windows and came across an issue with encoding. For example, I create a file on Notepad and convert it to UTF-8. However, Python does not recognise (or mapping does not occur?) and returns two single quotes when I am trying to open it.

Example

f= open (r'Tests', 'w+', encoding = 'UTF-8')
file = f.read()
file
# ''
If I remove encoding = 'utf-8', the encoding returns to the default one of Microsoft, 1252.

Example

f= open (r'Tests')
f
# <_io.TextIOWrapper name='Tests' mode='r' encoding='cp1252'>
Please note that I never had the same issue before the update. I was able to open and read any files from my desktop.

The strange thing is that this text file is saved as a Python file but with an error (..log is not UTF-8 encoded)

I would really appreciate it if you could help me. I have tried everything (i.e. putting the entire path etc.) but nothing works. PS. i have also thought about rolling back to an earlier version of Microsoft Windows 10 but it can't happen since the update occurred more than 10 days ago.

FYI

Version: Microsoft Windows 10 20H2
Python: 3.7
Jupyter notebook

Best,
Maria
 
Saloon Keeper
Posts: 24207
166
Android Eclipse IDE Tomcat Server Redhat Java Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
You can make your code more readable by wrapping it in Code tags (the Code button on our message editor).

You're working with apples and oranges. The first example opens a file, reads a line into "file", then prints that line (replacing the file in "file" with the line that was read.

The second example doesn't read a line, so it prints the object rendition of the file itself, not a line read by the file.

In any event, as far as I can think, the differences between UTF-8 and CP1252 are only apparent when you work with characters outside the original ASCII set and especially characters longer than one byte and there's no "code page" marker likely to be found in text files, so as long as its simple text, either one would probably work. International character are a different matter, of course.
 
Maria Tsilimos
Greenhorn
Posts: 2
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Dear Tim Holloway,

thanks so much for offering to help me. Much appreciated.

After having spent a lot of hours on this, I realised it is the OneDrive which caused the problem. I removed it and now Jupyter can read the desktop files.

thanks for your time!

all the best,
Maria
 
Die Fledermaus does not fear such a tiny ad:
the value of filler advertising in 2021
https://coderanch.com/t/730886/filler-advertising
reply
    Bookmark Topic Watch Topic
  • New Topic