Encoding Error? Atom-specific


#1

Hi,

I’m a Python noob. Trying to follow a tutorial but I came across this error, I think this is ATOM-specific because I tried the same script in IDLE and it was working fine. I really wanted to work with ATOM as I love the IDE over the others (tried Eclipse and PyCharm).

But I hope you could help me with this “encoding” error, I think.


#2
  1. What happens when you open the command prompt and type python C:\Programming\Codes\Py\BS4_Tut_1.py?

  2. You can find the encoding on the right-hand side of the status bar. What does it say?


#3

The encoding says UTF-8


#4

What package are you using to execute the code? The fault may be in how it’s handling the file or sending that to the interpreter.


#5

it’s script 3.14.1.

What would you suggest apart from this if this is the culprit?


#6

In order to figure out why this is happening, we have to test the other options and see what’s different. As far as I know, atom-runner functions identically to script, so any difference in output would be surprising. hydrogen functions differently as I understand it, so it might work when the others don’t.


#7

You’re right about one thing atom-runner behaves like script, they generated the same error. I see that hydrogen is cool because you can run the code inline, but I couldn’t get it to work, There isn’t enough information I can find… I just guessed that the way you execute it, is with a combination of a control key (CTRL or ALT or SHIFT, or any combination) and then ENTER? because in the demo it appears that he’s executing it without accessing any menu. Is there a pre-requisite for this to run? I’m getting this error - should I be installing iPython or Jupyter or both. tried google and youtube for some info but couldn’t find something that helps.


#8

The installation instructions for hydrogen can be accessed through the readme on the package page and repo. You do have to have Jupyter installed. Since Python includes pip by default, you can use pip install jupyter. Python support is part of the core of Jupyter, so everything should just work.


#9

Hahaha… Good Job!!! Works for now… Thanks a lot DamnedScholar for helping me get through this, I hope I’m not going to come accros another problem for the same topic.


#10

Okay, so we have a confirmation of my earlier hypothesis. script and atom-runner both give a command behind the scenes that sends the file to python. In your case, this isn’t working; the reason is unknown, but has something to do with encoding. python works, but the Atom packages fail in the same manner. hydrogen probably functions by processing the contents of the buffer instead of dealing with the saved file, and since it doesn’t have to deal with encoding, it’s not vulnerable to whatever is causing the hangup with the other packages.

Now if only I knew what could be causing an encoding issue when the command is issued by Atom, but not when it’s issued by you on the command line.

You can also embed a terminal in Atom and use python from there if something comes up where hydrogen doesn’t do the job.


#11

Now if only I knew what could be causing an encoding issue when the command is issued by Atom, but not when it’s issued by you on the command line.

One thing to check is what exactly we mean by python . For example it’s easy to have both python 2 and 3 installed in parallel. Also possible is a distribution like anaconda in parallel with official one and sandboxed with regard to add-on.


Also the thing that fail is inside the data extracting library beatiful soup . So the encoding problem could very well be encoding of the content at said url, instead of encoding of file.

Error is in cp1252.py, so Windows-1252 ou CP1252 that’s a window encoding. Bad character is \u1d90 .


It’s possible either urllib or bs4 try to infer the codec of the content at pythonprograming.net from local environment. And that local environment change when launching inside atom.

In that situation I’d try to see if it’s possible to force utf-8 either on read()or on the BeautifulSoupconstructor