beautifulsoup error - Printable Version

beautifulsoup error - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: Web Scraping & Web Development (https://python-forum.io/forum-13.html)
+--- Thread: beautifulsoup error (/thread-16294.html)

beautifulsoup error - rudolphyaber - Feb-21-2019

I'm using python 3.6 on Windows 7 and I'm following the directions on this page -> https://medium.freecodecamp.org/how-to-scrape-websites-with-python-and-beautifulsoup-5946935d93fe but I continue getting errors. I'm currently getting this error: AttributeError: 'NoneType' object has no attribute 'text'

Here's my code:

#import libraries
import urllib
import urllib.request
from bs4 import BeautifulSoup

#specify the url
quote_page = 'https://www.bloomberg.com/quote/SPX:IND'

#query the website and return the html to the variable 'page'
page = urllib.request.urlopen(quote_page)

#parse the html using beautiful soup and store in variable 'soup'
soup = BeautifulSoup(page, 'html.parse')

#Take out the <div> of name and get its value
name_box = soup.find('h1', attr={'class': 'name'})

name = name_box.text.strip() # strip() is used to remove starting & trailing
print(name)

RE: beautifulsoup error - metulburr - Feb-21-2019

They might be putting up a recaptcha. Do you get this if you just do

print(soup.title)

Output:
<title>Bloomberg - Are you a robot?</title>

I tried using their site with selenium and it brings me straight to a recaptcha for human verification.

RE: beautifulsoup error - rudolphyaber - Feb-21-2019

(Feb-21-2019, 07:05 PM)metulburr Wrote: They might be putting up a recaptcha. Do you get this if you just do
print(soup.title)
Output:
<title>Bloomberg - Are you a robot?</title>
I tried using their site with selenium and it brings me straight to a recaptcha for human verification.

I changed the code to the same as yours and I got the same thing. Maybe I should just find another tutorial? What I'm trying to accomplish can be found at -> Python Forum -> Web Development -> browser table value counting

Thanks.

RE: beautifulsoup error - metulburr - Feb-21-2019

(Feb-21-2019, 07:47 PM)rudolphyaber Wrote: Maybe I should just find another tutorial?

That website looks like it might be harder to parse. Try a different website, especially if you are new at it.

RE: beautifulsoup error - snippsat - Feb-22-2019

(Feb-21-2019, 07:47 PM)rudolphyaber Wrote: Maybe I should just find another tutorial?

Have a couple of here,that's up to date.
Web-Scraping part-1
Web-Scraping part-2

That Bloomberg site is not easy to parse(have changes since tutorial you look at).
They have API with Python support,but need C++ SDK.

Quote:> python -m pip install --index-url=https://bloomberg.bintray.com/pip/simple blpapi
Prebuilt binaries are provided for Python 2.7, 3.5, 3.6 and 3.7 for Windows, in both 32 and 64 bits.
A source package is also provided for other platforms/Python versions.
A local installation of the C++ API is required both for importing the blpapi module in Python and for building the module from sources, if needed.

For stocks there are better API free one like ALPHA VANTAGE.
Site over is okay for training to use an API to get data,then use Requests(not urllib),to get data in eg JSON.

RE: beautifulsoup error - rudolphyaber - Feb-25-2019

Thanks. Wasn't actually needing to get stocks, that was just the website that that tutorial was using to teach with.

RE: beautifulsoup error - metulburr - Feb-25-2019

There are some sites that have specific methods to throw off bots. Its usually better to start on a simple one and work your way up. One program i wrote in selenium; half of the code was just to handle their anti-bot measures.

RE: beautifulsoup error - heiner55 - May-26-2019

If you need quotes, look here:
https://python-forum.io/Thread-Exchange-Quotes