*Beginner* web scraping/Beautiful Soup help - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: Web Scraping & Web Development (https://python-forum.io/forum-13.html) +--- Thread: *Beginner* web scraping/Beautiful Soup help (/thread-32212.html) |
*Beginner* web scraping/Beautiful Soup help - 7ken8 - Jan-28-2021 Hello all! I am trying to scrape a table of reviews from an album’s wikipedia page, using Beautiful Soup and requests. I have become stuck trying to visualise this. It is the "Critical Reception" table on the page for the Ed Sheeran 2017 album "%". When I inspect this is says it is a 'wikitable floatright', but I can not understand what kind of data the words are. https://en.wikipedia.org/wiki/%C3%B7_(album) My code so far has been import requests from bs4 import BeautifulSoup soup = BeautifulSoup(response.text) url1 = “÷ (album) - Wikipedia” s = requests.Session() response = s.get(url1, timeout = 10) response right_table = soup.find(‘table’, {“class”: ‘wikitablefloatright’}) header = [th.text.rstrip() for th in right_table [0].find_all(‘th’)] print(header) print(’------’) print(len(header))The final cell writes ‘NoneType’ object is not subscriptable. Here is the inspection for the table. Let me know if anything is unclear - I am a beginner. Many thanks, RE: *Beginner* web scraping/Beautiful Soup help - buran - Jan-28-2021 there are multiple issues with the code you posted, to the extent it will never run, nor produce any erroro import requests from bs4 import BeautifulSoup url = "https://en.wikipedia.org/wiki/%C3%B7_(album)" response = requests.get(url, timeout = 10) soup = BeautifulSoup(response.text, 'html.parser') right_table = soup.find('table', {'class': 'wikitable floatright'}) header = [th.text.rstrip() for th in right_table.find_all('th')] print(header) print('------') print(len(header))
RE: *Beginner* web scraping/Beautiful Soup help - 7ken8 - Jan-28-2021 (Jan-28-2021, 10:28 AM)7ken8 Wrote: Hello all! Hi Buran, Thank you for your help, I was just wondering how in that box I can present the td name for the scores. As some publications are shown in img., but on inspection it does show the stars out of five in its description names. How can I present these as figures? Thanks PS thank you for the tag notes. |