How get table element - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: Web Scraping & Web Development (https://python-forum.io/forum-13.html) +--- Thread: How get table element (/thread-23384.html) |
How get table element - zinho - Dec-26-2019 Hi How can I get table with result? import requests from bs4 import BeautifulSoup page = requests.get('https://chess24.com/en/watch/live-tournaments/world-rapid-championship-2019/4/1/5') if page.status_code == requests.codes.ok: bs = BeautifulSoup(page.text, 'lxml') tabela = bs.find('table', {'class':'items'}) print(tabela) RE: How get table element - snippsat - Dec-26-2019 You can not get anything this way from this site,this is a standard problem with pages that use a lot of JavaScript. Look at Web-scraping part-2 As we have a okay player in my country here a Notebook that dos a lot of this task,in this example getting standings table. Also bring in Pandas to get table easier. When using Notebook JupyterLab the table view get a lot nicer. RE: How get table element - zinho - Dec-27-2019 Hi I find a solution, but how get each row like table? import requests from bs4 import BeautifulSoup from selenium import webdriver from selenium.webdriver.firefox.options import Options url = 'https://chess24.com/en/watch/live-tournaments/world-rapid-championship-2019/4/1/5' driver = webdriver.Firefox() driver.get(url) parent_element = driver.find_element_by_css_selector('#tabTournamentGamesworld-rapid-championship-2019 > div.tournamentStandings.tournamentDataContainer > div > div.gridView.tournamentTable.nativeScroll > div > div > table') #find all li childs in parent element child = parent_element.find_elements_by_css_selector('tr') lin = [] for i in child: lin.append(i.text) #print(i.text) print(lin) RE: How get table element - snippsat - Dec-27-2019 Running my code outside of Notebook. from selenium import webdriver from bs4 import BeautifulSoup from selenium.webdriver.chrome.options import Options from selenium.webdriver.common.keys import Keys import pandas as pd import time #--| Setup options = Options() #options.add_argument("--headless") browser = webdriver.Chrome(executable_path=r'chromedriver.exe', options=options) #--| Parse or automation browser.get('https://chess24.com/en/watch/live-tournaments/world-rapid-championship-2019/4/1/5') soup = BeautifulSoup(browser.page_source, 'lxml') #browser.implicitly_wait(5) time.sleep(4) title = soup.select('h2.title') print(title[0].text) print('-'*50) # Get table df = pd.read_html(browser.page_source, header=None) standings = df[2] standings.columns = ["Rank", "Name", "Score", "Rating"] print(standings.head(10))
zinho Wrote:Hi I find a solution, but how get each row like table?It's a lot more job to extract a table with with own scraping,i have done it many times in the past. Now i use mostly Pandas for getting tables,as you see it make the task a lot easier. Getting a correct formatted table back both in Notebook or as show over from command line. # We have a okay player in my country print(standings.loc[[0]])
RE: How get table element - zinho - Dec-27-2019 Hi snippsat Perfect, work like charm. Thank you!! RE: How get table element - snippsat - Dec-28-2019 Now that Rapid Championship is finish can show table in Notebook. [Image: mycva4.png] Add Unicode emoji code is this: # We have a okay player in my country print('-'*50) champ = standings.loc[[0]].Name champ = champ.to_string() champ = ' '.join(champ.split()[-2:]) print(f'{champ.upper():\N{sports medal}^29}') |