BeautifulSoup: Error while extracting a value from an HTML table - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: Web Scraping & Web Development (https://python-forum.io/forum-13.html) +--- Thread: BeautifulSoup: Error while extracting a value from an HTML table (/thread-20635.html) |
BeautifulSoup: Error while extracting a value from an HTML table - kawasso - Aug-22-2019 Hi all, From the below HTML text: <div class="card-body"> <div class="table-responsive"> <table class="table table__group table-sm table-hover"> <tr> <td>Trading currency</td> <td><strong>EUR</strong></td> </tr> <tr> <td>Price multiplier</td> <td><strong>1</strong></td> </tr> <tr> <td>Quantity notation</td> <td><strong>Number of units</strong></td> </tr> <tr> <td>Shares outstanding</td> <td><strong>872,308,162</strong></td> </tr> <tr> <td>Trading group</td> <td><strong>P0</strong></td> </tr> <tr> <td>Trading type</td> <td><strong>Continuous</strong></td> </tr>I would like to extract the value 872,308,162 from bs4 import BeautifulSoup import requests, io import pandas as pd timestamp = pd.datetime.today().strftime('%Y%m%d-&H&M&S') links_df = pd.read_excel(r'myfolder\myfile.xlsx', sheetname='Sheet1') links_Df = links_df[(links_df['Country'] == 'PT')] results = pd.DataFrame(columns=['ISIN', 'N Shares', 'Link']) for ISIN in links_df.ISIN: link='https://live.euronext.com/en/product/equities/=' + ISIN + '-XLIS/market-information' shares = soup.find('td', {'Shares outstanding'}).contents results = results.append({'ISIN': ISIN, 'N Shares': shares, 'Link': link}, ignore_index=True) print(ISIN +": " + shares) results.to_csv(r'myfolder\myoutputfile' + timestamp + 'csv', index=False) print('Finish')The error I get is Could you please guide on this?
RE: BeautifulSoup: Error while extracting a value from an HTML table - fishhook - Aug-23-2019 Quote:shares = soup.find('td', {'Shares outstanding'}).contents I am sorry, but I didn't manage to find in BS::find documentation an argument of type set([]). Can you show me it? RE: BeautifulSoup: Error while extracting a value from an HTML table - snippsat - Aug-23-2019 First you most always tell what parser BS should use,here html.parser which comes with Python.Can not search as you try to do with {'Shares outstanding'} .Here a example. from bs4 import BeautifulSoup html = '''\ <div class="card-body"> <div class="table-responsive"> <table class="table table__group table-sm table-hover"> <tr> <td>Trading currency</td> <td><strong>EUR</strong></td> </tr> <tr> <td>Price multiplier</td> <td><strong>1</strong></td> </tr> <tr> <td>Quantity notation</td> <td><strong>Number of units</strong></td> </tr> <tr> <td>Shares outstanding</td> <td><strong>872,308,162</strong></td> </tr> <tr> <td>Trading group</td> <td><strong>P0</strong></td> </tr> <tr> <td>Trading type</td> <td><strong>Continuous</strong></td> </tr><div class="g-recaptcha" data-sitekey="VALUE_TO_RETURN"></div>''' soup = BeautifulSoup(html, 'html.parser') table = soup.find('table', class_="table table__group table-sm table-hover") price = table.find_all('strong')[3] print(f'The price is {price.text}')
RE: BeautifulSoup: Error while extracting a value from an HTML table - kawasso - Aug-25-2019 (Aug-23-2019, 11:24 AM)snippsat Wrote: First you most always tell what parser BS should use,here Thanks. I am not sure to fully understand why it is not possible to look for the "Shares outstanding", but the solution provided works great. Cheers |