html data cell attribute issue - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: Web Scraping & Web Development (https://python-forum.io/forum-13.html) +--- Thread: html data cell attribute issue (/thread-27181.html) |
html data cell attribute issue - delahug - May-28-2020 hi all, an attribute that i need to use to identify a <td> contains the keyword 'data'... for cell in row.find_all('td',data-ending_ = 'RPR'): SyntaxError: keyword can't be an expression is there a way around this? RE: html data cell attribute issue - Larz60+ - May-29-2020 you need to provide more information. The URL, and either an xpath, or css selector would be handy RE: html data cell attribute issue - delahug - May-30-2020 (May-29-2020, 02:17 AM)Larz60+ Wrote: you need to provide more information. Thanks for your time. Here's my table data cell <td class="rp-horseTable__spanNarrow" data-ending="RPR" data-test-selector="full-result-rpr">85<!----</td> Unsurprisingly there are other data cells with this class name. But the other attributes(?), which would uniquely identify this cell, can't seem to be referenced in Python because they have the keyword(?) data in their names... RE: html data cell attribute issue - Larz60+ - May-30-2020 what is the URL? RE: html data cell attribute issue - snippsat - May-30-2020 delahug Wrote:for cell in row.find_all('td',data-ending_ = 'RPR'):Can not add data-ending _ = 'RPR' for this attribute,here have to use dictionary in search. it work for class attribute class _ ="rp-horseTable__spanNarrow"Quick test. from bs4 import BeautifulSoup html = '''\ <td class="rp-horseTable__spanNarrow" data-ending="RPR" data-test-selector="full-result-rpr">85<!----</td>''' soup = BeautifulSoup(html, 'lxml')Usage test: >>> td_tag = soup.find('td') >>> td_tag.attrs {'class': ['rp-horseTable__spanNarrow'], 'data-ending': 'RPR', 'data-test-selector': 'full-result-rpr'} # Search with data-ending td_tag = soup.find('td', {'data-ending': 'RPR'}) <td class="rp-horseTable__spanNarrow" data-ending="RPR" data-test-selector="full-result-rpr">85</td> # Search with class td_tag = soup.find('td', class_="rp-horseTable__spanNarrow") <td class="rp-horseTable__spanNarrow" data-ending="RPR" data-test-selector="full-result-rpr">85</td> # Get text and attributes >>> td_tag.text '85' >>> >>> td_tag.get('data-ending') 'RPR' >>> td_tag.get('class') ['rp-horseTable__spanNarrow'] RE: html data cell attribute issue - delahug - May-31-2020 (May-30-2020, 04:01 PM)snippsat Wrote:delahug Wrote:for cell in row.find_all('td',data-ending_ = 'RPR'):Can not add data-ending |