Python Forum
html data cell attribute issue - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: Web Scraping & Web Development (https://python-forum.io/forum-13.html)
+--- Thread: html data cell attribute issue (/thread-27181.html)



html data cell attribute issue - delahug - May-28-2020

hi all,

an attribute that i need to use to identify a <td> contains the keyword 'data'...

for cell in row.find_all('td',data-ending_ = 'RPR'):
SyntaxError: keyword can't be an expression

is there a way around this?


RE: html data cell attribute issue - Larz60+ - May-29-2020

you need to provide more information.
The URL, and either an xpath, or css selector would be handy


RE: html data cell attribute issue - delahug - May-30-2020

(May-29-2020, 02:17 AM)Larz60+ Wrote: you need to provide more information.
The URL, and either an xpath, or css selector would be handy

Thanks for your time.

Here's my table data cell

<td class="rp-horseTable__spanNarrow" data-ending="RPR" data-test-selector="full-result-rpr">85<!----</td>

Unsurprisingly there are other data cells with this class name. But the other attributes(?), which would uniquely identify this cell, can't seem to be referenced in Python because they have the keyword(?) data in their names...


RE: html data cell attribute issue - Larz60+ - May-30-2020

what is the URL?


RE: html data cell attribute issue - snippsat - May-30-2020

delahug Wrote:for cell in row.find_all('td',data-ending_ = 'RPR'):
Can not add data-ending_= 'RPR' for this attribute,here have to use dictionary in search.
it work for class attribute class_="rp-horseTable__spanNarrow"
Quick test.
from bs4 import BeautifulSoup

html = '''\
<td class="rp-horseTable__spanNarrow" data-ending="RPR" data-test-selector="full-result-rpr">85<!----</td>'''

soup = BeautifulSoup(html, 'lxml')
Usage test:
>>> td_tag = soup.find('td')
>>> td_tag.attrs
{'class': ['rp-horseTable__spanNarrow'],
 'data-ending': 'RPR',
 'data-test-selector': 'full-result-rpr'}

# Search with data-ending
 td_tag = soup.find('td', {'data-ending': 'RPR'})
<td class="rp-horseTable__spanNarrow" data-ending="RPR" data-test-selector="full-result-rpr">85</td>

# Search with class
td_tag = soup.find('td', class_="rp-horseTable__spanNarrow")
<td class="rp-horseTable__spanNarrow" data-ending="RPR" data-test-selector="full-result-rpr">85</td>

# Get text and attributes
>>> td_tag.text
'85'
>>> 
>>> td_tag.get('data-ending')
'RPR'
>>> td_tag.get('class')
['rp-horseTable__spanNarrow']



RE: html data cell attribute issue - delahug - May-31-2020

(May-30-2020, 04:01 PM)snippsat Wrote:
delahug Wrote:for cell in row.find_all('td',data-ending_ = 'RPR'):
Can not add data-ending_= 'RPR' for this attribute,here have to use dictionary in search.
it work for class attribute class_="rp-horseTable__spanNarrow"
Quick test.
from bs4 import BeautifulSoup

html = '''\
<td class="rp-horseTable__spanNarrow" data-ending="RPR" data-test-selector="full-result-rpr">85<!----</td>'''

soup = BeautifulSoup(html, 'lxml')
Usage test:
>>> td_tag = soup.find('td')
>>> td_tag.attrs
{'class': ['rp-horseTable__spanNarrow'],
 'data-ending': 'RPR',
 'data-test-selector': 'full-result-rpr'}

# Search with data-ending
 td_tag = soup.find('td', {'data-ending': 'RPR'})
<td class="rp-horseTable__spanNarrow" data-ending="RPR" data-test-selector="full-result-rpr">85</td>

# Search with class
td_tag = soup.find('td', class_="rp-horseTable__spanNarrow")
<td class="rp-horseTable__spanNarrow" data-ending="RPR" data-test-selector="full-result-rpr">85</td>

# Get text and attributes
>>> td_tag.text
'85'
>>> 
>>> td_tag.get('data-ending')
'RPR'
>>> td_tag.get('class')
['rp-horseTable__spanNarrow']

Thumbs Up