Beautiful soup and tags - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: Web Scraping & Web Development (https://python-forum.io/forum-13.html) +--- Thread: Beautiful soup and tags (/thread-19583.html) Pages:
1
2
|
RE: Beautiful soup and tags - snippsat - Jul-08-2019 (Jul-08-2019, 12:16 PM)starter_student Wrote: and now there is no error but the output file is empty just with headersThat's because your parsing or something else is wrong. Do test is small step,put in print() and do test in REPL.store_details = {} should be outside of the loop.The html code you posted it's just a mess. To show how can test html code outside of a web-site. from bs4 import BeautifulSoup as soup import csv import requests html = '''\ <div id="storelist"> <ul> <li>Coffee</li> <li>Tea</li> <li>Milk</li> </ul> </div>''' #code from bs4 import BeautifulSoup as soup import csv import requests #URL = "http:www.abc.com" #r = requests.get(URL) soup = BeautifulSoup(html, 'lxml') table = soup.find('div', id="storelist") print(table) # Test print store_details = {} for row in table.find_all('li'): store_details[row.text] = f'<{row.text}> parsed for site' filename = 'store_details_tab.csv' with open(filename, 'w') as f: w = csv.DictWriter(f, ['Coffee', 'Tea', 'Milk']) w.writeheader() w.writerow(store_details)In csv:
RE: Beautiful soup and tags - starter_student - Jul-08-2019 (Jul-08-2019, 02:15 PM)snippsat Wrote:(Jul-08-2019, 12:16 PM)starter_student Wrote: and now there is no error but the output file is empty just with headersThat's because your parsing or something else is wrong. Thanks for this approach ... it helped me to understand some stuffs. The html code was just a sample ... here is the right structure with a nested div [html] <div id ="storelist" class> <ul> <li id ="00021455" class> <div class ="wr-store-details"> <p> name </p> <span class ="address dc">Street 2</span> <span class ="city">LA</span> </div> </li> <li> </li> . . . </ul> </div> [/html] |