Python Forum
not getting image src in my BeautifulSoup csv file
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
not getting image src in my BeautifulSoup csv file
#1
I am getting image src in my python shell look like this:
image link:https://images-na.ssl-images-amazon.com/images/I/41oJQTxCbZL._AC_US40_.jpg
image link:https://images-na.ssl-images-amazon.com/images/I/4152DCmmGFL._AC_US40_.jpg
image link:https://images-na.ssl-images-amazon.com/images/I/41ayV4UraXL._AC_US40_.jpg
image link:https://images-na.ssl-images-amazon.com/images/I/310z8LQ%2BoYL._AC_US40_.jpg
image link:https://images-na.ssl-images-amazon.com/images/G/01/x-locale/common/transparent-pixel._V192234675_.gif
But I am getting image src in my csv file look like this:
<img alt="" src="https://images-na.ssl-images-amazon.com/images/G/01/x-locale/common/transparent-pixel._V192234675_.gif"/> 
where getting multiple image url for each item in python shell but in my csv file getting only one image url for each item with html tag. Product title, product price and product rating importing correctly in my csv but not getting all image url for each item. Here is an example of my final output which I am getting from python shell:

product_link: https://www.amazon.com/gp/slredirect/picassoRedirect.html/ref=pa_sp_btf_aps_sr_pg1_1?ie=UTF8&adId=A002532917E3JT34GS1DE&url=%2FWireless-Vssoplor-Portable-Computer-Computer-Black%2Fdp%2FB07RLYJJBX%2Fref%3Dsr_1_22_sspa%3Fcrid%3D22TI4BA3RLK5J%26dchild%3D1%26keywords%3Dwireless%2Bmouse%26qid%3D1599517835%26sprefix%3Dw%252Caps%252C528%26sr%3D8-22-spons%26psc%3D1&qualifier=1600050591&id=4126203954910776&widgetName=sp_btf

product_title: Wireless Mouse, Vssoplor 2.4G Slim Portable Computer Mice with Nano Receiver for Notebook, PC, Laptop, Computer-Black and Sapphire Blue

product_price:  $10.99 

product_rating: 2,262 ratings


image link:https://images-na.ssl-images-amazon.com/images/I/41oJQTxCbZL._AC_US40_.jpg
image link:https://images-na.ssl-images-amazon.com/images/I/4152DCmmGFL._AC_US40_.jpg
image link:https://images-na.ssl-images-amazon.com/images/I/41ayV4UraXL._AC_US40_.jpg
image link:https://images-na.ssl-images-amazon.com/images/I/310z8LQ%2BoYL._AC_US40_.jpg
image link:https://images-na.ssl-images-amazon.com/images/G/01/x-locale/common/transparent-pixel._V192234675_.gif  
here is my full code:
for page_num in range(1):
    url = "https://www.amazon.com/s?k=wireless+mouse&page={}&crid=22TI4BA3RLK5J&qid=1599517835&sprefix=w%2Caps%2C528&ref=sr_pg_2".format(page_num)
    r = requests.get(url,headers=headers,proxies=proxies,auth=auth).text
    soup = BeautifulSoup(r,'lxml')

    container = soup.find_all('h2',{'class':'a-size-mini a-spacing-none a-color-base s-line-clamp-2'})
    for containers in container:
        product_link = f"https://www.amazon.com{containers.find('a')['href']}"
        #print(f"page_number:{url}\n\nproduct_link:{product_link}")

        #here I am start scraping from details page of each product 
        details_page = requests.get(product_link,headers=headers,proxies=proxies,auth=auth).text
        dpsoup = BeautifulSoup(details_page,'lxml')

        
        title = dpsoup.find('span', id='productTitle')
        if title is not None:
          title = title.text.strip()
        else:
           title= None
        rating = dpsoup.find('span', id='acrCustomerReviewText')
        if rating is not None:
           rating = rating.text
        else:
           rating = None
        price = dpsoup.find('span', class_='a-size-mini twisterSwatchPrice')
        if price is not None:
           price = price.text
        else:
           price = None
        print(f'\nproduct_link: {product_link}\n\nproduct_title: {title}\n\nproduct_price: {price}\n\nproduct_rating: {rating}\n\n')

        #this is for scrape all gallray image src
        for url in dpsoup.select('span.a-button-text > img')[3:10]:
            print(f"image link:{url['src']}")
            with io.open("amazon.csv", "a",encoding="utf-8") as f:
              writeFile = csv.writer(f)
              writeFile.writerow([url,product_link ,title,rating,price]) 
Reply


Messages In This Thread
not getting image src in my BeautifulSoup csv file - by farhan275 - Sep-14-2020, 01:26 PM

Possibly Related Threads…
Thread Author Replies Views Last Post
  BeautifulSoup Showing none while extracting image url josephandrew 0 1,986 Sep-20-2021, 11:40 AM
Last Post: josephandrew
  Image Scraper (beautifulsoup), stopped working, need to help see why woodmister 9 4,218 Jan-12-2021, 04:10 PM
Last Post: woodmister

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020