Python Forum
Failure in web scraping by Beautiful Soup
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Failure in web scraping by Beautiful Soup
#5
You can run document.cookie in your console to read all the cookies accessible from that location.

Quote:document.write("Not all needed JavaScript methods are supported.<BR>");
Quote:<noscript>JavaScript must be enabled in order to view this page.</noscript>

Its possible they changed their site to include javascript? If so, then it would stop requests in in tracks. If it does have javascript you are going to need Selenium to accomplish your task instead. It doesnt have to be the main page, any portion of information you are getting could be obtained via javascript. I often have to change my scripts as admins change the HTML or add javascript to avoid bots.

It is also entirely possible they have detected your bot as you have a fair number of requests (making a request per entry / per page). They can rate-limit in iptables to greatly reduce the request volume per source.

However based on your first post, i would suggest either cookies or javascript is the issue.

using selenium does gt the html by the way without much hassle

from selenium import webdriver

race_day_url='https://racing.hkjc.com/racing/info/meeting/Results/English/Local/'

browser = webdriver.Firefox()
browser.get(race_day_url)
time.sleep(3)
print(browser.page_source)
Recommended Tutorials:
Reply


Messages In This Thread
RE: Failure in web scraping by Beautiful Soup - by metulburr - Mar-23-2019, 12:36 PM

Possibly Related Threads…
Thread Author Replies Views Last Post
  Beautiful Soup - access a rating value in a class KatMac 1 3,477 Apr-16-2021, 01:27 PM
Last Post: snippsat
  *Beginner* web scraping/Beautiful Soup help 7ken8 2 2,621 Jan-28-2021, 04:26 PM
Last Post: 7ken8
  Help: Beautiful Soup - Parsing HTML table ironfelix717 2 2,700 Oct-01-2020, 02:19 PM
Last Post: snippsat
  Beautiful Soup (suddenly) doesn't get full webpage html j.crater 8 16,964 Jul-11-2020, 04:31 PM
Last Post: j.crater
  Requests-HTML vs Beautiful Soup - How to Choose? robin73 0 3,829 Jun-23-2020, 02:53 PM
Last Post: robin73
  looking for direction - scrappy, crawler, beautiful soup Sly_Corn 2 2,463 Mar-17-2020, 03:17 PM
Last Post: Sly_Corn
  Beautiful soup truncates results jonesjoz 4 3,889 Mar-09-2020, 06:04 PM
Last Post: jonesjoz
  Beautiful soup and tags starter_student 11 6,198 Jul-08-2019, 03:41 PM
Last Post: starter_student
  Beautiful Soup find_all() kirito85 2 3,383 Jun-14-2019, 02:17 AM
Last Post: kirito85
  [split] Using beautiful soup to get html attribute value moski 6 6,319 Jun-03-2019, 04:24 PM
Last Post: moski

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020