You can run
Its possible they changed their site to include javascript? If so, then it would stop requests in in tracks. If it does have javascript you are going to need Selenium to accomplish your task instead. It doesnt have to be the main page, any portion of information you are getting could be obtained via javascript. I often have to change my scripts as admins change the HTML or add javascript to avoid bots.
It is also entirely possible they have detected your bot as you have a fair number of requests (making a request per entry / per page). They can rate-limit in iptables to greatly reduce the request volume per source.
However based on your first post, i would suggest either cookies or javascript is the issue.
using selenium does gt the html by the way without much hassle
document.cookie
in your console to read all the cookies accessible from that location.Quote:document.write("Not all needed JavaScript methods are supported.<BR>");
Quote:<noscript>JavaScript must be enabled in order to view this page.</noscript>
Its possible they changed their site to include javascript? If so, then it would stop requests in in tracks. If it does have javascript you are going to need Selenium to accomplish your task instead. It doesnt have to be the main page, any portion of information you are getting could be obtained via javascript. I often have to change my scripts as admins change the HTML or add javascript to avoid bots.
It is also entirely possible they have detected your bot as you have a fair number of requests (making a request per entry / per page). They can rate-limit in iptables to greatly reduce the request volume per source.
However based on your first post, i would suggest either cookies or javascript is the issue.
using selenium does gt the html by the way without much hassle
from selenium import webdriver race_day_url='https://racing.hkjc.com/racing/info/meeting/Results/English/Local/' browser = webdriver.Firefox() browser.get(race_day_url) time.sleep(3) print(browser.page_source)
Recommended Tutorials: