Web Scraping Sportsbook Websites - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: Web Scraping & Web Development (https://python-forum.io/forum-13.html) +--- Thread: Web Scraping Sportsbook Websites (/thread-25200.html) Pages:
1
2
|
RE: Web Scraping Sportsbook Websites - Khuber79 - Mar-27-2020 Oh so then I would not need to send click commands? It would automatically expand all JavaScript on the page? That would help a lot since I could see a ton of bugs with sending click commands. I tried by getting an error, not sure I am implementing it correctly: import requests import csv from bs4 import BeautifulSoup import urllib.request import random import re from selenium import webdriver import time chrome_path = r"C:\Users\user\Desktop\chromedriver.exe" Urls = [] Teams = '' with open('M:\SportsBooks3.csv') as csvfile: readCSV = csv.reader(csvfile, delimiter=',') for row in readCSV: Urls.append(row) FD_web = webdriver.Chrome(chrome_path) FD_web.get(str(Urls[2])[2:-2]) # MapURL Test FD_web.get(FD_web.mapurl) time.sleep(2) source = FD_web.page_source soup = BeautifulSoup(source, 'lxml') print(soup) # Soup should be all expanded HTMLError:
RE: Web Scraping Sportsbook Websites - Larz60+ - Mar-27-2020 the error is on this FD_web.get(FD_web.mapurl) doesn't know what mapurl is. Has nothing to do with extracting page_source RE: Web Scraping Sportsbook Websites - Khuber79 - Mar-27-2020 I am not grasping the line of your code. I assume browser is a variable that is set to your webdriver like mine is FD_Web? What is 'self' in this? And what is mapurl? Sorry for the newbie question, just getting a handle on python. I tried googling it and came up with nothing. browser.get(self.mapurl) RE: Web Scraping Sportsbook Websites - Larz60+ - Mar-27-2020 i's line 24 of your last post: https://python-forum.io/Thread-Web-Scraping-Sportsbook-Websites?pid=108462#pid108462 FD_web.get(FD_web.mapurl)
RE: Web Scraping Sportsbook Websites - Khuber79 - Mar-27-2020 (Mar-26-2020, 03:44 AM)Larz60+ Wrote: I often use selenium to expand all of the JavaScript, then switch to Beautifulsoup I realize my error, I just do not understand this post. I was wondering if you could break down this code for me. In particular the first line of code. RE: Web Scraping Sportsbook Websites - Larz60+ - Mar-27-2020 you should just use this part (I'm a bit confused about how you manipulate your pages) soup = BeautifulSoup(DK_Content_FT, 'lxml') # and/or soup1 = BeautifulSoup(DK_Content_HT, lxml') RE: Web Scraping Sportsbook Websites - Khuber79 - Mar-30-2020 Ok, I took a couple days off from looking at this to do some coding with logical statements comparing the data once I have it downloaded from the sites. Now revisiting this. I realized my error and installed lxml. I assume you just use that instead of html.parser. I will research the benefits of one over the other. Is there any method to just expand all html data without sending click commands to the websites. So take this page as an example: https://nj.unibet.com/sports/#filter/football/1006230344 If you click on "FULL TIME", "HALF", "ASIAN LINES", etc. The HTML page gets populated with additional data for everything in the list for that game. I thought you were originally saying that the lxml would do this, but that does not appear to be the case when I downloaded that page using lxml and writing it to a text document. If you inspect the page that I put the link up. The elements that I am trying to expand the code for are all using this: <li class="KambiBC-bet-offer-category KambiBC-collapsible-container"> <header class="KambiBC-bet-offer-category__header" data-touch-feedback="true"> <h2 class="KambiBC-bet-offer-category__title">Full Time </h2> <div class="KambiBC-header-meta-wrapper"><div class="KambiBC-bet-offer-category__bet-offer-count">12</div></div> </header> </li> So basically trying to automatically expand this element: <li class="KambiBC-bet-offer-category KambiBC-collapsible-container"> RE: Web Scraping Sportsbook Websites - Whitesox1 - Mar-17-2021 I saw this old post and was looking to do same thing. Was wondering if you were able ever to figure this out for scraping the live in play odd spreads |