Web Scraping Sportsbook Websites

Web Scraping Sportsbook Websites - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: Web Scraping & Web Development (https://python-forum.io/forum-13.html)
+--- Thread: Web Scraping Sportsbook Websites (/thread-25200.html)

Pages: 1 2

RE: Web Scraping Sportsbook Websites - Khuber79 - Mar-27-2020

Oh so then I would not need to send click commands? It would automatically expand all JavaScript on the page? That would help a lot since I could see a ton of bugs with sending click commands.

I tried by getting an error, not sure I am implementing it correctly:

import requests
import csv
from bs4 import BeautifulSoup
import urllib.request
import random
import re
from selenium import webdriver
import time

chrome_path = r"C:\Users\user\Desktop\chromedriver.exe"

Urls = []
Teams = ''

with open('M:\SportsBooks3.csv') as csvfile:
    readCSV = csv.reader(csvfile, delimiter=',')
    for row in readCSV:
        Urls.append(row)

FD_web = webdriver.Chrome(chrome_path)
FD_web.get(str(Urls[2])[2:-2])

# MapURL Test
FD_web.get(FD_web.mapurl)
time.sleep(2)
source = FD_web.page_source
soup = BeautifulSoup(source, 'lxml')
print(soup)
# Soup should be all expanded HTML

Error:

Error:C:\Users\user\PycharmProjects\untitled\venv\Scripts\python.exe C:/Users/user/PycharmProjects/untitled/Test_MapURL.py
Traceback (most recent call last):
File "C:/Users/user/PycharmProjects/untitled/Test_MapURL.py", line 25, in <module>
FD_web.get(FD_web.mapurl)
AttributeError: 'WebDriver' object has no attribute 'mapurl'

Process finished with exit code 1

RE: Web Scraping Sportsbook Websites - Larz60+ - Mar-27-2020

the error is on this FD_web.get(FD_web.mapurl)
doesn't know what mapurl is.
Has nothing to do with extracting page_source

RE: Web Scraping Sportsbook Websites - Khuber79 - Mar-27-2020

I am not grasping the line of your code.
I assume browser is a variable that is set to your webdriver like mine is FD_Web?
What is 'self' in this? And what is mapurl?
Sorry for the newbie question, just getting a handle on python. I tried googling it and came up with nothing.

browser.get(self.mapurl)

RE: Web Scraping Sportsbook Websites - Larz60+ - Mar-27-2020

i's line 24 of your last post: https://python-forum.io/Thread-Web-Scraping-Sportsbook-Websites?pid=108462#pid108462
FD_web.get(FD_web.mapurl)

RE: Web Scraping Sportsbook Websites - Khuber79 - Mar-27-2020

(Mar-26-2020, 03:44 AM)Larz60+ Wrote: I often use selenium to expand all of the JavaScript, then switch to Beautifulsoup
then you can use find, or find_all.
example:
        browser.get(self.mapurl)
        time.sleep(2)
        source = browser.page_source
        soup = BeautifulSoup(source, 'lxml')

I realize my error, I just do not understand this post. I was wondering if you could break down this code for me. In particular the first line of code.

RE: Web Scraping Sportsbook Websites - Larz60+ - Mar-27-2020

you should just use this part (I'm a bit confused about how you manipulate your pages)

soup = BeautifulSoup(DK_Content_FT, 'lxml')
# and/or
soup1 = BeautifulSoup(DK_Content_HT, lxml')

RE: Web Scraping Sportsbook Websites - Khuber79 - Mar-30-2020

Ok, I took a couple days off from looking at this to do some coding with logical statements comparing the data once I have it downloaded from the sites. Now revisiting this. I realized my error and installed lxml. I assume you just use that instead of html.parser. I will research the benefits of one over the other. Is there any method to just expand all html data without sending click commands to the websites. So take this page as an example:
https://nj.unibet.com/sports/#filter/football/1006230344

If you click on "FULL TIME", "HALF", "ASIAN LINES", etc. The HTML page gets populated with additional data for everything in the list for that game. I thought you were originally saying that the lxml would do this, but that does not appear to be the case when I downloaded that page using lxml and writing it to a text document.

If you inspect the page that I put the link up. The elements that I am trying to expand the code for are all using this:

<li class="KambiBC-bet-offer-category KambiBC-collapsible-container">
<header class="KambiBC-bet-offer-category__header" data-touch-feedback="true">
<h2 class="KambiBC-bet-offer-category__title">Full Time </h2>
<div class="KambiBC-header-meta-wrapper"><div class="KambiBC-bet-offer-category__bet-offer-count">12</div></div>
</header>
</li>

So basically trying to automatically expand this element:
<li class="KambiBC-bet-offer-category KambiBC-collapsible-container">

RE: Web Scraping Sportsbook Websites - Whitesox1 - Mar-17-2021

I saw this old post and was looking to do same thing. Was wondering if you were able ever to figure this out for scraping the live in play odd spreads