OK,
I'm running on fumes lately, so start with the following:
I won't be able to test, so if you have a problem, do not go on to the next step. Just let me know which step fails,
any error messages (verbatim), and what you think might be wrong.
Continue only after the problem has been fixed.
This post will be the go to point for the entire process. As I add steps, they will be added here, so save the URL
This concludes setup for project,
Do these steps (8-40) and let me know when done.
This will need some tweeking later (after I finish my move and have internet reestablished at new location (which is more remote, so I am hoping I can still get broadband))
Movers will be here Thursday. I have already moved most of the items that I didn't want them to break.
I'm running on fumes lately, so start with the following:
I won't be able to test, so if you have a problem, do not go on to the next step. Just let me know which step fails,
any error messages (verbatim), and what you think might be wrong.
Continue only after the problem has been fixed.
This post will be the go to point for the entire process. As I add steps, they will be added here, so save the URL
- From windows Start, click on control panel.
- If necessary enlarge the window, and choose view by Large Icons
- Select Programs and Features
- scroll through the list until you find Python 3.6.5 (or your latest python 3.6 version)
- double click and follow uninstall instructions.
- right click on the post #32 (or what ever number you see in upper right corner of this post), click on copy link location, open a notepad or similar window, click in window and type ctrl-v. Save this URL where you can find it if you have to return to this post.
- Navigate to the following URL and follow snippsat's instructions for installing python (and cmder if you haven't already installed same):
Part1: https://python-forum.io/Thread-Basic-Par...er-Windows
Part2: https://python-forum.io/Thread-Basic-Par...ight=cmder
Install the latest version which is 3.6.5. Make sure you install on C:/Python365, and make sure you check add to paths
- Continue here
- On 'O' drive do the following:
- Use explorer or navigate to the 'O' drive.
- create a directory named 'python'
- ImportantOpen PyCharm. If there are any projects named WellInfo remove from PyCharm by clicking on 'X' in upper right corner of project icon.
- Click on Create New Project
- In the top location Box, enter: O:\python\WellInfo
- Expand arrow to left of Project Interpreter: New Virtualenv environemnt
- Click on Existing Interpreter
- under 'Existing Interpreter' Click on Gear far right
- Select 'Add Local'
- Click on System Interpreter
- If Interpreter window does not show C:\Python365\python.exe, either select that interpreter from the pull down list, or if not there use '...' button to navigate to C:\Python365\python.exe and click ok
- Click Create
- Click close on Tip of day window (if it shows up)
- With WellInfo Highlighted in Left Pane, from top menu select File-->Settings
- Expand sub menu of Project: WellInfo
- Select Project Interpreter make sure it showd Python 36 C:\Python365\python.exe
- in package list below, make sure the following packages are installed: beautifulsoup4, lxml, requests.
- If any are missing, click + on right, type package name (make sure it's highlighted in left Pane, and click Install Package.
- Repeat for all missing packages.
- Now in left pane, click on Project Structure
- Right click on O:\python\WellInfo
- Click on new folder and add data
- Right click on O:\python\WellInfo again
- Click on new folder and add src
- Now highlight data, right click and select new folder
- Add command_files
- highlight data again, right click and select new folder
- Add completions
- Highlight src and click on Sources button next to Mark as:
- Click OK
- Continue here need a few more directories that I missed:
- Right click on data directory
- Click New-->directory and add reports
- Right click on data directory
- Click New-->directory and add html
- Right click on src
- Choose New-->Python File
- Name it CheckInternet (don't type the .py, it's added for you)
- Cut (do not type) the following code by doulbe clicking on any line of code and typing ctrl-c
import socket class CheckInternet: def __init__(self): self.internet_available = False def check_availability(self): self.internet_available = False if socket.gethostbyname(socket.gethostname()) != '127.0.0.1': self.internet_available = True return self.internet_available def testit(): ci = CheckInternet() print('Please turn internet OFF, then press Enter') input() ci.check_availability() print(f'ci.internet_available: {ci.internet_available}') if not ci.internet_available: print(' Off test successful') else: print(' Off test failed') print('Please turn internet ON, then press Enter') input() ci.check_availability() print(f'ci.internet_available: {ci.internet_available}') if ci.internet_available: print(' On test successful') else: print(' On test failed') if __name__ == '__main__': testit()
- Right click on src
- Choose New-->Python File
- Name it FetchCompletions (don't type the .py, it's added for you)
- Cut (do not type) the following code by doulbe clicking on any line of code and typing ctrl-c
import requests from bs4 import BeautifulSoup from pathlib import Path import CheckInternet import sys class GetCompletions: def __init__(self, infile): self.check_network = CheckInternet.CheckInternet() self.homepath = Path('.') self.rootpath = self.homepath / '..' self.datapath = self.rootpath / 'data' self.commandpath = self.datapath / 'command_files' self.completionspath = self.datapath / 'completions' self.htmlpath = self.datapath / 'html' self.reportspath = self.datapath / 'reports' if self.check_network.check_availability(): # use: Api_May_27_2018.txt for testing # self.infilename = 'Api_May_27_2018.txt' self.infilename = input('Please enter api filename: ') self.infile = self.commandpath / self.infilename self.api = [] with self.infile.open() as f: for line in f: self.api.append(line.strip()) self.fields = ['Spud Date', 'Total Depth', 'IP Oil Bbls', 'Reservoir Class', 'Completion Date', 'Plug Back', 'IP Gas Mcf', 'TD Formation', 'Formation', 'IP Water Bbls'] self.get_all_pages() self.parse_and_save(getpdfs=True) else: print('Internet access required, and not found.') print('Please make Internet available and try again') def get_url(self): for entry in self.api: print("http://wogcc.state.wy.us/wyocomp.cfm?nAPI={}".format(entry[3:10])) yield (entry, "http://wogcc.state.wy.us/wyocomp.cfm?nAPI={}".format(entry[3:10])) def get_all_pages(self): for entry, url in self.get_url(): print('Fetching main page for entry: {}'.format(entry)) response = requests.get(url) if response.status_code == 200: filename = self.htmlpath / 'api_{}.html'.format(entry) with filename.open('w') as f: f.write(response.text) else: print('error downloading {}'.format(entry)) def parse_and_save(self, getpdfs=False): filelist = [file for file in self.htmlpath.iterdir() if file.is_file()] for file in filelist: with file.open('r') as f: soup = BeautifulSoup(f.read(), 'lxml') if getpdfs: links = soup.find_all('a') for link in links: url = link['href'] if 'www' in url: continue print('downloading pdf at: {}'.format(url)) p = url.index('=') response = requests.get(url, stream=True, allow_redirects=False) if response.status_code == 200: try: header_info = response.headers['Content-Disposition'] idx = header_info.index('filename') filename = self.completionspath / header_info[idx+9:] except ValueError: filename = self.completionspath / 'comp{}.pdf'.format(url[p + 1:]) print("couldn't locate filename for {} will use: {}".format(file, filename)) except KeyError: filename = self.completionspath / 'comp{}.pdf'.format(url[p + 1:]) print('got KeyError on {}, response.headers = {}'.format(file, response.headers)) print('will use name: {}'.format(filename)) print(response.headers) with filename.open('wb') as f: f.write(response.content) sfname = self.reportspath / 'summary_{}.txt'.format((file.name.split('_'))[1].split('.')[0][3:10]) tds = soup.find_all('td') with sfname.open('w') as f: for td in tds: if td.text: if any(field in td.text for field in self.fields): f.write('{}\n'.format(td.text)) # Delete html file when finished file.unlink() if __name__ == '__main__': GetCompletions('apis.txt')
- Right click on PyCharm code area for this module and click paste
- Click File-->Save-All
- On Left pane, expand data directory.
- Right click on command_files and select New-->File
- Type Api_May_27_2018.txt and click OK
- Add some api numbers (real ones) without quotes, one per line
- When done click File-->Save_all
- Click on FetchCompletions.py tab
- With cursor anywhere in code window:
- On Top Menu click Run-->Run and select FetchCompletions
When done, your pdf files will be in the completions directory, and
a simple run report will be in the
This concludes setup for project,
Do these steps (8-40) and let me know when done.
This will need some tweeking later (after I finish my move and have internet reestablished at new location (which is more remote, so I am hoping I can still get broadband))
Movers will be here Thursday. I have already moved most of the items that I didn't want them to break.