Python Forum
how to get data in web - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: Web Scraping & Web Development (https://python-forum.io/forum-13.html)
+--- Thread: how to get data in web (/thread-17118.html)

Pages: 1 2


how to get data in web - yimchiwai - Mar-29-2019

import requests
import urllib.request
from bs4 import BeautifulSoup
from flask import Flask

with requests.session()as c:
    url="https://cplus.hit.com.hk/enquiry/vesselScheduleEnquiryAction.do"
    vesselName='WAN+HAI+102'
    ETBFrom='29-03-2019'
    ETBTo='11-04-2019'
    query='%E6%90%9C%E7%B4%A2'
    c.get(url)


    new_url = url+"?vesselName="+vesselName+"&ETBFrom="+ETBFrom+"&ETBTo="+ETBTo+"&query="+query

    print(new_url)

    r = urllib.request.urlopen(new_url)
    soup = BeautifulSoup(r,"html.parser")
i am a beginner ,brother can you help ,next step what can i can to get data belew web
https://cplus.hit.com.hk/enquiry/vesselScheduleEnquiryAction.do?vesselName=WAN+HAI+102&ETBFrom=29-03-2019&ETBTo=11-04-2019&query=%E6%90%9C%E7%B4%A2


RE: how to get data in web - nilamo - Mar-29-2019

Looks like you're already getting data from that url.
Are you getting errors?


RE: how to get data in web - snippsat - Mar-29-2019

You are not using requests.session and when have Requests don't use urllib.
Here with removed import that's not used,and eg get first line data in url.
import requests
from bs4 import BeautifulSoup

url="https://cplus.hit.com.hk/enquiry/vesselScheduleEnquiryAction.do"
vesselName='WAN+HAI+102'
ETBFrom='29-03-2019'
ETBTo='11-04-2019'
query='%E6%90%9C%E7%B4%A2'

#new_url = url+"?vesselName="+vesselName+"&ETBFrom="+ETBFrom+"&ETBTo="+ETBTo+"&query="+query
# Better
new_url = f'{url}?vesselName={vesselName}&ETBFrom={ETBFrom}&ETBTo={ETBTo}&query={query}'
response = requests.get(new_url)
soup = BeautifulSoup(response.content, 'html.parser')
vessel = soup.find('tr', class_="nobgcolorvalue")
for item in vessel.find_all('td'):
    print(item.text.strip())
Output:
1 WAN HAI 102 S174 HIT4 2019-04-06 2019-04-07



RE: how to get data in web - yimchiwai - Mar-29-2019

Dear Brother

Thanks,how about record2 to last record.

many many thanks


RE: how to get data in web - snippsat - Mar-29-2019

(Mar-29-2019, 04:47 PM)yimchiwai Wrote: Thanks,how about record to last record.
You have to inspect page then is normal to use Chrome/Firefox developer tools to look for values needed.
Eg one way to second record.
vessel = soup.find('tr', class_="colorvalue")
Web-Scraping part-1


RE: how to get data in web - yimchiwai - Mar-29-2019

(Mar-29-2019, 06:07 PM)snippsat Wrote:
(Mar-29-2019, 04:47 PM)yimchiwai Wrote: Thanks,how about record to last record.
You have to inspect page then is normal to use Chrome/Firefox developer tools to look for values needed.
Eg one way to second record.
vessel = soup.find('tr', class_="colorvalue")
Web-Scraping part-1
soup.find just get first row,i? need use soup.find_all?


RE: how to get data in web - snippsat - Mar-29-2019

(Mar-29-2019, 06:29 PM)yimchiwai Wrote: soup.find just get first row,i? need use soup.find_all?
You have to try yourself Undecided
Eg get both.
vessel = soup.find_all('td', class_="body")[1]
for item in vessel.find_all('td')[6:-3]:
    print(item.text.strip())
Output:
1 WAN HAI 102 S174 HIT4 2019-04-06 2019-04-07 2 WAN HAI 102 W174 HIT4 2019-04-06 2019-04-07



RE: how to get data in web - yimchiwai - Mar-30-2019

import requests
from bs4 import BeautifulSoup
 
url="https://cplus.hit.com.hk/enquiry/vesselScheduleEnquiryAction.do"
vesselName='WAN+HAI+102'
ETBFrom='29-03-2019'
ETBTo='11-05-2019'
query='%E6%90%9C%E7%B4%A2'
 
new_url = f'{url}?vesselName={vesselName}&ETBFrom={ETBFrom}&ETBTo={ETBTo}&query={query}'
response = requests.get(new_url)
soup = BeautifulSoup(response.content, 'html.parser')
vessel = soup.find_all('td', class_="body")[1]
record=[]
for item in vessel.find_all('td')[6:-3]:
    record.append(item.text.strip())

for i in range(0,len(record),6):
    print(record[i:i+6])
['1', 'WAN HAI 102', 'S174', 'HIT4', '2019-04-06', '2019-04-07']
['2', 'WAN HAI 102', 'W174', 'HIT4', '2019-04-06', '2019-04-07']
['3', 'WAN HAI 102', 'S175', 'HIT4', '2019-04-20', '2019-04-21']
['4', 'WAN HAI 102', 'W175', 'HIT4', '2019-04-20', '2019-04-21']
['5', 'WAN HAI 102', 'S176', 'HIT4', '2019-05-04', '2019-05-05']
['6', 'WAN HAI 102', 'W176', 'HIT4', '2019-05-04', '2019-05-05']

Thanks so much!!brother

any suggestion?


RE: how to get data in web - snippsat - Mar-30-2019

(Mar-30-2019, 05:06 AM)yimchiwai Wrote: any suggestion?
Can make new list to get some structure,then can call individual record in list.
new_record = []
for i in range(0,len(record),6):
     new_record.append(record[i:i+6])
Test:
>>> new_record
[['1', 'WAN HAI 102', 'S174', 'HIT4', '2019-04-06', '2019-04-07'],
 ['2', 'WAN HAI 102', 'W174', 'HIT4', '2019-04-06', '2019-04-07'],
 ['3', 'WAN HAI 102', 'S175', 'HIT4', '2019-04-20', '2019-04-21'],
 ['4', 'WAN HAI 102', 'W175', 'HIT4', '2019-04-20', '2019-04-21'],
 ['5', 'WAN HAI 102', 'S176', 'HIT4', '2019-05-04', '2019-05-05'],
 ['6', 'WAN HAI 102', 'W176', 'HIT4', '2019-05-04', '2019-05-05']]
>>> new_record[2]
['3', 'WAN HAI 102', 'S175', 'HIT4', '2019-04-20', '2019-04-21']
>>> new_record[5]
['6', 'WAN HAI 102', 'W176', 'HIT4', '2019-05-04', '2019-05-05']



RE: how to get data in web - yimchiwai - Mar-30-2019

(Mar-30-2019, 07:37 AM)snippsat Wrote:
(Mar-30-2019, 05:06 AM)yimchiwai Wrote: any suggestion?
Can make new list to get some structure,then can call individual record in list.
new_record = []
for i in range(0,len(record),6):
     new_record.append(record[i:i+6])
Test:
>>> new_record
[['1', 'WAN HAI 102', 'S174', 'HIT4', '2019-04-06', '2019-04-07'],
 ['2', 'WAN HAI 102', 'W174', 'HIT4', '2019-04-06', '2019-04-07'],
 ['3', 'WAN HAI 102', 'S175', 'HIT4', '2019-04-20', '2019-04-21'],
 ['4', 'WAN HAI 102', 'W175', 'HIT4', '2019-04-20', '2019-04-21'],
 ['5', 'WAN HAI 102', 'S176', 'HIT4', '2019-05-04', '2019-05-05'],
 ['6', 'WAN HAI 102', 'W176', 'HIT4', '2019-05-04', '2019-05-05']]
>>> new_record[2]
['3', 'WAN HAI 102', 'S175', 'HIT4', '2019-04-20', '2019-04-21']
>>> new_record[5]
['6', 'WAN HAI 102', 'W176', 'HIT4', '2019-05-04', '2019-05-05']

Brother,how to call record1 the third field s174