How to scrape html table only after data loads using Python Requests? -
i trying learn data scraping using python , have been using requests , beautifulsoup4 libraries. works normal websites. when tried data out of websites table data loads after delay, found empty table. examples this webpage
the script i've tried routine one.
import requests bs4 import beautifulsoup response = requests.get("http://www.oddsportal.com/soccer/england/premier-league/everton-arsenal-tnwxil2o#over-under;2") soup = beautifulsoup(response.text, "html.parser") content = soup.find('div', {'id': 'odds-data-portal'})
the data loads in table odds-data-portal
in page code doesn't give me that. how can make sure table loaded data , first?
you need use selenium
html. though continue use beautifulsoup
parse follows:
from bs4 import beautifulsoup operator import itemgetter selenium import webdriver url = "http://www.oddsportal.com/soccer/england/premier-league/everton-arsenal-tnwxil2o#over-under;2" browser = webdriver.firefox() browser.get(url) soup = beautifulsoup(browser.page_source) data_table = soup.find('div', {'id': 'odds-data-table'}) div in data_table.find_all_next('div', class_='table-container'): row = div.find_all(['span', 'strong']) if len(row): print ','.join(cell.get_text(strip=true) cell in itemgetter(0, 4, 3, 2, 1)(row))
this display:
over/under +0.5,(8),1.04,11.91,95.5% over/under +0.75,(1),1.04,10.00,94.2% over/under +1,(1),1.04,11.00,95.0% over/under +1.25,(2),1.13,5.88,94.8% over/under +1.5,(9),1.21,4.31,94.7% over/under +1.75,(2),1.25,3.93,94.8% over/under +2,(2),1.31,3.58,95.9% over/under +2.25,(4),1.52,2.59,95.7%
update - suggested @jroddynamite, run headless phantomjs
can used instead of firefox
. this:
download phantomjs windows binary.
extract
phantomjs.exe
executable , ensure in path.change following line:
browser = webdriver.phantomjs()
Comments
Post a Comment