How to scrape html table only after data loads using Python Requests? -
i trying learn data scraping using python , have been using requests , beautifulsoup4 libraries. works normal websites. when tried data out of websites table data loads after delay, found empty table. examples this webpage
the script i've tried routine one.
import requests bs4 import beautifulsoup response = requests.get("http://www.oddsportal.com/soccer/england/premier-league/everton-arsenal-tnwxil2o#over-under;2") soup = beautifulsoup(response.text, "html.parser") content = soup.find('div', {'id': 'odds-data-portal'}) the data loads in table odds-data-portal in page code doesn't give me that. how can make sure table loaded data , first?
you need use selenium html. though continue use beautifulsoup parse follows:
from bs4 import beautifulsoup operator import itemgetter selenium import webdriver url = "http://www.oddsportal.com/soccer/england/premier-league/everton-arsenal-tnwxil2o#over-under;2" browser = webdriver.firefox() browser.get(url) soup = beautifulsoup(browser.page_source) data_table = soup.find('div', {'id': 'odds-data-table'}) div in data_table.find_all_next('div', class_='table-container'): row = div.find_all(['span', 'strong']) if len(row): print ','.join(cell.get_text(strip=true) cell in itemgetter(0, 4, 3, 2, 1)(row)) this display:
over/under +0.5,(8),1.04,11.91,95.5% over/under +0.75,(1),1.04,10.00,94.2% over/under +1,(1),1.04,11.00,95.0% over/under +1.25,(2),1.13,5.88,94.8% over/under +1.5,(9),1.21,4.31,94.7% over/under +1.75,(2),1.25,3.93,94.8% over/under +2,(2),1.31,3.58,95.9% over/under +2.25,(4),1.52,2.59,95.7% update - suggested @jroddynamite, run headless phantomjs can used instead of firefox. this:
download phantomjs windows binary.
extract
phantomjs.exeexecutable , ensure in path.change following line:
browser = webdriver.phantomjs()
Comments
Post a Comment